From 24951aa10cf0e9e988da7454069c54c58edeb979 Mon Sep 17 00:00:00 2001 From: LeonHvastja Date: Mon, 25 Sep 2023 09:58:17 +0200 Subject: [PATCH 1/7] from local machine --- 18-distributions_intuition.Rmd | 335 ++++++++++ A2-probability_distributions2.Rmd | 4 +- _bookdown.yml | 2 +- docs/404.html | 9 +- .../figure-html/unnamed-chunk-14-1.png | Bin 26380 -> 16526 bytes .../figure-html/unnamed-chunk-3-1.png | Bin 18770 -> 16746 bytes .../figure-html/unnamed-chunk-7-1.png | Bin 24298 -> 18251 bytes .../figure-html/unnamed-chunk-9-1.png | Bin 18374 -> 26040 bytes docs/distributions-intutition.html | 591 ++++++++++++++++++ docs/distributions.html | 0 docs/probability-distributions.html | 126 +--- docs/reference-keys.txt | 5 + docs/search_index.json | 2 +- 13 files changed, 973 insertions(+), 101 deletions(-) create mode 100644 18-distributions_intuition.Rmd create mode 100644 docs/distributions-intutition.html create mode 100644 docs/distributions.html diff --git a/18-distributions_intuition.Rmd b/18-distributions_intuition.Rmd new file mode 100644 index 0000000..283597e --- /dev/null +++ b/18-distributions_intuition.Rmd @@ -0,0 +1,335 @@ +# Distributions intutition + +This chapter is intended to help you familiarize yourself with the different +probability distributions you will encounter in this course. + +Use [Appendix B](#distributions) as a reference for the basic properties of distributions. + + +```{r, echo = FALSE} +togs <- T +tog_ex <- T +``` + + + + + + + + +## Discrete distributions + +```{exercise, name = "Bernoulli intuition 1"} +Arguably the simplest distribution you will enocounter is the Bernoulli distribution. +It is a discrete probability distribution used to represent the outcome of a yes/no +question. It has one parameter $p$ which is the probability of success. The +probability of failure is $1-p$, sometimes denoted as $q$. + +A classic way to think about a Bernoulli trial (a yes/no experiment) is a coin +flip. Real coins are fair, meaning the probability of either heads (1) +or tails (0) are the same i.e. $p=0.5$, shown below in *figure a*. Alternatively +we may want to represent a process that doesn't have equal probabilities of outcomes +like "Will a throw of a fair die result in a 6?". In this case $p=\frac{1}{6}$, +shown in *figure b*. + +Using your knowledge of the Bernoulli distribution use the throw of a fair die +to think of events, such that: + +a) $p = 0.5$ +b) $p = \frac{5}{6}$ +c) $q = \frac{2}{3}$ +``` + +```{r, echo=FALSE, warning=FALSE, message=FALSE} +library(ggplot2) + +# Create data +df <- data.frame( + outcome = rep(c(0, 1), 2), + probability = c(1-0.5, 0.5, 1-1/6, 1/6), + p_value = factor(rep(c("a", "b"), each=2)) +) + +# Plot +ggplot(df, aes(x=factor(outcome), y=probability)) + + geom_segment(aes(xend=factor(outcome), yend=0), color='black') + + geom_point(color="red", size=3) + + coord_cartesian(ylim = c(0, 1)) + + labs(y = "Probability", x = "Outcome") + + theme(legend.position="none") + + facet_wrap(~p_value, ncol=2, scales = "free_x", labeller = label_parsed) + +``` + +
+```{solution, echo = togs} +a. An event that is equally likely to happen or not happen i.e. $p = 0.5$ would be +throwing an even number. More formally we can name this event $A$ and write: + $A = \{2,4,6\}$, $P(A) = 0.5$ + + +b. An example of an event with $p = \frac{5}{6}$ would be throwing a number +greater than 1. Defined as $B = \{2,3,4,5,6\}$. + +c. We need an event that fails $\frac{2}{3}$ of the time. Alternatively we can +reverse the problem and find an event that succeeds $\frac{1}{3}$ of the time, +since: $q = 1 - p \implies p = 1 - q = \frac{1}{3}$. The event that our outcome +is divisible by 3: $C = \{3, 6\}$ satisfies this condition. +``` +
+ +```{exercise, name = "Binomial intuition 1"} +The binomial distribution is a generalization of the Bernoulli distribution. +Instead of considering a single Bernoulli trial, we now consider a sequence of $n$ trials, +which are independent and have the same parameter $p$. So the binomial distribution +has two parameters $n$ - the number of trials and $p$ - the probability of success +for each trial. + +If we return to our coin flip representation, we now flip a coin several times. +The binomial distribution will give us the probabilities of all possible outcomes. +Below we show the distribution for a series of 10 coin flips with a fair coin +(left) and a biased coin (right). The numbers on the x axis represent the +number of times the coin landed heads. + +Using your knowledge of the binomial distribution: + +a. Take the [pmf of the binomial distribution](#distributions) and plug in $n=1$, +check that it is in fact equivalent to a Bernoulli distribution. + +b. In our examples we show the graph of a binomial distribution over 10 trials with +$p=0.8$. If we take a look at the graph, it appears as though the probabilities of getting 0,1,2 or 3 +heads in 10 flips are zero. Is it actually zero? Check by plugging in the values +into the pmf. +``` + +
+```{solution, echo = togs} +a. The pmf of a binomial distribution is $\binom{n}{k} p^k (1 - p)^{n - k}$, now +we insert $n=1$ to get: + $$\binom{1}{k} p^k (1 - p)^{1 - k}$$. +Not quite equivalent to +a Bernoulli, however note that the support of the binomial distribution is +defined as $k \in \{0,1,\dots,n\}$, so in our case $k = \{0,1\}$, then: + $$\binom{1}{0} = \binom{1}{1} = 1$$ + we get: $p^k (1 - p)^{1 - k}$ ,the Bernoulli distribution. + +b. As we already know $p=0.8, n=10$, so: + $$\binom{10}{0} 0.8^0 (1 - 0.8)^{10 - 0} = 1.024 \cdot 10^{-7}$$ + $$\binom{10}{1} 0.8^1 (1 - 0.8)^{10 - 1} = 4.096 \cdot 10^{-6}$$ + $$\binom{10}{2} 0.8^2 (1 - 0.8)^{10 - 2} = 7.3728 \cdot 10^{-5}$$ + $$\binom{10}{3} 0.8^3 (1 - 0.8)^{10 - 3} = 7.86432\cdot 10^{-4}$$ + So the probabilities are not zero, just very small. +``` +
+ + +```{r, echo=FALSE, warning=FALSE, message=FALSE} +library(ggplot2) + +# Data for binomial distribution +df1 <- data.frame( + outcome = 0:10, + probability = dbinom(0:10, size=10, prob=0.5), + p_value = "p = 0.5" +) + +df2 <- data.frame( + outcome = 0:10, + probability = dbinom(0:10, size=10, prob=0.8), + p_value = "p = 0.8" +) + +df <- rbind(df1, df2) + +ggplot(df, aes(x=factor(outcome), y=probability)) + + geom_segment(aes(xend=factor(outcome), yend=0), color='black') + + geom_point(color="red", size=2) + + labs(y = "Probability", x = "Outcome") + + theme(legend.position="none") + + facet_wrap(~p_value, ncol=2, scales = "free_x") + +``` + + +```{exercise, name = "Poisson intuition 1"} +Below are shown 3 different graphs of the Poisson distribution. Your task +is to replicate them on your own in R by varying the $\lambda$ parameter. + +Hint: You can use dpois() to get the probabilities. +``` + +```{r, echo=FALSE, warning=FALSE, message=FALSE} +library(ggplot2) +library(gridExtra) + +x = 0:15 + +# Create Poisson data +data1 <- data.frame(x = x, y = dpois(x, lambda = 0.1)) +data2 <- data.frame(x = x, y = dpois(x, lambda = 1)) +data3 <- data.frame(x = x, y = dpois(x, lambda = 7.5)) + +# Create individual ggplot objects +plot1 <- ggplot(data1, aes(x, y)) + geom_col() + + xlab("x") + ylab("Probability") + ylim(0,1) + +plot2 <- ggplot(data2, aes(x, y)) + geom_col() + + xlab("x") + ylab(NULL) + ylim(0,1) + +plot3 <- ggplot(data3, aes(x, y)) + geom_col() + + xlab("x") + ylab(NULL) + ylim(0,1) + +# Combine the plots +grid.arrange(plot1, plot2, plot3, ncol = 3) +``` + +
+```{r, echo = tog_ex, eval=FALSE} +library(ggplot2) +library(gridExtra) + +x = 0:15 + +# Create Poisson data +data1 <- data.frame(x = x, y = dpois(x, lambda = 0.1)) +data2 <- data.frame(x = x, y = dpois(x, lambda = 1)) +data3 <- data.frame(x = x, y = dpois(x, lambda = 7.5)) + +# Create individual ggplot objects +plot1 <- ggplot(data1, aes(x, y)) + geom_col() + + xlab("x") + ylab("Probability") + ylim(0,1) + +plot2 <- ggplot(data2, aes(x, y)) + geom_col() + + xlab("x") + ylab(NULL) + ylim(0,1) + +plot3 <- ggplot(data3, aes(x, y)) + geom_col() + + xlab("x") + ylab(NULL) + ylim(0,1) + +# Combine the plots +grid.arrange(plot1, plot2, plot3, ncol = 3) +``` +
+ + +```{exercise, name = "Poisson intuition 2"} +The Poisson distribution is a discrete probability distribution that models +processes where events occur at a constant mean rate and are independent of each other. + +It has a single parameter $\lambda$, which represents the constant mean rate. + +A classic example of a scenario that can be modeled using the Poisson distribution +is the number of calls received by a call center in a day (or in fact any other + time interval). + +Suppose you work in a call center and have some understanding of probability +distributions. You overhear your supervisor mentioning that the call center +receives an average of 2.5 calls per day. Using your knowledge of the Poisson +distribution, calculate: + +a. The probability you will get no calls today. +b. The probability you will get more than 5 calls today. +``` + +
+```{solution, echo = togs} + +First recall the Poisson pmf: $$p(k) = \frac{\lambda^k e^{-\lambda}}{k!}$$ + + as stated previously our parameter $\lambda = 2.5$ + +a. To get the probability of no calls we simply plug in $k = 0$, so: $$p(0) = \frac{2.5^0 e^{-2.5}}{0!} = e^{-2.5} \approx 0.082$$ + +b. The support of the Poisson distribution is non-negative integers. So if we wanted +to calculate the probability of getting more than 5 calls we would need to add up +the probabilities of getting 6 calls and 7 calls and so on up to infinity. +Let us instead remember that the sum of all probabilties will be 1, we will +reverse the problem and instead ask "What is the probability we get 5 calls or less?". +We can subtract the probability of the opposite outcome (the complement) from 1 +to get the probability of our original question. + + $$P(k > 5) = 1 - P(k \leq 5)$$ + $$P(k \leq 5) = \sum_{i=0}^{5} p(i) = p(0) + p(1) + p(2) + p(3) + p(4) + p(5) =$$ + $$= \frac{2.5^0 e^{-2.5}}{0!} + \frac{2.5^1 e^{-2.5}}{1!} + \dots =$$ + $$=0.957979$$ + + So the probability of geting more than 5 calls will be $1 - 0.957979 = 0.042021$ +``` +
+ +```{exercise, name = "Geometric intuition 1"} +The geometric distribution is a discrete distribution that models the **number of +failures** before the first success in a sequence of independent Bernoulli trials. +It has a single parameter $p$, representing the probability of success. + +*Disclaimer*: There are two forms of this distribution, the one we just described +and another version that models the **number of trials** before the first success. The +difference is subtle yet significant and you are likely to encounter both forms, +though here we will limit ourselves to the former. + +In the graph below we show the pmf of a geometric distribution with $p=0.5$. This +can be thought of as the number of successive failures (tails) in the flip of a fair coin. +You can see that there's a 50% chance you will have zero failures i.e. you will +flip a heads on your very first attempt. But there is some smaller chance that you +will flip a sequence of tails in a row, with longer sequences having ever lower +probability. + +Suppose you are gambling over coin flips with your friend and they propose if they +get 5 tails in a row you must give them 100€ and you get 1€ if they fail. + +a) Does it make sense to accept this wager? +b) They change their mind and claim that getting 10 tails in a row is half as +likely, so now they want 200€ if they succeed, but still only offer to pay 1€ +if they fail. Does it make sense to accept this wager? +c) *Bonus*: Look up the second form of this distribution and redo problems a) and b). + +``` + +```{r, echo=FALSE, warning=FALSE, message=FALSE} +library(ggplot2) + +# Parameters +p <- 0.5 +x_vals <- 0:9 # Starting from 0 +probs <- dgeom(x_vals, p) + +# Data +data <- data.frame(x_vals, probs) + +# Plot +ggplot(data, aes(x=x_vals, y=probs)) + + geom_segment(aes(xend=x_vals, yend=0), color="black", size=1) + + geom_point(color="red", size=2) + + labs(x = "Number of trials", y = "Probability") + + theme_minimal() + + scale_x_continuous(breaks = x_vals) # This line ensures integer x-axis labels + +``` + +
+```{solution, echo = togs} +aaa +``` +
+ +## Continuous distributions diff --git a/A2-probability_distributions2.Rmd b/A2-probability_distributions2.Rmd index ab7994d..3049971 100644 --- a/A2-probability_distributions2.Rmd +++ b/A2-probability_distributions2.Rmd @@ -1,10 +1,10 @@ -# Probability distributions +# Probability distributions {#distributions} Name | parameters | support | pdf/pmf | mean | variance --- | --- | --- | --- | --- | --- Bernoulli | $p \in [0,1]$ | $k \in \{0,1\}$ | $p^k (1 - p)^{1 - k}$
\@ref(exr:binomialpmf) | $p$
\@ref(exr:Bernev) | $p(1-p)$
\@ref(exr:Bernev) -binomial |$n \in \mathbb{N}$, $p \in [0,1]$| $k \in \{0,1,\dots,n\}$ | $\binom{n}{k} p^k (1 - p)^{n - k}$
\@ref(exr:bincdf) |$np$
\@ref(exr:binomev) | $np(1-p)$
\@ref(exr:binomev) +binomial |$n \in \mathbb{N}$, $p \in [0,1]$| $k \in \{0,1,\dots,n\}$ | $\binom{n}{k} p^k (1 - p)^{n - k}$
\@ref(exr:bincdf) |$np$
\@ref(exr:binomev) | $np(1-p)$
\@ref(exr:binomev) Poisson |$\lambda > 0$| $k \in \mathbb{N}_0$ | $\frac{\lambda^k e^{-\lambda}}{k!}$
\@ref(exr:poisex) | $\lambda$
\@ref(exr:poisev) | $\lambda$
\@ref(exr:poisev) geometric | $p \in (0,1]$ | $k \in \mathbb{N}_0$ | $p(1-p)^k$
\@ref(exr:geocdf) | $\frac{1 - p}{p}$
\@ref(exr:geoev)| $\frac{1 - p}{p^2}$
\@ref(exr:geovar) normal |$\mu \in \mathbb{R}$, $\sigma^2 > 0$ | $x \in \mathbb{R}$ |$\frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{(x - \mu)^2}{2 \sigma^2}}$
\@ref(exr:normalpdf) | $\mu$ \@ref(exr:normev) | $\sigma^2$ \@ref(exr:normev) diff --git a/_bookdown.yml b/_bookdown.yml index 638ca93..074641f 100644 --- a/_bookdown.yml +++ b/_bookdown.yml @@ -1,7 +1,7 @@ book_filename: "bookdown-pou" chapter_name: "Chapter " output_dir: docs -rmd_files: ["index.Rmd", "01-Introduction.Rmd", "02-uncountable_probability_spaces.Rmd", "03-conditional_probability.Rmd", "04-random_variables.Rmd", "05-multiple_random_variables.Rmd", "06-integration.Rmd", "07-expected_value.Rmd", "08-multivariate_random_variables.Rmd", "09-alternative_representations.Rmd", "10-concentration_inequalities.Rmd", "11-convergence_of_random_variables.Rmd", "12-limit_theorems.Rmd", "13-estimation_basics.Rmd", "14-bootstrap.Rmd", "15-maximum_likelihood.Rmd", "16-null_hypothesis_significance_tests.Rmd", "17-Bayesian_inference.Rmd", "Appendix.Rmd", "A1-R_programming_language.Rmd", "A2-probability_distributions2.Rmd", "References.Rmd"] +rmd_files: ["index.Rmd", "01-Introduction.Rmd", "02-uncountable_probability_spaces.Rmd", "03-conditional_probability.Rmd", "04-random_variables.Rmd", "05-multiple_random_variables.Rmd", "06-integration.Rmd", "07-expected_value.Rmd", "08-multivariate_random_variables.Rmd", "09-alternative_representations.Rmd", "10-concentration_inequalities.Rmd", "11-convergence_of_random_variables.Rmd", "12-limit_theorems.Rmd", "13-estimation_basics.Rmd", "14-bootstrap.Rmd", "15-maximum_likelihood.Rmd", "16-null_hypothesis_significance_tests.Rmd", "17-Bayesian_inference.Rmd","18-distributions_intuition.Rmd", "Appendix.Rmd", "A1-R_programming_language.Rmd", "A2-probability_distributions2.Rmd", "References.Rmd"] new_session: no diff --git a/docs/404.html b/docs/404.html index 705a64e..1e124f5 100644 --- a/docs/404.html +++ b/docs/404.html @@ -23,7 +23,7 @@ - + @@ -242,6 +242,11 @@
  • 17.1 Conjugate priors
  • 17.2 Posterior sampling
  • +
  • 18 Distributions intutition +
  • Appendix
  • A R programming language
  • -
  • B Probability distributions
  • +
  • B Probability distributions
  • References
  • Published with bookdown
  • diff --git a/docs/bookdown-pou_files/figure-html/unnamed-chunk-14-1.png b/docs/bookdown-pou_files/figure-html/unnamed-chunk-14-1.png index 047f9daa5b5975c044f5872852114433e4a198d3..ee399f14dd5176c431e130f4fa187a189f1b74a9 100644 GIT binary patch literal 16526 zcmeHv2Ut_-)^3bMG$_m{hz&Ah0R%OIBAvvFj#5-0A|)ykP%)_V8Z!)o3{n&ksfmMv z!l5V*NJ~(pDi8!kq)89Gg_4kRzfB8s{&Vg<|J-w)|GD=ePuNfP-e246Ti^QDdf&AP zFf!0vgWQCK!C-4n96xdf218(Bu;m3SmxC*XuebHVV9U&mP8lBsr!bf%3`T%yYQi+Z zF%(7s-%M}`x&VC$nwp`|fk4nCfFGHfp`n_g;MkxKeTe${!TLm^J`o&~^@B?ZGi?zx zVNeatP^c302_=MrZzdtMfsmO=$OK0Ls0r1BKEcG`WMXhBF}WicG)Dwqa3MK3xilDD zDFt6as3t6Qrn%72%+Sn+&;~&=Q9qeTOb#X{mv$s~2r>zp-?}d zEc9u}Yye+Dra({{tX~>TEKMesmV$bMQh}f$RI?$J(2xnfp-{P*zlH`ugFw)cOzbEP z?kG*}=m2E}eSLib7!33TS^?Dr4VeN^t|M8{Q7Y&F_o3!Nd!U{`AZT0J6$nJT>c(+1 zM;J`TANpGs(m7%SgZ%(Iapa(}Tg*T#FFN*{zd;!XX=1j<}6w9D^H#;BjY^n<{)wH`7MAWJJ=&5V# zbm^|LoYFRq*xuqt=i{+K%5wXVwHvHe{^HM!r|^7Nw9A&u@x$M)n4{vZ_lan6cWmmaD-k7PllwnP6Ri2P=^?7?ftf=b1({hG7KK&wySxMBRs5((JB`q}v5_-_H{> ze38HJMEdpE3l8;`Q%A*fIuac9$Y<JdDzR<-F*EtpY^jXlK<(n^ z8%La}s~b$@zGNDt*VmD)FyrkVzdPlhb!^kR_uWESt$QxEZ5F@n-_#;zm~LqU2{u}i zc`huXDo>jKrn1B4l)uAzF!Bi#{k7^tMkb{t-T+A&OY52VAfFdKT(A?V0D%#4v65(-cDmgWmbu6;a5K%tghZM6o9=mF87)p$0 z^#!pb_v`c7vwr1lws{07?!3EFVM<6SKb3{Z*BYv7&j^^Zv;>Sti#6FRkU9ab& zKk{~i=CYYx&*RHu_^i>~gou)Lc?O%cY{$q6{xXWNr7BAjJ560unE&H?kvn;|B+Dgd z@M~;_8sIix5+%|p;!5KcLRu+h&(AZOiMC`@a$)}*rtxgHJ8RfBIg-0Sh zW8qI*Q&pCM{m&`|ivl~Z4)8o!gFVqw3gWKez$sZbd=-&Xm%m>y!YNN_*BT~uTT}Np zn#e(`H(82Foczox@y$Lioki;iJFY%=rg^dS^MK=zmnGG$n8+8q*Jb~lL=LHC8I5aiooBI|AyHL zIabY2us7$QK#@_REk>`~?Km-TA`%}CM|??Dc>`Ry3Ys{J^Bg6`oG7rg0X~ID-g$DW zyZn9E)-2duW1o<$DS|8e=8zwbsx5qwvexoMKM801}+k^RB1S>ynj|PKkq>J zQ+Si&T;I4|j3B)Q8Jkk`Z$3-%U9xLSW?j;5X;=GCZ0hx|7MqXNyBK+Hz{kj6zZD8X zlC$9zf$$x3%RdF{*}f1?EP3-w!OT)D2Vocg&QLb&>GO7TElbyd%3ggut=>6;&rept zKY$;dTY^DIR%;OPzBI1)Xo?8rd?+u;^o{NLqq}f(M6+e%s|HIv8;$t${bsJFqXVHLTs!YEcC)Oa zc^Zf`sxRAJ&GNj)X+L<&pZBIRT*YwPPNSa(6xZ3C%YUI_LmVoHa^!Qkmuw--L>R`@ zKf=1^4t^X8!^ch@RW!2xa3J9c7Ka+q<|oqYd-m`Ovq`xn#ks>G4O(Fk6ft4i?1y2G zDo?g%{JwxUI3!{X#dB1uwhDV}VDIvJk=QDw@Wv&4Rc6XM`w3@iogq&>TOn4e=b#~+rH*LnpZXwm`+`8JoNQ~7I>ohFb`x8X9 z&f3a;&<*N26DbdRdOP3K7*@g|$j1G;2TwzqS=O3X%+cn3|M*_oPh|LvEE&yL>#;<^fH^%u_4XTiv5Xn zzb#2ZY4DX97@pVF7j*9Rjg@Snm?TEGe@xuxuRD2gW39%b%dB18>O{1mu}kY8WXhQ{ zRFOG#@%P+VlSR)~&sV}QZJsSydYxJ+CMr&L_o&y#ZxX)=FaLs-o`}(w77TdCSBu%) z!7vwzOl=Q?9Zw0dTH`@tb=*k1BJ4zsr^(3Dv!!Tr273*4clmOcon9D#sGIypzZx{2 zGnj;iu$84k$0+RLo6 z`Jp>_K|VUSu^1lP*t*KwXoXX%P>P$Snbi&N6GP)r{OR|?g&)11`HmtT$15T6x^cyF zo=K~Fw+J;#Z8u{YXlI7NnZ>tbd2IYF5wg%x{zY;y8jC;2@0b{lpf2du#IFcW(y zIlrSXkQPc?KNpNRWMOeJ?!EGbi`L_m<YxrCc}EEw^8W%3@~Rjly(KCGbgQ($~`hL7yRWVaZQIr^Pj6h;04 zOv6>r{Z}B=n)v7On5wZJ{PYI99dnc2LL~c)ZKfPa`fe^zGvCdqF)YSB$12UQ2U9WD zV23j+%_4@|!hDNWk>Xw#(ce6@Ry`a=Xe)4mB^`Esm6dv@lK}Hdm4QK|v30 zaue!zGl8;f#U_(eTn#pDSimj+BIiq&vLbJt=L~?Cj>dD{S^Q^D4Xw{C)c9Jp+?&FhGVb@p+J;^&<-RZJf^GFd@&vkRm){oTaC{T3KR5ve{R#fVuIBQ@rrY@o+TOQPCok> zKGADAKFBZ49X|XfS$1x|+ry5}(311AR7s%lZ*u5-)=RYYCh>)SqG-||L*m@KHg=^P z7Y?ecrJP+>#MaV4b6>9R(H2gi=+xeO@rL+~c(U$v!($3|5jZu`-tQ5QP4=DH>f+5N zcX1t257@j#`ryFzV=1LAM=a71w!!uvgx!Xs8vE;;Z~U)AmH=e>&snC+J1DD4yloJgp(K-sY_r!mk?^wjpKbAZh13 zXE%_q7OBf$ISKjRZWG4HUBT_sKUA~(u3PiRu5O|~Z!{9Fw5b5rA|;z6!e=$^UphH+ z8)FHB$yPzR01HI=Z2ht`MI$M(sktt;;|XZeC+7%*7D=I)76&DIHRd@dht_k1o|C;X zHMc2TOWt2CR|&tHsKMOB<6OQ~lVrSYkyE3OWWA|04wG=$m8KTr;NlQ_${~ukL9nQm zGgQx}lVhl~DhGErYoE^S+`OGrCOeI8&xIdPu&&(^k)n?G;y8xLR|`wE8&@;~m5OSG z$cn_;bXxJYPQFwLKi6D+<9H_6GBQq(2cOnGamPP_s}|c^|Ab_Ha2yn+-Q9^y;}C;f zgAYWly-fxD{Dt-X0$!4@%|ojTh+JtRb_h5G3d9g2g7_MLl@6Az@pFTq54* zt-7*Fs6e^R0sFmOyK#MnFk4Xr)W_tC^- zN!m5r%ft?jEsoWA+6+M4ngHU~(E#DW*vU}&_o>9+slmUbMoV=T<)QpHmfw=x{y&l1 z9HROnFTOR@-ui=!?FDjIkrD{<{`d|eQ7kK@?1W*zJ>u(T#Ce%n2k_5+(c*n^>CWGK zjsJG(h`*VXCj;00Etn&Kbe~NI;fU`F=NXpn1|ml{XeQV(nkBDr)V`VoP@r&80ez?d zQCJ`zD@v#`%q=g>0OUQ$mr-Ek)ufINLtq(zgOK}&UL|X51H3B}J`e=2&>B#bsG95g zDb{;E%@iW-LBH3LwRL1uB;XKJv2}^3$sB_xnyrAylVWmSDH^qi;del}HgHP{0r3<6 zjit>mARh=M2OJ}J#kz$1+k%86z@nX|x+hw8w$=kof5uU1zJhaigD^8l{i0mIn#e-g zbW7F#i%6mY*E-|*bbm;ZaL6fbng-2n4%`0(Q?jyFlFb0pixRv$CwUVz3;i zq$WG$f|T^11)bdffP1I!%b)OaYyX%VLgbEC7?4M2oJK&1no>O-PY*iR;fQ_BQ-_2^ zsVQ)v%xh2*roUvnvL_RLWO+%F%;C8)tU!2pIr`KAJodThK@^(dX%^|ReaKFId$iwX zYZ@Bq6j+a}eKcTjf*j}bW0-5J`)gOZ8RGfC;M?w_3y<<=9myInn=0XHe>Z~|Lkbn` zEyEl?i`nx#)_tC+RbrIT*7Y1A6PF8V-jRyITO2DlM9v!FGs*iaQHog?Y~5v&7Agad z9LhE$0jGte$VmBlPB=yICbt>H?fd^S4@;`KB$B^mz%=S9zs%~_Om6Te$Vl56&B8G8 z*pnF`nS$?AOPvL{puAw5B7?_EKWMptRHUAzPEX(T%~`-FQm|vwoT61prLYN;M{xTARenh#QEfq-Jy0BrkY^ z2qqr{@t-WnU67Rlr~fZJ5aeh5Ps%lncLLtsU}=N+sUQN1wvhX`K^e{ma?6nqZlfW6 zm144?v}U`Extjz=>%^ko#vfeILk5-mp+D zX3EMFDm3%Fi70QpJztEaAB z>uF~o_w4{K8)iX{!_dTE89?X9z@u-y(Gp<1!-AY@m<+`7y*k)iC51A^HE4RWlV@l2 zQ|il<<@Qs{&dttMs&n{tnF}@OWkyuUI`sd9+xTl%y(WLF@!WmK@GEZ7DN07aK}j8o zbt@b)oX=CMaWDlV#Q7M@ZpE761LppSO+JX*a1&}#y>qI{8c@WPI`r}AG#oQ zazG4tw5hkQ51X2IZ!vA_&1zC(>$4-mdyG0|*7#q}5_6c4FnHT7 z-degEvm)&fwfIqm|7iElfP41msDZfMI$a4DxVp+P;NiQ(r#ObtKRZk7?z=8-GMjf- zwQ^!giMJ~c-gIfkqtcN-O*WaWZ+>0lm-l> zxTmvfx3~QL!$;@N9b5u`AdMD|{LT8LE?GKie77Xi>-R{qtZkOB2&dXU4xN9pV3IKW-ng&b zk08Ri;9pzX*zB;{;8%pELu}+Y8d=F%k95k)z17_j`Zck=NqjBypz9T{9V&d^eam(H z=qGsX%HOMxN&DR9-Ai1b>($*cMLN7S@-kOiB3)i!ap|P)!^sUk+|%DrO{{{@;U6R% zemJa}z3shyty@BK6jz-FBVBXj5M8}T3-h?G`?9xQ6wwrfZPR!pu+H{B;ic{T4u!HD zZDKzeczEvBz&hroM-zAG^&`%V&G@}F+~MLz%Z_7c`W^DV=61TZ>~bAinI&&Gzd|IV zgqxvC@7F_~I~3SB`9+sJ_>kKktL7O&3TaM^{zXz@{8S}za_uiZZtdmtpdn}O?h&UI z<5iz}aRmx7d2ma9nAzpFOarQoz)gdUKWL60{K>m|A0%=j?3YR0s@xQB5FBVjFT7m1iH!Ea`kAc8EQ@x z-{}&L+r<4awLA3DqWAVLC$+s^RKWcPThie=1#$Ou=F)1GW8-w~maTGf$ks1B*?%D+ z^u;MEeQY^1Aa|r?B02-N_WO{SyF*SN&Z9>>BCfBu3&c@HwIWKprD)p+3#;Nv3-~nK zA1!XBzh?bnEtJL%W}E$#Pp7cPYe z&KO=AcqzE$rQ!W`xP-kyVS1M~GA>Vg)r9Fb< zYAZde@ad@yDHQ$s)N*)7lTyk$x#cZ~Z#Vn3nCJDSzC)3wMKnE0dnc^i45*CgLxhW{ zPZV!NZ6CITG+e0m+|>r3>-&c|Em&~di~Xkmusp4C=V;pJiB(P`t5mQnPjC+^ws`#ui#6@RuTs+jI!2_Pepa zQ5i)>z3ui}l(*$6V=Rf_yx%?H6t9Whqq=8CO{nc$dRaI55NSB9tbva7UfFj5dLuIF z4xk?%eNZs6zFT(pFaO+JIZ(~?>olb0YI^i)`Csxa#K0cleX*mcefY z53lYq)Mm$upj&c%TDSym4an!v)|ldSG_Feyw%|#*>cxLj`c_nhKc|zuzmp;-s#lda z!RCrlnzVTXrZkxK!_!nkVdP4u{x^|tKZ~)@#H!kfFo_oPo6S?wx~|R>W7!>DE4Bp1 zQMmSA!=LsUzC@FX`~)5!(Tt+9?42Ra=G@O1-i--ALkE5bE2a2vN1zAX$|&2XadvZx zH1o?HilNNJJ7#v3L2xrgW@@lEUNBtSUuT%6Pxp2pFv{M%*g`kU*4`rc4KTf4s_ap2 zS-h%iE6$+OAzN!ltk>A+7S&uf#6*`Alu0|KilR=fn;9}lN;3X<#8w9$+r0PtuCq!d zav0J@zL>o1mcIVyVc`@*@i>oQ?^~B#QZ)I7HGLmt$uyeB`_b0=qM^OH?7_C)iHYa$ zEv)5c8Vj(sr)7Uvx6(lAt>TxC8ifZUX<0MP#n!A7YY0HRIMe)2kzin(R8}0WX z?=gt;&f!fYf&7lR5RUb0#8^Af;LVj5dKtMNysO@(@q4AnwDi$wFIVSR_eat!RD1U; zS;Yvd#^K(*tleQ=(}^VS?w;x-tRca0x6a5{+fGGgOcbO%yB?_LucQ3PC^{VDN_vJh z#K*@Src$)?N3}3H?wq8gvW#I`Xjx1U)ZLkYFs+dhCB`_-qA}DRq3OM&fEkGsjAS$< zA~Y*0&5=3dtQwPI&}`4HaCPNcl9m#qq5C@~OPNt4fs=#lR9RC&Ns?4eS!4+|;>o0* zEsCfLsde|gYFb=gyzP&S$o#GS&NcPpSox!*BAnojDqC^51n9MD#xnstr+haa9H$F)0CuZeg})&gD~ zXVugp^-NUWVX^b;k zn~JnlL-@like`Sqm1YcQ7+RmFJ>*5VX!Dqb_hjW6>_lZ945_utj2!2s@?xqx0Bei? z*l~1hOwUf*aH|ZjH4@O4lu19TT)EQ)EdfYxYVQDJF@K zEqnHTH_sc@r|0#0ey``x=k>f^-|3=xzwbHs-gE9f_uO;OIkCouy3~|xlq4i1)OrZ* zOC%&@FcK2dw-ltnlcqO(vm_*-OUC+F;J{ZB5+xE6dlC{12}vah$p8rnkwi&}M9H2+ z2}5FEDI+5Tyav82k=QGdV3bHIl}Lz(?+-urz&|B>;5SCezETMTlmUaokDGx(qJbMw z3JntR0BQi2J;{JQ2@ybX2yqC5IfOg>VC*qKskFxo*jECD2mrgeCAuXRx+UV>3da)x zTy8)C9uy|x#}n~Hpre?h&S6N1K!-5)KsiKrhz3)MsRYUZW&kL}LN|lLL;wd77+58V zQYA1R5=<|w43{(yPg;+@>;u8(`DU|`+-WJU09Z#6#}d6O}MT zn|$IC!vGQ!0a8}l0~8?+V2A^i#6N>1j^l~rz$g!?1yBeWH<3t8ULB4B<`m=(!rYUD zgfsf^=fwSqQIvMj}N8Le1V`9 zp$7+I>4i_6V0o?_HiZQR6A$(+Eq&E2YNqOID=(=X z4DL+O=i32qnGVt5V4jxtrJvrEH}Vr6|KDG5_tk8F*`C;~bPKt8x3{^-(jxUDF45#S z=DVQ!a`h~Sp|x)hWYl|WtU%uJzo&34!}QajN`J?*pEnmp+e+=L!N1skca&a;pj+91 zIk4<^30*8sCyuXLS~jjaHhDn0<$kXQetjIeZ}Ry*nnn>FjEGfoP6-uuSXyM%j0t;7 zx%RWC$06$xad6*q849e)*P}3S&3WU3t?x2@saL90TCbBEsCCki|8A};3f7x=x#o@c zDuZ}C^-59|2i2c7$AA({GK9#2|IpzP%yOgDp`+Voiv*zJ5$*1R#1j{-xTdxbf^Rt# zOQymyvVX73pI{5m+%J56NEwEL4sA-3|3-H$DeIbI?|#?8__^HnEThF2<`SqHo!^}K z0PzZ`_NsV(ardkLw{J-hwkL)M`)g*(j9!ZW zQwK_>#2CIeE`BonfAZpg;_`pH#s4jq zM@qy0>lXk21agxhPKBdhF+C&3#)mPN>$l2b${%AOd+WVeff$7TWJ37 z%)J`?}k_fy0jug7!RJ8$huemtIwR4)xa zK;B}hN5_1EckmnYAL020z3(DaI}|DtNA`FjvhkPKT7Oa5_(-P$Ix_U>7_9vdzB?@b z(7{`4Ml6A&Xc_E~pZ)8lVRIHoA9|;&NZP!KDjrl#*x?yM!)83`K78t6s`-Zg^7tzH zKKx4waS~axR4|2Rm?(mxF5pa$jub!OhIwC|W^|Ubq}?{(U&PVp_XGk%t43R`=7IYY z!M{L<{QEG;n3nc66J`XLa0cn$x&)RFg$DDCO@{lJ*t znB{2g$Un!wPnRcq>{5_7^}RK7&%%S`?ONQY+?iwq(YdtyTWq!#7G@1QWd3&yG*1Hj zIB0~`NV;>#bAohgL60opmPhjgeO$?H$vW_3t)pu3(A55N2}0q58q2A*Zw~fMzywc$ z2@sCyV$WC|Ae&E3j}lC6uigFO*_NF7K+9N;G#NH~#D5oUtaJ3&f$O#b&67?ZPbbIn zvRUXu4QXa&j3Hi}N2JwZG=Gq5fNJpBcUdOWFEL6Q_P!7}DnRf)B2>C@j;d3NUgENf zImgMJYisrDS=Q^#6L)dC&^(q-_P9R`5@ekDko6iy6jALHR(#|X&}>)e0Nr=Uh~-C3 z9+L)#P+%Y+g)XV=@2yLI9X@Z+8A-I6u-XbO^aNUCJX)rq*u*_^juu%SzoLg?1v+Q|>kY{Qi=gC>_%tHXm8~vd3@L{k_0Si4 zAU|^WmG0=PH#SiS#}s-2NAJk+dm#nycryKhI{cA5o>(^_-26d~gtRpj_5O${=U7-D zs28C1QxUjxM@0SrSIP+U*-om)yTpPij}GDQERoj^_$8!m{{Ss9v!rR z31izzs>0v=1EgV@GXQbiv~xs@rToGD5?+%L#-#SFYW1BAf5iUW}Z_E$$zb`h!&IlgGd-n zMaX&rPx;%+_(RHXED7DhCjZ53hvnT#D#d^Ni}@A8jDcWtfD|h|mW^Gm*w&6Z^F7+< zbu0QFvs+~YCTl;x_C5QU0Daz$X&Pb2o>myyRe4;3|BciQt->?`QZ2QncXa!zpPS5% zekh@JI^115YR2e2)a;lTHDW7A@_+Pabj0N^+j$PUM!Hi*#Y5L}yjVPhz4)X*QTqXv zkGrlUwo)*a9eF5mBp1|6Gdi_8)v7$~(z?v??+6Ka*nXSY_^;yu_`)+|o~34fA`95_ zYP&DAqHM%}sp!YGTrVb%^DcZ{U%=&6>N9axmPeeNF~+&?-F!U}GWTgal(SPeG1OOfN^i4V z;Kj$}U*jV*El=*X^<}BDX9T+id>fqNH#TGBZCV#H$I2aV8T%CFTRFfPmGRVCrkCo) z4en&Hg&IZ=e(xGZce5n>xKSz8`IuTebki#v+ernmBvS(!mw(!`w(`34L1U^1^d?sOHbGwaC$x*?_5%}l$*S+ed{DJD@6*Ob|M z1v!6AHc60bNKH=LjQ5w#ByaJ3UkeUm!5FLvbE^BDJDZqOF4Xy$6FX_BOp#cx9n`#; zTgG!C?m?N1kP)ejSR__92Mw7fQ2V`r&K_3{P(Ho(JgF?ZQ^=vts6>ggKEOF-XLs#I zZMK=WM4j9N%#Gcoe0G`Sn6Yn z1}f%q-Co;eS$jarI4#TxTHOZI8w29-4X!EBV$W z=SVsqpskAU(&u}>-xSir9&La-lBN8FmgL{HVO#7={D6(mkcVQX5#J@!`)0$RrZ0gA zrG8{_8nPs$3ji@tXaYcW9eTK{X%@`F@3R;I0JkH6I2ij?*dlN3vMo~KaxGkQPw%i@ z@_U#A^vjpt^!e~yAF{Y*AuX&7q?;PdODbA=gOf@WDUM(MLnYdPTC;V(@u@?ss%N7^8x zj;M5k*}P0Xip3*=1T7|0an^W*JZK3%d8AXY-o(xp+J*6&L0;np;Lktv-p?oqp*AgH z(apTc?Smyd#%d1OPjel+qhPNOsOs>bBQ-8u%)>Q-N@|=^gW3GtJo9u+Z;Q>(v(r$P zW4!j*!dWDSw=lvS+40OU2vM%yB;kQN>JEzeEDD;&Mj<;K5&BCqTX<)E_{o#chH%Ne zQ-(RV@Zhz+dEuRJU*#kd8a_DU;Sxt`g){+bcD4PS)>Vpf{KH#m)}Vc~FrUxl=#?jB zha10r8Q;Lnpo8tT;9GkFkF_K`Vvm}7%4}{eMM`Zh*bQ|^mQV%zNs5it2p9LTzdHW&T(0`=@WJO*lePC= z0r0gQSay^csHZsbezV!QPju>_Y*>QT zcV*YUrv))I?7a7B;8W~|79Ka|ffbCFw1~FBC0ove65#$ubIndFV?n$JQ+T_Q{lzxs zpoYs%6W*@DtwH(ltH(r{h6)hAoZ&W3r&#-VvSTnwW%LeXu=>H~*5K5ELFMkg_QQD< zB@QNK?8|MLZm1dNP=%D9hPr0Hccof4tqzf1VAadvcQ$JuI{&lQ~<~x&c~Bvw`8r9}gmqXb^iz+&Cu z5xjU*8&}QK9sCS z6eX{s+BlG_4lGMfAy6=Kc2ajLb#b6(JN9QL+EQVENTpiTdlIlP|H^X2WYtYb{G{G$ z>`)^udJ^^Hj^PT$JwW&5G|ZAr70GyRPEl%tv-7N=2ZX~L+Ue)}1%ku1qUr7}b)myO zER!?VGmZCMf(=1MQ&-mP)t(tmD64%fM4P(7C>roy-qfn&Vt8m^kg}~usl8NA?mVbs zu9Fu1@?9j1W`g09)+wU~U$cga%onTsyUg~N^lz&i-}!IuM? z=BTxCqOE7GKRq-L(y3{XY`ad*`Cg@Ql`;7S?|qj%%lTx3E%OG5ZRm{wI|!t(@^r+SP^{tP8wGP z46UVBa`B*Eq8{H=taMs7@+}U<)JM1$h=0LrX8S%tT}!m|L}=h>W7)6H$DIi3m0kC~ z$1u;8&uh(Xx>F0W?kXFyP=<|-dIT!tA*?Ffli2Rlye8mU!6n?l{vu!#sYhr zbUZ2GJ~A}K%LH7Ui&¨IMV`5U=yZN`KsA=C*$7M(Zl!=Z?2jRgTiGSSvM}&XyMC{;9VHY z^ka?I)2_L|wHaUk?b{Tr7EzzH$yoXeJcj!4J?FP*sj< z<^lK;@+6>dJc;CJ;Ox8#;JnZ)oX<49lv<{72JXX)s(E_@|CIcesPy-6!yA+y}6ZybZ<=Q3{Aj>xUn{k#YvPhKSqB zMH?d(05Fw^xmwOnK-IFi|Bk)!5;E+7>M4R0(A)eVeVw>?IJqL(x^0f0xaj5W7Kg1> z2NL*WKnZ+%8Lvk9TufR!d`Ge?=i#sDHr^eXFsK2*pb=3dO}Gc378TpArKHa%tj-t_ z;cSN^wnf-QdN8w&Fm?xGb&?-(8qZePEFQnILwwE9%YjEZ7!SiA%Ux1?UU&;ciI+J8 zd^?16M#zMh9-_Sd=G|*QKOFlKps;I!yitqZAFrEU7ngo>Hh&6gp#IuiMTIHs6X1_> z>z_7ElzB%s~B1T-i1u`E!zsfD0-mf!&Y8_ zZ{w#8?}>Z}yLg7%3#+bT71~wO-vSDrddvMIWOqJGY3VYgJeF3|N4UHQsLL4ML)pFg z*rs_lMppf{c&OjkvHjhokZ)|TEbfv+{4bSlt>@Iv)C2wo(*!H$> zG3C#Ye#K}URi>LB3luUfZqNc1@nSHpErDqFP)MN`xcBcL$Rq#GdO;22b&1*z8DlBWJJ?j9=r?~ z9(%3jzSs9lOTc*e1hcGLtIR1`;qU+stm^GYCl`%&Ol!4pi$luQxeI)qeil3L)Kg%E z5sanpaCOR9`g_;LWcCv9HY~%p2g9O2u%0@?JnXXbW;L3K09o_C#8gmDsK%e=#dxc| z%Z3H2_;A&!&zQ}hMk&4HQTK1R$sb6}Q^lz9J**fSi<$G?pCc%YY@4}47&Br?5u8cV=vGpQHsIT!NmX6SX=IBO7Hv>K zZTO3!)Ms>&G^hoYkS3(`_PG@bV@S6eQ+O8(>&F+11$ed4ki7;4PGfdNEM$(a%L=Ew zIT>ATg;oY#0eNfaJ>7$om#*xdEuA6{`l|T)#+66!)S#cq0?uUyAB>{hgFK|SAAv0{ zzT!l~7e!Nj zKCZdAAS%JWve2Mvs~7s_92PbUQAXY-BnV4`$+m<`qj%$|6QxQ<$v5cwcnQePFLJ|% zp+R>_r?O9;=Gf{QYOA;g*tilO$uFc*$4j8dc}_@z=9dX`ubp|d@{1_}&hZlFZu@W> zFzzh4Zrf}*zhwYu#*}LtGM8QqCEemviO1V85ax!zVNNvc&lURQ!OCXO3bB+7_@A$S zwZb*xJ;!g{d*@7f8G;xZzDw5^N*YpxryUO3Fdcy53KTbkl-4JW2rCa^8PZuY!sV8@ zAtIOtBo{iq@@hD0m-JrnH+eco5A%s@PkrC!Wzy$6PkKhpZ!JLUiDVtRIoq~Y0fAH0 zn&AXu*R+og_x0dpfN*)I8RyIo+&^rQ$ zTr{+4C{f{TmuJuvNG{}hJuE!97N~BB;`&5mbHOOgmg6fwss~2vVUEO;%ZYfHLAtl< zUo;f1?$l}lV_Be|j%-7AQdbG}zHaY`kW6%X0PE$YP%LrlCabxc2I+S8tK#$cPQF)) zzdr1AaJbBXIoNBXbj?Z=IW=0Zlw8Jj50eZHA52sb?Q-^MJ?z9f_78Si$3X#Yp9Wr3 zk0!0hb3Bat_O;le%Ln&UY2a(1O)~j+v9;Q`-20Y+*F*Zj@~4?MOlvRVX5M8nD7OOht)oCv zS~l|WscuV`Xuhr>P>oNoir=B!%m8I{`i6Loq*oe-r@{uijFH#LVM=@#P$PHl+7K-kr8{UhOfsoanSs|WiC8kgzHQFvX$yX4AF7wq4#t*8YD z@TPmV5cLl3o&)|9#}xyLkYIF*qmKgK=MD#Z$%TNKX(FwT6z z?W4W8zQw{4xVnFKqWCUFOBG33;o97`*3ReFCpkCB4eK?~LsU;B!v?<;BHSrCjmOC; zyYGN3d~NSey~QYDLXxE%po8h%{@y-R!S9)Nu6B@9Xl{sh_4@45qLezELbXp1g-H%p z-R0|I$?k{@4O-0kpu22VDT6l}@;J99-gRFinz7m+BvT>ln_k2)CqXN_eAb1 zIRlstvfC?#*&?wurB*N_ph zM!^?Bu2DSWZ=sx`UXF-6@$xTEb&v4QD`V*8eYKVB<6*vBFXpssOq8*Yc_yenI^`A*=R=c8$j2 z^(}rGUUP0b2GgOITXB}Kpxf>GwpYc?EFQt)$S?%ls?x#x&8G8dZr?a-@%lW}KoA#Y z)Gxr-r8edRO_pq2tMq}&Y0}bP%2}V^x*J$?U8?_;0`xBXQmuowg~N{<`hjW|Pr}xc zqBnn?%hjRKycbvQLNMVSJk$LQrp$ePuG3l;DlYRAwo~Qa-T~|P)C$q;lB?lzBn9){ zfLoYOgz`SG@n=*whmv`zWk<`0LgaYkBCr z)FoVs)X^lpjxK*rc+Hze)?X1+v|PD)y*B}2d1R%6NOX1i>^Y*8&i4#=qz@+7WvP$y zb-l0p{JNz7$R%_MT6Bi-PPsZ%huueq|9jRnozvKEBCv%Gfd;p~pxl4Kb}T3~Erh%7 zX|_fqnH+mpLJq+9@i%p)*WgFBz7;%4pp7Uu;SW*f;xQf|uaJh;3gJ|GzDNW^L|(G*TJ{XdCpdy zaKBcn!yhvWs>gg*919gB@(eu~-VO-b6{U9+2lH|=Le2^6o@_?ky|nw+AIujSCPADKu+D zJ>oS{yhYZs+@gfaDVlu}FG1A%GYt$g@8$wWX(SrBY=%z_Q)nQf;fdc9b)%59j;&y? z^~&umeqeW~U{M2fNWY;pr&a4<$sm9Y8O(~`amJ{ADPb?6OIg%gHW|Q16m?3rk2;I1 zwcOX^bJe8s{l#V@n1HeGc?l1k=2PDNQ{~R;-j=%Adj%N?;g%<{UhZ8T-}zimjC8-pld}trKEue~6{mrM zR~o_0+%|SCy?9h(ux54vQ|wj4m=#Ogmf6c7@0k10T~S63X&02mTAmJn92Qg&f3!ir zyNIo)j$_VGaJOlJl;1Lu^Cwo<0q(cJ;StQM%G7uHsauJcEX6gfc%RPh;fy&COvG`a z=j>aH+Vt#3H8O2uC7YXtwT`isU5`3Jk-T!C;Mq#AVUyR3)((gAs>>EoN}n!aw$2K_ z!1qA%XZYB^#DJv$_Y~gY-G-?u0m5nE@gg8dPWXFpe;sm-9Oyy(C@9!@5$W-n zn{(^ZUS|TmqcWJcp4lnLI6o4DqAahbiBOk=ngA40Q(c{&4_v)`UVs3Iy#R?W0!|)G zDKDIhXg}zV(f%XI_xMrrKzD#MIAsMLy^AphmY&M9S3nngb8cd;i`!kaxJV4uqea^l ziWMAKzN$$sOsIr7u3sfdttgEk(m8=Naoe3ot4>R}bO7fArj!%b!6lp$xe;w^E*Iuo zi>B2fooQc zn;)rv7e!q??=$&%FK>+Kqf4(6AmfBPtN-}^M=zqe%%*kAV@Xx;e2Z$ z4tyTnEX|2EkMPPg#IXlb!~O`6xTnZRvVb}Lhf5!2p0k$>DXp+~eTgtdURK-yQwB=v zu=FvKZJwCc2Z_sMSZqJmB1KuW@`UE|jUL@`7ficnWpb)=W~iqPoe!N|`&gUZTfE8U zo&eZ5xw|$r-%}~s;HZ81_MM*NO?tCzTeo{T?sdi{!&;XB^F1$tZ=i%9xaLyawesN0 z2((EXqKF-dd&&8QSt|-kn1nb^wBX+`$8l^@BWBODVe_7`B&u*{?58Kqgw?K~f!q4R zwXc;;U)qh_blp`_i#|0T+|v5k!m0-<114$%X@DJkjkK~% z@)JGkVAoHhuFnrzUveXo^R8f*+`8@i?G4vAMw&n?B)z*44DNf-5K?JW3lV)$i_7pw zwd78}Snaa?YFRKX66;J^!Ag%LVBzvP`HUyF);Mg&zq_MsThn0Kt45tat0zy*oQ_K^ z&azog|5Dh=U+B2;bO1X)t^VYRYT9DcwTpdQ?>RvRi5bM{8qLrk%k8ZlYf(bP4{WPZ zRbZu)Uz?1oGda4(se9LBwX81t&@LWFy z$S<4BgVW=^P;cV))8frCB~(Metf_BSt&p`JvNi<$mi-;Gd0!e*f&T+8JI>`vSW0MA z4!BN->?SHVM1S1>+n|X?ozEmlwWp!c_lM0^hW^u2#ZkXJvjN8@Sn$D)Hm-nHmQ*ee zY21D&kPn6)$Y|)&I163U5F|L2+S_cyOmQG}o(m*gd-Xs1>BTpD@Z7ot8+)`4rF z1oIVEN00392#Q%4Aq*{cw!0EW$KJ{(-9tP`rh!E_20=JmuG|sXv5>C2iBd~7kVn_B zfGWL9lQ&2LiahGaNcV+&Yo1f00dWH@+evF#kV-A+p zeV3$j+f=eP$}HZJg0KxW(h5Ug(i0_{(JZZ~1}lz}kLQL1tKXEJj})UqlUpG#^dzlg zY{>d$PlyvaUcugd=fZA(tdiW#ml^C0fhvUtciq`pp7PiaE?eug_}J2$3e)v?GpV2B z_f>UY`_^mgbJWexPOzjUJw)2##_qr3c=F|`x^FG)qNugU$MrVAxvFGPsoh+<$K}GcpjihIJ_k0KX zW30^k4B?>_Dqp7K=Ja!>pPSEErS9qH?8w;i8#{{)g=J3^N=*dr^lZ-gS316hOWXzK zP$#z)az4wxEvwcPCpY`OU^>#TB`xC@aM_t=<+5Vdh;S1#VlMZPr~~&muTY=g$+~%~ z;gOMI$==PG7D_GwwmVpX$&Gf_;IzbdprYE&2kO30Vd*_ZU937z?&YoA`U-!OY>=%SWn*cj>k0x8ATr1}G zcJ);GMqMCxbP1qCM(&B|2ZmUJ=1JJP65*`q@~YB!zd>Gj&z@540g`2UCpjw!FLUD! zZ1_U48d3LT10xYwUHzw3_(O*29k4xD(?%tg5jeTV;TP&{WHlc1Cbld3XCw$Gg_39l zx&0f0-4DbOz^3Y>sVQ*5B1bj+bnwqGsIw4Y#d6t0ffs!|vCj*KYY*_A3GYCIvGv<& zAB}+Or;A0$7TL~76hNGu=2`ci?W5O5IumH~efu;}7l-8gR+%8>WbM3ke6dxC zi?f80F^ip3<&(Zfy&>;4m)xo-Wx>9=JmAMfF_ipI=r{r`&EnJscmVa(;@9Uk_VvUG zH*t27Vd@yE^rb*%+v`ER+bY1G{=pPaM?WFa6{Q)S`MApxH{1BVM%{_b;(-J00Y??X zj2)UC7@ISoiad$pus2zOy1H~yt0>kuV7(a#%``poCXk`|Z_xHJr8A+4w2msXR)!i3 z86n6LaDUh;W}n9MnPgX#Mzl&n6|8sRQ85u5Ju)c4QZ4^PYkBGs>5QF2O8hn<$Kx_c zd`t)_8S4ha-ibV2+hE=Z?G4fOCk=U9q7VT<#8KIYnu!0i<;#ysn|Mmnu z(j#PhqhTN!+olGk2JdXRHb2DyG*$IQ;0(Yl9{cC=5=7|p2}6?I7jcD+?aAs*?ATdu z|I+(-HopE;*_XPG7Rw8)kj+m zS23Q^flG8I>9Ce}hJPlrsGsQ03enB!PPDeeU{yPT6vBa^Wuv(2KU`Q^L1SErE*cJ!(kLzFUvsIG6JM@S@{tF(QlJ z{r#*AA%H?to{2RD9!Ca6-&N#! zdQN`h5B!Y5w{CYHcpV7HkY;{7B$=0YUhQeS+fh772du_5FC6RyWizFhH|Q~Otk{x; z&tp97$kW}f4(N^H*emDcE0&W!$G!A-$}8&U!iL=iOr*XLS=Hp# zmYb({Hd0-lP`MvjTc@mP4HZmC-2NE z2w!@#x&c0qXJ8AxBw~}Ov8B6O(+g_ad{e5GlV4U|%eeOpokkf+5IuvPq*#XRcY+LE!=zw6E{$80|$r%lufq0mN9u>SmieDfSbLh z{s~ewL=*Vzom~x?1?19BtH2n#{6PKFxvTRMEr9*`DN?BGoA386B}G9LmpYR;nMKWA zk%0N0ndH3eRUuM}J~F^&0yhW-QA;jGm@#?m(j3&;VX?LVV zgIm2*fRLAYKLsQ{xCMj)3@I5*K6S`i1`2kK{~~TE2yV>^Uxut{GuE#mc?r!4ei7Cz^_jz)rJI<+Yk(a{``xY@xk}+G z9Va{`#EJQBr_lIwOhWYM;<1gGQ}-$WJ7ki(w4p@nT2no9gF&L+p&P7^A8>7?qx5kD zE}660TvZ=E*K!qm4XvqlTJPI_{05YC*`WZR%TRUoj&BYeMKm3Vcn7i$c^(~Ecuoo; zJ9@`osB3h39r=V)9SG%E+rZARYDBoS+GF1YUu0HunBISqc+!iae6QVMXs=qVbhPUP z(~ou;-gwhmv=rb=bMp465{{jRFmt5va^9tPq(q8qM|0dHBu*5;uL|Nsea}Cq=)1H0 zLcr!n8hy))!RG5Ii4t}9(F<$9w2JU@xU{5uujZ`BOZa?{@xUr%zI}kKW<`2P()pW+ zap;OZ1;BYt5S;J^k_B0-n9UQ|Yj$a}#@fZb@}l(Z6e^c+Yn1k$sp{BsealYKO?0F_ zuKx;3P)Hi}HC>k>I8kSI_56H;M{!<*^eMDZpi)96KIN%flCj~psD9TLBp_+!=Oarl z7lft;2z$Kcb{9f+($3#^J?pWoy?EgABbKovj0$75W%j+;f4-(R3}0j&mGV{L(v>rp4;^{&Sh)h$rFiN8*g(35$`0ew#!gHa$Gwky|DW@9qoPE$9jE%WFH<2?0ZQ&^Yr9D(W+@&# zZ`1zfdzkO2QIpzv9sU!55>BDn!7k`~qo90QR?m`km%UsE*R185`aPSxhK?1`HGm58 zS?@U?T(FEQPcGgUpa$1`53B08>TC?ziz!z-{im4)sMh!QiH9d(Kh;&h_d@(IErU}v0&_t(0*ojh?fa%I3YQSVHHA7)^k!e7qMdFW zHr(@5!%o_2v$}Wi6DE%R!+qh2CTHiu`{_DMYOB1mMF^}dS+8iIlq82CP^y7IG#zi> zCG}nO`Jcn+4plSeeECt}77xrD%-Co9JO-1Eyh8jSh?-0-Ar11v)FXD77HCO8s#&hD zLPM1XQ}-N{>VGYK$fb9DeeA66j?J~)dWC5cx7f^CQDs6Ut~-GAO=L{_n(QAi$lCH>g;9b=^P7X+aId< z?|H55tS^%$Uc~*5GU12!Uv}zDwG1Awft_@DTV?-Bz}U{=s<8jc_rvHduAsv}L7Ss6 z4?Qy^(R(P2OR{|BRdeS>)W)jAl)Db4*CW|>F;&I-71a3BbM4f&-~z>@zI-6o=ot@% zo)!PK=Pe~B?u2J}HN{UMqvt~teeEM7VYWu6Z39-v=e9btui#$mgeocrD18J!=9-nI z`bn2|98~Zl9(v>6xj>(Z6NMnTJ(6B46>Nl8>V5ggT!{ST0eug4Cz%m$4`K1ZI=f*;gqyV`s|BA{J zUdye|G?qWo4w9Jr_5zwF6noN`&Vk745Hj!X;~D?+AMW=v`wXOP5P!-7QKc-^mlTjw zdG{7RWz8s z|Ad3*A*6i%wS6Ov#Aj(#;{fF8eLNM*VK#~k%Ldsj|pfm5rROGMd1+k z;9N!vJcaA4qvmGEOWLj44BDNTPj`3;T9#Gg#E{~R;=|Mz&i5p{(CY2o18(!J+jWGI z$MmWf%k-xgH{!2r=K`+Ae3rHdV*Mku$NF*g_mW*@>KA3>&dE%dzlle%XA&buhFzoR z(neso#o<)<$F}{pL3s1(Li_D#4|1a6YZU*%TE>k?L!+zTM0_NHv@xIYY0sEqU}(~0 z5#MiOlII?2Bz3NaYXDI}FS>mn$e)Sh$DjEnSO@;a!3230BoXw6{`pyHuRlSLABSTZ zi?s{ls95JCeUAcqpIIRD2Ufk{QE|r}t-il|dWAn`9jf(E7&t~flfhaY#n(9sOP`Bf zVTRbxy>6BO?;HeM9%inn_c-D#kc>n7BKCyd!&(GKc4x;O4=HT>^Fnd!{;Up}c3)F> z1Ww+GL*fQ=HJbt+W2yu=Py}W+A}Gc>X?#3Y3GJJ$A$2ktwv!<7;R+8eP=)U59Y`s@ z;tg+<)jJ%Xgr8ERYqG84v=09WrQER#DY)>;`Ptz1J3yqz`kNC`R?X#zM*3z(MSZoX zPAm|0c@L&pnUD(jNXN3<;N^B0AxVzzCt-0M_S|2m5I_*11`n@*@yRuDdfqPw^e_NU zz2y(w1>ET~m&Wll@RQfxro%pGmpfDw(CD6P^ChIkRk?>K@u(O@OY zrW1@E#*XpqlXt8W*pYY<3^dTSGmv2#M}2+l<&8wVDL+B#46(Bt{^;&2n%?VSOimJC zPRMB7KiQRTe+zJwa};VJ&I+BOoNJOh^9uf0ubik)*t~7GbZ&$0Eb7o#G>V760YMtO zkqKG&2zBK_CBi`7)EWq7_%oxu9~JOtSeP#0y$}Ul-$JpyJbmBbP%Nw!mw`y+rEsc3 zJk<6L#tQQjObCESFvad2YVwnC5dRCP1_C$6ql{@_1)MX1>^3G?K7pf$$C1qE#XLT2 z)Ww01uW;kQqkvXSZ;IvlGYf-&o$48oRplr!1^8QnrdUq`(%$^7zbc(FIk5Uqjm1+1 z>3Ub-_WY2x%DzeOO>4hNhb0jPC=3U?G!`V88XB5Z

    n}X~o9*Z3SlpU`$=|5im;|w| z=zoz_B69=Zc}?A;SN3Z8xr* z|DWpC7W_8P{%foD4dUfLUGee9zM1#`+fyLf&;#Lr(q4T!BmN&MieEsErMJV<7oK(h zj*M>-DRqLM7D1{4ZuZ*2_%^T$p^qX~nk5a0w*n$(sb1h)m=tp`)L>I2Uj?ra13P&J z^`7y=H^1X}kqEea3I0T+3p59sUT{h=?V8NAOPIF*pD147DzQC|!w|0n5-BCy1R2cm znG8W!kC}La=`5I#+N#g+L^j$wj$9ruXD(=|rBi#-k7WWV~KjtHJ zN! zdCI&@wexRy^iybbDfA!WFe%18|A(+w#vSK^;4y%e={WTL3O|?`kW$Bs6-b4#s<`lL zW8=@-0&*FZLbX!!|E$h9u!)HD^?&P&!pTKU{>*q_o5{tKR80Z)`QJGyVxpj=N&kR< zIW%!4|6aKO_L?{RBsjtr1=A8O>Fpr?F7yo(vmZ{G(up!m;zXm8= zu&Vt1&JF)`0WhK%KMSV%_lZcC_A^yA*?)#9#@u(-#54RKI;WuLwv4;G0Snr3W&>dC z;m`s8+m3((?y>k_g1k4y-iqZ?UIlKFcd0YQ|0ByAu~LkqInJK^pC+tW(hkGWfP=h6 zL*A znxO26KtKqK2r9CLJuCrC$h?H5?K9IeJ=5pRIX$QUOr>7cyH)k>yLI3FeU*-pR^$?S zQKMvi^}VF{M{2(y3gim*~$yBJ1R7E@NH}Fk*c{n zr>=(kSMMOlj;CY>BMf5~(0J35ed}9q*&I*@&jW7x!qd4N3jr~`XK(#xE;<)To+RBM zL7+C0&rBH&QsV3nL81iWqrt^Q2~6@aA>g4IRinLen|^{Gr&w3iTK>|7n+XD(r`~rw zz3r^XRVT=G{ab7m7tb@|jY!+wlj=oa*r;L`H{u~_Ya{fvFnFZcW>dIZI>BhMvry+R znc*7tiT3&C?pa~VeX_o9Ivc8zh1j}_3DcP z;46J;ADmTX7F=&g0EjEDfl22aIy$G;j!aFwuwo}4Inz!t_Z5;;U-00=pAmVInwASZ zI&gMSYqx5F`4D&jo+;2TaDHhT`Zv(Ie@3MF^EWBpMXC%a+QnCn0*(mqeSWJ&ICD+B ze9+=x*iK3mY5&G0K&jw<1bpKWwnLK9Oa219e@ZVtOCtLqI*Z`PHURHf`eSYRw?9C= z@zOuvPOiTj2>AbbZ~iiWffaS`S*6v*0i~K%@Pz~gfVi9{-mp2RyKXUV;jxnPfz(Mm zszLhT5{bkS>;&1g?kI`46w4kZ!TOfMd`AGP6kd$!Upw7}8qHVI0bo)aLSS4Yfb

  • D&+@a7JDr~YSM|~L+GEO{83pGcV6Pt zvRHrNgPxNkfTH=>!Si_RtOmap*Jqz<@N#SNoYHblkU|7QSdI4DlI(DKp78_gb62$m zI(8Ti#2PChO&C@Ds`HH)Rgqj!wHcdcX>_ql5vm*{lL&P}uWG0`WMg!>Lmq5UqVP9V z6DBCFWCUmX68!VO;&}b}y(So$chQ|icrKlumT+Zc_8pIEsaZ5I&mwgdX@rA-gGa$t zwfzTG`DFY8x}qL;-= zeO8&{jttC2!&g_}6|^GCXXaF#(<>jOmhGL?i#3$}f{l3V2_tV4iGhq9Uopt((b4^( zg$U549GklngS8r0Ogrtq@0TH9)RpCu+}F`0nrR&YZ3wd{iscQo$EL``04-(K%`v|^ zr4hT)b&lPJ0#dHEStWYU3~uCAeso6y2JB3CL({blLkYdN$|eU6`&$KF*TEV~RVfug z;|M{|gjg|8rDfL)ntuyK)uIudr3kf14Y)98kiiD4(aQ&puNYq8t)si8g}6iX)H`EF=1|L{A$JgUiYcnm34U zMuW5}TsFxxg(xoe*R;yR7yoev;s4Ng=|Ty?+ zt5fGaiJv2w(vHkieC2_NyRh{?*xi%N<28={PUg3z`x0%VFqCzVV*rF65=cGsx&9IV zV;^!m9^&raFO6BI<<$DIUgiq$F-O;b>gs&h!2yX_7!K(h$Hs)#(RX7fuf|=x4wDzk zzacnZTQeZTaIo4Jbo zRyi|R-&a=mNp>6(S)Muq%wQ00>gf+O_0;kwwB%X`>t$Y|-(Avk53ASpTyYV z_^E{l8R9y8!&+lTE{Bq#quc#iXw8<-7p}un!=!5brf}l>$YNN9U92K&G6l!R&jHu? zcx3Wa%n`DZe%R(1B^t4HTOKyUXSQK(V#5kY6{k2G!`x-uijbK3qgz5Awz`JkjuP~* z8~JGJelnKdcsT7AWmE~+1X>H05e`=il&jLVWx-|lKoweN&ua~C+x@p7O5BEdtIH*~ z=2kT(6e1MsTU|J`6iZ%9a5Y#X?`L-mWtPKmroM3|xtc0;Sv@s<3ETSOoeAY+s8Nwi zp`SICyABjkRymPo1W&*w#?qAa?uC3;9GXJlU^wXbSTfM%y694=ZcV$to0d=GVJ9n_VK46HF z-=h~7hG$nzXf{O)YaC#os6Gknz(h7T8Aul~hNiu;sUEVU9y5E4S#;`;ASXLoYDeCv zsJ&*Yq|*^Md8r{PWH_yH`NW5alokc2h~)DHQ*_Z>_AdBTPZ48%*(R93_RXU3S?4g* z>8Ht&gOv;VZUyKQ=`tCcZ3GW*L>9tSenc;_J0-QSaGr{qY z;UQQP=Xm_-@DZkLN_w-P)0unI%8KmZ;Lg1MYBl9JSw{4m@O+cCQh?#7#U6F9vK5Al zrfQ=un>gB7hbVLygQH{>HWtD~1Kif_cJknI67>iUSWW4GDKVxZv$-lz3Q4tbX31u? zO!6%9B9!;o_?tM^e~C(={TM&ImFrRCO@=!N$`VK0$`Zn;%#RFa@&P9l`*je(L4#~n z+%V{H0V1+*`j$4$PvV*%@v8PDJ=Rnzz+I#5#h7=HF^R4oJvKfri!;J!N#bUbKc_P_ zhbds~EbXj%NT0p!<4d69Il2NA{SVMbt(cCP@`9tRVsFKRR}T7BSEO>AYmY(;nY-Xu z(g~4A+&JPBAKM-oZ>hg@dcGl3wyz?NOSe;@HR();C^(sHec%05otAgNX^_fwq^f=p zQ*_=k19cy2_K|nXQXz<%Y75SGA-%%d;TEeN+N;on*V`ADMu#y~`nz_|)VnI0V1bh=`0t0pZZz}ZlAatP_= zjkqTXbi7lTDeX%lxGhQ8!C+O}PqA@m`@NGeT+PMNv5CaSGKq-|1a;%?0JhKl8U|5A z2y>;}WxS%gx8Kb6aSZQTtnO=4c*j)ib@odK)Uq1Xen(43h?E}h z9uI7HGL$%V;y$f?AG!Zd=nn-# zQa3y$6nboCS$fJNv3==;I3u22as&sUYke8M@p3+8RJc0b%ORWGI49f3Jd;d0(2B}1 zA^;4jqZvdhGUL4}CZWeByF7QiWKXtRUG~@(T1!xC$He=enz0NRj``A)wf_m6dYe(_ zS(hCt^ivnwd@UF$W<%a_E0Yb10$e(XPP|9^`j85tH3soZC&(!LNa1_>>wQ^qQGw&)Z?%3Vea wA_Ink0)uI_l(w^QFw*|`w>ZA>l3dfCjn{lbj~Qk0zG`D-j@gmD-TmC(0SWJS@&Et; diff --git a/docs/bookdown-pou_files/figure-html/unnamed-chunk-3-1.png b/docs/bookdown-pou_files/figure-html/unnamed-chunk-3-1.png index b404ca7281d5b93a25c916ac7ec5e0193f90215b..8f3f78c7a9750d9679fae95fbed3750fcdebaf08 100644 GIT binary patch literal 16746 zcmeHP2~?BUwhpEmt3s-_B4g@ETL-E#NHBv!1qZ}hD^-ywD2PlVC}R?E0I3RCs)z`| zqG*eV3W+F&6j1^q5RHIIAW;Y;5dkA%NFe#&`9m1I-h0>UT7CE3du!HD7W`-0XYYN^ z`Sx%~*to$-f0FSe1OlPI#@TTb0x_P5Kp=mghy-uSisrN<5PAh0*Khs?{E0wd5eRPt z7K^}w?_`8Gco;@T1S2C)BLcyQ5NqT~Gd4C}wrm-w2iJl>-dJoh{N?S9^#;E%u*u*N zfHrUof+xX~Mu=s)z@I=DmqeF9@Z`EAR(OI=3E&}kf=_I$XDrQ=#`L7|ytNc~V-awd z*krg@_>=6N3?7DevcQ`G9tiXm7?>ECm>ZZ_5tu8EB^YV@V8$|eiC~yO@PHC=Vg;0p zMI>uGN={}ZGX%*3h~{ZT^CZw>2{a~?#^lj>yxc&S+{D1#3URI&Vt6AM-dKjWHzOI# zNCw}@4A2h#2p9tJKnw_C#uAt`PbMvvNdq15DiU2P5(6tLKnG9-1o3E|JZ3D9N#ij= zA?O}_^90FQL9(}i0iI;IBkhks00|&SoEs>vNEBCq!J+o{b_jt0L_igZkRTZnFd$G# zoC}F7ATcNhgM$&^Mi2yr3{0W|p?$pHd20v)VHOYn=q0tv_aG3TBi1;s+#G(Pqv=$7 zqYpVlzNFmRW%gtXx|y`<`B?|jG{b#YKDA+_Y}pX-MUn4<@!vXcTko9hxPkwPVQS2@ zZ+=?&=rQocfOZo(tI zGnYQI%lfHsHfr-vTf{MIpZ;jNam}{V$gkEcI6$1|`;EcXxp@`uS6YiwZN3ug z^4+Pu=aIj0f6w`KwNMr2GwDV~GhrCU@jR7z*#9a6s= z^_DDT=kvld%d^-s0LizXQ1Q0i>(cQP<6O2UAS3+&jiHm($~(1#1CP>(%nxg^G5 z4;LmYgw;=iUZH5W_a6C%Ysv{wuvlM1^RF_^D{}J0%n;}e%SF?l9m0|t%8_t|MG8Y2 zA*^#5BqnJLSw}ICbMkg;s()b((q)PJVgUc{k}3TH2}S0rdDIna9`CZfa&T)uM#FG* zGx!XiS1n1yzg9z4`WZ-J;tZrW;chz$W^xDw+ktWGlDxTFnq$nj0|_+Y3oKt!`=cbL zoKk=O+BOkVeWG>%$GHfzZicFx_uU&-f0QO#sGVhXCIy;v!sZ9K;cDvOu@DL6WrKyo zE7#O1yR}ngS{|ajsIAgB9MXuwGYcuxWwb!m=cYROZ?BA!P+CwlLSGt5q_89Y%k*Qc zwk|N^umqIj4re{*dUE?`CID^a%xF%&JTu|B6@K+>Kx*N!<-5OJJ=^9~(rITf*??IF zTYo%#+@-~9?lds7?H`){|FNm?dz3=>(Cz$fjJwb%NmRrT>-$xgJ3q*gFPXNL8{=VS zi;V1Q4zFE@F^wqOR<>l?P2cE~j`d5k(AN_MdQ5D|{(!#D%kpC;$~bzQ*N_5S^CCjM z9l`clVTW?iwp}}-<=f?P^qk?+UBvt(c96zpSyqE2Xa|P%SYvJ8> zZxd4W6jg1~!77wFiIAVeT2^u0rOa#Yf&0k2im6xW6TId&$GlxKEtpvRN&Lcx&JXlj zLE$vytAaOIX3m+KUe0zFJ5C6d&!KhGkmTHmEA-VmsxMT;sM4LU7>-m@lwOus4X`xA zd%N;vI#u^G=h)kG*=b>D7J|EjbV~N=t|5+GcpTFyWUfeiY}mN|ZIohXKyj+ztzY^mZ7Z#-)$aQQ`2q7s==iAKN)nRrE(eQp>}XD(TnD3k zq*d9M)&W+0#m5Z&X1*y0uX|Ok=W-m#VnC~?M;okAR-(5Fa`RN1%4O&Sb@B~~8cvLj7RQHE%ZeO|rRe!A~>9(n$g21cS@ z23lN-FOL!Of{Cu^)wAz`<@9lIK>JKiZp-9`LtP%mq@t;qqJr+`+w-Z7=B!(Tq(+4V zkI-6dZ@_Yy|A3yii`eygJ!YG*c=?Vhhxuqn(RuyX+YO|@mOASeK!83nlB_{=M|O&P zcwSFYsRy>Mq^};cBp3j#YcYKHf}SWtA52l0{639|Gl%NwbD*{yLZ9B)_0%Cpg+h95 z85zClStp!wPDqm5?Cs@`8*f6lFLC#6R!u*Ooz-r_QFkQ>&67m;ZVvRNvIj_&_F0?u z6J3w7jX7Qb#=3vO^7_H5DFGo2qdX>8%f&4M_;%RB>>zh|EZJVRD zKar2(ix|^QIN>YqDsbtVMify?KI&(5aDYYc!oerB%vy`IPn23a;<&apeb7|8y_$An zYX*xB(FdQbY5H5vWjqU-ZKjeba|>RsP`P}&3jdC9;`A~Wwp2M$3^l@s>Y#8Tk2kO+ zr#J|s-k}_^0w(5zBNI@J^;j?Nyy)h2o<(u))~{{vHuS>2&2=6nbZF~g(C?W`o2pXC z(Jkej7xe-t-m=cFM7Cu~S`uOE5be4i)AOB0@j9kxRl;B|W~|WdvdCkJZHwvE+Udis zk^B#hKQQcrk~Dl+81Cu+A)5i?iQXA2+qaM`V*-BMklPV#XZUddpLDP5{{Zm*b)Ys* zUuprVj?tE9@ykXq#EX~`tT9g6O2cLyo~r{eHvYseYE+?wEBc`;E>IPML5u~~6R2nf zn{+-aD#mOhwO#J<4#$&m=WhN)7);`MzxI%SPbuy3&r@xWOX;k*RRn65C z74=W_)UP6YQ zWcr?T)R|C&D^>5T&!}fHLP*t~f}*8BEBtq$o!*;Pk}=2C!;JiDKv}R}V>(6+{?=Gy zq5cSzv3aqzKRbGcW{TH{ioRHLDLZm{5a(G>tu?Lj5rm#bsF^fkqYS;kYBKM&FDx(A7yernt}7Q0}}ZlHc{31hp2K z1FldD+9Q3Tlsk591-yt68lS@3(w=smTunMitX(`}Y9Jwx{xHJhb_cM#^PNEXiH6lHvy%J&!=HdWp?)v#|*ByeBhtL zUrV2cKY)J^orGl&f>Eb0>&9`?qio2>^Tw_D2sjTE6Ccn4-)ZEIDfooYkX>99&-k3< zPL~0Dl;R6e1g%s#Wd5i}-XGz!g3`jj*wuq!nZFOx3n49k@*p^A7Lh{|sAu zHvMjPPOdBZ_(~?UEV6llyi~cZru;x#P0dBU9dCn)c|pRDI==aM)9gz*)yrWk@CXU5 z_?6bFmbmpXtVrv`uln!Pn~GD*4Fg}Z+I#!Ybh7p6^Q*tnA8o;o#Bq34$K4a$F0l%h zQ7G-E{X%&^mXEX*UHlWxhg$M-ZTR0b>!XZ%pNKs5H2s&0SN+|Rvf}w(h&(#A}EtT=6 z9B!%OhO@}XuU3d>5b%UZjao^niMUKQ)^LcOP}b9CL1t9#yhQ{3jljALyHT<;@AtL4 z4Y`HmSw~%u`H*QA}u4I39w;Zf2vuk&*{=OE(n;hH) zFGR;6Rurj7#9FHU=2Dy8C1Fe_PP*IRi*`H7w5yJ$8}WdZEe{x6+-Y;nT<)g2rZsfMXo+pO&y$emv5_>eIrn~fVS$c$`<@mNE^}kH~a_T0o zx3-3tzEsWf=oQXm`Gy|s-5FwY>9NRh)jL8iLn*Q?T}TNsWnVWkB`w;H(x9%X5hg8Z zB1fsN&%-!9J6|y4dtyS?d7D;P&_f#s>s-@hWva0Vr`Wd+g!+<}<=2u<%C=EVzxS*7 z;nVwb$NBL^c3Dodu7kazvqK^jO@>hUE;AIe!6AdyOVxh>pH&=24O>BxSDQHD4RWnL z7p`47trYnEaN~jlkSImH22B-0HD7p~@;wM2&o}B5;K(t|#Rep<|D|$)6Xq~B%LfH} zlz~0(#Wa_$A6$G78Oit>j#VD_v#+n>Es6kW!A~2dgSE8vwKpD5qh5(eM#kxCnGB{r z!jYvcYM_z;&Y|&Ie|0lI4x*Q{RNRM@>+K7|SK;fiTDSIbKiz5cU4AviK!GL^2*$X+ zbjrunL!;#z%p=WPQ=H+$6s8Qg2y3@lz+q0{UhoUIE3OLw;O6&{R|b1&^bNXEf$z~T zLr1MXhC`$F@t#N%d{~3iQHu52fFxM@;Y30sb&~H6!s?_7_DDAlsL&t+ExNO(0bg7* z9(GT}jC4D}K+2XlxmIfyC()SpSVfc^91>-4tXC@{54--V7hSCO_SnolIzpnYb4Dpq zMrM&m7y$gC*A8j}{_rl;<%sP+Ar3q>I1N4~K^DL#4Q(mTqqR*mUh8QZp@OCEKcSh) zP16RLXI<}DPH-kf_VRohR6~x9G14Wq4n9#TgAU;3XAX8)%)*Q^gu3MP0ETH3@sX<0 zwqEhkPxLq)9c@*-o~`WiGgO&}zMp35%}jnj#KEN9x# zPy0qF0x+atM0jp5#WnnBm*qvNIXwN11 zkp5wKjvc3wSn@bcB~`(zq+V~&Uv;Z~_i!V-EKH&|^%&bJ1@%iiW##F28-9z7gqj`( zZa5pdgPL{$j*DxBX0wwh$)EF9uB21-Suk`R;W};~e6q@;oKO&0n^iiN0B^wG)d+xx z{_eDa-RP4y{qHOYaiE zTMrPBjhyuL(&U}9Ey~MH>A%x~qj&Gr#*`PzN*EhpmUh>2{Ai zFJgY8rMrL`3kEr0p;(MNpR~oVnyowWK}hOg@BR>; z+bgaf^SyA>$Y{O&$h#1tyQZepcBPLusm?Of;e6NY2g^d(8qsgIU#agb9yi_2s4eoV zYX^xwq(;%Z0jHhw4=jD$vpM{l-YMF5cX8>DANDyX;ZwC4XZ^>%*Flj(he`KE!U%ar zTSqW{)d&pJLi>!y7!M3PrOV0u;gI{SqW+R+q zoNYidqato41+B-BHlP%}MMj*a;wVc4l{a4FDE1mA$A)yeWEAZF^%l?{hClzMEV`$4 zd^l;-GKP?3QTpU`d_^{#B7Am8|M=BqHso{KJdj&jbUddxBBdY;PAS==)u+{MF$M*r zSoB7^FBkPnn<_Na<=3dttWj_qY{sk2^WebwB17G^93aN?N6nyAB=*_p-cw}Cxj1Ga zwZ<6GM@RblDFr)>Z+99Hrk)83?*VpwYHf)dI`-yI!6cH zb-7OE^R|r5nQoc3U`*E3iQt4Y+EqN1LjADl!CEc0pK}_XoTt0zyfGQoOBUY*{%JUW zRhv^aGxZ0pnXF5&KDq9i)++kX9Cn*5X57gsbI(;&e%)|%J)B-v4q5&e&F`-z%@4Ev z?_H@MWc%mHc7gtIu9rFNeZ5`3S-10poGRvt7MvydO>)>GWf8qxcYIs}cPd8V_S<*L zR%k#fSQ?&jFHHV0Xc4*Lr=;zEAd~`M#;IsQOMB!Cx(T9fL*op&oaUTnV?eT4SPoT~ ziQZG9Ulgsac)f#!xbYH8F^7?el%4F~r}?6Mp%vD^GW?8o2>U{Ih6hjN4gK3+XL372(c zU~Bkj4zW2E^d#be+^dV)nYHO=4Q7`43b?XI@`=TAm^yitytuQv_-u6Jwh-NU3VI0Z zGLNKR)?Uwu7uA*Q%6q9hT>WMO44Y~(&|j{Qw?-4utOel8iNB8Fz{vRx1_(pRWFIat z{X1ONT8sMhMVgNd`EY<2?6;b}6sXU!dk?1X`M><@e|4u*k&H?WD{ryA9-x{jXa(1* zY{*Bs?3_ih{^yaAFN6=}nW^~aij|B|XJ_kvk1J@HpX_|=i@Fa^CRTV49wB3iB_&aE|BOKhJ8ef-T#YPRK zfRMX8^HLq^BXfC$t;ryb+0dYlkFm7-Xlr|(H}|Xq-0n?q>4A!ELDuOE(_7#|<6*jo z!DwldBqV#C5u48|yqp}oJZoSE;f%ypVx3GE2HFICxf>vUvW8Nz3ADFOXM75`FF~c2 zTMn2!a{63L^ri|xG^$f&Y_|@)X!YkWgik>D6mYNfmc_@s~rfcGqf$4Yq0nA>So+MNC~ZwoP`|R zR=e6|@fV^t3ncZst!d{da!q$vr;E%jRr8&}Rht^Ux_YI!?uqo*K1om`YpD_UGU=gx z7JoTv^L9zux64kYj3-C(2yHR@T1+bZsR0a7i zB^;p z1bHX<*@iPPnGwyR4yLschq`}HZ=x$15_Bg;7bsDN&@Z9x23Hx?dC@|0Z{keO5zCi^ z3lz3Ob6W!O#nrCVps+qR=>kq^WyGmmr3qEZEu3nh-F=#{Hqos9{k=aZB%(tr>0bdB zw3S8k$-T*nT}W&Mq^PajRKt2o>v%QT!PLvpL`Au}RGaevRWDw|cE+P4&aRF86mE15 z8R;ia5}ogSQEI$FL$B@Vk}7B-=qb4uT_Yk4*!)^JPFL;gqu;spiY^RJc6DtC*6dbm zoO0Ghg^0ZDq1hzZGy~3UH#lMDp=BLWu_t z!>ry#K;4aV%A&MZvaWaM7o%ija(Vymcz6lhjU zb3{+{49eTfkQlkp<89Y;xAthD2_ zBC$$LT5VfiJX4sWmC!vz`1|nD@G(zTYl#gQK5y-6!lOqA7+#^)qmlJm3nAtga9xghItB zVL0U(H6{akBEBi*MhR{!>dtxeSH31i_Q!0{lro`HN2uRo=l`*L=wnvMKfbI!Qb`3( Zq8;A6A$#H-_`DTOT^;J6${}1^@^XdQq literal 18770 zcmeHvc~n!$^DiMl1X)x>L=jv;;|PihvL@~lQ5F*rP*5=}j)Ed1JBgqo!hnhyWmJ+G zL@)sv7D0A#10oJF>Nv=bAYnlEHIU?8w%pGBzVUqD`|G`Peh!Z3Hq~8KpQ^6z>gs!= zY!2>KQdpuOB_*X~xo?-9l$0!9N(%jK4jO8C@pM%u^v{5e6>&H8uauODl$5)aiHVd6 z^#AKp?zx*cZ-$yC8%*3y+^?Is=bBuvJuvyl`@n(N1K!XN>p*OU`&2h~6DfCy4*GLF z*F9GRQFwdDddIT7V=KH_t+7xCZ|DblYK^TBT{n@s?rs7RUeB$)UMpgGA7I6LvnpCy zt)g6asa$uH-0SYSxwW~q(2porB&vu#P!a20Q2~*NDnz2%>n63i*C&5jD}w&36^U9| z-mMj}trgG!qR!4vXcVXqL^t`%R+gx>LevWVW^yE`H$*NHi6UKBPU0eecAw*UDXC?V zlYdchZGB`ZspV3ZyS5Vp?tQ!O6>+3I(53dErb%Y~!tLa9dsc5xIHeywH8ggm zfhvZ$>!gly>eUJ>O`G{Q61Fi`uH1YMzk8Q^3-#L7s+TCW`3X_~J3_@}oV@Hi<6-7a zsZ=J_Y8%#mU0G^XYztLOdao=|_T+QhOoIHLSE!$#*`i3OgtNpnyKvic_*G|#coJSm zQ;l`&7*UfmM^nv4?YjKlSJ*X+)i7KBr|bVaFmciRzOhduVS}yOJbeHrV6&`CS`%}BNOB!7;+6(p9hWr1H=EpT= zKqTYYZ#uu0x4Q~k%Jelk2VrMgjTw9!B={yOG-CKXrVP0Y>LXmYG44x=y(vnW=O7xI zs4qr^J`pzeXWZ5=9+`{DQDtD4tHxd<5jJ8wtXI}Mms{&UrvKGQHy`h7(0}YAdbM@4 z$Z@5#$4{qo_)k4b=xD6(=Lfov_0w0r4Jkc<=EpbIjlUB3FA1sXmYQ*8V=ahC0GcT)!9E|o=kitv)3nk8!SG0M=~=ufYKqrtmf6#DPRUdQMo+a~Pl9&0Q0k$I zaLYsCG>!?_wio;s7V(+f;KYkJ(UlWl`Ru?Ru_bBmNVyZ++rwF>b<7#rizj}&K%LPc zhSlHNesz&!+d6oXl5J>5Es9Huu%=DaJ+spu&FtdSEZ;D*Rw;YR#41c_%FSn6xPHpN zhYbb@%~@SzyZh!o%F+f1)7ipCyPQkd;4koq_nB2=SE}A;7@*;R&|ypdnwL!vpapJY z%-@Ec!izAD&894=uyja>hOy6i?E9?R&ZHST0LL2ILz42itzU{$+j3Z$xhLy=nSAr@B#?rmH|$QE6FXG!w9 z3=5>`l-x`~cVA-Wynu&PW^o5g8G!~QrxsZ6NTSz-_Hv6G*fW6WXl9PECy%~|_8L~4 z-GxnWAK1d&QVx&!)!wU=o??@14ufluL5LhPyNESED~`G}M^Zso`#Xn@a9F^qRIZ0b zTroix){ZQ~77q_RWeq-u84Ra$&sfA25O4rlHFnv^z*UytOMr~YJ#P`0M{oeh=-AMa zfy1mpU@;g@<@#I1Jt24iWbN4d8FZhxY=S?~_z5c*8Cc931XR6{#-&)qWwNCJvOMgn z;ekdjX2nbr&AaFUA-!G0igar*eF4( zgC#Hl$+U{F7S=An8<+%lTnEVjmWp7zCWqdOT>)#@c8>NTEQ|F3=(U%0Vm#{c0>GWM z>?^I}Q6VX7 z+%o+p;5^+#`5xUI!eW@+*pW5G^eU_uz^j5xQEw)dv%RUMSHAbWsmEH<$ zAkeQs;TY~BNeaeHfHncoOw%8TdV%0~iuPLOmojHzT6z0)7#+Low|H~n=bds2IF?K!1PZi#Ph1=xw zWftscTd;?=ej|34iOg@lL-GNa*lS|24#3`RdmDf5ojF*C<;)`y@CHwcc2{3YHucEP zR9`f_3zQ;ymr%7BzDmEbm0)%^9UP4rwa>%~$3JDa_cX<@>TYKAYs6feb@M1zh){X- zDSN{hmJXv6pTlOO>XrAjt;G;qA2^OtQj(jM{Pd^Ri1;*ivG#EfnnDcQwgHc!q%NCY zwVtDcM%3*e0UT4IK|&Y6yeye!qZ)$}Bs^+^D4B$X@mQVO--`=B35_lmvGS1_Kd!;StYpfpqkmmO-P zV9Uu64Ugc}UC4l!7`Tw%hv^lv_W+P=WZZShNBDrX91Sfz%iD(X18MR z0%Y&7cOfwzXUW21XryJ4aq36NOJQLRg7A4o-ua|l04g_9?os@0XYzW0EC8QZ?3+o- z1yYWkq+A!gx-;1nAiIFiE4lE9lnVf~ij?b$zwOi}4{LRj!B!p5=U}Y?rmHZ6QJ?g) z*@s{YRc@oD=rP`;^{)mvoW;|FB6iyc0=}j3u)kH-?*=JR4z}fMC0@Kw=1Z(KAW0H2 zv4pCZ$yNqf{DBJPA6Rsli~*#(&Q@0E$lBL{$R%aH3De799|XEGQ1xv;FLEI30$s1N z52|xsk!q9x&^I$oi$8LiU2?GTV{B@^Nf+c`O+h&2{PK*P57U+bHbh}fA&BkC4gian z)b7y>(Do3p4%e0ZX)Z5|*`)}xP+ZJ8+IZHMbO8XRfYlX(I!dI~11yB-f&pi}N9=A` zGfN*@J>)vfEDVf0Hx%zrc`r+FRCroGh4--IRCwxzl7{oC zdMRudfJHm1zWrzQLu7qevj;4;3j`jqMg?GD$_OqVvSW5(V9f@JH9gV=x3IwgctvjvkIyF!gWx^t z3Pl~=MH2ul!q5dH!9V=Q76O{-(*zKBsRkfWhK%vjH`2^1DHte|M^#U$dJMKEAl)bG zp8<4epN;_lNHS=b4&syCfvIv|&hJmt`yn((fU7#qQJoW?VE|&cnUUSl`GHl1hPei% zRuxhcgJ=mr*W0N3Bi_0$WE}vL?c}ergD+WCvOrfwUe$={H|%*>e^pG5lpk>$X?C@@ z45sB<_of^r8v;We$G4Y!fpQ7}-YpIZp(yhX>;ODe{4m;N+}t{!9RlF-ggr5rx6Z*> z3dY0Bf-%rIbcv-2gwiec#!b>e&q>OFA3CY$* zZ^y>U+v%bP)HfL@y?T(&UZKu;pYA(vF|C#AqSD$P8nO|}{xmt=lR4iK72b-TkSvn2 zbka>tjzgzMMm4VJFmAlBY<&YgbSyf$w8)uqwDe1p$`ka|;fv)VT7(`$*ST%uMr;`> zlyV{P$;9Y{?}nbU=jIE#PaP{gj}Jyexu%yX15-Tm$lh2nhHw_&c*J#Mqgh?0k%K?i zT3<_#HJuV$?x(%bV?db}erm9(v|NS$0cTMf^s2nQI!qu45@dghYdGH<`585pEk5X= zY7KYmPW8w~6&D{X!4--k8Ae9$?^drk+S4WLdL(-Uw}16(PDXG1s^ofVgD`Q+sp zsMED;w0~ftKck>D-qMtY#$ERNoun+66uy*$8F|;gCIa_CdHSSE2{XuPY+B9IRNNyw z;y7CFYA))Hw^H`pKbfAtB`TLt6Hn4!l%?(}RURx1rW7Z9>~yWSc!Nv)P^NtfGrbMD zfDY>IISXZ#n4}%!2OpS1WBhY!o3`k;RP^V0lM<|_z0hZ*98N|prZrJre)!@LSE4Tb zGEj@_4WSr@YP^r^(WYcfz$VK^nNQ?xE5j;1OI=O5QxR#3`?-VP9$7bzt~UCFi|Kfm zw^B)HV>*5Gwv8dnNP7Ngfgq5f?c&n)#R!+!r~yHhJ=omdIdr?xqKIO`*f4oWx5dt@ zSB4T!a-!Py>2mi@ZrOtVl!hySiEH0Is9Dw-phx?``a8FuK^j#Dbf-kiX4#3DeLJQZe?x(Hdjkhw-itl z57X>nIVzj6>B9p#%q>6|(d0;TY@3PVO)!R)rpb8>_pyvXooU{5jzIbhFjV$L@lww+ z#e1opq|fGMbUWG=xL;*6`twV-mP^bnz_v`8=oNr6{|&nVh!zo;oS(~Ib`ZUQ6VVy7 zh&SQNhrGsQC?=#|fpsEM*>hD~RQx0ELKr~feaRiNgD06eK(JHR4~omw7R#4 z_7|*t6xvD31ba`M9-J&8bn1@2PvvR?{Z8VS23g(PMf(c(t2|Bn3OPk8R|}v!hIfYO zcF=}ly8jseF=1SH^iV1n*r+Dp7zfOC4l)Sfe&`_~axPDv*nxr<$SN&Pu9j-KYcjA! zK;n~-!u=vI9A)_dVfm}Fr(0Z9QU({;TO}lO+oXdANqk_-uo;u{r2J*3LkE!SAnp4# zbucg-H;laxq#c3i&vkAs6)Zo%r`=Qi3aDCvG+_HtnT9&kv*}zNDd;=?DIf)zmudUc zykQ6{@1edmZ94aXQ~>k~X89FTwSJ<70dz5_Z;=02k@x@|(*HwAd}#dhv{;~@5dHa; zTMKjo1o;0ju>@A!`Y4ztU_LZ9b z$(^!;o~%K@|A#d>E|7hll7Wp9(!Mo&lY3MrGVMD6|L-zl)SKzV<`6t!|9^?vAhk^G zptu^56-j8<9bJ{qT?)wXBYr&4O4EbZ0D#erE!Jn0J0=4OMpiZT!r+~4Y%H*aMeaf> z|HRhQo8qzv6j+9naj0({o6g0M_&`2@G{{jBA6KUBMni!zpg(_fYnj7R1cEg(`k68n z1Hy67v8w<_ypPGrfDA;^047b>LxUFSpb*j(z|jkpJv|_Uunc5iFp$x^>w!V5ypYF? z0VdJ~O-=)3AbYanzXzzA-Q+6BKqL(rc;a>$GV0BBVxa@Ojo7`$jFo}*iGf~Mj5x*9 zYtscAjX@d8Nz$H+w#MwGJn;fqc@X`%(XHhvGX|JgiS`;lq0Ap+UjTX^mGh&PzqBQ3 z0ND<5;wBO`hGi&3(raLHm@4w-#6`I=4U}OV0tw_g*}*$ZMPM={YH?O*sqQ6`m;bxo zfjh`esuS}`8o=Z*GiJC!j+w&V2RKC`yJ9Y{kEo#l<72j!5dqOXWZS@W|KY$SGEa5l z9Z_R0&~FnX3F?>3J`DFed6)h7TwW$o1DGJEwK>Nj`R&LX|6Tr?-wj6Ph5MK>K=zTV z%-a?h^(EB+SVU^l4EWMPnxrd$PuI%%F(CYj8p^P2Nc;_%ij;8NHS8B)LEV5h;kvaf zX2xLP9lznDm{O|NPMQ=fYE~dVGRR7EJJ|Lgae=V_nhxPuFya3~jUBqv`CmB7%*iA`8DAY+}e0C3%Z zP|LbEp(L7+v-&cDC$HtQMi$2L7TTo2E&M3hWM*%96kkI+0Y?nWtqfG0ydQ1xzbg^= zU_o&BUWZD+{a;aa%ycS9zVJB6)w=aJLbrV4ku`H*dCaspSIcFd66V2E1{pfcY~(G< zB3P>dOYcSh+IOeP-V#_9ka&$_D|G|BY0%z&_C&bfcyf7pUmx)Wi z39KSoik@t@aHloEJgFm9y5Jv?p2Lb(3Q#`>TH#rjU>W}-qA#QMs+#2x0N>9yVi?EC zyVJJ7G9rCtEbV4kxQBfdFgyG)TX>r;rY)Fq^3t`7xwCfxrlTLy1@f?cx&bF?W-t$5 z4}83q>Pu|kcW_9$NeS1rM*;pw$87m$6Dmy_SVJC@q||MWsYwGvn=%~Zbkts}nx!iMt5W_F zPR{E!e_-c*NA=ji@JP~cfadLY=B`as05*F_2fgQ(*yKpsc#H^@uV?Dwrut)}A)%Gl zVTVc^P-m2&482wJJPJC%r`mRyGD<~x6d}d1r+vuD<>xlFp6YHz&%Pi`&Q_>|8GpV? zb|GoanO=<;LOV*(acW3k*spD<2vp4j#(($k%*ZPdkDN_U#d<|lA0zUWdD_AYTl@4y z4?lB8JU)$7=~K>mY%ZOWhkn;i;XXE=mKXk2R!nn^HXRx3F(4+m%x4BI+8t^2wC(cIraM za6IuclvWyi&F-a)j#G>d@U#VO_pR5}HyAe3>xC=Y-?m?W*zv<;z1B5$GR13aIvVW%s z&O1;ryCsN)0u*X8T`9&4@~v?=%2u9_1EpWZTAAn9;fIfuVuo)*g(_x%9IYZZ1yRMlGiD)Me;W%L-_hNE0?j4elK&4e1!c6_;FW z2n%g*&ls2(Z0UBl2rHgC6N!O1OV4nw7M4e-3z|zC>-I$VYzg}5r<$TAcBQIbG7Rj( zc-jnI8U3~2rg&HLQJ0YskN3AXg^mjQcfaaf#`#KR+E_izI8HDPxsr5(7Ty|{J$|-5 zWkWHpu*jn%UwGUqLmTpx$y1`2C)=oh9Zp`(AHd^^ivBM5(d)VW2^}Py&iI9+ydh|3 zjVCSFVX3WCNM<{KfYRgHlpk-U$x%9-JX5$jeh12x!AC3CT`Je49~m5{Nl%B{%g0-& z-p{eJGj*cp5Nq(UwZFD$yX|6X3JAw!<19hy(cOJD?qp-i;#Wm(5B|sGF4f z1^n1LT8!^L%BR`Q6d5021_v4=eh(l*w_2uwzJ_)x zgYH<(V7g!{R70~C+hmOy!#@mlt;H6u23bM^@K?j*Lp3l5##$|!xH8c~wV%ADHN#Nm zfW0#By;wNtu1&FP_`mbNxf$Yb3PWzp7@;Uh@K^Sfg|2zccnP7ZwOQ3{-L<`NK}!ZR zST+}LYX;v0dxc7GN)=0BnP3(auDiQ0ReW)bR!6l6kzWfAvW=kzT`iln(TezT>)r>T zjtiD@_UcPZ3=(5`4z)(2K`i`+K>eSMkEI~&k z^QhjTTO~XqW^-VRG(m##DvaNF=XS84qyuZX_kTe6yrL#Yw?XPVve@caTa?9v(~kWV z75dMHX((w)nl?AajvT?iS6HdiD_5Wn9q7}x>sY(ur^Tlw6gzjZf_1Bj z5>ResSeGhEC@DFcc3NEsM0mn>bis?I7bV1-6MrA!lm_m2|G@R%aWT0&UBd`}*(v9j zVkF@TG@%YuT$D5mR9hUr@Eu3=GpZRv%v3o=UaZd?3A;&7JfHac2szj5JrOj)86Y$j<+LJz85aOOK6({U=R8sRa1S{(C+#S(0Se(m}R5y48( z1=Y^qSuMElPL?8OQilmtx-^zB|KR4D0fdj6Lxsyc%)d|30_ci*CFCX_s1Rw?I$QF6Ad*BL)sq;t6}u-!Zs)yfIQ@FT6*Q zsw5N}6Ob5NpiSC~B|@6ybniH3(1UQk0!3b$zob_W>ulDSjNx22e0#CB1b4}K`1Zh8 zL`)z=qVA8yN+R+3&nmpS5t;AUF0*n1;S^B$GP_Yt!cd=o)jW&vw!hFOqR{WS zvDYcz3WOVP(c+*RG$aOg`c=~pF>o$As3Q1#Zp>-@11YfyRHZbPKrd&raWBG+H(&<; z82O$XCsh81bc(Zhl?kQq*xQTj+Z%QhjD-ZKOgHK;1tLz{qZ2tBoo7xfl0QE_53%bCUh6S{DhvrTtD?}8{PSB@{aY) z%vru|@)om4|Ca&Q_*Yrzwlig0X~?@aSJ(9EBDC*Q%cNO*uh|CO3fI5m437X3y#Y;-c;t6#T>ass?+lUs#UulJ!-_#9eRTjj84>Kl)7e6hWYP;XBS4}zaM;@ z)u~d6_f2*{P}7c+DQ@}ak?`yJeV*Yz0$&g@@k4`sCdOs z_+TlfZxhupYu0Sx=%CD)$qaPV$RFoAC_K$Kc&O5d<7?iWELhVbtej?-Vp^+;>KK%S zZi14~r;!a469sciO76X9eZukWAIw(3Bb7YYN6Swy0c*gN^fsFB4ZTEAjkCgO4%SAF zSYvY5FtCw#wiHumW|D9hfASp>T7+Cq!mOe*B~#5~++oDLj#TtCon@>RA;z+1l0g0CmF%9YO7ZmB1F}oKYutOEfL7vf6XbK6@NNaW7nb1{J2SSbsMlU zIfB^Ee5D%Z&RSq5=XBIDN6c3PAH1`(l|;B;YsTxT+eqw-6SgwmZqSj4R^%&|bi|&e zN`o^zJ5Ofn#95`5DM+ft9AzX`o@EAtstd<3{{g|#HQpN`S4}X!@GwI9L>*`qmq=r> zdZ@#h2qXknCbK>_6FAA$sBb3`s0Jqt79p??Lw|mQu+Z?Pjv#~`i{*0eBCOqYoI8T> ziE%|w9Kv&I>K_cuhL4#G2E&Uvu}HwOEqjoFSuo6zCUS`HMgkUyFLXix7OMkY*^&gT zlREqa0a(429MZwcQQuA@05eU9d5Qo`PiNDil!(os~+YZ(Rlq0YafR zulVfDHAH9x8O%B2|2Ql83M?RMy&v9Q}j=G78;2AlHK!U4wHGIr_VkgNV@$Re1Z5qn|XG zKxnws>NRroC7eve=yrEykaOxd9)pM*1h1P2+Y`tOYOEy!$vTYRQv~V@brKMkCqk7b zgaNhgUP1`aU$OcLUBYYo1MsQ{x%)Y*k#@)EwHhb2zF+OLnrR%t;0Y4HBy%^PBvpQI`Yg(4remRqw-AmhF zt0$!5BvK3`oj!!nwWyzw^Z1FS5Mf&f9V((*CrSxnHFmOmGoo9({D~QvXF_A!y(~$n zeg7OqbMmG)ATyEP_h3F^;v17CN0H5z93))AH=2pfNAQ!o{@J_ahsaNUzeD}KXejl= zOu5X8MzbY|&otMP?3F#0#AkpH*n=2`V!ZsmTl>W%&?zzQ;-!uGm*%&@$7j1}EbC{g zR*&(!@+XhIrw)TI+mNC2TJfW-OOFPN&X8p>x}8`kr;ov!{nq4*B)gNw>tu zn-5qa4U*4wW6A+ZL@~i7R-Hh8<(^04^T$g2$@5k5f$M^FUmppEX}%Qjc|ltLG4S~& ztN$$cynxm32|nLs_6LB^3z&9YDFAi`B&Og<$gL%_XNt6aG+*3pEi!b4bLO-&3Of|0 zXU5hMV<}BC>}tiN3dXiXGF-ncV$; zoJQCmAJLmioH;G`a%^LanSERvDV`DB$Lxy(fxp@+N6y+YnM8|dA6S@_Llw8!(q6O= z$RusKC~mQ&EgA^)wOt~1d@h3!kutKLCRe(5sTo#n1}5dl+*20TSi(**1j=b#Zwu?S zghO!4qpVZL^ppk3Tg9jhu(l*tzrV`W5~Gq&a>{70J#>|e7B@MPes#6TAUcVzWcP^` ztSMwOtAk3+|5>|Cj7<%T&1{6)KCitGw-pmU^%W62Mx1-1_9o91so1fk0*K417y}tG zm-|e`!_|3v)yeb3(pfubWGDK>t!Hl&>yT^jI3!4Wx0AL)bn7S#ca07C$r97V+t?RE zOY3IaM>B=1{-=Z!rWwT7!Y{c^la1TO>2-16R6*}`UHXE%fBr~&iOYGJWOp2@=uyK$ zN)ksaeGH3I1EBKZubtTDZ$)-df_UwgsS$kfuIcn*i|&c86e%gpx5@7UhmchOYlny{2O=eF{t{)m1c9 zI9*~44a~U}*F>CsUteAsszpOJ;81__TPm)^q6Sob)%!z^F595W>&!d+zEC0j)lkyd zc=lafYthK>EkjcmwmUzcyI5tlX634L`1e&YD^~_Wx3ll+6ALB@j1)^w`_H~;d>E<# z`4Z!Ll|6&um&0do=+Ar@05?~A_TB19@|i?;s5$zYxEek6-h7|wjg{c{1)z7YI$rTv zn8wbqp4kfO?57H5?oFTR5Y)7p4|GR&i=Lytp{HJ7J^|Il2!rhx!vX>`;~I}N=x1OC zyQ8%xAOdMI!8P0s3+ey)(!u{9Uv+?94)oJ<4Y^^s9NIr5K~a)HT8J6k2kjPYv6B@~ zJx$ia#Z%#d_x%63@ec;{kqoE?{@u@IOnq0QS(#Vf-nl=l|8BMZ_+bC(V&T`H4q~fu z!jNKNze_mwL-^W1FO|P^v*VC$pQ8D~u6L2aM~~(2ZJL;HaIh_zc-5}1_S>Re=b+11 zTJhKPqMtA0K(|A*OL!moHJt-0`hQi2ST+*B;%dc&2-jx5!%{S5_I(!-DF0M~7G0%70=lY1p8^QbZ_1(Bk3q_df z_7s(W=TIIlQPkbmUcH(xig9kJ8#DU)#b)kvRQ;9mR=v9bp#s8Wcy<;iIUq!@t zSmgA>H^n+u|HCi^=hn+_Q1w^KPomGLx6P3o9CmZ=PI2h5ANzO(_sZ$3%c$e1Qh=4= zy9>6S=%5#GKS?(V{`M|$`Z3?N$^Tv3M?Ndq=qg}7FWXdfP@jI}8eV?*|8GA5^WS|d##lD2 z&`D4UdD%dzJ&Qv7p1=3o3Y5!lFzm`Hf>}0 zE;(XRI4-5{b`IaR<>}K9VIRe1OiF(z^tfv@1$Q((gnAo#soocR#H@6NjCkiU!v#-X ztDJgo&U=rhjMBx#lk+AY)UA6($Di<{bFTcLfFT}?h{lZK#!4y0{3OxMMK{(oQ`ug< zZ@f%X)P`2LyoR1Eduv(M&qUnF01fB^I4RFQ`un;FKSEEDlTZ3SCE*=;VH;8v|NRp= j{~bR61E4w56V-JVNu)pl>;r8H10Xzm=t>7^P z!Ukc3M%Z8wXq-O#;YjBW3|;OiV#cL2FDa3u7aMu|Z%kXbi3y)7+fts-NlVni=Jq`K~9k zhgD$DQD6@*u(vOWffvMp@0bGc9Q)H+&~|-a3nbzbJ5p3rBX01@6=8Av%_Ch#eQxjS$X{7bJ{vd z`1dNOt($hc8o$-v{`*cHk-Efq$bR{CUGQB(kIXXXebos*zV^O8_W1sk0AfI`)>!{N zSk~02r_boC=ut)k`4->NTl-SU>tIqpIrHt6x+xff8lq(tGiO3r~mT$FHXm&nKE#&L&LlHsnX^PPjvPnM;ed5-())2WOMu zp)5no*~h4&sq85?g=|$q(Eb_;hWzb7?azfS&*j@D#EI0S)I$Gc1#3RXOP0kGhJDd; z$!!uCSF%U&6qQ#l)~DB%K8>IpU{UUwJY#5z*CPaIF6K{pDtC}WwXx{yBUQtHJZ8~8 zyzd+mq0RPOnapx95dRe(yd`1??P=Xv@V3ZZCK^3KqBl#DUY8X3rA@ZBiYImE5mDZ~ zRc_2)URKi>2^WA6^) z?qnW%fm|b?r8(vy4>j&OiF_5H9viLd5@g`m%~`61dRu+EHY})h=TI4L>cj9?Pk*5* ziUM%e5~mU!E^a7mM&#pH4Jyz%aB zp#(`ViERC`!}Gm^)f1|Af4;ZUn$+rfYdhCT|A{EIG!^p_*C?60s!;TXzWf6$KjwHn zO<-4$TW`5Xhrd0(tkW`Ba9j5NBO-wAMe?EwN$LVn;rOO!j7I8}FJp{cG1PDXL5(h3 zG28q48m3=@IHOncbT*L9V1+gmCP;kmn3$FX`w_oQ_3Fc|1v&1cfEd=8_Hw)n4^O{1 zR9KMKtobi*JUaVXP05jOZ^m|7HVQWyFxage?M*A`3|d%{ma`zrdU+2raJ3a z2f1~!rMGNy3WBzYXdM7Qaix4ul-Ny*P+<-|MQO-X_U>PI3;w(*47vUi_laXzW?4TfIGw8;To@?6rdh4CN8+5SiKsRY4mKntK2LIbjutV%_ z&_OFb0QDb&o6s9Tw_C`0XKny>dx#W?19anmf?CAq$Kd~UWDgVdP^vakgeHnE^KJ9t zua|8O?S*)+Vs&}s;RMsozaiUK81(tlE`vSxA^Ul#bwga;lxnwU!2`)A72VFLad}9j zqlG#0kn+O7K7}<3P6qA%I4tO&I?sK`S+n6JiS!EbkuDEfNzj zNiL0~c*+PooaTKtUsqQbVp2ieMf$bTE({2TAY??EMCY1xCWbd<2Th&Juri%hfO&mc zbyvN5=z}_s^Ru2VZ>Vrx(FeBd=wL0N{?3fIv)R+N?bZa;cw2M&UH{pAZ+YoovZ^<+ zg|qa;(-~Ptsz`P5XusENZpdS^^x1Kxn$8ryqp?RQ4pd2jw}#k4_UwEn)PegO5cNR$ zY$FfuN{MDteYaFJF~^57qrGtR`KU?~fa%<-_Pi^%EKKxyTR@iro(Dy(tqSsk!Y>Xf zl>=S5O5)c|U1$9qqFJP-MSpA=T(v12H@v83I#@AiQSXJB%?2aA=^xxbS4S=Ak0P|? zIJPpIr4b}XP8HEhmnOxmns(&aG5_EZ6R4T*qVh?R8U7l5Pt^ zj$;#-&Bi9Qw2}U@Q3ylwKiRWvZ*;J*bK91sQpyPWC-<_g_%X~6qWw3NG(Abud)W~c zXrjonR1ffYDLoh3j_wOvR+afD@%M4Qf6DGHov3mvXbnPC%u1j)o2x% z|A2DDzeHZ&iKj;=Y+uGDgXew|ns#=@BlmDtso$M@vf}XPAicP8&4yJj9ir06w}RLY z?foBS^1Kwm^6q5#ovTqwjfn>}QLNfB-=~Z6&{*JRQ|=Jz!TFtC|2^lutWN3HHgRCP z+43NLRq3E`9(m-s;bykW*bCWlQ-9C-kj4^C+Mf#^{W#V?2Z%H2_rl6xl?7)FJGVtO z;qzuQ3hi_5LTIoZT?OLMjDO?zV5#5EjS3}V>@!QX@~0Ls9z z;LijhE?G;nO9ZaeY-w;6Ygk_X(%4tZehlz~L<=#_4=MU@r)UAs^zEIpo1=>O&ws7Z zU9>awEd4#I83|EEJ675PK--Co3sFTvD{b_rzyf9nMHPvyv;}}R`;2^vsVzBM)wC8( zO^fYj^cL8X=?Hi#MOtUQXlop!RiJRmg%Er%WaN1mlI!M@CIvP6?F>H5UU@Z}#fr^~ zKwCU4D_h(ZGe7!?IW8EwGjiRr=eLtnn|Ld_<2o#SJdM zi}_JJ1EQEas&$(o%~2t%sFsCALO7SnXDblw~CFRo=X%c~Ee==+5dEIThbUvR#; z>AniO;Fo+Nw+XuA2yV}DEbM;Fz4-Zk!3JJi$qD^!V-%&ddUt_fLDN=z#AeKniXmhW zv?!R3CLb@G6# z)0ZR(_aE~`ixZfWwN=is;_bqS0Gj?unAek?V5+c&$Er7xM8Y;F=4h?28ndf*65hy1 z_CMV=voM)r_GBoz4jnwGerc36Ny-fPC26C$7Cwlkf6j!{nQ7j%ra!8q9TIzolkUNZ zL#8^|p?x0RyA~G8KaoTbrL=5yT&yKX8{bB-F&s!^tBr!;GUnM3*o!W%I0b)p<1u-PGgVvEy9gP2abV2$z zQEk`5))!B>p{}HnDz$8bPiK4wOOq2`qk#AF4mRoV6Kc6`St$&$FpLn9U`M0m6o@9H zLia@FbzyYDqZN?4*h<<%O##;J?8p-mI@rzop34GVF@uF6eK&f|lKTfTwa21Ta!S&^ zA509Q8R+drd!IUOHS@ZVkIY>4B+QFu@TjuU3QuM7TwF8UK2kdVdY@i6>l5{C+IL34 z7uKMV-0uyn|J41L6n?gC9OWZ-#i8g_=3IaRvuIw9^1t72Q9eJYJ~p-6s^@XI)I5#6DwKq4m0M%$2dfS(qABfx zUU7P`s%bFpyU8hR2Q#rsvcGVqZSkk zGn5uunbVr*$)S)bgZ}Gc&M+^{MaKuiw!AU3e~Qli=`!_C-z)}+hqKE3$SRHa9v`DI zueB3jJ3Tr#bvzsJX9?Y+aAYTw%AgR_JCNAe^WTstGRB5yr)Y#yz2Z<-&%V-zZ|Jy%c%nrQiJ@Nkcs zTJgmj-?67i2RlX>|15!F^x-l~V(^;Z2u?OYKRfI>q6YIKt`-UZnZcGFqaFq$@rTLjcz9s32*M5 zT*Xvi=AI#hSWt+I!3sGFl7*1zll3wMN zJ48r+^YDrHkZ~BUq8Y)vNrBQhsnv9?#i$jjv(BUyuf2hXB)12kSe#nCd8^p41uc&- zx=x9{J|00`Eh#!!UA6w`V0rP4y`9Hi8pFpHbX*I#7f`^0I_Myn|V^C;>3)h#?UX zF_59g_vG;}t69;NDI;M&eJC0&yrRIPTge|VxhMf9(dwaaUb#6twY_8vYxJE{^f!u#*M2ewvOrF%vnStjB^|lb*#i;Q(RzANL*xU)d z$�TVq~ilFYiNhWH(O%{E?N>LV;rXr_|{;x-HiN@ES_zueilEG7T)yVDfjim*27T zpQ*?+A6WrU>2aifzK5CRhUBjjkah?zXkz=>bkx1v?<`C*+rm(y@zk%naN+MA`l4J# zye0HfCQWn?jNR77($iM_?QT+1I!lnzl^H1hWR8Vu3)=X11w7T2j0tIhUd;d`i^A2 z8>wVnlen>-z~Njm#wr&X3?w^{{&rvoS(jjgf^eg^G=Rm*c63pLDCgk4@1by0`;6Tr zNvkZB7cl84>t1iLB>1@ho2ByIm{(2Uj7GB3-Yv&oAZW_x-KM?9>;q*kj7Dr(SQ;#< z3Yg#RY$75iIH9u-H(s+w7J*64?5dA%Ss~9_SGNT{_joO5*M|S?yGa;HQq+p*cqJiu z3959-v`R$li19H#viq`{X=BI4P*Lq%BZgv9B;jReL$yKIVtz$2#)2=;vD6vHCt-Qt z1OWskg}S#ve;%Vy$Ey{Rd09^`2CUfIe@`#} zE-wG+liQ9SO=j<0%-bDgUa~PmVi)qN(>0fEwnZ{tn4%O9%1N>poyV~O_DTnTI}vo@ z10p_t=VCULG_RCkdVCjNmL_rL49{W)gKU{_=EFbgEMvE<>W905R6C~kt%o*N zT8D74#U!t6X~Io*wZazl#YC^^4P|b2v4g13VzM^}>adkvD{**uDdEdq`xCp`!TmG; zVpe(U!&B^Hi7z?37BkDc58vYhVFzVh-LJ(B+6d@9gR97de#!D zM4)3WmHa_*NTVcaYLWfl$aq_>y5g^zK0<=1bESTai6jR+NNP>O4n0~}^N7LvBuU{- zmAQe8;vp-%$_ebj1@lElDsrJagb~wybs-|%{T&vwUsKI(p~8!HeG7C*epMX;_4Zha z(h2O(OHbyELU6g;dy^>w=kcmc29eLvh@ByV!ug&SgBl0IVb)Z~Z;E3YbO&dGE_QdU zoo(KNr~(r|F2VxN&yOWPhQnUmTk#u9x3cwHF z@s*Y$#V)De7T#{$I4n;Jtcn+`Z_XxXc7+hEfpgJ_R_CO_8o*YY#ibv&e~^XahdBID zgCAnB+2Zcm$n(|E8&ZuuEL<}3tu?L@qD+2+93P_!EIkzAk- zDI(h`8-O%Q37vML$Of`Nk13UrZ9bM9A@3WnvbE@|nfl#vk#lkesp!7j0 zaShoJIK%FmTVMlh>(hN8I*$cIL<0QXN8#?YeOHo9fm z57x0a%m$zBgS2mbzN@RL3KWxuUh;0*j&_ase0SX04s1pq%_#zoukhIL$(ej5P!%a< z+OBdOYcB2c6T?gh6t?m`9Zi%b$?>ai4bIx62(}T<`|9(PV{iQ!{=c61=lE3<9^=*$ zoLzyW8q&Ci$5WT?Mbu%}@YU$Z=gT%Kj%#>%ZnJ7a{ARN?{QD(D%E*`py~TJj$NVQK zc25K1oVK{*aBg!LQeO@VKeM>=^+b3RI?)W~6~DN7V01)~SZ2QJe@(&c^(uLMWW}Wf z%MHbjh=z`%*z$(Oq{04$ilvp&_H*my9I;Vcf&Y_iY2U5V6#&2IK$CU>l$HUUF57xF z3KZNgqx%DZ3iSUK4W8aW%H-xXE!drPAN=0&ITBUoYkm&d!AHJqqb~Lq*dsj&d(F$R zy#7Q+-@DQYP%U~eC5mmAcJnGrfTg{sevl{i(f}ts!PHQfZJ*ADH}eWXMqcf~AGwh; z3~EU~q^Gj&R75#kbfJHM=XOwprUJ_Yd;5cBd1|`+3#)wfM!qNVXa42QX3nbI@!3ln zJPJJ9Iz_cj{m~O}QlR3bln$0&;WJs~79}oAKwg(Qs}(;L12zejhmZX1t*;P)ysi+a z6;A<8^Gf5m7cZRsPeIMPfxX;V&PUb~n!AqPloWzgpN|igg9>x&U;{3Xo1|U-EDCKI zsH8eekxmLi+IPXckb;n3K&31{ns!kK`%w}4=O*@Ob3&k!6_8{z`gHl? zsTDvsVPcuys{gXrKL|U;-7q&XxD^ob7lDERZ9cL_$u_6GU}Y+S3sw{-F#4htMl2mC zMHr_=fq%I0S)Avr0k*>UhO(1ne8}=a2G{wIqoJyuV^2JQk^jY-E_*Tw=`y7HsQ&E5 zbMM>?qSs~*+G&rId0bFE#T9@RJ3>eDCG{csSxFwJ^z%gpb_^!f%Ye+ln8XK#LlTw* z+9x@ymRBS(%YdKR6uMGIGmqJcZfUG}`6$io_uCsOH3K`qQrx$KsDPR;i&c;UMg1z) zqmcZOBo96PeA)h+Gkw&JAVFc?tx$`?73THzE%oVusW4qY^3s?GYaB4`5Qua3wkJ;#&J^U!4>;9upbJ0eQo3Yi#H^V z{(&)oXa;}!7^WcRP@O+s^(@cQafpzZJ$lQ=%}C?Lu=(5K@uvfdiiT*{SCDqwKj@6~ z3Q_Z`_JUV2ocnf@WCeCiB-QUaa}uk`G`5YBQ&rwlPJ+s``)l$b zZr5Q{fD&ZNU!A^@H=I;sa3bA1Gyj3TwSG97ow2yB*<{J(KdKkNeQuUP#ACv7uv(Z-bgs{^%*XK(L*%WQ`Al@AdM&v(quQ)bN6hm%dx@)3T8`KmSZ zA^j~^c(bGXXJ>p{k5x1qMJdBtX5u(++ zbf*Q0TE3&x{JDUu$@ zdYzWn>oh)F&~=qw^6)vuo{`XdUgJoG811qq&8-3NRuP&z)_RG@p;>LRteIAm=Ux%S zvz?qp@pKc{bkrKIPxWAf2E~JEC_4k|EPQ;Er4s^WpMJEP*N2Eq$nMJp9p@x!^i->< z{mxfv3_j9qRW*6ivUEG0oU%l1CsQ6qPAy_DMss-*I+W*@#pBSbR$Z1bEoXq8=Itw< z8uuT%8{b~-q(QA;PkEbqAj+5yfPrwH>s~&C@nyQkf0v%Y#KE=btVeb=DCH14&}IcKyaS|&E&52HzSlgS~Xv6%$x0+ZC<}W>v4D! zhK{KxaIy4T*Y|~ax;$?6SBifIGcahFcf0$^_GqBH06RH)a1<*$+Aiu7z53r*J8^Xcp% z@qv4p(OUw(Dpm~2jvSP`NXKt3bhn^o1$CXF+iQHR2r9k^$=7iAbk@K(zLcA+Pn+s5 zQp1T*M(bDsxRJWqeZ5m}ok)R?ZI8T?kW0D946HZw8R{9M>B$(CmkH5aUbE*K8>&2# zvPj=Yg(r!00Qrnd^vV&L; z-VW&`j?(lMym#(W)!ZEv{6{e^CZk&I#m0sl_t+p2_b&Kwk{ugU^TXZveV_*O04ad0 z=_>E9gnA#$6^hZTsoyeOYDUGZHXb6JtDEq8{ocMHS+m+itw6K-=Pw%@irgdVsy%?~ zH7|hND$8B`dVqw6`d89E=5uCSSM)l!Kc%<%puHx_itHcu{x%v*8NTkoy!_TKU{eF$ z-31k~pwyd-06xmH7fsNr zT1Cz+g*V5J!C%OFwJN_(&+2aqb~!bqO|bK?8?UDY<*P=@5T(w8xD4 z-KaAV6}gLXK_ZS_PwDn%hr2EO>dS=KdUjjwQ%gs8TM4OV?A3)gbCR@wv@%|l>>DaQ zFfv9Dn$<7Am1c6RQ(2R<*w1Q4oy#S?jdF1Gg^csScw`Ib74w(!bX8K>g=t_P3W9P$ z#L@{toasT&8~d0E%*})pCw4Xjy3HoQ!(@z;#(ZSQB7#aNxy7Sm@h9@y55Vz-pM9pD zGcHTlx6Z&v-un-CKDmHxWJ{I~8`M>{Jy^hVViGCw!SZ+dwXx_e63N4M#Yh1OS;slq zDq3hluCSG|MrMr`K8GTPEigeFOrjhx#Lt_OaK4Wz+kP^MdLCq0`-Tyj$a%%`Fi?X6 zfT^yWrU9zkDCU-*|OG!$h}2(B(s*c(Oq)l}qZf3E$qjDk4G!UdzRr z`3HiUYQ-9eQRcs<&A0b%ReNUYgTSkk5=BS+A?U54AV|AkZ0xM1N|KhM(hy+_Ep_j^GKb$<>T}K&y2|6n(QZL5vsO&M z5ngct?&qA~nKn&&3H>XN1JvJNfJtNU*OzC@IoEdvk0-(#` z(Y2#>82g&in|bV+4NuE5-a;%0ImfV%rU94k&0BH}x^q^o$Gnk&-G|vts^2x>^q;(g zjI+G>tlm+=?81FH4G2=`sPu*v3lCBppM8MZ6KxR(t#sSrofsH!0w;4>jrgAogRmb z2itDu#(9b}FOL#OW`c$2bKCsO2*l)%{HUN9As32-Fz|LrXX-bs(P{Za>;O<8;Q5b@}HaQ*1rr tubSwNXGbsH_+NFr*Ma~5 literal 24298 zcmeFZ2{_c>`#-D+SrQUL2xW;VYlbl+Ye^IlLLbVEUD@|0mF%ex$~F^)vL#ztlS&dZ zB0|VMb~1Km{%6LLzQ5=CJ##8K zGSSe`u$=PX`tZ$2Q+pi zJUnn79vdDw_@Q}hFuc6Ha&mH@5)}N|*+K0dK1AZNLpjRmPLG7RqD6LZd9@sr7v3t;E zS3(A@P}*>bc8Sh*iT>=8J&1<$py6;drV?G*5e@oq0Y6Zf9sPMQ`ZM_fl;#1op$8=; z54yS@bdfPOJQy1|rm_Rm(UI+vkx*kBg9@ur2fZu~I z&lLxb120up+eg?gf$H&LXG&G=B^5;MH(kBZ#$|=o+xwE~ zrJ%6;k@wds=C$qAaIA9OPcC3014g2dughM)be*E1!G`lqGt$sVandr;7&1cWXtHS0 z5E|p#h+QuzRN$8RIzd`jm_uuDO2r1H7cv1co~{45Lu;8)^JBMuuL?QyO=E?y-l-H@%tmbHJn zq=)N3FKfckcVPzOFmDT2=J4szhTjhCv@buD=a$laT&%tBTg(8t)Zg9Kb^Cv6cb);{ zB=`XA?lCUwwcb?~gVcaY*EQy8`_^CEQ-)Tp)1v)mEDZJqObYrNJcDe{qkSgC6|RmI z2$-&KCrz;iFNIbw4Ca&creFL!u>Cnx&@^}IVMErS8ApSl*}m!N;JKy6*V{DoRPGJ@ zJ?KiZpdEX+SiS5sv^*&m+~9kRbDOSZ$rB-`Ri4Sha7I~9LC*g=^q#dA}w`A zLHn##)JfvI_)W3%%?!poQHVeU zmNPkMVfpuO*){gRCP9~vnN(X#Htj=#NnMwR%l&i=oj z{-;u+D6t*aL%#W{{;y5h(8aAn3l-a1kfL}~@F=4hVDHT_vK}_KMC(z&78JGSGj+1KTG50$%bNaO@*%$_n{$9{Iuq^4c$6F z_du0^`1pcKeZP7OA?f4kv(Arh+9AK@3s=Qv7(85d$P_PR&wAQ_2$X%0IL$3NU{Y(e z(Z7lJz2aV+9(ahqFkQ2Zph%cza;YioTuWnRqrn%_!rI2fP5RXt-MJ5Rhh`( z;o!lyJ*9jO*jf=AEmWZ_imF#G*>o0~h6w~(@k3I0yMe$5;+H~*Y|sjABFE&0&GLWg z#kl62@P0_+eXk9*4`>jV=I(0C6!1zGG7lJGgKYk}gbKbt)4pf?B4VjXCo0lRc;Zo*`#*s91Mj zBYTMOU&P?ltdo|Dp!*V`o4d^pDT4gP*WM!DM!hK5CPrcUKw=J?xYD;$upNNb0!!=V z<#GDaa~WZ+3^bw}aNLH#Yl=S4P()PB_$<$s1yu4T-8z~r(a|0}vtp2xLp#X1+#DEu z@2(k7;Gu{eP0%a*y#g&P87dTa3Wh=%9mv32XllP@->}{b|Rxc}?+M zeaPKrhN7j*jXGP4&2J?99_+2-Y9@2ri(PjZ|Fd7oMwzw@*oJ`p`CHiksnQZmV-~oF z$kZ-n-=5&t##?KIXm|)~xl?MguGp>}OT%Jli?{Wt!!&z)34|i`pQ73_!w#?jv*a4> zO9+)!*f{AcHFX5rx>ZlcT0wR!HQzJuqZ0d3uKs?)>kK>goZ|T^gy5&c_;6bV!L1=E zo-aYHhOnVX1&J|Rtdex~Jtot#_Mg~LA$-~nzUIfDw7v%~7Zj3}N1=#X;Vp%ykcSPc zx0f&w+VkZ%a;Z4uOz=&pNXY8iBBbC!dw0PA zn8YLXi6D4DQ#3+uYyE~Q0t;4``lgj|F~Qa+COZOuJ@H361_#Hgm-o_pAI1+;m2y$C z5>7B*)p|;8Y#Qw5K%!DPJjm=NZpPEjAVG_yiH)Xm4Rp6DIy$Xk{mHCm;49d1qpek& zukZ~D;v7v8Y1665k}l3yId&XB+FIT-Oc1~2{N5pJZsg5^(-lj?Q{oiZ$Jz_CP#ta-irx7nbLJQu>Ub$G zHAwZRU1y=rZPHmr;#$_3kZ9J{E@*&ZeNISv1U=9TvXDz?gru)R6^Xd2iqAs0ev&z$ z#nKsb?(Ah*$8C+K#=P8Ogq>X-#ij3uOy2krj&Rr6WOs#%z_5%0Wi;6$%{f9~Dw z+adey+S-2k(T>>#Voo8h;Loz&ODJEg-s{}H;M8e?Y%&>N&17_V@f$km)>j4b= zCa>d}thCl%a1x&7NAERmcemlGeRE&ErBM1*-JeCj*`wNgL*QLA8m43)*1i!a`~QfU&{s${h4^2xPO}#wqS%5 zGVl>;c{5ro1epCi=$AlD{R}L zT}sga{i#t&`E!=f;RCqut4X-R=oJoh$>wJtU>4$d;oK>WJ@+eK${+Gu34;Hqi_qQF z=$bzZnJUNto8-z5y4?Je=;_@zM3eO~vL=Sz)TCbjFFUJyAsUFzc=U5EA_~ zS*NC-QzTw<$3tDa>vhDmt8zKPmS;okmmqBwRrZ%&AAI^W=j%VSdf)5gxJ35HN_|FI zc+GYnM{)-q6M=XF723Su^!Xb9X*Is!BzfDOtY%|2luGkmPpL4q7%l7#AdPR}?iqWD zioU3G)J7;8-XeU7 zPX4Z`P7lTvgOJ>=*;A;IV#%=b_15PoVvT}|G{WsNJU{2~I+&?779r=@vaTcXM7RDj zE$<#Mv_+?0yM$d_d3$eiLkHC=s&wcM;{1yWiw;*(>Zc+Kv*k#kUlOrY z`XNsJv?=zj=#!rcPhAu-WM?q0UrXiBNxt>@UQ+(~ znzMRp*K8p$6JdoiqugW%){NLyM+U*rk>M$J)PCBR*Me~SE^(UezbQFvRP^m4Yezg{ zb%^=WbJPAvP#2A`1bb+BaYEN5mh)+wjpAL&W7$$yfcjQvsneIS-kor_@1-;~L#wmKMc-kK;7C*|U5tBn@@(_2_ztzC7JmCB2|DZ(QX0m8w*mmZCu z%oG@?G36pGU&%PesYV{%&uXR5RGr+)%o4Zso1bcVomu=RpgpUryi{T`WLyXtMqj_e zgIb|^!FV92VeP3YH(|&Lij2S1b7-7i;G0Y@icDbWzVQ?ib+1XB0iqAoKcGt)vO<1p zg`Pz!^_zm9=(N@|B747_xLE?7ZN3z~X_;tnJkw+72prO9mowF(S>788|EzV9jW!YR z0_ffY=VMRoG5uzZNUuBaP!b>3T_f>0fT`}qVKmrPG1aBwJ~>Tc3pnLWkB<{#w7jG+ z!Y}n2)|(Q?b+1W5qCz%@P$a+bqLi0K(CEGY#-}DC;qiMKwy7%y=!vBKKppKi~N-2ndV1^_t%#2FV{0#ZN8O z-`0!>MGS!NM%5yxcygnX8{oBFk7XGIgU&!EQ@&nZU3%@l#HRPHjw$6`qyA-Vh|V+S z{W1buV=J+BxN&HKmg_?I?*okTuisv?6M+uKFVjtREu8*RE?tZdlE2x$8X?0>y5*zt1XS$ zHF95%=~noldog1h=MF$uCmn3FMz-CXd$$KNUa%8pF5g}hDi+7%egEZdNSw);`Xt!h z8vOYh=#lx(rhoKX;K43-6tOQtx8Mfnc>ZZD+WvfGqj|!zhQ77tS&Yt&g0GAQV=imW z`jw#{nD+PZOBtsvbkTujT<9Z4=ysC=etzqUy)3-k%1|U1*HZ!MJumes$L|h(1Z`)E z%}TO6y)Rp8E1|J^RTced!JFaJX1@=kPA!`>Sp}!Y`?=P+U1@~m1hGqaepfyim!qK# zgM=tRV2!1#mJ`Vtbv~ehr528|+J3q-kq4h2#KJ} z&r>RB$dEWh>-S`K@a@!U3@Z5*LJXX#V0M~r4>sK8M~d$`P90(__9{(qZ0n5ddVITX z+`#eGrAw?)2<9fDrf05{E^N=6@nHWoxg~kg%MUasZ-;F9lIz?L6)_5S9UleY$YYJQX>ydAlczH<_)^w9!H zBQH8XCBF)$#;$kT6QTSZ{EFg1R}He7281^J)qs6O7x%5YOW_GG#Vwrr6;!S42PWsn zFC52{JRXj)E6dmyN7DMWke?a^v}P@(^LrNeN&`Zcv9>rv$2!ypvfIr2c+Uq(w>A=b z(0Ds_m;)U&89AnEBW7Zp$M~{`K3*Z~B39WuMF>wz{Z4Gf<4588CmCxWb|SYr+6hv|4R-RM=M%Jka_e&VDfqrk3H<5}5*JwQYJpX%G(%!+o3Hw#Kka=f_3u-J7-np` zoaHV;>oERUPtb3t($$%{zF`l+_{E7A-HTDoMA52m^7D_6Cq;V*HcfO?+M#jg zwcYlS!!1~Z0bHvo;I6&)_1|tBDDnh#L^TcFb)nU+f?9r}-Jx>+3%hDA?q?wW?l%c` z4T0zjIJ%i+bE19)umgo;-qgh%W6*64S6OExrid)mrB){f{i`)tGrIf(Q}yq=qLBn8 zeXH3_GXIx@$vyLul{Q&+%Bc{}oEx-$*Hp(lo~L}W!lGLUm_ATuCyrF+NAGjFb{5W& zxSTT@`|D-cl%d?p{hp~NE!f@Z_`K@teIHy-W=X3V^l<>UIK&hi^K8}44yU%?Oy1Oz z)V-+dSt5M~TiMA{B9% zwm&y{wRo8WZN))rUem>0zWBA5xz{x~)bnvI@oY!&iyLHHub9tbpYYFP$Q@xLzo8T#ceIvzNvm`3DCr4Y|3O zu}1inlpgG~99V6cu#wa^VvR!xDiKG58~rRJ)@O7OtLq>t?jH+k`4=H$ z^fQL8v)l2al-}==LsB3b31XbjR$o;iRVrmH!>>oiRJxi1I_|U9?DWnv$bETTm-H;; zsKf?|HfqW+pRPDC(wt;&DT&N8H(tn23v3+ad<2!504uRgNd#yZIHwnOunZbu|_dx)(R*Fkim!;dVoh98dKW zR_UEM8+^Mt&crHa=IX}?-4s~uh0GdLDQtX0Am4hzXWEZCSVpRHzNfS&sglEUGVZZ{ z&soVloolc96*FcRVi8@(z3(?U;TD}Wdp;?rFEPmPf?RqyRr{;0$$_xw-v*j`MgbGvmo<4R$@x&8C``z(2E zsF};@qcKf{fV2d-cIMP+2d2BlI@n`iP~oYQIq0b$KhkcHt_>BqWSE%S!)!w4!SPl8 zX3m^+pn3$on07*&0F_UU`IVG2oVEyMK3d%VxbJDH zv=S$-qiGfQNnc#cgV%95lC+-96x>EC1m?U8$*&>$<1@ry7al|wFv#zNT)OzZaQ!a6 z+v}%Gsp=f#S1I@BaqS|imyhE2hLnilBSp@iFH;e}{NM~5*fkW+Dd0zQO&@1V|8Dl) z$KkDHZkoW`X1b3tl`=xBmuo}Mle5K19kS7~3Zpfz4AnCbo79gdPFzI~9xwfJJu&v@ z(C)Q<^DQjE>wn%j@<3R~2h-*(g}A{98s7>- zs&MAc9pkSb{MxHgo*;~$^)z$UkbOD#YCz;uz5z=Wzo~o0=-MUh?@XD8$}akzgf~4Zq<;&_kDCHUy&sX`UJcq%)huW{0jSYKWi03o7(&| z8j(WXkB>X@g@UzA2O5Vut~h)>L~q20%Pjw<&HBVT#>nF;I7REItQJmS^imN9F$b&E z6vDqs&WuK&CUqVOBnwfG*1#yVDyN@U_@Q{njoSwsUGw2n%iEln%^Az{r?f&#u5$}K7(R#qGh)4Y($Ai>V+>9s*c$G&_s?324J zFx87)5^th5D=IfakLJ$`t7?zr{(y>R;FW9fpM2*}u_|(+Y9e%Nb!ZKWpS>F0gZz38 zlj?DbZj+{uTXJfSD!yf!ekd~hF!l7}-Vh7q_W{)(CfJVBlPm&k%HOZ1T#SoQgW8JH zJ(}7p!xI5aa(oFvrT|~U-Xby_o2qeb!Bn~{AY3d3+F}*hRC_T_k|8wy7e=>!jZBmo zmwY90d(t0f^yr{vyrx%c`@nww&UnkUjyk3ppNs*2VkkMh@0i`)k){wyLmA*_P{&2L zw-@!x%3p4@wAqi2fPKvk;(2LE z|DH0ZQ#lVwV-wDbHZy$R_o(caJYE!N0+VhY00$Z+cEYKZi;vyV{m+vkqi-;2Qk^>w zQ~9Msn_n6+-3*^)=?PmrAU&L$JXH53yo%?5z3M6RvKNGd2^zwN++d@ZrLyE?mCtzU zPj{sbIPqK#cq3B$?bsOmp$d7lHhrddUApa+ePPsd`Aa{=qKuUb>+FngpRoIwJ%fi5 z&PwL%_{^N8^-JhXx*vT*`3RLYUW|79O61doO~Fae<86gP?7fwQ@#?W2%_pTpHL=n_ z|8TrV+PB=;Q_xSd4=7>rm(S9Wd-n0p zPx!`E9x%IT>-6`4ou`Z*#J8H}EcIFc9-d@k}v4aks zZsymc`(@3!2hFqmjTCjMk zrbiGw91JW5G@^LJf1NhffqpM-9lfr2PzZSfGZB~ZVh`jy)qiaqR)(x&8n2R&KqBb@fQhw#@)Ized7DQx=VT{R9ngA-lv;Fo}EjQMZ1V>;zH)uX1~l0O{J3rF5udh_gZ>HyMBc?y?zAaBK&%!^_nGob6WD4JOmsB1kPYb2y?qL z=y+9s5MD9J+ipfh&1<`Wo4<;|DPi8}30$XH3!;DJwlF?&GvI_`k@21m=IP8E?+^}- zPFLySJNrtmq2*J7B=A>2Zj8{wvGgxOW4|WWD7|ab@16m(5DYZ# zdWHz$`p?(=2MC`N{$be}FHKKPF*a&5TDq^kr}mz}*TnhJ8$dP!*bgNl>g4CO(w{e@ z^(^l~$Ur!9onSyawpnz}PjO)MMu+pc*mug1A#K0u&7fR!p)&kx4hQNdb`F^$_a|~@ z9HxaEW|mhK6~bq!kKmw4kLIhiFVAUHOu_vTsf}EyYldiq1_cfN1ait%_-)mwGMPn8Q@ zwjK1&jf`HBYsdaQWCajS9@`O3{;ny4f4LQ&<)lK_jHc-{GD;w z3m$7O&!05Ez+i=UM0K^AO(6PJ_hNRa1THN;M)^QKwH+~Mivv-J;=j5mYK+ijDEhw1 zh@FLZn?*}c3gJyQ6{bBdUR%fY8HNK?i`opE0icLTV%Ou&O+^B*9=h^M-;}oF?>iuA z6nMZNfZG2zpT1WFEszpI^fO@PCEKVV^PZ|7LU`qE^i6R^rx4z3`!TK}=pJGOFx4hv z?*^Hcq_o6C$k}cHYBi2T^fQ zy66YL7WnlKzEvtbO4C8GU5jndLB^IoQj6P>hr? zu*gv43=qhz`xwckI zPur;o@YKzT7HSN{4>8k~%dCFZLim!WOF^i%pG*j>7_@Ay!MbWWEI1lr5!%1{)XZIy z2AxV6`@N(9apnW)>udG7yJ-FTNnxwC*GFRE{!6~%8wEBWb0M^T!zwZP{QOJlNlBQ~ zSZ9sdb+(*R&szxGPis#rX>~!5mG)}T^1c35q06r0G-#_g-gb45s{>nB6`41|-XHxe z=opCaV}i?(U&*WE3`O=}wzr2$_OYQ7$i#edUGzzEi>nAe9t`@cwWA~mE&ey~a*d$1 z>EIFXpplqywLPsHm~mSY)Pj+rsLon7fU9LzZjwX_77u>oZDBg&Up+Gj5ZUFui>cZy z4KgHH@Jv*7$%lSDlP+EamOp5%OUaTn-rrz2T_QbFZm@5SSWQgTrHcSH;Y-D03?B7h zp;$5RJ>2K4*DC!$ud^H{_45J;94ccNu$}Ex!WfbZ%%h=@(zHCprPECqrTtWQsu~us~pQAh-kX zT_(l&xlVq4U!Lo@Oj2Uc+Oufzi);}@UxoslF}*kcRCNot13hS|LDstda?ot``XDmi z%#YR&f=+!CAiN$1mn2DKfr5m{+0uUdAN5^42NAR*Oz&CR3o=u#+Sr&N)DUz)ET-5AzvRE%f>uq~CtdO`c&A zfLJO5PBWofc{hFhxANNi2y!=dyHc9B;3D*8-so``x9j*w4srbnN9$nq_MlUL6d}0Z zMkZLU)h*tFy?W>W*-pA-{uaW(=>ir|xh`{xfjD*;e`(_ltCH{n$?5oq}P{Bsb0`(HG90mbPG_b=&MpLAtW^54j> zxMH7D%nHfv%N{A;T@C#)wUHumBR8&1m7(Z%sf^cjV9?@1?__FFXK#T)Y>&!83t6A1 z&DTSAr%(TmfUg2YZ@r9}cW}%FNKU%aa_?%NrB#67A?@1n`AKByyg$T7 z7z1W9)iSsTseH?GuxJaI^4Bo_&w-)7>}Pfw9X~%k+crr+e@5t=B9+cXn^`lI6>*j{ zoZVq_{ZZ$eBisZ?{r~WZlD7mPj$~px7l5S8CQ5E%VE<%GPymqs{S#4vu9JLxmb30G z758UKPX+xsMpUJN#Xd_pY@}Omb|&)GpR@M zNlCH*fjWxXOmET0otGiqGffobK|4@^UN1j9L&}*1*GlR?irjM1*nneF1GNF|E#FH4 z!HkwSqDfj!M!0^fcP>~2!nS$&LJz^tY*hr@ky zT+r+UkFB~H$hT4TSFn})MgENM0uJy}0b-SVdB=Px8^J{f8zLyTqb3FF-)V$hcl4t* zKlBG5T&$(S_#M;=jIp8}(mb z^3x?&X@qgdW^9lY?*K#!xuHgs!XZVO2>0lZ&ga4ij>lgQt7D4+zET70o>oZ4#isQ(!om;~4Zd{hWN@~rq7_%1?sE{%_kpg$2cdsI#m9OX>+Kyv#RNTdFy z*y%&F$@>RSu%V_`rx<$8-rHb}95$_I5A@Gxgncuq;!?YLp992V%o|n@mm+e6uEqkL zD}qULWbd%V+82<1Qvt0YMFX3?y!X=`#CUpsaVyABil1BQSm*Geg(|JAKwy6CXr{FM zqSj7IfrIyOw~U`ZRg3`y&6!As18C6B&|$+f#9}%>Mcen7)gGmGWfpKwo!^Bv8JIty z_Ns!<%0BTL8Zm(TsR2->V1iU4em})1UsOdcdx5Bx4m_y1Mmm=HG>n1?#5)Ag@?Pm> z$(PrEZGznuBA`vDXxS$(Pj#1?M@1lN$IB_3YJm;)0qgD%L!k@hIJ}1q5QYF{530K>}eBlJzit3V65|U2GFY@Uv>PW%d;5kP5qnDDRSu2tH=IIE|GO zqvQP%hEQUmwx)ACo7xN5x)*9rlT=A@{=32veIVusx{A4D;!WA9syf(RLfUJ}p6g)< zm9Aej)Iu>x_Nt2m!p;NbQ~!gDj@P*VM}jZ#Sd36%V!%=GlkkDA^(Wy)LH2%e8R4U8 z+d?FXCnbOhD&u!^VukLD^SR8e4=M;@^D&4$d=W>XKf|{M2o6#npWg^{tVEh#@Zn9| zKF^-CxyhzE-F_I~TsXBq_&nc%EgFdIjrVmL!jw80V4KM{N5F`ZKr$Z~&x!Qw0*fOz zy6EY6mw5n>#cXdPrw8w#F{^X0IMH6=0A{--a57+gM~a#TNC7{h5v!a@{yf`*QvWUe zZ;2GX)(ppm>W^Y{HxN15DYXGeJ?F2;07(%i%@$2-TQIFZ@v(jXvqd*jQE$V!(~`>a zXaL8;EN`6J6Yqr9c4bcT=M8JN+Xh>+%dWlc=8k>8G9DCz$o@k#K9O2XQe~9=#o%SY zVC!2nW6R9`G7-xl|5^~urB%L>fC`52Zs1E(8U67{HC7p(@&oA9Y_8#b=<|Onc%5rn z$&0K4ftapjL~VXS+!gQhCI`1jsB@=%@!xZ$NAd5R&lQs*|K;2+h-_Q*d9f#c{Sz&; zEVil#!%$StvX3gqRBZfmL)+HdXrWM(N<;|LHrd%9>A~JiluF~3Nqc#yt5kONpP*%@(V4HSHQk4^ z6G}j$FUWq;Xx|r`{cJYA0tI+Mmy4;W3A;-0zs63;*eo~4h6wEranZ!aa>t(c_o(jP zCbEi0yr6558=39E8YbfobGK7EPB*ruF^@R3H-_*^h zFMrK?;q!p7TK+JB7sG{WWn3}C>6-*S){U`2gA9NH#Zy#({{%Yqlnfueq-yaLuK9-8 zXCzGJtq?j|O~7r+-uSDZ*A@*XcB8ef0H-F9f+jmSMes%s3Z2OfgG>Tn!5@AsH&{=c zThod`_!~13FDh<{nr<8ja;xDF1?L%I>k6yZ%!Y9I;~5H>u`tJfwmS-nN60V-$(5ee ziuu0yPcgjC_2x@x)K4?XO%nm&{u!1F-_g_(?1S?zU~3bNSLbGqL%J!+yOivK)xvAb zz0cBEz)Z{}@M*W{m`>%4{A~H0RIIqQ2^T&~1*@qZKsU(|@R?imy zH011g5%0D?86iYHkOh66-ZC9=ojxvEdm-oU_O7l;LS(O*+2AiRc?hl@$Zl$fuUzor z++q{Ooa!kq+Qi@-k3cqp=9axugcMov|KX-|@D?I7M!_1J)*_x*EKZ%^rM1!gX)go# zTELr*A|%I3`S`RBR=MshvqvZPWs7)HF|fB@B0zR-iOLMIrN*lzRSQeOe+y2jwjOrI z|J*N3F-j!gbi>x$y1Pxm$tkFrl6V^%gYo}yGvV@x981|*w5VPPLM|=8q|Kb77ePwz zKxlQZDx4HpQ-rJ)gew&{WwwzA&)MP3W1ZXuQE5JpaDAScu_}%m=?V?Le?l)UsokM! zL3wQ$3oevNps0-c^~_jhM|&zqAl*FC(OC<~lG9IvuE){P92%wE3&2j$TxX+fqwO4w z_SBj~!97k37&Rm8Bl>wYRuoMcHr2G8^}O{8LlHji;4fXPdRdoiztz}hLPfKN7ZO46 zC8xa)8xZ`d=f%HrkCY4JkE#Sb)}2bhs^0CDXDEvALK?sYT;#?(u+wMhcso>2KH@2C zJK8|&r#h~R9$r5{6<6l<*|YPRDTvzl4T~fC*L<4JjUkVpGhrlpbSf)oV8aOGQm^T5 zA;y=a1`Q<9+K~y*o91?>(?5QEWTUw-6adW187;&cqZ zbQGm6L6t+C72!*Bose94g4$YIl^j1)SB|*4fnU|g$iY&=zS!S#NWvJk#D$DQT1kg)ZrM1 zS>iHG9ftg&g?6&Yw%6C@ozGeJ&)56yi`mlH^(eWJJU#4F^8Nku>ezy;7a z1hWdir3C)jE)xlS(`e>6M>T{c(At}~qmXMeylaos_PPAlJI?z~>hAuC;3!CxzngB) z_{_Rre5&>oSe&g4r+u=1f_#k#H!7QojVgUjz4Y+Ttc zv|ZTA#r;JK^5Mwix?An(pSXUDr$G=sY<0UWReLsvZhNFJ`U>fOoft1>lpmj4XD0wF z8t6>LFGVz>Ea3G~Mq@ie+A>9|i5fNg#^(`zmi2NrSJu-0e^(Ku8`ds+Q$(p1&AvkE zb6youFEp_g9QN)TVpq>H?Qn}d)K75A|SY|OJnn(0AYOI3B-@826l~~Bb z6uTtm!RW0Q?53%n|Ge(ao`FLjUb=2|w-;amfHsQux$;_Bk`I0FiIfEzW_0OXiuQ8O zh!|z-Qi}H}%XCEEg*l@g8_T3SG_W5R&T-57@eUQHVz#*$TX>O*PPZxM4Yw#U+EHnskVAD;RA1ez`klfbW=Ps)2%MQ|Dz^0d!tM|CpEIPA@yKw>Up2G|Ic-Fy7`0GAwY)`M21w(P2ammZ6$(@T z>>|;r8ush1lZq*x_P;?Kpe0!1h0*)4|6rk%A!uaikW!#x%;%3DT(9HBBLEa4LPJ` z+;)xUg>GW;i>;xzQ^{{MfltG|yFz#JD(i=W1j|C%(ZYt*Oe%1eswdPbB%A-@NWku5J#4rwCsa673g5X~Yn6S< zvEU$ZTHVpvTC^}i4qfUhbWOpoI~hSopQ{}8wlMF^mIz6?-(=CpApdc%=e{o{ZoGv^ z%UH@3cA&w*%@>qP{S5N;dwutf#Rv1wAuUs>m3V=AgUw8FhfHR`RM=BW-$!6-MexGM z&fAi58aA>EL8kr=^f#!+399K?Yj?&Yv=Xr0kn^ecuC{%oZh*gF-yp9P%=BooRbR*S zusp`|5hTSCe_ngaUfeDUQAu$wbOz2WE$XcEgEsjCuau5*rjmnv#?4A44AJVVbjhkL z^*QWW9KbydJQS?Ky=vjaNT3LH;CWp{CLQdfx|bd!S!=$TPoF4IoCy>wYrtr+9d{m1 z8%S{{Pt{xe>VV9{cud1{&(M`Jkv|W~AH!E@!TNCL!^i8FWHk~iZQhUd3BFB^M(k0N zeWK==7~XWb84i6=y3&ZO{FXU0u>XAWJ-fCIe>2A$NFeYNWTuI=yN9WjO5x4E9Wr$r zlb0`~x+HFLN#JGutq#0`p-q_Qt`xh^zCN!Z=SV?*)Jpydkpg zMY-H$itO}T4gbse+@%H8J3{zx;c=;;H|ixR>gm&(UzT)NZnB|-K5k8M<8GLR9_C=Gca2$dCa07ffy?z79Qey${4u=ZwX*j*C^^zXm1FzQSRdc=+jbUS2hNqAf^%zO zTh1lQi?)Jc&RoJKOHkyD#ev+Jk$ZH|LX_4{nx>)R;H47ySrIihu zdk)+#n-fYy(LzTqKm0PZ#TOJuRYMv(A<|X|zhjVG+Ky{)_0fszRv>YI!zef&)ppP$ z2{KeU!|}&PIv$lOiD#*1s>YaBzHY&T1>4MHzabAvaR<)P@IQVX-y%&cb-@Q93chmt zw~R9rP3FDO=XX}jl%(7QeTwnv^_(s))W`>R6pCUL0RAeG^@~@ROE!d! zXV5`aZ(Jzea>$Ar`a&FpyC8FjcxGYw?#xi^Kd1IHd*ULKzn@Cvhk(wEKNt-Cu-5Nw&#zVD`a>G(5 z4P2hZ)U%`C^FzLa^;T$xpc5O*Jke?o+-$Wce{by<3NKuuF60<+*p&hdJN3mJtsg5S zkl2R^KMJV~h(NS&IQlO&7t$rJ&f$KW$*5yzOJ8%f=FY*kDAlHA-GY@ofFerYxfrZQ zdWfPelie6iWL26vuxv0KhRJWOZrAxt;CjNp|AsbIf} zCbnzC7YE{0(ACQWrw_EUpz}6-my+rEBe-HFdBgCBUp%Wdw%DhWkiG1#0m(gqBJN~K zepmy>tI+BhLh#aKi%LDj+Rj2j%Y1_dGnorm%NT&N3f#Hzz~v7}e#c+9+pL$Hly|Ug z6TpkjMJqgQ^a*s|Nyqu-@4(#_bI6gs^je-(LIA-ivFVqx}*^3_MM)+v~Nj zdQf{sff_h)0$7tvhsIC0Ckut#s|HviNszM#WLhV{oE$)9X^*~{65rY}xzXC#PY~GX ztW-`!a&eryVd%4I}S33*k(r|ij z-ec_wt+Eiz4sQfOu6y1-kj0cV5@VVw<1IK(0Rskl_^6BEZ`*pl;6RyptWyPu2R$>$ zbrfbm#|vg_To`2|eA#jS&ua<0Y4)LZx^N~X!2bYev~1oXlI6H>@#-()P$J#g2x6Ft zgt6kwH#e8ObBDboN>5sH&Hm%`0zQZN=(;SiW{(ScRbZ`=`aYipL@_1{GX+^opMYTy zec1)`?r7@?qR$JCzrPj>^0W6Qa>9_*$?wElCUK&Ed#_AnT5}aVnU|$YG<&A*S(t`w zqza(9Q6^&H19%cAVX6Lla38pl#x|v1@?wq`gNHLf`Seu%tX$Nh+|zPOsyT06=+QQ&W!k?h+tDF1rUe^nBr_?DY{Nz{0kMe+-zt?^_23vv^U>CZTnzGu3>C)jG8Gx!_R+GfKER01 z1Cwr=bcuyTkunNz3!=3}j@eZjqobv}(?gb?-y8=C@_?n5o9nb-yjP}h_vICZ@%uRF zQ#dMJpW3GVk-nUof@{i$lBW%5p@#$2)g)I=lz}{6PpYwMc8Tcrab%SPz3@c5E}1Yu zcnENPbW}H+R&!Xxm&fWjRsS)kv={(3!?x7_jtcYv^Dnu<%B|4q5phU((XIZ~EhJJ%_*Kg*>}k`1+TJHU!q z6&tONJwp{%ahjn;_kk0QQzTP5L)fP~W@0J$C?-bK3uTz)#-gUg{!Il;cn{W3W}~ zqaGG3`ZHm`c^|>Cqwsi|zhuSqr@CqBAc87Fb{zNthy{|}4}~Zj5qPq5)ID4~Lc4uP z?AFt}TAFz{BzP;XfQb2eBGn_^cGz!Cb1vsrOK)WiEB|Gio*dI7_z&(&9nEiT*|V6b zXI5TK9oo*yLzDJ2=uc@n5Eg5^$e+f~VYzwk$Yiq-#Qm;3F?_N`oi<^=Z`eSY!+1@P zv=i>>?w_dH4j}fV5GK7JAVz_vO*XLGvU1EG5%#0cFT0OJJOQ3<M8 zQ^+iWupeB!7*7Q(dbDVID0iMbIHBrz%jD{D>g7nn(7?Amt6A3@bKZiR0o~~+=?Os5 z(Ko*kn@HcaTCd`{b+Zy+ZL>g+ub9|go@z{)(Vnfm0x-PqI^IJE*~Kw?pvpV0hFT#! z212`j#}ztb>2lHtcEZkEb$B(czaE_X`^Fp1b^RXw-pT{feo$NonF6dFVaRe`GR#1Gr!cy)AA!8vE=%~!=y z=fU-5N>IhWYj$dj;0Jf$BOLjn5RZ0%MRT0xLBi<{9P1|lb^RB*<3^?2bP!Iyu>&BS zs8a{PqX*^q`48;`6&E2b?w_xI!LftDo>5c5c8`v10Zu4B5i6Op@z3RR@zUR$yZ@f* zf}!bpe_ptaI{oi&>(!sltgh;UtR&q8 zc!euZJk?n!VG8FOms0LTW6N;w0xaas5z7Gt2$kubZYQ$n9 z7d?32j{n+?kJc9^0lL!pLDNHkgIFhzV!%xso=;nM@ZDSwP!QhOC!GcfLT;wq zj_}9x|L(8*R^WUx{C~>1)lA=CKiGTl_6+9Ps|;oVgU}QfgbBbijON%mznRA`H|xiT z`oF7>_aB}4`ProPR`N?eOZ=YtWOM16v>Njh6VF+lfCQ8z%Pqd`yAMp-@%z<7XZiZl zd-cELYm4Upe5m+XruycUM?LJ8mm=9CV?gOdwV9-2sIjI0(~qpp)&H%Y|M%{{b-?A* z{y)!fnsGC=rWOjmu(kiKKG)X%-*R&fk#!)ki^4C?`p5B90UK~!;cShEHvKqvHvXr1 zyFO@H@#kfW&+Py4RGNkBI!MZb`^!B0T|0qmr|W*aI{yFHYmgy)AVWl79RB>X*;szw zzW+z;AEz%U$Q97g>7M8C|I_ro`**p&Jpb1sKJmdTg@6b#+s1p=pFha_JO2Op`%hXn zz$xj($S*9vJ_KK=FZd|D-`hqomubc>gTe~DWM4f2Woyr diff --git a/docs/bookdown-pou_files/figure-html/unnamed-chunk-9-1.png b/docs/bookdown-pou_files/figure-html/unnamed-chunk-9-1.png index 3e02f677a7ef349a6b542788cb00f257b9c3801f..d94c3bba11a5885043f46f939e851e018b58a093 100644 GIT binary patch literal 26040 zcmeHw30zWXyEtfOSw>}+O(so!la-Z~rIVG=RO3{dZD!Uet=eQtWo0Q_)0o+!HjY`D zLaUhzGg@dWP?}q&(qg8BxF(4LD#*^c2eeXw`Tghn&;9QF?u8KZoOe6V@;=M^JP(l@ zH@KRcePsrLK+MMXNWWS z83Xa7I5;?fm(&;3FF$AJ81;vrpR*tMABuAf_-4ARUqSBfQSL$DJJmg^)Z5$JPcx98 zGeq6oIY!+>{fhC60pAoq3e%4QfPI33qJpAQgQ7}KBv31m6G!07|3W zOQV8HOTivMDF85IoS89xObYmpQ8&{3Fqr`Oj|l)RsX;BJQ7xt5*Z`Ny1t$RegM9!d zMSZ^|6=*31TELs?fx*GR9smFkJ+2jlkQ;gU2k&4AWNL)^2O8blwI2eR23fm$<<^ro zI;rLfH`1H*Q2*PtE9EJxaL-c@!Z0tw zi78`5#V;EVdOT#<8~<_{Z#f3Ow>JaQ>G)|aH&Me!%mC(CyA*f$Kf z@93k)ey{Qbmsy)<^pJV@zEji~udrL;xXIHD3pYSq52=JXB8e=Zww{Q>oI+>ITexuu z>m^d&{`YShV_z)w9zs}PYdEiWIqn2r`~W9QJ%g}z(%*JI61Dt&XPDx!_-Rd>2fV{Y zQmkssDGRZ0zBh)rud-|~qg>pX0#TM!g~)9Xk3+g_x@whT+-}v_1^UMS!_I;feIm~T zkSCQyXQ+e+#Mh*&p0wn}AK4P&v8Zj|VD2m?Zza$GbQPR`3S3Xm1;97|7#{|W<@*-9 zPjxA~5qo?Pds@*m*{kau1Vv|3J!U1nmp3F(5kJyY5ngfoQZ}SrT)HcAPI3#1xS3G@ z^cA#szO*KpGKN7^9puC0?zZ@xe3{_24YuvB`zTx5sZI6psOuQfpRW%wp#Isv}54(r^oy%~0W9WEY& zFf`~0I{bwHhs9h91l!e0W$|LkY=!hGFBfz>_4N3Q*$Q;YK9d6nbvQaYo)6>1MWE*{ z20cm1kVGfUuS;#YgWasS{YJt@-WhTg(d8XhaOqT0Y#3kJ)QY#^<0$nxgrIn+9k1o1~h`*sp{aK(LaMy2*D~{1Z7ku`Sp+Vo_y6J zadkpLxvbH%6XraE5qtOkq1FoM`K{*bRmW31jfA53KpXxEA}&}}h2r_*GjBrHc-C9r zcdI6~pt9$faO$HsFQ=)B8%4{?3++Sr__=8wja$Arc~p-l_dB1zrPR8T>}~R;X$4nZ z7I^CfnN^A7sEs}IgGJxS*G(X$=x|>hIh4gOfU5{(rE`wau~;1~7ETuhlZc&4ah)_3 zK~)`VW{C8G+lo@nUY*-zOoQ1|S158X86CTzYrwhVNcI(BKm(cUk_X5F7J2Zfvb7hI ziE!2Ndc8wmk?x=FfmRe;O+_8~UA*L)s~6q=Ln-^pM#8-Te!efK0V!-#nXo*5BCfka zY91}dr8Rvd@=x%w2N9~RgrG$_h%&4OeTVO8Z4dzl5nvDjLl4#rW|YB={}1c2f6W&(#+;#a9W_o)sEmX_RPqQIa z0qq^0%1TsF!p%N2Bf)u5_e|2@BlHfeWC;V1MNFD3LnvDp$6A8s$Q0SaUafX4`OH`7 z`3^!``i;GrMvIqV>>PeX8x0Otcu}w97a)KNdZ^`1zM{*Ky}8co7R240{VZ~?_5lT4 zx$qudR>Swj%%K-W^sjM>2 zUf_B-au}YArYupLbiPZzZqYiQkxS6isaZhtdN%%kG!k7Tcrfy7Z3wwSWwF2kU!ej> z2WIupTN|!pTapF3fe}(V){Ze3+N(GfZCo~1h(HUyn#6Ilzz5C#qe;gow>=usn34k* z(D3wYX=+pypP!+P3R5FtUWg?r`+&r8n{}5K?v>M!D6Db)Hl>^&paH7xxgy}FabZ>@ zymOI;F2DDcXwL*6&glW<&C!TZaUvX#@d5Ibk@qgSZ3E3k^!~G4Eue-q1dy^cw><={ z3~kDI61}iuk;ck?8ox>#+M+W;oj}Zn$`tl4IO)^2asZ!2ksUsUic}Nz6uT&GVH{*E zl*6iUTy#g(qK|Dz^h-ft+PAvpi=I2Q;b1`8Gp*HYLgvP5uUZour;T}(biQKu$2@^P zG+|p4eu%h>p3b_}$lQKJ&vd%)V9mkMI348k*Ti!MeH1%THwpRR-Q^ZAotyK8|wD+35 zL9)#+N1K{qMnX>69xu13$crl@-*&RY3F1;!(c2v);Q64QtlKKjP*)Cql2IUnMPGq9g#?{?+A&U zh0M0n{RFr1aXq)o!Ce*xr-z;B-K%tA>|Csy6EpBzGxX7lek#Hx`eZl$zq6Sp)>L~F zWQ}CB;X^#>QnqjjgXy4qfi7kYpo=VWzr$Rrr-5pvY>LR(f@Dw2&t_t~MVI22=h5@- zgHduyQQzho?X&sS(8#;1R2VR)t#%ng#bJ2?A-$Evt6sj7LqE3dU-3AkusN=Vtl8ZW zNAccT5pKOM3M>gBiSPmd57hn?rO$L&2poe|37_)Lpnd6_E}lP+Tce>x_|hk7+Fs$u zbeOQa<{V}}ZiX^8%!_ZvmKT}tLw1`=emK z^n;p7sk$fKDD8&#ETCUjiIzRH75n0+evp;FC%^YOrG$|W!jFz6sS>Wt*?1~gu){Or z8z8s3Ay0tv!H9ag*WdiWUcIA7(iJ5B=+p)%T6wLh6G+dVC!DEy;5=7alRg|OW99iH zq^;G6N@=w!yAgd-BJ-b}L0{Cp$VIs-pkAzW9%q?A5aN zCO#Ys(s*T_zZ|$zQn##(8=5a~n9?Td)6iZTFMs)|>!%}qE3dQoNh%;r6e;f^&sR-< zRWFWN9*v@+wh}6*x-3dNpcOJ|S7~n{wYLxMk>XV^qg9fIHjXQMKmDO6JDli@Ika6n zuG4Egy^<$$Wz;KX)qyw)T!B&MYH*|-b7}Pfl;HH8VIQBP&YnCeRS8}NDBbU*6s{;V z(~8fuavZ~8D;kT?dt_zyop!#?Qks;T^U1G_%0JL5O4V#fN@KT#p+TQ}-_1f2@m;d` zS|v^CdWl#s318sB^U1fv923d+&?w{(4av$)uR0D5YxU!wfnei^WR;Hu54gc9mUobid~2Rg?)xgheG5B%*|IH4oo@K7^Y=WfLf3g zg~4os`>3orc!`9Sl?4YbyM)tM=utf#ktiJJ)OOj-SNddd=7o zY|xreZf&hcE%aa0~C+^o+FUyu$YwKMRP#L#|{asyq3CEI@Fq-|syhmJEt=0jlsr}zDDQXL0L>S&rr_;y>-!8WUGQ}o!64Tw z_6n-#48Aedz0j$lZlp&FKfr&L_qq`a*!uKF ztOYpfrF?-7V-&D@%~zCM07NJt4tMe6HgmQv8Vli@eyqX?Z^Hv+UYC3O%#}rjBCW^6 zO1zc=J$YXE*q%hhwS5{)IXErX##9O2twoA<36+lHL<2`}qo+;nBk;;GC0ZB@Q99@8 z3^h?2 zhn^QeqSI8zNfv!tE4uu=_7or;CDUn=aG{sTRR*g6{<>9cti=6J!d-Upu?uI>dyB?t z15IsK=)R-ROwz+?*s&x$tnX~IX!6u)qx;itu-_nwKy@nWnUIOkY($Uq*4WCl$94hQ znloqzhSbnYcp;8J`44@1ex zsJ0c+*p(0uKNqANwWg2ObT$r|LaL(YR93{V&8OSVL6A(-_^A#u@w#!^wgY=olxEa{LK@Hcl%l}1F zS`T3x1_#sZv`rraSpE`N4iI=XD+%3q?r)vrc}qABl)ayg&xnbGvJO9AACecTZNC3o z1bf9YOx0Rf{?`39bPlrZbCTv*^h6WR+|keJgx`*M+zO{{AKRZY`7B`#)OpmIZu@Eb z-^U)fX+QCo{>;lRzfFY70+M3X50l0J+QGSCvb#TlGv9jxDQm$jz-V*v3NXEM&&vKh zP{R}kK&g{(4HRYI3w7$YVFS^#)7nfm8vZF3z_9D~YjFng^bZiSVlXva(p?`UE~{}u z38V2eCN)HK%U>KaN%R`^xX+}$7W@QM44w{D3@-XygHXE#_rigXud2nku6CFQZ#J^c z987J8xuvb;azrJH$|-nTl1EH;;%R(tV8S@zHYD-PO6jDSMZC2S z8Iwj8^oJdfvL&+Un0QI1@A7v_g`#QABmkQuKST}gJJ0)dsj0OI$9k}?^C|i`G_W;G z?nT`AV~)J(X+V#U#y10|_czQ!w(Z)eGP!dxZ2UZ7(a#I~_>mCoNvX||&e+IN(O_W! z?Eqx!!xKosgB2n1Yh$5E{=`;Xk4gNHIUmuLGav{VA#0fEaphKKoKxGbH@mlGP~GkY z?wBG={`t*4XsA?L(#+z$MppvTau(OaaRTZ7^Mv39>!9oIn=7}hd@zCZ!F9F`DmPTQ z|3+UlvbFrs?~*Rb&feLW`WwcTb^csCXzo*oixfeD5?hcp7Qk50ChfBQL@Sqp&0zrU$EjiZR! zUZSQ(o-$Q@t_C5=>yEYhvo7wSO<9CO%0E;9s@kNQ8G6xE{QGJGZ80C)Q*#{7QvBK* z8%D;xYWmn_%3ifu-@2_n2QGuB1hmwAmE}%*uK%Pc6ixLddRYh2CF9_n*Y>C#lCq+_ zC!?a_{NCj*cv-u667EX`vo@6a^ZCy?T`)Q}ad;~V`8>z=<#(vLBl=62G*#S!B!&+r z@4m>etXW9^-Ggbd?@V^C8N=J^H#~H9bJFTj*gvNAmp3s?@r7v(fHF|jKb$Xaj3wFG z8?1AKiSWP58PmGGs*YJXRp6oR0sz%TvL=}bj5Zo53_Qe>=yN*DI6ur-h|8@O9 zkq7q0^&QBr8tj8I;VAlo4kJzulAUG@=_8Hreuyo)8JgSMyR|UQF|qz8WCbnq9+;Xw zaH8@{Neu~+Mh`Wi4%UMOHK-@nXrhCL_&NS@4>^Ivaeg<|<9z$?W7w1$UyAHI$}!9A zpV2czngU(nr!@xpOVcgDBJ2P+<^m_Lo?`Gf1vhH9H?k1z5bjDW{?it*;DP!ZFdSzYjC3tp>3$IHc%RjElEpaX9Uba4`J#w+Tt6B zo|?|l4MA`0Pv6%54BJY0>lT%kl0Re{$&$S~oM{5TuO=pK2*s!0q7oved(iL35Cb1FPG*xbGt`EfbXn%{{*q3pHajCv3+`I^lz~m08F{uTFI>#6dW8g|_D( zBu`M+Qys`KW*gf)!|tjJqz+VCHW8f3b$04_pn-Z;w9%VW)xNXUwO0pfO%q6!slDrl zDsj43Zew!4=K*!|P)+`CSn}h&po;e7l46VY*tb-@O)v z91As_s`iYvW*$^o+GzOuRBwPsLuoJ!y+_oTjUS-Y{pZ`1nlq|-ua)afI2Y{sohOKx z6_}ULxn6Ej%TpfN9&YJ;>cV<$gOjUkdJ?vrjP)c;G&*Jn9>dO0{eWsbjSeKZzSfg37*> zCG}JOm9FW1B{1n=4GMWt`-N>_obJ(P~3yit&lX0&)Ji3z%L zk9scw%wYN9AbJ8n9!qq3u&PE4n1;zC`8aN7ML=90ZkcE744EZpOcVE1LX*3{_gemB?GzE^d?5+!VbpCed6K;t zEWDZ{>u4;aa><1qRB#eR{sZMo(;chqoxS-xJx>LX`F<*GUOM%VD7=-IO>q>W(~&9@ z_;&8u-HPeM?;n(3#}FQrflpAi_Wm*AIQx-H9#`-mZ9B!4K=e{SZx2xsiXWRpT>$woq+zL zwJ++ofPt=HrPFzemL{KuxDqGSKGoqj`l`_S@t?7OfRD}VFAr;|3Oyu6PZNG)C};gY zUqF1>Vq^_B0`casP+(zpXVulU$Fq~V(hFxH+0Pqo-=v4dT7I~byR%C2cJ~Alx>aD& z{+3#17QC62?vLRvFk@sNm5e;jt=|{EG==xpfp*-=gtK$0;+Ndx=*)r}TSK^$+0GY@ zgz46}?^wA`-n%zE$P-;ePyHS$+cpMqGl-Rtw!pTU6giy&^y=DBaX8rvSU zqIqY;V-x6u%_&<~tk?W1+e~->1|MZkAgPPO&o#Nf?(XJqbly7g`H*jmAEG}IrGP{9 zQIbb`G>8O|9@Dlvbe`H0)3&ra({eOBKUnJZWEgS6QY3k6WvwDM>@^0P!$kGmJ6BU^ z>vDR#`=d7Zl}T+|%eO5!)>GMvGO@sHZc*>l4cwk&*avPD*-iDXqiUx)S zjVAUEyS1b`$;sqo(w2EUix?Hl;_;Slg0)AC)m4Qs_!9sX<#ttdSX}!{N?RKsk@7+- zn26I;3X@lyK2QP9SAiecQK;_tGOo_m!TMy^u1aRbZSW)wQ0JP4bfLZgi6W+)vqAJv=@%q84gzLca zvA2eExWD0#q6;Q1&Y+uN9tqsJ2P=Y;o;24G(_VxbUA`Vrl~Ay} zD&@-5U7Qbz<#e-|iaWf+1Tg&&D%GT#1Qpm?#Xu78jc`qN*CZvsa$Mk0O_a6}~vCf^HT_J=NAt z{&^%Kg*l#N!CvE%Brt}>L?hXXQ{B-wN1KVpy*OThO=^hVRj;_VxBjsrbytw0wj|#K z>fG#Oxfp4Shb){cgjQmc?ila8rg_4e;~i^Rjr;;s%|&+)<&}n;5qDsFknAFA#^Wjr zfzSI#e=OGEI@qTqX@a~BiD1h4S42%upSjbe>&kwy709sjdOLyN{lG;c`xZ%4FOmQ* za+$*EvdwkRL*8iIyWIsACwpio>McfCa?qK()MPYf5Qb*+l|SLhqB5!%;+#O(Gr5pl zv%GcWd&#DUKY2{sx&=h*RW+Ti8qLyxY&CI$G(K{9qBOzN#z*NKHxj?9No3mVko1Jx zXhB4#-2>M%O}!qwPe5~--OD$|%39Z7Cc1@IiG)=ld19ADIW-q#hGGFY$@VRsk55C@ zxJ=*kPL0d)Glh5FX?QMBCx873>;0X#;zeL)Axza+4M;?_wfxMY`%!Tn?LHwT2osJc z_BfbqXb<#&=}UBx)1i3}Zhw;(T9er20S)zJ<0a3Zx1oOgDmAzD^HrZoM7M{)#isPx zwbgmN^PuX?4VBglvQ^|P9&uCt5_Ll#u)8dm7AEdUQ~FOpdz59=A9n?HbXrP0aihBP zygBO5XTi=+qIT>ax%gaPBPVsE3@`TheA&mv)t&b0R_DN0HKOh<=cB7CRfKt~EbTsB z1t*?d1XD35`STT1C}TbE?DOL1+JS06nVah6N6UPq^tVK>QBUp#j_X^U9nT)GXdi>w z*5&w{8UV#00QUQ+sxvQuU(%7*G;fxA;3#mqa#_2~bq3UT8^+xiotjJ#&z-}h}9 j|K4|O^!0iRz#8&m7kO_SFIRs9%i1*?R%d*->*xOh#q=z( literal 18374 zcmeHv2UJtp`frS(SEZ>SI3R))Q9%-wee1pdT6g^y$KzRg&aPkmec#?Y_lynoxY+lx zgFqmzb7yrgfbDZ(AkYp=;|peIfd7I(Y9NpUNKFl-20Wud4!~Uq zLjSF(s0e(ZzouUfYHHDH4i0J#z_U;-I$A9ncy=4mFE;~&NCP)F12^EAX%JcKu-u%3 z8VIQ4pcd`m0HW8UU(vuHhiKqg=n&oQPzc;KpfbIxTclfLrdwpKTjpRS(3~4^11~Zo zYX>81Y0+w+XrTM$w&=B^3!@8xyF0p@mg#1Y8R?c;JD530D|7%AI;a&oI21;!6-EP3 zpcHU*7j^?Tt&m2mjWno@bgRvDtE~lU&}wP4?r62{Xov1X;EwKIDM~L6T(oW)Z7|bq zur_kA7HFF`K0ZD(Gc)hw3UmWB1k|K;7t(-ogPF9!TG}8`gx&+tBv6k=qnWEtp#jpe zd7QQM0)h62(f{s<8Y0+(KnFnQbWfT2CE}B43Bs4)&+U&k7}|jk-nHDLaQ&D%gv8zV zr0MlRz7G?RLwVoj5tMwYuDV|>a)X&Bph5@|402%uGl9PBhOmQP zOS1BVuIm4&psox?EcR}g{;zxP|CRg8Ny+AC#P>HqNatgmXG|>R5>BI%ZbJ+9-1+OJ zT%5sA7{gO2D^48~Gr@)zyiREcjOFkADJ>IjZ_1i)%KK8V+L%A~eePjM*I`z(@N3fd z>woR!2jlr68SjtspRs@A!jAM|2Y+7eDHN*CwB5~r3DMGfKhb1aHrR8(7_x7T)sXv8 ziB*2yhVF8LtTK&P8XU1I036^ms9mdEK80dcT9X)7e&o-qV!?zI$FQ#Qe!KB)vTTwT zaRF&iqp1!GBk8g#T3^8S9NR*)VqCX%Yun^(TjFJC@&D#SVzMBMgKgOUX^OC(w(Wky zg@qJu9x+?hMfeWF1^rb3M0ZV{Kclh^M3MJE7Ox5+v#aRPh3?dHBcM4vy>D;LPmfJU zh*c?wPE^c=eT*-yww0slY-!;%ipQrWAucNCfvakblMbH@^{J>pDD>a+lbd*G& zO06Kcyv^-u&%NX_#m+;)o+qAq-}ig*yjpS@dePmh%TJi`>fT872hHy_7KhM&K|>J; z>UWR%OyL^pMUA%B{y&z18QsmE6vDaeoUNSvmVj!LZyP=KtUXTx-hS~Be_x{s-L!s( zs0A7kCq3pXdBSMaQIA5YDCmwjB-BX#4EVB&(dZ=cB^jnqoPTqWP;m$L_8=YCagGYCI zjLih;5W+f%WwF1igDuZj!*OY*CRQf-#;C`+WiUzqT#&8%R6UTosGC^R%YOzHcLnuQ zjy|zMUR>>i9I;YCDj~L0IGM*Mq3y9Kb#cC6=G&7v?)bmg+0)ddE82{WNH0g!RtYEj zb|oC!U&?=ukxGRnVz_D}J$Mj8t}z`d)NC0rpU^*D!1HZR3QYOf%CW*2R^MkG%OG_d zsB8n3O+aNE)-H#8+lb^pLL}R;b~y#`-xe~Ovl6=+Y_&rx8GY9EVyM_<5gRyr$UR0U zErZmWZ8F_+fzgq~aS;1bz0in%6u(b`l9*dGLzPdxXYiEQL!pSpprd|s&-LY}zGf>v zX7u#0H0~-HWhv`2R@xkDdx)3+TM(4VwAdyj#2n%!)I+K0xw^$cUPJ}v>730m7v3@# zbGCfKNS?ywfThr<%mOX>Yu+CS>SF}J_fvJkXhi#8#)}6HKsP1BAD2GtSmHP)7VTG*1P+4)?l{){)a_220tQ||Z zC-OBQFrR6$RVtSIzzIfp##)ruw%&oIH$QbKZTind)=z6Fp|-w7x1@|HWY1Xb?6>pJ z0=B+#5dY4*g|Jh|NdtElD}hbKz@#0-+|MsSJ4Mg?s~~!%17I-q>rb$_MJbt>P4^@ zP!c;0wHkkQ4>FyLAzj(aHcfTP%=7m1S?v5vwMHus(_CoWH+ zgxT{o<8p8drety2}~LK_r;O|}TCju`ObUKapi+j}tA>9J2C_;csO zyrz&x*p;-dokGn{A$_PHMnt68iG52Y44U93hoMw@DN6-T>wUW!m92*q4LluqvaSEZ zbJGYzqbr^sqt3{lfK30e{*MTF(G}gz5*xv^i2J5Uv;S5n(T$f!@DRW7J@r*rC_>w6 z@Z<;gOq@^Swv@!wUAiQJypm{Reu-&wE6yLcL~N zYS6?dJ8*>29yX$aLbELqK4CRKNhv;$N~KMZx@5n(PnvZb3T{$JtPRRb?>2XZ-=D={ ztrmDBaUktjsFVmcVl*=}?S!wLbF=T$o@eQ%=Gf-!g&a5c!?_mte1@=blmazklEPki zCV(s7Z}=?d&RU2x_^i9L0G##j%$}59<%i*~IX_O!9xTY|Aj_aTA5u1uLgnXLIKP9?LeYl>VQZ%d_c0RN)4Rp|xD6}RFR zuiJTvU(D83+p(UAxF2%1$+4@bw}(?06>-Rxr=vvsi>sks!2CEt=bx07g#@@xo~#cf zD8>K*x9e!-q@Vh%VW8p*VFHPR(eu@27!4&Zi98EbCzz^=!PRC4mIm+*=BO|pgrZ2r zTqREp9Q_-kd?XT!ryJnXr?*Hu8|)efKhnH%`W@!n*8KeylDM3uI zk_dXc8z1>o6RKsv<#RI~32DECZ<+*duK=!6FJxq2KQ_RNTvMa4->NBN7Mi)=KSgg7 zO^6TJVF=7VFHdj~7qVT=mobTz1$94BbUWeY1hpm=J38t{mkz`pvgsrBMU}XgyPjhkTIcW?gv>E?08;uF|EO zSSIT~IrELj^s-@Zeh}Ocr4S)1?C%k9^5g74Jb9>jLUN?jwrs$r7^-(3K6$u|ZF8na zM!;zR_0eH)h+vWwk9IfAN}RabLOI%VP0VsO=jRWl1&V>sig!E7RZ=R8hiey`2oFbW zJ&}4sdfm_U(u6PZ;*3&wm|H4f=|f4xhNR6l?WGxNAgikU=il2dPY^B*acDpAF7_CY zOovMG6p)5Q4$sYe(?8-#YjS8HzV9_SsOHNL&pKxzs%u4w5$$GRs{k&OPe60FNf#0J zFhr9>yRi`3f;d4bO@~H(BDJB(l!TCbP)CKmsiI%^$-<^RXDwJ^$2YQZW(fWYRT0JR zDwj`1(>j?A<%GkDXyJwWM!il7N2lmY99OFnX} zuzW8X8DE`Bg2r>eCN)11;$*=|jT^xJGn%FAYjZF*5GBu&BBt4@rFCgwupU($Z?ngF z?hkLKfa#GqYeZ8Zs$j+F0Hn1G$M094hLfeqpYGZb0|R&@?Xuu zR1aZNs6mNS@suIKPgmeK9X98~&+_1Dek#_Dc^BZk_F7!go)i$K7!M%xLNdmyo4%K>$li{ig5umD50_Lo1OfMe-t;b?&!3Y( zMxKluzq|vnh2bx4P!YSQ@DE&-0&8aPF1eqyexX2h6aQG8I<59Fn-}NAM=W0q`OGNM z^-POn{S$W^cOymD9EGq1Qk0DCPHh^FKT=RuedEHKflYu4SI}KT59f)lP+N_tWWLey zYxEsC2aG8u^VJ35{kLd;VNLuXi?CIb919&WEEJ(NpG6tn+d?>5q-c;*>92gqb{2($ ze*()WpU$Jg0`r0g2rRbZn2f&AgY48oEGzqXPN5vYni_Avw|`_QQe}e?j*`7t^H+6nD3TcSnY2@r)Y`4y zL}UJ*eJr(%L#YR)D4 zX=5k^u4i+Z*rWtN&m3H-X6h((7Z=4ddUvyo5B|^WmRra zDjxIhDP;R5T8!zN2ml$;T}}sX?K9QI3=KmWK)??Fx_n zq?~as^9{lH{`@67z!_Gh*SaVcu_7@DT3MCZcKix`?Gf^n8F}awD#_CZMN;5q#}i%`1MKQ}yi3zy5lqRJP?v&?Qct4T=CUQfFGF z7jOEZJAbY->=z&Y3jZmue91*})y@-}Fu-~Vl|z`1qJepAYG{^TxP{?{2CYlZJA%Hd zo^@gfhkA(5PGb8+Df^FxELSW+(HpS7o9NCh5}|!yG^jvU_2xQa*u-3H^JKdpZg+-z}Cd|$^S;*wYV;4%NrWkhYzA}L|fI7 zGpHT(+}Va4(6_p6jBR=Q+By@K5zRWCp!B_v+Z`^OplyING%zS!+RX@N4z*mwt2&R*HI0T!jd`jxV+*n#ov{g&wyrsUS*>AgLYU_`gG9k@rEn~Yb0U3{C zw7LjgP`e}}3+Ka1+ucwWk9Z`doL371QsDJ{*R0PBg@zMlROnt3UOQLpxxrhE~U0pg__yiU7;ZR?*St7yj-z3Kv= z{cx@;JDBcBb%f;k1SE+M zEWhUR<*pAsW%LpiB=t|Bsw7uE+Rdz2gNvE8Kb{{j?*D@hjsQ(B zFgN$4d`8q=7oVGI-f8_CHOKhT~?(Sj|)d zR|FUq5^KLVV-G|oQFBGPN-FDfIWKX71)7srGz)vx-tvwNGm_o0Wd$PwT$9rg2!V#? zjjW(VELJ7Z;YcPHfXxPN6HgTE9*}wc1P_MZaRRXyDzhf9zWszD7 zFkh7*+K2kEI5NEPhBZD7QS1m^BgU7Xff$cd7*0(S;E`XT{N7gBW%S-tq2+gI-X23y zFh0%|5)eH9A)-3WBFvrqn*66EI>n>Qd@22x$squ=2v*nWps=~2P_c=~B6%@ee>{)A znAOZu+)6vXjb=L_G7W$g`vPq4xr25>!`~8tCfrqBGbv_rq`W7ihg}3MSciTw^9OLs zJK*??78yT3jx~QoU?|3h9HV0*;7H8FOnml22zE3fk%KWvC{cErh0RECc`GlygVw z^ojNL%$4t}M_%SJfAkh@t($Ej{`v9xGUW1cY9J((vIxrwyNMMN1;`?}vE%8NlfY@| z)p)SGH3UoW87Aw5Yd3D%wy@HIw>Bm$C6YW(^;^mnz>#(O!&Eoc!Bf# zf8}EMb$OkpfWFu%Di6Ws>Orpg z&!>Tta3hE@wfe(-&AiBLdty%GxH5MN*_~=~hpG85s9rdwsHp=uxn|pcP;^2Z^ZlzU z?O25(`Amn&yL)-km>P>N`4(VzpL*oC7MyJYtvL`i{+OD;`MTDUAIAwPNgl7Z_29iB zpNDw3Lr|pw92^v?{HpzyZ~jsww4>FGh@f_R%rDX=)Y$xP#WB?{UAnoKCL99oIG%xP zt26PNyKyyHQ4Uioa_n-$ya6|)B(AiH?Si(5)`G-Mc0rX=aCzy{zNw|0{^t34b@1f| z?C6UT?>T3nk}O8uMfTND<>XrlpjP4GlH2br1*Dg(*d{q0J)#$MMJEcGnksfUsU(1d zJ%i`f#)ITpl^OtJgRb!{#({=a6Ts!o$9q9O`W=eFQx8qUuOSr=W7Oxp_;_&vq7&S` z=t>j6yNPFzS{28;6^Ej4fFV!&CMO4(Z4*0p%mVLw0E!zxQF}$Ug zpgHW6UT-j|-AR~8dV=m0<2=%WkOw2H4(~)(J^J|bU|($dyPuC6)VZJq@bHUorEWv} zA0eUF>I*9SfWw#$0y$e*J&Wsb3NeKfpBUh*ai{T3?-O2_n}}hrKSFgNj8w{o zNwLA)CoRKjM3Xc$cH?X-R2LS=%B=i`Ke6&#&Bc}6B;powfL#WGd75XH7K=OK$VhYZ!{U#fQI zsbYv^;;5b75g{D0r3>vW7J>N6O7vX%){>>g!v0PW!Y;4RRH7Vc3%icCNJyzUts$bH z^QPYu*=*?SES9XXkLKqv@^NZJgTuoaio)cYp*kFzg*zt~(ob~T8kiMdKO`>fqg zOOE(M6GTnSd+nc59$lwUNA3Hq<`W)j5>qAgm$Ho0tkbi&J48ao%7=%lC9;lCj{#kB zgG2W=J%^4wEj_mXYd~iCL8m%@RMvo%UUHBq?%TpSLewtlL?2#h5ke=~B)RU)V=oKw zay81r!x$rN_o?=L;1~r3*s;_^BOF7XbW9QrQlLgLe-Q=bGpJOtR3bQVbZP8nzm#ye zETm-8YMu~P11aexJ9jcIfRceJjP!~1I*SDKV^c4W)^R<$vOvX+40~z6AX!>TVq$&m zo-Pg8fi=ri7ZXN0!^2E>{s)%xU+C2T&2O)e5eVs4RoT|-C=hU+J7cJuefsL{{{v|a BLBs$6 diff --git a/docs/distributions-intutition.html b/docs/distributions-intutition.html new file mode 100644 index 0000000..d2f0075 --- /dev/null +++ b/docs/distributions-intutition.html @@ -0,0 +1,591 @@ + + + + + + + Chapter 18 Distributions intutition | Principles of Uncertainty – exercises + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + +
    + +
    + +
    +
    + + +
    +
    + +
    +
    +

    Chapter 18 Distributions intutition

    +

    This chapter is intended to help you familiarize yourself with the different +probability distributions you will encounter in this course.

    +

    Use Appendix B as a reference for the basic properties of distributions.

    + + +
    +

    18.1 Discrete distributions

    +
    +

    Exercise 18.1 (Bernoulli intuition 1) Arguably the simplest distribution you will enocounter is the Bernoulli distribution. +It is a discrete probability distribution used to represent the outcome of a yes/no +question. It has one parameter \(p\) which is the probability of success. The +probability of failure is \(1-p\), sometimes denoted as \(q\).

    +

    A classic way to think about a Bernoulli trial (a yes/no experiment) is a coin +flip. Real coins are fair, meaning the probability of either heads (1) +or tails (0) are the same i.e. \(p=0.5\), shown below in figure a. Alternatively +we may want to represent a process that doesn’t have equal probabilities of outcomes +like “Will a throw of a fair die result in a 6?”. In this case \(p=\frac{1}{6}\), +shown in figure b.

    +

    Using your knowledge of the Bernoulli distribution use the throw of a fair die +to think of events, such that:

    +
      +
    1. \(p = 0.5\)
    2. +
    3. \(p = \frac{5}{6}\)
    4. +
    5. \(q = \frac{2}{3}\)
    6. +
    +
    +

    +
    +
    +

    Solution.

    +
      +
    1. An event that is equally likely to happen or not happen i.e. \(p = 0.5\) would be +throwing an even number. More formally we can name this event \(A\) and write: +\(A = \{2,4,6\}\), \(P(A) = 0.5\)

    2. +
    3. An example of an event with \(p = \frac{5}{6}\) would be throwing a number +greater than 1. Defined as \(B = \{2,3,4,5,6\}\).

    4. +
    5. We need an event that fails \(\frac{2}{3}\) of the time. Alternatively we can +reverse the problem and find an event that succeeds \(\frac{1}{3}\) of the time, +since: \(q = 1 - p \implies p = 1 - q = \frac{1}{3}\). The event that our outcome +is divisible by 3: \(C = \{3, 6\}\) satisfies this condition.

    6. +
    +
    +
    +
    +

    Exercise 18.2 (Binomial intuition 1) The binomial distribution is a generalization of the Bernoulli distribution. +Instead of considering a single Bernoulli trial, we now consider a sequence of \(n\) trials, +which are independent and have the same parameter \(p\). So the binomial distribution +has two parameters \(n\) - the number of trials and \(p\) - the probability of success +for each trial.

    +

    If we return to our coin flip representation, we now flip a coin several times. +The binomial distribution will give us the probabilities of all possible outcomes. +Below we show the distribution for a series of 10 coin flips with a fair coin +(left) and a biased coin (right). The numbers on the x axis represent the +number of times the coin landed heads.

    +

    Using your knowledge of the binomial distribution:

    +
      +
    1. Take the pmf of the binomial distribution and plug in \(n=1\), +check that it is in fact equivalent to a Bernoulli distribution.

    2. +
    3. In our examples we show the graph of a binomial distribution over 10 trials with +\(p=0.8\). If we take a look at the graph, it appears as though the probabilities of getting 0,1,2 or 3 +heads in 10 flips are zero. Is it actually zero? Check by plugging in the values +into the pmf.

    4. +
    +
    +
    +
    +

    Solution.

    +
      +
    1. The pmf of a binomial distribution is \(\binom{n}{k} p^k (1 - p)^{n - k}\), now +we insert \(n=1\) to get: +\[\binom{1}{k} p^k (1 - p)^{1 - k}\]. +Not quite equivalent to +a Bernoulli, however note that the support of the binomial distribution is +defined as \(k \in \{0,1,\dots,n\}\), so in our case \(k = \{0,1\}\), then: +\[\binom{1}{0} = \binom{1}{1} = 1\] +we get: \(p^k (1 - p)^{1 - k}\) ,the Bernoulli distribution.

    2. +
    3. As we already know \(p=0.8, n=10\), so: +\[\binom{10}{0} 0.8^0 (1 - 0.8)^{10 - 0} = 1.024 \cdot 10^{-7}\] +\[\binom{10}{1} 0.8^1 (1 - 0.8)^{10 - 1} = 4.096 \cdot 10^{-6}\] +\[\binom{10}{2} 0.8^2 (1 - 0.8)^{10 - 2} = 7.3728 \cdot 10^{-5}\] +\[\binom{10}{3} 0.8^3 (1 - 0.8)^{10 - 3} = 7.86432\cdot 10^{-4}\] +So the probabilities are not zero, just very small.

    4. +
    +
    +
    +

    +
    +

    Exercise 18.3 (Poisson intuition 1) Below are shown 3 different graphs of the Poisson distribution. Your task +is to replicate them on your own in R by varying the \(\lambda\) parameter.

    +

    Hint: You can use dpois() to get the probabilities.

    +
    +

    +
    +
    library(ggplot2)
    +library(gridExtra)
    +
    +x = 0:15
    +
    +# Create Poisson data
    +data1 <- data.frame(x = x, y = dpois(x, lambda = 0.1))
    +data2 <- data.frame(x = x, y = dpois(x, lambda = 1))
    +data3 <- data.frame(x = x, y = dpois(x, lambda = 7.5))
    +
    +# Create individual ggplot objects
    +plot1 <- ggplot(data1, aes(x, y)) + geom_col() +
    +  xlab("x") + ylab("Probability") + ylim(0,1)
    +
    +plot2 <- ggplot(data2, aes(x, y)) + geom_col() +
    +  xlab("x") + ylab(NULL) + ylim(0,1)
    +
    +plot3 <- ggplot(data3, aes(x, y)) + geom_col() +
    +  xlab("x") + ylab(NULL) + ylim(0,1)
    +
    +# Combine the plots
    +grid.arrange(plot1, plot2, plot3, ncol = 3)
    +
    +
    +

    Exercise 18.4 (Poisson intuition 2) The Poisson distribution is a discrete probability distribution that models +processes where events occur at a constant mean rate and are independent of each other.

    +

    It has a single parameter \(\lambda\), which represents the constant mean rate.

    +

    A classic example of a scenario that can be modeled using the Poisson distribution +is the number of calls received by a call center in a day (or in fact any other +time interval).

    +

    Suppose you work in a call center and have some understanding of probability +distributions. You overhear your supervisor mentioning that the call center +receives an average of 2.5 calls per day. Using your knowledge of the Poisson +distribution, calculate:

    +
      +
    1. The probability you will get no calls today.
    2. +
    3. The probability you will get more than 5 calls today.
    4. +
    +
    +
    +
    +

    Solution. First recall the Poisson pmf: \[p(k) = \frac{\lambda^k e^{-\lambda}}{k!}\]

    +

    as stated previously our parameter \(\lambda = 2.5\)

    +
      +
    1. To get the probability of no calls we simply plug in \(k = 0\), so: \[p(0) = \frac{2.5^0 e^{-2.5}}{0!} = e^{-2.5} \approx 0.082\]

    2. +
    3. The support of the Poisson distribution is non-negative integers. So if we wanted +to calculate the probability of getting more than 5 calls we would need to add up +the probabilities of getting 6 calls and 7 calls and so on up to infinity. +Let us instead remember that the sum of all probabilties will be 1, we will +reverse the problem and instead ask “What is the probability we get 5 calls or less?”. +We can subtract the probability of the opposite outcome (the complement) from 1 +to get the probability of our original question.

    4. +
    +

    \[P(k > 5) = 1 - P(k \leq 5)\] +\[P(k \leq 5) = \sum_{i=0}^{5} p(i) = p(0) + p(1) + p(2) + p(3) + p(4) + p(5) =\] +\[= \frac{2.5^0 e^{-2.5}}{0!} + \frac{2.5^1 e^{-2.5}}{1!} + \dots =\] +\[=0.957979\]

    +

    So the probability of geting more than 5 calls will be \(1 - 0.957979 = 0.042021\)

    +
    +
    +
    +

    Exercise 18.5 (Geometric intuition 1) The geometric distribution is a discrete distribution that models the number of +failures before the first success in a sequence of independent Bernoulli trials. +It has a single parameter \(p\), representing the probability of success.

    +

    Disclaimer: There are two forms of this distribution, the one we just described +and another version that models the number of trials before the first success. The +difference is subtle yet significant and you are likely to encounter both forms, +though here we will limit ourselves to the former.

    +

    In the graph below we show the pmf of a geometric distribution with \(p=0.5\). This +can be thought of as the number of successive failures (tails) in the flip of a fair coin. +You can see that there’s a 50% chance you will have zero failures i.e. you will +flip a heads on your very first attempt. But there is some smaller chance that you +will flip a sequence of tails in a row, with longer sequences having ever lower +probability.

    +

    Suppose you are gambling over coin flips with your friend and they propose if they +get 5 tails in a row you must give them 100€ and you get 1€ if they fail.

    +
      +
    1. Does it make sense to accept this wager?
    2. +
    3. They change their mind and claim that getting 10 tails in a row is half as +likely, so now they want 200€ if they succeed, but still only offer to pay 1€ +if they fail. Does it make sense to accept this wager?
    4. +
    5. Bonus: Look up the second form of this distribution and redo problems a) and b).
    6. +
    +
    +

    +
    +
    +

    Solution. aaa

    +
    +
    +
    +
    +

    18.2 Continuous distributions

    + +
    +
    + + + +
    + + +
    +
    +
    + + +
    + + + + + + + + + + + + + + + + diff --git a/docs/distributions.html b/docs/distributions.html new file mode 100644 index 0000000..e69de29 diff --git a/docs/probability-distributions.html b/docs/probability-distributions.html index 936088f..759f03c 100644 --- a/docs/probability-distributions.html +++ b/docs/probability-distributions.html @@ -23,7 +23,7 @@ - + @@ -54,70 +54,6 @@ - + + + + + + + + + + + +
    + +
    + +
    + +
    +
    + + +
    +
    + +
    +
    +

    B Probability distributions

    + ++++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Nameparameterssupportpdf/pmfmeanvariance
    Bernoulli\(p \in [0,1]\)\(k \in \{0,1\}\)\(p^k (1 - p)^{1 - k}\)
    1.12
    \(p\)
    7.1
    \(p(1-p)\)
    7.1
    binomial\(n \in \mathbb{N}\), \(p \in [0,1]\)\(k \in \{0,1,\dots,n\}\)\(\binom{n}{k} p^k (1 - p)^{n - k}\)
    4.4
    \(np\)
    7.2
    \(np(1-p)\)
    7.2
    Poisson\(\lambda > 0\)\(k \in \mathbb{N}_0\)\(\frac{\lambda^k e^{-\lambda}}{k!}\)
    4.6
    \(\lambda\)
    7.3
    \(\lambda\)
    7.3
    geometric\(p \in (0,1]\)\(k \in \mathbb{N}_0\)\(p(1-p)^k\)
    4.5
    \(\frac{1 - p}{p}\)
    7.4
    \(\frac{1 - p}{p^2}\)
    9.3
    normal\(\mu \in \mathbb{R}\), \(\sigma^2 > 0\)\(x \in \mathbb{R}\)\(\frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{(x - \mu)^2}{2 \sigma^2}}\)
    4.12
    \(\mu\) 7.8\(\sigma^2\) 7.8
    uniform\(a,b \in \mathbb{R}\), \(a < b\)\(x \in [a,b]\)\(\frac{1}{b-a}\)
    4.9
    \(\frac{a+b}{2}\)\(\frac{(b-a)^2}{12}\)
    beta\(\alpha,\beta > 0\)\(x \in [0,1]\)\(\frac{x^{\alpha - 1} (1 - x)^{\beta - 1}}{\text{B}(\alpha, \beta)}\)
    4.10
    \(\frac{\alpha}{\alpha + \beta}\) 7.6\(\frac{\alpha \beta}{(\alpha + \beta)^2(\alpha + \beta + 1)}\) 7.6
    gamma\(\alpha,\beta > 0\)\(x \in (0, \infty)\)\(\frac{\beta^\alpha}{\Gamma(\alpha)} x^{\alpha - 1}e^{-\beta x}\)
    4.11
    \(\frac{\alpha}{\beta}\)
    7.5
    \(\frac{\alpha}{\beta^2}\)
    7.5
    exponential\(\lambda > 0\)\(x \in [0, \infty)\)\(\lambda e^{-\lambda x}\)
    4.8
    \(\frac{1}{\lambda}\)
    7.7
    \(\frac{1}{\lambda^2}\)
    7.7
    logistic\(\mu \in \mathbb{R}\), \(s > 0\)\(x \in \mathbb{R}\)\(\frac{e^{-\frac{x - \mu}{s}}}{s(1 + e^{-\frac{x - \mu}{s}})^2}\)
    4.13
    \(\mu\)\(\frac{s^2 \pi^2}{3}\)
    negative binomial\(r \in \mathbb{N}\), \(p \in [0,1]\)\(k \in \mathbb{N}_0\)\(\binom{k + r - 1}{k}(1-p)^r p^k\)
    4.7
    \(\frac{rp}{1 - p}\)
    9.2
    \(\frac{rp}{(1 - p)^2}\)
    9.2
    multinomial\(n \in \mathbb{N}\), \(k \in \mathbb{N}\) \(p_i \in [0,1]\), \(\sum p_i = 1\)\(x_i \in \{0,..., n\}\), \(i \in \{1,...,k\}\), \(\sum{x_i} = n\)\(\frac{n!}{x_1!x_2!...x_k!} p_1^{x_1} p_2^{x_2}...p_k^{x_k}\)
    8.1
    \(np_i\)\(np_i(1-p_i)\)
    + + +
    +
    + +
    +
    +
    + + +
    +
    + + + + + + + + + + + + + + + diff --git a/docs/eb.html b/docs/eb.html index 60263e5..44adc0e 100644 --- a/docs/eb.html +++ b/docs/eb.html @@ -23,7 +23,7 @@ - + @@ -242,6 +242,11 @@
  • 17.1 Conjugate priors
  • 17.2 Posterior sampling
  • +
  • 18 Distributions intutition +
  • Appendix
  • A R programming language
  • -
  • B Probability distributions
  • +
  • B Probability distributions
  • References
  • Published with bookdown
  • diff --git a/docs/ev.html b/docs/ev.html index f2b709d..5ec2b00 100644 --- a/docs/ev.html +++ b/docs/ev.html @@ -23,7 +23,7 @@ - + @@ -242,6 +242,11 @@
  • 17.1 Conjugate priors
  • 17.2 Posterior sampling
  • +
  • 18 Distributions intutition +
  • Appendix
  • A R programming language
  • -
  • B Probability distributions
  • +
  • B Probability distributions
  • References
  • Published with bookdown
  • diff --git a/docs/index.html b/docs/index.html index 91bc273..71bb37d 100644 --- a/docs/index.html +++ b/docs/index.html @@ -23,7 +23,7 @@ - + @@ -242,6 +242,11 @@
  • 17.1 Conjugate priors
  • 17.2 Posterior sampling
  • +
  • 18 Distributions intutition +
  • Appendix
  • A R programming language
  • -
  • B Probability distributions
  • +
  • B Probability distributions
  • References
  • Published with bookdown
  • @@ -295,7 +300,7 @@

    Preface

    diff --git a/docs/integ.html b/docs/integ.html index e8cb4de..a068607 100644 --- a/docs/integ.html +++ b/docs/integ.html @@ -23,7 +23,7 @@ - + @@ -242,6 +242,11 @@
  • 17.1 Conjugate priors
  • 17.2 Posterior sampling
  • +
  • 18 Distributions intutition +
  • Appendix
  • A R programming language
  • -
  • B Probability distributions
  • +
  • B Probability distributions
  • References
  • Published with bookdown
  • diff --git a/docs/introduction.html b/docs/introduction.html index f95c454..3ca7b69 100644 --- a/docs/introduction.html +++ b/docs/introduction.html @@ -23,7 +23,7 @@ - + @@ -242,6 +242,11 @@
  • 17.1 Conjugate priors
  • 17.2 Posterior sampling
  • +
  • 18 Distributions intutition +
  • Appendix
  • A R programming language
  • -
  • B Probability distributions
  • +
  • B Probability distributions
  • References
  • Published with bookdown
  • diff --git a/docs/lt.html b/docs/lt.html index e23041f..095b208 100644 --- a/docs/lt.html +++ b/docs/lt.html @@ -23,7 +23,7 @@ - + @@ -242,6 +242,11 @@
  • 17.1 Conjugate priors
  • 17.2 Posterior sampling
  • +
  • 18 Distributions intutition +
  • Appendix
  • A R programming language
  • -
  • B Probability distributions
  • +
  • B Probability distributions
  • References
  • Published with bookdown
  • diff --git a/docs/ml.html b/docs/ml.html index 7f2a07d..7d2bd1f 100644 --- a/docs/ml.html +++ b/docs/ml.html @@ -23,7 +23,7 @@ - + @@ -242,6 +242,11 @@
  • 17.1 Conjugate priors
  • 17.2 Posterior sampling
  • +
  • 18 Distributions intutition +
  • Appendix
  • A R programming language
  • -
  • B Probability distributions
  • +
  • B Probability distributions
  • References
  • Published with bookdown
  • @@ -607,27 +612,27 @@

    15.1 Deriving MLE Sigma_dec$vectors

    ##             [,1]       [,2]        [,3]        [,4]         [,5]        [,6]
    -##  [1,]  0.4158823  0.1488081 -0.26747198  0.08833244 -0.442314456  0.03071237
    -##  [2,] -0.3940515 -0.1520815 -0.16894945  0.24424963  0.368913901 -0.09378242
    -##  [3,] -0.2691057  0.4835374  0.09853273  0.10776276 -0.009754680  0.23002054
    -##  [4,] -0.2122818  0.0278985 -0.85498656 -0.38794393 -0.001876311  0.07454380
    -##  [5,]  0.3558474  0.3521598 -0.18949642 -0.08057457  0.146965351 -0.32692886
    -##  [6,]  0.4334816  0.0695682 -0.12616012  0.38229029 -0.088802794  0.21049130
    -##  [7,] -0.1757923  0.5033347  0.04609969 -0.02558404  0.019358607  0.61491241
    -##  [8,] -0.3840821  0.1495820  0.13687235 -0.14396548 -0.716743474 -0.34776037
    -##  [9,] -0.1799436  0.3719570 -0.19232803  0.60046566  0.095582043 -0.43744387
    -## [10,]  0.1701426  0.4209653  0.22255233 -0.48564231  0.339772188 -0.30032419
    +##  [1,]  0.4158823  0.1488081 -0.26747198 -0.08833244 -0.442314456  0.03071237
    +##  [2,] -0.3940515 -0.1520815 -0.16894945 -0.24424963  0.368913901 -0.09378242
    +##  [3,] -0.2691057  0.4835374  0.09853273 -0.10776276 -0.009754680  0.23002054
    +##  [4,] -0.2122818  0.0278985 -0.85498656  0.38794393 -0.001876311  0.07454380
    +##  [5,]  0.3558474  0.3521598 -0.18949642  0.08057457  0.146965351 -0.32692886
    +##  [6,]  0.4334816  0.0695682 -0.12616012 -0.38229029 -0.088802794  0.21049130
    +##  [7,] -0.1757923  0.5033347  0.04609969  0.02558404  0.019358607  0.61491241
    +##  [8,] -0.3840821  0.1495820  0.13687235  0.14396548 -0.716743474 -0.34776037
    +##  [9,] -0.1799436  0.3719570 -0.19232803 -0.60046566  0.095582043 -0.43744387
    +## [10,]  0.1701426  0.4209653  0.22255233  0.48564231  0.339772188 -0.30032419
     ##             [,7]         [,8]        [,9]       [,10]
    -##  [1,]  0.2543985  0.663712826 -0.10839531 -0.10948045
    -##  [2,]  0.7505343  0.141264141  0.04613910 -0.05580431
    -##  [3,] -0.1106637  0.072505560  0.42247611 -0.65073655
    -##  [4,] -0.1351242 -0.155435871 -0.10206505 -0.11941181
    -##  [5,]  0.1413388 -0.146839303  0.65076229  0.33681395
    -##  [6,]  0.2725296 -0.639003579 -0.20723854 -0.25971800
    -##  [7,]  0.1439726  0.009400445 -0.16724055  0.53450315
    -##  [8,]  0.2732665 -0.276873049 -0.01766443  0.06589572
    -##  [9,] -0.3419099  0.058519366 -0.30619617  0.13093187
    -## [10,]  0.1868704  0.007310045 -0.45688227 -0.24311846
    +## [1,] 0.2543985 0.663712826 -0.10839531 0.10948045 +## [2,] 0.7505343 0.141264141 0.04613910 0.05580431 +## [3,] -0.1106637 0.072505560 0.42247611 0.65073655 +## [4,] -0.1351242 -0.155435871 -0.10206505 0.11941181 +## [5,] 0.1413388 -0.146839303 0.65076229 -0.33681395 +## [6,] 0.2725296 -0.639003579 -0.20723854 0.25971800 +## [7,] 0.1439726 0.009400445 -0.16724055 -0.53450315 +## [8,] 0.2732665 -0.276873049 -0.01766443 -0.06589572 +## [9,] -0.3419099 0.058519366 -0.30619617 -0.13093187 +## [10,] 0.1868704 0.007310045 -0.45688227 0.24311846
    my_pca$rotation
    ##             PC1        PC2         PC3         PC4          PC5         PC6
     ## 100  -0.4158823  0.1488081  0.26747198 -0.08833244 -0.442314456  0.03071237
    diff --git a/docs/mrv.html b/docs/mrv.html
    index 6cdf72c..d2f18d7 100644
    --- a/docs/mrv.html
    +++ b/docs/mrv.html
    @@ -23,7 +23,7 @@
     
     
     
    -
    +
     
       
       
    @@ -242,6 +242,11 @@
     
  • 17.1 Conjugate priors
  • 17.2 Posterior sampling
  • +
  • 18 Distributions intutition +
  • Appendix
  • A R programming language
  • -
  • B Probability distributions
  • +
  • B Probability distributions
  • References
  • Published with bookdown
  • @@ -749,24 +754,24 @@

    8.3 Transformations dmvnorm(x, mu_v, cov_m))) }

    ## [1] "My function: 0.0229514237156383, dmvnorm: 0.0229514237156383"
    -## [1] "My function: 0.00763138915406231, dmvnorm: 0.00763138915406232"
    +## [1] "My function: 0.00763138915406231, dmvnorm: 0.00763138915406231"
     ## [1] "My function: 0.0230688881105741, dmvnorm: 0.0230688881105741"
    -## [1] "My function: 0.0113616213114732, dmvnorm: 0.0113616213114732"
    -## [1] "My function: 0.00151808500121908, dmvnorm: 0.00151808500121908"
    +## [1] "My function: 0.0113616213114731, dmvnorm: 0.0113616213114731"
    +## [1] "My function: 0.00151808500121907, dmvnorm: 0.00151808500121907"
     ## [1] "My function: 0.0257658045974509, dmvnorm: 0.0257658045974509"
     ## [1] "My function: 0.0157963825730805, dmvnorm: 0.0157963825730805"
     ## [1] "My function: 0.00408856287529248, dmvnorm: 0.00408856287529248"
     ## [1] "My function: 0.0327793540101256, dmvnorm: 0.0327793540101256"
     ## [1] "My function: 0.0111606542967978, dmvnorm: 0.0111606542967978"
     ## [1] "My function: 0.0147636757585684, dmvnorm: 0.0147636757585684"
    -## [1] "My function: 0.0142948300412208, dmvnorm: 0.0142948300412208"
    +## [1] "My function: 0.0142948300412207, dmvnorm: 0.0142948300412207"
     ## [1] "My function: 0.0203093820657542, dmvnorm: 0.0203093820657542"
     ## [1] "My function: 0.0287533273357481, dmvnorm: 0.0287533273357481"
     ## [1] "My function: 0.0213402305128623, dmvnorm: 0.0213402305128623"
     ## [1] "My function: 0.0218356957993885, dmvnorm: 0.0218356957993885"
     ## [1] "My function: 0.0250750113961771, dmvnorm: 0.0250750113961771"
     ## [1] "My function: 0.0166498666348048, dmvnorm: 0.0166498666348048"
    -## [1] "My function: 0.0018972510687466, dmvnorm: 0.0018972510687466"
    +## [1] "My function: 0.00189725106874659, dmvnorm: 0.00189725106874659"
     ## [1] "My function: 0.0196697814975113, dmvnorm: 0.0196697814975113"
    diff --git a/docs/mrvs.html b/docs/mrvs.html index 7541054..7aa70c5 100644 --- a/docs/mrvs.html +++ b/docs/mrvs.html @@ -23,7 +23,7 @@ - + @@ -242,6 +242,11 @@
  • 17.1 Conjugate priors
  • 17.2 Posterior sampling
  • +
  • 18 Distributions intutition +
  • Appendix
  • A R programming language
  • -
  • B Probability distributions
  • +
  • B Probability distributions
  • References
  • Published with bookdown
  • diff --git a/docs/nhst.html b/docs/nhst.html index 1e2b245..cd8320a 100644 --- a/docs/nhst.html +++ b/docs/nhst.html @@ -23,7 +23,7 @@ - + @@ -242,6 +242,11 @@
  • 17.1 Conjugate priors
  • 17.2 Posterior sampling
  • +
  • 18 Distributions intutition +
  • Appendix
  • A R programming language
  • -
  • B Probability distributions
  • +
  • B Probability distributions
  • References
  • Published with bookdown
  • diff --git a/docs/reference-keys.txt b/docs/reference-keys.txt index 9c3844f..68bb213 100644 --- a/docs/reference-keys.txt +++ b/docs/reference-keys.txt @@ -114,6 +114,11 @@ exr:unnamed-chunk-263 exr:unnamed-chunk-265 exr:unnamed-chunk-269 exr:unnamed-chunk-273 +exr:unnamed-chunk-277 +exr:unnamed-chunk-280 +exr:unnamed-chunk-283 +exr:unnamed-chunk-286 +exr:unnamed-chunk-288 introduction measure-and-probability-spaces properties-of-probability-measures @@ -167,6 +172,9 @@ nhst bi conjugate-priors posterior-sampling +distributions-intutition +discrete-distributions +continuous-distributions A1 basic-characteristics why-r @@ -189,9 +197,4 @@ functions writing-functions other-tips further-reading-and-references -probability-distributions -exr:unnamed-chunk-5 -distributions-intutition -discrete-distributions -continuous-distributions distributions diff --git a/docs/references.html b/docs/references.html index 8a8531c..20f280c 100644 --- a/docs/references.html +++ b/docs/references.html @@ -23,14 +23,14 @@ - + - + @@ -242,6 +242,11 @@
  • 17.1 Conjugate priors
  • 17.2 Posterior sampling
  • +
  • 18 Distributions intutition +
  • Appendix
  • A R programming language
  • -
  • B Probability distributions
  • +
  • B Probability distributions
  • References
  • Published with bookdown
  • @@ -312,7 +317,7 @@

    References - + diff --git a/docs/rvs.html b/docs/rvs.html index 1f34b75..6286024 100644 --- a/docs/rvs.html +++ b/docs/rvs.html @@ -23,7 +23,7 @@ - + @@ -242,6 +242,11 @@
  • 17.1 Conjugate priors
  • 17.2 Posterior sampling
  • +
  • 18 Distributions intutition +
  • Appendix
  • A R programming language
  • -
  • B Probability distributions
  • +
  • B Probability distributions
  • References
  • Published with bookdown
  • diff --git a/docs/search_index.json b/docs/search_index.json index 55bce71..6ddc581 100644 --- a/docs/search_index.json +++ b/docs/search_index.json @@ -1 +1 @@ -[["distributions-intutition.html", "Chapter 18 Distributions intutition 18.1 Discrete distributions 18.2 Continuous distributions", " Chapter 18 Distributions intutition This chapter is intended to help you familiarize yourself with the different probability distributions you will encounter in this course. Use Appendix B as a reference for the basic properties of distributions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 18.1 Discrete distributions Exercise 18.1 (Bernoulli intuition 1) Arguably the simplest distribution you will enocounter is the Bernoulli distribution. It is a discrete probability distribution used to represent the outcome of a yes/no question. It has one parameter \\(p\\) which is the probability of success. The probability of failure is \\(1-p\\), sometimes denoted as \\(q\\). A classic way to think about a Bernoulli trial (a yes/no experiment) is a coin flip. Real coins are fair, meaning the probability of either heads (1) or tails (0) are the same i.e. \\(p=0.5\\), shown below in figure a. Alternatively we may want to represent a process that doesn’t have equal probabilities of outcomes like “Will a throw of a fair die result in a 6?”. In this case \\(p=\\frac{1}{6}\\), shown in figure b. Using your knowledge of the Bernoulli distribution use the throw of a fair die to think of events, such that: \\(p = 0.5\\) \\(p = \\frac{5}{6}\\) \\(q = \\frac{2}{3}\\) Solution. An event that is equally likely to happen or not happen i.e. \\(p = 0.5\\) would be throwing an even number. More formally we can name this event \\(A\\) and write: \\(A = \\{2,4,6\\}\\), \\(P(A) = 0.5\\) An example of an event with \\(p = \\frac{5}{6}\\) would be throwing a number greater than 1. Defined as \\(B = \\{2,3,4,5,6\\}\\). We need an event that fails \\(\\frac{2}{3}\\) of the time. Alternatively we can reverse the problem and find an event that succeeds \\(\\frac{1}{3}\\) of the time, since: \\(q = 1 - p \\implies p = 1 - q = \\frac{1}{3}\\). The event that our outcome is divisible by 3: \\(C = \\{3, 6\\}\\) satisfies this condition. Exercise 18.2 (Binomial intuition 1) The binomial distribution is a generalization of the Bernoulli distribution. Instead of considering a single Bernoulli trial, we now consider a sequence of \\(n\\) trials, which are independent and have the same parameter \\(p\\). So the binomial distribution has two parameters \\(n\\) - the number of trials and \\(p\\) - the probability of success for each trial. If we return to our coin flip representation, we now flip a coin several times. The binomial distribution will give us the probabilities of all possible outcomes. Below we show the distribution for a series of 10 coin flips with a fair coin (left) and a biased coin (right). The numbers on the x axis represent the number of times the coin landed heads. Using your knowledge of the binomial distribution: Take the pmf of the binomial distribution and plug in \\(n=1\\), check that it is in fact equivalent to a Bernoulli distribution. In our examples we show the graph of a binomial distribution over 10 trials with \\(p=0.8\\). If we take a look at the graph, it appears as though the probabilities of getting 0,1,2 or 3 heads in 10 flips are zero. Is it actually zero? Check by plugging in the values into the pmf. Solution. The pmf of a binomial distribution is \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\), now we insert \\(n=1\\) to get: \\[\\binom{1}{k} p^k (1 - p)^{1 - k}\\]. Not quite equivalent to a Bernoulli, however note that the support of the binomial distribution is defined as \\(k \\in \\{0,1,\\dots,n\\}\\), so in our case \\(k = \\{0,1\\}\\), then: \\[\\binom{1}{0} = \\binom{1}{1} = 1\\] we get: \\(p^k (1 - p)^{1 - k}\\) ,the Bernoulli distribution. As we already know \\(p=0.8, n=10\\), so: \\[\\binom{10}{0} 0.8^0 (1 - 0.8)^{10 - 0} = 1.024 \\cdot 10^{-7}\\] \\[\\binom{10}{1} 0.8^1 (1 - 0.8)^{10 - 1} = 4.096 \\cdot 10^{-6}\\] \\[\\binom{10}{2} 0.8^2 (1 - 0.8)^{10 - 2} = 7.3728 \\cdot 10^{-5}\\] \\[\\binom{10}{3} 0.8^3 (1 - 0.8)^{10 - 3} = 7.86432\\cdot 10^{-4}\\] So the probabilities are not zero, just very small. Exercise 18.3 (Poisson intuition 1) Below are shown 3 different graphs of the Poisson distribution. Your task is to replicate them on your own in R by varying the \\(\\lambda\\) parameter. Hint: You can use dpois() to get the probabilities. library(ggplot2) library(gridExtra) x = 0:15 # Create Poisson data data1 <- data.frame(x = x, y = dpois(x, lambda = 0.1)) data2 <- data.frame(x = x, y = dpois(x, lambda = 1)) data3 <- data.frame(x = x, y = dpois(x, lambda = 7.5)) # Create individual ggplot objects plot1 <- ggplot(data1, aes(x, y)) + geom_col() + xlab("x") + ylab("Probability") + ylim(0,1) plot2 <- ggplot(data2, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) plot3 <- ggplot(data3, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) # Combine the plots grid.arrange(plot1, plot2, plot3, ncol = 3) Exercise 18.4 (Poisson intuition 2) The Poisson distribution is a discrete probability distribution that models processes where events occur at a constant mean rate and are independent of each other. It has a single parameter \\(\\lambda\\), which represents the constant mean rate. A classic example of a scenario that can be modeled using the Poisson distribution is the number of calls received by a call center in a day (or in fact any other time interval). Suppose you work in a call center and have some understanding of probability distributions. You overhear your supervisor mentioning that the call center receives an average of 2.5 calls per day. Using your knowledge of the Poisson distribution, calculate: The probability you will get no calls today. The probability you will get more than 5 calls today. Solution. First recall the Poisson pmf: \\[p(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!}\\] as stated previously our parameter \\(\\lambda = 2.5\\) To get the probability of no calls we simply plug in \\(k = 0\\), so: \\[p(0) = \\frac{2.5^0 e^{-2.5}}{0!} = e^{-2.5} \\approx 0.082\\] The support of the Poisson distribution is non-negative integers. So if we wanted to calculate the probability of getting more than 5 calls we would need to add up the probabilities of getting 6 calls and 7 calls and so on up to infinity. Let us instead remember that the sum of all probabilties will be 1, we will reverse the problem and instead ask “What is the probability we get 5 calls or less?”. We can subtract the probability of the opposite outcome (the complement) from 1 to get the probability of our original question. \\[P(k > 5) = 1 - P(k \\leq 5)\\] \\[P(k \\leq 5) = \\sum_{i=0}^{5} p(i) = p(0) + p(1) + p(2) + p(3) + p(4) + p(5) =\\] \\[= \\frac{2.5^0 e^{-2.5}}{0!} + \\frac{2.5^1 e^{-2.5}}{1!} + \\dots =\\] \\[=0.957979\\] So the probability of geting more than 5 calls will be \\(1 - 0.957979 = 0.042021\\) Exercise 18.5 (Geometric intuition 1) The geometric distribution is a discrete distribution that models the number of failures before the first success in a sequence of independent Bernoulli trials. It has a single parameter \\(p\\), representing the probability of success. Disclaimer: There are two forms of this distribution, the one we just described and another version that models the number of trials before the first success. The difference is subtle yet significant and you are likely to encounter both forms, though here we will limit ourselves to the former. In the graph below we show the pmf of a geometric distribution with \\(p=0.5\\). This can be thought of as the number of successive failures (tails) in the flip of a fair coin. You can see that there’s a 50% chance you will have zero failures i.e. you will flip a heads on your very first attempt. But there is some smaller chance that you will flip a sequence of tails in a row, with longer sequences having ever lower probability. Suppose you are gambling over coin flips with your friend and they propose if they get 5 tails in a row you must give them 100€ and you get 1€ if they fail. Does it make sense to accept this wager? They change their mind and claim that getting 10 tails in a row is half as likely, so now they want 200€ if they succeed, but still only offer to pay 1€ if they fail. Does it make sense to accept this wager? Bonus: Look up the second form of this distribution and redo problems a) and b). Solution. aaa 18.2 Continuous distributions "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]] +[["index.html", "Principles of Uncertainty – exercises Preface", " Principles of Uncertainty – exercises Gregor Pirš and Erik Štrumbelj 2023-09-25 Preface These are the exercises for the Principles of Uncertainty course of the Data Science Master’s at University of Ljubljana, Faculty of Computer and Information Science. This document will be extended each week as the course progresses. At the end of each exercise session, we will post the solutions to the exercises worked in class and select exercises for homework. Students are also encouraged to solve the remaining exercises to further extend their knowledge. Some exercises require the use of R. Those exercises (or parts of) are coloured blue. Students that are not familiar with R programming language should study A to learn the basics. As the course progresses, we will cover more relevant uses of R for data science. "],["introduction.html", "Chapter 1 Probability spaces 1.1 Measure and probability spaces 1.2 Properties of probability measures 1.3 Discrete probability spaces", " Chapter 1 Probability spaces This chapter deals with measures and probability spaces. At the end of the chapter, we look more closely at discrete probability spaces. The students are expected to acquire the following knowledge: Theoretical Use properties of probability to calculate probabilities. Combinatorics. Understanding of continuity of probability. R Vectors and vector operations. For loop. Estimating probability with simulation. sample function. Matrices and matrix operations. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 1.1 Measure and probability spaces Exercise 1.1 (Completing a set to a sigma algebra) Let \\(\\Omega = \\{1,2,...,10\\}\\) and let \\(A = \\{\\emptyset, \\{1\\}, \\{2\\}, \\Omega \\}\\). Show that \\(A\\) is not a sigma algebra of \\(\\Omega\\). Find the minimum number of elements to complete A to a sigma algebra of \\(\\Omega\\). Solution. \\(1^c = \\{2,3,...,10\\} \\notin A \\implies\\) \\(A\\) is not sigma algebra. First we need the complements of all elements, so we need to add sets \\(\\{2,3,...,10\\}\\) and \\(\\{1,3,4,...,10\\}\\). Next we need unions of all sets – we add the set \\(\\{1,2\\}\\). Again we need the complement of this set, so we add \\(\\{3,4,...,10\\}\\). So the minimum number of elements we need to add is 4. Exercise 1.2 (Diversity of sigma algebras) Let \\(\\Omega\\) be a set. Find the smallest sigma algebra of \\(\\Omega\\). Find the largest sigma algebra of \\(\\Omega\\). Solution. \\(A = \\{\\emptyset, \\Omega\\}\\) \\(2^{\\Omega}\\) Exercise 1.3 Find all sigma algebras for \\(\\Omega = \\{0, 1, 2\\}\\). Solution. \\(A = \\{\\emptyset, \\Omega\\}\\) \\(A = 2^{\\Omega}\\) \\(A = \\{\\emptyset, \\{0\\}, \\{1,2\\}, \\Omega\\}\\) \\(A = \\{\\emptyset, \\{1\\}, \\{0,2\\}, \\Omega\\}\\) \\(A = \\{\\emptyset, \\{2\\}, \\{0,1\\}, \\Omega\\}\\) Exercise 1.4 (Difference between algebra and sigma algebra) Let \\(\\Omega = \\mathbb{N}\\) and \\(\\mathcal{A} = \\{A \\subseteq \\mathbb{N}: A \\text{ is finite or } A^c \\text{ is finite.} \\}\\). Show that \\(\\mathcal{A}\\) is an algebra but not a sigma algebra. Solution. \\(\\emptyset\\) is finite so \\(\\emptyset \\in \\mathcal{A}\\). Let \\(A \\in \\mathcal{A}\\) and \\(B \\in \\mathcal{A}\\). If both are finite, then their union is also finite and therefore in \\(\\mathcal{A}\\). Let at least one of them not be finite. Then their union is not finite. But \\((A \\cup B)^c = A^c \\cap B^c\\). And since at least one is infinite, then its complement is finite and the intersection is too. So finite unions are in \\(\\mathcal{A}\\). Let us look at numbers \\(2n\\). For any \\(n\\), \\(2n \\in \\mathcal{A}\\) as it is finite. But \\(\\bigcup_{k = 1}^{\\infty} 2n \\notin \\mathcal{A}\\). Exercise 1.5 We define \\(\\sigma(X) = \\cap_{\\lambda \\in I} S_\\lambda\\) to be a sigma algebra, generated by the set \\(X\\), where \\(S_\\lambda\\) are all sigma algebras such that \\(X \\subseteq S_\\lambda\\). \\(S_\\lambda\\) are indexed by \\(\\lambda \\in I\\). Let \\(A, B \\subseteq 2^{\\Omega}\\). Prove that \\(\\sigma(A) = \\sigma(B) \\iff A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A)\\). Solution. To prove the equivalence, we need to prove that the left hand side implies the right hand side and vice versa. Proving \\(\\sigma(A) = \\sigma(B) \\Rightarrow A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A)\\): we know \\(A \\subseteq \\sigma(A)\\) is always true, so by substituting in \\(\\sigma(B)\\) from the left hand side equality we obtain \\(A \\subseteq \\sigma(B)\\). We obtain \\(B \\subseteq \\sigma(A)\\) by symmetry. This proves the implication. Proving \\(A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A) \\Rightarrow \\sigma(A) = \\sigma(B)\\): by definition of a sigma algebra, generated by a set, we have \\(\\sigma(B) = \\cap_{\\lambda \\in I} S_\\lambda\\) where \\(S_\\lambda\\) are all sigma algebras where \\(B \\subseteq S_\\lambda\\). But \\(\\sigma(A)\\) is one of \\(S_\\lambda\\), so we can write \\(\\sigma(B) = \\sigma(A) \\cap \\left(\\cap_{\\lambda \\in I} S_\\lambda \\right)\\), which implies \\(\\sigma(B) \\subseteq \\sigma(A)\\). By symmetry, we have \\(\\sigma(A) \\subseteq \\sigma(B)\\). Since \\(\\sigma(A) \\subseteq \\sigma(B)\\) and \\(\\sigma(B) \\subseteq \\sigma(A)\\), we obtain \\(\\sigma(A) = \\sigma(B)\\), which proves the implication and completes the equivalence proof. Exercise 1.6 (Intro to measure) Take the measurable space \\(\\Omega = \\{1,2\\}\\), \\(F = 2^{\\Omega}\\). Which of the following is a measure? Which is a probability measure? \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 5\\), \\(\\mu(\\{2\\}) = 6\\), \\(\\mu(\\{1,2\\}) = 11\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 0\\), \\(\\mu(\\{1,2\\}) = 1\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 0\\), \\(\\mu(\\{1,2\\}) = 0\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 1\\), \\(\\mu(\\{1,2\\}) = 1\\) \\(\\mu(\\emptyset)=0\\), \\(\\mu(\\{1\\})=0\\), \\(\\mu(\\{2\\})=\\infty\\), \\(\\mu(\\{1,2\\})=\\infty\\) Solution. Measure. Not probability measure since \\(\\mu(\\Omega) > 1\\). Neither due to countable additivity. Measure. Not probability measure since \\(\\mu(\\Omega) = 0\\). Probability measure. Measure. Not probability measure since \\(\\mu(\\Omega) > 1\\). Exercise 1.7 Define a probability space that could be used to model the outcome of throwing two fair 6-sided dice. Solution. \\(\\Omega = \\{\\{i,j\\}, i = 1,...,6, j = 1,...,6\\}\\) \\(F = 2^{\\Omega}\\) \\(\\forall \\omega \\in \\Omega\\), \\(P(\\omega) = \\frac{1}{6} \\times \\frac{1}{6} = \\frac{1}{36}\\) 1.2 Properties of probability measures Exercise 1.8 A standard deck (52 cards) is distributed to two persons: 26 cards to each person. All partitions are equally likely. Find the probability that: The first person gets 4 Queens. The first person gets at least 2 Queens. R: Use simulation (sample) to check the above answers. Solution. \\(\\frac{\\binom{48}{22}}{\\binom{52}{26}}\\) 1 - \\(\\frac{\\binom{48}{26} + 4 \\times \\binom{48}{25}}{\\binom{52}{26}}\\) For the simulation, let us represent cards with numbers from 1 to 52, and let 1 through 4 represent Queens. set.seed(1) cards <- 1:52 n <- 10000 q4 <- vector(mode = "logical", length = n) q2 <- vector(mode = "logical", length = n) tmp <- vector(mode = "logical", length = n) for (i in 1:n) { p1 <- sample(1:52, 26) q4[i] <- sum(1:4 %in% p1) == 4 q2[i] <- sum(1:4 %in% p1) >= 2 } sum(q4) / n ## [1] 0.0572 sum(q2) / n ## [1] 0.6894 Exercise 1.9 Let \\(A\\) and \\(B\\) be events with probabilities \\(P(A) = \\frac{2}{3}\\) and \\(P(B) = \\frac{1}{2}\\). Show that \\(\\frac{1}{6} \\leq P(A\\cap B) \\leq \\frac{1}{2}\\), and give examples to show that both extremes are possible. Find corresponding bounds for \\(P(A\\cup B)\\). R: Draw samples from the examples and show the probability bounds of \\(P(A \\cap B)\\) . Solution. From the properties of probability we have \\[\\begin{equation} P(A \\cup B) = P(A) + P(B) - P(A \\cap B) \\leq 1. \\end{equation}\\] From this follows \\[\\begin{align} P(A \\cap B) &\\geq P(A) + P(B) - 1 \\\\ &= \\frac{2}{3} + \\frac{1}{2} - 1 \\\\ &= \\frac{1}{6}, \\end{align}\\] which is the lower bound for the intersection. Conversely, we have \\[\\begin{equation} P(A \\cup B) = P(A) + P(B) - P(A \\cap B) \\geq P(A). \\end{equation}\\] From this follows \\[\\begin{align} P(A \\cap B) &\\leq P(B) \\\\ &= \\frac{1}{2}, \\end{align}\\] which is the upper bound for the intersection. For an example take a fair die. To achieve the lower bound let \\(A = \\{3,4,5,6\\}\\) and \\(B = \\{1,2,3\\}\\), then their intersection is \\(A \\cap B = \\{3\\}\\). To achieve the upper bound take \\(A = \\{1,2,3,4\\}\\) and $B = {1,2,3} $. For the bounds of the union we will use the results from the first part. Again from the properties of probability we have \\[\\begin{align} P(A \\cup B) &= P(A) + P(B) - P(A \\cap B) \\\\ &\\geq P(A) + P(B) - \\frac{1}{2} \\\\ &= \\frac{2}{3}. \\end{align}\\] Conversely \\[\\begin{align} P(A \\cup B) &= P(A) + P(B) - P(A \\cap B) \\\\ &\\leq P(A) + P(B) - \\frac{1}{6} \\\\ &= 1. \\end{align}\\] Therefore \\(\\frac{2}{3} \\leq P(A \\cup B) \\leq 1\\). We use sample in R: set.seed(1) n <- 10000 samps <- sample(1:6, n, replace = TRUE) # lower bound lb <- vector(mode = "logical", length = n) A <- c(1,2,3) B <- c(3,4,5,6) for (i in 1:n) { lb[i] <- samps[i] %in% A & samps[i] %in% B } sum(lb) / n ## [1] 0.1605 # upper bound ub <- vector(mode = "logical", length = n) A <- c(1,2,3) B <- c(1,2,3,4) for (i in 1:n) { ub[i] <- samps[i] %in% A & samps[i] %in% B } sum(ub) / n ## [1] 0.4913 Exercise 1.10 A fair coin is tossed repeatedly. Show that, with probability one, a head turns up sooner or later. Show similarly that any given finite sequence of heads and tails occurs eventually with probability one. Solution. \\[\\begin{align} P(\\text{no heads}) &= \\lim_{n \\rightarrow \\infty} P(\\text{no heads in first }n \\text{ tosses}) \\\\ &= \\lim_{n \\rightarrow \\infty} \\frac{1}{2^n} \\\\ &= 0. \\end{align}\\] For the second part, let us fix the given sequence of heads and tails of length \\(k\\) as \\(s\\). A probability that this happens in \\(k\\) tosses is \\(\\frac{1}{2^k}\\). \\[\\begin{align} P(s \\text{ occurs}) &= \\lim_{n \\rightarrow \\infty} P(s \\text{ occurs in first } nk \\text{ tosses}) \\end{align}\\] The right part of the upper equation is greater than if \\(s\\) occurs either in the first \\(k\\) tosses, second \\(k\\) tosses,…, \\(n\\)-th \\(k\\) tosses. Therefore \\[\\begin{align} P(s \\text{ occurs}) &\\geq \\lim_{n \\rightarrow \\infty} P(s \\text{ occurs in first } n \\text{ disjoint sequences of length } k) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - P(s \\text{ does not occur in first } n \\text{ disjoint sequences})) \\\\ &= 1 - \\lim_{n \\rightarrow \\infty} P(s \\text{ does not occur in first } n \\text{ disjoint sequences}) \\\\ &= 1 - \\lim_{n \\rightarrow \\infty} (1 - \\frac{1}{2^k})^n \\\\ &= 1. \\end{align}\\] Exercise 1.11 An Erdos-Renyi random graph \\(G(n,p)\\) is a model with \\(n\\) nodes, where each pair of nodes is connected with probability \\(p\\). Calculate the probability that there exists a node that is not connected to any other node in \\(G(4,0.6)\\). Show that the upper bound for the probability that there exist 2 nodes that are not connected to any other node for an arbitrary \\(G(n,p)\\) is \\(\\binom{n}{2} (1-p)^{2n - 3}\\). R: Estimate the probability from the first point using simulation. Solution. Let \\(A_i\\) be the event that the \\(i\\)-th node is not connected to any other node. Then our goal is to calculate \\(P(\\cup_{i=1}^n A_i)\\). Using the inclusion-exclusion principle, we get \\[\\begin{align} P(\\cup_{i=1}^n A_i) &= \\sum_i A_i - \\sum_{i<j} P(A_i \\cap A_j) + \\sum_{i<j<k} P(A_i \\cap A_j \\cap A_k) - P(A_1 \\cap A_2 \\cap A_3 \\cap A_4) \\\\ &=4 (1 - p)^3 - \\binom{4}{2} (1 - p)^5 + \\binom{4}{3} (1 - p)^6 - (1 - p)^6 \\\\ &\\approx 0.21. \\end{align}\\] Let \\(A_{ij}\\) be the event that nodes \\(i\\) and \\(j\\) are not connected to any other node. We are interested in \\(P(\\cup_{i<j}A_{ij})\\). By using Boole`s inequality, we get \\[\\begin{align} P(\\cup_{i<j}A_{ij}) \\leq \\sum_{i<j} P(A_{ij}). \\end{align}\\] What is the probability of \\(A_{ij}\\)? There need to be no connections to the \\(i\\)-th node to the remaining nodes (excluding \\(j\\)), the same for the \\(j\\)-th node, and there can be no connection between them. Therefore \\[\\begin{align} P(\\cup_{i<j}A_{ij}) &\\leq \\sum_{i<j} (1 - p)^{2(n-2) + 1} \\\\ &= \\binom{n}{2} (1 - p)^{2n - 3}. \\end{align}\\] set.seed(1) n_samp <- 100000 n <- 4 p <- 0.6 conn_samp <- vector(mode = "logical", length = n_samp) for (i in 1:n_samp) { tmp_mat <- matrix(data = 0, nrow = n, ncol = n) samp_conn <- sample(c(0,1), choose(4,2), replace = TRUE, prob = c(1 - p, p)) tmp_mat[lower.tri(tmp_mat)] <- samp_conn tmp_mat[upper.tri(tmp_mat)] <- t(tmp_mat)[upper.tri(t(tmp_mat))] not_conn <- apply(tmp_mat, 1, sum) if (any(not_conn == 0)) { conn_samp[i] <- TRUE } else { conn_samp[i] <- FALSE } } sum(conn_samp) / n_samp ## [1] 0.20565 1.3 Discrete probability spaces Exercise 1.12 Show that the standard measurable space on \\(\\Omega = \\{0,1,...,n\\}\\) equipped with binomial measure is a discrete probability space. Define another probability measure on this measurable space. Show that for \\(n=1\\) the binomial measure is the same as the Bernoulli measure. R: Draw 1000 samples from the binomial distribution \\(p=0.5\\), \\(n=20\\) (rbinom) and compare relative frequencies with theoretical probability measure. Solution. We need to show that the terms of \\(\\sum_{k=0}^n \\binom{n}{k} p^k (1 - p)^{n - k}\\) sum to 1. For that we use the binomial theorem \\(\\sum_{k=0}^n \\binom{n}{k} x^k y^{n-k} = (x + y)^n\\). So \\[\\begin{equation} \\sum_{k=0}^n \\binom{n}{k} p^k (1 - p)^{n - k} = (p + 1 - p)^n = 1. \\end{equation}\\] \\(P(\\{k\\}) = \\frac{1}{n + 1}\\). When \\(n=1\\) then \\(k \\in \\{0,1\\}\\). Inserting \\(n=1\\) into the binomial measure, we get \\(\\binom{1}{k}p^k (1-p)^{1 - k}\\). Now \\(\\binom{1}{1} = \\binom{1}{0} = 1\\), so the measure is \\(p^k (1-p)^{1 - k}\\), which is the Bernoulli measure. set.seed(1) library(ggplot2) library(dplyr) bin_samp <- rbinom(n = 1000, size = 20, prob = 0.5) bin_samp <- data.frame(x = bin_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:20, n = dbinom(0:20, size = 20, prob = 0.5), type = "theoretical_measure")) bin_plot <- ggplot(data = bin_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(bin_plot) Exercise 1.13 Show that the standard measurable space on \\(\\Omega = \\{0,1,...,\\infty\\}\\) equipped with geometric measure is a discrete probability space, equipped with Poisson measure is a discrete probability space. Define another probability measure on this measurable space. R: Draw 1000 samples from the Poisson distribution \\(\\lambda = 10\\) (rpois) and compare relative frequencies with theoretical probability measure. Solution. \\(\\sum_{k = 0}^{\\infty} p(1 - p)^k = p \\sum_{k = 0}^{\\infty} (1 - p)^k = p \\frac{1}{1 - 1 + p} = 1\\). We used the formula for geometric series. \\(\\sum_{k = 0}^{\\infty} \\frac{\\lambda^k e^{-\\lambda}}{k!} = e^{-\\lambda} \\sum_{k = 0}^{\\infty} \\frac{\\lambda^k}{k!} = e^{-\\lambda} e^{\\lambda} = 1.\\) We used the Taylor expansion of the exponential function. Since we only have to define a probability measure, we could only assign probabilities that sum to one to a finite number of events in \\(\\Omega\\), and probability zero to the other infinite number of events. However to make this solution more educational, we will try to find a measure that assigns a non-zero probability to all events in \\(\\Omega\\). A good start for this would be to find a converging infinite series, as the probabilities will have to sum to one. One simple converging series is the geometric series \\(\\sum_{k=0}^{\\infty} p^k\\) for \\(|p| < 1\\). Let us choose an arbitrary \\(p = 0.5\\). Then \\(\\sum_{k=0}^{\\infty} p^k = \\frac{1}{1 - 0.5} = 2\\). To complete the measure, we have to normalize it, so it sums to one, therefore \\(P(\\{k\\}) = \\frac{0.5^k}{2}\\) is a probability measure on \\(\\Omega\\). We could make it even more difficult by making this measure dependent on some parameter \\(\\alpha\\), but this is out of the scope of this introductory chapter. set.seed(1) pois_samp <- rpois(n = 1000, lambda = 10) pois_samp <- data.frame(x = pois_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:25, n = dpois(0:25, lambda = 10), type = "theoretical_measure")) pois_plot <- ggplot(data = pois_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(pois_plot) Exercise 1.14 Define a probability measure on \\((\\Omega = \\mathbb{Z}, 2^{\\mathbb{Z}})\\). Define a probability measure such that \\(P(\\omega) > 0, \\forall \\omega \\in \\Omega\\). R: Implement a random generator that will generate samples with the relative frequency that corresponds to your probability measure. Compare relative frequencies with theoretical probability measure . Solution. \\(P(0) = 1, P(\\omega) = 0, \\forall \\omega \\neq 0\\). \\(P(\\{k\\}) = \\sum_{k = -\\infty}^{\\infty} \\frac{p(1 - p)^{|k|}}{2^{1 - 1_0(k)}}\\), where \\(1_0(k)\\) is the indicator function, which equals to one if \\(k\\) is 0, and equals to zero in every other case. n <- 1000 geom_samps <- rgeom(n, prob = 0.5) sign_samps <- sample(c(FALSE, TRUE), size = n, replace = TRUE) geom_samps[sign_samps] <- -geom_samps[sign_samps] my_pmf <- function (k, p) { indic <- rep(1, length(k)) indic[k == 0] <- 0 return ((p * (1 - p)^(abs(k))) / 2^indic) } geom_samps <- data.frame(x = geom_samps) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = -10:10, n = my_pmf(-10:10, 0.5), type = "theoretical_measure")) geom_plot <- ggplot(data = geom_samps, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(geom_plot) Exercise 1.15 Define a probability measure on \\(\\Omega = \\{1,2,3,4,5,6\\}\\) with parameter \\(m \\in \\{1,2,3,4,5,6\\}\\), so that the probability of outcome at distance \\(1\\) from \\(m\\) is half of the probability at distance \\(0\\), at distance \\(2\\) is half of the probability at distance \\(1\\), etc. R: Implement a random generator that will generate samples with the relative frequency that corresponds to your probability measure. Compare relative frequencies with theoretical probability measure . Solution. \\(P(\\{k\\}) = \\frac{\\frac{1}{2}^{|m - k|}}{\\sum_{i=1}^6 \\frac{1}{2}^{|m - i|}}\\) n <- 10000 m <- 4 my_pmf <- function (k, m) { denom <- sum(0.5^abs(m - 1:6)) return (0.5^abs(m - k) / denom) } samps <- c() for (i in 1:n) { a <- sample(1:6, 1) a_val <- my_pmf(a, m) prob <- runif(1) if (prob < a_val) { samps <- c(samps, a) } } samps <- data.frame(x = samps) %>% count(x) %>% mutate(n = n / length(samps), type = "empirical_frequencies") %>% bind_rows(data.frame(x = 1:6, n = my_pmf(1:6, m), type = "theoretical_measure")) my_plot <- ggplot(data = samps, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(my_plot) "],["uprobspaces.html", "Chapter 2 Uncountable probability spaces 2.1 Borel sets 2.2 Lebesgue measure", " Chapter 2 Uncountable probability spaces This chapter deals with uncountable probability spaces. The students are expected to acquire the following knowledge: Theoretical Understand Borel sets and identify them. Estimate Lebesgue measure for different sets. Know when sets are Borel-measurable. Understanding of countable and uncountable sets. R Uniform sampling. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 2.1 Borel sets Exercise 2.1 Prove that the intersection of two sigma algebras on \\(\\Omega\\) is a sigma algebra. Prove that the collection of all open subsets \\((a,b)\\) on \\((0,1]\\) is not a sigma algebra of \\((0,1]\\). Solution. Empty set: \\[\\begin{equation} \\emptyset \\in \\mathcal{A} \\wedge \\emptyset \\in \\mathcal{B} \\Rightarrow \\emptyset \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Complement: \\[\\begin{equation} \\text{Let } A \\in \\mathcal{A} \\cap \\mathcal{B} \\Rightarrow A \\in \\mathcal{A} \\wedge A \\in \\mathcal{B} \\Rightarrow A^c \\in \\mathcal{A} \\wedge A^c \\in \\mathcal{B} \\Rightarrow A^c \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Countable additivity: Let \\(\\{A_i\\}\\) be a countable sequence of subsets in \\(\\mathcal{A} \\cap \\mathcal{B}\\). \\[\\begin{equation} \\forall i: A_i \\in \\mathcal{A} \\cap \\mathcal{B} \\Rightarrow A_i \\in \\mathcal{A} \\wedge A_i \\in \\mathcal{B} \\Rightarrow \\cup A_i \\in \\mathcal{A} \\wedge \\cup A_i \\in \\mathcal{B} \\Rightarrow \\cup A_i \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Let \\(A\\) denote the collection of all open subsets \\((a,b)\\) on \\((0,1]\\). Then \\((0,1) \\in A\\). But \\((0,1)^c = 1 \\notin A\\). Exercise 2.2 Show that \\(\\mathcal{C} = \\sigma(\\mathcal{C})\\) if and only if \\(\\mathcal{C}\\) is a sigma algebra. Solution. “\\(\\Rightarrow\\)” This follows from the definition of a generated sigma algebra. “\\(\\Leftarrow\\)” Let \\(\\mathcal{F} = \\cap_i F_i\\) be the intersection of all sigma algebras that contain \\(\\mathcal{C}\\). Then \\(\\sigma(\\mathcal{C}) = \\mathcal{F}\\). Additionally, \\(\\forall i: \\mathcal{C} \\in F_i\\). So each \\(F_i\\) can be written as \\(F_i = \\mathcal{C} \\cup D\\), where \\(D\\) are the rest of the elements in the sigma algebra. In other words, each sigma algebra in the collection contains at least \\(\\mathcal{C}\\), but can contain other elements. Now for some \\(j\\), \\(F_j = \\mathcal{C}\\) as \\(\\{F_i\\}\\) contains all sigma algebras that contain \\(\\mathcal{C}\\) and \\(\\mathcal{C}\\) is such a sigma algebra. Since this is the smallest subset in the intersection it follows that \\(\\sigma(\\mathcal{C}) = \\mathcal{F} = \\mathcal{C}\\). Exercise 2.3 Let \\(\\mathcal{C}\\) and \\(\\mathcal{D}\\) be two collections of subsets on \\(\\Omega\\) such that \\(\\mathcal{C} \\subset \\mathcal{D}\\). Prove that \\(\\sigma(\\mathcal{C}) \\subseteq \\sigma(\\mathcal{D})\\). Solution. \\(\\sigma(\\mathcal{D})\\) is a sigma algebra that contains \\(\\mathcal{D}\\). It follows that \\(\\sigma(\\mathcal{D})\\) is a sigma algebra that contains \\(\\mathcal{C}\\). Let us write \\(\\sigma(\\mathcal{C}) = \\cap_i F_i\\), where \\(\\{F_i\\}\\) is the collection of all sigma algebras that contain \\(\\mathcal{C}\\). Since \\(\\sigma(\\mathcal{D})\\) is such a sigma algebra, there exists an index \\(j\\), so that \\(F_j = \\sigma(\\mathcal{D})\\). Then we can write \\[\\begin{align} \\sigma(\\mathcal{C}) &= (\\cap_{i \\neq j} F_i) \\cap \\sigma(\\mathcal{D}) \\\\ &\\subseteq \\sigma(\\mathcal{D}). \\end{align}\\] Exercise 2.4 Prove that the following subsets of \\((0,1]\\) are Borel-measurable by finding their measure. Any countable set. The set of numbers in (0,1] whose decimal expansion does not contain 7. Solution. This follows directly from the fact that every countable set is a union of singletons, whose measure is 0. Let us first look at numbers which have a 7 as the first decimal numbers. Their measure is 0.1. Then we take all the numbers with a 7 as the second decimal number (excluding those who already have it as the first). These have the measure 0.01, and there are 9 of them, so their total measure is 0.09. We can continue to do so infinitely many times. At each \\(n\\), we have the measure of the intervals which is \\(10^n\\) and the number of those intervals is \\(9^{n-1}\\). Now \\[\\begin{align} \\lambda(A) &= 1 - \\sum_{n = 0}^{\\infty} \\frac{9^n}{10^{n+1}} \\\\ &= 1 - \\frac{1}{10} \\sum_{n = 0}^{\\infty} (\\frac{9}{10})^n \\\\ &= 1 - \\frac{1}{10} \\frac{10}{1} \\\\ &= 0. \\end{align}\\] Since we have shown that the measure of the set is \\(0\\), we have also shown that the set is measurable. Exercise 2.5 Let \\(\\Omega = [0,1]\\), and let \\(\\mathcal{F}_3\\) consist of all countable subsets of \\(\\Omega\\), and all subsets of \\(\\Omega\\) having a countable complement. Show that \\(\\mathcal{F}_3\\) is a sigma algebra. Let us define \\(P(A)=0\\) if \\(A\\) is countable, and \\(P(A) = 1\\) if \\(A\\) has a countable complement. Is \\((\\Omega, \\mathcal{F}_3, P)\\) a legitimate probability space? Solution. The empty set is countable, therefore it is in \\(\\mathcal{F}_3\\). For any \\(A \\in \\mathcal{F}_3\\). If \\(A\\) is countable, then \\(A^c\\) has a countable complement and is in \\(\\mathcal{F}_3\\). If \\(A\\) is uncountable, then it has a countable complement \\(A^c\\) which is therefore also in \\(\\mathcal{F}_3\\). We are left with showing countable additivity. Let \\(\\{A_i\\}\\) be an arbitrary collection of sets in \\(\\mathcal{F}_3\\). We will look at two possibilities. First let all \\(A_i\\) be countable. A countable union of countable sets is countable, and therefore in \\(\\mathcal{F}_3\\). Second, let at least one \\(A_i\\) be uncountable. It follows that it has a countable complement. We can write \\[\\begin{equation} (\\cup_{i=1}^{\\infty} A_i)^c = \\cap_{i=1}^{\\infty} A_i^c. \\end{equation}\\] Since at least one \\(A_i^c\\) on the right side is countable, the whole intersection is countable, and therefore the union has a countable complement. It follows that the union is in \\(\\mathcal{F}_3\\). The tuple \\((\\Omega, \\mathcal{F}_3)\\) is a measurable space. Therefore, we only need to check whether \\(P\\) is a probability measure. The measure of the empty set is zero as it is countable. We have to check for countable additivity. Let us look at three situations. Let \\(A_i\\) be disjoint sets. First, let all \\(A_i\\) be countable. \\[\\begin{equation} P(\\cup_{i=1}^{\\infty} A_i) = \\sum_{i=1}^{\\infty}P( A_i)) = 0. \\end{equation}\\] Since the union is countable, the above equation holds. Second, let exactly one \\(A_i\\) be uncountable. W.L.O.G. let that be \\(A_1\\). Then \\[\\begin{equation} P(\\cup_{i=1}^{\\infty} A_i) = 1 + \\sum_{i=2}^{\\infty}P( A_i)) = 1. \\end{equation}\\] Since the union is uncountable, the above equation holds. Third, let at least two \\(A_i\\) be uncountable. We have to check whether it is possible for two uncountable sets in \\(\\mathcal{F}_3\\) to be disjoint. If that is possible, then their measures would sum to more than one and \\(P\\) would not be a probability measure. W.L.O.G. let \\(A_1\\) and \\(A_2\\) be uncountable. Then we have \\[\\begin{equation} A_1 \\cap A_2 = (A_1^c \\cup A_2^c)^c. \\end{equation}\\] Now \\(A_1^c\\) and \\(A_2^c\\) are countable and their union is therefore countable. Let \\(B = A_1^c \\cup A_2^c\\). So the intersection of \\(A_1\\) and \\(A_2\\) equals the complement of \\(B\\), which is countable. For the intersection to be the empty set, \\(B\\) would have to equal to \\(\\Omega\\). But \\(\\Omega\\) is uncountable and therefore \\(B\\) can not equal to \\(\\Omega\\). It follows that two uncountable sets in \\(\\mathcal{F}_3\\) can not have an empty intersection. Therefore the tuple is a legitimate probability space. 2.2 Lebesgue measure Exercise 2.6 Show that the Lebesgue measure of rational numbers on \\([0,1]\\) is 0. R: Implement a random number generator, which generates uniform samples of irrational numbers in \\([0,1]\\) by uniformly sampling from \\([0,1]\\) and rejecting a sample if it is rational. Solution. There are a countable number of rational numbers. Therefore, we can write \\[\\begin{align} \\lambda(\\mathbb{Q}) &= \\lambda(\\cup_{i = 1}^{\\infty} q_i) &\\\\ &= \\sum_{i = 1}^{\\infty} \\lambda(q_i) &\\text{ (countable additivity)} \\\\ &= \\sum_{i = 1}^{\\infty} 0 &\\text{ (Lebesgue measure of a singleton)} \\\\ &= 0. \\end{align}\\] Exercise 2.7 Prove that the Lebesgue measure of \\(\\mathbb{R}\\) is infinity. Paradox. Show that the cardinality of \\(\\mathbb{R}\\) and \\((0,1)\\) is the same, while their Lebesgue measures are infinity and one respectively. Solution. Let \\(a_i\\) be the \\(i\\)-th integer for \\(i \\in \\mathbb{Z}\\). We can write \\(\\mathbb{R} = \\cup_{-\\infty}^{\\infty} (a_i, a_{i + 1}]\\). \\[\\begin{align} \\lambda(\\mathbb{R}) &= \\lambda(\\cup_{i = -\\infty}^{\\infty} (a_i, a_{i + 1}]) \\\\ &= \\lambda(\\lim_{n \\rightarrow \\infty} \\cup_{i = -n}^{n} (a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\lambda(\\cup_{i = -n}^{n} (a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\sum_{i = -n}^{n} \\lambda((a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\sum_{i = -n}^{n} 1 \\\\ &= \\lim_{n \\rightarrow \\infty} 2n \\\\ &= \\infty. \\end{align}\\] We need to find a bijection between \\(\\mathbb{R}\\) and \\((0,1)\\). A well-known function that maps from a bounded interval to \\(\\mathbb{R}\\) is the tangent. To make the bijection easier to achieve, we will take the inverse, which maps from \\(\\mathbb{R}\\) to \\((-\\frac{\\pi}{2}, \\frac{\\pi}{2})\\). However, we need to change the function so it maps to \\((0,1)\\). First we add \\(\\frac{\\pi}{2}\\), so that we move the function above zero. Then we only have to divide by the max value, which in this case is \\(\\pi\\). So our bijection is \\[\\begin{equation} f(x) = \\frac{\\tan^{-1}(x) + \\frac{\\pi}{2}}{\\pi}. \\end{equation}\\] Exercise 2.8 Take the measure space \\((\\Omega_1 = (0,1], B_{(0,1]}, \\lambda)\\) (we know that this is a probability space on \\((0,1]\\)). Define a map (function) from \\(\\Omega_1\\) to \\(\\Omega_2 = \\{1,2,3,4,5,6\\}\\) such that the measure space \\((\\Omega_2, 2^{\\Omega_2}, \\lambda(f^{-1}()))\\) will be a discrete probability space with uniform probabilities (\\(P(\\omega) = \\frac{1}{6}, \\forall \\omega \\in \\Omega_2)\\). Is the map that you defined in (a) the only such map? How would you in the same fashion define a map that would result in a probability space that can be interpreted as a coin toss with probability \\(p\\) of heads? R: Use the map in (a) as a basis for a random generator for this fair die. Solution. In other words, we have to assign disjunct intervals of the same size to each element of \\(\\Omega_2\\). Therefore \\[\\begin{equation} f(x) = \\lceil 6x \\rceil. \\end{equation}\\] No, we could for example rearrange the order in which the intervals are mapped to integers. Additionally, we could have several disjoint intervals that mapped to the same integer, as long as the Lebesgue measure of their union would be \\(\\frac{1}{6}\\) and the function would remain injective. We have \\(\\Omega_3 = \\{0,1\\}\\), where zero represents heads and one represents tails. Then \\[\\begin{equation} f(x) = 0^{I_{A}(x)}, \\end{equation}\\] where \\(A = \\{y \\in (0,1] : y < p\\}\\). set.seed(1) unif_s <- runif(1000) die_s <- ceiling(6 * unif_s) summary(as.factor(die_s)) ## 1 2 3 4 5 6 ## 166 154 200 146 166 168 "],["condprob.html", "Chapter 3 Conditional probability 3.1 Calculating conditional probabilities 3.2 Conditional independence 3.3 Monty Hall problem", " Chapter 3 Conditional probability This chapter deals with conditional probability. The students are expected to acquire the following knowledge: Theoretical Identify whether variables are independent. Calculation of conditional probabilities. Understanding of conditional dependence and independence. How to apply Bayes’ theorem to solve difficult probabilistic questions. R Simulating conditional probabilities. cumsum. apply. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 3.1 Calculating conditional probabilities Exercise 3.1 A military officer is in charge of identifying enemy aircraft and shooting them down. He is able to positively identify an enemy airplane 95% of the time and positively identify a friendly airplane 90% of the time. Furthermore, 99% of the airplanes are friendly. When the officer identifies an airplane as an enemy airplane, what is the probability that it is not and they will shoot at a friendly airplane? Solution. Let \\(E = 0\\) denote that the observed plane is friendly and \\(E=1\\) that it is an enemy. Let \\(I = 0\\) denote that the officer identified it as friendly and \\(I = 1\\) as enemy. Then \\[\\begin{align} P(E = 0 | I = 1) &= \\frac{P(I = 1 | E = 0)P(E = 0)}{P(I = 1)} \\\\ &= \\frac{P(I = 1 | E = 0)P(E = 0)}{P(I = 1 | E = 0)P(E = 0) + P(I = 1 | E = 1)P(E = 1)} \\\\ &= \\frac{0.1 \\times 0.99}{0.1 \\times 0.99 + 0.95 \\times 0.01} \\\\ &= 0.91. \\end{align}\\] Exercise 3.2 R: Consider tossing a fair die. Let \\(A = \\{2,4,6\\}\\) and \\(B = \\{1,2,3,4\\}\\). Then \\(P(A) = \\frac{1}{2}\\), \\(P(B) = \\frac{2}{3}\\) and \\(P(AB) = \\frac{1}{3}\\). Since \\(P(AB) = P(A)P(B)\\), the events \\(A\\) and \\(B\\) are independent. Simulate draws from the sample space and verify that the proportions are the same. Then find two events \\(C\\) and \\(D\\) that are not independent and repeat the simulation. set.seed(1) nsamps <- 10000 tosses <- sample(1:6, nsamps, replace = TRUE) PA <- sum(tosses %in% c(2,4,6)) / nsamps PB <- sum(tosses %in% c(1,2,3,4)) / nsamps PA * PB ## [1] 0.3295095 sum(tosses %in% c(2,4)) / nsamps ## [1] 0.3323 # Let C = {1,2} and D = {2,3} PC <- sum(tosses %in% c(1,2)) / nsamps PD <- sum(tosses %in% c(2,3)) / nsamps PC * PD ## [1] 0.1067492 sum(tosses %in% c(2)) / nsamps ## [1] 0.1622 Exercise 3.3 A machine reports the true value of a thrown 12-sided die 5 out of 6 times. If the machine reports a 1 has been tossed, what is the probability that it is actually a 1? Now let the machine only report whether a 1 has been tossed or not. Does the probability change? R: Use simulation to check your answers to a) and b). Solution. Let \\(T = 1\\) denote that the toss is 1 and \\(M = 1\\) that the machine reports a 1. \\[\\begin{align} P(T = 1 | M = 1) &= \\frac{P(M = 1 | T = 1)P(T = 1)}{P(M = 1)} \\\\ &= \\frac{P(M = 1 | T = 1)P(T = 1)}{\\sum_{k=1}^{12} P(M = 1 | T = k)P(T = k)} \\\\ &= \\frac{\\frac{5}{6}\\frac{1}{12}}{\\frac{5}{6}\\frac{1}{12} + 11 \\frac{1}{6} \\frac{1}{11} \\frac{1}{12}} \\\\ &= \\frac{5}{6}. \\end{align}\\] Yes. \\[\\begin{align} P(T = 1 | M = 1) &= \\frac{P(M = 1 | T = 1)P(T = 1)}{P(M = 1)} \\\\ &= \\frac{P(M = 1 | T = 1)P(T = 1)}{\\sum_{k=1}^{12} P(M = 1 | T = k)P(T = k)} \\\\ &= \\frac{\\frac{5}{6}\\frac{1}{12}}{\\frac{5}{6}\\frac{1}{12} + 11 \\frac{1}{6} \\frac{1}{12}} \\\\ &= \\frac{5}{16}. \\end{align}\\] set.seed(1) nsamps <- 10000 report_a <- vector(mode = "numeric", length = nsamps) report_b <- vector(mode = "logical", length = nsamps) truths <- vector(mode = "logical", length = nsamps) for (i in 1:10000) { toss <- sample(1:12, size = 1) truth <- sample(c(TRUE, FALSE), size = 1, prob = c(5/6, 1/6)) truths[i] <- truth if (truth) { report_a[i] <- toss report_b[i] <- toss == 1 } else { remaining <- (1:12)[1:12 != toss] report_a[i] <- sample(remaining, size = 1) report_b[i] <- toss != 1 } } truth_a1 <- truths[report_a == 1] sum(truth_a1) / length(truth_a1) ## [1] 0.8300733 truth_b1 <- truths[report_b] sum(truth_b1) / length(truth_b1) ## [1] 0.3046209 Exercise 3.4 A coin is tossed independently \\(n\\) times. The probability of heads at each toss is \\(p\\). At each time \\(k\\), \\((k = 2,3,...,n)\\) we get a reward at time \\(k+1\\) if \\(k\\)-th toss was a head and the previous toss was a tail. Let \\(A_k\\) be the event that a reward is obtained at time \\(k\\). Are events \\(A_k\\) and \\(A_{k+1}\\) independent? Are events \\(A_k\\) and \\(A_{k+2}\\) independent? R: simulate 10 tosses 10000 times, where \\(p = 0.7\\). Check your answers to a) and b) by counting the frequencies of the events \\(A_5\\), \\(A_6\\), and \\(A_7\\). Solution. For \\(A_k\\) to happen, we need the tosses \\(k-2\\) and \\(k-1\\) be tails and heads respectively. For \\(A_{k+1}\\) to happen, we need tosses \\(k-1\\) and \\(k\\) be tails and heads respectively. As the toss \\(k-1\\) need to be heads for one and tails for the other, these two events can not happen simultaneously. Therefore the probability of their intersection is 0. But the probability of each of them separately is \\(p(1-p) > 0\\). Therefore, they are not independent. For \\(A_k\\) to happen, we need the tosses \\(k-2\\) and \\(k-1\\) be tails and heads respectively. For \\(A_{k+2}\\) to happen, we need tosses \\(k\\) and \\(k+1\\) be tails and heads respectively. So the probability of intersection is \\(p^2(1-p)^2\\). And the probability of each separately is again \\(p(1-p)\\). Therefore, they are independent. set.seed(1) nsamps <- 10000 p <- 0.7 rewardA_5 <- vector(mode = "logical", length = nsamps) rewardA_6 <- vector(mode = "logical", length = nsamps) rewardA_7 <- vector(mode = "logical", length = nsamps) rewardA_56 <- vector(mode = "logical", length = nsamps) rewardA_57 <- vector(mode = "logical", length = nsamps) for (i in 1:nsamps) { samps <- sample(c(0,1), size = 10, replace = TRUE, prob = c(0.7, 0.3)) rewardA_5[i] <- (samps[4] == 0 & samps[3] == 1) rewardA_6[i] <- (samps[5] == 0 & samps[4] == 1) rewardA_7[i] <- (samps[6] == 0 & samps[5] == 1) rewardA_56[i] <- (rewardA_5[i] & rewardA_6[i]) rewardA_57[i] <- (rewardA_5[i] & rewardA_7[i]) } sum(rewardA_5) / nsamps ## [1] 0.2141 sum(rewardA_6) / nsamps ## [1] 0.2122 sum(rewardA_7) / nsamps ## [1] 0.2107 sum(rewardA_56) / nsamps ## [1] 0 sum(rewardA_57) / nsamps ## [1] 0.0454 Exercise 3.5 A drawer contains two coins. One is an unbiased coin, the other is a biased coin, which will turn up heads with probability \\(p\\) and tails with probability \\(1-p\\). One coin is selected uniformly at random. The selected coin is tossed \\(n\\) times. The coin turns up heads \\(k\\) times and tails \\(n-k\\) times. What is the probability that the coin is biased? The selected coin is tossed repeatedly until it turns up heads \\(k\\) times. Given that it is tossed \\(n\\) times in total, what is the probability that the coin is biased? Solution. Let \\(B = 1\\) denote that the coin is biased and let \\(H = k\\) denote that we’ve seen \\(k\\) heads. \\[\\begin{align} P(B = 1 | H = k) &= \\frac{P(H = k | B = 1)P(B = 1)}{P(H = k)} \\\\ &= \\frac{P(H = k | B = 1)P(B = 1)}{P(H = k | B = 1)P(B = 1) + P(H = k | B = 0)P(B = 0)} \\\\ &= \\frac{p^k(1-p)^{n-k} 0.5}{p^k(1-p)^{n-k} 0.5 + 0.5^{n+1}} \\\\ &= \\frac{p^k(1-p)^{n-k}}{p^k(1-p)^{n-k} + 0.5^n}. \\end{align}\\] The same results as in a). The only difference between these two scenarios is that in b) the last throw must be heads. However, this holds for the biased and the unbiased coin and therefore does not affect the probability of the coin being biased. Exercise 3.6 Judy goes around the company for Women’s day and shares flowers. In every office she leaves a flower, if there is at least one woman inside. The probability that there’s a woman in the office is \\(\\frac{3}{5}\\). What is the probability that Judy leaves her first flower in the fourth office? Given that she has given away exactly three flowers in the first four offices, what is the probability that she gives her fourth flower in the eighth office? What is the probability that she leaves the second flower in the fifth office? What is the probability that she leaves the second flower in the fifth office, given that she did not leave the second flower in the second office? Judy needs a new supply of flowers immediately after the office, where she gives away her last flower. What is the probability that she visits at least five offices, if she starts with two flowers? R: simulate Judy’s walk 10000 times to check your answers a) - e). Solution. Let \\(X_i = k\\) denote the event that … \\(i\\)-th sample on the \\(k\\)-th run. Since the events are independent, we can multiply their probabilities to get \\[\\begin{equation} P(X_1 = 4) = 0.4^3 \\times 0.6 = 0.0384. \\end{equation}\\] Same as in a) as we have a fresh start after first four offices. For this to be possible, she had to leave the first flower in one of the first four offices. Therefore there are four possibilities, and for each of those the probability is \\(0.4^3 \\times 0.6\\). Additionally, the probability that she leaves a flower in the fifth office is \\(0.6\\). So \\[\\begin{equation} P(X_2 = 5) = \\binom{4}{1} \\times 0.4^3 \\times 0.6^2 = 0.09216. \\end{equation}\\] We use Bayes’ theorem. \\[\\begin{align} P(X_2 = 5 | X_2 \\neq 2) &= \\frac{P(X_2 \\neq 2 | X_2 = 5)P(X_2 = 5)}{P(X_2 \\neq 2)} \\\\ &= \\frac{0.09216}{0.64} \\\\ &= 0.144. \\end{align}\\] The denominator in the second equation can be calculated as follows. One of three things has to happen for the second not to be dealt in the second round. First, both are zero, so \\(0.4^2\\). Second, first is zero, and second is one, so \\(0.4 \\times 0.6\\). Third, the first is one and the second one zero, so \\(0.6 \\times 0.4\\). Summing these values we get \\(0.64\\). We will look at the complement, so the events that she gave away exactly two flowers after two, three and four offices. \\[\\begin{equation} P(X_2 \\geq 5) = 1 - 0.6^2 - 2 \\times 0.4 \\times 0.6^2 - 3 \\times 0.4^2 \\times 0.6^2 = 0.1792. \\end{equation}\\] The multiplying parts represent the possibilities of the first flower. set.seed(1) nsamps <- 100000 Judyswalks <- matrix(data = NA, nrow = nsamps, ncol = 8) for (i in 1:nsamps) { thiswalk <- sample(c(0,1), size = 8, replace = TRUE, prob = c(0.4, 0.6)) Judyswalks[i, ] <- thiswalk } csJudy <- t(apply(Judyswalks, 1, cumsum)) # a sum(csJudy[ ,4] == 1 & csJudy[ ,3] == 0) / nsamps ## [1] 0.03848 # b csJsubset <- csJudy[csJudy[ ,4] == 3 & csJudy[ ,3] == 2, ] sum(csJsubset[ ,8] == 4 & csJsubset[ ,7] == 3) / nrow(csJsubset) ## [1] 0.03665893 # c sum(csJudy[ ,5] == 2 & csJudy[ ,4] == 1) / nsamps ## [1] 0.09117 # d sum(csJudy[ ,5] == 2 & csJudy[ ,4] == 1) / sum(csJudy[ ,2] != 2) ## [1] 0.1422398 # e sum(csJudy[ ,4] < 2) / nsamps ## [1] 0.17818 3.2 Conditional independence Exercise 3.7 Describe: A real-world example of two events \\(A\\) and \\(B\\) that are dependent but become conditionally independent if conditioned on a third event \\(C\\). A real-world example of two events \\(A\\) and \\(B\\) that are independent, but become dependent if conditioned on some third event \\(C\\). Solution. Let \\(A\\) be the height of a person and let \\(B\\) be the person’s knowledge of the Dutch language. These events are dependent since the Dutch are known to be taller than average. However if \\(C\\) is the nationality of the person, then \\(A\\) and \\(B\\) are independent given \\(C\\). Let \\(A\\) be the event that Mary passes the exam and let \\(B\\) be the event that John passes the exam. These events are independent. However, if the event \\(C\\) is that Mary and John studied together, then \\(A\\) and \\(B\\) are conditionally dependent given \\(C\\). Exercise 3.8 We have two coins of identical appearance. We know that one is a fair coin and the other flips heads 80% of the time. We choose one of the two coins uniformly at random. We discard the coin that was not chosen. We now flip the chosen coin independently 10 times, producing a sequence \\(Y_1 = y_1\\), \\(Y_2 = y_2\\), …, \\(Y_{10} = y_{10}\\). Intuitively, without doing and computation, are these random variables independent? Compute the probability \\(P(Y_1 = 1)\\). Compute the probabilities \\(P(Y_2 = 1 | Y_1 = 1)\\) and \\(P(Y_{10} = 1 | Y_1 = 1,...,Y_9 = 1)\\). Given your answers to b) and c), would you now change your answer to a)? If so, discuss why your intuition had failed. Solution. \\(P(Y_1 = 1) = 0.5 * 0.8 + 0.5 * 0.5 = 0.65\\). Since we know that \\(Y_1 = 1\\) this should change our view of the probability of the coin being biased or not. Let \\(B = 1\\) denote the event that the coin is biased and let \\(B = 0\\) denote that the coin is unbiased. By using marginal probability, we can write \\[\\begin{align} P(Y_2 = 1 | Y_1 = 1) &= P(Y_2 = 1, B = 1 | Y_1 = 1) + P(Y_2 = 1, B = 0 | Y_1 = 1) \\\\ &= \\sum_{k=1}^2 P(Y_2 = 1 | B = k, Y_1 = 1)P(B = k | Y_1 = 1) \\\\ &= 0.8 \\frac{P(Y_1 = 1 | B = 1)P(B = 1)}{P(Y_1 = 1)} + 0.5 \\frac{P(Y_1 = 1 | B = 0)P(B = 0)}{P(Y_1 = 1)} \\\\ &= 0.8 \\frac{0.8 \\times 0.5}{0.65} + 0.5 \\frac{0.5 \\times 0.5}{0.65} \\\\ &\\approx 0.68. \\end{align}\\] For the other calculation we follow the same procedure. Let \\(X = 1\\) denote that first nine tosses are all heads (equivalent to \\(Y_1 = 1\\),…, \\(Y_9 = 1\\)). \\[\\begin{align} P(Y_{10} = 1 | X = 1) &= P(Y_2 = 1, B = 1 | X = 1) + P(Y_2 = 1, B = 0 | X = 1) \\\\ &= \\sum_{k=1}^2 P(Y_2 = 1 | B = k, X = 1)P(B = k | X = 1) \\\\ &= 0.8 \\frac{P(X = 1 | B = 1)P(B = 1)}{P(X = 1)} + 0.5 \\frac{P(X = 1 | B = 0)P(B = 0)}{P(X = 1)} \\\\ &= 0.8 \\frac{0.8^9 \\times 0.5}{0.5 \\times 0.8^9 + 0.5 \\times 0.5^9} + 0.5 \\frac{0.5^9 \\times 0.5}{0.5 \\times 0.8^9 + 0.5 \\times 0.5^9} \\\\ &\\approx 0.8. \\end{align}\\] 3.3 Monty Hall problem The Monty Hall problem is a famous probability puzzle with non-intuitive outcome. Many established mathematicians and statisticians had problems solving it and many even disregarded the correct solution until they’ve seen the proof by simulation. Here we will show how it can be solved relatively simply with the use of Bayes’ theorem if we select the variables in a smart way. Exercise 3.9 (Monty Hall problem) A prize is placed at random behind one of three doors. You pick a door. Now Monty Hall chooses one of the other two doors, opens it and shows you that it is empty. He then gives you the opportunity to keep your door or switch to the other unopened door. Should you stay or switch? Use Bayes’ theorem to calculate the probability of winning if you switch and if you do not. R: Check your answers in R. Solution. W.L.O.G. assume we always pick the first door. The host can only open door 2 or door 3, as he can not open the door we picked. Let \\(k \\in \\{2,3\\}\\). Let us first look at what happens if we do not change. Then we have \\[\\begin{align} P(\\text{car in 1} | \\text{open $k$}) &= \\frac{P(\\text{open $k$} | \\text{car in 1})P(\\text{car in 1})}{P(\\text{open $k$})} \\\\ &= \\frac{P(\\text{open $k$} | \\text{car in 1})P(\\text{car in 1})}{\\sum_{n=1}^3 P(\\text{open $k$} | \\text{car in $n$})P(\\text{car in $n$)}}. \\end{align}\\] The probability that he opened \\(k\\) if the car is in 1 is \\(\\frac{1}{2}\\), as he can choose between door 2 and 3 as both have a goat behind it. Let us look at the normalization constant. When \\(n = 1\\) we get the value in the nominator. When \\(n=k\\), we get 0, as he will not open the door if there’s a prize behind. The remaining option is that we select 1, the car is behind \\(k\\) and he opens the only door left. Since he can’t open 1 due to it being our pick and \\(k\\) due to having the prize, the probability of opening the remaining door is 1, and the prior probability of the car being behind this door is \\(\\frac{1}{3}\\). So we have \\[\\begin{align} P(\\text{car in 1} | \\text{open $k$}) &= \\frac{\\frac{1}{2}\\frac{1}{3}}{\\frac{1}{2}\\frac{1}{3} + \\frac{1}{3}} \\\\ &= \\frac{1}{3}. \\end{align}\\] Now let us look at what happens if we do change. Let \\(k' \\in \\{2,3\\}\\) be the door that is not opened. If we change, we select this door, so we have \\[\\begin{align} P(\\text{car in $k'$} | \\text{open $k$}) &= \\frac{P(\\text{open $k$} | \\text{car in $k'$})P(\\text{car in $k'$})}{P(\\text{open $k$})} \\\\ &= \\frac{P(\\text{open $k$} | \\text{car in $k'$})P(\\text{car in $k'$})}{\\sum_{n=1}^3 P(\\text{open $k$} | \\text{car in $n$})P(\\text{car in $n$)}}. \\end{align}\\] The denominator stays the same, the only thing that is different from before is \\(P(\\text{open $k$} | \\text{car in $k'$})\\). We have a situation where we initially selected door 1 and the car is in door \\(k'\\). The probability that the host will open door \\(k\\) is then 1, as he can not pick any other door. So we have \\[\\begin{align} P(\\text{car in $k'$} | \\text{open $k$}) &= \\frac{\\frac{1}{3}}{\\frac{1}{2}\\frac{1}{3} + \\frac{1}{3}} \\\\ &= \\frac{2}{3}. \\end{align}\\] Therefore it makes sense to change the door. set.seed(1) nsamps <- 1000 ifchange <- vector(mode = "logical", length = nsamps) ifstay <- vector(mode = "logical", length = nsamps) for (i in 1:nsamps) { where_car <- sample(c(1:3), 1) where_player <- sample(c(1:3), 1) open_samp <- (1:3)[where_car != (1:3) & where_player != (1:3)] if (length(open_samp) == 1) { where_open <- open_samp } else { where_open <- sample(open_samp, 1) } ifstay[i] <- where_car == where_player where_ifchange <- (1:3)[where_open != (1:3) & where_player != (1:3)] ifchange[i] <- where_ifchange == where_car } sum(ifstay) / nsamps ## [1] 0.328 sum(ifchange) / nsamps ## [1] 0.672 "],["rvs.html", "Chapter 4 Random variables 4.1 General properties and calculations 4.2 Discrete random variables 4.3 Continuous random variables 4.4 Singular random variables 4.5 Transformations", " Chapter 4 Random variables This chapter deals with random variables and their distributions. The students are expected to acquire the following knowledge: Theoretical Identification of random variables. Convolutions of random variables. Derivation of PDF, PMF, CDF, and quantile function. Definitions and properties of common discrete random variables. Definitions and properties of common continuous random variables. Transforming univariate random variables. R Familiarize with PDF, PMF, CDF, and quantile functions for several distributions. Visual inspection of probability distributions. Analytical and empirical calculation of probabilities based on distributions. New R functions for plotting (for example, facet_wrap). Creating random number generators based on the Uniform distribution. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 4.1 General properties and calculations Exercise 4.1 Which of the functions below are valid CDFs? Find their respective densities. R: Plot the three functions. \\[\\begin{equation} F(x) = \\begin{cases} 1 - e^{-x^2} & x \\geq 0 \\\\ 0 & x < 0. \\end{cases} \\end{equation}\\] \\[\\begin{equation} F(x) = \\begin{cases} e^{-\\frac{1}{x}} & x > 0 \\\\ 0 & x \\leq 0. \\end{cases} \\end{equation}\\] \\[\\begin{equation} F(x) = \\begin{cases} 0 & x \\leq 0 \\\\ \\frac{1}{3} & 0 < x \\leq \\frac{1}{2} \\\\ 1 & x > \\frac{1}{2}. \\end{cases} \\end{equation}\\] Solution. Yes. First, let us check the limits. \\(\\lim_{x \\rightarrow -\\infty} (0) = 0\\). \\(\\lim_{x \\rightarrow \\infty} (1 - e^{-x^2}) = 1 - \\lim_{x \\rightarrow \\infty} e^{-x^2} = 1 - 0 = 1\\). Second, let us check whether the function is increasing. Let \\(x > y \\geq 0\\). Then \\(1 - e^{-x^2} \\geq 1 - e^{-y^2}\\). We only have to check right continuity for the point zero. \\(F(0) = 0\\) and \\(\\lim_{\\epsilon \\downarrow 0}F (0 + \\epsilon) = \\lim_{\\epsilon \\downarrow 0} 1 - e^{-\\epsilon^2} = 1 - \\lim_{\\epsilon \\downarrow 0} e^{-\\epsilon^2} = 1 - 1 = 0\\). We get the density by differentiating the CDF. \\(p(x) = \\frac{d}{dx} 1 - e^{-x^2} = 2xe^{-x^2}.\\) Students are encouraged to check that this is a proper PDF. Yes. First, let us check the limits. $_{x -} (0) = 0 and \\(\\lim_{x \\rightarrow \\infty} (e^{-\\frac{1}{x}}) = 1\\). Second, let us check whether the function is increasing. Let \\(x > y \\geq 0\\). Then \\(e^{-\\frac{1}{x}} \\geq e^{-\\frac{1}{y}}\\). We only have to check right continuity for the point zero. \\(F(0) = 0\\) and \\(\\lim_{\\epsilon \\downarrow 0}F (0 + \\epsilon) = \\lim_{\\epsilon \\downarrow 0} e^{-\\frac{1}{\\epsilon}} = 0\\). We get the density by differentiating the CDF. \\(p(x) = \\frac{d}{dx} e^{-\\frac{1}{x}} = \\frac{1}{x^2}e^{-\\frac{1}{x}}.\\) Students are encouraged to check that this is a proper PDF. No. The function is not right continuous as \\(F(\\frac{1}{2}) = \\frac{1}{3}\\), but \\(\\lim_{\\epsilon \\downarrow 0} F(\\frac{1}{2} + \\epsilon) = 1\\). f1 <- function (x) { tmp <- 1 - exp(-x^2) tmp[x < 0] <- 0 return(tmp) } f2 <- function (x) { tmp <- exp(-(1 / x)) tmp[x <= 0] <- 0 return(tmp) } f3 <- function (x) { tmp <- x tmp[x == x] <- 1 tmp[x <= 0.5] <- 1/3 tmp[x <= 0] <- 0 return(tmp) } cdf_data <- tibble(x = seq(-1, 20, by = 0.001), f1 = f1(x), f2 = f2(x), f3 = f3(x)) %>% melt(id.vars = "x") cdf_plot <- ggplot(data = cdf_data, aes(x = x, y = value, color = variable)) + geom_hline(yintercept = 1) + geom_line() plot(cdf_plot) Exercise 4.2 Let \\(X\\) be a random variable with CDF \\[\\begin{equation} F(x) = \\begin{cases} 0 & x < 0 \\\\ \\frac{x^2}{2} & 0 \\leq x < 1 \\\\ \\frac{1}{2} + \\frac{p}{2} & 1 \\leq x < 2 \\\\ \\frac{1}{2} + \\frac{p}{2} + \\frac{1 - p}{2} & x \\geq 2 \\end{cases} \\end{equation}\\] R: Plot this CDF for \\(p = 0.3\\). Is it a discrete, continuous, or mixed random varible? Find the probability density/mass of \\(X\\). f1 <- function (x, p) { tmp <- x tmp[x >= 2] <- 0.5 + (p * 0.5) + ((1-p) * 0.5) tmp[x < 2] <- 0.5 + (p * 0.5) tmp[x < 1] <- (x[x < 1])^2 / 2 tmp[x < 0] <- 0 return(tmp) } cdf_data <- tibble(x = seq(-1, 5, by = 0.001), y = f1(x, 0.3)) cdf_plot <- ggplot(data = cdf_data, aes(x = x, y = y)) + geom_hline(yintercept = 1) + geom_line(color = "blue") plot(cdf_plot) ::: {.solution} \\(X\\) is a mixed random variable. Since \\(X\\) is a mixed random variable, we have to find the PDF of the continuous part and the PMF of the discrete part. We get the continuous part by differentiating the corresponding CDF, \\(\\frac{d}{dx}\\frac{x^2}{2} = x\\). So the PDF, when \\(0 \\leq x < 1\\), is \\(p(x) = x\\). Let us look at the discrete part now. It has two steps, so this is a discrete distribution with two outcomes – numbers 1 and 2. The first happens with probability \\(\\frac{p}{2}\\), and the second with probability \\(\\frac{1 - p}{2}\\). This reminds us of the Bernoulli distribution. The PMF for the discrete part is \\(P(X = x) = (\\frac{p}{2})^{2 - x} (\\frac{1 - p}{2})^{x - 1}\\). ::: Exercise 4.3 (Convolutions) Convolutions are probability distributions that correspond to sums of independent random variables. Let \\(X\\) and \\(Y\\) be independent discrete variables. Find the PMF of \\(Z = X + Y\\). Hint: Use the law of total probability. Let \\(X\\) and \\(Y\\) be independent continuous variables. Find the PDF of \\(Z = X + Y\\). Hint: Start with the CDF. Solution. \\[\\begin{align} P(Z = z) &= P(X + Y = z) & \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + Y = z | Y = k) P(Y = k) & \\text{ (law of total probability)} \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + k = z | Y = k) P(Y = k) & \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + k = z) P(Y = k) & \\text{ (independence of $X$ and $Y$)} \\\\ &= \\sum_{k = -\\infty}^\\infty P(X = z - k) P(Y = k). & \\end{align}\\] Let \\(f\\) and \\(g\\) be the PDFs of \\(X\\) and \\(Y\\) respectively. \\[\\begin{align} F(z) &= P(Z < z) \\\\ &= P(X + Y < z) \\\\ &= \\int_{-\\infty}^{\\infty} P(X + Y < z | Y = y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X + y < z | Y = y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X + y < z)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X < z - y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} (\\int_{-\\infty}^{z - y} f(x) dx) g(y) dy \\end{align}\\] Now \\[\\begin{align} p(z) &= \\frac{d}{dz} F(z) & \\\\ &= \\int_{-\\infty}^{\\infty} (\\frac{d}{dz}\\int_{-\\infty}^{z - y} f(x) dx) g(y) dy & \\\\ &= \\int_{-\\infty}^{\\infty} f(z - y) g(y) dy & \\text{ (fundamental theorem of calculus)}. \\end{align}\\] 4.2 Discrete random variables Exercise 4.4 (Binomial random variable) Let \\(X_k\\), \\(k = 1,...,n\\), be random variables with the Bernoulli measure as the PMF. Let \\(X = \\sum_{k=1}^n X_k\\). We call \\(X_k\\) a Bernoulli random variable with parameter \\(p \\in (0,1)\\). Find the CDF of \\(X_k\\). Find PMF of \\(X\\). This is a Binomial random variable with support in \\(\\{0,1,2,...,n\\}\\) and parameters \\(p \\in (0,1)\\) and \\(n \\in \\mathbb{N}_0\\). We denote \\[\\begin{equation} X | n,p \\sim \\text{binomial}(n,p). \\end{equation}\\] Find CDF of \\(X\\). R: Simulate from the binomial distribution with \\(n = 10\\) and \\(p = 0.5\\), and from \\(n\\) Bernoulli distributions with \\(p = 0.5\\). Visually compare the sum of Bernoullis and the binomial. Hint: there is no standard function like rpois for a Bernoulli random variable. Check exercise 1.12 to find out how to sample from a Bernoulli distribution. Solution. There are two outcomes – zero and one. Zero happens with probability \\(1 - p\\). Therefore \\[\\begin{equation} F(k) = \\begin{cases} 0 & k < 0 \\\\ 1 - p & 0 \\leq k < 1 \\\\ 1 & k \\geq 1. \\end{cases} \\end{equation}\\] For the probability of \\(X\\) to be equal to some \\(k \\leq n\\), exactly \\(k\\) Bernoulli variables need to be one, and the others zero. So \\(p^k(1-p)^{n-k}\\). There are \\(\\binom{n}{k}\\) such possible arrangements. Therefore \\[\\begin{align} P(X = k) = \\binom{n}{k} p^k (1 - p)^{n-k}. \\end{align}\\] \\[\\begin{equation} F(k) = \\sum_{i = 0}^{\\lfloor k \\rfloor} \\binom{n}{i} p^i (1 - p)^{n - i} \\end{equation}\\] set.seed(1) nsamps <- 10000 binom_samp <- rbinom(nsamps, size = 10, prob = 0.5) bernoulli_mat <- matrix(data = NA, nrow = nsamps, ncol = 10) for (i in 1:nsamps) { bernoulli_mat[i, ] <- rbinom(10, size = 1, prob = 0.5) } bern_samp <- apply(bernoulli_mat, 1, sum) b_data <- tibble(x = c(binom_samp, bern_samp), type = c(rep("binomial", 10000), rep("Bernoulli_sum", 10000))) b_plot <- ggplot(data = b_data, aes(x = x, fill = type)) + geom_bar(position = "dodge") plot(b_plot) Exercise 4.5 (Geometric random variable) A variable with PMF \\[\\begin{equation} P(k) = p(1-p)^k \\end{equation}\\] is a geometric random variable with support in non-negative integers. It has one parameter \\(p \\in (0,1]\\). We denote \\[\\begin{equation} X | p \\sim \\text{geometric}(p) \\end{equation}\\] Derive the CDF of a geometric random variable. R: Draw 1000 samples from the geometric distribution with \\(p = 0.3\\) and compare their frequencies to theoretical values. Solution. \\[\\begin{align} P(X \\leq k) &= \\sum_{i = 0}^k p(1-p)^i \\\\ &= p \\sum_{i = 0}^k (1-p)^i \\\\ &= p \\frac{1 - (1-p)^{k+1}}{1 - (1 - p)} \\\\ &= 1 - (1-p)^{k + 1} \\end{align}\\] set.seed(1) geo_samp <- rgeom(n = 1000, prob = 0.3) geo_samp <- data.frame(x = geo_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:20, n = dgeom(0:20, prob = 0.3), type = "theoretical_measure")) geo_plot <- ggplot(data = geo_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(geo_plot) Exercise 4.6 (Poisson random variable) A variable with PMF \\[\\begin{equation} P(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!} \\end{equation}\\] is a Poisson random variable with support in non-negative integers. It has one positive parameter \\(\\lambda\\), which also represents its mean value and variance (a measure of the deviation of the values from the mean – more on mean and variance in the next chapter). We denote \\[\\begin{equation} X | \\lambda \\sim \\text{Poisson}(\\lambda). \\end{equation}\\] This distribution is usually the default choice for modeling counts. We have already encountered a Poisson random variable in exercise 1.13, where we also sampled from this distribution. The CDF of a Poisson random variable is \\(P(X <= x) = e^{-\\lambda} \\sum_{i=0}^x \\frac{\\lambda^{i}}{i!}\\). R: Draw 1000 samples from the Poisson distribution with \\(\\lambda = 5\\) and compare their empirical cumulative distribution function with the theoretical CDF. set.seed(1) pois_samp <- rpois(n = 1000, lambda = 5) pois_samp <- data.frame(x = pois_samp) pois_plot <- ggplot(data = pois_samp, aes(x = x, colour = "ECDF")) + stat_ecdf(geom = "step") + geom_step(data = tibble(x = 0:17, y = ppois(x, 5)), aes(x = x, y = y, colour = "CDF")) + scale_colour_manual("Lgend title", values = c("black", "red")) plot(pois_plot) Exercise 4.7 (Negative binomial random variable) A variable with PMF \\[\\begin{equation} p(k) = \\binom{k + r - 1}{k}(1-p)^r p^k \\end{equation}\\] is a negative binomial random variable with support in non-negative integers. It has two parameters \\(r > 0\\) and \\(p \\in (0,1)\\). We denote \\[\\begin{equation} X | r,p \\sim \\text{NB}(r,p). \\end{equation}\\] Let us reparameterize the negative binomial distribution with \\(q = 1 - p\\). Find the PMF of \\(X \\sim \\text{NB}(1, q)\\). Do you recognize this distribution? Show that the sum of two negative binomial random variables with the same \\(p\\) is also a negative binomial random variable. Hint: Use the fact that the number of ways to place \\(n\\) indistinct balls into \\(k\\) boxes is \\(\\binom{n + k - 1}{n}\\). R: Draw samples from \\(X \\sim \\text{NB}(5, 0.4)\\) and \\(Y \\sim \\text{NB}(3, 0.4)\\). Draw samples from \\(Z = X + Y\\), where you use the parameters calculated in b). Plot both distributions, their sum, and \\(Z\\) using facet_wrap. Be careful, as R uses a different parameterization size=\\(r\\) and prob=\\(1 - p\\). Solution. \\[\\begin{align} P(X = k) &= \\binom{k + 1 - 1}{k}q^1 (1-q)^k \\\\ &= q(1-q)^k. \\end{align}\\] This is the geometric distribution. Let \\(X \\sim \\text{NB}(r_1, p)\\) and \\(Y \\sim \\text{NB}(r_2, p)\\). Let \\(Z = X + Y\\). \\[\\begin{align} P(Z = z) &= \\sum_{k = 0}^{\\infty} P(X = z - k)P(Y = k), \\text{ if k < 0, then the probabilities are 0} \\\\ &= \\sum_{k = 0}^{z} P(X = z - k)P(Y = k), \\text{ if k > z, then the probabilities are 0} \\\\ &= \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k}(1 - p)^{r_1} p^{z - k} \\binom{k + r_2 - 1}{k}(1 - p)^{r_2} p^{k} & \\\\ &= \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}(1 - p)^{r_1 + r_2} p^{z} & \\\\ &= (1 - p)^{r_1 + r_2} p^{z} \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}& \\end{align}\\] The part before the sum reminds us of the negative binomial distribution with parameters \\(r_1 + r_2\\) and \\(p\\). To complete this term to the negative binomial PMF we need \\(\\binom{z + r_1 + r_2 -1}{z}\\). So the only thing we need to prove is that the sum equals this term. Both terms in the sum can be interpreted as numbers of ways to place a number of balls into boxes. For the left term it is \\(z-k\\) balls into \\(r_1\\) boxes, and for the right \\(k\\) balls into \\(r_2\\) boxes. For each \\(k\\) we are distributing \\(z\\) balls in total. By summing over all \\(k\\), we actually get all the possible placements of \\(z\\) balls into \\(r_1 + r_2\\) boxes. Therefore \\[\\begin{align} P(Z = z) &= (1 - p)^{r_1 + r_2} p^{z} \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}& \\\\ &= \\binom{z + r_1 + r_2 -1}{z} (1 - p)^{r_1 + r_2} p^{z}. \\end{align}\\] From this it also follows that the sum of geometric distributions with the same parameter is a negative binomial distribution. \\(Z \\sim \\text{NB}(8, 0.4)\\). set.seed(1) nsamps <- 10000 x <- rnbinom(nsamps, size = 5, prob = 0.6) y <- rnbinom(nsamps, size = 3, prob = 0.6) xpy <- x + y z <- rnbinom(nsamps, size = 8, prob = 0.6) samps <- tibble(x, y, xpy, z) samps <- melt(samps) ggplot(data = samps, aes(x = value)) + geom_bar() + facet_wrap(~ variable) 4.3 Continuous random variables Exercise 4.8 (Exponential random variable) A variable \\(X\\) with PDF \\(\\lambda e^{-\\lambda x}\\) is an exponential random variable with support in non-negative real numbers. It has one positive parameter \\(\\lambda\\). We denote \\[\\begin{equation} X | \\lambda \\sim \\text{Exp}(\\lambda). \\end{equation}\\] Find the CDF of an exponential random variable. Find the quantile function of an exponential random variable. Calculate the probability \\(P(1 \\leq X \\leq 3)\\), where \\(X \\sim \\text{Exp(1.5)}\\). R: Check your answer to c) with a simulation (rexp). Plot the probability in a meaningful way. R: Implement PDF, CDF, and the quantile function and compare their values with corresponding R functions visually. Hint: use the size parameter to make one of the curves wider. Solution. \\[\\begin{align} F(x) &= \\int_{0}^{x} \\lambda e^{-\\lambda t} dt \\\\ &= \\lambda \\int_{0}^{x} e^{-\\lambda t} dt \\\\ &= \\lambda (\\frac{1}{-\\lambda}e^{-\\lambda t} |_{0}^{x}) \\\\ &= \\lambda(\\frac{1}{\\lambda} - \\frac{1}{\\lambda} e^{-\\lambda x}) \\\\ &= 1 - e^{-\\lambda x}. \\end{align}\\] \\[\\begin{align} F(F^{-1}(x)) &= x \\\\ 1 - e^{-\\lambda F^{-1}(x)} &= x \\\\ e^{-\\lambda F^{-1}(x)} &= 1 - x \\\\ -\\lambda F^{-1}(x) &= \\ln(1 - x) \\\\ F^{-1}(x) &= - \\frac{ln(1 - x)}{\\lambda}. \\end{align}\\] \\[\\begin{align} P(1 \\leq X \\leq 3) &= P(X \\leq 3) - P(X \\leq 1) \\\\ &= P(X \\leq 3) - P(X \\leq 1) \\\\ &= 1 - e^{-1.5 \\times 3} - 1 + e^{-1.5 \\times 1} \\\\ &\\approx 0.212. \\end{align}\\] set.seed(1) nsamps <- 1000 samps <- rexp(nsamps, rate = 1.5) sum(samps >= 1 & samps <= 3) / nsamps ## [1] 0.212 exp_plot <- ggplot(data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dexp, args = list(rate = 1.5)) + stat_function(fun = dexp, args = list(rate = 1.5), xlim = c(1,3), geom = "area", fill = "red") plot(exp_plot) exp_pdf <- function(x, lambda) { return (lambda * exp(-lambda * x)) } exp_cdf <- function(x, lambda) { return (1 - exp(-lambda * x)) } exp_quant <- function(q, lambda) { return (-(log(1 - q) / lambda)) } ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_pdf, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = pexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_cdf, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) ggplot(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x)) + stat_function(fun = qexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_quant, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) Exercise 4.9 (Uniform random variable) Continuous uniform random variable with parameters \\(a\\) and \\(b\\) has the PDF \\[\\begin{equation} p(x) = \\begin{cases} \\frac{1}{b - a} & x \\in [a,b] \\\\ 0 & \\text{otherwise}. \\end{cases} \\end{equation}\\] Find the CDF of the uniform random variable. Find the quantile function of the uniform random variable. Let \\(X \\sim \\text{Uniform}(a,b)\\). Find the CDF of the variable \\(Y = \\frac{X - a}{b - a}\\). This is the standard uniform random variable. Let \\(X \\sim \\text{Uniform}(-1, 3)\\). Find such \\(z\\) that \\(P(X < z + \\mu_x) = \\frac{1}{5}\\). R: Check your result from d) using simulation. Solution. \\[\\begin{align} F(x) &= \\int_{a}^x \\frac{1}{b - a} dt \\\\ &= \\frac{1}{b - a} \\int_{a}^x dt \\\\ &= \\frac{x - a}{b - a}. \\end{align}\\] \\[\\begin{align} F(F^{-1}(p)) &= p \\\\ \\frac{F^{-1}(p) - a}{b - a} &= p \\\\ F^{-1}(p) &= p(b - a) + a. \\end{align}\\] \\[\\begin{align} F_Y(y) &= P(Y < y) \\\\ &= P(\\frac{X - a}{b - a} < y) \\\\ &= P(X < y(b - a) + a) \\\\ &= F_X(y(b - a) + a) \\\\ &= \\frac{(y(b - a) + a) - a}{b - a} \\\\ &= y. \\end{align}\\] \\[\\begin{align} P(X < z + 1) &= \\frac{1}{5} \\\\ F(z + 1) &= \\frac{1}{5} \\\\ z + 1 &= F^{-1}(\\frac{1}{5}) \\\\ z &= \\frac{1}{5}4 - 1 - 1 \\\\ z &= -1.2. \\end{align}\\] set.seed(1) a <- -1 b <- 3 nsamps <- 10000 unif_samp <- runif(nsamps, a, b) mu_x <- mean(unif_samp) new_samp <- unif_samp - mu_x quantile(new_samp, probs = 1/5) ## 20% ## -1.203192 punif(-0.2, -1, 3) ## [1] 0.2 Exercise 4.10 (Beta random variable) A variable \\(X\\) with PDF \\[\\begin{equation} p(x) = \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)}, \\end{equation}\\] where \\(\\text{B}(\\alpha, \\beta) = \\frac{\\Gamma(\\alpha) \\Gamma(\\beta)}{\\Gamma(\\alpha + \\beta)}\\) and \\(\\Gamma(x) = \\int_0^{\\infty} x^{z - 1} e^{-x} dx\\) is a Beta random variable with support on \\([0,1]\\). It has two positive parameters \\(\\alpha\\) and \\(\\beta\\). Notation: \\[\\begin{equation} X | \\alpha, \\beta \\sim \\text{Beta}(\\alpha, \\beta) \\end{equation}\\] It is often used in modeling rates. Calculate the PDF for \\(\\alpha = 1\\) and \\(\\beta = 1\\). What do you notice? R: Plot densities of the beta distribution for parameter pairs (2, 2), (4, 1), (1, 4), (2, 5), and (0.1, 0.1). R: Sample from \\(X \\sim \\text{Beta}(2, 5)\\) and compare the histogram with Beta PDF. Solution. \\[\\begin{equation} p(x) = \\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\\text{B}(1, 1)} = 1. \\end{equation}\\] This is the standard uniform distribution. set.seed(1) ggplot(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x)) + stat_function(fun = dbeta, args = list(shape1 = 2, shape2 = 2), aes(color = "alpha = 0.5")) + stat_function(fun = dbeta, args = list(shape1 = 4, shape2 = 1), aes(color = "alpha = 4")) + stat_function(fun = dbeta, args = list(shape1 = 1, shape2 = 4), aes(color = "alpha = 1")) + stat_function(fun = dbeta, args = list(shape1 = 2, shape2 = 5), aes(color = "alpha = 25")) + stat_function(fun = dbeta, args = list(shape1 = 0.1, shape2 = 0.1), aes(color = "alpha = 0.1")) set.seed(1) nsamps <- 1000 samps <- rbeta(nsamps, 2, 5) ggplot(data = data.frame(x = samps), aes(x = x)) + geom_histogram(aes(y = ..density..), color = "black") + stat_function(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x), fun = dbeta, args = list(shape1 = 2, shape2 = 5), color = "red", size = 1.2) Exercise 4.11 (Gamma random variable) A random variable with PDF \\[\\begin{equation} p(x) = \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x} \\end{equation}\\] is a Gamma random variable with support on the positive numbers and parameters shape \\(\\alpha > 0\\) and rate \\(\\beta > 0\\). We write \\[\\begin{equation} X | \\alpha, \\beta \\sim \\text{Gamma}(\\alpha, \\beta) \\end{equation}\\] and it’s CDF is \\[\\begin{equation} \\frac{\\gamma(\\alpha, \\beta x)}{\\Gamma(\\alpha)}, \\end{equation}\\] where \\(\\gamma(s, x) = \\int_0^x t^{s-1} e^{-t} dt\\). It is usually used in modeling positive phenomena (for example insurance claims and rainfalls). Let \\(X \\sim \\text{Gamma}(1, \\beta)\\). Find the PDF of \\(X\\). Do you recognize this PDF? Let \\(k = \\alpha\\) and \\(\\theta = \\frac{1}{\\beta}\\). Find the PDF of \\(X | k, \\theta \\sim \\text{Gamma}(k, \\theta)\\). Random variables can be reparameterized, and sometimes a reparameterized distribution is more suitable for certain calculations. The first parameterization is for example usually used in Bayesian statistics, while this parameterization is more common in econometrics and some other applied fields. Note that you also need to pay attention to the parameters in statistical software, so diligently read the help files when using functions like rgamma to see how the function is parameterized. R: Plot gamma CDF for random variables with shape and rate parameters (1,1), (10,1), (1,10). Solution. \\[\\begin{align} p(x) &= \\frac{\\beta^1}{\\Gamma(1)} x^{1 - 1}e^{-\\beta x} \\\\ &= \\beta e^{-\\beta x} \\end{align}\\] This is the PDF of the exponential distribution with parameter \\(\\beta\\). \\[\\begin{align} p(x) &= \\frac{1}{\\Gamma(k)\\beta^k} x^{k - 1}e^{-\\frac{x}{\\theta}}. \\end{align}\\] set.seed(1) ggplot(data = data.frame(x = seq(0, 25, by = 0.01)), aes(x = x)) + stat_function(fun = pgamma, args = list(shape = 1, rate = 1), aes(color = "Gamma(1,1)")) + stat_function(fun = pgamma, args = list(shape = 10, rate = 1), aes(color = "Gamma(10,1)")) + stat_function(fun = pgamma, args = list(shape = 1, rate = 10), aes(color = "Gamma(1,10)")) Exercise 4.12 (Normal random variable) A random variable with PDF \\[\\begin{equation} p(x) = \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} \\end{equation}\\] is a normal random variable with support on the real axis and parameters \\(\\mu\\) in reals and \\(\\sigma^2 > 0\\). The first is the mean parameter and the second is the variance parameter. Many statistical methods assume a normal distribution. We denote \\[\\begin{equation} X | \\mu, \\sigma \\sim \\text{N}(\\mu, \\sigma^2), \\end{equation}\\] and it’s CDF is \\[\\begin{equation} F(x) = \\int_{-\\infty}^x \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(t - \\mu)^2}{2 \\sigma^2}} dt, \\end{equation}\\] which is intractable and is usually approximated. Due to its flexibility it is also one of the most researched distributions. For that reason statisticians often use transformations of variables or approximate distributions with the normal distribution. Show that a variable \\(\\frac{X - \\mu}{\\sigma} \\sim \\text{N}(0,1)\\). This transformation is called standardization, and \\(\\text{N}(0,1)\\) is a standard normal distribution. R: Plot the normal distribution with \\(\\mu = 0\\) and different values for the \\(\\sigma\\) parameter. R: The normal distribution provides a good approximation for the Poisson distribution with a large \\(\\lambda\\). Let \\(X \\sim \\text{Poisson}(50)\\). Approximate \\(X\\) with the normal distribution and compare its density with the Poisson histogram. What are the values of \\(\\mu\\) and \\(\\sigma^2\\) that should provide the best approximation? Note that R function rnorm takes standard deviation (\\(\\sigma\\)) as a parameter and not variance. Solution. \\[\\begin{align} P(\\frac{X - \\mu}{\\sigma} < x) &= P(X < \\sigma x + \\mu) \\\\ &= F(\\sigma x + \\mu) \\\\ &= \\int_{-\\infty}^{\\sigma x + \\mu} \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(t - \\mu)^2}{2\\sigma^2}} dt \\end{align}\\] Now let \\(s = f(t) = \\frac{t - \\mu}{\\sigma}\\), then \\(ds = \\frac{dt}{\\sigma}\\) and \\(f(\\sigma x + \\mu) = x\\), so \\[\\begin{align} P(\\frac{X - \\mu}{\\sigma} < x) &= \\int_{-\\infty}^{x} \\frac{1}{\\sqrt{2 \\pi}} e^{-\\frac{s^2}{2}} ds. \\end{align}\\] There is no need to evaluate this integral, as we recognize it as the CDF of a normal distribution with \\(\\mu = 0\\) and \\(\\sigma^2 = 1\\). set.seed(1) # b ggplot(data = data.frame(x = seq(-15, 15, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 1), aes(color = "sd = 1")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 0.4), aes(color = "sd = 0.1")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "sd = 2")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 5), aes(color = "sd = 5")) # c mean_par <- 50 nsamps <- 100000 pois_samps <- rpois(nsamps, lambda = mean_par) norm_samps <- rnorm(nsamps, mean = mean_par, sd = sqrt(mean_par)) my_plot <- ggplot() + geom_bar(data = tibble(x = pois_samps), aes(x = x, y = (..count..)/sum(..count..))) + geom_density(data = tibble(x = norm_samps), aes(x = x), color = "red") plot(my_plot) Exercise 4.13 (Logistic random variable) A logistic random variable has CDF \\[\\begin{equation} F(x) = \\frac{1}{1 + e^{-\\frac{x - \\mu}{s}}}, \\end{equation}\\] where \\(\\mu\\) is real and \\(s > 0\\). The support is on the real axis. We denote \\[\\begin{equation} X | \\mu, s \\sim \\text{Logistic}(\\mu, s). \\end{equation}\\] The distribution of the logistic random variable resembles a normal random variable, however it has heavier tails. Find the PDF of a logistic random variable. R: Implement logistic PDF and CDF and visually compare both for \\(X \\sim \\text{N}(0, 1)\\) and \\(Y \\sim \\text{logit}(0, \\sqrt{\\frac{3}{\\pi^2}})\\). These distributions have the same mean and variance. Additionally, plot the same plot on the interval [5,10], to better see the difference in the tails. R: For the distributions in b) find the probability \\(P(|X| > 4)\\) and interpret the result. Solution. \\[\\begin{align} p(x) &= \\frac{d}{dx} \\frac{1}{1 + e^{-\\frac{x - \\mu}{s}}} \\\\ &= \\frac{- \\frac{d}{dx} (1 + e^{-\\frac{x - \\mu}{s}})}{(1 + e{-\\frac{x - \\mu}{s}})^2} \\\\ &= \\frac{e^{-\\frac{x - \\mu}{s}}}{s(1 + e{-\\frac{x - \\mu}{s}})^2}. \\end{align}\\] # b set.seed(1) logit_pdf <- function (x, mu, s) { return ((exp(-(x - mu)/(s))) / (s * (1 + exp(-(x - mu)/(s)))^2)) } nl_plot <- ggplot(data = data.frame(x = seq(-12, 12, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "normal")) + stat_function(fun = logit_pdf, args = list(mu = 0, s = sqrt(12/pi^2)), aes(color = "logit")) plot(nl_plot) nl_plot <- ggplot(data = data.frame(x = seq(5, 10, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "normal")) + stat_function(fun = logit_pdf, args = list(mu = 0, s = sqrt(12/pi^2)), aes(color = "logit")) plot(nl_plot) # c logit_cdf <- function (x, mu, s) { return (1 / (1 + exp(-(x - mu) / s))) } p_logistic <- 1 - logit_cdf(4, 0, sqrt(12/pi^2)) + logit_cdf(-4, 0, sqrt(12/pi^2)) p_norm <- 1 - pnorm(4, 0, 2) + pnorm(-4, 0, 2) p_logistic ## [1] 0.05178347 p_norm ## [1] 0.04550026 # Logistic distribution has wider tails, therefore the probability of larger # absolute values is higher. 4.4 Singular random variables Exercise 4.14 (Cantor distribution) The Cantor set is a subset of \\([0,1]\\), which we create by iteratively deleting the middle third of the interval. For example, in the first iteration, we get the sets \\([0,\\frac{1}{3}]\\) and \\([\\frac{2}{3},1]\\). In the second iteration, we get \\([0,\\frac{1}{9}]\\), \\([\\frac{2}{9},\\frac{1}{3}]\\), \\([\\frac{2}{3}, \\frac{7}{9}]\\), and \\([\\frac{8}{9}, 1]\\). On the \\(n\\)-th iteration, we have \\[\\begin{equation} C_n = \\frac{C_{n-1}}{3} \\cup \\bigg(\\frac{2}{3} + \\frac{C_{n-1}}{3} \\bigg), \\end{equation}\\] where \\(C_0 = [0,1]\\). The Cantor set is then defined as the intersection of these sets \\[\\begin{equation} C = \\cap_{n=1}^{\\infty} C_n. \\end{equation}\\] It has the same cardinality as \\([0,1]\\). Another way to define the Cantor set is the set of all numbers on \\([0,1]\\), that do not have a 1 in the ternary representation \\(x = \\sum_{n=1}^\\infty \\frac{x_i}{3^i}, x_i \\in \\{0,1,2\\}\\). A random variable follows the Cantor distribution, if its CDF is the Cantor function (below). You can find the implementations of random number generator, CDF, and quantile functions for the Cantor distributions at https://github.com/Henrygb/CantorDist.R. Show that the Lebesgue measure of the Cantor set is 0. (Jagannathan) Let us look at an infinite sequence of independent fair-coin tosses. If the outcome is heads, let \\(x_i = 2\\) and \\(x_i = 0\\), when tails. Then use these to create \\(x = \\sum_{n=1}^\\infty \\frac{x_i}{3^i}\\). This is a random variable with the Cantor distribution. Show that \\(X\\) has a singular distribution. Solution. \\[\\begin{align} \\lambda(C) &= 1 - \\lambda(C^c) \\\\ &= 1 - \\frac{1}{3}\\sum_{k = 0}^\\infty (\\frac{2}{3})^k \\\\ &= 1 - \\frac{\\frac{1}{3}}{1 - \\frac{2}{3}} \\\\ &= 0. \\end{align}\\] First, for every \\(x\\), the probability of observing it is \\(\\lim_{n \\rightarrow \\infty} \\frac{1}{2^n} = 0\\). Second, the probability that we observe one of all the possible sequences is 1. Therefore \\(P(C) = 1\\). So this is a singular variable. The CDF only increments on the elements of the Cantor set. 4.5 Transformations Exercise 4.15 Let \\(X\\) be a random variable that is uniformly distributed on \\(\\{-2, -1, 0, 1, 2\\}\\). Find the PMF of \\(Y = X^2\\). Solution. \\[\\begin{align} P_Y(y) = \\sum_{x \\in \\sqrt(y)} P_X(x) = \\begin{cases} 0 & y \\notin \\{0,1,4\\} \\\\ \\frac{1}{5} & y = 0 \\\\ \\frac{2}{5} & y \\in \\{1,4\\} \\end{cases} \\end{align}\\] Exercise 4.16 (Lognormal random variable) A lognormal random variable is a variable whose logarithm is normally distributed. In practice, we often encounter skewed data. Usually using a log transformation on such data makes it more symmetric and therefore more suitable for modeling with the normal distribution (more on why we wish to model data with the normal distribution in the following chapters). Let \\(X \\sim \\text{N}(\\mu,\\sigma)\\). Find the PDF of \\(Y: \\log(Y) = X\\). R: Sample from the lognormal distribution with parameters \\(\\mu = 5\\) and \\(\\sigma = 2\\). Plot a histogram of the samples. Then log-transform the samples and plot a histogram along with the theoretical normal PDF. Solution. \\[\\begin{align} p_Y(y) &= p_X(\\log(y)) \\frac{d}{dy} \\log(y) \\\\ &= \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{\\frac{(\\log(y) - \\mu)^2}{2 \\sigma^2}} \\frac{1}{y} \\\\ &= \\frac{1}{y \\sqrt{2 \\pi \\sigma^2}} e^{\\frac{(\\log(y) - \\mu)^2}{2 \\sigma^2}}. \\end{align}\\] set.seed(1) nsamps <- 10000 mu <- 0.5 sigma <- 0.4 ln_samps <- rlnorm(nsamps, mu, sigma) ln_plot <- ggplot(data = data.frame(x = ln_samps), aes(x = x)) + geom_histogram(color = "black") plot(ln_plot) norm_samps <- log(ln_samps) n_plot <- ggplot(data = data.frame(x = norm_samps), aes(x = x)) + geom_histogram(aes(y = ..density..), color = "black") + stat_function(fun = dnorm, args = list(mean = mu, sd = sigma), color = "red") plot(n_plot) Exercise 4.17 (Probability integral transform) This exercise is borrowed from Wasserman. Let \\(X\\) have a continuous, strictly increasing CDF \\(F\\). Let \\(Y = F(X)\\). Find the density of \\(Y\\). This is called the probability integral transform. Let \\(U \\sim \\text{Uniform}(0,1)\\) and let \\(X = F^{-1}(U)\\). Show that \\(X \\sim F\\). R: Implement a program that takes Uniform(0,1) random variables and generates random variables from an exponential(\\(\\beta\\)) distribution. Compare your implemented function with function rexp in R. Solution. \\[\\begin{align} F_Y(y) &= P(Y < y) \\\\ &= P(F(X) < y) \\\\ &= P(X < F_X^{-1}(y)) \\\\ &= F_X(F_X^{-1}(y)) \\\\ &= y. \\end{align}\\] From the above it follows that \\(p(y) = 1\\). Note that we need to know the inverse CDF to be able to apply this procedure. \\[\\begin{align} P(X < x) &= P(F^{-1}(U) < x) \\\\ &= P(U < F(x)) \\\\ &= F_U(F(x)) \\\\ &= F(x). \\end{align}\\] set.seed(1) nsamps <- 10000 beta <- 4 generate_exp <- function (n, beta) { tmp <- runif(n) X <- qexp(tmp, beta) return (X) } df <- tibble("R" = rexp(nsamps, beta), "myGenerator" = generate_exp(nsamps, beta)) %>% gather() ggplot(data = df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") "],["mrvs.html", "Chapter 5 Multiple random variables 5.1 General 5.2 Bivariate distribution examples 5.3 Transformations", " Chapter 5 Multiple random variables This chapter deals with multiple random variables and their distributions. The students are expected to acquire the following knowledge: Theoretical Calculation of PDF of transformed multiple random variables. Finding marginal and conditional distributions. R Scatterplots of bivariate random variables. New R functions (for example, expand.grid). .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 5.1 General Exercise 5.1 Let \\(X \\sim \\text{N}(0,1)\\) and \\(Y \\sim \\text{N}(0,1)\\) be independent random variables. Draw 1000 samples from \\((X,Y)\\) and plot a scatterplot. Now let \\(X \\sim \\text{N}(0,1)\\) and \\(Y | X = x \\sim N(ax, 1)\\). Draw 1000 samples from \\((X,Y)\\) for \\(a = 1\\), \\(a=0\\), and \\(a=-0.5\\). Plot the scatterplots. How would you interpret parameter \\(a\\)? Plot the marginal distribution of \\(Y\\) for cases \\(a=1\\), \\(a=0\\), and \\(a=-0.5\\). Can you guess which distribution it is? set.seed(1) nsamps <- 1000 x <- rnorm(nsamps) y <- rnorm(nsamps) ggplot(data.frame(x, y), aes(x = x, y = y)) + geom_point() y1 <- rnorm(nsamps, mean = 1 * x) y2 <- rnorm(nsamps, mean = 0 * x) y3 <- rnorm(nsamps, mean = -0.5 * x) df <- tibble(x = c(x,x,x), y = c(y1,y2,y3), a = c(rep(1, nsamps), rep(0, nsamps), rep(-0.5, nsamps))) ggplot(df, aes(x = x, y = y)) + geom_point() + facet_wrap(~a) + coord_equal(ratio=1) # Parameter a controls the scale of linear dependency between X and Y. ggplot(df, aes(x = y)) + geom_density() + facet_wrap(~a) 5.2 Bivariate distribution examples Exercise 5.2 (Discrete bivariate random variable) Let \\(X\\) represent the event that a die rolls an even number and let \\(Y\\) represent the event that a die rolls one, two, or a three. Find the marginal distributions of \\(X\\) and \\(Y\\). Find the PMF of \\((X,Y)\\). Find the CDF of \\((X,Y)\\). Find \\(P(X = 1 | Y = 1)\\). Solution. \\[\\begin{align} P(X = 1) = \\frac{1}{2} \\text{ and } P(X = 0) = \\frac{1}{2} \\\\ P(Y = 1) = \\frac{1}{2} \\text{ and } P(Y = 0) = \\frac{1}{2} \\\\ \\end{align}\\] \\[\\begin{align} P(X = 1, Y = 1) = \\frac{1}{6} \\\\ P(X = 1, Y = 0) = \\frac{2}{6} \\\\ P(X = 0, Y = 1) = \\frac{2}{6} \\\\ P(X = 0, Y = 0) = \\frac{1}{6} \\end{align}\\] \\[\\begin{align} P(X \\leq x, Y \\leq y) = \\begin{cases} \\frac{1}{6} & x = 0, y = 0 \\\\ \\frac{3}{6} & x \\neq y \\\\ 1 & x = 1, y = 1 \\end{cases} \\end{align}\\] \\[\\begin{align} P(X = 1 | Y = 1) = \\frac{1}{3} \\end{align}\\] Exercise 5.3 (Continuous bivariate random variable) Let \\(p(x,y) = 6 (x - y)^2\\) be the PDF of a bivariate random variable \\((X,Y)\\), where both variables range from zero to one. Find CDF. Find marginal distributions. Find conditional distributions. R: Plot a grid of points and colour them by value – this can help us visualize the PDF. R: Implement a random number generator, which will generate numbers from \\((X,Y)\\) and visually check the results. R: Plot the marginal distribution of \\(Y\\) and the conditional distributions of \\(X | Y = y\\), where \\(y \\in \\{0, 0.1, 0.5\\}\\). Solution. \\[\\begin{align} F(x,y) &= \\int_0^{x} \\int_0^{y} 6 (t - s)^2 ds dt\\\\ &= 6 \\int_0^{x} \\int_0^{y} t^2 - 2ts + s^2 ds dt\\\\ &= 6 \\int_0^{x} t^2y - ty^2 + \\frac{y^3}{3} dt \\\\ &= 6 (\\frac{x^3 y}{3} - \\frac{x^2y^2}{2} + \\frac{x y^3}{3}) \\\\ &= 2 x^3 y - 3 t^2y^2 + 2 x y^3 \\end{align}\\] \\[\\begin{align} p(x) &= \\int_0^{1} 6 (x - y)^2 dy\\\\ &= 6 (x^2 - x + \\frac{1}{3}) \\\\ &= 6x^2 - 6x + 2 \\end{align}\\] \\[\\begin{align} p(y) &= \\int_0^{1} 6 (x - y)^2 dx\\\\ &= 6 (y^2 - y + \\frac{1}{3}) \\\\ &= 6y^2 - 6y + 2 \\end{align}\\] \\[\\begin{align} p(x|y) &= \\frac{p(xy)}{p(y)} \\\\ &= \\frac{6 (x - y)^2}{6 (y^2 - y + \\frac{1}{3})} \\\\ &= \\frac{(x - y)^2}{y^2 - y + \\frac{1}{3}} \\end{align}\\] \\[\\begin{align} p(y|x) &= \\frac{p(xy)}{p(x)} \\\\ &= \\frac{6 (x - y)^2}{6 (x^2 - x + \\frac{1}{3})} \\\\ &= \\frac{(x - y)^2}{x^2 - x + \\frac{1}{3}} \\end{align}\\] set.seed(1) # d pxy <- function (x, y) { return ((x - y)^2) } x_axis <- seq(0, 1, length.out = 100) y_axis <- seq(0, 1, length.out = 100) df <- expand.grid(x_axis, y_axis) colnames(df) <- c("x", "y") df <- cbind(df, pdf = pxy(df$x, df$y)) ggplot(data = df, aes(x = x, y = y, color = pdf)) + geom_point() # e samps <- NULL for (i in 1:10000) { xt <- runif(1, 0, 1) yt <- runif(1, 0, 1) pdft <- pxy(xt, yt) acc <- runif(1, 0, 6) if (acc <= pdft) { samps <- rbind(samps, c(xt, yt)) } } colnames(samps) <- c("x", "y") ggplot(data = as.data.frame(samps), aes(x = x, y = y)) + geom_point() # f mar_pdf <- function (x) { return (6 * x^2 - 6 * x + 2) } cond_pdf <- function (x, y) { return (((x - y)^2) / (y^2 - y + 1/3)) } df <- tibble(x = x_axis, mar = mar_pdf(x), y0 = cond_pdf(x, 0), y0.1 = cond_pdf(x, 0.1), y0.5 = cond_pdf(x, 0.5)) %>% gather(dist, value, -x) ggplot(df, aes(x = x, y = value, color = dist)) + geom_line() Exercise 5.4 (Mixed bivariate random variable) Let \\(f(x,y) = \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y+ \\alpha -1} e^{-x(1 + \\beta)}\\) be the PDF of a bivariate random variable, where \\(x \\in (0, \\infty)\\) and \\(y \\in \\mathbb{N}_0\\). Find the marginal distribution of \\(X\\). Do you recognize this distribution? Find the conditional distribution of \\(Y | X\\). Do you recognize this distribution? Calculate the probability \\(P(Y = 2 | X = 2.5)\\) for \\((X,Y)\\). Find the marginal distribution of \\(Y\\). Do you recognize this distribution? R: Take 1000 random samples from \\((X,Y)\\) with parameters \\(\\beta = 1\\) and \\(\\alpha = 1\\). Plot a scatterplot. Plot a bar plot of the marginal distribution of \\(Y\\), and the theoretical PMF calculated from d) on the range from 0 to 10. Hint: Use the gamma function in R.? Solution. \\[\\begin{align} p(x) &= \\sum_{k = 0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)k!} x^{k + \\alpha -1} e^{-x(1 + \\beta)} & \\\\ &= \\sum_{k = 0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)k!} x^{k} x^{\\alpha -1} e^{-x} e^{-\\beta x} & \\\\ &= \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x} \\sum_{k = 0}^{\\infty} \\frac{1}{k!} x^{k} e^{-x} & \\\\ &= \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x} & \\text{the last term above sums to one} \\end{align}\\] This is the Gamma PDF. \\[\\begin{align} p(y|x) &= \\frac{p(x,y)}{p(x)} \\\\ &= \\frac{\\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y+ \\alpha -1} e^{-x(1 + \\beta)}}{\\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x}} \\\\ &= \\frac{x^y e^{-x}}{y!}. \\end{align}\\] This is the Poisson PMF. \\[\\begin{align} P(Y = 2 | X = 2.5) = \\frac{2.5^2 e^{-2.5}}{2!} \\approx 0.26. \\end{align}\\] \\[\\begin{align} p(y) &= \\int_{0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y + \\alpha -1} e^{-x(1 + \\beta)} dx & \\\\ &= \\frac{1}{y!} \\int_{0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{(y + \\alpha) -1} e^{-(1 + \\beta)x} dx & \\\\ &= \\frac{1}{y!} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} \\int_{0}^{\\infty} \\frac{\\Gamma(y + \\alpha)}{(1 + \\beta)^{y + \\alpha}} \\frac{(1 + \\beta)^{y + \\alpha}}{\\Gamma(y + \\alpha)} x^{(y + \\alpha) -1} e^{-(1 + \\beta)x} dx & \\text{complete to Gamma PDF} \\\\ &= \\frac{1}{y!} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} \\frac{\\Gamma(y + \\alpha)}{(1 + \\beta)^{y + \\alpha}}. \\end{align}\\] We add the terms in the third equality to get a Gamma PDF inside the integral, which then integrates to one. We do not recognize this distribution. set.seed(1) px <- function (x, alpha, beta) { return((1 / factorial(x)) * (beta^alpha / gamma(alpha)) * (gamma(x + alpha) / (1 + beta)^(x + alpha))) } nsamps <- 1000 rx <- rgamma(nsamps, 1, 1) ryx <- rpois(nsamps, rx) ggplot(data = data.frame(x = rx, y = ryx), aes(x = x, y = y)) + geom_point() ggplot(data = data.frame(x = rx, y = ryx), aes(x = y)) + geom_bar(aes(y = (..count..)/sum(..count..))) + stat_function(fun = px, args = list(alpha = 1, beta = 1), color = "red") Exercise 5.5 Let \\(f(x,y) = cx^2y\\) for \\(x^2 \\leq y \\leq 1\\) and zero otherwise. Find such \\(c\\) that \\(f\\) is a PDF of a bivariate random variable. This exercise is borrowed from Wasserman. Solution. \\[\\begin{align} 1 &= \\int_{-1}^{1} \\int_{x^2}^1 cx^2y dy dx \\\\ &= \\int_{-1}^{1} cx^2 (\\frac{1}{2} - \\frac{x^4}{2}) dx \\\\ &= \\frac{c}{2} \\int_{-1}^{1} x^2 - x^6 dx \\\\ &= \\frac{c}{2} (\\frac{1}{3} + \\frac{1}{3} - \\frac{1}{7} - \\frac{1}{7}) \\\\ &= \\frac{c}{2} \\frac{8}{21} \\\\ &= \\frac{4c}{21} \\end{align}\\] It follows \\(c = \\frac{21}{4}\\). 5.3 Transformations Exercise 5.6 Let \\((X,Y)\\) be uniformly distributed on the unit ball \\(\\{(x,y,z) : x^2 + y^2 + z^2 \\leq 1\\}\\). Let \\(R = \\sqrt{X^2 + Y^2 + Z^2}\\). Find the CDF and PDF of \\(R\\). Solution. \\[\\begin{align} P(R < r) &= P(\\sqrt{X^2 + Y^2 + Z^2} < r) \\\\ &= P(X^2 + Y^2 + Z^2 < r^2) \\\\ &= \\frac{\\frac{4}{3} \\pi r^3}{\\frac{4}{3}\\pi} \\\\ &= r^3. \\end{align}\\] The second line shows us that we are looking at the probability which is represented by a smaller ball with radius \\(r\\). To get the probability, we divide it by the radius of the whole ball. We get the PDF by differentiating the CDF, so \\(p(r) = 3r^2\\). "],["integ.html", "Chapter 6 Integration 6.1 Monte Carlo integration 6.2 Lebesgue integrals", " Chapter 6 Integration This chapter deals with abstract and Monte Carlo integration. The students are expected to acquire the following knowledge: Theoretical How to calculate Lebesgue integrals for non-simple functions. R Monte Carlo integration. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 6.1 Monte Carlo integration Exercise 6.1 Let \\(X\\) and \\(Y\\) be continuous random variables on the unit interval and \\(p(x,y) = 6(x - y)^2\\). Use Monte Carlo integration to estimate the probability \\(P(0.2 \\leq X \\leq 0.5, \\: 0.1 \\leq Y \\leq 0.2)\\). Can you find the exact value? set.seed(1) nsamps <- 1000 V <- (0.5 - 0.2) * (0.2 - 0.1) x1 <- runif(nsamps, 0.2, 0.5) x2 <- runif(nsamps, 0.1, 0.2) f_1 <- function (x, y) { return (6 * (x - y)^2) } mcint <- V * (1 / nsamps) * sum(f_1(x1, x2)) sdm <- sqrt((V^2 / nsamps) * var(f_1(x1, x2))) mcint ## [1] 0.008793445 sdm ## [1] 0.0002197686 F_1 <- function (x, y) { return (2 * x^3 * y - 3 * x^2 * y^2 + 2 * x * y^3) } F_1(0.5, 0.2) - F_1(0.2, 0.2) - F_1(0.5, 0.1) + F_1(0.2, 0.1) ## [1] 0.0087 6.2 Lebesgue integrals Exercise 6.2 (borrowed from Jagannathan) Find the Lebesgue integral of the following functions on (\\(\\mathbb{R}\\), \\(\\mathcal{B}(\\mathbb{R})\\), \\(\\lambda\\)). \\[\\begin{align} f(\\omega) = \\begin{cases} \\omega, & \\text{for } \\omega = 0,1,...,n \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] \\[\\begin{align} f(\\omega) = \\begin{cases} 1, & \\text{for } \\omega = \\mathbb{Q}^c \\cap [0,1] \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] \\[\\begin{align} f(\\omega) = \\begin{cases} n, & \\text{for } \\omega = \\mathbb{Q}^c \\cap [0,n] \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] Solution. \\[\\begin{align} \\int f(\\omega) d\\lambda = \\sum_{\\omega = 0}^n \\omega \\lambda(\\omega) = 0. \\end{align}\\] \\[\\begin{align} \\int f(\\omega) d\\lambda = 1 \\times \\lambda(\\mathbb{Q}^c \\cap [0,1]) = 1. \\end{align}\\] \\[\\begin{align} \\int f(\\omega) d\\lambda = n \\times \\lambda(\\mathbb{Q}^c \\cap [0,n]) = n^2. \\end{align}\\] Exercise 6.3 (borrowed from Jagannathan) Let \\(c \\in \\mathbb{R}\\) be fixed and (\\(\\mathbb{R}\\), \\(\\mathcal{B}(\\mathbb{R})\\)) a measurable space. If for any Borel set \\(A\\), \\(\\delta_c (A) = 1\\) if \\(c \\in A\\), and \\(\\delta_c (A) = 0\\) otherwise, then \\(\\delta_c\\) is called a Dirac measure. Let \\(g\\) be a non-negative, measurable function. Show that \\(\\int g d \\delta_c = g(c)\\). Solution. \\[\\begin{align} \\int g d \\delta_c &= \\sup_{q \\in S(g)} \\int q d \\delta_c \\\\ &= \\sup_{q \\in S(g)} \\sum_{i = 1}^n a_i \\delta_c(A_i) \\\\ &= \\sup_{q \\in S(g)} \\sum_{i = 1}^n a_i \\text{I}_{A_i}(c) \\\\ &= \\sup_{q \\in S(g)} q(c) \\\\ &= g(c) \\end{align}\\] "],["ev.html", "Chapter 7 Expected value 7.1 Discrete random variables 7.2 Continuous random variables 7.3 Sums, functions, conditional expectations 7.4 Covariance", " Chapter 7 Expected value This chapter deals with expected values of random variables. The students are expected to acquire the following knowledge: Theoretical Calculation of the expected value. Calculation of variance and covariance. Cauchy distribution. R Estimation of expected value. Estimation of variance and covariance. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 7.1 Discrete random variables Exercise 7.1 (Bernoulli) Let \\(X \\sim \\text{Bernoulli}(p)\\). Find \\(E[X]\\). Find \\(Var[X]\\). R: Let \\(p = 0.4\\). Check your answers to a) and b) with a simulation. Solution. \\[\\begin{align*} E[X] = \\sum_{k=0}^1 p^k (1-p)^{1-k} k = p. \\end{align*}\\] \\[\\begin{align*} Var[X] = E[X^2] - E[X]^2 = \\sum_{k=0}^1 (p^k (1-p)^{1-k} k^2) - p^2 = p(1-p). \\end{align*}\\] set.seed(1) nsamps <- 1000 x <- rbinom(nsamps, 1, 0.4) mean(x) ## [1] 0.394 var(x) ## [1] 0.239003 0.4 * (1 - 0.4) ## [1] 0.24 Exercise 7.2 (Binomial) Let \\(X \\sim \\text{Binomial}(n,p)\\). Find \\(E[X]\\). Find \\(Var[X]\\). Solution. Let \\(X = \\sum_{i=0}^n X_i\\), where \\(X_i \\sim \\text{Bernoulli}(p)\\). Then, due to linearity of expectation \\[\\begin{align*} E[X] = E[\\sum_{i=0}^n X_i] = \\sum_{i=0}^n E[X_i] = np. \\end{align*}\\] Again let \\(X = \\sum_{i=0}^n X_i\\), where \\(X_i \\sim \\text{Bernoulli}(p)\\). Since the Bernoulli variables \\(X_i\\) are independent we have \\[\\begin{align*} Var[X] = Var[\\sum_{i=0}^n X_i] = \\sum_{i=0}^n Var[X_i] = np(1-p). \\end{align*}\\] Exercise 7.3 (Poisson) Let \\(X \\sim \\text{Poisson}(\\lambda)\\). Find \\(E[X]\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\sum_{k=0}^\\infty \\frac{\\lambda^k e^{-\\lambda}}{k!} k & \\\\ &= \\sum_{k=1}^\\infty \\frac{\\lambda^k e^{-\\lambda}}{k!} k & \\text{term at $k=0$ is 0} \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty \\frac{\\lambda^{k-1}}{(k - 1)!} & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=0}^\\infty \\frac{\\lambda^{k}}{k!} & \\\\ &= e^{-\\lambda} \\lambda e^\\lambda & \\\\ &= \\lambda. \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty k \\frac{\\lambda^{k-1}}{(k - 1)!} - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty (k - 1) + 1) \\frac{\\lambda^{k-1}}{(k - 1)!} - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\sum_{k=1}^\\infty (k - 1) \\frac{\\lambda^{k-1}}{(k - 1)!} + \\sum_{k=1}^\\infty \\frac{\\lambda^{k-1}}{(k - 1)!}\\Big) - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\lambda\\sum_{k=2}^\\infty \\frac{\\lambda^{k-2}}{(k - 2)!} + e^\\lambda\\Big) - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\lambda e^\\lambda + e^\\lambda\\Big) - \\lambda^2 & \\\\ &= \\lambda^2 + \\lambda - \\lambda^2 & \\\\ &= \\lambda. \\end{align*}\\] Exercise 7.4 (Geometric) Let \\(X \\sim \\text{Geometric}(p)\\). Find \\(E[X]\\). Hint: \\(\\frac{d}{dx} x^k = k x^{(k - 1)}\\). Solution. \\[\\begin{align*} E[X] &= \\sum_{k=0}^\\infty (1 - p)^k p k & \\\\ &= p (1 - p) \\sum_{k=0}^\\infty (1 - p)^{k-1} k & \\\\ &= p (1 - p) \\sum_{k=0}^\\infty -\\frac{d}{dp}(1 - p)^k & \\\\ &= p (1 - p) \\Big(-\\frac{d}{dp}\\Big) \\sum_{k=0}^\\infty (1 - p)^k & \\\\ &= p (1 - p) \\Big(-\\frac{d}{dp}\\Big) \\frac{1}{1 - (1 - p)} & \\text{geometric series} \\\\ &= \\frac{1 - p}{p} \\end{align*}\\] 7.2 Continuous random variables Exercise 7.5 (Gamma) Let \\(X \\sim \\text{Gamma}(\\alpha, \\beta)\\). Hint: \\(\\Gamma(z) = \\int_0^\\infty t^{z-1}e^{-t} dt\\) and \\(\\Gamma(z + 1) = z \\Gamma(z)\\). Find \\(E[X]\\). Find \\(Var[X]\\). R: Let \\(\\alpha = 10\\) and \\(\\beta = 2\\). Plot the density of \\(X\\). Add a horizontal line at the expected value that touches the density curve (geom_segment). Shade the area within a standard deviation of the expected value. Solution. \\[\\begin{align*} E[X] &= \\int_0^\\infty \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)}x^\\alpha e^{-\\beta x} dx & \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\int_0^\\infty x^\\alpha e^{-\\beta x} dx & \\text{ (let $t = \\beta x$)} \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(\\alpha) }\\int_0^\\infty \\frac{t^\\alpha}{\\beta^\\alpha} e^{-t} \\frac{dt}{\\beta} & \\\\ &= \\frac{1}{\\beta \\Gamma(\\alpha) }\\int_0^\\infty t^\\alpha e^{-t} dt & \\\\ &= \\frac{\\Gamma(\\alpha + 1)}{\\beta \\Gamma(\\alpha)} & \\\\ &= \\frac{\\alpha \\Gamma(\\alpha)}{\\beta \\Gamma(\\alpha)} & \\\\ &= \\frac{\\alpha}{\\beta}. & \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\int_0^\\infty \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)}x^{\\alpha+1} e^{-\\beta x} dx - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\Gamma(\\alpha + 2)}{\\beta^2 \\Gamma(\\alpha)} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{(\\alpha + 1)\\alpha\\Gamma(\\alpha)}{\\beta^2 \\Gamma(\\alpha)} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\alpha^2 + \\alpha}{\\beta^2} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\alpha}{\\beta^2}. \\end{align*}\\] set.seed(1) x <- seq(0, 25, by = 0.01) y <- dgamma(x, shape = 10, rate = 2) df <- data.frame(x = x, y = y) ggplot(df, aes(x = x, y = y)) + geom_line() + geom_segment(aes(x = 5, y = 0, xend = 5, yend = dgamma(5, shape = 10, rate = 2)), color = "red") + stat_function(fun = dgamma, args = list(shape = 10, rate = 2), xlim = c(5 - sqrt(10/4), 5 + sqrt(10/4)), geom = "area", fill = "gray", alpha = 0.4) Exercise 7.6 (Beta) Let \\(X \\sim \\text{Beta}(\\alpha, \\beta)\\). Find \\(E[X]\\). Hint 1: \\(\\text{B}(x,y) = \\int_0^1 t^{x-1} (1 - t)^{y-1} dt\\). Hint 2: \\(\\text{B}(x + 1, y) = \\text{B}(x,y)\\frac{x}{x + y}\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\int_0^1 \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} x dx \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)}\\int_0^1 x^{\\alpha} (1 - x)^{\\beta - 1} dx \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 1, \\beta) \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha, \\beta) \\frac{\\alpha}{\\alpha + \\beta} \\\\ &= \\frac{\\alpha}{\\alpha + \\beta}. \\\\ \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\int_0^1 \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} x^2 dx - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)}\\int_0^1 x^{\\alpha + 1} (1 - x)^{\\beta - 1} dx - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 2, \\beta) - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 1, \\beta) \\frac{\\alpha + 1}{\\alpha + \\beta + 1} - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{\\alpha + 1}{\\alpha + \\beta + 1} \\frac{\\alpha}{\\alpha + \\beta} - \\frac{\\alpha^2}{(\\alpha + \\beta)^2}\\\\ &= \\frac{\\alpha \\beta}{(\\alpha + \\beta)^2(\\alpha + \\beta + 1)}. \\end{align*}\\] Exercise 7.7 (Exponential) Let \\(X \\sim \\text{Exp}(\\lambda)\\). Find \\(E[X]\\). Hint: \\(\\Gamma(z + 1) = z\\Gamma(z)\\) and \\(\\Gamma(1) = 1\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\int_0^\\infty \\lambda e^{-\\lambda x} x dx & \\\\ &= \\lambda \\int_0^\\infty x e^{-\\lambda x} dx & \\\\ &= \\lambda \\int_0^\\infty \\frac{t}{\\lambda} e^{-t} \\frac{dt}{\\lambda} & \\text{$t = \\lambda x$}\\\\ &= \\lambda \\lambda^{-2} \\Gamma(2) & \\text{definition of gamma function} \\\\ &= \\lambda^{-1}. \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 & \\\\ &= \\int_0^\\infty \\lambda e^{-\\lambda x} x^2 dx - \\lambda^{-2} & \\\\ &= \\lambda \\int_0^\\infty \\frac{t^2}{\\lambda^2} e^{-t} \\frac{dt}{\\lambda} - \\lambda^{-2} & \\text{$t = \\lambda x$} \\\\ &= \\lambda \\lambda^{-3} \\Gamma(3) - \\lambda^{-2} & \\text{definition of gamma function} & \\\\ &= \\lambda^{-2} 2 \\Gamma(2) - \\lambda^{-2} & \\\\ &= 2 \\lambda^{-2} - \\lambda^{-2} & \\\\ &= \\lambda^{-2}. & \\\\ \\end{align*}\\] Exercise 7.8 (Normal) Let \\(X \\sim \\text{N}(\\mu, \\sigma)\\). Show that \\(E[X] = \\mu\\). Hint: Use the error function \\(\\text{erf}(x) = \\frac{1}{\\sqrt(\\pi)} \\int_{-x}^x e^{-t^2} dt\\). The statistical interpretation of this function is that if \\(Y \\sim \\text{N}(0, 0.5)\\), then the error function describes the probability of \\(Y\\) falling between \\(-x\\) and \\(x\\). Also, \\(\\text{erf}(\\infty) = 1\\). Show that \\(Var[X] = \\sigma^2\\). Hint: Start with the definition of variance. Solution. \\[\\begin{align*} E[X] &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} x dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty x e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty \\Big(t \\sqrt{2\\sigma^2} + \\mu\\Big)e^{-t^2} \\sqrt{2 \\sigma^2} dt & t = \\frac{x - \\mu}{\\sqrt{2}\\sigma} \\\\ &= \\frac{\\sqrt{2\\sigma^2}}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty t e^{-t^2} dt + \\frac{1}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty \\mu e^{-t^2} dt & \\\\ \\end{align*}\\] Let us calculate these integrals separately. \\[\\begin{align*} \\int t e^{-t^2} dt &= -\\frac{1}{2}\\int e^{s} ds & s = -t^2 \\\\ &= -\\frac{e^s}{2} + C \\\\ &= -\\frac{e^{-t^2}}{2} + C & \\text{undoing substitution}. \\end{align*}\\] Inserting the integration limits we get \\[\\begin{align*} \\int_{-\\infty}^\\infty t e^{-t^2} dt &= 0, \\end{align*}\\] due to the integrated function being symmetric. Reordering the second integral we get \\[\\begin{align*} \\mu \\frac{1}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty e^{-t^2} dt &= \\mu \\text{erf}(\\infty) & \\text{definition of error function} \\\\ &= \\mu & \\text{probability of $Y$ falling between $-\\infty$ and $\\infty$}. \\end{align*}\\] Combining all of the above we get \\[\\begin{align*} E[X] &= \\frac{\\sqrt{2\\sigma^2}}{\\sqrt{\\pi}} \\times 0 + \\mu &= \\mu.\\\\ \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[(X - E[X])^2] \\\\ &= E[(X - \\mu)^2] \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty (x - \\mu)^2 e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} dx \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\int_{-\\infty}^\\infty t^2 e^{-\\frac{t^2}{2}} dt \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\bigg(\\Big(- t e^{-\\frac{t^2}{2}} |_{-\\infty}^\\infty \\Big) + \\int_{-\\infty}^\\infty e^{-\\frac{t^2}{2}} \\bigg) dt & \\text{integration by parts} \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\sqrt{2 \\pi} \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt(\\pi)}e^{-s^2} \\bigg) & s = \\frac{t}{\\sqrt{2}} \\text{ and evaluating the left expression at the bounds} \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\sqrt{2 \\pi} \\Big(\\text{erf}(\\infty) & \\text{definition of error function} \\\\ &= \\sigma^2. \\end{align*}\\] 7.3 Sums, functions, conditional expectations Exercise 7.9 (Expectation of transformations) Let \\(X\\) follow a normal distribution with mean \\(\\mu\\) and variance \\(\\sigma^2\\). Find \\(E[2X + 4]\\). Find \\(E[X^2]\\). Find \\(E[\\exp(X)]\\). Hint: Use the error function \\(\\text{erf}(x) = \\frac{1}{\\sqrt(\\pi)} \\int_{-x}^x e^{-t^2} dt\\). Also, \\(\\text{erf}(\\infty) = 1\\). R: Check your results numerically for \\(\\mu = 0.4\\) and \\(\\sigma^2 = 0.25\\) and plot the densities of all four distributions. Solution. \\[\\begin{align} E[2X + 4] &= 2E[X] + 4 & \\text{linearity of expectation} \\\\ &= 2\\mu + 4. \\\\ \\end{align}\\] \\[\\begin{align} E[X^2] &= E[X]^2 + Var[X] & \\text{definition of variance} \\\\ &= \\mu^2 + \\sigma^2. \\end{align}\\] \\[\\begin{align} E[\\exp(X)] &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} e^x dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{\\frac{2 \\sigma^2 x}{2\\sigma^2} -\\frac{(x - \\mu)^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{x^2 - 2x(\\mu + \\sigma^2) + \\mu^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{(x - (\\mu + \\sigma^2))^2 + \\mu^2 - (\\mu + \\sigma^2)^2}{2\\sigma^2}} dx & \\text{complete the square} \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{\\frac{- \\mu^2 + (\\mu + \\sigma^2)^2}{2\\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{(x - (\\mu + \\sigma^2))^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{\\frac{- \\mu^2 + (\\mu + \\sigma^2)^2}{2\\sigma^2}} \\sigma \\sqrt{2 \\pi} \\text{erf}(\\infty) & \\\\ &= e^{\\frac{2\\mu + \\sigma^2}{2}}. \\end{align}\\] set.seed(1) mu <- 0.4 sigma <- 0.5 x <- rnorm(100000, mean = mu, sd = sigma) mean(2*x + 4) ## [1] 4.797756 2 * mu + 4 ## [1] 4.8 mean(x^2) ## [1] 0.4108658 mu^2 + sigma^2 ## [1] 0.41 mean(exp(x)) ## [1] 1.689794 exp((2 * mu + sigma^2) / 2) ## [1] 1.690459 Exercise 7.10 (Sum of independent random variables) Borrowed from Wasserman. Let \\(X_1, X_2,...,X_n\\) be IID random variables with expected value \\(E[X_i] = \\mu\\) and variance \\(Var[X_i] = \\sigma^2\\). Find the expected value and variance of \\(\\bar{X} = \\frac{1}{n} \\sum_{i=1}^n X_i\\). \\(\\bar{X}\\) is called a statistic (a function of the values in a sample). It is itself a random variable and its distribution is called a sampling distribution. R: Take \\(n = 5, 10, 100, 1000\\) samples from the N(\\(2\\), \\(6\\)) distribution 10000 times. Plot the theoretical density and the densities of \\(\\bar{X}\\) statistic for each \\(n\\). Intuitively, are the results in correspondence with your calculations? Check them numerically. Solution. Let us start with the expectation of \\(\\bar{X}\\). \\[\\begin{align} E[\\bar{X}] &= E[\\frac{1}{n} \\sum_{i=1}^n X_i] & \\\\ &= \\frac{1}{n} E[\\sum_{i=1}^n X_i] & \\text{ (multiplication with a scalar)} \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[X_i] & \\text{ (linearity)} \\\\ &= \\frac{1}{n} n \\mu & \\\\ &= \\mu. \\end{align}\\] Now the variance \\[\\begin{align} Var[\\bar{X}] &= Var[\\frac{1}{n} \\sum_{i=1}^n X_i] & \\\\ &= \\frac{1}{n^2} Var[\\sum_{i=1}^n X_i] & \\text{ (multiplication with a scalar)} \\\\ &= \\frac{1}{n^2} \\sum_{i=1}^n Var[X_i] & \\text{ (independence of samples)} \\\\ &= \\frac{1}{n^2} n \\sigma^2 & \\\\ &= \\frac{1}{n} \\sigma^2. \\end{align}\\] set.seed(1) nsamps <- 10000 mu <- 2 sigma <- sqrt(6) N <- c(5, 10, 100, 500) X <- matrix(data = NA, nrow = nsamps, ncol = length(N)) ind <- 1 for (n in N) { for (i in 1:nsamps) { X[i,ind] <- mean(rnorm(n, mu, sigma)) } ind <- ind + 1 } colnames(X) <- N X <- melt(as.data.frame(X)) ggplot(data = X, aes(x = value, colour = variable)) + geom_density() + stat_function(data = data.frame(x = seq(-2, 6, by = 0.01)), aes(x = x), fun = dnorm, args = list(mean = mu, sd = sigma), color = "black") Exercise 7.11 (Conditional expectation) Let \\(X \\in \\mathbb{R}_0^+\\) and \\(Y \\in \\mathbb{N}_0\\) be random variables with joint distribution \\(p_{XY}(X,Y) = \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} 0.5^{y + 1}\\). Find \\(E[X | Y = y]\\) by first finding \\(p_Y\\) and then \\(p_{X|Y}\\). Find \\(E[X]\\). R: check your answers to a) and b) by drawing 10000 samples from \\(p_Y\\) and \\(p_{X|Y}\\). Solution. \\[\\begin{align} p(y) &= \\int_0^\\infty \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} 0.5^{y + 1} dx \\\\ &= \\frac{0.5^{y + 1}}{y + 1} \\int_0^\\infty e^{-\\frac{x}{y + 1}} dx \\\\ &= \\frac{0.5^{y + 1}}{y + 1} (y + 1) \\\\ &= 0.5^{y + 1} \\\\ &= 0.5(1 - 0.5)^y. \\end{align}\\] We recognize this as the geometric distribution. \\[\\begin{align} p(x|y) &= \\frac{p(x,y)}{p(y)} \\\\ &= \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}}. \\end{align}\\] We recognize this as the exponential distribution. \\[\\begin{align} E[X | Y = y] &= \\int_0^\\infty x \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} dx \\\\ &= y + 1 & \\text{expected value of the exponential distribution} \\end{align}\\] Use the law of iterated expectation. \\[\\begin{align} E[X] &= E[E[X | Y]] \\\\ &= E[Y + 1] \\\\ &= E[Y] + 1 \\\\ &= \\frac{1 - 0.5}{0.5} + 1 \\\\ &= 2. \\end{align}\\] set.seed(1) y <- rgeom(100000, 0.5) x <- rexp(100000, rate = 1 / (y + 1)) x2 <- x[y == 3] mean(x2) ## [1] 4.048501 3 + 1 ## [1] 4 mean(x) ## [1] 2.007639 (1 - 0.5) / 0.5 + 1 ## [1] 2 Exercise 7.12 (Cauchy distribution) Let \\(p(x | x_0, \\gamma) = \\frac{1}{\\pi \\gamma \\Big(1 + \\big(\\frac{x - x_0}{\\gamma}\\big)^2\\Big)}\\). A random variable with this PDF follows a Cauchy distribution. This distribution is symmetric and has wider tails than the normal distribution. R: Draw \\(n = 1,...,1000\\) samples from a standard normal and \\(\\text{Cauchy}(0, 1)\\). For each \\(n\\) plot the mean and the median of the sample using facets. Interpret the results. To get a mathematical explanation of the results in a), evaluate the integral \\(\\int_0^\\infty \\frac{x}{1 + x^2} dx\\) and consider that \\(E[X] = \\int_{-\\infty}^\\infty \\frac{x}{1 + x^2}dx\\). set.seed(1) n <- 1000 means_n <- vector(mode = "numeric", length = n) means_c <- vector(mode = "numeric", length = n) medians_n <- vector(mode = "numeric", length = n) medians_c <- vector(mode = "numeric", length = n) for (i in 1:n) { tmp_n <- rnorm(i) tmp_c <- rcauchy(i) means_n[i] <- mean(tmp_n) means_c[i] <- mean(tmp_c) medians_n[i] <- median(tmp_n) medians_c[i] <- median(tmp_c) } df <- data.frame("distribution" = c(rep("normal", 2 * n), rep("Cauchy", 2 * n)), "type" = c(rep("mean", n), rep("median", n), rep("mean", n), rep("median", n)), "value" = c(means_n, medians_n, means_c, medians_c), "n" = rep(1:n, times = 4)) ggplot(df, aes(x = n, y = value)) + geom_line(alpha = 0.5) + facet_wrap(~ type + distribution , scales = "free") Solution. \\[\\begin{align} \\int_0^\\infty \\frac{x}{1 + x^2} dx &= \\frac{1}{2} \\int_1^\\infty \\frac{1}{u} du & u = 1 + x^2 \\\\ &= \\frac{1}{2} \\ln(x) |_0^\\infty. \\end{align}\\] This integral is not finite. The same holds for the negative part. Therefore, the expectation is undefined, as \\(E[|X|] = \\infty\\). Why can we not just claim that \\(f(x) = x / (1 + x^2)\\) is odd and \\(\\int_{-\\infty}^\\infty f(x) = 0\\)? By definition of the Lebesgue integral \\(\\int_{-\\infty}^{\\infty} f= \\int_{-\\infty}^{\\infty} f_+-\\int_{-\\infty}^{\\infty} f_-\\). At least one of the two integrals needs to be finite for \\(\\int_{-\\infty}^{\\infty} f\\) to be well-defined. However \\(\\int_{-\\infty}^{\\infty} f_+=\\int_0^{\\infty} x/(1+x^2)\\) and \\(\\int_{-\\infty}^{\\infty} f_-=\\int_{-\\infty}^{0} |x|/(1+x^2)\\). We have just shown that both of these integrals are infinite, which implies that their sum is also infinite. 7.4 Covariance Exercise 7.13 Below is a table of values for random variables \\(X\\) and \\(Y\\). X Y 2.1 8 -0.5 11 1 10 -2 12 4 9 Find sample covariance of \\(X\\) and \\(Y\\). Find sample variances of \\(X\\) and \\(Y\\). Find sample correlation of \\(X\\) and \\(Y\\). Find sample variance of \\(Z = 2X - 3Y\\). Solution. \\(\\bar{X} = 0.92\\) and \\(\\bar{Y} = 10\\). \\[\\begin{align} s(X, Y) &= \\frac{1}{n - 1} \\sum_{i=1}^5 (X_i - 0.92) (Y_i - 10) \\\\ &= -3.175. \\end{align}\\] \\[\\begin{align} s(X) &= \\frac{\\sum_{i=1}^5(X_i - 0.92)^2}{5 - 1} \\\\ &= 5.357. \\end{align}\\] \\[\\begin{align} s(Y) &= \\frac{\\sum_{i=1}^5(Y_i - 10)^2}{5 - 1} \\\\ &= 2.5. \\end{align}\\] \\[\\begin{align} r(X,Y) &= \\frac{Cov(X,Y)}{\\sqrt{Var[X]Var[Y]}} \\\\ &= \\frac{-3.175}{\\sqrt{5.357 \\times 2.5}} \\\\ &= -8.68. \\end{align}\\] \\[\\begin{align} s(Z) &= 2^2 s(X) + 3^2 s(Y) + 2 \\times 2 \\times 3 s(X, Y) \\\\ &= 4 \\times 5.357 + 9 \\times 2.5 + 12 \\times 3.175 \\\\ &= 82.028. \\end{align}\\] Exercise 7.14 Let \\(X \\sim \\text{Uniform}(0,1)\\) and \\(Y | X = x \\sim \\text{Uniform(0,x)}\\). Find the covariance of \\(X\\) and \\(Y\\). Find the correlation of \\(X\\) and \\(Y\\). R: check your answers to a) and b) with simulation. Plot \\(X\\) against \\(Y\\) on a scatterplot. Solution. The joint PDF is \\(p(x,y) = p(x)p(y|x) = \\frac{1}{x}\\). \\[\\begin{align} Cov(X,Y) &= E[XY] - E[X]E[Y] \\\\ \\end{align}\\] Let us first evaluate the first term: \\[\\begin{align} E[XY] &= \\int_0^1 \\int_0^x x y \\frac{1}{x} dy dx \\\\ &= \\int_0^1 \\int_0^x y dy dx \\\\ &= \\int_0^1 \\frac{x^2}{2} dx \\\\ &= \\frac{1}{6}. \\end{align}\\] Now let us find \\(E[Y]\\), \\(E[X]\\) is trivial. \\[\\begin{align} E[Y] = E[E[Y | X]] = E[\\frac{X}{2}] = \\frac{1}{2} \\int_0^1 x dx = \\frac{1}{4}. \\end{align}\\] Combining all: \\[\\begin{align} Cov(X,Y) &= \\frac{1}{6} - \\frac{1}{2} \\frac{1}{4} = \\frac{1}{24}. \\end{align}\\] \\[\\begin{align} \\rho(X,Y) &= \\frac{Cov(X,Y)}{\\sqrt{Var[X]Var[Y]}} \\\\ \\end{align}\\] Let us calculate \\(Var[X]\\). \\[\\begin{align} Var[X] &= E[X^2] - \\frac{1}{4} \\\\ &= \\int_0^1 x^2 - \\frac{1}{4} \\\\ &= \\frac{1}{3} - \\frac{1}{4} \\\\ &= \\frac{1}{12}. \\end{align}\\] Let us calculate \\(E[E[Y^2|X]]\\). \\[\\begin{align} E[E[Y^2|X]] &= E[\\frac{x^2}{3}] \\\\ &= \\frac{1}{9}. \\end{align}\\] Then \\(Var[Y] = \\frac{1}{9} - \\frac{1}{16} = \\frac{7}{144}\\). Combining all \\[\\begin{align} \\rho(X,Y) &= \\frac{\\frac{1}{24}}{\\sqrt{\\frac{1}{12}\\frac{7}{144}}} \\\\ &= 0.65. \\end{align}\\] set.seed(1) nsamps <- 10000 x <- runif(nsamps) y <- runif(nsamps, 0, x) cov(x, y) ## [1] 0.04274061 1/24 ## [1] 0.04166667 cor(x, y) ## [1] 0.6629567 (1 / 24) / (sqrt(7 / (12 * 144))) ## [1] 0.6546537 ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_point(alpha = 0.2) + geom_smooth(method = "lm") "],["mrv.html", "Chapter 8 Multivariate random variables 8.1 Multinomial random variables 8.2 Multivariate normal random variables 8.3 Transformations", " Chapter 8 Multivariate random variables This chapter deals with multivariate random variables. The students are expected to acquire the following knowledge: Theoretical Multinomial distribution. Multivariate normal distribution. Cholesky decomposition. Eigendecomposition. R Sampling from the multinomial distribution. Sampling from the multivariate normal distribution. Matrix decompositions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 8.1 Multinomial random variables Exercise 8.1 Let \\(X_i\\), \\(i = 1,...,k\\) represent \\(k\\) events, and \\(p_i\\) the probabilities of these events happening in a trial. Let \\(n\\) be the number of trials, and \\(X\\) a multivariate random variable, the collection of \\(X_i\\). Then \\(p(x) = \\frac{n!}{x_1!x_2!...x_k!} p_1^{x_1} p_2^{x_2}...p_k^{x_k}\\) is the PMF of a multinomial distribution, where \\(n = \\sum_{i = 1}^k x_i\\). Show that the marginal distribution of \\(X_i\\) is a binomial distribution. Take 1000 samples from the multinomial distribution with \\(n=4\\) and probabilities \\(p = (0.2, 0.2, 0.5, 0.1)\\). Then take 1000 samples from four binomial distributions with the same parameters. Inspect the results visually. Solution. We will approach this proof from the probabilistic point of view. W.L.O.G. let \\(x_1\\) be the marginal distribution we are interested in. The term \\(p^{x_1}\\) denotes the probability that event 1 happened \\(x_1\\) times. For this event not to happen, one of the other events needs to happen. So for each of the remaining trials, the probability of another event is \\(\\sum_{i=2}^k p_i = 1 - p_1\\), and there were \\(n - x_1\\) such trials. What is left to do is to calculate the number of permutations of event 1 happening and event 1 not happening. We choose \\(x_1\\) trials, from \\(n\\) trials. Therefore \\(p(x_1) = \\binom{n}{x_1} p_1^{x_1} (1 - p_1)^{n - x_1}\\), which is the binomial PMF. Interested students are encouraged to prove this mathematically. set.seed(1) nsamps <- 1000 samps_mult <- rmultinom(nsamps, 4, prob = c(0.2, 0.2, 0.5, 0.1)) samps_mult <- as_tibble(t(samps_mult)) %>% gather() samps <- tibble( V1 = rbinom(nsamps, 4, 0.2), V2 = rbinom(nsamps, 4, 0.2), V3 = rbinom(nsamps, 4, 0.5), V4 = rbinom(nsamps, 4, 0.1) ) %>% gather() %>% bind_rows(samps_mult) %>% bind_cols("dist" = c(rep("binomial", 4*nsamps), rep("multinomial", 4*nsamps))) ggplot(samps, aes(x = value, fill = dist)) + geom_bar(position = "dodge") + facet_wrap(~ key) Exercise 8.2 (Multinomial expected value) Find the expected value, variance and covariance of the multinomial distribution. Hint: First find the expected value for \\(n = 1\\) and then use the fact that the trials are independent. Solution. Let us first calculate the expected value of \\(X_1\\), when \\(n = 1\\). \\[\\begin{align} E[X_1] &= \\sum_{n_1 = 0}^1 \\sum_{n_2 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_1!n_2!...n_k!}p_1^{n_1}p_2^{n_2}...p_k^{n_k}n_1 \\\\ &= \\sum_{n_1 = 0}^1 \\frac{p_1^{n_1} n_1}{n_1!} \\sum_{n_2 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_2!...n_k!}p_2^{n_2}...p_k^{n_k} \\end{align}\\] When \\(n_1 = 0\\) then the whole terms is zero, so we do not need to evaluate other sums. When \\(n_1 = 1\\), all other \\(n_i\\) must be zero, as we have \\(1 = \\sum_{i=1}^k n_i\\). Therefore the other sums equal \\(1\\). So \\(E[X_1] = p_1\\) and \\(E[X_i] = p_i\\) for \\(i = 1,...,k\\). Now let \\(Y_j\\), \\(j = 1,...,n\\), have a multinomial distribution with \\(n = 1\\), and let \\(X\\) have a multinomial distribution with an arbitrary \\(n\\). Then we can write \\(X = \\sum_{j=1}^n Y_j\\). And due to independence \\[\\begin{align} E[X] &= E[\\sum_{j=1}^n Y_j] \\\\ &= \\sum_{j=1}^n E[Y_j] \\\\ &= np. \\end{align}\\] For the variance, we need \\(E[X^2]\\). Let us follow the same procedure as above and first calculate \\(E[X_i]\\) for \\(n = 1\\). The only thing that changes is that the term \\(n_i\\) becomes \\(n_i^2\\). Since we only have \\(0\\) and \\(1\\) this does not change the outcome. So \\[\\begin{align} Var[X_i] &= E[X_i^2] - E[X_i]^2\\\\ &= p_i(1 - p_i). \\end{align}\\] Analogous to above for arbitrary \\(n\\) \\[\\begin{align} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\sum_{j=1}^n E[Y_j^2] - \\sum_{j=1}^n E[Y_j]^2 \\\\ &= \\sum_{j=1}^n E[Y_j^2] - E[Y_j]^2 \\\\ &= \\sum_{j=1}^n p(1-p) \\\\ &= np(1-p). \\end{align}\\] To calculate the covariance, we need \\(E[X_i X_j]\\). Again, let us start with \\(n = 1\\). Without loss of generality, let us assume \\(i = 1\\) and \\(j = 2\\). \\[\\begin{align} E[X_1 X_2] = \\sum_{n_1 = 0}^1 \\sum_{n_2 = 0}^1 \\frac{p_1^{n_1} n_1}{n_1!} \\frac{p_2^{n_2} n_2}{n_2!} \\sum_{n_3 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_3!...n_k!}p_3^{n_3}...p_k^{n_k}. \\end{align}\\] In the above expression, at each iteration we multiply with \\(n_1\\) and \\(n_2\\). Since \\(n = 1\\), one of these always has to be zero. Therefore \\(E[X_1 X_2] = 0\\) and \\[\\begin{align} Cov(X_i, X_j) &= E[X_i X_j] - E[X_i]E[X_j] \\\\ &= - p_i p_j. \\end{align}\\] For arbitrary \\(n\\), let \\(X = \\sum_{t = 1}^n Y_t\\) be the sum of independent multinomial random variables \\(Y_t = [X_{1t}, X_{2t},...,X_{kt}]^T\\) with \\(n=1\\). Then \\(X_1 = \\sum_{t = 1}^n X_{1t}\\) and \\(X_2 = \\sum_{l = 1}^n X_{2l}\\). \\[\\begin{align} Cov(X_1, X_2) &= E[X_1 X_2] - E[X_1] E[X_2] \\\\ &= E[\\sum_{t = 1}^n X_{1t} \\sum_{l = 1}^n X_{2l}] - n^2 p_1 p_2 \\\\ &= \\sum_{t = 1}^n \\sum_{l = 1}^n E[X_{1t} X_{2l}] - n^2 p_1 p_2. \\end{align}\\] For \\(X_{1t}\\) and \\(X_{2l}\\) the expected value is zero when \\(t = l\\). When \\(t \\neq l\\) then they are independent, so the expected value is the product \\(p_1 p_2\\). There are \\(n^2\\) total terms, and for \\(n\\) of them \\(t = l\\) holds. So \\(E[X_1 X_2] = (n^2 - n) p_1 p_2\\). Inserting into the above \\[\\begin{align} Cov(X_1, X_2) &= (n^2 - n) p_1 p_2 - n^2 p_1 p_2 \\\\ &= - n p_1 p_2. \\end{align}\\] 8.2 Multivariate normal random variables Exercise 8.3 (Cholesky decomposition) Let \\(X\\) be a random vector of length \\(k\\) with \\(X_i \\sim \\text{N}(0, 1)\\) and \\(LL^*\\) the Cholesky decomposition of a Hermitian positive-definite matrix \\(A\\). Let \\(\\mu\\) be a vector of length \\(k\\). Find the distribution of the random vector \\(Y = \\mu + L X\\). Find the Cholesky decomposition of \\(A = \\begin{bmatrix} 2 & 1.2 \\\\ 1.2 & 1 \\end{bmatrix}\\). R: Use the results from a) and b) to sample from the MVN distribution \\(\\text{N}(\\mu, A)\\), where \\(\\mu = [1.5, -1]^T\\). Plot a scatterplot and compare it to direct samples from the multivariate normal distribution (rmvnorm). Solution. \\(X\\) has an independent normal distribution of dimension \\(k\\). Then \\[\\begin{align} Y = \\mu + L X &\\sim \\text{N}(\\mu, LL^T) \\\\ &\\sim \\text{N}(\\mu, A). \\end{align}\\] Solve \\[\\begin{align} \\begin{bmatrix} a & 0 \\\\ b & c \\end{bmatrix} \\begin{bmatrix} a & b \\\\ 0 & c \\end{bmatrix} = \\begin{bmatrix} 2 & 1.2 \\\\ 1.2 & 1 \\end{bmatrix} \\end{align}\\] # a set.seed(1) nsamps <- 1000 X <- matrix(data = rnorm(nsamps * 2), ncol = 2) mu <- c(1.5, -1) L <- matrix(data = c(sqrt(2), 0, 1.2 / sqrt(2), sqrt(1 - 1.2^2/2)), ncol = 2, byrow = TRUE) Y <- t(mu + L %*% t(X)) plot_df <- data.frame(rbind(X, Y), c(rep("X", nsamps), rep("Y", nsamps))) colnames(plot_df) <- c("D1", "D2", "var") ggplot(data = plot_df, aes(x = D1, y = D2, colour = as.factor(var))) + geom_point() Exercise 8.4 (Eigendecomposition) R: Let \\(\\Sigma = U \\Lambda U^T\\) be the eigendecomposition of covariance matrix \\(\\Sigma\\). Follow the procedure below, to sample from a multivariate normal with \\(\\mu = [-2, 1]^T\\) and \\(\\Sigma = \\begin{bmatrix} 0.3, -0.5 \\\\ -0.5, 1.6 \\end{bmatrix}\\): Sample from two independent standardized normal distributions to get \\(X\\). Find the eigendecomposition of \\(X\\) (eigen). Multiply \\(X\\) by \\(\\Lambda^{\\frac{1}{2}}\\) to get \\(X2\\). Consider how the eigendecomposition for \\(X2\\) changes compared to \\(X\\). Multiply \\(X2\\) by \\(U\\) to get \\(X3\\). Consider how the eigendecomposition for \\(X3\\) changes compared to \\(X2\\). Add \\(\\mu\\) to \\(X3\\). Consider how the eigendecomposition for \\(X4\\) changes compared to \\(X3\\). Plot the data and the eigenvectors (scaled with \\(\\Lambda^{\\frac{1}{2}}\\)) at each step. Hint: Use geom_segment for the eigenvectors. # a set.seed(1) sigma <- matrix(data = c(0.3, -0.5, -0.5, 1.6), nrow = 2, byrow = TRUE) ed <- eigen(sigma) e_val <- ed$values e_vec <- ed$vectors # b set.seed(1) nsamps <- 1000 X <- matrix(data = rnorm(nsamps * 2), ncol = 2) vec1 <- matrix(c(1,0,0,1), nrow = 2) X2 <- t(sqrt(diag(e_val)) %*% t(X)) vec2 <- sqrt(diag(e_val)) %*% vec1 X3 <- t(e_vec %*% t(X2)) vec3 <- e_vec %*% vec2 X4 <- t(c(-2, 1) + t(X3)) vec4 <- c(-2, 1) + vec3 vec_mat <- data.frame(matrix(c(0,0,0,0,0,0,0,0,0,0,0,0,-2,1,-2,1), ncol = 2, byrow = TRUE), t(cbind(vec1, vec2, vec3, vec4)), c(1,1,2,2,3,3,4,4)) df <- data.frame(rbind(X, X2, X3, X4), c(rep(1, nsamps), rep(2, nsamps), rep(3, nsamps), rep(4, nsamps))) colnames(df) <- c("D1", "D2", "wh") colnames(vec_mat) <- c("D1", "D2", "E1", "E2", "wh") ggplot(data = df, aes(x = D1, y = D2)) + geom_point() + geom_segment(data = vec_mat, aes(xend = E1, yend = E2), color = "red") + facet_wrap(~ wh) + coord_fixed() Exercise 8.5 (Marginal and conditional distributions) Let \\(X \\sim \\text{N}(\\mu, \\Sigma)\\), where \\(\\mu = [2, 0, -1]^T\\) and \\(\\Sigma = \\begin{bmatrix} 1 & -0.2 & 0.5 \\\\ -0.2 & 1.4 & -1.2 \\\\ 0.5 & -1.2 & 2 \\\\ \\end{bmatrix}\\). Let \\(A\\) represent the first two random variables and \\(B\\) the third random variable. R: For the calculation in the following points, you can use R. Find the marginal distribution of \\(B\\). Find the conditional distribution of \\(B | A = [a_1, a_2]^T\\). Find the marginal distribution of \\(A\\). Find the conditional distribution of \\(A | B = b\\). R: Visually compare the distributions of a) and b), and c) and d) at three different conditional values. mu <- c(2, 0, -1) Sigma <- matrix(c(1, -0.2, 0.5, -0.2, 1.4, -1.2, 0.5, -1.2, 2), nrow = 3, byrow = TRUE) mu_A <- c(2, 0) mu_B <- -1 Sigma_A <- Sigma[1:2, 1:2] Sigma_B <- Sigma[3, 3] Sigma_AB <- Sigma[1:2, 3] # b tmp_b <- t(Sigma_AB) %*% solve(Sigma_A) mu_b <- mu_B - tmp_b %*% mu_A Sigma_b <- Sigma_B - t(Sigma_AB) %*% solve(Sigma_A) %*% Sigma_AB mu_b ## [,1] ## [1,] -1.676471 tmp_b ## [,1] [,2] ## [1,] 0.3382353 -0.8088235 Sigma_b ## [,1] ## [1,] 0.8602941 # d tmp_a <- Sigma_AB * (1 / Sigma_B) mu_a <- mu_A - tmp_a * mu_B Sigma_d <- Sigma_A - (Sigma_AB * (1 / Sigma_B)) %*% t(Sigma_AB) mu_a ## [1] 2.25 -0.60 tmp_a ## [1] 0.25 -0.60 Sigma_d ## [,1] [,2] ## [1,] 0.875 0.10 ## [2,] 0.100 0.68 Solution. \\(B \\sim \\text{N}(-1, 2)\\). \\(B | A = a \\sim \\text{N}(-1.68 + [0.34, -0.81] a, 0.86)\\). \\(\\mu_A = [2, 0]^T\\) and \\(\\Sigma_A = \\begin{bmatrix} 1 & -0.2 & \\\\ -0.2 & 1.4 \\\\ \\end{bmatrix}\\). \\[\\begin{align} A | B = b &\\sim \\text{N}(\\mu_t, \\Sigma_t), \\\\ \\mu_t &= [2.25, -0.6]^T + [0.25, -0.6]^T b, \\\\ \\Sigma_t &= \\begin{bmatrix} 0.875 & 0.1 \\\\ 0.1 & 0.68 \\\\ \\end{bmatrix} \\end{align}\\] library(mvtnorm) set.seed(1) nsamps <- 1000 # a and b samps <- as.data.frame(matrix(data = NA, nrow = 4 * nsamps, ncol = 2)) samps[1:nsamps,1] <- rnorm(nsamps, mu_B, Sigma_B) samps[1:nsamps,2] <- "marginal" for (i in 1:3) { a <- rmvnorm(1, mu_A, Sigma_A) samps[(i*nsamps + 1):((i + 1) * nsamps), 1] <- rnorm(nsamps, mu_b + tmp_b %*% t(a), Sigma_b) samps[(i*nsamps + 1):((i + 1) * nsamps), 2] <- paste0(# "cond", round(a, digits = 2), collapse = "-") } colnames(samps) <- c("x", "dist") ggplot(samps, aes(x = x)) + geom_density() + facet_wrap(~ dist) # c and d samps <- as.data.frame(matrix(data = NA, nrow = 4 * nsamps, ncol = 3)) samps[1:nsamps,1:2] <- rmvnorm(nsamps, mu_A, Sigma_A) samps[1:nsamps,3] <- "marginal" for (i in 1:3) { b <- rnorm(1, mu_B, Sigma_B) samps[(i*nsamps + 1):((i + 1) * nsamps), 1:2] <- rmvnorm(nsamps, mu_a + tmp_a * b, Sigma_d) samps[(i*nsamps + 1):((i + 1) * nsamps), 3] <- b } colnames(samps) <- c("x", "y", "dist") ggplot(samps, aes(x = x, y = y)) + geom_point() + geom_smooth(method = "lm") + facet_wrap(~ dist) 8.3 Transformations Exercise 8.6 Let \\((U,V)\\) be a random variable with PDF \\(p(u,v) = \\frac{1}{4 \\sqrt{u}}\\), \\(U \\in [0,4]\\) and \\(V \\in [\\sqrt{U}, \\sqrt{U} + 1]\\). Let \\(X = \\sqrt{U}\\) and \\(Y = V - \\sqrt{U}\\). Find PDF of \\((X,Y)\\). What can you tell about distributions of \\(X\\) and \\(Y\\)? This exercise shows how we can simplify a probabilistic problem with a clever use of transformations. R: Take 1000 samples from \\((X,Y)\\) and transform them with inverses of the above functions to get samples from \\((U,V)\\). Plot both sets of samples. Solution. First we need to find the inverse functions. Since \\(x = \\sqrt{u}\\) it follows that \\(u = x^2\\), and that \\(x \\in [0,2]\\). Similarly \\(v = y + x\\) and \\(y \\in [0,1]\\). Let us first find the Jacobian. \\[\\renewcommand\\arraystretch{1.6} J(x,y) = \\begin{bmatrix} \\frac{\\partial u}{\\partial x} & \\frac{\\partial v}{\\partial x} \\\\%[1ex] % <-- 1ex more space between rows of matrix \\frac{\\partial u}{\\partial y} & \\frac{\\partial v}{\\partial y} \\end{bmatrix} = \\begin{bmatrix} 2x & 1 \\\\%[1ex] % <-- 1ex more space between rows of matrix 0 & 1 \\end{bmatrix}, \\] and the determinant is \\(|J(x,y)| = 2x\\). Putting everything together, we get \\[\\begin{align} p_{X,Y}(x,y) = p_{U,V}(x^2, y + x) |J(x,y)| = \\frac{1}{4 \\sqrt{x^2}} 2x = \\frac{1}{2}. \\end{align}\\] This reminds us of the Uniform distribution. Indeed we can see that \\(p_X(x) = \\frac{1}{2}\\) and \\(p_Y(y) = 1\\). So instead of dealing with an awkward PDF of \\((U,V)\\) and the corresponding dynamic bounds, we are now looking at two independent Uniform random variables. In practice, this could make modeling much easier. set.seed(1) nsamps <- 2000 x <- runif(nsamps, min = 0, max = 2) y <- runif(nsamps) orig <- tibble(x = x, y = y, vrs = "original") u <- x^2 v <- y + x transf <- tibble(x = u, y = v, vrs = "transformed") df <- bind_rows(orig, transf) ggplot(df, aes(x = x, y = y, color = vrs)) + geom_point(alpha = 0.3) Exercise 8.7 R: Write a function that will calculate the probability density of an arbitraty multivariate normal distribution, based on independent standardized normal PDFs. Compare with dmvnorm from the mvtnorm package. library(mvtnorm) set.seed(1) mvn_dens <- function (y, mu, Sigma) { L <- chol(Sigma) L_inv <- solve(t(L)) g_inv <- L_inv %*% t(y - mu) J <- L_inv J_det <- det(J) return(prod(dnorm(g_inv)) * J_det) } mu_v <- c(-2, 0, 1) cov_m <- matrix(c(1, -0.2, 0.5, -0.2, 2, 0.3, 0.5, 0.3, 1.6), ncol = 3, byrow = TRUE) n_comp <- 20 for (i in 1:n_comp) { x <- rmvnorm(1, mean = mu_v, sigma = cov_m) print(paste0("My function: ", mvn_dens(x, mu_v, cov_m), ", dmvnorm: ", dmvnorm(x, mu_v, cov_m))) } ## [1] "My function: 0.0229514237156383, dmvnorm: 0.0229514237156383" ## [1] "My function: 0.00763138915406231, dmvnorm: 0.00763138915406231" ## [1] "My function: 0.0230688881105741, dmvnorm: 0.0230688881105741" ## [1] "My function: 0.0113616213114731, dmvnorm: 0.0113616213114731" ## [1] "My function: 0.00151808500121907, dmvnorm: 0.00151808500121907" ## [1] "My function: 0.0257658045974509, dmvnorm: 0.0257658045974509" ## [1] "My function: 0.0157963825730805, dmvnorm: 0.0157963825730805" ## [1] "My function: 0.00408856287529248, dmvnorm: 0.00408856287529248" ## [1] "My function: 0.0327793540101256, dmvnorm: 0.0327793540101256" ## [1] "My function: 0.0111606542967978, dmvnorm: 0.0111606542967978" ## [1] "My function: 0.0147636757585684, dmvnorm: 0.0147636757585684" ## [1] "My function: 0.0142948300412207, dmvnorm: 0.0142948300412207" ## [1] "My function: 0.0203093820657542, dmvnorm: 0.0203093820657542" ## [1] "My function: 0.0287533273357481, dmvnorm: 0.0287533273357481" ## [1] "My function: 0.0213402305128623, dmvnorm: 0.0213402305128623" ## [1] "My function: 0.0218356957993885, dmvnorm: 0.0218356957993885" ## [1] "My function: 0.0250750113961771, dmvnorm: 0.0250750113961771" ## [1] "My function: 0.0166498666348048, dmvnorm: 0.0166498666348048" ## [1] "My function: 0.00189725106874659, dmvnorm: 0.00189725106874659" ## [1] "My function: 0.0196697814975113, dmvnorm: 0.0196697814975113" "],["ard.html", "Chapter 9 Alternative representation of distributions 9.1 Probability generating functions (PGFs) 9.2 Moment generating functions (MGFs)", " Chapter 9 Alternative representation of distributions This chapter deals with alternative representation of distributions. The students are expected to acquire the following knowledge: Theoretical Probability generating functions. Moment generating functions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 9.1 Probability generating functions (PGFs) Exercise 9.1 Show that the sum of independent Poisson random variables is itself a Poisson random variable. R: Let \\(X\\) be a sum of three Poisson distributions with \\(\\lambda_i \\in \\{2, 5.2, 10\\}\\). Take 1000 samples and plot the three distributions and the sum. Then take 1000 samples from the theoretical distribution of \\(X\\) and compare them to the sum. Solution. Let \\(X_i \\sim \\text{Poisson}(\\lambda_i)\\) for \\(i = 1,...,n\\), and let \\(X = \\sum_{i=1}^n X_i\\). \\[\\begin{align} \\alpha_X(t) &= \\prod_{i=1}^n \\alpha_{X_i}(t) \\\\ &= \\prod_{i=1}^n \\bigg( \\sum_{j=0}^\\infty t^j \\frac{\\lambda_i^j e^{-\\lambda_i}}{j!} \\bigg) \\\\ &= \\prod_{i=1}^n \\bigg( e^{-\\lambda_i} \\sum_{j=0}^\\infty \\frac{(t\\lambda_i)^j }{j!} \\bigg) \\\\ &= \\prod_{i=1}^n \\bigg( e^{-\\lambda_i} e^{t \\lambda_i} \\bigg) & \\text{power series} \\\\ &= \\prod_{i=1}^n \\bigg( e^{\\lambda_i(t - 1)} \\bigg) \\\\ &= e^{\\sum_{i=1}^n \\lambda_i(t - 1)} \\\\ &= e^{t \\sum_{i=1}^n \\lambda_i - \\sum_{i=1}^n \\lambda_i} \\\\ &= e^{-\\sum_{i=1}^n \\lambda_i} \\sum_{j=0}^\\infty \\frac{(t \\sum_{i=1}^n \\lambda_i)^j}{j!}\\\\ &= \\sum_{j=0}^\\infty \\frac{e^{-\\sum_{i=1}^n \\lambda_i} (t \\sum_{i=1}^n \\lambda_i)^j}{j!}\\\\ \\end{align}\\] The last term is the PGF of a Poisson random variable with parameter \\(\\sum_{i=1}^n \\lambda_i\\). Because the PGF is unique, \\(X\\) is a Poisson random variable. set.seed(1) library(tidyr) nsamps <- 1000 samps <- matrix(data = NA, nrow = nsamps, ncol = 4) samps[ ,1] <- rpois(nsamps, 2) samps[ ,2] <- rpois(nsamps, 5.2) samps[ ,3] <- rpois(nsamps, 10) samps[ ,4] <- samps[ ,1] + samps[ ,2] + samps[ ,3] colnames(samps) <- c(2, 2.5, 10, "sum") gsamps <- as_tibble(samps) gsamps <- gather(gsamps, key = "dist", value = "value") ggplot(gsamps, aes(x = value)) + geom_bar() + facet_wrap(~ dist) samps <- cbind(samps, "theoretical" = rpois(nsamps, 2 + 5.2 + 10)) gsamps <- as_tibble(samps[ ,4:5]) gsamps <- gather(gsamps, key = "dist", value = "value") ggplot(gsamps, aes(x = value, fill = dist)) + geom_bar(position = "dodge") Exercise 9.2 Find the expected value and variance of the negative binomial distribution. Hint: Find the Taylor series of \\((1 - y)^{-r}\\) at point 0. Solution. Let \\(X \\sim \\text{NB}(r, p)\\). \\[\\begin{align} \\alpha_X(t) &= E[t^X] \\\\ &= \\sum_{j=0}^\\infty t^j \\binom{j + r - 1}{j} (1 - p)^r p^j \\\\ &= (1 - p)^r \\sum_{j=0}^\\infty \\binom{j + r - 1}{j} (tp)^j \\\\ &= (1 - p)^r \\sum_{j=0}^\\infty \\frac{(j + r - 1)(j + r - 2)...r}{j!} (tp)^j. \\\\ \\end{align}\\] Let us look at the Taylor series of \\((1 - y)^{-r}\\) at 0 \\[\\begin{align} (1 - y)^{-r} = &1 + \\frac{-r(-1)}{1!}y + \\frac{-r(-r - 1)(-1)^2}{2!}y^2 + \\\\ &\\frac{-r(-r - 1)(-r - 2)(-1)^3}{3!}y^3 + ... \\\\ \\end{align}\\] How does the \\(k\\)-th term look like? We have \\(k\\) derivatives of our function so \\[\\begin{align} \\frac{d^k}{d^k y} (1 - y)^{-r} &= \\frac{-r(-r - 1)...(-r - k + 1)(-1)^k}{k!}y^k \\\\ &= \\frac{r(r + 1)...(r + k - 1)}{k!}y^k. \\end{align}\\] We observe that this equals to the \\(j\\)-th term in the sum of NB PGF. Therefore \\[\\begin{align} \\alpha_X(t) &= (1 - p)^r (1 - tp)^{-r} \\\\ &= \\Big(\\frac{1 - p}{1 - tp}\\Big)^r \\end{align}\\] To find the expected value, we need to differentiate \\[\\begin{align} \\frac{d}{dt} \\Big(\\frac{1 - p}{1 - tp}\\Big)^r &= r \\Big(\\frac{1 - p}{1 - tp}\\Big)^{r-1} \\frac{d}{dt} \\frac{1 - p}{1 - tp} \\\\ &= r \\Big(\\frac{1 - p}{1 - tp}\\Big)^{r-1} \\frac{p(1 - p)}{(1 - tp)^2}. \\\\ \\end{align}\\] Evaluating this at 1, we get: \\[\\begin{align} E[X] = \\frac{rp}{1 - p}. \\end{align}\\] For the variance we need the second derivative. \\[\\begin{align} \\frac{d^2}{d^2t} \\Big(\\frac{1 - p}{1 - tp}\\Big)^r &= \\frac{p^2 r (r + 1) (\\frac{1 - p}{1 - tp})^r}{(tp - 1)^2} \\end{align}\\] Evaluating this at 1 and inserting the first derivatives, we get: \\[\\begin{align} Var[X] &= \\frac{d^2}{dt^2} \\alpha_X(1) + \\frac{d}{dt}\\alpha_X(1) - \\Big(\\frac{d}{dt}\\alpha_X(t) \\Big)^2 \\\\ &= \\frac{p^2 r (r + 1)}{(1 - p)^2} + \\frac{rp}{1 - p} - \\frac{r^2p^2}{(1 - p)^2} \\\\ &= \\frac{rp}{(1 - p)^2}. \\end{align}\\] library(tidyr) set.seed(1) nsamps <- 100000 find_p <- function (mu, r) { return (10 / (r + 10)) } r <- c(1,2,10,20) p <- find_p(10, r) sigma <- rep(sqrt(p*r / (1 - p)^2), each = nsamps) samps <- cbind("r=1" = rnbinom(nsamps, size = r[1], prob = 1 - p[1]), "r=2" = rnbinom(nsamps, size = r[2], prob = 1 - p[2]), "r=4" = rnbinom(nsamps, size = r[3], prob = 1 - p[3]), "r=20" = rnbinom(nsamps, size = r[4], prob = 1 - p[4])) gsamps <- gather(as.data.frame(samps)) iw <- (gsamps$value > sigma + 10) | (gsamps$value < sigma - 10) ggplot(gsamps, aes(x = value, fill = iw)) + geom_bar() + # geom_density() + facet_wrap(~ key) 9.2 Moment generating functions (MGFs) Exercise 9.3 Find the variance of the geometric distribution. Solution. Let \\(X \\sim \\text{Geometric}(p)\\). The MGF of the geometric distribution is \\[\\begin{align} M_X(t) &= E[e^{tX}] \\\\ &= \\sum_{k=0}^\\infty p(1 - p)^k e^{tk} \\\\ &= p \\sum_{k=0}^\\infty ((1 - p)e^t)^k. \\end{align}\\] Let us assume that \\((1 - p)e^t < 1\\). Then, by using the geometric series we get \\[\\begin{align} M_X(t) &= \\frac{p}{1 - e^t + pe^t}. \\end{align}\\] The first derivative of the above expression is \\[\\begin{align} \\frac{d}{dt}M_X(t) &= \\frac{-p(-e^t + pe^t)}{(1 - e^t + pe^t)^2}, \\end{align}\\] and evaluating at \\(t = 0\\), we get \\(\\frac{1 - p}{p}\\), which we already recognize as the expected value of the geometric distribution. The second derivative is \\[\\begin{align} \\frac{d^2}{dt^2}M_X(t) &= \\frac{(p-1)pe^t((p-1)e^t - 1)}{((p - 1)e^t + 1)^3}, \\end{align}\\] and evaluating at \\(t = 0\\), we get \\(\\frac{(p - 1)(p - 2)}{p^2}\\). Combining we get the variance \\[\\begin{align} Var(X) &= \\frac{(p - 1)(p - 2)}{p^2} - \\frac{(1 - p)^2}{p^2} \\\\ &= \\frac{(p-1)(p-2) - (1-p)^2}{p^2} \\\\ &= \\frac{1 - p}{p^2}. \\end{align}\\] Exercise 9.4 Find the distribution of sum of two normal random variables \\(X\\) and \\(Y\\), by comparing \\(M_{X+Y}(t)\\) to \\(M_X(t)\\). R: To illustrate the result draw random samples from N\\((-3, 1)\\) and N\\((5, 1.2)\\) and calculate the empirical mean and variance of \\(X+Y\\). Plot all three histograms in one plot. Solution. Let \\(X \\sim \\text{N}(\\mu_X, 1)\\) and \\(Y \\sim \\text{N}(\\mu_Y, 1)\\). The MGF of the sum is \\[\\begin{align} M_{X+Y}(t) &= M_X(t) M_Y(t). \\end{align}\\] Let us calculate \\(M_X(t)\\), the MGF for \\(Y\\) then follows analogously. \\[\\begin{align} M_X(t) &= \\int_{-\\infty}^\\infty e^{tx} \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - mu_X)^2}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - mu_X)^2 - 2\\sigma_X tx}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{x^2 - 2\\mu_X x + \\mu_X^2 - 2\\sigma_X tx}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - (\\mu_X + \\sigma_X^2 t))^2 + \\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} dx & \\text{complete the square}\\\\ &= e^{-\\frac{\\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - (\\mu_X + \\sigma_X^2 t))^2}{2\\sigma_X^2}} dx & \\\\ &= e^{-\\frac{\\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} & \\text{normal PDF} \\\\ &= e^{-\\frac{\\mu_X^2 - \\mu_X^2 - \\mu_X \\sigma_X^2 t - 2 \\sigma_X^4 t^2}{2\\sigma_X^2}} \\\\ &= e^{\\sigma_X^2 t^2 + \\frac{\\mu_X t}{2}}. \\\\ \\end{align}\\] The MGF of the sum is then \\[\\begin{align} M_{X+Y}(t) &= e^{\\sigma_X^2 t^2 + 0.5\\mu_X t} e^{\\sigma_Y^2 t^2 + 0.5\\mu_Y t} \\\\ &= e^{t^2(\\sigma_X^2 + \\sigma_Y^2) + 0.5 t(\\mu_X + \\mu_Y)}. \\end{align}\\] By comparing \\(M_{X+Y}(t)\\) and \\(M_X(t)\\) we observe that both have two terms. The first is \\(2t^2\\) multiplied by the variance, and the second is \\(2t\\) multiplied by the mean. Since MGFs are unique, we conclude that \\(Z = X + Y \\sim \\text{N}(\\mu_X + \\mu_Y, \\sigma_X^2 + \\sigma_Y^2)\\). library(tidyr) library(ggplot2) set.seed(1) nsamps <- 1000 x <- rnorm(nsamps, -3, 1) y <- rnorm(nsamps, 5, 1.2) z <- x + y mean(z) ## [1] 1.968838 var(z) ## [1] 2.645034 df <- data.frame(x = x, y = y, z = z) %>% gather() ggplot(df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") "],["ci.html", "Chapter 10 Concentration inequalities 10.1 Comparison 10.2 Practical", " Chapter 10 Concentration inequalities This chapter deals with concentration inequalities. The students are expected to acquire the following knowledge: Theoretical More assumptions produce closer bounds. R Optimization. Estimating probability inequalities. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 10.1 Comparison Exercise 10.1 R: Let \\(X\\) be geometric random variable with \\(p = 0.7\\). Visually compare the Markov bound, Chernoff bound, and the theoretical probabilities for \\(x = 1,...,12\\). To get the best fitting Chernoff bound, you will need to optimize the bound depending on \\(t\\). Use either analytical or numerical optimization. bound_chernoff <- function (t, p, a) { return ((p / (1 - exp(t) + p * exp(t))) / exp(a * t)) } set.seed(1) p <- 0.7 a <- seq(1, 12, by = 1) ci_markov <- (1 - p) / p / a t <- vector(mode = "numeric", length = length(a)) for (i in 1:length(t)) { t[i] <- optimize(bound_chernoff, interval = c(0, log(1 / (1 - p))), p = p, a = a[i])$minimum } t ## [1] 0.5108267 0.7984981 0.9162927 0.9808238 1.0216635 1.0498233 1.0704327 ## [8] 1.0861944 1.0986159 1.1086800 1.1169653 1.1239426 ci_chernoff <- (p / (1 - exp(t) + p * exp(t))) / exp(a * t) actual <- 1 - pgeom(a, 0.7) plot_df <- rbind( data.frame(x = a, y = ci_markov, type = "Markov"), data.frame(x = a, y = ci_chernoff, type = "Chernoff"), data.frame(x = a, y = actual, type = "Actual") ) ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() Exercise 10.2 R: Let \\(X\\) be a sum of 100 Beta distributions with random parameters. Take 1000 samples and plot the Chebyshev bound, Hoeffding bound, and the empirical probabilities. set.seed(1) nvars <- 100 nsamps <- 1000 samps <- matrix(data = NA, nrow = nsamps, ncol = nvars) Sn_mean <- 0 Sn_var <- 0 for (i in 1:nvars) { alpha1 <- rgamma(1, 10, 1) beta1 <- rgamma(1, 10, 1) X <- rbeta(nsamps, alpha1, beta1) Sn_mean <- Sn_mean + alpha1 / (alpha1 + beta1) Sn_var <- Sn_var + alpha1 * beta1 / ((alpha1 + beta1)^2 * (alpha1 + beta1 + 1)) samps[ ,i] <- X } mean(apply(samps, 1, sum)) ## [1] 51.12511 Sn_mean ## [1] 51.15723 var(apply(samps, 1, sum)) ## [1] 1.170652 Sn_var ## [1] 1.166183 a <- 1:30 b <- a / sqrt(Sn_var) ci_chebyshev <- 1 / b^2 ci_hoeffding <- 2 * exp(- 2 * a^2 / nvars) empirical <- NULL for (i in 1:length(a)) { empirical[i] <- sum(abs((apply(samps, 1, sum)) - Sn_mean) >= a[i])/ nsamps } plot_df <- rbind( data.frame(x = a, y = ci_chebyshev, type = "Chebyshev"), data.frame(x = a, y = ci_hoeffding, type = "Hoeffding"), data.frame(x = a, y = empirical, type = "Empirical") ) ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() + coord_cartesian(xlim = c(15, 25), ylim = c(0, 0.05)) 10.2 Practical Exercise 10.3 From Jagannathan. Let \\(X_i\\), \\(i = 1,...n\\), be a random sample of size \\(n\\) of a random variable \\(X\\). Let \\(X\\) have mean \\(\\mu\\) and variance \\(\\sigma^2\\). Find the size of the sample \\(n\\) required so that the probability that the difference between sample mean and true mean is smaller than \\(\\frac{\\sigma}{10}\\) is at least 0.95. Hint: Derive a version of the Chebyshev inequality for \\(P(|X - \\mu| \\geq a)\\) using Markov inequality. Solution. Let \\(\\bar{X} = \\sum_{i=1}^n X_i\\). Then \\(E[\\bar{X}] = \\mu\\) and \\(Var[\\bar{X}] = \\frac{\\sigma^2}{n}\\). Let us first derive another representation of Chebyshev inequality. \\[\\begin{align} P(|X - \\mu| \\geq a) = P(|X - \\mu|^2 \\geq a^2) \\leq \\frac{E[|X - \\mu|^2]}{a^2} = \\frac{Var[X]}{a^2}. \\end{align}\\] Let us use that on our sampling distribution: \\[\\begin{align} P(|\\bar{X} - \\mu| \\geq \\frac{\\sigma}{10}) \\leq \\frac{100 Var[\\bar{X}]}{\\sigma^2} = \\frac{100 Var[X]}{n \\sigma^2} = \\frac{100}{n}. \\end{align}\\] We are interested in the difference being smaller, therefore \\[\\begin{align} P(|\\bar{X} - \\mu| < \\frac{\\sigma}{10}) = 1 - P(|\\bar{X} - \\mu| \\geq \\frac{\\sigma}{10}) \\geq 1 - \\frac{100}{n} \\geq 0.95. \\end{align}\\] It follows that we need a sample size of \\(n \\geq \\frac{100}{0.05} = 2000\\). "],["crv.html", "Chapter 11 Convergence of random variables", " Chapter 11 Convergence of random variables This chapter deals with convergence of random variables. The students are expected to acquire the following knowledge: Theoretical Finding convergences of random variables. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 11.1 Let \\(X_1\\), \\(X_2\\),…, \\(X_n\\) be a sequence of Bernoulli random variables. Let \\(Y_n = \\frac{X_1 + X_2 + ... + X_n}{n^2}\\). Show that this sequence converges point-wise to the zero random variable. R: Use a simulation to check your answer. Solution. Let \\(\\epsilon\\) be arbitrary. We need to find such \\(n_0\\), that for every \\(n\\) greater than \\(n_0\\) \\(|Y_n| < \\epsilon\\) holds. \\[\\begin{align} |Y_n| &= |\\frac{X_1 + X_2 + ... + X_n}{n^2}| \\\\ &\\leq |\\frac{n}{n^2}| \\\\ &= \\frac{1}{n}. \\end{align}\\] So we need to find such \\(n_0\\), that for every \\(n > n_0\\) we will have \\(\\frac{1}{n} < \\epsilon\\). So \\(n_0 > \\frac{1}{\\epsilon}\\). x <- 1:1000 X <- matrix(data = NA, nrow = length(x), ncol = 100) y <- vector(mode = "numeric", length = length(x)) for (i in 1:length(x)) { X[i, ] <- rbinom(100, size = 1, prob = 0.5) } X <- apply(X, 2, cumsum) tmp_mat <- matrix(data = (1:1000)^2, nrow = 1000, ncol = 100) X <- X / tmp_mat y <- apply(X, 1, mean) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() Exercise 11.2 Let \\(\\Omega = [0,1]\\) and let \\(X_n\\) be a sequence of random variables, defined as \\[\\begin{align} X_n(\\omega) = \\begin{cases} \\omega^3, &\\omega = \\frac{i}{n}, &0 \\leq i \\leq 1 \\\\ 1, & \\text{otherwise.} \\end{cases} \\end{align}\\] Show that \\(X_n\\) converges almost surely to \\(X \\sim \\text{Uniform}(0,1)\\). Solution. We need to show \\(P(\\{\\omega: X_n(\\omega) \\rightarrow X(\\omega)\\}) = 1\\). Let \\(\\omega \\neq \\frac{i}{n}\\). Then for any \\(\\omega\\), \\(X_n\\) converges pointwise to \\(X\\): \\[\\begin{align} X_n(\\omega) = 1 \\implies |X_n(\\omega) - X(s)| = |1 - 1| < \\epsilon. \\end{align}\\] The above is independent of \\(n\\). Since there are countably infinite number of elements in the complement (\\(\\frac{i}{n}\\)), the probability of this set is 1. Exercise 11.3 Borrowed from Wasserman. Let \\(X_n \\sim \\text{N}(0, \\frac{1}{n})\\) and let \\(X\\) be a random variable with CDF \\[\\begin{align} F_X(x) = \\begin{cases} 0, &x < 0 \\\\ 1, &x \\geq 0. \\end{cases} \\end{align}\\] Does \\(X_n\\) converge to \\(X\\) in distribution? How about in probability? Prove or disprove these statement. R: Plot the CDF of \\(X_n\\) for \\(n = 1, 2, 5, 10, 100, 1000\\). Solution. Let us first check convergence in distribution. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} F_{X_n}(x) &= \\lim_{n \\rightarrow \\infty} \\phi (\\sqrt(n) x). \\end{align}\\] We have two cases, for \\(x < 0\\) and \\(x > 0\\). We do not need to check for \\(x = 0\\), since \\(F_X\\) is not continuous in that point. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} \\phi (\\sqrt(n) x) = \\begin{cases} 0, & x < 0 \\\\ 1, & x > 0. \\end{cases} \\end{align}\\] This is the same as \\(F_X\\). Let us now check convergence in probability. Since \\(X\\) is a point-mass distribution at zero, we have \\[\\begin{align} \\lim_{n \\rightarrow \\infty} P(|X_n| > \\epsilon) &= \\lim_{n \\rightarrow \\infty} (P(X_n > \\epsilon) + P(X_n < -\\epsilon)) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - P(X_n < \\epsilon) + P(X_n < -\\epsilon)) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - \\phi(\\sqrt{n} \\epsilon) + \\phi(- \\sqrt{n} \\epsilon)) \\\\ &= 0. \\end{align}\\] n <- c(1,2,5,10,100,1000) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01)), aes(x = x)) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/1), aes(color = "sd = 1/1")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/2), aes(color = "sd = 1/2")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/5), aes(color = "sd = 1/5")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/10), aes(color = "sd = 1/10")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/100), aes(color = "sd = 1/100")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/1000), aes(color = "sd = 1/1000")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/10000), aes(color = "sd = 1/10000")) Exercise 11.4 Let \\(X_i\\) be i.i.d. and \\(\\mu = E(X_1)\\). Let variance of \\(X_1\\) be finite. Show that the mean of \\(X_i\\), \\(\\bar{X}_n = \\frac{1}{n}\\sum_{i=1}^n X_i\\) converges in quadratic mean to \\(\\mu\\). Solution. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} E(|\\bar{X_n} - \\mu|^2) &= \\lim_{n \\rightarrow \\infty} E(\\bar{X_n}^2 - 2 \\bar{X_n} \\mu + \\mu^2) \\\\ &= \\lim_{n \\rightarrow \\infty} (E(\\bar{X_n}^2) - 2 \\mu E(\\frac{\\sum_{i=1}^n X_i}{n}) + \\mu^2) \\\\ &= \\lim_{n \\rightarrow \\infty} E(\\bar{X_n})^2 + \\lim_{n \\rightarrow \\infty} Var(\\bar{X_n}) - 2 \\mu^2 + \\mu^2 \\\\ &= \\lim_{n \\rightarrow \\infty} \\frac{n^2 \\mu^2}{n^2} + \\lim_{n \\rightarrow \\infty} \\frac{\\sigma^2}{n} - \\mu^2 \\\\ &= \\mu^2 - \\mu^2 + \\lim_{n \\rightarrow \\infty} \\frac{\\sigma^2}{n} \\\\ &= 0. \\end{align}\\] "],["lt.html", "Chapter 12 Limit theorems", " Chapter 12 Limit theorems This chapter deals with limit theorems. The students are expected to acquire the following knowledge: Theoretical Monte Carlo integration convergence. Difference between weak and strong law of large numbers. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 12.1 Show that Monte Carlo integration converges almost surely to the true integral of a bounded function. Solution. Let \\(g\\) be a function defined on \\(\\Omega\\). Let \\(X_i\\), \\(i = 1,...,n\\) be i.i.d. (multivariate) uniform random variables with bounds defined on \\(\\Omega\\). Let \\(Y_i\\) = \\(g(X_i)\\). Then it follows that \\(Y_i\\) are also i.i.d. random variables and their expected value is \\(E[g(X)] = \\int_{\\Omega} g(x) f_X(x) dx = \\frac{1}{V_{\\Omega}} \\int_{\\Omega} g(x) dx\\). By the strong law of large numbers, we have \\[\\begin{equation} \\frac{1}{n}\\sum_{i=1}^n Y_i \\xrightarrow{\\text{a.s.}} E[g(X)]. \\end{equation}\\] It follows that \\[\\begin{equation} V_{\\Omega} \\frac{1}{n}\\sum_{i=1}^n Y_i \\xrightarrow{\\text{a.s.}} \\int_{\\Omega} g(x) dx. \\end{equation}\\] Exercise 12.2 Let \\(X\\) be a geometric random variable with probability 0.5 and support in positive integers. Let \\(Y = 2^X (-1)^X X^{-1}\\). Find the expected value of \\(Y\\) by using conditional convergence (this variable does not have an expected value in the conventional sense – the series is not absolutely convergent). R: Draw \\(10000\\) samples from a geometric distribution with probability 0.5 and support in positive integers to get \\(X\\). Then calculate \\(Y\\) and plot the means at each iteration (sample). Additionally, plot the expected value calculated in a. Try it with different seeds. What do you notice? Solution. \\[\\begin{align*} E[Y] &= \\sum_{x=1}^{\\infty} \\frac{2^x (-1)^x}{x} 0.5^x \\\\ &= \\sum_{x=1}^{\\infty} \\frac{(-1)^x}{x} \\\\ &= - \\sum_{x=1}^{\\infty} \\frac{(-1)^{x+1}}{x} \\\\ &= - \\ln(2) \\end{align*}\\] set.seed(3) x <- rgeom(100000, prob = 0.5) + 1 y <- 2^x * (-1)^x * x^{-1} y_means <- cumsum(y) / seq_along(y) df <- data.frame(x = 1:length(y_means), y = y_means) ggplot(data = df, aes(x = x, y = y)) + geom_line() + geom_hline(yintercept = -log(2)) "],["eb.html", "Chapter 13 Estimation basics 13.1 ECDF 13.2 Properties of estimators", " Chapter 13 Estimation basics This chapter deals with estimation basics. The students are expected to acquire the following knowledge: Biased and unbiased estimators. Consistent estimators. Empirical cumulative distribution function. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 13.1 ECDF Exercise 13.1 (ECDF intuition) Take any univariate continuous distribution that is readily available in R and plot its CDF (\\(F\\)). Draw one sample (\\(n = 1\\)) from the chosen distribution and draw the ECDF (\\(F_n\\)) of that one sample. Use the definition of the ECDF, not an existing function in R. Implementation hint: ECDFs are always piecewise constant - they only jump at the sampled values and by \\(1/n\\). Repeat (b) for \\(n = 5, 10, 100, 1000...\\) Theory says that \\(F_n\\) should converge to \\(F\\). Can you observe that? For \\(n = 100\\) repeat the process \\(m = 20\\) times and plot every \\(F_n^{(m)}\\). Theory says that \\(F_n\\) will converge to \\(F\\) the slowest where \\(F\\) is close to 0.5 (where the variance is largest). Can you observe that? library(ggplot2) set.seed(1) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) one_samp <- rnorm(1) X <- data.frame(x = c(-5, one_samp, 5), y = c(0,1,1)) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) + geom_step(data = X, aes(x = x, y = y)) N <- c(5, 10, 100, 1000) X <- NULL for (n in N) { tmp <- rnorm(n) tmp_X <- data.frame(x = c(-5, sort(tmp), 5), y = c(0, seq(1/n, 1, by = 1/n), 1), n = n) X <- rbind(X, tmp_X) } ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) + geom_step(data = X, aes(x = x, y = y, color = as.factor(n))) + labs(color = "N") 13.2 Properties of estimators Exercise 13.2 Show that the sample average is, as an estimator of the mean: unbiased, consistent, asymptotically normal. Solution. \\[\\begin{align*} E[\\frac{1}{n} \\sum_{i=1}^n X_i] &= \\frac{1}{n} \\sum_{i=i}^n E[X_i] \\\\ &= E[X]. \\end{align*}\\] \\[\\begin{align*} \\lim_{n \\rightarrow \\infty} P(|\\frac{1}{n} \\sum_{i=1}^n X_i - E[X]| > \\epsilon) &= \\lim_{n \\rightarrow \\infty} P((\\frac{1}{n} \\sum_{i=1}^n X_i - E[X])^2 > \\epsilon^2) \\\\ & \\leq \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i - E[X])^2]}{\\epsilon^2} & \\text{Markov inequality} \\\\ & = \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i)^2 - 2 \\frac{1}{n} \\sum_{i=1}^n X_i E[X] + E[X]^2]}{\\epsilon^2} \\\\ & = \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i)^2] - 2 E[X]^2 + E[X]^2}{\\epsilon^2} \\\\ &= 0 \\end{align*}\\] For the last equality see the solution to ??. Follows directly from the CLT. Exercise 13.3 (Consistent but biased estimator) Show that sample variance (the plug-in estimator of variance) is a biased estimator of variance. Show that sample variance is a consistent estimator of variance. Show that the estimator with (\\(N-1\\)) (Bessel correction) is unbiased. Solution. \\[\\begin{align*} E[\\frac{1}{n} \\sum_{i=1}^n (Y_i - \\bar{Y})^2] &= \\frac{1}{n} \\sum_{i=1}^n E[(Y_i - \\bar{Y})^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2] - 2 E[Y_i \\bar{Y}] + \\bar{Y}^2)] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2 - 2 Y_i \\bar{Y} + \\bar{Y}^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2 - \\frac{2}{n} Y_i^2 - \\frac{2}{n} \\sum_{i \\neq j} Y_i Y_j + \\frac{1}{n^2}\\sum_j \\sum_{k \\neq j} Y_j Y_k + \\frac{1}{n^2} \\sum_j Y_j^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n \\frac{n - 2}{n} (\\sigma^2 + \\mu^2) - \\frac{2}{n} (n - 1) \\mu^2 + \\frac{1}{n^2}n(n-1)\\mu^2 + \\frac{1}{n^2}n(\\sigma^2 + \\mu^2) \\\\ &= \\frac{n-1}{n}\\sigma^2 \\\\ < \\sigma^2. \\end{align*}\\] Let \\(S_n\\) denote the sample variance. Then we can write it as \\[\\begin{align*} S_n &= \\frac{1}{n} \\sum_{i=1}^n (X_i - \\bar{X})^2 = \\frac{1}{n} \\sum_{i=1}^n (X_i - \\mu)^2 + 2(X_i - \\mu)(\\mu - \\bar{X}) + (\\mu - \\bar{X})^2. \\end{align*}\\] Now \\(\\bar{X}\\) converges in probability (by WLLN) to \\(\\mu\\) therefore the right terms converge in probability to zero. The left term converges in probability to \\(\\sigma^2\\), also by WLLN. Therefore the sample variance is a consistent estimatior of the variance. The denominator changes in the second-to-last line of a., therefore the last line is now equality. Exercise 13.4 (Estimating the median) Show that the sample median is an unbiased estimator of the median for N\\((\\mu, \\sigma^2)\\). Show that the sample median is an unbiased estimator of the mean for any distribution with symmetric density. Hint 1: The pdf of an order statistic is \\(f_{X_{(k)}}(x) = \\frac{n!}{(n - k)!(k - 1)!}f_X(x)\\Big(F_X(x)^{k-1} (1 - F_X(x)^{n - k}) \\Big)\\). Hint 2: A distribution is symmetric when \\(X\\) and \\(2a - X\\) have the same distribution for some \\(a\\). Solution. Let \\(Z_i\\), \\(i = 1,...,n\\) be i.i.d. variables with a symmetric distribution and let \\(Z_{k:n}\\) denote the \\(k\\)-th order statistic. We will distinguish two cases, when \\(n\\) is odd and when \\(n\\) is even. Let first \\(n = 2m + 1\\) be odd. Then the sample median is \\(M = Z_{m+1:2m+1}\\). Its PDF is \\[\\begin{align*} f_M(x) = (m+1)\\binom{2m + 1}{m}f_Z(x)\\Big(F_Z(x)^m (1 - F_Z(x)^m) \\Big). \\end{align*}\\] For every symmetric distribution, it holds that \\(F_X(x) = 1 - F(2a - x)\\). Let \\(a = \\mu\\), the population mean. Plugging this into the PDF, we get that \\(f_M(x) = f_M(2\\mu -x)\\). It follows that \\[\\begin{align*} E[M] &= E[2\\mu - M] \\\\ 2E[M] &= 2\\mu \\\\ E[M] &= \\mu. \\end{align*}\\] Now let \\(n = 2m\\) be even. Then the sample median is \\(M = \\frac{Z_{m:2m} + Z_{m+1:2m}}{2}\\). It can be shown, that the joint PDF of these terms is also symmetric. Therefore, similar to the above \\[\\begin{align*} E[M] &= E[\\frac{Z_{m:2m} + Z_{m+1:2m}}{2}] \\\\ &= E[\\frac{2\\mu - M + 2\\mu - M}{2}] \\\\ &= E[2\\mu - M]. \\end{align*}\\] The above also proves point a. as the median and the mean are the same in normal distribution. Exercise 13.5 (Matrix trace estimation) The Hutchinson trace estimator [1] is an estimator of the trace of a symmetric positive semidefinite matrix A that relies on Monte Carlo sampling. The estimator is defined as \\[\\begin{align*} \\textrm{tr}(A) \\approx \\frac{1}{n} \\Sigma_{i=1}^n z_i^T A z_i, &\\\\ z_i \\sim_{\\mathrm{IID}} \\textrm{Uniform}(\\{-1, 1\\}^m), & \\end{align*}\\] where \\(A \\in \\mathbb{R}^{m \\times m}\\) is a symmetric positive semidefinite matrix. Elements of each vector \\(z_i\\) are either \\(-1\\) or \\(1\\) with equal probability. This is also called a Rademacher distribution. Data scientists often want the trace of a Hessian to obtain valuable curvature information for a loss function. Per [2], an example is classifying ten digits based on \\((28,28)\\) grayscale images (i.e. MNIST data) using logistic regression. The number of parameters is \\(m = 28^2 \\cdot 10 = 7840\\) and the size of the Hessian is \\(m^2\\), roughly \\(6 \\cdot 10^6\\). The diagonal average is equal to the average eigenvalue, which may be useful for optimization; in MCMC contexts, this would be useful for preconditioners and step size optimization. Computing Hessians (as a means of getting eigenvalue information) is often intractable, but Hessian-vector products can be computed faster by autodifferentiation (with e.g. Tensorflow, Pytorch, Jax). This is one motivation for the use of a stochastic trace estimator as outlined above. References: A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines (Hutchinson, 1990) A Modern Analysis of Hutchinson’s Trace Estimator (Skorski, 2020) Prove that the Hutchinson trace estimator is an unbiased estimator of the trace. Solution. We first simplify our task: \\[\\begin{align} \\mathbb{E}\\left[\\frac{1}{n} \\Sigma_{i=1}^n z_i^T A z_i \\right] &= \\frac{1}{n} \\Sigma_{i=1}^n \\mathbb{E}\\left[z_i^T A z_i \\right] \\\\ &= \\mathbb{E}\\left[z_i^T A z_i \\right], \\end{align}\\] where the second equality is due to having \\(n\\) IID vectors \\(z_i\\). We now only need to show that \\(\\mathbb{E}\\left[z^T A z \\right] = \\mathrm{tr}(A)\\). We omit the index due to all vectors being IID: \\[\\begin{align} \\mathrm{tr}(A) &= \\mathrm{tr}(AI) \\\\ &= \\mathrm{tr}(A\\mathbb{E}[zz^T]) \\\\ &= \\mathbb{E}[\\mathrm{tr}(Azz^T)] \\\\ &= \\mathbb{E}[\\mathrm{tr}(z^TAz)] \\\\ &= \\mathbb{E}[z^TAz]. \\end{align}\\] This concludes the proof. We clarify some equalities below. The second equality assumes that \\(\\mathbb{E}[zz^T] = I\\). By noting that the mean of the Rademacher distribution is 0, we have \\[\\begin{align} \\mathrm{Cov}[z, z] &= \\mathbb{E}[(z - \\mathbb{E}[z])(z - \\mathbb{E}[z])^T] \\\\ &= \\mathbb{E}[zz^T]. \\end{align}\\] Dimensions of \\(z\\) are independent, so \\(\\mathrm{Cov}[z, z]_{ij} = 0\\) for \\(i \\neq j\\). The diagonal will contain variances, which are equal to \\(1\\) for all dimensions \\(k = 1 \\dots m\\): \\(\\mathrm{Var}[z^{(k)}] = \\mathbb{E}[z^{(k)}z^{(k)}] - \\mathbb{E}[z^{(k)}]^2 = 1 - 0 = 1\\). It follows that the covariance is an identity matrix. Note that this is a general result for vectors with IID dimensions sampled from a distribution with mean 0 and variance 1. We could therefore use something else instead of the Rademacher, e.g. \\(z ~ N(0, I)\\). The third equality uses the fact that the expectation of a trace equals the trace of an expectation. If \\(X\\) is a random matrix, then \\(\\mathbb{E}[X]_{ij} = \\mathbb{E}[X_{ij}]\\). Therefore: \\[\\begin{align} \\mathrm{tr}(\\mathbb{E}[X]) &= \\Sigma_{i=1}^m(\\mathbb{E}[X]_{ii}) \\\\ &= \\Sigma_{i=1}^m(\\mathbb{E}[X_{ii}]) \\\\ &= \\mathbb{E}[\\Sigma_{i=1}^m(X_{ii})] \\\\ &= \\mathbb{E}[\\mathrm{tr}(X)], \\end{align}\\] where we used the linearity of the expectation in the third step. The fourth equality uses the fact that \\(\\mathrm{tr}(AB) = \\mathrm{tr}(BA)\\) for any matrices \\(A \\in \\mathbb{R}^{n \\times m}, B \\in \\mathbb{R}^{m \\times n}\\). The last inequality uses the fact that the trace of a \\(1 \\times 1\\) matrix is just its element. "],["boot.html", "Chapter 14 Bootstrap", " Chapter 14 Bootstrap This chapter deals with bootstrap. The students are expected to acquire the following knowledge: How to use bootstrap to generate coverage intervals. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 14.1 Ideally, a \\(1-\\alpha\\) CI would have \\(1-\\alpha\\) coverage. That is, say a 95% CI should, in the long run, contain the true value of the parameter 95% of the time. In practice, it is impossible to assess the coverage of our CI method, because we rarely know the true parameter. In simulation, however, we can. Let’s assess the coverage of bootstrap percentile intervals. Pick a univariate distribution with readily available mean and one that you can easily sample from. Draw \\(n = 30\\) random samples from the chosen distribution and use the bootstrap (with large enough m) and percentile CI method to construct 95% CI. Repeat the process many times and count how many times the CI contains the true mean. That is, compute the actual coverage probability (don’t forget to include the standard error of the coverage probability!). What can you observe? Try one or two different distributions. What can you observe? Repeat (b) and (c) using BCa intervals (R package boot). How does the coverage compare to percentile intervals? As (d) but using intervals based on asymptotic normality (+/- 1.96 SE). How do results from (b), (d), and (e) change if we increase the sample size to n = 200? What about n = 5? library(boot) set.seed(0) nit <- 1000 # Repeat the process "many times" alpha <- 0.05 # CI parameter nboot <- 100 # m parameter for bootstrap ("large enough m") # f: change this to 200 or 5. nsample <- 30 # n = 30 random samples from the chosen distribution. Comment out BCa code if it breaks. covers <- matrix(nrow = nit, ncol = 3) covers_BCa <- matrix(nrow = nit, ncol = 3) covers_asymp_norm <- matrix(nrow = nit, ncol = 3) isin <- function (x, lower, upper) { (x > lower) & (x < upper) } for (j in 1:nit) { # Repeating many times # a: pick a univariate distribution - standard normal x1 <- rnorm(nsample) # c: one or two different distributions - beta and poisson x2 <- rbeta(nsample, 1, 2) x3 <- rpois(nsample, 5) X1 <- matrix(data = NA, nrow = nsample, ncol = nboot) X2 <- matrix(data = NA, nrow = nsample, ncol = nboot) X3 <- matrix(data = NA, nrow = nsample, ncol = nboot) for (i in 1:nboot) { X1[ ,i] <- sample(x1, nsample, replace = T) X2[ ,i] <- sample(x2, nsample, T) X3[ ,i] <- sample(x3, nsample, T) } X1_func <- apply(X1, 2, mean) X2_func <- apply(X2, 2, mean) X3_func <- apply(X3, 2, mean) X1_quant <- quantile(X1_func, probs = c(alpha / 2, 1 - alpha / 2)) X2_quant <- quantile(X2_func, probs = c(alpha / 2, 1 - alpha / 2)) X3_quant <- quantile(X3_func, probs = c(alpha / 2, 1 - alpha / 2)) covers[j,1] <- (0 > X1_quant[1]) & (0 < X1_quant[2]) covers[j,2] <- ((1 / 3) > X2_quant[1]) & ((1 / 3) < X2_quant[2]) covers[j,3] <- (5 > X3_quant[1]) & (5 < X3_quant[2]) mf <- function (x, i) return(mean(x[i])) bootX1 <- boot(x1, statistic = mf, R = nboot) bootX2 <- boot(x2, statistic = mf, R = nboot) bootX3 <- boot(x3, statistic = mf, R = nboot) X1_quant_BCa <- boot.ci(bootX1, type = "bca")$bca X2_quant_BCa <- boot.ci(bootX2, type = "bca")$bca X3_quant_BCa <- boot.ci(bootX3, type = "bca")$bca covers_BCa[j,1] <- (0 > X1_quant_BCa[4]) & (0 < X1_quant_BCa[5]) covers_BCa[j,2] <- ((1 / 3) > X2_quant_BCa[4]) & ((1 / 3) < X2_quant_BCa[5]) covers_BCa[j,3] <- (5 > X3_quant_BCa[4]) & (5 < X3_quant_BCa[5]) # e: estimate mean and standard error # sample mean: x1_bar <- mean(x1) x2_bar <- mean(x2) x3_bar <- mean(x3) # standard error (of the sample mean) estimate: sample standard deviation / sqrt(n) x1_bar_SE <- sd(x1) / sqrt(nsample) x2_bar_SE <- sd(x2) / sqrt(nsample) x3_bar_SE <- sd(x3) / sqrt(nsample) covers_asymp_norm[j,1] <- isin(0, x1_bar - 1.96 * x1_bar_SE, x1_bar + 1.96 * x1_bar_SE) covers_asymp_norm[j,2] <- isin(1/3, x2_bar - 1.96 * x2_bar_SE, x2_bar + 1.96 * x2_bar_SE) covers_asymp_norm[j,3] <- isin(5, x3_bar - 1.96 * x3_bar_SE, x3_bar + 1.96 * x3_bar_SE) } apply(covers, 2, mean) ## [1] 0.918 0.925 0.905 apply(covers, 2, sd) / sqrt(nit) ## [1] 0.008680516 0.008333333 0.009276910 apply(covers_BCa, 2, mean) ## [1] 0.927 0.944 0.927 apply(covers_BCa, 2, sd) / sqrt(nit) ## [1] 0.008230355 0.007274401 0.008230355 apply(covers_asymp_norm, 2, mean) ## [1] 0.939 0.937 0.930 apply(covers_asymp_norm, 2, sd) / sqrt(nit) ## [1] 0.007572076 0.007687008 0.008072494 Exercise 14.2 You are given a sample of independent observations from a process of interest: Index 1 2 3 4 5 6 7 8 X 7 2 4 6 4 5 9 10 Compute the plug-in estimate of mean and 95% symmetric CI based on asymptotic normality. Use the plug-in estimate of SE. Same as (a), but use the unbiased estimate of SE. Apply nonparametric bootstrap with 1000 bootstrap replications and estimate the 95% CI for the mean with percentile-based CI. # a x <- c(7, 2, 4, 6, 4, 5, 9, 10) n <- length(x) mu <- mean(x) SE <- sqrt(mean((x - mu)^2)) / sqrt(n) SE ## [1] 0.8915839 z <- qnorm(1 - 0.05 / 2) c(mu - z * SE, mu + z * SE) ## [1] 4.127528 7.622472 # b SE <- sd(x) / sqrt(n) SE ## [1] 0.9531433 c(mu - z * SE, mu + z * SE) ## [1] 4.006873 7.743127 # c set.seed(0) m <- 1000 T_mean <- function(x) {mean(x)} est_boot <- array(NA, m) for (i in 1:m) { x_boot <- x[sample(1:n, n, rep = T)] est_boot[i] <- T_mean(x_boot) } quantile(est_boot, p = c(0.025, 0.975)) ## 2.5% 97.5% ## 4.250 7.625 Exercise 14.3 We are given a sample of 10 independent paired (bivariate) observations: Index 1 2 3 4 5 6 7 8 9 10 X 1.26 -0.33 1.33 1.27 0.41 -1.54 -0.93 -0.29 -0.01 2.40 Y 2.64 0.33 0.48 0.06 -0.88 -2.14 -2.21 0.95 0.83 1.45 Compute Pearson correlation between X and Y. Use the cor.test() from R to estimate a 95% CI for the estimate from (a). Apply nonparametric bootstrap with 1000 bootstrap replications and estimate the 95% CI for the Pearson correlation with percentile-based CI. Compare CI from (b) and (c). Are they similar? How would the bootstrap estimation of CI change if we were interested in Spearman or Kendall correlation instead? x <- c(1.26, -0.33, 1.33, 1.27, 0.41, -1.54, -0.93, -0.29, -0.01, 2.40) y <- c(2.64, 0.33, 0.48, 0.06, -0.88, -2.14, -2.21, 0.95, 0.83, 1.45) # a cor(x, y) ## [1] 0.6991247 # b res <- cor.test(x, y) res$conf.int[1:2] ## [1] 0.1241458 0.9226238 # c set.seed(0) m <- 1000 n <- length(x) T_cor <- function(x, y) {cor(x, y)} est_boot <- array(NA, m) for (i in 1:m) { idx <- sample(1:n, n, rep = T) # !!! important to use same indices to keep dependency between x and y est_boot[i] <- T_cor(x[idx], y[idx]) } quantile(est_boot, p = c(0.025, 0.975)) ## 2.5% 97.5% ## 0.2565537 0.9057664 # d # Yes, but the bootstrap CI is more narrow. # e # We just use the functions for Kendall/Spearman coefficients instead: T_kendall <- function(x, y) {cor(x, y, method = "kendall")} T_spearman <- function(x, y) {cor(x, y, method = "spearman")} # Put this in a function that returns the CI bootstrap_95_ci <- function(x, y, t, m = 1000) { n <- length(x) est_boot <- array(NA, m) for (i in 1:m) { idx <- sample(1:n, n, rep = T) # !!! important to use same indices to keep dependency between x and y est_boot[i] <- t(x[idx], y[idx]) } quantile(est_boot, p = c(0.025, 0.975)) } bootstrap_95_ci(x, y, T_kendall) ## 2.5% 97.5% ## -0.08108108 0.78378378 bootstrap_95_ci(x, y, T_spearman) ## 2.5% 97.5% ## -0.1701115 0.8867925 Exercise 14.4 In this problem we will illustrate the use of the nonparametric bootstrap for estimating CIs of regression model coefficients. Load the longley dataset from base R with data(longley). Use lm() to apply linear regression using “Employed” as the target (dependent) variable and all other variables as the predictors (independent). Using lm() results, print the estimated regression coefficients and standard errors. Estimate 95% CI for the coefficients using +/- 1.96 * SE. Use nonparametric bootstrap with 100 replications to estimate the SE of the coefficients from (b). Compare the SE from (c) with those from (b). # a data(longley) # b res <- lm(Employed ~ . , longley) tmp <- data.frame(summary(res)$coefficients[,1:2]) tmp$LB <- tmp[,1] - 1.96 * tmp[,2] tmp$UB <- tmp[,1] + 1.96 * tmp[,2] tmp ## Estimate Std..Error LB UB ## (Intercept) -3.482259e+03 8.904204e+02 -5.227483e+03 -1.737035e+03 ## GNP.deflator 1.506187e-02 8.491493e-02 -1.513714e-01 1.814951e-01 ## GNP -3.581918e-02 3.349101e-02 -1.014616e-01 2.982320e-02 ## Unemployed -2.020230e-02 4.883997e-03 -2.977493e-02 -1.062966e-02 ## Armed.Forces -1.033227e-02 2.142742e-03 -1.453204e-02 -6.132495e-03 ## Population -5.110411e-02 2.260732e-01 -4.942076e-01 3.919994e-01 ## Year 1.829151e+00 4.554785e-01 9.364136e-01 2.721889e+00 # c set.seed(0) m <- 100 n <- nrow(longley) T_coef <- function(x) { lm(Employed ~ . , x)$coefficients } est_boot <- array(NA, c(m, ncol(longley))) for (i in 1:m) { idx <- sample(1:n, n, rep = T) est_boot[i,] <- T_coef(longley[idx,]) } SE <- apply(est_boot, 2, sd) SE ## [1] 1.826011e+03 1.605981e-01 5.693746e-02 8.204892e-03 3.802225e-03 ## [6] 3.907527e-01 9.414436e-01 # Show the standard errors around coefficients library(ggplot2) library(reshape2) df <- data.frame(index = 1:7, bootstrap_SE = SE, lm_SE = tmp$Std..Error) melted_df <- melt(df[2:nrow(df), ], id.vars = "index") # Ignore bias which has a really large magnitude ggplot(melted_df, aes(x = index, y = value, fill = variable)) + geom_bar(stat="identity", position="dodge") + xlab("Coefficient") + ylab("Standard error") # + scale_y_continuous(trans = "log") # If you want to also plot bias Exercise 14.5 This exercise shows a shortcoming of the bootstrap method when using the plug in estimator for the maximum. Compute the 95% bootstrap CI for the maximum of a standard normal distribution. Compute the 95% bootstrap CI for the maximum of a binomial distribution with n = 15 and p = 0.2. Repeat (b) using p = 0.9. Why is the result different? # bootstrap CI for maximum alpha <- 0.05 T_max <- function(x) {max(x)} # Equal to T_max = max bootstrap <- function(x, t, m = 1000) { n <- length(x) values <- rep(0, m) for (i in 1:m) { values[i] <- t(sample(x, n, replace = T)) } quantile(values, probs = c(alpha / 2, 1 - alpha / 2)) } # a # Meaningless, as the normal distribution can yield arbitrarily large values. x <- rnorm(100) bootstrap(x, T_max) ## 2.5% 97.5% ## 1.819425 2.961743 # b x <- rbinom(100, size = 15, prob = 0.2) # min = 0, max = 15 bootstrap(x, T_max) ## 2.5% 97.5% ## 6 7 # c x <- rbinom(100, size = 15, prob = 0.9) # min = 0, max = 15 bootstrap(x, T_max) ## 2.5% 97.5% ## 15 15 # Observation: to estimate the maximum, we need sufficient probability mass near the maximum value the distribution can yield. # Using bootstrap is pointless when there is too little mass near the true maximum. # In general, bootstrap will fail when estimating the CI for the maximum. "],["ml.html", "Chapter 15 Maximum likelihood 15.1 Deriving MLE 15.2 Fisher information 15.3 The German tank problem", " Chapter 15 Maximum likelihood This chapter deals with maximum likelihood estimation. The students are expected to acquire the following knowledge: How to derive MLE. Applying MLE in R. Calculating and interpreting Fisher information. Practical use of MLE. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 15.1 Deriving MLE Exercise 15.1 Derive the maximum likelihood estimator of variance for N\\((\\mu, \\sigma^2)\\). Compare with results from 13.3. What does that say about the MLE estimator? Solution. The mean is assumed constant, so we have the likelihood \\[\\begin{align} L(\\sigma^2; y) &= \\prod_{i=1}^n \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(y_i - \\mu)^2}{2 \\sigma^2}} \\\\ &= \\frac{1}{\\sqrt{2 \\pi \\sigma^2}^n} e^{\\frac{-\\sum_{i=1}^n (y_i - \\mu)^2}{2 \\sigma^2}} \\end{align}\\] We need to find the maximum of this function. We first observe that we can replace \\(\\frac{-\\sum_{i=1}^n (y_i - \\mu)^2}{2}\\) with a constant \\(c\\), since none of the terms are dependent on \\(\\sigma^2\\). Additionally, the term \\(\\frac{1}{\\sqrt{2 \\pi}^n}\\) does not affect the calculation of the maximum. So now we have \\[\\begin{align} L(\\sigma^2; y) &= (\\sigma^2)^{-\\frac{n}{2}} e^{\\frac{c}{\\sigma^2}}. \\end{align}\\] Differentiating we get \\[\\begin{align} \\frac{d}{d \\sigma^2} L(\\sigma^2; y) &= (\\sigma^2)^{-\\frac{n}{2}} \\frac{d}{d \\sigma^2} e^{\\frac{c}{\\sigma^2}} + e^{\\frac{c}{\\sigma^2}} \\frac{d}{d \\sigma^2} (\\sigma^2)^{-\\frac{n}{2}} \\\\ &= - (\\sigma^2)^{-\\frac{n}{2}} e^{\\frac{c}{\\sigma^2}} \\frac{c}{(\\sigma^2)^2} - e^{\\frac{c}{\\sigma^2}} \\frac{n}{2} (\\sigma^2)^{-\\frac{n + 2}{2}} \\\\ &= - (\\sigma^2)^{-\\frac{n + 4}{2}} e^{\\frac{c}{\\sigma^2}} c - e^{\\frac{c}{\\sigma^2}} \\frac{n}{2} (\\sigma^2)^{-\\frac{n + 2}{2}} \\\\ &= - e^{\\frac{c}{\\sigma^2}} (\\sigma^2)^{-\\frac{n + 4}{2}} \\Big(c + \\frac{n}{2}\\sigma^2 \\Big). \\end{align}\\] To get the maximum, this has to equal to 0, so \\[\\begin{align} c + \\frac{n}{2}\\sigma^2 &= 0 \\\\ \\sigma^2 &= -\\frac{2c}{n} \\\\ \\sigma^2 &= \\frac{\\sum_{i=1}^n (Y_i - \\mu)^2}{n}. \\end{align}\\] The MLE estimator is biased. Exercise 15.2 (Multivariate normal distribution) Derive the maximum likelihood estimate for the mean and covariance matrix of the multivariate normal. Simulate \\(n = 40\\) samples from a bivariate normal distribution (choose non-trivial parameters, that is, mean \\(\\neq 0\\) and covariance \\(\\neq 0\\)). Compute the MLE for the sample. Overlay the data with an ellipse that is determined by the MLE and an ellipse that is determined by the chosen true parameters. Repeat b. several times and observe how the estimates (ellipses) vary around the true value. Hint: For the derivation of MLE, these identities will be helpful: \\(\\frac{\\partial b^T a}{\\partial a} = \\frac{\\partial a^T b}{\\partial a} = b\\), \\(\\frac{\\partial a^T A a}{\\partial a} = (A + A^T)a\\), \\(\\frac{\\partial \\text{tr}(BA)}{\\partial A} = B^T\\), \\(\\frac{\\partial \\ln |A|}{\\partial A} = (A^{-1})^T\\), \\(a^T A a = \\text{tr}(a^T A a) = \\text{tr}(a a^T A) = \\text{tr}(Aaa^T)\\). Solution. The log likelihood of the MVN distribution is \\[\\begin{align*} l(\\mu, \\Sigma ; x) &= -\\frac{1}{2}\\Big(\\sum_{i=1}^n k\\ln(2\\pi) + |\\Sigma| + (x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) \\\\ &= -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n(x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) + c, \\end{align*}\\] where \\(c\\) is a constant with respect to \\(\\mu\\) and \\(\\Sigma\\). To find the MLE we first need to find partial derivatives. Let us start with \\(\\mu\\). \\[\\begin{align*} \\frac{\\partial}{\\partial \\mu}l(\\mu, \\Sigma ; x) &= \\frac{\\partial}{\\partial \\mu} -\\frac{1}{2}\\Big(\\sum_{i=1}^n x_i^T \\Sigma^{-1} x_i - x_i^T \\Sigma^{-1} \\mu - \\mu^T \\Sigma^{-1} x_i + \\mu^T \\Sigma^{-1} \\mu \\Big) \\\\ &= -\\frac{1}{2}\\Big(\\sum_{i=1}^n - \\Sigma^{-1} x_i - \\Sigma^{-1} x_i + 2 \\Sigma^{-1} \\mu \\Big) \\\\ &= -\\Sigma^{-1}\\Big(\\sum_{i=1}^n - x_i + \\mu \\Big). \\end{align*}\\] Equating above with zero, we get \\[\\begin{align*} \\sum_{i=1}^n - x_i + \\mu &= 0 \\\\ \\hat{\\mu} = \\frac{1}{n} \\sum_{i=1}^n x_i, \\end{align*}\\] which is the dimension-wise empirical mean. Now for the covariance matrix \\[\\begin{align*} \\frac{\\partial}{\\partial \\Sigma^{-1}}l(\\mu, \\Sigma ; x) &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n(x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) \\\\ &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n \\text{tr}((x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu))\\Big) \\\\ &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n \\text{tr}((\\Sigma^{-1} (x_i - \\mu) (x_i - \\mu)^T )\\Big) \\\\ &= \\frac{n}{2}\\Sigma + -\\frac{1}{2}\\Big(\\sum_{i=1}^n (x_i - \\mu) (x_i - \\mu)^T \\Big). \\end{align*}\\] Equating above with zero, we get \\[\\begin{align*} \\hat{\\Sigma} = \\frac{1}{n}\\sum_{i=1}^n (x_i - \\mu) (x_i - \\mu)^T. \\end{align*}\\] set.seed(1) n <- 40 mu <- c(1, -2) Sigma <- matrix(data = c(2, -1.6, -1.6, 1.8), ncol = 2) X <- mvrnorm(n = n, mu = mu, Sigma = Sigma) colnames(X) <- c("X1", "X2") X <- as.data.frame(X) # plot.new() tru_ellip <- ellipse(mu, Sigma, draw = FALSE) colnames(tru_ellip) <- c("X1", "X2") tru_ellip <- as.data.frame(tru_ellip) mu_est <- apply(X, 2, mean) tmp <- as.matrix(sweep(X, 2, mu_est)) Sigma_est <- (1 / n) * t(tmp) %*% tmp est_ellip <- ellipse(mu_est, Sigma_est, draw = FALSE) colnames(est_ellip) <- c("X1", "X2") est_ellip <- as.data.frame(est_ellip) ggplot(data = X, aes(x = X1, y = X2)) + geom_point() + geom_path(data = tru_ellip, aes(x = X1, y = X2, color = "truth")) + geom_path(data = est_ellip, aes(x = X1, y = X2, color = "estimated")) + labs(color = "type") Exercise 15.3 (Logistic regression) Logistic regression is a popular discriminative model when our target variable is binary (categorical with 2 values). One of the ways of looking at logistic regression is that it is linear regression but instead of using the linear term as the mean of a normal RV, we use it as the mean of a Bernoulli RV. Of course, the mean of a Bernoulli is bounded on \\([0,1]\\), so, to avoid non-sensical values, we squeeze the linear between 0 and 1 with the inverse logit function inv_logit\\((z) = 1 / (1 + e^{-z})\\). This leads to the following model: \\(y_i | \\beta, x_i \\sim \\text{Bernoulli}(\\text{inv_logit}(\\beta x_i))\\). Explicitly write the likelihood function of beta. Implement the likelihood function in R. Use black-box box-constraint optimization (for example, optim() with L-BFGS) to find the maximum likelihood estimate for beta for \\(x\\) and \\(y\\) defined below. Plot the estimated probability as a function of the independent variable. Compare with the truth. Let \\(y2\\) be a response defined below. Will logistic regression work well on this dataset? Why not? How can we still use the model, without changing it? inv_log <- function (z) { return (1 / (1 + exp(-z))) } set.seed(1) x <- rnorm(100) y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) y2 <- rbinom(100, size = 1, prob = inv_log(1.2 * x + 1.4 * x^2)) Solution. \\[\\begin{align*} l(\\beta; x, y) &= p(y | x, \\beta) \\\\ &= \\ln(\\prod_{i=1}^n \\text{inv_logit}(\\beta x_i)^{y_i} (1 - \\text{inv_logit}(\\beta x_i))^{1 - y_i}) \\\\ &= \\sum_{i=1}^n y_i \\ln(\\text{inv_logit}(\\beta x_i)) + (1 - y_i) \\ln(1 - \\text{inv_logit}(\\beta x_i)). \\end{align*}\\] set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par ## [1] 1.166558 truth_p <- data.frame(x = x, prob = inv_log(1.2 * x), type = "truth") est_p <- data.frame(x = x, prob = inv_log(my_optim$par * x), type = "estimated") plot_df <- rbind(truth_p, est_p) ggplot(data = plot_df, aes(x = x, y = prob, color = type)) + geom_point(alpha = 0.3) y2 <- rbinom(2000, size = 1, prob = inv_log(1.2 * x + 1.4 * x^2)) X2 <- cbind(x, x^2) my_optim2 <- optim(par = c(0, 0), fn = l_logistic, method = "L-BFGS-B", lower = c(0, 0), upper = c(2, 2), X = t(X2), y = y2) my_optim2$par ## [1] 1.153656 1.257649 tmp <- sweep(data.frame(x = x, x2 = x^2), 2, my_optim2$par, FUN = "*") tmp <- tmp[ ,1] + tmp[ ,2] truth_p <- data.frame(x = x, prob = inv_log(1.2 * x + 1.4 * x^2), type = "truth") est_p <- data.frame(x = x, prob = inv_log(tmp), type = "estimated") plot_df <- rbind(truth_p, est_p) ggplot(data = plot_df, aes(x = x, y = prob, color = type)) + geom_point(alpha = 0.3) Exercise 15.4 (Linear regression) For the data generated below, do the following: Compute the least squares (MLE) estimate of coefficients beta using the matrix exact solution. Compute the MLE by minimizing the sum of squared residuals using black-box optimization (optim()). Compute the MLE by using the output built-in linear regression (lm() ). Compare (a-c and the true coefficients). Compute 95% CI on the beta coefficients using the output of built-in linear regression. Compute 95% CI on the beta coefficients by using (a or b) and the bootstrap with percentile method for CI. Compare with d. set.seed(1) n <- 100 x1 <- rnorm(n) x2 <- rnorm(n) x3 <- rnorm(n) X <- cbind(x1, x2, x3) beta <- c(0.2, 0.6, -1.2) y <- as.vector(t(beta %*% t(X))) + rnorm(n, sd = 0.2) set.seed(1) n <- 100 x1 <- rnorm(n) x2 <- rnorm(n) x3 <- rnorm(n) X <- cbind(x1, x2, x3) beta <- c(0.2, 0.6, -1.2) y <- as.vector(t(beta %*% t(X))) + rnorm(n, sd = 0.2) LS_fun <- function (beta, X, y) { return(sum((y - beta %*% t(X))^2)) } my_optim <- optim(par = c(0, 0, 0), fn = LS_fun, lower = -5, upper = 5, X = X, y = y, method = "L-BFGS-B") my_optim$par ## [1] 0.1898162 0.5885946 -1.1788264 df <- data.frame(y = y, x1 = x1, x2 = x2, x3 = x3) my_lm <- lm(y ~ x1 + x2 + x3 - 1, data = df) my_lm ## ## Call: ## lm(formula = y ~ x1 + x2 + x3 - 1, data = df) ## ## Coefficients: ## x1 x2 x3 ## 0.1898 0.5886 -1.1788 # matrix solution beta_hat <- solve(t(X) %*% X) %*% t(X) %*% y beta_hat ## [,1] ## x1 0.1898162 ## x2 0.5885946 ## x3 -1.1788264 out <- summary(my_lm) out$coefficients[ ,2] ## x1 x2 x3 ## 0.02209328 0.02087542 0.01934506 # bootstrap CI nboot <- 1000 beta_boot <- matrix(data = NA, ncol = length(beta), nrow = nboot) for (i in 1:nboot) { inds <- sample(1:n, n, replace = T) new_df <- df[inds, ] X_tmp <- as.matrix(new_df[ ,-1]) y_tmp <- new_df[ ,1] # print(nrow(new_df)) tmp_beta <- solve(t(X_tmp) %*% X_tmp) %*% t(X_tmp) %*% y_tmp beta_boot[i, ] <- tmp_beta } apply(beta_boot, 2, mean) ## [1] 0.1893281 0.5887068 -1.1800738 apply(beta_boot, 2, quantile, probs = c(0.025, 0.975)) ## [,1] [,2] [,3] ## 2.5% 0.1389441 0.5436911 -1.221560 ## 97.5% 0.2386295 0.6363102 -1.140416 out$coefficients[ ,2] ## x1 x2 x3 ## 0.02209328 0.02087542 0.01934506 Exercise 15.5 (Principal component analysis) Load the olympic data set from package ade4. The data show decathlon results for 33 men in 1988 Olympic Games. This data set serves as a great example of finding the latent structure in the data, as there are certain characteristics of the athletes that make them excel at different events. For example an explosive athlete will do particulary well in sprints and long jumps. Perform PCA (prcomp) on the data set and interpret the first 2 latent dimensions. Hint: Standardize the data first to get meaningful results. Use MLE to estimate the covariance of the standardized multivariate distribution. Decompose the estimated covariance matrix with the eigendecomposition. Compare the eigenvectors to the output of PCA. data(olympic) X <- olympic$tab X_scaled <- scale(X) my_pca <- prcomp(X_scaled) summary(my_pca) ## Importance of components: ## PC1 PC2 PC3 PC4 PC5 PC6 PC7 ## Standard deviation 1.8488 1.6144 0.97123 0.9370 0.74607 0.70088 0.65620 ## Proportion of Variance 0.3418 0.2606 0.09433 0.0878 0.05566 0.04912 0.04306 ## Cumulative Proportion 0.3418 0.6025 0.69679 0.7846 0.84026 0.88938 0.93244 ## PC8 PC9 PC10 ## Standard deviation 0.55389 0.51667 0.31915 ## Proportion of Variance 0.03068 0.02669 0.01019 ## Cumulative Proportion 0.96312 0.98981 1.00000 autoplot(my_pca, data = X, loadings = TRUE, loadings.colour = 'blue', loadings.label = TRUE, loadings.label.size = 3) Sigma_est <- (1 / nrow(X_scaled)) * t(X_scaled) %*% X_scaled Sigma_dec <- eigen(Sigma_est) Sigma_dec$vectors ## [,1] [,2] [,3] [,4] [,5] [,6] ## [1,] 0.4158823 0.1488081 -0.26747198 -0.08833244 -0.442314456 0.03071237 ## [2,] -0.3940515 -0.1520815 -0.16894945 -0.24424963 0.368913901 -0.09378242 ## [3,] -0.2691057 0.4835374 0.09853273 -0.10776276 -0.009754680 0.23002054 ## [4,] -0.2122818 0.0278985 -0.85498656 0.38794393 -0.001876311 0.07454380 ## [5,] 0.3558474 0.3521598 -0.18949642 0.08057457 0.146965351 -0.32692886 ## [6,] 0.4334816 0.0695682 -0.12616012 -0.38229029 -0.088802794 0.21049130 ## [7,] -0.1757923 0.5033347 0.04609969 0.02558404 0.019358607 0.61491241 ## [8,] -0.3840821 0.1495820 0.13687235 0.14396548 -0.716743474 -0.34776037 ## [9,] -0.1799436 0.3719570 -0.19232803 -0.60046566 0.095582043 -0.43744387 ## [10,] 0.1701426 0.4209653 0.22255233 0.48564231 0.339772188 -0.30032419 ## [,7] [,8] [,9] [,10] ## [1,] 0.2543985 0.663712826 -0.10839531 0.10948045 ## [2,] 0.7505343 0.141264141 0.04613910 0.05580431 ## [3,] -0.1106637 0.072505560 0.42247611 0.65073655 ## [4,] -0.1351242 -0.155435871 -0.10206505 0.11941181 ## [5,] 0.1413388 -0.146839303 0.65076229 -0.33681395 ## [6,] 0.2725296 -0.639003579 -0.20723854 0.25971800 ## [7,] 0.1439726 0.009400445 -0.16724055 -0.53450315 ## [8,] 0.2732665 -0.276873049 -0.01766443 -0.06589572 ## [9,] -0.3419099 0.058519366 -0.30619617 -0.13093187 ## [10,] 0.1868704 0.007310045 -0.45688227 0.24311846 my_pca$rotation ## PC1 PC2 PC3 PC4 PC5 PC6 ## 100 -0.4158823 0.1488081 0.26747198 -0.08833244 -0.442314456 0.03071237 ## long 0.3940515 -0.1520815 0.16894945 -0.24424963 0.368913901 -0.09378242 ## poid 0.2691057 0.4835374 -0.09853273 -0.10776276 -0.009754680 0.23002054 ## haut 0.2122818 0.0278985 0.85498656 0.38794393 -0.001876311 0.07454380 ## 400 -0.3558474 0.3521598 0.18949642 0.08057457 0.146965351 -0.32692886 ## 110 -0.4334816 0.0695682 0.12616012 -0.38229029 -0.088802794 0.21049130 ## disq 0.1757923 0.5033347 -0.04609969 0.02558404 0.019358607 0.61491241 ## perc 0.3840821 0.1495820 -0.13687235 0.14396548 -0.716743474 -0.34776037 ## jave 0.1799436 0.3719570 0.19232803 -0.60046566 0.095582043 -0.43744387 ## 1500 -0.1701426 0.4209653 -0.22255233 0.48564231 0.339772188 -0.30032419 ## PC7 PC8 PC9 PC10 ## 100 0.2543985 -0.663712826 0.10839531 -0.10948045 ## long 0.7505343 -0.141264141 -0.04613910 -0.05580431 ## poid -0.1106637 -0.072505560 -0.42247611 -0.65073655 ## haut -0.1351242 0.155435871 0.10206505 -0.11941181 ## 400 0.1413388 0.146839303 -0.65076229 0.33681395 ## 110 0.2725296 0.639003579 0.20723854 -0.25971800 ## disq 0.1439726 -0.009400445 0.16724055 0.53450315 ## perc 0.2732665 0.276873049 0.01766443 0.06589572 ## jave -0.3419099 -0.058519366 0.30619617 0.13093187 ## 1500 0.1868704 -0.007310045 0.45688227 -0.24311846 15.2 Fisher information Exercise 15.6 Let us assume a Poisson likelihood. Derive the MLE estimate of the mean. Derive the Fisher information. For the data below compute the MLE and construct confidence intervals. Use bootstrap to construct the CI for the mean. Compare with c) and discuss. x <- c(2, 5, 3, 1, 2, 1, 0, 3, 0, 2) Solution. The log likelihood of the Poisson is \\[\\begin{align*} l(\\lambda; x) = \\sum_{i=1}^n x_i \\ln \\lambda - n \\lambda - \\sum_{i=1}^n \\ln x_i! \\end{align*}\\] Taking the derivative and equating with 0 we get \\[\\begin{align*} \\frac{1}{\\hat{\\lambda}}\\sum_{i=1}^n x_i - n &= 0 \\\\ \\hat{\\lambda} &= \\frac{1}{n} \\sum_{i=1}^n x_i. \\end{align*}\\] Since \\(\\lambda\\) is the mean parameter, this was expected. For the Fischer information, we first need the second derivative, which is \\[\\begin{align*} - \\lambda^{-2} \\sum_{i=1}^n x_i. \\\\ \\end{align*}\\] Now taking the expectation of the negative of the above, we get \\[\\begin{align*} E[\\lambda^{-2} \\sum_{i=1}^n x_i] &= \\lambda^{-2} E[\\sum_{i=1}^n x_i] \\\\ &= \\lambda^{-2} n \\lambda \\\\ &= \\frac{n}{\\lambda}. \\end{align*}\\] set.seed(1) x <- c(2, 5, 3, 1, 2, 1, 0, 3, 0, 2) lambda_hat <- mean(x) finfo <- length(x) / lambda_hat mle_CI <- c(lambda_hat - 1.96 * sqrt(1 / finfo), lambda_hat + 1.96 * sqrt(1 / finfo)) boot_lambda <- c() nboot <- 1000 for (i in 1:nboot) { tmp_x <- sample(x, length(x), replace = T) boot_lambda[i] <- mean(tmp_x) } boot_CI <- c(quantile(boot_lambda, 0.025), quantile(boot_lambda, 0.975)) mle_CI ## [1] 1.045656 2.754344 boot_CI ## 2.5% 97.5% ## 1.0 2.7 Exercise 15.7 Find the Fisher information matrix for the Gamma distribution. Generate 20 samples from a Gamma distribution and plot a confidence ellipse of the inverse of Fisher information matrix around the ML estimates of the parameters. Also plot the theoretical values. Repeat the sampling several times. What do you observe? Discuss what a non-diagonal Fisher matrix implies. Hint: The digamma function is defined as \\(\\psi(x) = \\frac{\\frac{d}{dx} \\Gamma(x)}{\\Gamma(x)}\\). Additionally, you do not need to evaluate \\(\\frac{d}{dx} \\psi(x)\\). To calculate its value in R, use package numDeriv. Solution. The log likelihood of the Gamma is \\[\\begin{equation*} l(\\alpha, \\beta; x) = n \\alpha \\ln \\beta - n \\ln \\Gamma(\\alpha) + (\\alpha - 1) \\sum_{i=1}^n \\ln x_i - \\beta \\sum_{i=1}^n x_i. \\end{equation*}\\] Let us calculate the derivatives. \\[\\begin{align*} \\frac{\\partial}{\\partial \\alpha} l(\\alpha, \\beta; x) &= n \\ln \\beta - n \\psi(\\alpha) + \\sum_{i=1}^n \\ln x_i, \\\\ \\frac{\\partial}{\\partial \\beta} l(\\alpha, \\beta; x) &= \\frac{n \\alpha}{\\beta} - \\sum_{i=1}^n x_i, \\\\ \\frac{\\partial^2}{\\partial \\alpha \\beta} l(\\alpha, \\beta; x) &= \\frac{n}{\\beta}, \\\\ \\frac{\\partial^2}{\\partial \\alpha^2} l(\\alpha, \\beta; x) &= - n \\frac{\\partial}{\\partial \\alpha} \\psi(\\alpha), \\\\ \\frac{\\partial^2}{\\partial \\beta^2} l(\\alpha, \\beta; x) &= - \\frac{n \\alpha}{\\beta^2}. \\end{align*}\\] The Fisher information matrix is then \\[\\begin{align*} I(\\alpha, \\beta) = - E[ \\begin{bmatrix} - n \\psi'(\\alpha) & \\frac{n}{\\beta} \\\\ \\frac{n}{\\beta} & - \\frac{n \\alpha}{\\beta^2} \\end{bmatrix} ] = \\begin{bmatrix} n \\psi'(\\alpha) & - \\frac{n}{\\beta} \\\\ - \\frac{n}{\\beta} & \\frac{n \\alpha}{\\beta^2} \\end{bmatrix} \\end{align*}\\] A non-diagonal Fisher matrix implies that the parameter estimates are linearly dependent. set.seed(1) n <- 20 pars_theor <- c(5, 2) x <- rgamma(n, 5, 2) # MLE for alpha and beta log_lik <- function (pars, x) { n <- length(x) return (- (n * pars[1] * log(pars[2]) - n * log(gamma(pars[1])) + (pars[1] - 1) * sum(log(x)) - pars[2] * sum(x))) } my_optim <- optim(par = c(1,1), fn = log_lik, method = "L-BFGS-B", lower = c(0.001, 0.001), upper = c(8, 8), x = x) pars_mle <- my_optim$par fish_mat <- matrix(data = NA, nrow = 2, ncol = 2) fish_mat[1,2] <- - n / pars_mle[2] fish_mat[2,1] <- - n / pars_mle[2] fish_mat[2,2] <- (n * pars_mle[1]) / (pars_mle[2]^2) fish_mat[1,1] <- n * grad(digamma, pars_mle[1]) fish_mat_inv <- solve(fish_mat) est_ellip <- ellipse(pars_mle, fish_mat_inv, draw = FALSE) colnames(est_ellip) <- c("X1", "X2") est_ellip <- as.data.frame(est_ellip) ggplot() + geom_point(data = data.frame(x = pars_mle[1], y = pars_mle[2]), aes(x = x, y = y)) + geom_path(data = est_ellip, aes(x = X1, y = X2)) + geom_point(aes(x = pars_theor[1], y = pars_theor[2]), color = "red") + geom_text(aes(x = pars_theor[1], y = pars_theor[2], label = "Theoretical parameters"), color = "red", nudge_y = -0.2) 15.3 The German tank problem Exercise 15.8 (The German tank problem) During WWII the allied intelligence were faced with an important problem of estimating the total production of certain German tanks, such as the Panther. What turned out to be a successful approach was to estimate the maximum from the serial numbers of the small sample of captured or destroyed tanks (describe the statistical model used). What assumptions were made by using the above model? Do you think they are reasonable assumptions in practice? Show that the plug-in estimate for the maximum (i.e. the maximum of the sample) is a biased estimator. Derive the maximum likelihood estimate of the maximum. Check that the following estimator is not biased: \\(\\hat{n} = \\frac{k + 1}{k}m - 1\\). Solution. The data are the serial numbers of the tanks. The parameter is \\(n\\), the total production of the tank. The distribution of the serial numbers is a discrete uniform distribution over all serial numbers. One of the assumptions is that we have i.i.d samples, however in practice this might not be true, as some tanks produced later could be sent to the field later, therefore already in theory we would not be able to recover some values from the population. To find the expected value we first need to find the distribution of \\(m\\). Let us start with the CDF. \\[\\begin{align*} F_m(x) = P(Y_1 < x,...,Y_k < x). \\end{align*}\\] If \\(x < k\\) then \\(F_m(x) = 0\\) and if \\(x \\geq 1\\) then \\(F_m(x) = 1\\). What about between those values. So the probability that the maximum value is less than or equal to \\(m\\) is just the number of possible draws from \\(Y\\) that are all smaller than \\(m\\), divided by all possible draws. This is \\(\\frac{{x}\\choose{k}}{{n}\\choose{k}}\\). The PDF on the suitable bounds is then \\[\\begin{align*} P(m = x) = F_m(x) - F_m(x - 1) = \\frac{\\binom{x}{k} - \\binom{x - 1}{k}}{\\binom{n}{k}} = \\frac{\\binom{x - 1}{k - 1}}{\\binom{n}{k}}. \\end{align*}\\] Now we can calculate the expected value of \\(m\\) using some combinatorial identities. \\[\\begin{align*} E[m] &= \\sum_{i = k}^n i \\frac{{i - 1}\\choose{k - 1}}{{n}\\choose{k}} \\\\ &= \\sum_{i = k}^n i \\frac{\\frac{(i - 1)!}{(k - 1)!(i - k)!}}{{n}\\choose{k}} \\\\ &= \\frac{k}{\\binom{n}{k}}\\sum_{i = k}^n \\binom{i}{k} \\\\ &= \\frac{k}{\\binom{n}{k}} \\binom{n + 1}{k + 1} \\\\ &= \\frac{k(n + 1)}{k + 1}. \\end{align*}\\] The bias of this estimator is then \\[\\begin{align*} E[m] - n = \\frac{k(n + 1)}{k + 1} - n = \\frac{k - n}{k + 1}. \\end{align*}\\] The probability that we observed our sample \\(Y = {Y_1, Y_2,...,,Y_k}\\) given \\(n\\) is \\(\\frac{1}{{n}\\choose{k}}\\). We need to find such \\(n^*\\) that this function is maximized. Additionally, we have a constraint that \\(n^* \\geq m = \\max{(Y)}\\). Let us plot this function for \\(m = 10\\) and \\(k = 4\\). library(ggplot2) my_fun <- function (x, m, k) { tmp <- 1 / (choose(x, k)) tmp[x < m] <- 0 return (tmp) } x <- 1:20 y <- my_fun(x, 10, 4) df <- data.frame(x = x, y = y) ggplot(data = df, aes(x = x, y = y)) + geom_line() ::: {.solution} (continued) We observe that the maximum of this function lies at the maximum value of the sample. Therefore \\(n^* = m\\) and ML estimate equals the plug-in estimate. \\[\\begin{align*} E[\\hat{n}] &= \\frac{k + 1}{k} E[m] - 1 \\\\ &= \\frac{k + 1}{k} \\frac{k(n + 1)}{k + 1} - 1 \\\\ &= n. \\end{align*}\\] ::: "],["nhst.html", "Chapter 16 Null hypothesis significance testing", " Chapter 16 Null hypothesis significance testing This chapter deals with null hypothesis significance testing. The students are expected to acquire the following knowledge: Binomial test. t-test. Chi-squared test. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 16.1 (Binomial test) We assume \\(y_i \\in \\{0,1\\}\\), \\(i = 1,...,n\\) and \\(y_i | \\theta = 0.5 \\sim i.i.d.\\) Bernoulli\\((\\theta)\\). The test statistic is \\(X = \\sum_{i=1}^n\\) and the rejection region R is defined as the region where the probability of obtaining such or more extreme \\(X\\) given \\(\\theta = 0.5\\) is less than 0.05. Derive and plot the power function of the test for \\(n=100\\). What is the significance level of this test if \\(H0: \\theta = 0.5\\)? At which values of X will we reject the null hypothesis? # a # First we need the rejection region, so we need to find X_min and X_max n <- 100 qbinom(0.025, n, 0.5) ## [1] 40 qbinom(0.975, n, 0.5) ## [1] 60 pbinom(40, n, 0.5) ## [1] 0.02844397 pbinom(60, n, 0.5) ## [1] 0.9823999 X_min <- 39 X_max <- 60 thetas <- seq(0, 1, by = 0.01) beta_t <- 1 - pbinom(X_max, size = n, prob = thetas) + pbinom(X_min, size = n, prob = thetas) plot(beta_t) # b # The significance level is beta_t[51] ## [1] 0.0352002 # We will reject the null hypothesis at X values below X_min and above X_max. Exercise 16.2 (Long-run guarantees of the t-test) Generate a sample of size \\(n = 10\\) from the standard normal. Use the two-sided t-test with \\(H0: \\mu = 0\\) and record the p-value. Can you reject H0 at 0.05 significance level? (before simulating) If we repeated (b) many times, what would be the relative frequency of false positives/Type I errors (rejecting the null that is true)? What would be the relative frequency of false negatives /Type II errors (retaining the null when the null is false)? (now simulate b and check if the simulation results match your answer in b) Similar to (a-c) but now we generate data from N(-0.5, 1). Similar to (a-c) but now we generate data from N(\\(\\mu\\), 1) where we every time pick a different \\(\\mu < 0\\) and use a one-sided test \\(H0: \\mu <= 0\\). set.seed(2) # a x <- rnorm(10) my_test <- t.test(x, alternative = "two.sided", mu = 0) my_test ## ## One Sample t-test ## ## data: x ## t = 0.6779, df = 9, p-value = 0.5149 ## alternative hypothesis: true mean is not equal to 0 ## 95 percent confidence interval: ## -0.4934661 0.9157694 ## sample estimates: ## mean of x ## 0.2111516 # we can not reject the null hypothesis # b # The expected value of false positives would be 0.05. The expected value of # true negatives would be 0, as there are no negatives (the null hypothesis is # always the truth). nit <- 1000 typeIerr <- vector(mode = "logical", length = nit) typeIIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { x <- rnorm(10) my_test <- t.test(x, alternative = "two.sided", mu = 0) if (my_test$p.value < 0.05) { typeIerr[i] <- T } else { typeIerr[i] <- F } } mean(typeIerr) ## [1] 0.052 sd(typeIerr) / sqrt(nit) ## [1] 0.007024624 # d # We can not estimate the percentage of true negatives, but it will probably be # higher than 0.05. There will be no false positives as the null hypothesis is # always false. typeIIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { x <- rnorm(10, -0.5) my_test <- t.test(x, alternative = "two.sided", mu = 0) if (my_test$p.value < 0.05) { typeIIerr[i] <- F } else { typeIIerr[i] <- T } } mean(typeIIerr) ## [1] 0.719 sd(typeIIerr) / sqrt(nit) ## [1] 0.01422115 # e # The expected value of false positives would be lower than 0.05. The expected # value of true negatives would be 0, as there are no negatives (the null # hypothesis is always the truth). typeIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { u <- runif(1, -1, 0) x <- rnorm(10, u) my_test <- t.test(x, alternative = "greater", mu = 0) if (my_test$p.value < 0.05) { typeIerr[i] <- T } else { typeIerr[i] <- F } } mean(typeIerr) ## [1] 0.012 sd(typeIerr) / sqrt(nit) ## [1] 0.003444977 Exercise 16.3 (T-test, confidence intervals, and bootstrap) Sample \\(n=20\\) from a standard normal distribution and calculate the p-value using t-test, confidence intervals based on normal distribution, and bootstrap. Repeat this several times and check how many times we rejected the null hypothesis (made a type I error). Hint: For the confidence intervals you can use function CI from the Rmisc package. set.seed(1) library(Rmisc) nit <- 1000 n_boot <- 100 t_logic <- rep(F, nit) boot_logic <- rep(F, nit) norm_logic <- rep(F, nit) for (i in 1:nit) { x <- rnorm(20) my_test <- t.test(x) my_CI <- CI(x) if (my_test$p.value <= 0.05) t_logic[i] <- T boot_tmp <- vector(mode = "numeric", length = n_boot) for (j in 1:n_boot) { tmp_samp <- sample(x, size = 20, replace = T) boot_tmp[j] <- mean(tmp_samp) } if ((quantile(boot_tmp, 0.025) >= 0) | (quantile(boot_tmp, 0.975) <= 0)) { boot_logic[i] <- T } if ((my_CI[3] >= 0) | (my_CI[1] <= 0)) { norm_logic[i] <- T } } mean(t_logic) ## [1] 0.053 sd(t_logic) / sqrt(nit) ## [1] 0.007088106 mean(boot_logic) ## [1] 0.093 sd(boot_logic) / sqrt(nit) ## [1] 0.009188876 mean(norm_logic) ## [1] 0.053 sd(norm_logic) / sqrt(nit) ## [1] 0.007088106 Exercise 16.4 (Chi-squared test) Show that the \\(\\chi^2 = \\sum_{i=1}^k \\frac{(O_i - E_i)^2}{E_i}\\) test statistic is approximately \\(\\chi^2\\) distributed when we have two categories. Let us look at the US voting data here. Compare the number of voters who voted for Trump or Hillary depending on their income (less or more than 100.000 dollars per year). Manually calculate the chi-squared statistic, compare to the chisq.test in R, and discuss the results. Visualize the test. Solution. Let \\(X_i\\) be binary variables, \\(i = 1,...,n\\). We can then express the test statistic as \\[\\begin{align} \\chi^2 = &\\frac{(O_i - np)^2}{np} + \\frac{(n - O_i - n(1 - p))^2}{n(1 - p)} \\\\ &= \\frac{(O_i - np)^2}{np(1 - p)} \\\\ &= (\\frac{O_i - np}{\\sqrt{np(1 - p)}})^2. \\end{align}\\] When \\(n\\) is large, this distrbution is approximately normal with \\(\\mu = np\\) and \\(\\sigma^2 = np(1 - p)\\) (binomial converges in distribution to standard normal). By definition, the chi-squared distribution with \\(k\\) degrees of freedom is a sum of squares of \\(k\\) independent standard normal random variables. n <- 24588 less100 <- round(0.66 * n * c(0.49, 0.45, 0.06)) # some rounding, but it should not affect results more100 <- round(0.34 * n * c(0.47, 0.47, 0.06)) x <- rbind(less100, more100) colnames(x) <- c("Clinton", "Trump", "other/no answer") print(x) ## Clinton Trump other/no answer ## less100 7952 7303 974 ## more100 3929 3929 502 chisq.test(x) ## ## Pearson's Chi-squared test ## ## data: x ## X-squared = 9.3945, df = 2, p-value = 0.00912 x ## Clinton Trump other/no answer ## less100 7952 7303 974 ## more100 3929 3929 502 csum <- apply(x, 2, sum) rsum <- apply(x, 1, sum) chi2 <- (x[1,1] - csum[1] * rsum[1] / sum(x))^2 / (csum[1] * rsum[1] / sum(x)) + (x[1,2] - csum[2] * rsum[1] / sum(x))^2 / (csum[2] * rsum[1] / sum(x)) + (x[1,3] - csum[3] * rsum[1] / sum(x))^2 / (csum[3] * rsum[1] / sum(x)) + (x[2,1] - csum[1] * rsum[2] / sum(x))^2 / (csum[1] * rsum[2] / sum(x)) + (x[2,2] - csum[2] * rsum[2] / sum(x))^2 / (csum[2] * rsum[2] / sum(x)) + (x[2,3] - csum[3] * rsum[2] / sum(x))^2 / (csum[3] * rsum[2] / sum(x)) chi2 ## Clinton ## 9.394536 1 - pchisq(chi2, df = 2) ## Clinton ## 0.009120161 x <- seq(0, 15, by = 0.01) df <- data.frame(x = x) ggplot(data = df, aes(x = x)) + stat_function(fun = dchisq, args = list(df = 2)) + geom_segment(aes(x = chi2, y = 0, xend = chi2, yend = dchisq(chi2, df = 2))) + stat_function(fun = dchisq, args = list(df = 2), xlim = c(chi2, 15), geom = "area", fill = "red") "],["bi.html", "Chapter 17 Bayesian inference 17.1 Conjugate priors 17.2 Posterior sampling", " Chapter 17 Bayesian inference This chapter deals with Bayesian inference. The students are expected to acquire the following knowledge: How to set prior distribution. Compute posterior distribution. Compute posterior predictive distribution. Use sampling for inference. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 17.1 Conjugate priors Exercise 17.1 (Poisson-gamma model) Let us assume a Poisson likelihood and a gamma prior on the Poisson mean parameter (this is a conjugate prior). Derive posterior Below we have some data, which represents number of goals in a football match. Choose sensible prior for this data (draw the gamma density if necessary), justify it. Compute the posterior. Compute an interval such that the probability that the true mean is in there is 95%. What is the probability that the true mean is greater than 2.5? Back to theory: Compute prior predictive and posterior predictive. Discuss why the posterior predictive is overdispersed and not Poisson? Draw a histogram of the prior predictive and posterior predictive for the data from (b). Discuss. Generate 10 and 100 random samples from a Poisson distribution and compare the posteriors with a flat prior, and a prior concentrated away from the truth. x <- c(3, 2, 1, 1, 5, 4, 0, 0, 4, 3) Solution. \\[\\begin{align*} p(\\lambda | X) &= \\frac{p(X | \\lambda) p(\\lambda)}{\\int_0^\\infty p(X | \\lambda) p(\\lambda) d\\lambda} \\\\ &\\propto p(X | \\lambda) p(\\lambda) \\\\ &= \\Big(\\prod_{i=1}^n \\frac{1}{x_i!} \\lambda^{x_i} e^{-\\lambda}\\Big) \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\lambda^{\\alpha - 1} e^{-\\beta \\lambda} \\\\ &\\propto \\lambda^{\\sum_{i=1}^n x_i + \\alpha - 1} e^{- \\lambda (n + \\beta)} \\\\ \\end{align*}\\] We recognize this as the shape of a gamma distribution, therefore \\[\\begin{align*} \\lambda | X \\sim \\text{gamma}(\\alpha + \\sum_{i=1}^n x_i, \\beta + n) \\end{align*}\\] For the prior predictive, we have \\[\\begin{align*} p(x^*) &= \\int_0^\\infty p(x^*, \\lambda) d\\lambda \\\\ &= \\int_0^\\infty p(x^* | \\lambda) p(\\lambda) d\\lambda \\\\ &= \\int_0^\\infty \\frac{1}{x^*!} \\lambda^{x^*} e^{-\\lambda} \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\lambda^{\\alpha - 1} e^{-\\beta \\lambda} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\int_0^\\infty \\lambda^{x^* + \\alpha - 1} e^{-\\lambda (1 + \\beta)} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\frac{\\Gamma(x^* + \\alpha)}{(1 + \\beta)^{x^* + \\alpha}} \\int_0^\\infty \\frac{(1 + \\beta)^{x^* + \\alpha}}{\\Gamma(x^* + \\alpha)} \\lambda^{x^* + \\alpha - 1} e^{-\\lambda (1 + \\beta)} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\frac{\\Gamma(x^* + \\alpha)}{(1 + \\beta)^{x^* + \\alpha}} \\\\ &= \\frac{\\Gamma(x^* + \\alpha)}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} (\\frac{\\beta}{1 + \\beta})^\\alpha (\\frac{1}{1 + \\beta})^{x^*}, \\end{align*}\\] which we recognize as the negative binomial distribution with \\(r = \\alpha\\) and \\(p = \\frac{1}{\\beta + 1}\\). For the posterior predictive, the calculation is the same, only now the parameters are \\(r = \\alpha + \\sum_{i=1}^n x_i\\) and \\(p = \\frac{1}{\\beta + n + 1}\\). There are two sources of uncertainty in the predictive distribution. First is the uncertainty about the population. Second is the variability in sampling from the population. When \\(n\\) is large, the latter is going to be very small. But when \\(n\\) is small, the latter is going to be higher, resulting in an overdispersed predictive distribution. x <- c(3, 2, 1, 1, 5, 4, 0, 0, 4, 3) # b # quick visual check of the prior ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dgamma, args = list(shape = 1, rate = 1)) palpha <- 1 pbeta <- 1 alpha_post <- palpha + sum(x) beta_post <- pbeta + length(x) ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dgamma, args = list(shape = alpha_post, rate = beta_post)) # probability of being higher than 2.5 1 - pgamma(2.5, alpha_post, beta_post) ## [1] 0.2267148 # interval qgamma(c(0.025, 0.975), alpha_post, beta_post) ## [1] 1.397932 3.137390 # d prior_pred <- rnbinom(1000, size = palpha, prob = 1 - 1 / (pbeta + 1)) post_pred <- rnbinom(1000, size = palpha + sum(x), prob = 1 - 1 / (pbeta + 10 + 1)) df <- data.frame(prior = prior_pred, posterior = post_pred) df <- gather(df) ggplot(df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") # e set.seed(1) x1 <- rpois(10, 2.5) x2 <- rpois(100, 2.5) alpha_flat <- 1 beta_flat <- 0.1 alpha_conc <- 50 beta_conc <- 10 n <- 10000 df_flat <- data.frame(x1 = rgamma(n, alpha_flat + sum(x1), beta_flat + 10), x2 = rgamma(n, alpha_flat + sum(x2), beta_flat + 100), type = "flat") df_flat <- tidyr::gather(df_flat, key = "key", value = "value", - type) df_conc <- data.frame(x1 = rgamma(n, alpha_conc + sum(x1), beta_conc + 10), x2 = rgamma(n, alpha_conc + sum(x2), beta_conc + 100), type = "conc") df_conc <- tidyr::gather(df_conc, key = "key", value = "value", - type) df <- rbind(df_flat, df_conc) ggplot(data = df, aes(x = value, color = type)) + facet_wrap(~ key) + geom_density() 17.2 Posterior sampling Exercise 17.2 (Bayesian logistic regression) In Chapter 15 we implemented a MLE for logistic regression (see the code below). For this model, conjugate priors do not exist, which complicates the calculation of the posterior. However, we can use sampling from the numerator of the posterior, using rejection sampling. Set a sensible prior distribution on \\(\\beta\\) and use rejection sampling to find the posterior distribution. In a) you will get a distribution of parameter \\(\\beta\\). Plot the probabilities (as in exercise 15.3) for each sample of \\(\\beta\\) and compare to the truth. Hint: We can use rejection sampling even for functions which are not PDFs – they do not have to sum/integrate to 1. We just need to use a suitable envelope that we know how to sample from. For example, here we could use a uniform distribution and scale it suitably. set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par # Let's say we believe that the mean of beta is 0.5. Since we are not very sure # about this, we will give it a relatively high variance. So a normal prior with # mean 0.5 and standard deviation 5. But there is no right solution to this, # this is basically us expressing our prior belief in the parameter values. set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) if (is.nan(logl)) logl <- Inf return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par ## [1] 1.166558 f_logistic <- function (beta, X, y) { logl <- prod(inv_log(as.vector(beta %*% X))^y * (1 - inv_log(as.vector(beta %*% X)))^(1 - y)) return(logl) } a <- seq(0, 3, by = 0.01) my_l <- c() for (i in a) { my_l <- c(my_l, f_logistic(i, x, y) * dnorm(i, 0.5, 5)) } plot(my_l) envlp <- 10^(-25.8) * dunif(a, -5, 5) # found by trial and error tmp <- data.frame(envel = envlp, l = my_l, t = a) tmp <- gather(tmp, key = "key", value = "value", - t) ggplot(tmp, aes(x = t, y = value, color = key)) + geom_line() # envelope OK set.seed(1) nsamps <- 1000 samps <- c() for (i in 1:nsamps) { tmp <- runif(1, -5, 5) u <- runif(1, 0, 1) if (u < (f_logistic(tmp, x, y) * dnorm(tmp, 0.5, 5)) / (10^(-25.8) * dunif(tmp, -5, 5))) { samps <- c(samps, tmp) } } plot(density(samps)) mean(samps) ## [1] 1.211578 median(samps) ## [1] 1.204279 truth_p <- data.frame(x = x, prob = inv_log(1.2 * x), type = "truth") preds <- inv_log(x %*% t(samps)) preds <- gather(cbind(as.data.frame(preds), x = x), key = "key", "value" = value, - x) ggplot(preds, aes(x = x, y = value)) + geom_line(aes(group = key), color = "gray", alpha = 0.7) + geom_point(data = truth_p, aes(y = prob), color = "red", alpha = 0.7) + theme_bw() "],["distributions-intutition.html", "Chapter 18 Distributions intutition 18.1 Discrete distributions 18.2 Continuous distributions", " Chapter 18 Distributions intutition This chapter is intended to help you familiarize yourself with the different probability distributions you will encounter in this course. Use Appendix B as a reference for the basic properties of distributions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 18.1 Discrete distributions Exercise 18.1 (Bernoulli intuition 1) Arguably the simplest distribution you will enocounter is the Bernoulli distribution. It is a discrete probability distribution used to represent the outcome of a yes/no question. It has one parameter \\(p\\) which is the probability of success. The probability of failure is \\(1-p\\), sometimes denoted as \\(q\\). A classic way to think about a Bernoulli trial (a yes/no experiment) is a coin flip. Real coins are fair, meaning the probability of either heads (1) or tails (0) are the same i.e. \\(p=0.5\\), shown below in figure a. Alternatively we may want to represent a process that doesn’t have equal probabilities of outcomes like “Will a throw of a fair die result in a 6?”. In this case \\(p=\\frac{1}{6}\\), shown in figure b. Using your knowledge of the Bernoulli distribution use the throw of a fair die to think of events, such that: \\(p = 0.5\\) \\(p = \\frac{5}{6}\\) \\(q = \\frac{2}{3}\\) Solution. An event that is equally likely to happen or not happen i.e. \\(p = 0.5\\) would be throwing an even number. More formally we can name this event \\(A\\) and write: \\(A = \\{2,4,6\\}\\), its probability being \\(P(A) = 0.5\\) An example of an event with \\(p = \\frac{5}{6}\\) would be throwing a number greater than 1. Defined as \\(B = \\{2,3,4,5,6\\}\\). We need an event that fails \\(\\frac{2}{3}\\) of the time. Alternatively we can reverse the problem and find an event that succeeds \\(\\frac{1}{3}\\) of the time, since: \\(q = 1 - p \\implies p = 1 - q = \\frac{1}{3}\\). The event that our outcome is divisible by 3: \\(C = \\{3, 6\\}\\) satisfies this condition. Exercise 18.2 (Binomial intuition 1) The binomial distribution is a generalization of the Bernoulli distribution. Instead of considering a single Bernoulli trial, we now consider a sequence of \\(n\\) trials, which are independent and have the same parameter \\(p\\). So the binomial distribution has two parameters \\(n\\) - the number of trials and \\(p\\) - the probability of success for each trial. If we return to our coin flip representation, we now flip a coin several times. The binomial distribution will give us the probabilities of all possible outcomes. Below we show the distribution for a series of 10 coin flips with a fair coin (left) and a biased coin (right). The numbers on the x axis represent the number of times the coin landed heads. Using your knowledge of the binomial distribution: Take the pmf of the binomial distribution and plug in \\(n=1\\), check that it is in fact equivalent to a Bernoulli distribution. In our examples we show the graph of a binomial distribution over 10 trials with \\(p=0.8\\). If we take a look at the graph, it appears as though the probabilities of getting 0,1,2 or 3 heads in 10 flips are zero. Is it actually zero? Check by plugging in the values into the pmf. Solution. The pmf of a binomial distribution is \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\), now we insert \\(n=1\\) to get: \\[\\binom{1}{k} p^k (1 - p)^{1 - k}\\]. Not quite equivalent to a Bernoulli, however note that the support of the binomial distribution is defined as \\(k \\in \\{0,1,\\dots,n\\}\\), so in our case \\(k = \\{0,1\\}\\), then: \\[\\binom{1}{0} = \\binom{1}{1} = 1\\] we get: \\(p^k (1 - p)^{1 - k}\\) ,the Bernoulli distribution. As we already know \\(p=0.8, n=10\\), so: \\[\\binom{10}{0} 0.8^0 (1 - 0.8)^{10 - 0} = 1.024 \\cdot 10^{-7}\\] \\[\\binom{10}{1} 0.8^1 (1 - 0.8)^{10 - 1} = 4.096 \\cdot 10^{-6}\\] \\[\\binom{10}{2} 0.8^2 (1 - 0.8)^{10 - 2} = 7.3728 \\cdot 10^{-5}\\] \\[\\binom{10}{3} 0.8^3 (1 - 0.8)^{10 - 3} = 7.86432\\cdot 10^{-4}\\] So the probabilities are not zero, just very small. Exercise 18.3 (Poisson intuition 1) Below are shown 3 different graphs of the Poisson distribution. Your task is to replicate them on your own in R by varying the \\(\\lambda\\) parameter. Hint: You can use dpois() to get the probabilities. library(ggplot2) library(gridExtra) x = 0:15 # Create Poisson data data1 <- data.frame(x = x, y = dpois(x, lambda = 0.1)) data2 <- data.frame(x = x, y = dpois(x, lambda = 1)) data3 <- data.frame(x = x, y = dpois(x, lambda = 7.5)) # Create individual ggplot objects plot1 <- ggplot(data1, aes(x, y)) + geom_col() + xlab("x") + ylab("Probability") + ylim(0,1) plot2 <- ggplot(data2, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) plot3 <- ggplot(data3, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) # Combine the plots grid.arrange(plot1, plot2, plot3, ncol = 3) Exercise 18.4 (Poisson intuition 2) The Poisson distribution is a discrete probability distribution that models processes where events occur at a constant mean rate and are independent of each other. It has a single parameter \\(\\lambda\\), which represents the constant mean rate. A classic example of a scenario that can be modeled using the Poisson distribution is the number of calls received by a call center in a day (or in fact any other time interval). Suppose you work in a call center and have some understanding of probability distributions. You overhear your supervisor mentioning that the call center receives an average of 2.5 calls per day. Using your knowledge of the Poisson distribution, calculate: The probability you will get no calls today. The probability you will get more than 5 calls today. Solution. First recall the Poisson pmf: \\[p(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!}\\] as stated previously our parameter \\(\\lambda = 2.5\\) To get the probability of no calls we simply plug in \\(k = 0\\), so: \\[p(0) = \\frac{2.5^0 e^{-2.5}}{0!} = e^{-2.5} \\approx 0.082\\] The support of the Poisson distribution is non-negative integers. So if we wanted to calculate the probability of getting more than 5 calls we would need to add up the probabilities of getting 6 calls and 7 calls and so on up to infinity. Let us instead remember that the sum of all probabilties will be 1, we will reverse the problem and instead ask “What is the probability we get 5 calls or less?”. We can subtract the probability of the opposite outcome (the complement) from 1 to get the probability of our original question. \\[P(k > 5) = 1 - P(k \\leq 5)\\] \\[P(k \\leq 5) = \\sum_{i=0}^{5} p(i) = p(0) + p(1) + p(2) + p(3) + p(4) + p(5) =\\] \\[= \\frac{2.5^0 e^{-2.5}}{0!} + \\frac{2.5^1 e^{-2.5}}{1!} + \\dots =\\] \\[=0.957979\\] So the probability of geting more than 5 calls will be \\(1 - 0.957979 = 0.042021\\) Exercise 18.5 (Geometric intuition 1) The geometric distribution is a discrete distribution that models the number of failures before the first success in a sequence of independent Bernoulli trials. It has a single parameter \\(p\\), representing the probability of success. Disclaimer: There are two forms of this distribution, the one we just described and another version that models the number of trials before the first success. The difference is subtle yet significant and you are likely to encounter both forms, though here we will limit ourselves to the former. In the graph below we show the pmf of a geometric distribution with \\(p=0.5\\). This can be thought of as the number of successive failures (tails) in the flip of a fair coin. You can see that there’s a 50% chance you will have zero failures i.e. you will flip a heads on your very first attempt. But there is some smaller chance that you will flip a sequence of tails in a row, with longer sequences having ever lower probability. Create an equivalent graph that represents the probability of rolling a 6 with a fair 6-sided die. Use the formula for the mean of the geometric distribution and determine the average number of failures before you roll a 6. Look up the alternative form of the geometric distribtuion and again use the formula for the mean to determine the average number of trials before you roll a 6. Solution. Parameter p (the probability of success) for rolling a 6 is \\(p=\\frac{1}{6}\\). library(ggplot2) # Parameters p <- 1/6 x_vals <- 0:9 # Starting from 0 probs <- dgeom(x_vals, p) # Data data <- data.frame(x_vals, probs) # Plot ggplot(data, aes(x=x_vals, y=probs)) + geom_segment(aes(xend=x_vals, yend=0), color="black", size=1) + geom_point(color="red", size=2) + labs(x = "Number of trials", y = "Probability") + theme_minimal() + scale_x_continuous(breaks = x_vals) # This line ensures integer x-axis labels ::: {.solution} b) \\[E[X] = \\frac{1-p}{p}= \\frac{1- \\frac{1}{6}}{\\frac{1}{6}} = \\frac{5}{6}\\cdot 6 = 5\\] On average we will fail 5 times before we roll our first 6. The alternative form of this distribution (with support on all positive integers) has a slightly different formula for the mean. This change reflects the difference in the way we posed our question: \\[E[X] = \\frac{1}{p} = \\frac{1}{\\frac{1}{6}} = 6\\] On average we will have to throw the die 6 times before we roll a 6. ::: 18.2 Continuous distributions "],["A1.html", "A R programming language A.1 Basic characteristics A.2 Why R? A.3 Setting up A.4 R basics A.5 Functions A.6 Other tips A.7 Further reading and references", " A R programming language A.1 Basic characteristics R is free software for statistical computing and graphics. It is widely used by statisticians, scientists, and other professionals for software development and data analysis. It is an interpreted language and therefore the programs do not need compilation. A.2 Why R? R is one of the main two languages used for statistics and machine learning (the other being Python). Pros Libraries. Comprehensive collection of statistical and machine learning packages. Easy to code. Open source. Anyone can access R and develop new methods. Additionally, it is relatively simple to get source code of established methods. Large community. The use of R has been rising for some time, in industry and academia. Therefore a large collection of blogs and tutorials exists, along with people offering help on pages like StackExchange and CrossValidated. Integration with other languages and LaTeX. New methods. Many researchers develop R packages based on their research, therefore new methods are available soon after development. Cons Slow. Programs run slower than in other programming languages, however this can be somewhat ammended by effective coding or integration with other languages. Memory intensive. This can become a problem with large data sets, as they need to be stored in the memory, along with all the information the models produce. Some packages are not as good as they should be, or have poor documentation. Object oriented programming in R can be very confusing and complex. A.3 Setting up https://www.r-project.org/. A.3.1 RStudio RStudio is the most widely used IDE for R. It is free, you can download it from https://rstudio.com/. While console R is sufficient for the requirements of this course, we recommend the students install RStudio for its better user interface. A.3.2 Libraries for data science Listed below are some of the more useful libraries (packages) for data science. Students are also encouraged to find other useful packages. dplyr Efficient data manipulation. Part of the wider package collection called tidyverse. ggplot2 Plotting based on grammar of graphics. stats Several statistical models. rstan Bayesian inference using Hamiltonian Monte Carlo. Very flexible model building. MCMCpack Bayesian inference. rmarkdown, knitr, and bookdown Dynamic reports (for example such as this one). devtools Package development. A.4 R basics A.4.1 Variables and types Important information and tips: no type declaration define variables with <- instead of = (although both work, there is a slight difference, additionally most of the packages use the arrow) for strings use \"\" for comments use # change types with as.type() functions no special type for single character like C++ for example n <- 20 x <- 2.7 m <- n # m gets value 20 my_flag <- TRUE student_name <- "Luka" typeof(n) ## [1] "double" typeof(student_name) ## [1] "character" typeof(my_flag) ## [1] "logical" typeof(as.integer(n)) ## [1] "integer" typeof(as.character(n)) ## [1] "character" A.4.2 Basic operations n + x ## [1] 22.7 n - x ## [1] 17.3 diff <- n - x # variable diff gets the difference between n and x diff ## [1] 17.3 n * x ## [1] 54 n / x ## [1] 7.407407 x^2 ## [1] 7.29 sqrt(x) ## [1] 1.643168 n > 2 * n ## [1] FALSE n == n ## [1] TRUE n == 2 * n ## [1] FALSE n != n ## [1] FALSE paste(student_name, "is", n, "years old") ## [1] "Luka is 20 years old" A.4.3 Vectors use c() to combine elements into vectors can only contain one type of variable if different types are provided, all are transformed to the most basic type in the vector access elements by indexes or logical vectors of the same length a scalar value is regarded as a vector of length 1 1:4 # creates a vector of integers from 1 to 4 ## [1] 1 2 3 4 student_ages <- c(20, 23, 21) student_names <- c("Luke", "Jen", "Mike") passed <- c(TRUE, TRUE, FALSE) length(student_ages) ## [1] 3 # access by index student_ages[2] ## [1] 23 student_ages[1:2] ## [1] 20 23 student_ages[2] <- 24 # change values # access by logical vectors student_ages[passed == TRUE] # same as student_ages[passed] ## [1] 20 24 student_ages[student_names %in% c("Luke", "Mike")] ## [1] 20 21 student_names[student_ages > 20] ## [1] "Jen" "Mike" A.4.3.1 Operations with vectors most operations are element-wise if we operate on vectors of different lengths, the shorter vector periodically repeats its elements until it reaches the length of the longer one a <- c(1, 3, 5) b <- c(2, 2, 1) d <- c(6, 7) a + b ## [1] 3 5 6 a * b ## [1] 2 6 5 a + d ## Warning in a + d: longer object length is not a multiple of shorter object ## length ## [1] 7 10 11 a + 2 * b ## [1] 5 7 7 a > b ## [1] FALSE TRUE TRUE b == a ## [1] FALSE FALSE FALSE a %*% b # vector multiplication, not element-wise ## [,1] ## [1,] 13 A.4.4 Factors vectors of finite predetermined classes suitable for categorical variables ordinal (ordered) or nominal (unordered) car_brand <- factor(c("Audi", "BMW", "Mercedes", "BMW"), ordered = FALSE) car_brand ## [1] Audi BMW Mercedes BMW ## Levels: Audi BMW Mercedes freq <- factor(x = NA, levels = c("never","rarely","sometimes","often","always"), ordered = TRUE) freq[1:3] <- c("rarely", "sometimes", "rarely") freq ## [1] rarely sometimes rarely ## Levels: never < rarely < sometimes < often < always freq[4] <- "quite_often" # non-existing level, returns NA ## Warning in `[<-.factor`(`*tmp*`, 4, value = "quite_often"): invalid factor ## level, NA generated freq ## [1] rarely sometimes rarely <NA> ## Levels: never < rarely < sometimes < often < always A.4.5 Matrices two-dimensional generalizations of vectors my_matrix <- matrix(c(1, 2, 1, 5, 4, 2), nrow = 2, byrow = TRUE) my_matrix ## [,1] [,2] [,3] ## [1,] 1 2 1 ## [2,] 5 4 2 my_square_matrix <- matrix(c(1, 3, 2, 3), nrow = 2) my_square_matrix ## [,1] [,2] ## [1,] 1 2 ## [2,] 3 3 my_matrix[1,2] # first row, second column ## [1] 2 my_matrix[2, ] # second row ## [1] 5 4 2 my_matrix[ ,3] # third column ## [1] 1 2 A.4.5.1 Matrix functions and operations most operation element-wise mind the dimensions when using matrix multiplication %*% nrow(my_matrix) # number of matrix rows ## [1] 2 ncol(my_matrix) # number of matrix columns ## [1] 3 dim(my_matrix) # matrix dimension ## [1] 2 3 t(my_matrix) # transpose ## [,1] [,2] ## [1,] 1 5 ## [2,] 2 4 ## [3,] 1 2 diag(my_matrix) # the diagonal of the matrix as vector ## [1] 1 4 diag(1, nrow = 3) # creates a diagonal matrix ## [,1] [,2] [,3] ## [1,] 1 0 0 ## [2,] 0 1 0 ## [3,] 0 0 1 det(my_square_matrix) # matrix determinant ## [1] -3 my_matrix + 2 * my_matrix ## [,1] [,2] [,3] ## [1,] 3 6 3 ## [2,] 15 12 6 my_matrix * my_matrix # element-wise multiplication ## [,1] [,2] [,3] ## [1,] 1 4 1 ## [2,] 25 16 4 my_matrix %*% t(my_matrix) # matrix multiplication ## [,1] [,2] ## [1,] 6 15 ## [2,] 15 45 my_vec <- as.vector(my_matrix) # transform to vector my_vec ## [1] 1 5 2 4 1 2 A.4.6 Arrays multi-dimensional generalizations of matrices my_array <- array(c(1, 2, 3, 4, 5, 6, 7, 8), dim = c(2, 2, 2)) my_array[1, 1, 1] ## [1] 1 my_array[2, 2, 1] ## [1] 4 my_array[1, , ] ## [,1] [,2] ## [1,] 1 5 ## [2,] 3 7 dim(my_array) ## [1] 2 2 2 A.4.7 Data frames basic data structure for analysis differ from matrices as columns can be of different types student_data <- data.frame("Name" = student_names, "Age" = student_ages, "Pass" = passed) student_data ## Name Age Pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE ## 3 Mike 21 FALSE colnames(student_data) <- c("name", "age", "pass") # change column names student_data[1, ] ## name age pass ## 1 Luke 20 TRUE student_data[ ,colnames(student_data) %in% c("name", "pass")] ## name pass ## 1 Luke TRUE ## 2 Jen TRUE ## 3 Mike FALSE student_data$pass # access column by name ## [1] TRUE TRUE FALSE student_data[student_data$pass == TRUE, ] ## name age pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE A.4.8 Lists useful for storing different data structures access elements with double square brackets elements can be named first_list <- list(student_ages, my_matrix, student_data) second_list <- list(student_ages, my_matrix, student_data, first_list) first_list[[1]] ## [1] 20 24 21 second_list[[4]] ## [[1]] ## [1] 20 24 21 ## ## [[2]] ## [,1] [,2] [,3] ## [1,] 1 2 1 ## [2,] 5 4 2 ## ## [[3]] ## name age pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE ## 3 Mike 21 FALSE second_list[[4]][[1]] # first element of the fourth element of second_list ## [1] 20 24 21 length(second_list) ## [1] 4 second_list[[length(second_list) + 1]] <- "add_me" # append an element names(first_list) <- c("Age", "Matrix", "Data") first_list$Age ## [1] 20 24 21 A.4.9 Loops mostly for loop for loop can iterate over an arbitrary vector # iterate over consecutive natural numbers my_sum <- 0 for (i in 1:10) { my_sum <- my_sum + i } my_sum ## [1] 55 # iterate over an arbirary vector my_sum <- 0 some_numbers <- c(2, 3.5, 6, 100) for (i in some_numbers) { my_sum <- my_sum + i } my_sum ## [1] 111.5 A.5 Functions for help use ?function_name A.5.1 Writing functions We can write our own functions with function(). In the brackets, we define the parameters the function gets, and in curly brackets we define what the function does. We use return() to return values. sum_first_n_elements <- function (n) { my_sum <- 0 for (i in 1:n) { my_sum <- my_sum + i } return (my_sum) } sum_first_n_elements(10) ## [1] 55 A.6 Other tips Use set.seed(arbitrary_number) at the beginning of a script to set the seed and ensure replication. To dynamically set the working directory in R Studio to the parent folder of a R script use setwd(dirname(rstudioapi::getSourceEditorContext()$path)). To avoid slow R loops use the apply family of functions. See ?apply and ?lapply. To make your data manipulation (and therefore your life) a whole lot easier, use the dplyr package. Use getAnywhere(function_name) to get the source code of any function. Use browser for debugging. See ?browser. A.7 Further reading and references Getting started with R Studio: https://www.youtube.com/watch?v=lVKMsaWju8w Official R manuals: https://cran.r-project.org/manuals.html Cheatsheets: https://www.rstudio.com/resources/cheatsheets/ Workshop on R, dplyr, ggplot2, and R Markdown: https://github.com/bstatcomp/Rworkshop "],["distributions.html", "B Probability distributions", " B Probability distributions Name parameters support pdf/pmf mean variance Bernoulli \\(p \\in [0,1]\\) \\(k \\in \\{0,1\\}\\) \\(p^k (1 - p)^{1 - k}\\) 1.12 \\(p\\) 7.1 \\(p(1-p)\\) 7.1 binomial \\(n \\in \\mathbb{N}\\), \\(p \\in [0,1]\\) \\(k \\in \\{0,1,\\dots,n\\}\\) \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\) 4.4 \\(np\\) 7.2 \\(np(1-p)\\) 7.2 Poisson \\(\\lambda > 0\\) \\(k \\in \\mathbb{N}_0\\) \\(\\frac{\\lambda^k e^{-\\lambda}}{k!}\\) 4.6 \\(\\lambda\\) 7.3 \\(\\lambda\\) 7.3 geometric \\(p \\in (0,1]\\) \\(k \\in \\mathbb{N}_0\\) \\(p(1-p)^k\\) 4.5 \\(\\frac{1 - p}{p}\\) 7.4 \\(\\frac{1 - p}{p^2}\\) 9.3 normal \\(\\mu \\in \\mathbb{R}\\), \\(\\sigma^2 > 0\\) \\(x \\in \\mathbb{R}\\) \\(\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}}\\) 4.12 \\(\\mu\\) 7.8 \\(\\sigma^2\\) 7.8 uniform \\(a,b \\in \\mathbb{R}\\), \\(a < b\\) \\(x \\in [a,b]\\) \\(\\frac{1}{b-a}\\) 4.9 \\(\\frac{a+b}{2}\\) \\(\\frac{(b-a)^2}{12}\\) beta \\(\\alpha,\\beta > 0\\) \\(x \\in [0,1]\\) \\(\\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)}\\) 4.10 \\(\\frac{\\alpha}{\\alpha + \\beta}\\) 7.6 \\(\\frac{\\alpha \\beta}{(\\alpha + \\beta)^2(\\alpha + \\beta + 1)}\\) 7.6 gamma \\(\\alpha,\\beta > 0\\) \\(x \\in (0, \\infty)\\) \\(\\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x}\\) 4.11 \\(\\frac{\\alpha}{\\beta}\\) 7.5 \\(\\frac{\\alpha}{\\beta^2}\\) 7.5 exponential \\(\\lambda > 0\\) \\(x \\in [0, \\infty)\\) \\(\\lambda e^{-\\lambda x}\\) 4.8 \\(\\frac{1}{\\lambda}\\) 7.7 \\(\\frac{1}{\\lambda^2}\\) 7.7 logistic \\(\\mu \\in \\mathbb{R}\\), \\(s > 0\\) \\(x \\in \\mathbb{R}\\) \\(\\frac{e^{-\\frac{x - \\mu}{s}}}{s(1 + e^{-\\frac{x - \\mu}{s}})^2}\\) 4.13 \\(\\mu\\) \\(\\frac{s^2 \\pi^2}{3}\\) negative binomial \\(r \\in \\mathbb{N}\\), \\(p \\in [0,1]\\) \\(k \\in \\mathbb{N}_0\\) \\(\\binom{k + r - 1}{k}(1-p)^r p^k\\) 4.7 \\(\\frac{rp}{1 - p}\\) 9.2 \\(\\frac{rp}{(1 - p)^2}\\) 9.2 multinomial \\(n \\in \\mathbb{N}\\), \\(k \\in \\mathbb{N}\\) \\(p_i \\in [0,1]\\), \\(\\sum p_i = 1\\) \\(x_i \\in \\{0,..., n\\}\\), \\(i \\in \\{1,...,k\\}\\), \\(\\sum{x_i} = n\\) \\(\\frac{n!}{x_1!x_2!...x_k!} p_1^{x_1} p_2^{x_2}...p_k^{x_k}\\) 8.1 \\(np_i\\) \\(np_i(1-p_i)\\) "],["references.html", "References", " References "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]] diff --git a/docs/uprobspaces.html b/docs/uprobspaces.html index 62d83ea..31c6cd1 100644 --- a/docs/uprobspaces.html +++ b/docs/uprobspaces.html @@ -23,7 +23,7 @@ - + @@ -242,6 +242,11 @@
  • 17.1 Conjugate priors
  • 17.2 Posterior sampling
  • +
  • 18 Distributions intutition +
  • Appendix
  • A R programming language
  • -
  • B Probability distributions
  • +
  • B Probability distributions
  • References
  • Published with bookdown
  • From 09cafa288171c790213a284b2e5bcdb0908ae803 Mon Sep 17 00:00:00 2001 From: LeonHvastja Date: Thu, 28 Sep 2023 19:26:56 +0200 Subject: [PATCH 3/7] finished beta and uniform --- .Rproj.user/shared/notebooks/paths | 7 +- 18-distributions_intuition.Rmd | 131 ++++++++- docs/404.html | 2 +- docs/A1.html | 250 +++++++++--------- docs/ard.html | 2 +- docs/bi.html | 2 +- .../figure-html/unnamed-chunk-19-1.png | Bin 21107 -> 33865 bytes .../figure-html/unnamed-chunk-294-1.png | Bin 0 -> 33865 bytes docs/boot.html | 2 +- docs/ci.html | 2 +- docs/condprob.html | 2 +- docs/crv.html | 2 +- docs/distributions-intutition.html | 102 ++++++- docs/distributions.html | 2 +- docs/eb.html | 2 +- docs/ev.html | 2 +- docs/index.html | 4 +- docs/integ.html | 2 +- docs/introduction.html | 2 +- docs/lt.html | 2 +- docs/ml.html | 2 +- docs/mrv.html | 2 +- docs/mrvs.html | 2 +- docs/nhst.html | 2 +- docs/reference-keys.txt | 5 + docs/references.html | 2 +- docs/rvs.html | 2 +- docs/search_index.json | 2 +- docs/uprobspaces.html | 2 +- 29 files changed, 385 insertions(+), 156 deletions(-) create mode 100644 docs/bookdown-pou_files/figure-html/unnamed-chunk-294-1.png diff --git a/.Rproj.user/shared/notebooks/paths b/.Rproj.user/shared/notebooks/paths index 26eb1da..b5ac38b 100644 --- a/.Rproj.user/shared/notebooks/paths +++ b/.Rproj.user/shared/notebooks/paths @@ -1,5 +1,2 @@ -C:/Users/Leon/work/course_pou/01-Introduction.Rmd="FFB61D49" -C:/Users/Leon/work/course_pou/04-random_variables.Rmd="AFFDF795" -C:/Users/Leon/work/course_pou/A2-probability_distributions.Rmd="E11827A3" -C:/Users/Leon/work/course_pou/A2-probability_distributions2.Rmd="B44B9051" -C:/Users/Leon/work/course_pou/README.md="E30F7E0B" +/Users/leonhvastja/work/course_pou-leon/18-distributions_intuition.Rmd="D5F01375" +/Users/leonhvastja/work/course_pou-leon/A2-probability_distributions2.Rmd="47392273" diff --git a/18-distributions_intuition.Rmd b/18-distributions_intuition.Rmd index a74f261..1deb49e 100644 --- a/18-distributions_intuition.Rmd +++ b/18-distributions_intuition.Rmd @@ -349,7 +349,7 @@ ggplot(data, aes(x=x_vals, y=probs)) + ``` ```{solution, echo = togs} -b) +b) The expected value of a random variable (the mean) is denoted as $E[X]$. $$E[X] = \frac{1-p}{p}= \frac{1- \frac{1}{6}}{\frac{1}{6}} = \frac{5}{6}\cdot 6 = 5$$ On average we will fail 5 times before we roll our first 6. @@ -363,3 +363,132 @@ c) The alternative form of this distribution (with support on all positive integ ## Continuous distributions +```{exercise, name = "Uniform intuition 1"} +The need for a randomness is a common problem. A practical solution are so-called random number generators (RNGs). The simplest RNG one would think of is choosing a set of numbers and having the generator return a number at random, where the probability of returning any number from this set is the same. If this set is an interval of real numbers, then we've basically described the continuous uniform distribution. + +It has two parameters $a$ and $b$, which define the beginning and end of its support respectively. + +a) Let's think of the mean intuitively. The expected value or mean of a distribution is the pivot point on our x-axis, which "balances" the graph. Given parameters $a$ and $b$ what is your intuitive guess of the mean for this distribution? +b) A special case of the uniform distribution is the **standard uniform distribution** with $a=0$,$b=1$. Write the pdf of this particular case. +``` +```{r, fig.width=5, fig.height=3, echo=FALSE, warning=FALSE, message=FALSE} +# Load required libraries +library(ggplot2) + +# Create data for a uniform distribution +df <- data.frame(x = c("a", "b"), y = c(1, 1)) + +# Plot +p <- ggplot(df, aes(x, y)) + + geom_line(aes(group = 1), color = "blue") + + geom_point(size = 2) + + geom_segment(aes(x = "a", xend = "a", y = 0, yend = 1), linetype = "dotted") + + geom_segment(aes(x = "b", xend = "b", y = 0, yend = 1), linetype = "dotted") + + labs(x = "Value", y = "Density", title = "Uniform Distribution") + + theme_minimal() + + theme(axis.text.y = element_blank(), + axis.ticks.y = element_blank()) + + annotate("text", x = "a", y = 1, label = "1/(b-a)", hjust = 1.2, vjust = 0.5) + + annotate("text", x = "a", y = 0, label = "0", hjust = 1.5, vjust = 0.5) + +print(p) +``` +
    +```{solution, echo = togs} +a. It's the midpoint between $a$ and $b$, so $\frac{b-a}{2}$ +b. Inserting the parameter values we get:$$f(x) = +\begin{cases} +1 & \text{if } 0 \leq x \leq 1 \\ +0 & \text{otherwise} +\end{cases} +$$ +Notice how the pdf is just a constant $1$ across all values of $x \in [0,1]$. Here it is important to distinguish between probability and **probability density**. The density may be 1, but the probability is not and while discreet distributions never exceeded 1 on the y-axis, continuous distributions can go as high as you like. +``` +
    + + +```{exercise, name = "Normal intuition 1"} +a +``` + +```{exercise, name = "Beta intuition 1"} +The beta distribution is a continuous distribution defined on the unit interval $[0,1]$. It has two strictly positive paramters $\alpha$ and $\beta$, which determine its shape. Its support makes it especially suitable to model distribtuions of percentages and proportions. + +Below you've been provided with some code that you can copy into Rstudio. Once you run the code, an interactive Shiny app will appear and you will be able to manipulate the graph of the beta distribution. + +Play around with the parameters to get: + +a) A straight line from (0,0) to (1,2) +b) A straight line from (0,2) to (1,0) +c) A symmetric bell curve +d) A bowl-shaped curve +e) The standard uniform distribution is actually a special case of the beta distribution. Find the exact parameters $\alpha$ and $\beta$. Once you do, prove the equality by inserting the values into our pdf. + +*Hint*: The beta function is evaluated as $\text{B}(a,b) = \frac{\Gamma(a)\Gamma(b)}{\Gamma(a+b)}$, + the gamma function for **positive integers** $n$ is evaluated as $\Gamma(n)= (n-1)!$ +``` + +``` +# Install and load necessary packages +install.packages(c("shiny", "ggplot2")) +library(shiny) +library(ggplot2) + +# The Shiny App +ui <- fluidPage( + titlePanel("Beta Distribution Viewer"), + + sidebarLayout( + sidebarPanel( + sliderInput("alpha", "Alpha:", min = 0.1, max = 10, value = 2, step = 0.1), + sliderInput("beta", "Beta:", min = 0.1, max = 10, value = 2, step = 0.1) + ), + + mainPanel( + plotOutput("betaPlot") + ) + ) +) + +server <- function(input, output) { + output$betaPlot <- renderPlot({ + x <- seq(0, 1, by = 0.01) + y <- dbeta(x, shape1 = input$alpha, shape2 = input$beta) + + ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + + geom_line() + + labs(x = "Value", y = "Density") + + theme_minimal() + }) +} + +shinyApp(ui = ui, server = server) + +``` +
    +```{solution, echo = togs} + a) $\alpha = 2, \beta=1$ + + b) $\alpha = 1, \beta=2$ + + c) Possible solution $\alpha = \beta= 5$ + + d) Possible solution $\alpha = \beta= 0.5$ + + e) The correct parameters are $\alpha = 1, \beta=1$, to prove the equality we insert them into the beta pdf: + $$\frac{x^{\alpha - 1} (1 - x)^{\beta - 1}}{\text{B}(\alpha, \beta)} = + \frac{x^{1 - 1} (1 - x)^{1 - 1}}{\text{B}(1, 1)} = + \frac{1}{\frac{\Gamma(1)\Gamma(1)}{\Gamma(1+1)}}= + \frac{1}{\frac{(1-1)!(1-1)!}{(2-1)!}} = 1$$ +``` +
    + + +```{exercise, name = "Gamma intuition 1"} +a +``` + +```{exercise, name = "Exponential intuition 1"} +a +``` + diff --git a/docs/404.html b/docs/404.html index 1e124f5..9efe05c 100644 --- a/docs/404.html +++ b/docs/404.html @@ -23,7 +23,7 @@ - + diff --git a/docs/A1.html b/docs/A1.html index 0e8faae..9ee3176 100644 --- a/docs/A1.html +++ b/docs/A1.html @@ -23,7 +23,7 @@ - + @@ -357,48 +357,48 @@

    A.4.1 Variables and typeschange types with as.type() functions
  • no special type for single character like C++ for example
  • -
    n            <- 20
    -x            <- 2.7
    -m            <- n # m gets value 20
    -my_flag      <- TRUE
    -student_name <- "Luka"
    -typeof(n)
    +
    n            <- 20
    +x            <- 2.7
    +m            <- n # m gets value 20
    +my_flag      <- TRUE
    +student_name <- "Luka"
    +typeof(n)
    ## [1] "double"
    -
    typeof(student_name)
    +
    typeof(student_name)
    ## [1] "character"
    -
    typeof(my_flag)
    +
    typeof(my_flag)
    ## [1] "logical"
    -
    typeof(as.integer(n))
    +
    typeof(as.integer(n))
    ## [1] "integer"
    -
    typeof(as.character(n))
    +
    typeof(as.character(n))
    ## [1] "character"

    A.4.2 Basic operations

    -
    n + x
    +
    n + x
    ## [1] 22.7
    -
    n - x
    +
    n - x
    ## [1] 17.3
    -
    diff <- n - x # variable diff gets the difference between n and x
    -diff
    +
    diff <- n - x # variable diff gets the difference between n and x
    +diff
    ## [1] 17.3
    -
    n * x
    +
    n * x
    ## [1] 54
    -
    n / x
    +
    n / x
    ## [1] 7.407407
    -
    x^2
    +
    x^2
    ## [1] 7.29
    -
    sqrt(x)
    +
    sqrt(x)
    ## [1] 1.643168
    -
    n > 2 * n
    +
    n > 2 * n
    ## [1] FALSE
    -
    n == n
    +
    n == n
    ## [1] TRUE
    -
    n == 2 * n
    +
    n == 2 * n
    ## [1] FALSE
    -
    n != n
    +
    n != n
    ## [1] FALSE
    -
    paste(student_name, "is", n, "years old")
    +
    paste(student_name, "is", n, "years old")
    ## [1] "Luka is 20 years old"
    @@ -410,26 +410,26 @@

    A.4.3 Vectors
    1:4 # creates a vector of integers from 1 to 4

    +
    1:4 # creates a vector of integers from 1 to 4
    ## [1] 1 2 3 4
    -
    student_ages  <- c(20, 23, 21)
    -student_names <- c("Luke", "Jen", "Mike")
    -passed        <- c(TRUE, TRUE, FALSE)
    -length(student_ages)
    +
    student_ages  <- c(20, 23, 21)
    +student_names <- c("Luke", "Jen", "Mike")
    +passed        <- c(TRUE, TRUE, FALSE)
    +length(student_ages)
    ## [1] 3
    -
    # access by index
    -student_ages[2] 
    +
    # access by index
    +student_ages[2] 
    ## [1] 23
    -
    student_ages[1:2]
    +
    student_ages[1:2]
    ## [1] 20 23
    -
    student_ages[2] <- 24 # change values
    -
    -# access by logical vectors
    -student_ages[passed == TRUE] # same as student_ages[passed]
    +
    student_ages[2] <- 24 # change values
    +
    +# access by logical vectors
    +student_ages[passed == TRUE] # same as student_ages[passed]
    ## [1] 20 24
    -
    student_ages[student_names %in% c("Luke", "Mike")]
    +
    student_ages[student_names %in% c("Luke", "Mike")]
    ## [1] 20 21
    -
    student_names[student_ages > 20]
    +
    student_names[student_ages > 20]
    ## [1] "Jen"  "Mike"

    A.4.3.1 Operations with vectors

    @@ -437,24 +437,24 @@

    A.4.3.1 Operations with vectorsmost operations are element-wise
  • if we operate on vectors of different lengths, the shorter vector periodically repeats its elements until it reaches the length of the longer one
  • -
    a <- c(1, 3, 5)
    -b <- c(2, 2, 1)
    -d <- c(6, 7)
    -a + b
    +
    a <- c(1, 3, 5)
    +b <- c(2, 2, 1)
    +d <- c(6, 7)
    +a + b
    ## [1] 3 5 6
    -
    a * b
    +
    a * b
    ## [1] 2 6 5
    -
    a + d
    +
    a + d
    ## Warning in a + d: longer object length is not a multiple of shorter object
     ## length
    ## [1]  7 10 11
    -
    a + 2 * b
    +
    a + 2 * b
    ## [1] 5 7 7
    -
    a > b
    +
    a > b
    ## [1] FALSE  TRUE  TRUE
    -
    b == a
    +
    b == a
    ## [1] FALSE FALSE FALSE
    -
    a %*% b # vector multiplication, not element-wise
    +
    a %*% b # vector multiplication, not element-wise
    ##      [,1]
     ## [1,]   13

    @@ -466,21 +466,21 @@

    A.4.4 Factors
    car_brand <- factor(c("Audi", "BMW", "Mercedes", "BMW"), ordered = FALSE)
    -car_brand
    +
    car_brand <- factor(c("Audi", "BMW", "Mercedes", "BMW"), ordered = FALSE)
    +car_brand
    ## [1] Audi     BMW      Mercedes BMW     
     ## Levels: Audi BMW Mercedes
    -
    freq      <- factor(x       = NA,
    -                    levels  = c("never","rarely","sometimes","often","always"),
    -                    ordered = TRUE)
    -freq[1:3] <- c("rarely", "sometimes", "rarely")
    -freq
    +
    freq      <- factor(x       = NA,
    +                    levels  = c("never","rarely","sometimes","often","always"),
    +                    ordered = TRUE)
    +freq[1:3] <- c("rarely", "sometimes", "rarely")
    +freq
    ## [1] rarely    sometimes rarely   
     ## Levels: never < rarely < sometimes < often < always
    -
    freq[4]   <- "quite_often" # non-existing level, returns NA
    +
    freq[4]   <- "quite_often" # non-existing level, returns NA
    ## Warning in `[<-.factor`(`*tmp*`, 4, value = "quite_often"): invalid factor
     ## level, NA generated
    -
    freq
    +
    freq
    ## [1] rarely    sometimes rarely    <NA>     
     ## Levels: never < rarely < sometimes < often < always
    @@ -489,26 +489,26 @@

    A.4.5 Matrices
    my_matrix <- matrix(c(1, 2, 1,
    -                      5, 4, 2),
    -                    nrow  = 2,
    -                    byrow = TRUE)
    -my_matrix
    +
    my_matrix <- matrix(c(1, 2, 1,
    +                      5, 4, 2),
    +                    nrow  = 2,
    +                    byrow = TRUE)
    +my_matrix
    ##      [,1] [,2] [,3]
     ## [1,]    1    2    1
     ## [2,]    5    4    2
    -
    my_square_matrix <- matrix(c(1, 3,
    -                             2, 3),
    -                           nrow  = 2)
    -my_square_matrix
    +
    my_square_matrix <- matrix(c(1, 3,
    +                             2, 3),
    +                           nrow  = 2)
    +my_square_matrix
    ##      [,1] [,2]
     ## [1,]    1    2
     ## [2,]    3    3
    -
    my_matrix[1,2] # first row, second column
    +
    my_matrix[1,2] # first row, second column
    ## [1] 2
    -
    my_matrix[2, ] # second row
    +
    my_matrix[2, ] # second row
    ## [1] 5 4 2
    -
    my_matrix[ ,3] # third column
    +
    my_matrix[ ,3] # third column
    ## [1] 1 2

    A.4.5.1 Matrix functions and operations

    @@ -516,40 +516,40 @@

    A.4.5.1 Matrix functions and oper
  • most operation element-wise
  • mind the dimensions when using matrix multiplication %*%
  • -
    nrow(my_matrix) # number of matrix rows
    +
    nrow(my_matrix) # number of matrix rows
    ## [1] 2
    -
    ncol(my_matrix) # number of matrix columns
    +
    ncol(my_matrix) # number of matrix columns
    ## [1] 3
    -
    dim(my_matrix) # matrix dimension
    +
    dim(my_matrix) # matrix dimension
    ## [1] 2 3
    -
    t(my_matrix) # transpose
    +
    t(my_matrix) # transpose
    ##      [,1] [,2]
     ## [1,]    1    5
     ## [2,]    2    4
     ## [3,]    1    2
    -
    diag(my_matrix) # the diagonal of the matrix as vector
    +
    diag(my_matrix) # the diagonal of the matrix as vector
    ## [1] 1 4
    -
    diag(1, nrow = 3) # creates a diagonal matrix
    +
    diag(1, nrow = 3) # creates a diagonal matrix
    ##      [,1] [,2] [,3]
     ## [1,]    1    0    0
     ## [2,]    0    1    0
     ## [3,]    0    0    1
    -
    det(my_square_matrix) # matrix determinant
    +
    det(my_square_matrix) # matrix determinant
    ## [1] -3
    -
    my_matrix + 2 * my_matrix
    +
    my_matrix + 2 * my_matrix
    ##      [,1] [,2] [,3]
     ## [1,]    3    6    3
     ## [2,]   15   12    6
    -
    my_matrix * my_matrix # element-wise multiplication
    +
    my_matrix * my_matrix # element-wise multiplication
    ##      [,1] [,2] [,3]
     ## [1,]    1    4    1
     ## [2,]   25   16    4
    -
    my_matrix %*% t(my_matrix) # matrix multiplication
    +
    my_matrix %*% t(my_matrix) # matrix multiplication
    ##      [,1] [,2]
     ## [1,]    6   15
     ## [2,]   15   45
    -
    my_vec <- as.vector(my_matrix) # transform to vector
    -my_vec
    +
    my_vec <- as.vector(my_matrix) # transform to vector
    +my_vec
    ## [1] 1 5 2 4 1 2

    @@ -558,16 +558,16 @@

    A.4.6 Arrays
    my_array <- array(c(1, 2, 3, 4, 5, 6, 7, 8), dim = c(2, 2, 2))
    -my_array[1, 1, 1]
    +
    my_array <- array(c(1, 2, 3, 4, 5, 6, 7, 8), dim = c(2, 2, 2))
    +my_array[1, 1, 1]
    ## [1] 1
    -
    my_array[2, 2, 1]
    +
    my_array[2, 2, 1]
    ## [1] 4
    -
    my_array[1, , ]
    +
    my_array[1, , ]
    ##      [,1] [,2]
     ## [1,]    1    5
     ## [2,]    3    7
    -
    dim(my_array)
    +
    dim(my_array)
    ## [1] 2 2 2
    @@ -576,26 +576,26 @@

    A.4.7 Data frames
    student_data <- data.frame("Name" = student_names, 
    -                           "Age"  = student_ages, 
    -                           "Pass" = passed)
    -student_data

    +
    student_data <- data.frame("Name" = student_names, 
    +                           "Age"  = student_ages, 
    +                           "Pass" = passed)
    +student_data
    ##   Name Age  Pass
     ## 1 Luke  20  TRUE
     ## 2  Jen  24  TRUE
     ## 3 Mike  21 FALSE
    -
    colnames(student_data) <- c("name", "age", "pass") # change column names
    -student_data[1, ]
    +
    colnames(student_data) <- c("name", "age", "pass") # change column names
    +student_data[1, ]
    ##   name age pass
     ## 1 Luke  20 TRUE
    -
    student_data[ ,colnames(student_data) %in% c("name", "pass")]
    +
    student_data[ ,colnames(student_data) %in% c("name", "pass")]
    ##   name  pass
     ## 1 Luke  TRUE
     ## 2  Jen  TRUE
     ## 3 Mike FALSE
    -
    student_data$pass # access column by name
    +
    student_data$pass # access column by name
    ## [1]  TRUE  TRUE FALSE
    -
    student_data[student_data$pass == TRUE, ]
    +
    student_data[student_data$pass == TRUE, ]
    ##   name age pass
     ## 1 Luke  20 TRUE
     ## 2  Jen  24 TRUE
    @@ -607,11 +607,11 @@

    A.4.8 Lists
    first_list  <- list(student_ages, my_matrix, student_data)
    -second_list <- list(student_ages, my_matrix, student_data, first_list)
    -first_list[[1]]
    +
    first_list  <- list(student_ages, my_matrix, student_data)
    +second_list <- list(student_ages, my_matrix, student_data, first_list)
    +first_list[[1]]
    ## [1] 20 24 21
    -
    second_list[[4]]
    +
    second_list[[4]]
    ## [[1]]
     ## [1] 20 24 21
     ## 
    @@ -625,13 +625,13 @@ 

    A.4.8 Lists
    second_list[[4]][[1]] # first element of the fourth element of second_list
    +
    second_list[[4]][[1]] # first element of the fourth element of second_list
    ## [1] 20 24 21
    -
    length(second_list)
    +
    length(second_list)
    ## [1] 4
    -
    second_list[[length(second_list) + 1]] <- "add_me" # append an element
    -names(first_list) <- c("Age", "Matrix", "Data")
    -first_list$Age
    +
    second_list[[length(second_list) + 1]] <- "add_me" # append an element
    +names(first_list) <- c("Age", "Matrix", "Data")
    +first_list$Age
    ## [1] 20 24 21
    @@ -640,20 +640,20 @@

    A.4.9 Loops
    # iterate over consecutive natural numbers
    -my_sum <- 0
    -for (i in 1:10) {
    -  my_sum <- my_sum + i
    -}
    -my_sum

    +
    # iterate over consecutive natural numbers
    +my_sum <- 0
    +for (i in 1:10) {
    +  my_sum <- my_sum + i
    +}
    +my_sum
    ## [1] 55
    -
    # iterate over an arbirary vector
    -my_sum       <- 0
    -some_numbers <- c(2, 3.5, 6, 100)
    -for (i in some_numbers) {
    -  my_sum <- my_sum + i
    -}
    -my_sum
    +
    # iterate over an arbirary vector
    +my_sum       <- 0
    +some_numbers <- c(2, 3.5, 6, 100)
    +for (i in some_numbers) {
    +  my_sum <- my_sum + i
    +}
    +my_sum
    ## [1] 111.5
    @@ -667,14 +667,14 @@

    A.5.1 Writing functions
    sum_first_n_elements <- function (n) {
    -  my_sum <- 0
    -  for (i in 1:n) {
    -    my_sum <- my_sum + i
    -  }
    -  return (my_sum)
    -}
    -sum_first_n_elements(10)
    +
    sum_first_n_elements <- function (n) {
    +  my_sum <- 0
    +  for (i in 1:n) {
    +    my_sum <- my_sum + i
    +  }
    +  return (my_sum)
    +}
    +sum_first_n_elements(10)
    ## [1] 55
    diff --git a/docs/ard.html b/docs/ard.html index 0a0dd4a..385143d 100644 --- a/docs/ard.html +++ b/docs/ard.html @@ -23,7 +23,7 @@ - + diff --git a/docs/bi.html b/docs/bi.html index e40ff02..a1b79a0 100644 --- a/docs/bi.html +++ b/docs/bi.html @@ -23,7 +23,7 @@ - + diff --git a/docs/bookdown-pou_files/figure-html/unnamed-chunk-19-1.png b/docs/bookdown-pou_files/figure-html/unnamed-chunk-19-1.png index 4f341a2503bbdf107eed6e1172f9cd8a55c5618e..a6468f84837319aba2d66f9a046bec806f51762f 100644 GIT binary patch literal 33865 zcmeFZby$^a@HV=B|ktCRF}PkLJ2*6s%2zJt#wQy zwxWoGsr_zmMn)*f9)EbTt$NqJ!Kpp7BD-PEFx%pbAx(3iaOeA`L0Vv{&_k?i%0 z(Xi2dgV(HQth47IbEOokrQqH@`=fJ9Ho2RMaJJ$?QB_#EO-;-2=0x10?mB%&S^R?*1#M`$bF0Xw$kN9+QaRf!MN3dX>{?<@8IKN2$9@D>Gwy|$9 zo2Mp=_26^BryQYXhOcI*J<4~5M*g@Lw;KujZ~P#9D{31@|Mlpb5|qw~1k$C#j_7B+ z4@@p|o_+ab@9oe7{7J@jeO!F!rJ*1({ujAy3@w+N1hQjP7<+2&Zu^y9{%rGh=&ovT z(K(-EBRJ*jQS45UUjoSzwZ?{B9^`Xd6}= zsg9+-7@*d$I-cw2DZ&T zRGFsrUgPXk*SSj>`;1GSFB=vuwHT`p`+8VryLJ+qJ2Lj3m*NkXmrLVdZ@g2-euK$me}T!Qp#I@Kd|w>D*&*#R z{#m|OZ6cpa2ZOh=E8;4q66su8w%3bNQ}pzQPrUj1rDa!J~{qWSB^fcE2G%q zcLEevpcM)ie#3%)r{EvNktFPYU%{zQ!uj`iY$@cA?gstDD3laRUgn~vJJ#H1!nkAa zHkZCzT)(bIVsxAwOBhd6>R6XQ{+$#89wK)78z|DZIPc^d$QcvdnZ5^|VpQmTVX1le zD5V;inhe(E%SJM4WDG9|3Kt58=R{R^d*U&lqX(h~5~33pOnfBXw60z?TFW&uPt2Nm znbWOy^e8p~y+7)=9}-kACRVntisO<=p|F1Y@dSl+PLklSH>1huQ6w@KG~|DKBl0=y zYrcp6`aJd-DHQbs^Ni5HE`eLWbNJ6YAi%xkj}jtlQ(`9n^A-sFH&6cg3K_vY6qX}> zOtb>SAB!L$L#f|ruDw^)Q?DG^_Tl8w&Nt?|eEIT3=lvt66^hK;ZU}RIx!IRz#38xA zQzUGXpWgo@T)_4dJx|?Eq4)NDS)oM-t<_+07M}yg5I%BzI=tRte>}>>v@M4JDh*to zrNsJU=`T1gQ18k`KK6&G1p`6m8lXQiVoe`EA6 z=@l`b!D6fQkJ&n9!6h>T8OpH_CJR!FUHWqLC|fzJ96R*%^wI-qJuRo1qYTxz`V1W2 z$(*n`{XSgp34TB=!Nc#3Q-NCA>qQ+BL+i1;W(upV-Q|KYvSzMIH(yoo{cMGWY3KgMa`NH`&&1*#6F>yIaBy>U7U?a|=k1NT(iZ_U<}&d%Ha+y1U%r9`mDz5Fj!M{F#7w&@?O z$002?Sz++N7E#(t@LG7~HMypc{IK!u!0E*5?63>JwiZ6k#4bhZnBsK3eH;90JTuXk z*Gon0yG)i|QHDpZBE>Y>z{ZH*2xpDU+^Y|d*86C0y(Pcyv^m0Zyvw{oz*&Rn^`+dwa64lVj zUA!Jh!{XVtJiWbG7X^pEdr!D)cc$vvYwT^c?06Sxv3V-hf!xQ_yHoo3yL$c18LPf7 zHL6?BZw5V!i=&<`DCpR7F18yIC{mrxX;HbXFtgh(y7;N8gjJ{R`GbB&jEMI5#)r+S z(f9kLZsQ*4fA^4{tV-&rdE~ljWR~us>FSmCqNsq)g{nmbX{_s3Qy3#JgBM!O)wH~( za#ofE&NBIOip7Vig$k4;##cUWB%@v_)h2K2C_0pXx!>IxQ%c?;zEJ*6LHf08^h$Hx z^iWy8kMCJB?jy>TW_4xF&!VDqP7#f^W~orD{g{Y#P>@C&1vsw`canYURvR@n;dM#0 zvpPFP>SY^9LZcAHsds}ew(;?D3w=PJdywd93xj)oMtP3DX#y^^FI-9RWUkLzMc=P& zJsU6XRl35rB694Wthj0isSn%DL@L#7h%)d|pY;KY^j!TCB7=m^DC+e5y1h!HAV;3K zmM7x-J1bvEUX@LE6q%)VXb-)cy{bo(OI-PV``eC@UAUlr|a>iQnw%EUDrySOi(-XJANk3w~xw z-cOjDk6vOpStn_{wSR_U(4Y5LcC;rXLsc>Banl{qUM01lsoWWM?ubrO4YaWC%%)(| zlMn7nF8-cYw=>RT{wbTp$F*I52crGJ(Ee6`T@0tI(A;GzBUAiPk;7VxCLifF@5<6x zZ^Nd?oO$WlkY`k7I$VB>*Lk8<;W3}voUZee$TPiR=Gek*Cizs{R;V|PHEEU zGli%4W(uxn(^j_U4Ni+xZN2j+$?QwcGbpt^|Hk^mlan_sa^JjhotSlDp?Nv0B%&mm zaBPM$9Fm^1f~AcA$(TG7^(i$2A799&m+RCz)9m#2hy&F~i>;g|LoDoYb_WV**Jk?# zC!_XRG3y+DwJW=;(>-nKSGQL3BvtiUeK<{O4KLNSPpBR7W>rnFU5Ii^n4Oq?>3)wx zw$RrvbaP+2+_T`wzkyzAv<~~#?c)~f&$4K-X})sJL(H{0o0Nm1BC=JGu6eUg)-GT- zs3~+s@GY;{1(%MTc-1LP7t!XhYroO@UZ;3Oda{37!}-cHE{o7myoJf{&%iuYvQQET>u4+-Au^^!FqpJAVk&+XE4 z{mJPGb2rzv90OiZC zHzi*HWX#>XG>bCG)OHWb+)arvy5L5F?sE!Q_+A=VE6S6zR=Km>GQS*fXRN!E_j0>* zs|{Sk& z3&Yc^@$VB5OmMxE)hS--rY*9=cg%8NF`nnr#v#SC?iVQ3r`_VqIaWW^t%MNem(Z$8 z?iSy}=JrY?@%mWZkUm&q6WdgngE5{w|K|P?jtf{(Dx+p=Y9hfs{grOn5iSicTv&1+ z_q8bWFRy{ohMXiixMyJZW6X z?5C>PZhvp1u1s*&C%!B6P*y~n>AaYU1-kD{ z;g`@uZ6b^%5iXVm?K7vj`EB}IF*(wyt`lEeSS+L*JU*z?PImZ{obHP*W=#J;>!mc& z9?!9rvk_VK>?@q~eri6;kZ5_a?(He|*I(s3U_+K}b z%K$JO>H>%8B>&_oe(UgQ;is&Qk78ck%v2FEV?H90YZjej_a#A9be_*^1p%@BUySeF zw+|emzGaYxv6#i3ta+I{a4)xwG9f;UV_IcWU5%1N<@F(pMIW>I1D@&q{BWMoW%Q z4u$ViI!VlG#!K}2;Qp-1KKli)+e#-tII308Q*pKE`qsvYZ-~rEx@2A{=Ip8E`Dx2D z^YzF6y+bm&yHeM%N0&EP?z8YbiZ!sfRT@e%{Uq`$;d;urMud+DaY4ilLbcUTou0Yt zPg=Jm2E|KX?U3ht=w=z$c!Fal)*KtV!eM*y;>GCC-w)Lfe%r%%Ee6minmu2)&?T~6wA*x&_1fRtUL2}&{bXI7V6$EC z#N!gpIB(~@@%i@UD39*suJN$FYKf(K9H(Zs%ukB?5t1yJ4dmGD#Mg{I3v$L}=l0uo zCS6RE=J=fO2=mQ!e@|0ip5(uB%+ZMc9G2rIEtQdjKtkGvhseG^51rbFFrAk{t|YgN z%IuAlqCOhSPKuW#r-$7T;dy~yv)s(#_Hm7N@+I!3NJT{6_nZymz(_gXBBzbTk-FBJ zuZoeJ&j=}4KJ~9QKh`y`>x+Ju+gt77+{JU4fqAmyTKUbRo7fzN5?!U=w<(kZeSQ@K z=0vzGs6NS3K#Pu55yyFqZR>)@1KJ5@l_Hd<#>51S@+mfPFelI2oa-;R{>FuJWTpRO z;=|3A$)WL`&x^hmb$K(yHB{Tp9QoloCO%rLEitd};t%Q7LWWPM^3$#p%HkBp8Fi#^ z6!uEvI6Xt7SZ}1~!9LWnsA_g;?w6DEs)9iZ1lp!E6Ya}F)`x`yw+2NSsykhka?{=< ziUM}6S(__bA$30zbA#|yg3r2Ml>}>odx}sXdQjB;I!kTkTw#kX;gP9BB-EC5!Q%mW zc)93YZrRoKAqQEp%d)4%pxAI*Tn4y2{+OrsfxEYeTWg$fQFn_3fS~pB?L4VDPutg0 zIkxq7p~lPU30-epRPDBGI`J!B#7ZS$k4pG#ed-+SnKoSM!;nt|tRJuY{{HlO0#eM8 zT$y`H>wW9}Ns9+0{2$Z8e0RSG`#sTDh+mp%VofX#zO^ngc-rIoebz+hcH!kNS-OE! zBF;rc_&pf%@c@;=)#VnR*lX*6>7O{IJ!n6=P`y~PBR*A8QL(=HYB<=2Zm#>WUXghd zKrF_I^)G>R&2{^`4n>lpq6T>t6EQ>|Xq_6*tP31@5x-V)`<^l2do2L55|bH3`Jb0Y zH!`s~PX9|Xpr23?XrjCU0?BtIh|gc~T6&Fd2JZSbZvG?Yq~}#bLrw}-6;9`%yo+8? zn2C_C%-I~ljzc#{lZrD1i?8c`TAmTlw7&k?YoORlgwXdkMfU>Mo$aA|61YuoTTMrz06&~+{YN5m*~ht*DR&jzgM-v46G+;^>eOP${ER^Z{23?1SdhQl>=G5uMY3GAmq+{Gu2!*$F5U7eMV_i0U_X<(ho| z^mN zG@VmB>n^L|fuilfLvbcw;?oA?tIq5I%yv50ZP= zY5UH%Pu?XFku&z3PJdvRTGQD#g?mKE#BbNtdFv4+M<^}IMts#m$;*phid1mexX!UF z8S|yqvh$9$qoi5$F$yW0&5gN%o7jFFKgu{=a?Ot1b3NgjhSCjoZMi!_r7$W?J9Zcy09c$n8^hVq{%sp@MsIiobcdrP93Z)JZ{mL){kl(PAo%Mdm(! z-xgAv(&a8pI~LBNMyO{Q4%K<)?8RA=>DQJCc9Z#W_z5Nez_9Z4TC>PBZ;LT2?0tq? zS^Fw4n%aKdV)FHt==%2$bg@^7$D`!Cy^EuCy&_zTub4F<+r=JcF8MEG5F(7Nm;BIN z;X-D46I&K$ByH<;eUk33&x;N*Z3Wh>_inOH`-F4oS$1Xlx{Q9D zI1@+c+LhcV^2B|2Yi%MKz47V;EeG&hK~+CQijC-iLQ~vS*WOx5e_hG_v^QOm>Q;LO zwd&C$`v(G>9_&P3u|C^)&!hcWj&qP`82gKr@jfVBT zUnu(E`eavt@jOfv`~P4353~QDGRweT_EG*zC@e|B@W{x>EER2S?dXl_iw9tj{y3I| z>W!;kPvGtVoj12Hka3^Dw74F~8U|)&<_wa}&CSrTu&}PR8B_}2&Tj;ro<`0d?X?}!)e$7t#6vm2Dz^9KW)`e~@< z$Bn(6&5WJd{``#_-_KI~cJWdaR)O+sV%!vBLyz$Yr?F?U%@nF)rTaPABc4d+j{3w=A>)ZENobzyaZZvRK%)OjjHN3_fQ zU`fNC(x*;1{*cYft9RFjZutm@5W1#5`Ok`bV8C@9FZEa^0;PBntt(GYMX0qyk{6g< zo9@YM*hJ{xCp4nk*1a#z+#bFq8?YN)`~~1wR1CjObNQ`rrr+K_%{#suW#an`xbw#( zuP`-UW!WBKClTXFIpUT&jBwlZsrFczK+3Y|zT8-9@1@rn${4@>U3QbZhY!Na4o{M^ zb@h+0j}e3~7sjI8y7G`U;q22>{6}}428H+U@AJPngH7-P>GR+d+#~Na9wDI-Y0S>A zwO*g8sr#OB8{;ACf}MRcSX}#*fUWL?%lvN1eNEQEDnk}&(&|?;=j{gh5SH;McNZ|d zJgkY{g-@u3)g!r$^jYY3)h`ao?7MeioR5Ud|J;Fl6rW1%VhNoPq_J4Loo9K@HeK9n z{kcKN>hbi)+?8@}^MhvdUr%D;dUJ=&AM~WY9`E#WD(TnJfcD7P)@;6DO>7{Io35dU z-p2m@(|bDzw?t|%|9R(QzAvRl?J3O}{2zgIzGs0Ocp(XGU9Y?%~YiJ~w!Iid4I6m`D##L^)(Ue=Nep?#7-`>pk(wffC??Uou znCC8+t6ohOe6qAn%L`c=dt*_pG{Zao>tRP$rqY&@$R8_tJk+zdsJH} zbque5Ic*K7!gtfGr)94CN_GAk?GONdT_|j**Lwjxw>bE&oQ@XrDC`B+dG37W)mdjK z>td5ukygk-*Fj)7$R@u=AY-&&#`gOBIMf?qmmeRGK2tx4vjZK3mWs&)o1oy0#C{V$qeG;$w)bspgn=e;r8mhXnc80Iq#{%F6q;o={q2?aJodgm=1J4ZH(vf0EZGdi zFtfe9X7=!e+xsf`*O=*VO{w?8In(y^aqQ%~HKAQBh)a3gtR0pI$zshan^OxOfdHjYUdvYPIDX3jozTRqgj?0G(>EP-7W^HN#Q!~^$wFB>X zi0QClrpchr67zh5xt%T_tbtFXPovn)?@OWF7CRZ}nZLp0DKdGG@qWmSZZDp`H!&iY zbIG&iYH>&TTc>eOTrx)k`mQe>`XnRx^$`7z`h<|Zc;ru#UKZMJYv^laVv{H_TW~|! zBHmRy;M=ON5yx~*>9gfqD)q@>VwXi!R=TAO!boEKR+d**#!3BtNcWDTi$82M5cBX& z>g`X3n8hzCEM^%&n>ELvjN6tpPi2lWSFGLZ0Y5>ITaK%pPv3UBD{sC`cPKM!%bCDl zxiQW51KhC`UEdo0Svu?-CQL2Yb7O&KWm5YR)hKMIT$TSK&S){QhJQVw?(BY~$)4Pq zM~@%d1?zfU;ZJ%RxxBnQ_gO?tFSy?#US34KdyPCdZ37j2;ylBgQnROO0ZCmS{raNc zewC|2uZ2ZT+}cx5b<5258$O?1`=bkW+;W`vp+`(km4V{{p zQ-YgA+qv!V>LA1?0yq(vV{50cE_|39w^qn9MHFlG{3%4t8cw6iUZqw_^Vu?$N1E%vm)lcJwawpYP_MNKxvanx5l%z;%g*1Ij z*k+2_*>j~s-lQAYS6p@}H9vCP^@n416Op-V=>!Ca`{KKa`#YIpE3FsSYsI}%I!bLn zay;`l}EmwRC?>tucL|c2+8H zx%|1W#72h=GOgkTD^%`AkNyU^jp_nNel-Iyk|mHBM%q~iYg#;mO4?ub6|u`7;V0{> z)VCp>rQ;SB7S?KtbUxrkuvt1^YH?HmkJ3u@u5awg);sHRg9z*^(yrtLY0$kv+q$vG zAI7rzQN*20#uX2J)%6A7V>*0krr!bbI0Nm9V@I4^Un~PTXK(IV%97qKpM5mGT)J)- z$Y6BEDr9Y&zNL>|)}n56XGN|@85`tpzNJw8WRi<(*OWodDej8^$C}187zaS0MidM$0HxqKdi&U5)Nl@b5 zag4j~-vN@E`JzMVBGG4(C>*Akv}0EH^d*^$>@S6JUq5CO?8dpaFi?~^t=ESc$EWi? z7pWn1=w}T|gpGqwQY@(oOq3#N=gyoXM7=P#dW2Z1F7RLqwLUOD6kUI!Fm<`l;ahNu9f3v#a-^f`E_~%BGP~ zZtS~&@SwtDgU1l!|80e?~eqeyi;lrr{9VDnS#HYyJyIaLAT0H7s)I~i(%>qru>qHUD~gfwxEy+vbERiD&-nXiV+#&D-d&yf*djL38vV^0V|pLC z6EbRa)ne_?o|wl%#pInoD^P%mEa&|8yo*!{^CGHmyd%n|3|>CCKhp>GqKnlx{SqLL z@JTBupNC4o4(_<@auewv_bXK&hCPtLNqMl2Jr2hfuCm7E@s_qKC@@H=ey;53(dXx` z--&Zup2*f}_JA0v|u3Fxok zO6>i(sRu~P<1@(HtOg31dH*>fTxx1+*YY_=ph&kGsr4E5aKwg#Qj3~JS@@n-$HDdH z>P(i#?~>H#Q--|4q8u|eyQMCSECfA9usS!Oah%)Gl9vuTyO1N9g{c^Hz%zKE zLjHiMo@`2v7wOfi5?C%7sH@*-4oyttfvl2nmMY-D3XPoqZb?A<&n;;D|Jlj7l30`doanLQ`nbKb`S960;uN_bE{$rz1*m&4JH7EAD)$?~^R^EoiJ z3|?OOSEB8eu^`a}Ilj~F(kS#{s>b%EWgYz;s6!e-=fW-?gah`A#UAn5h@Ts-=s9K& zRFJx%;auAtb@qdAR3#RYR0UPY+sTKS29F!gkQpTbpP`1wsu+e0l z4%2>nn-&`VhzJz83+$^7Kjo^UdG!D~F4M)t2-vaX)4C^BQPB(Hk~I=wjclMSWH{91 zaLFE=O_Dau36F2{(~HFW8Na_1S_WSQLcL8M{Nc%>Y0woE)(8QUfa9Rx$4rL5Y<>}T zjzmFGL-h~i(vQMf!rI45+WO<|Qfjahce}fLDgN+Kf^tX@^?s-q5`UaXvSScYlZyG) ze(5j{+yk4tK8E4vwcU+I5y}!F3vQ^y30wO;6uyLb`ZlnM+dZoO9bm?YKzPlk|WY)C_|s<#)aGFHJwT08aC zY!7_Nh4?Uyh$e~9C`#i=QM(Ao=XYPgI3X-(&|WL_lvHrS+SRh8u3t|M4-bF-4T>{n zR@TgJXQ7{20lkb3P6B>qD=;}eE^lvdzbWk=jKanuyGG!GIu*_SOW5_(;txRgtnpiP zJ4gWk-*wMYsMe1EV&lPYQuMgG_&-J3jfPl8g*O1FWMBi)X?8NL)lNYHA;QzNq@sGa zq5rLP5*+v)ubh?L5$+(VE!-(Al|*r!D=72{T&NhizY_YB~JG{sub8%dT=6~u2#*Xe>jnpKk9iRys}g*)svyzcCG|8(^j7fk~3Qrz8q{) zHOJ_Vm{sqKXSazJnqY^su=UcD!+SZP{{)uDLxZHOYrB+hd~(v6)RB@?m=Z2VP$+fv z>a(fpqcKpTmg5UGN*?YN@5Ek3;a6^EiT&xV)Zne7#r*sb{eC+O*pQZ~rzSF&|2zne z$U*4o>ph6Rzdk@v433!;o7ntkH{dmL##9V6G=94-ET9w47~0;K9lYSTi$mps5)!gg zRwi$hLwBXhDn@Y|wcdRH`d?a3g&-tQH+n2uv902;;ZWDTFHDFtjH>$K~Rh)%C zlcm-JVaota2CrOLnU0qpnR2ElLcib>bs+z*Y9|8&hf^5U|#f%>UGJRXH+ zB4j!)z?yZgsZqP`P%jrY@*v+3M>@$fQkW4zO>7vi?)(y zf8?)<2jp3LadB_2O8*+m&yR#W@LyRm1^WD<>1$zFIS!Vmzfb1_O)M-);@+29EqSyT zD*r7(A81HLbr*%Ml$1OLT>-&;DeV^)b)t_Mjc{`Tl%XbyD zf1WETeF%m<4egf^!~eG7>WJ#n`c~dw^-{=j2(-lo&R2hXD{>?>0nTOW$IbrbRB2GR z331ICLZrPxS^h0ZG|g{+Z{j=j?ml>)`tXmvuZ#k%Q7WmLp_*uHPUiqeLI`#V;cp&E zT_LL8+nPmqkC+d#bUxwFT$4Hv);mr09K_eV9?Q_uOskU$hnw&IhtYuE-tQ&}u}05n znuHwO+INB{bdtu<3bR(3?$0+7%XC6AKe$e5{qyrT)8M2{?BY=wUUQhlg8-{YpUoG3 zN6X9_uwgWlny&6yq!r!xPb*sfU1rX~XKBgLva^SN{dz+qrn*|JsjY2IuIcO{$h+!5 zLuK_&=>L8<@Lg6EutBHAsQ}siZbi9t*dX?jisOIWy}#5TvO!lm0{-40HrOBqyS%^Z z+zFnBZKq-UPjPs|P5kqkwt%5RcD_n1O)4!2YucQZ%83 zK;V=Ped8nu^(Z_58U9aIBFGz=(9n%UARF8ASW}o;cba^(rg!s!$w$f2Fxb@G5J0SK z1wCA5jSQvl|2sC zwGEDD5U^sQRQo?+QE++nSew6HUKw28Zq@LYyzl3^qK^Qd{(t2RZ@JFdemW<-v-GCz z+ki#l(`|2+pO=)v17HE%0FsQ5U6R5p#L;(n2{7X%!%~~zh#a6Nwt1K$<4x;5b^!Y8 zZLk3!7C}K|`)Rb~>s4AX#xsZ_Yiqz;TR~t zl`1@nsN7*5NLJ#x_|GkpA=8B}EizAMzW7X*mJRU!BS)rK;08JnZoqj~T@crsKFKc$ zQ5y87LHQ-|f?Vrh4RV3?7x8JBppn8GB>}2~?yIT$rfw3fJ!4m|k3A0ggYBdT|xalp;S#wK6Oqyy(5jbLt|NOpaG$YMF~ zYU+JW;Jw)T&-?^;e#CR1t8`YAA9hgqmGPIX)cIpRBuvXIx-KF$YOMhdIn%YS=qmKA!g zXru_?uAE{(=F@N*6(YbyTnO~_IwtGuTbo&H=$sui_LEuk{xCxbNIA?(kVfE+k>D*PxiPTh5$GpQ}GA{8hg!`bmtnB>8gA?@iQJY z6Nv4u_V8M?Qx_?VTD1cm&%5mb-}svBg|Fh-3cZh|c0-Ry&Tm3f|MFtp=W36lYG?^8 z!~EIvgzGhvDax?|q4D!uQ#OLx7mXp)k)sv4D`X}*5_td$RTSF}sH+H%9}5nDa^qm@ zOhfBo_JTFPW<6$8tP3*&Jhf=pBUL?EI5Sqi1$qwz0?evn-|u%u_-EAU6e28K2$q!~3Ze zL%ZXFA&?QX-BM8?L91xQU0rHP^3)UC+GpgFBq z=Ud}k#Ao|SEw6Hh{iLW{&J?E)l*$BokP-g-EYL1l^2P+Fc~Z?dqrnnea(kN{e$7u# zUN8mNWi|%G8WyE`v&GYMXHTKl^r~gVW?s%bn zIMuGR+8)#CQA(ZLz`aMOKg($t9~)~Fo{VFxtbT~uFv!(^*Yv{m{nY-({$8T+Ws3+} z=oh$(0PjIY4}b}$_md`4_Q|C{M3R5YY=g@ySHDI(TH!p|%or@j;86YjezaAG)ve)M zAEJ|P14dJ?^Yxj`Bg%K(WD=j4O!SKcy-Omh{0`YKJ@P*Z_A9xgII0$u_sUB)lwU!G z9SKl3x?U&h@*~!1b%hv^Em$RN{w4Q9p=Iz4N&QRc2L=YZ^GzfvR%eQ8g9TymSd~RJ zfwL`EFbXsS5||!g+CTC*Y&tJ?-UyBu9J1-*iXzJn9>l3$yz~#Ee${ul4kS?H_fOlu zu#Sv7C?3hAex=Pe*`H~XFbP`=%JN4=s#;xRLk@KjKmu#HOibv8t*<7ye|_i#x~+Gf zL;#?qq;PZ0x{52?`?0blKK{qmI9{z^L;d)uV-f)cy&8-_pLRZvD4JL*AF)~n2|JU> zpwLf_3r!{6m6^oplr=UszS1Aia8UWCsn&0L0N0pUd*>iqW=rh|PFCs~)^ywkKU)SK zA;!?AKX3lVf+=7RU$W=I&KU@6@^T6PF`a(62y*bJ$=zY_GaZUU!#Ezzo}NR7lz{3r z*-(9idBBS4WYBu8&z)ZcVe@do?6wGL-%Xe`aB%203zhr(h%-qiDPS$tY+w*&&{2*l z*W>4g1t#~F^IcP7LGGTpV~+W=V27ymNGbw*6xs-vXnj_%>VYQ3=s6;Ahpd(EFj8wy z$mk{eoXS(x1g@9{5XT8bDT=_ky`g@0pi9B(wz6qPp(HbqX94`?>MyxRw|M15IJce( zVnX_jz1ni{4A&S!aR4gn0SYN-6|V<=_Tog+!oea4iJ**$sK6Hz6cl_NCj?_`yn0Cr z;`LwHC7>QHO6!6%N^37T1X<$`2ag)L8w)puu)LPUvH0<;zZw82 zvvESHn&H7 z%3$cUbMW0iBBb-gNqTT zy+EkVAR|(71j~`ZDekVnA09}eIKf5uDUo?`KMsYqgC&Nt7eE*XLKk{$@-rokQ2thP z)RS2*mL6gLtIy(imHx|_PGrIt%FO5#%MLpTKy9<2fNX?Hn1Vg!LgGDMMBK0x@6VkL zW2lMIp3_T{!1Mmd3(_P&%t^nAO?;e_Y#y(#7nlz?h|bGA&6L0vJ7W{#*C%8X;z7&D z1W<7SPi%w&0^-=HuF?EzB9U4CFS|nYU*rp3s)vA1P6fICb8})r z{eQcEIa^3L>IbCbC_s5S=IwEM=E|esd&>4z^QFAwAWf=6dg@NIxn+?9?SBPkEZpH^ z5c!PD9g~M)qcfgBC{wR;v#Y8`wCdB-xyCh>h_F&(Yl;yjJ#1Qik33lu|4S^5LYoql z`;sm}=kT(ZE7D-@$yAL36c_E}2bgcYcLgRn5@?v{K-*9V9ldBlhXhXjVmX+C4}Ibp zdO%J5SCL88BSDpzi0JV2!3=9%<;DKx;?TVDX{EdPqykt=(6x$>^+33?=REI`$pOam zmYqqw`}y?;z{p=J2OE-h7^%dMdTI|=f!BVMa+Ce5Fl_P{{Op{&PXjSmo$ zN`_MO(c83f;1@b&>ozhY05a6`;;(`1;T!LD7y>4tZFX7wP36_a{g22@)k>=*IDkL1V>d{v zK&sj{1evdUu=Ss_d5aJjA}vJQ*N!-};iMB(PCsPnIrwZVdfR3W(@7X2nY?S?6dn>F zT-3hEmKk@WV=3ogY{r2*%dZ!j*?tkXzP*ngNOr}}1rbPZ+t(22#hI4>%k$}vO-?GR zsHoiUfAIBZK9Ej&@uG1dr0Z%Rj0{bmJp`kfXg=a{TGXtJcAru^zjg09za0IVI?!2? z&l-;cTu?=ZbC7D`jAdU=^t5muMBfiqZ8z`q4M4})VH zp&TXEZG6pW%Mu#N-9;9vfH;vTB_b0f^pyZ%pQl3QL|U>jqDp_Ydl5N89w|DgB*}C3 zmBdRe=Nbx;J_bi>FpNJGix-lfon?Tp48ekerrrNa$^(KlDS9^X`d8ti92|__`CXyKz4~fX$V~=11lc@Xspop%Uu=vZC>wZxS%&ir{UJ0T2+bRlQ(aDe&!W! zt#&d<*~}}(BMoxO^2BCk4Es){K=!z;D_Z0)C*PjTayumC_tAy`AEaFiNu z0*$1+l*^9~D^9k@n}P5I0U+8eMg~GWJUqE(qhDd!<^Ilv)%qpY(4iAzwm=9!^zj!E z!HoR&B#PPG(Lfe4fdnkV)%7I8P8sfYh?Q+VqMtbUx)^Ecl&dQ8T%LVgMX!It)_FY*8QzUkwR zu17g@TVZ|)k+<_&_nrl`zsCrJTjAVBm2!`dbG6P|QublKfoPR~@C;~c(Efx_Yt)~L zn?YePP^!Q}hTN+*5e zLVfrfSA`H^^#@8(QBk!VBw;TQ!^_Y?5|*kzS`L2N8i&2LhiILlz|yO7|4DDro1@{O z8z#m+Yg*1b$Or-ydBK~8;GS6f1mzOAmG+-2_qvi1w{}BE8GFh@e?v)F1q8-GW0eu$ zA?JfvL+b7bNR46|K(rf#61oukOr2593>{8_#sC*?NqYW&q%2M1atzK~j=zfkTBhW# zJa{EbwoZN|l&?*ot2Q{OX&^}O!ZRN>+Z=pvMk>G5=1PB6Ui7>g?wgc&>RidY$5gRL z*B3qyyw4kd-By?u*;2N_> z4tVt|lO0q>>vBCvaj7c477myzwe{7($cyfwhe*c^)+dMylA3&VQwOdyRbimc6r*cf1|o3wO{w9F-eCjApNI%*(@9|~*=&|+zT%zF%Z zKGAqfRBV6yokSQOH8+7U)Mh@9Wn5w%u|j72_wJ6yz!@x1IbR2SNzZI{QJH`PM(oCQ zL~8^S9?y1myx}oHi7=qS|Mk_}Utcbj#EzYZU!Ul`{E#hp9eZ+OLR()z{EF8j{%ILK z#6$kL*U&Sg)~KACa=?xOY!=WWyTR>{DB;uk3M9M1o79MJ{P7WbS}Yx$QBc?`ey((F zTXUW0(+%T}O3Q+=j8_Q_xBfn3Z<&$H>j0=pRmOs9gUb2hs>C&A0FO@&E{?uNFtzms zg#76q$Ec|AQ@eW)nwj;kSo4mrBvhMudZxGDHIiZBigAp3qJ}{Ms4*~PU!@F>VbZXA z2Wym%=C_Fy=1#cY6Jg$p1<05fjmE)`;eX@$c?n4BUMkm=qX##YMOBVWsCI4zKVMmZcT%%Uat3 z-iq}>oAJA(5)Wd0pfNd?97r1q*o1fH)*NJ-YeZ92RtP!1TxsqO;Njev7eAH25x6cc zDg06Q+&;Z6G_;7|Zft@tmq(}J)$1EP%|(VO^s3!o>3!4b-~5QkzruiaX-39e;mJN5 zOotIUlLK^kyi#e-A~`=~8koLl%$;yLtlPMur%+WIGETt4(L?tEzhW&5k+O~^fJDnH za_pWI3$bI59ZMOl*`Im=vIlzfq7#fa#T zH(@z1?iP$>MnR48JUxh=*tcXDVP9WdOLdKU1 zaG&G0gpR@x4;gY+mF{R!inm{`m&Dd0@)ra=>FIlW5o0HlSG6D#&1V(LXW2>rQVm#s zq&mPLvk7sBUbjT?M5P9au+MlPTRuv7m{v>|m_G-0Uob}#R4{iA8o+RPUvfH*N9e?i z3-hOb|MsEf?L&&}IM$QEMNJTDS`n-!?ya)Ex-)`2W@Djle^24A3A6P`ZB1+R1zB3( zAJ-W(YF_uvwlb!@g$-B2+HJ4}z(4WIcX!Lw_)+x%weQWj1XdqRWEEk52I=edF*zaB z9$43GrZ?+KF7%k3U{^kH33;ws3V*pr(dyA5v2B|kX7SDk>mTy`zYqftl_BiFu7T2Y zpu#z`Y2pz_o*V%m2S>}Wq&H~w6yT|4IffPLm2UGfYQ}N!ZF|Su5#1wrYlaijIF{s) z&pn86po~sEq65wLJZEU`v+Q~r!nahjg*c+-s2}M@?8|_XNyy+zl=^`tT?PGyMD&2T z6#&_T=^@vEpwG%=g#_3#s79#@a;vf9Ml(F>_?R20#(O*NtnIoJf($FU#e z`JU+`{NS_F;xga#f^@Z`(i@R)3`cI#vKDOw!nicKwwsIMT@kk~M(rpYmHxAz;rvbr zI?FJPg-1`n#qB{7!I1rA*j&R%Wwfl=%&}onXkX&+2hiOZ&AfG zX-O=fNhXVVk3xc;t@Tls=2~~}sVBN4muV)hkQ>`Qs@OKOEBUYb{7v^J5euPGFkVEA z;RMG*UExDiQ~JyuTY6YMd<0GfjojZl{Gy`OXtC$4;GhBO3^h089FKo)mfmx2;IFyo z3^xF-kof4zi|}c!6JY4eXS=pwYR}+A++V~DW7W99f0A?wD(;cl2z*tWmc;7VYw;bX zGpVW8CVuMcQTs=vCP3asT$Ny+U0ekS;-zViDx+4plSyZIk%Sei?q0Pt^if^oOIpH8 z@lRD+{w^&KIEUvBL2ERok5vufZ?b>D`7%=0(${CWa1=L>fv45np~AWO%JopKG~K5D z;8!~%)>jEYE8_9mnS|ddBAQTpCE~1a`Gm}EQg4j|tJ4m1#TYmZyVw{XwOSN79ieZC zUJ0QOf15Vbrbd0%n8;C*P|>>ru1khaZ6$Vy?`wcqW99Z41dx(ng(A?kDT^|pxRGVQ zhNL8Lr4(pPmic<%ymT(5+ia0_2~bZ4*z(o-N5%0o8;QA>X`P5#2=Tc^#wx+P53?)5 zW^bRCV_O%?Bh8h0{jX+@IMKrzd&t^ch2xOvNyQOnyY`4D$CfLQ2azc+bDrblz2sZc z6qn!N*oh7ZulIzXt>pN)n0yPHk^MPs{J z05)TWsd|m1`5-FwX3W($8s`zu^{@2y3+> z9-lo#)0)7&o(6n5RM~x*6t-goC)oQ-I&g4W_?l4MZErd{oDr83NkrKR7j%dUgI@J3m;y0l2k%4%=>;C3T ztNs`XxSIU)sV~_qye*Hm&&PR)Zd85YsMS3!W)s)|HXxWenZB;%rB!6pA-3k|JX6-$99ZmbNqbPUpfExj~ac}SdD%<&y_IDODfVX{|)#!mAP*F^amVF z1s;YwD=mf%^8Z>SjRZm;+tm{D;V}_OtnXr2t7P_4I(?c6_09%V=n?HtXub(sbov|R z#rgF-?)a8~DbJB1Bh<3hE}U@z8S>sNJ79PjFwuL+r48SPvy8_)`sA#LwMTy!;**Cc zkS3sCMzetdMiLn?qJAV4nZw#VcPa1oC%eA!IkV^;0V;Rb=R5-;0L7-07WXj4t)jJ& zl+hydGp-5hRiIxu3HC6ubG|88{HKwXT%$Ws27PPdmX6zP4#)DHcEmx{Y?qLF{>rr$ z956RSmQp&0QbOwNN=hNin2-YWWHaH1x}Z>=bnN9ghpRIhUiscAk)4U{)M(8>cILRQ znsw&7H;aqc*pT6_<^kCxG$Qt8c<-{|Q#%l;{zS5G&L6AErt)bXY)Wmp`whdmkCAG+ zN#EAE{Z+66C;GLNIj}tL^DD@yoM-sm07q!G8W~JORERftRyQe`I#}qV%{*K)ihP#; z(4Oj}N#iDI(t*Z`u9}eg4+Ifyio4K5XM*(|e2=I>cOM&5(;VGhJKFix|54?h1k&-A zUdRJwj1RwuKehPp#v!ifGb_X3Y72>eMN1oLCF8Uh=Pj=6fKgD z3@(*fzjdzhA-XXfgJi!@o@5NO6gQt^fw3%6|#Jc@Sh-s{uffj ziAf9^q71zR;T+7ca<~X?)lg=RDq%Q~I0e=7b9f%N%E*Z||_!ti(=JZ%Y zkVg+v%R=Y7@Pa4a;^v*jR583cvWxJs@B}G^3H6Y(J3q>FELvD0E;aA|aiW>hVV-{z z5H_2;fxjQQTULi#6qn1oJ4|?3&l)N*-??01sEXcYLq`+>zGq6#1zTW}y7zan6m zLlX0C{%fOPI#Ek}*mbj+__n5^a5aQ3*euE5H84@AVf@eKfh}s zT(ETdp8uSSIUqq0bfWA*gRrwH9dLoit#*}X{pH3|!cK1tj1>_PDG22DM38z?AdXj4 zBI@`uj`e6|P0+s7s0pnk3wB?~F3X9Lz9#R1bHkODb3jS!XDjKKvK~cuid~^sN?Y<= zGeYSSY1+L!mYa`4hGytruV590e|R1LV--5tj`t4}`413ZA~{}{zYC(inroZxNje_i z###j5=Vc*`vd?uplk%1j9Z1CW?elj z(c@NWCA0cg(GReQX_Bysc>}tOy1PjYfut89rXTbhuOdZ_%pc_}z7|$FD-~Ev8a7JbG@HY9?*W^W<@4%DD2Fp%fIZSVSK6NA`u(v??6)j<)#dPuR{xupiXaudVosK+%Q%i1I?1ar{ynM2?XnYtI?XZ2ZJR}>oV@# zYUvM;P{1x`QXwzwoY9d_NGCEVLV_LvM>LnL6}lh18*64akKYoi_U`B-p>R$4+H}6} zQ!Y!wDb#do%JQ3}O3FpqMdH=tA@Vh05fL(|P3~%8Ho?E(1xz$tb>ayJB^YM`b7Xvc zOw#jn%`qqA>e$z|V4^E%g11|mfMH!UaA2&v= z@&ZG20?%aKPdI~6r5->56=)4t`3}6F>EP!AmWd5o!h@vA_P|)Q8(%b%E}6HC3lst^ zNI~i?*B|E++HldFhb{?`VfwDnQ`9j0$Zu1odgrsui@y;Y^HGf)ZK%J%v|0U~d)8>4 zh;g%f#cDYXkwXVO4^9&~!FyD1sgR)l(vZDQXOfo+3QI`J*6fv_(N6G$1#O(WS8A5@ z?6}KLa{7-B#58B+iahw%su9?bbFcJE-u0sHt|4=a=lQBds(tp?hP^ktCSVaIg$PO* zSh+jC1|34N74LlTRHWcxybJOw?Gz+6Ln9(wr)}J{NlHaU<%v-n+a73Pw4YxT@8U=R_n2TaDogWm>``hFMmKB8(wBc~dc+$eFo)|+ zMK~+~Z4cIyQ{aKzpBE9ZmjqK8E2Jz^mLvNm&`IeWH>-2y=`RwEo(u>+Nut(F8BK^3 zQPcDCJ@tV_=F?qhogmEcdX-m;ygQ^7*W@OjbT0r!K`r1K;^O@}?y@iKP$3;gLP|&H z&3w~~x{mAiHXU9ib7_>dY?=ky&{%NW^Q@Z=S`c@G##pEI@s<<@u#;lSnDj-)aERkr z|1RXjm&<;?D^ww2;bxTSk_J&YS@iU;c9x2O=E*>~XS`-j)Cc`#%a)xUIMRY0?8_5L zaa^aBfv9CIH9IFqN?G~w6VaJ770#SFbNz`4{v`I!N1<|LsvrleO=)X$JAUGXr!DV? ziEuNem?Ccf%TaVn%Q4YL(qD%OBbwWp*_DWs5rTP;o$Iu8rb|lHe?w*p5cv` z1vng@EK(>K5>6QYZ7Y(e3_w5jp}Ts2|M@KpCumyPBj#~~NpPLQjttfHDJdyR-HLBm z!CLY>#{2iq-2-Pqz&%JkG6ZWG#IHx)JLHDhJXF905HBesb5C}buAk-|^)@);SeL#$ zh}m0*13Y1Hd+U#Mz}l7bOyi!^%WK%jIXA+zB2V>iwqX|d#h_uz^y$+*-~^<|}c`ugV;F2~1Lct}AS!ZgF$*;8Aa zpiHA4V*&+mLBk;nq|H<}ro~@Be*Cx!mf8!z2SK@^n}0dyhoZPFL#xX@;1z0&#h{a6O&8$yG@#nS$)9J_P)~ajQV1?>bWV7 zMb{yFGP1A;-PTb0=EZ+3NO_2D>ct@T0FKuH2?s#RiQpj=d^Hx{44I8G=*2jFK;F zrt9p>MGqcx#D!8gT;PZe93og0kTHT&Fi7brj~<2-!m~di0;??yNjE>6cQ0UhV7Yv_?wKPi11!O+b&pYEyW)d0HTd< zbkW40`Z(3YpUdw{r+vMu*J$9q{`p)~uS8d~!mEdNtbPBYusn53;e|6v5-UO?mD5yW zciUz-e3})tzg>OI4R|2$@u5blTM8+o1Np;b1F)$s=RQEqTndN*Wb|5^4_(0=kY9d2O(KwRQHdAtJGn?)$E z{}S?iFtu7(ykn7FY_;Qh7wLBJ89nGywZPtTFG21_RJF*b3Gs!AQY+7cB8spH`@KpdZwlArmh25d$|c1{`{%|D zJi5Tsj!(OHYL3RY!I^u1cM$hr-M^AD)1ehO#F5kXXrbX#ln<>p?j!(iFI3o$8?{t^ zTjR-wWX(N!zm9e&i{uoGOBx^wz?!>>QcX#>(`U#j+!kK^{WsKzpB3NyAKIaFK|ACy zP7`ZCAoujO)9+?63gIvQeNhKxT+i75#eGe-^Ko>U&WpT5LJ%pbyD?;)lU|Z`)6IWb z7x7j0I26LAD*Od$OGy>U1bp(txpqcIM%hp?Rr?%J-`-H_hz`x0>L=F?Ln`s^{)?H&nu6mzkKgsTDIrIrgERt@Y+hrZ-wX=*JRK+K zRYy%o+*FhBa9>*q*Y#Zj?%ggnYQd@cMqiu$oqIvl~G>{2X!>a1*zcX};>ca3#5 zTFLRbF$}g6_fd3l`@>lnAo-(K`x?%)rVyM66XxGPFLq%65|e01|8G9}h~QUryd-zJ zkn5>fZ8|bB91#b`{d$jlT8!qdWkmjq2+!Ljq7B7DL?qlEJmQ{X{j+|&GlNMM?RQr_ zevTC)PZz&T`FK+iKgci^)C~Au;JkVB1N7WaM0U8QVArNCfH|!Y*a9L~a@CI~Y8r9g$MZVqXi=^$3wk_O5pZgw{Oa_;><)UikRjdZJ@u<{|HzInaMK zzg~XRy^!$!elMc;fm!qI(`3C_7VTWoMdgy^;%!tRAt91c+`J}gq^{KI#6SOpLv0yp zsl9B>d5BYEpdC#)Tm(MR;E<4WKXm%D1jxgb^oLHINMn zy;9l8VA=#f<#okv+xJ(m-mr5#FWXw!(Hx*Lo8qcZ&RSppk`^2X^NzTr+%clQmcb@0 zbu~UEKyzmRzKZn8>pyzt*fH9TkA)oxKtF5H6Ro=8EfK`@;u0@j?>SpoT^)k_DTREw z)`cET2ehuP?YNeFIb5Jsz~c(m{Gy8rCCkL!co3Ct2oTPEWl6e>+zpVe{!kU=H%u4K z6}y5A0KAj%xgPCLZzU;8B8}^dST2va9gq1%T}2{9dHC$9 zR`7_yx)q1fa?q*Qmsiv7qDmJjVRwpFIgU%O!J+%`1JTXB~#V7>V^X_7;Det$}N^tczdEoIiT}qB5lIbIq!4evrbcbdU{;@%zs#M znTG*Af9V(>0B0XQq73>Q1YXhxElM`4>cn^(lw0*k82#Sxp%vK(b~Jc%2KDGg z6mX&x?m3j9O5B3ur%p}Is;h<^A*ouhK}{GA{geb&?LxL z*(MniMgFw{N153Fqd)zOMr5A1Zr!SUh8u$(yv{_h5ViI8?b|K@LR?Dg+YeM$n(mB7 z@jLy52fOP5<1{h!-$tpp)f59Ho5loHS`v|vl(g^EuokiJq+l!3g`3VKB_+S^p6)U> zoN*M4uL9SZwJeY0&^bE7%0x_9QY|4ion2jrCT0;L5F1o($&S~WWts8fW8TAeAJ~r3 z9pz%IuUcjrXB;lC0t&0P1F1WEG>UAX5Iv^cOkrUO?eM4{Ig&5LEp;_BW34tNp=qj{ zPx`jn6y;*omF+{Xcy!?#Op{xLt15A?PvQD!Kl1?c2I7}D?zrv9esY>31EjApBEEFT zA{*!=!~=&vN2u+1U7$yO+?F{-e#>CriyYTx6w-)Shk*6ygfRHLii6>ik+Ow*ZR~51 zl#v3`@AbYA0w!H$5m>tXuIRteX^Bm32zdkT(H@NDb)Ri_cE8V=THj zTnXM=mT?nVZ?aS_E_lNhF28TRF7~J*n8q%C;kpF8;d?PTG^H(?Fyks;8BT{@(gkV9 zjtuMTtmb0#?8I>~LRMwIwidw)0lg%kSg6~b_G}`88a2DwXqV-4DdWIEk!SOXVqlpn zP-#}DH0alos}M=wbZ)Qe#k%a!LV!kb84~pr&y8Hj_8$erDX^KR+KP&!jHAa7LnmI| zbo;28eOmSUiAw4{e*M)e%udgkGe>z}pki8U$I~>|j5nxLqoJJ%ViOnLSD%lD7&q*B zwOv~CS)+!Mk~yrok}vLgd6=AgeR<%7Dr5mv-M8BoCA0OEjBE@^gnn=~YeZ-w0#!bH zy`NZ~HJ>vAyo2xk6y1K)z`E7Xk@gdLR6O{hLb*(BebfSQ9t1ZgwT`1#LuAB{Usu}IP_`ad1}Nmny=K8$ zTntZSeQ#)^hB^V-=#j z4n57OI+y8lD0rQv;6?9#ZRFTq8#l;&37@^Uky}-LO`ncLedzC3DnH*D0qZC@6M_bl{%83S=ux z&pj@M(CHV+^j-j(>Y2$zsMGGxpggizC@XtEphD3*V`r zQr_=?hN4&3KY8+G8F$tW*M3ZK)McJ4Ci^oK2lK}lXbYux3wIRd6;2EtA7(Dybx``v z+!Ro%0qXh+&c@YqQ!5;qUFUnP{)yyNK)aYW*AL9kcJbm=(JB z6CicpbW%fwO-#0ei>DAC5zX2xhfV~Ad~Ba?tAUpHBU4SG9d(Rz)g}Cg1dRo;X2Otw2EQvD_wGOTdLf{`%5-WJn#yV9~XN+ zlc^ua5(ff?2p$;yYHs{wdC;r)`nz7);(T*xu zKBr#(#5!!Iy1TRe2-0^g-xEzD=Jj1QKXP<5@YXY+Rq5FewJv0IZr!;-%zp&x+{*nW zh1z+g{9`Vo_%x(x#ZIpLXCdnysWCyN9oi6`*DAfKwOwK9t(}_$x~LrPse8!Pkt+I& zoPGTOyWgsQeFJSpjIRI7*O%u7JE8xXZ)P=5UV05IET7pMCA%cJaP=L}usi0bRb302 zFC!&&-X{)t{QYIV#AU~kEG~e9WL%S6)p*wVY)d;iTj%Ar4PGsk)2qe5f?QW0u3;7= z+>L{S)ZJC4G}|Ryt;ZjNonm{be_HnO5ATjs(AXbZy4FE$OV$WBx~NXqaC&VKO!Ic# zB>-(M59b&c9zL64b(OoMwC~3oSCPNkb-aq@A28xXkDWRd0a1SBmzB6(V$vMlZIrjb zYYOh@dZT;(O!2J(H#XcJ|H*V$-D>MM8@J%PDmX@Ed%NFlEzHiRGY*oCU6kJgv7BU~ zMcz6Hla%oQqW(+luV}6?Fqnc&>g|F8&FT&{m(81vajmm6dZ5));8>m^_m+8mQZg%) zwe$4M*0?wpu~GDce1gU`8&ypj8<#e>6r}^3j}CTMyfcwx;j5b&t0Cy5f} z__1RG{uiI+^n|x;tt!D~w-v8+aOYiUY^)M=DH?&$*jaAexUurqI-Y^=Prqy#1wrKQ zJI9<|D{o9dgl^t$sX-QMPzM#g|L19MY&Vip3}}{%4SCvlU)<5mOM>~-n zstjvCwPeJO*rwg*&7;@d@N~JkHf--=R26gbEFiHq)AC0n`Sp%XzEHKOAi35pd+q5z zYEH2bjrt9LepXZJzrXr$o|g#ruT6@DzCp+=IrzlUuf1x`Z1Ni+$RpMu1ks?Q*Siz% zuEs%rL6VVZ_es!Uo6!0qpUfA1wg(@Ar?5&6G2c{I?clhp5ElhB7y7JQ7u$B(0ne!t zU!r}&5t?YvH^ISd-{Id~P`Rp*UQ5!aj`fGw{|j=&KHg(wIho_*czpc-fBMg%`ae4+ aqvM|L|F+qt!r+I`ChaZB(cL literal 21107 zcmeHvc{r5o`}i~&h0vmuC3RY;M4dv28L70179lZOlukm88rztL$~pNYosx>oON+|d zX|W6@$)O|@b!6YdAY&WbFf+ertn#@`!N_C0fU+TcII@@lAE@y9sKvgde6aa;C~Fp2!lC^F#>Oqp-cby{D%~h{1p{pdaW2z6ph*L0|;vC=?1I^)m^8WEsyL_9P5OJy`T_+Kr}e z7Yt?zX6IJ3gWgdcS$;vP1K%DQTvW5Szu{4S{dC^PHD}0{q~qC#w6EQH-zm)bmHZj^ zwzbU^7p1bfSVP}X&Bjt)?M}dTX(+@BgGsp-asW_)d1eh^Fl%B6(=dm&(6E?<%UB%7 zP6j87Ntr9FgmGS>q=k9AOUoK_DO^&=KNtjNX_#=@qSR?J3pST&eD9S}dKPB-?wbhJOV)(4}4n*_)bRa1~c5oYlSh78gC+Z$TH=6~@hg z2#l0sN0*xB0)^4RCx^HgC0FJC^!QJeYUMeJR)G~c+VsVjupRGA$>Bbc%Z$_dc3KqN&3};bYe!TsmyOPUi$lDWM9b*SJ^b54gFd7kEU6N^BpTi0`1~aI^$zz z(M|O#-twpFpL$yVm$?7$IM8<-~th9!@ccUgNy8WAXwx? z`7mx@X_u3~^D`db@CXnV#}VyK39??##4PRbi(~@5*5WX#Jx9iiE?C6L1_okfHCy8n zHqm=Oq^LWcVjo{PWAGK>th^Dzq8d)y)G+6?cC?hOuYTcwi7xlmSLy-Wt`B!NT!5(Q zLk+!ayj}j2pMI5&_xCNvR*VXasJyD zq2nW@FaP}jm-?g!$>y)C_Cv2>2#mJLzJkRM>$n}Xe_+w0O{v49ISQ_V6<3V54WAuE zVl6n}Ma(TmgqkhUzu*oVi3gg31CI#1VY=yaP06)YJjg7y`a)3+KnAiNrD_Q>EUxyK z$0N0v!clT<5{DR5`~aH>QzmZu0Vo{C4O4P2nw4ylfdp0_`t$(176$Hr+EVR@p6Hrp zP8MoTA$p`Zf+8$rC)_i}`Fm~bB;X2B4D1BioQh~t|%nUa8!Yf!qMmyxcTj55y`?i?ldQgKZQxga9NP+Xf5 zA^|A{FjA_ku9I&f^A-T?@8W~e2u05O2((}ZT#%n1U`^`8b?3!-K*ux~9pNH6K*Bba zJ|#pkIwdglb2S+)-!mR(p`a}m1tW` zh-S2_pe!g}g!lpsu?aw&3`5)o5c|OpA1bcCfSyHs3nv3D-#(IoBgzRJ;{)xWz_g=X zL<$K;3LcO`hmpcdkL3JPgzzW;m?iz$W`xOy$ODlKz(nG9*BF!RGt})aqXjMBH^2{6cfgdz1t^O%D)m9^n zqF`y{*H`R0;4+zaDLXW~r=vM-DFO>KOE+Ma8Kp3=m0-e-unAW-yQiVTe#lvp!4b6B zB#j2_IxjA^@0)|fmU>QIMxUVREH6$D$)#37z(3-eoZrE^nP)H zV-*Uu9sjE|#{5SZED!5A)?QU9!J%9LvVn!3kVL>lBAkiQego(*;3lc{xVNu7wT&)~ zsoP+$$-26Xok{KcI5n3j0^|E;vMl!sj5+=31`?P9S~bu!1YwL#g6UERpqCEQrHw!Z zDb3<4_+{Nd4Pyt=Pq^Oo>(oP~UMfKSO<`u}3ap0rNnD9yj{`Cez{m(qPpv^ocw2sz z1Dfey)&mxZr_&Xn-zrwz6raEvbuFpWf36oY;DgqYVqfIqHP(gm0G+kD0+ZsshmkF^ z#Cb5Xqtgu}kX>A*+KKWqv8OoK(OB3bEU<_R`YExG!>~L_AE-t%r?hG~6ha9FSS$|~ zy<$%Tc*|h$l7K^|!yIx(bInEN(@1tXk~|ow(7LS*FuNGWY)7C&c$f~oD#~i#$wKA` zqSl$n6->w*K#d{{-dO-o0~SMhK&xtC7`R1QtvV&hjFi`0kX6}6(ZfCJ{Y{OM>(3>3E5(Z;x4 zfrlpu>`(Wgpb6|C)uNWu72AeI1m5Pg$~1_|T(7`FV1?PUiJ1$*N3n`mh#=h~6CE8z zt!T?ak39g17v8R>d-$P=yqbHhXrtfYzfT@QY@l}>%H zzdziDj;EeQls>}FZUDnsF$~Dx{v@Ig z+QzcelQ*OM%sT_J?AOO)Wntv+O#>IK(4#=B1u9r91XsRQE55fy!~HNf zZ_!snFrkN&6EZ0%^65PXN`xkU0sg%VCVu!(dsW$H8AMFu&hAXE=*_)_rTF=AM}{H` znVW~5Ku0xXLI#SM9ztI0P8z6UezSEK{K(VG1g`q&w#lOW;qFJG0w$+~&(F1`Qu3c@ zBsH9-9ZtkxoLCN!MyLaDxtnF$USl!dq49*y?&uL)kajaJiq+FWF%lvSw9@N2g_Gs~ zYi))szf2%_^(nz$n?tUrXU+B67la5OiQWPuR%97<@$d!c49X3xG2P>>IT!camr4CT z&BaJ98ttv*Di(DB{mh3125g!E z*L%=@;wOo@-4H1Ru$wag1R(By`_qY}b1z1!qfjoTm@i$A%IN@p4~Sq4$3Goc@odtP zL^xD+?l8kpNoRk0?O!7jA;u6?;SdXZlz&Ao<^f#e1rl5|eOO(|l`aF+VYz%-8l=T8g$b>bpQmQ>0>X6)~{!n%ZfoL<(@u>X}3iuTpXif{Bd2 zg{D8{{lRu8e^JvdNwQhgB+0kngRs>}xLF=k`E)%GQv>#BjA30+q*WOCHb_G=A{V_^mgFK=&W2rDhHm zb0vIdYI^`et=2u66m6IsHPLtyf4>u1B znoqOHidI(zT}_r2H4E6ix}93AwYHrGnQhqWZ*XyWDzh1JMFSnV9?3>MAyxx8-bkUb ztXFRl&(;nKMrZ}-TpF>;Wc51+v1Vi_Lp%C#V5nbUAQ_nj+a-8PuFF|5uXqu z&UGOq1FAA@oxLU#X8}@*Dayf!DA}dVu6`Mrs;-5f!CQ;-ohj>eYg`L2J*>VWtS@XQ zX8L&>mJBFcao*wiJ&p|1ssK=Z?Bc2j?wJSYO#6IpvV52CtIn?8cfP)zppn$dWI_bR z^K}S!Ic$70o0*0`Zc5cGDe5F<&ubM_as;bnxl5GT3ns$qhFUr5z?WZa*YavXp&e+@ zgy)KCE4|AZ?kbO4Pf7X^xt%|cTra1PiEz|$g3YYPZX-GFTS;coc@Gw_n}JnGM)#iD1DIjk4Csl7U!zH$qZc6tGa%>k7<6E1J`iSmK$H zUrvampbDBL5u^(dE=;i5LW@ORd(@T|Kt0zTB1{|Zvmz3YjI@@b|3FQ#H8{)WrQ7h^d!YNCNIB8!x3+s z@knZ50sAj%=6lnDKTjc8HcOW!>}vvRI!1`F{TxjY5fWU%2sqC@_MwlTy?cKH3aLC`9sEmO1 zYz~&<%IGZo$JiQ*uO$i>4p&8Zep<#B>)VrGUVW!wKv~$7VKkt`wwYEFaKLA7wJo%^ zW9ak=ny)HXjZ6ev>Z1DBIwqmBCNUcnM>)dV1_4g|;6IL#KAjm9)4dUFmS1^GHa3g0 zqBPK9=c#>Ae&$vDR6yOM!*cqZNa&FF4Q1$pf~x3%*yu4Y%A$PZ!01M?H6mDL5v6gf zVc8Mr#SjNK<_g0es+$#0MW|74rh6IdIFT8R9uOP2qSm1zn3~M!(XZkCp%5IEB~?ov z5|UNCdgpMp<-Yo;woa+V?ETPN_L zU)L{c${4V++~;P7w4}OQL*f#=bQ#O%p3POgsbvi|8D?Od(o3R{hn!;Pmqh)ZtljA#sD_XM${Ivii6 zn&u?V>9I4SO)j*4eKEH-b|iGZ>?`rAC^+L6L~9ji;P{cyi%PETC{t$0;to$lt#KqT zX>sDp%)vgGj@v~uh1E?MAD2xp+%4X1jP9S>tiQ$8U{vvww$ScOaBytCK5lj1t56dO zuf|o7(PnxcA~>M!^5eYex#-z^jKOxeipg-yN37PBeKj;wOmW9&`fM65!yIPZt99jG zrJ|>`c>_cs!rA0vL3@*Z7wKskA4x$Raf2E^wHr$?FhI}O5m(`Zo^Q`PNCL4gr)c_% z(UfIwga~w4ZtCblruAq2DQ4{B$A1-q{*gdg1|d~X0%g( z%WJXn;bJd$fGm?w7$E{t(qENOCt5l#`)KYGMCmuHA{6D`ZSKrx^U!)H(i{n$`xkkk zBWb+aY8J?~r*ejI$z@4fN|@XXxmWcPNM=sM`NoN_$jITjxC}l&3K@@V6c)ZxLrYUz zmvsS^GdMPdndPHZ{TfJi&}OmA26DK8=8uBPMhWO{<1)I@vuH}m>0icvHg6*=CvkkN z!#ZohI-uzaTP84e9VO8IbpAu^aJ!r=HxR2Qp7l6B{`EWW>R_qn_XId|R^T$4X8Q7^ z;ANSl{VHyo0~L0p*uZTwMAj>Jju#;?5w2CwqhE6Lr@AdQ%}!rcSS#$!i*T}{C7A|>?ntqm zB*phqd}*_W`|92=i@3Rc5u+#T=LW5{U@el-pE5+nuKNC;L)}FW8=EWRY`OFo;DrBa zT9QfN=N&0K#b7c18r`|5qPDhp1Pqr|*SZ1!{Y#L%1#79yn<)Vd^jhEP#M6zYPFa7z zW%SH6TAnjmX;5MZfaxV^)=~p%dIgJ1S~j#Mcd`c;S-pJ;q)Ao7Ec|d4OYv%C=XS*3 zH_i4*`;&^p@t&Gl@f$I3clgS31?zZ%ysYKJABZk7Ic=xR@|W8sPq1|NAphrTmLC{4 z(!e?cHU{R%dVPBd9iN*p6&5(B%1F?_W}nlg@}L^8tp^dVPws^9e`d&0i`zkP2b&Q) zQx1!9AT@QMLL*Co8*_l@^sTZb>96n3O)sIKxe4DUbF;Um&hH& z2kzaKvVV$}Fbnz&Utz^iq;ALH)|KEi{2^ubJLw0vogYu6uw$1iq=IZLSo>uDnG(0? z2FJGbcHk64xN|tX^H;3R4d~_E1bC`>?LoV-`1B6Vf69WU&wC2*`vo}vi&H1zfIcLw zVKvDct*0TZbiOoQ&dV$)bPoh)ri%+r2N)iz`8urx+Awbi92+WqO89lQikB;PYJ+c- zUk-AweDNhLscY~V{If_KaMyAPoQ*Dza~h=8Bc0RXowZm~`Z5aijQkyac9m2JtZaqO zc7AbKM>o^k1g*YN47dD@-aF0n-sGR^7Nf zXDOUc?!wI;EDPRJ83nlZJpSO^IcMX_9P#Z{U<4Z%us=&*-|G z+tkjG{6P&FA?PO8$m3_wt=^H1zoA<<;ws8fLi#wzpe1k4t2mTeh97qimC*XoJ+$dG z8ukfX#Xo4+=M3_{pj&Sm=34}_p&xFKLrN$ZJE21VY?M6%sI>rCpSu z;~7P*5+ryzvh*vOpQ{Yw;DW-*y7Qohf-Re|tdx6eVcy&XgQj>v8>47lJ}}rqsY%N%}3vni3xw${d0%C|y|1Kiyav zTV|}VYoEBqDRLI!`yyXHH?vq_*8*#`k9_&W?9>W7aJOvyz^kb5(a4t#he}eU&VWPE zbPaCeL0P$yzH2+9U z@4VYcmg;IL#DJU-2})gU^X2W!H;HV+{PM#n3}%kC=v{!xPysH-zYH;#zkjn1LK3~V z|H$7&Th|?d=15$A-z%Z@GW~G%s%idtyf*7ml$7X3m6&dLkowQp{aPb)BL@q+Z2x&$ zv;jL-=st4Dd)9(Yu&enB=dPJ+q57Nk>pgw3#Ru70HZ)*n(|Wr)*sc9%oBEgD&UdDX z)mVD<iv&9MrLVI^ zQ4#txrG0jVT?`7sU%6!CswC2=Bnn0Qwbq`{Fw zbzDZ4q!AX(nhVktF6WWW3KT1r8E+nPq+UTura+-Mpw>~+Ttbq$ha}Ct@eE^)SG9#+ zH&Vz23dYRCHtZe+lsIpy_xoqXOCmXoy8jS6R?-NCk@9HS@K8yV<=_+7ylNY*1&X;- z(SdL!YQT%`g`*Wcn0G5sXd-JrWldKX732_)W8?;QmOI4A8F_&K0=Kmj!P{zIXK#sH z{(a)ggD71W9gegcq$uw;@`@1Lm5PV^(E^1KCdu)NsIVjEyF~{+miTw({i#4n!mrL* zjUXYYzoTcps5*iiBAhpD%>5$M7qb<3awMWPf_pCWeJf)`^$3`cI1zk7_`N|vR6Uru zLIGA+q6<++JMi9hjtCn-}xf zKZQN|5uAt(UhOq-edwh3tktkrLyj`u=Nx~^H1|QYgDZDi*sT5qP3xcj5Pv=-c=0_= zd3&t zq8xhW%V7TKt)B@K@NCyzXYKGR^QlKi3HpjB7Q@(r5+y4mqRk`nh@#qEc}O>q@LlW*mu zs;FtdzOXjq^)6MTaBL!X6>Bj_F~{E4sXgKRyd-pG_-@n(`r8AK-|VN9Cn>o;wxabr zPriR*vCI0kjm0gV@dXQp{w(&ah=Qrddj^3eNu^_AD5J>b4T@AwUg9Np?)toA)%d`$mZEX_YB^Mjccr&GFO;jS}tmDo;63X#;b zXP@vIyb|>urElr+QAB#};{fbpa?y!dZhCKaPBZJLkD!%Kn@sCx{Nw$V9 zJo~2_H=+8@!DsEAHs&-=Q%8qkZ;JmdQI_1kft>>Y}c;_x{K!dNYF$9c_!Eh=<3-W=bmsBe67dURSm-cTWz_wOzpxuTom4H?xz%X2^gLGh(KtoM z)W}gs&zy|s_}tkVI(RC#B;UZmifSq-nKvoe$!eHyn1zPG#}@Km%pN`8Jn_QL|Gw4q zzpq^V*C)85e#hL|T{ZTL?FWH7UF9j#cmv1uZ28XvZ1Eu6*2$4&3 zTK@nGp}P6Y?PlUCX36Ze@&~U!B`?X@a&^8IhM@3&f6;98>9ap~;&_d=5fKqo(>|*g zH#L~N+-KCwYyJI6eG9>xQ*naz`kuFzP(nG=-lS^0g zk>37a(+RaTNdcbIvZ(S@AF5#GP;yQ+@kdv_^HP#5x2@KJnzkr_L#(kfkXJeaX}HJt zb~JeiecLo#R@e^rMhrWCY@%=t8_fH=^qQ5}!Ick&23HOej~EC!@-@+^WMThDlkK%D zey@p*)RWa=rIw#sa@`Q;YtSDbAHj^`S=bkT%c)oVAms6fA{n<%o&Q}mU}SX~Az2_h zQkzpuGj;o7zwQY^<8OiKKrOK|`Iq>Td~hl?gnGO{zVjBH#z?=Z>_2Cy(bg94 zf1XxT{l^HP+ssY@SFvaC2DK>ohalY$ugRR<-t0sG--%nK@yeP6n*KaNBlTKmvFiEc zdJik%2b^#5nF8w#$pr8^)~ie^jz82Do3@Fg<4-gnI>Ef{w(7VnSFh>{b7dwWIf+Wl ztBBKQRjWR8%2oB*uYtY39>;6#=-6h>l!s)Bvb!>)Rp%xlK!1Y-V-V7sb#w&W z)+altq&5v4G)-11v;dq+9%<7c3+2KTakBY8ntu+3%6WUQu!cMj);NyzP(7z9CgVWI zk4&NWxg6Z)%mo0DPAyT?E)CiXzs=!BP>uT(r?Eh=%WtZT-#w_imv4^Y5@_%CrDp^R zJ%}}#3ze2=fX*Mj;YO5ggRONM>MwxL>(g65M=rp5;{dj;oSNJZYg+5WL3?)^RZ+9E z>9Q>rKy2i^-0JSWqWk1>QNRC(=L-KF>%aquAB2j*Sfy;o>>eC=BN4N6+up57Tbu*` E4>OGUe*gdg diff --git a/docs/bookdown-pou_files/figure-html/unnamed-chunk-294-1.png b/docs/bookdown-pou_files/figure-html/unnamed-chunk-294-1.png new file mode 100644 index 0000000000000000000000000000000000000000..a6468f84837319aba2d66f9a046bec806f51762f GIT binary patch literal 33865 zcmeFZby$^a@HV=B|ktCRF}PkLJ2*6s%2zJt#wQy zwxWoGsr_zmMn)*f9)EbTt$NqJ!Kpp7BD-PEFx%pbAx(3iaOeA`L0Vv{&_k?i%0 z(Xi2dgV(HQth47IbEOokrQqH@`=fJ9Ho2RMaJJ$?QB_#EO-;-2=0x10?mB%&S^R?*1#M`$bF0Xw$kN9+QaRf!MN3dX>{?<@8IKN2$9@D>Gwy|$9 zo2Mp=_26^BryQYXhOcI*J<4~5M*g@Lw;KujZ~P#9D{31@|Mlpb5|qw~1k$C#j_7B+ z4@@p|o_+ab@9oe7{7J@jeO!F!rJ*1({ujAy3@w+N1hQjP7<+2&Zu^y9{%rGh=&ovT z(K(-EBRJ*jQS45UUjoSzwZ?{B9^`Xd6}= zsg9+-7@*d$I-cw2DZ&T zRGFsrUgPXk*SSj>`;1GSFB=vuwHT`p`+8VryLJ+qJ2Lj3m*NkXmrLVdZ@g2-euK$me}T!Qp#I@Kd|w>D*&*#R z{#m|OZ6cpa2ZOh=E8;4q66su8w%3bNQ}pzQPrUj1rDa!J~{qWSB^fcE2G%q zcLEevpcM)ie#3%)r{EvNktFPYU%{zQ!uj`iY$@cA?gstDD3laRUgn~vJJ#H1!nkAa zHkZCzT)(bIVsxAwOBhd6>R6XQ{+$#89wK)78z|DZIPc^d$QcvdnZ5^|VpQmTVX1le zD5V;inhe(E%SJM4WDG9|3Kt58=R{R^d*U&lqX(h~5~33pOnfBXw60z?TFW&uPt2Nm znbWOy^e8p~y+7)=9}-kACRVntisO<=p|F1Y@dSl+PLklSH>1huQ6w@KG~|DKBl0=y zYrcp6`aJd-DHQbs^Ni5HE`eLWbNJ6YAi%xkj}jtlQ(`9n^A-sFH&6cg3K_vY6qX}> zOtb>SAB!L$L#f|ruDw^)Q?DG^_Tl8w&Nt?|eEIT3=lvt66^hK;ZU}RIx!IRz#38xA zQzUGXpWgo@T)_4dJx|?Eq4)NDS)oM-t<_+07M}yg5I%BzI=tRte>}>>v@M4JDh*to zrNsJU=`T1gQ18k`KK6&G1p`6m8lXQiVoe`EA6 z=@l`b!D6fQkJ&n9!6h>T8OpH_CJR!FUHWqLC|fzJ96R*%^wI-qJuRo1qYTxz`V1W2 z$(*n`{XSgp34TB=!Nc#3Q-NCA>qQ+BL+i1;W(upV-Q|KYvSzMIH(yoo{cMGWY3KgMa`NH`&&1*#6F>yIaBy>U7U?a|=k1NT(iZ_U<}&d%Ha+y1U%r9`mDz5Fj!M{F#7w&@?O z$002?Sz++N7E#(t@LG7~HMypc{IK!u!0E*5?63>JwiZ6k#4bhZnBsK3eH;90JTuXk z*Gon0yG)i|QHDpZBE>Y>z{ZH*2xpDU+^Y|d*86C0y(Pcyv^m0Zyvw{oz*&Rn^`+dwa64lVj zUA!Jh!{XVtJiWbG7X^pEdr!D)cc$vvYwT^c?06Sxv3V-hf!xQ_yHoo3yL$c18LPf7 zHL6?BZw5V!i=&<`DCpR7F18yIC{mrxX;HbXFtgh(y7;N8gjJ{R`GbB&jEMI5#)r+S z(f9kLZsQ*4fA^4{tV-&rdE~ljWR~us>FSmCqNsq)g{nmbX{_s3Qy3#JgBM!O)wH~( za#ofE&NBIOip7Vig$k4;##cUWB%@v_)h2K2C_0pXx!>IxQ%c?;zEJ*6LHf08^h$Hx z^iWy8kMCJB?jy>TW_4xF&!VDqP7#f^W~orD{g{Y#P>@C&1vsw`canYURvR@n;dM#0 zvpPFP>SY^9LZcAHsds}ew(;?D3w=PJdywd93xj)oMtP3DX#y^^FI-9RWUkLzMc=P& zJsU6XRl35rB694Wthj0isSn%DL@L#7h%)d|pY;KY^j!TCB7=m^DC+e5y1h!HAV;3K zmM7x-J1bvEUX@LE6q%)VXb-)cy{bo(OI-PV``eC@UAUlr|a>iQnw%EUDrySOi(-XJANk3w~xw z-cOjDk6vOpStn_{wSR_U(4Y5LcC;rXLsc>Banl{qUM01lsoWWM?ubrO4YaWC%%)(| zlMn7nF8-cYw=>RT{wbTp$F*I52crGJ(Ee6`T@0tI(A;GzBUAiPk;7VxCLifF@5<6x zZ^Nd?oO$WlkY`k7I$VB>*Lk8<;W3}voUZee$TPiR=Gek*Cizs{R;V|PHEEU zGli%4W(uxn(^j_U4Ni+xZN2j+$?QwcGbpt^|Hk^mlan_sa^JjhotSlDp?Nv0B%&mm zaBPM$9Fm^1f~AcA$(TG7^(i$2A799&m+RCz)9m#2hy&F~i>;g|LoDoYb_WV**Jk?# zC!_XRG3y+DwJW=;(>-nKSGQL3BvtiUeK<{O4KLNSPpBR7W>rnFU5Ii^n4Oq?>3)wx zw$RrvbaP+2+_T`wzkyzAv<~~#?c)~f&$4K-X})sJL(H{0o0Nm1BC=JGu6eUg)-GT- zs3~+s@GY;{1(%MTc-1LP7t!XhYroO@UZ;3Oda{37!}-cHE{o7myoJf{&%iuYvQQET>u4+-Au^^!FqpJAVk&+XE4 z{mJPGb2rzv90OiZC zHzi*HWX#>XG>bCG)OHWb+)arvy5L5F?sE!Q_+A=VE6S6zR=Km>GQS*fXRN!E_j0>* zs|{Sk& z3&Yc^@$VB5OmMxE)hS--rY*9=cg%8NF`nnr#v#SC?iVQ3r`_VqIaWW^t%MNem(Z$8 z?iSy}=JrY?@%mWZkUm&q6WdgngE5{w|K|P?jtf{(Dx+p=Y9hfs{grOn5iSicTv&1+ z_q8bWFRy{ohMXiixMyJZW6X z?5C>PZhvp1u1s*&C%!B6P*y~n>AaYU1-kD{ z;g`@uZ6b^%5iXVm?K7vj`EB}IF*(wyt`lEeSS+L*JU*z?PImZ{obHP*W=#J;>!mc& z9?!9rvk_VK>?@q~eri6;kZ5_a?(He|*I(s3U_+K}b z%K$JO>H>%8B>&_oe(UgQ;is&Qk78ck%v2FEV?H90YZjej_a#A9be_*^1p%@BUySeF zw+|emzGaYxv6#i3ta+I{a4)xwG9f;UV_IcWU5%1N<@F(pMIW>I1D@&q{BWMoW%Q z4u$ViI!VlG#!K}2;Qp-1KKli)+e#-tII308Q*pKE`qsvYZ-~rEx@2A{=Ip8E`Dx2D z^YzF6y+bm&yHeM%N0&EP?z8YbiZ!sfRT@e%{Uq`$;d;urMud+DaY4ilLbcUTou0Yt zPg=Jm2E|KX?U3ht=w=z$c!Fal)*KtV!eM*y;>GCC-w)Lfe%r%%Ee6minmu2)&?T~6wA*x&_1fRtUL2}&{bXI7V6$EC z#N!gpIB(~@@%i@UD39*suJN$FYKf(K9H(Zs%ukB?5t1yJ4dmGD#Mg{I3v$L}=l0uo zCS6RE=J=fO2=mQ!e@|0ip5(uB%+ZMc9G2rIEtQdjKtkGvhseG^51rbFFrAk{t|YgN z%IuAlqCOhSPKuW#r-$7T;dy~yv)s(#_Hm7N@+I!3NJT{6_nZymz(_gXBBzbTk-FBJ zuZoeJ&j=}4KJ~9QKh`y`>x+Ju+gt77+{JU4fqAmyTKUbRo7fzN5?!U=w<(kZeSQ@K z=0vzGs6NS3K#Pu55yyFqZR>)@1KJ5@l_Hd<#>51S@+mfPFelI2oa-;R{>FuJWTpRO z;=|3A$)WL`&x^hmb$K(yHB{Tp9QoloCO%rLEitd};t%Q7LWWPM^3$#p%HkBp8Fi#^ z6!uEvI6Xt7SZ}1~!9LWnsA_g;?w6DEs)9iZ1lp!E6Ya}F)`x`yw+2NSsykhka?{=< ziUM}6S(__bA$30zbA#|yg3r2Ml>}>odx}sXdQjB;I!kTkTw#kX;gP9BB-EC5!Q%mW zc)93YZrRoKAqQEp%d)4%pxAI*Tn4y2{+OrsfxEYeTWg$fQFn_3fS~pB?L4VDPutg0 zIkxq7p~lPU30-epRPDBGI`J!B#7ZS$k4pG#ed-+SnKoSM!;nt|tRJuY{{HlO0#eM8 zT$y`H>wW9}Ns9+0{2$Z8e0RSG`#sTDh+mp%VofX#zO^ngc-rIoebz+hcH!kNS-OE! zBF;rc_&pf%@c@;=)#VnR*lX*6>7O{IJ!n6=P`y~PBR*A8QL(=HYB<=2Zm#>WUXghd zKrF_I^)G>R&2{^`4n>lpq6T>t6EQ>|Xq_6*tP31@5x-V)`<^l2do2L55|bH3`Jb0Y zH!`s~PX9|Xpr23?XrjCU0?BtIh|gc~T6&Fd2JZSbZvG?Yq~}#bLrw}-6;9`%yo+8? zn2C_C%-I~ljzc#{lZrD1i?8c`TAmTlw7&k?YoORlgwXdkMfU>Mo$aA|61YuoTTMrz06&~+{YN5m*~ht*DR&jzgM-v46G+;^>eOP${ER^Z{23?1SdhQl>=G5uMY3GAmq+{Gu2!*$F5U7eMV_i0U_X<(ho| z^mN zG@VmB>n^L|fuilfLvbcw;?oA?tIq5I%yv50ZP= zY5UH%Pu?XFku&z3PJdvRTGQD#g?mKE#BbNtdFv4+M<^}IMts#m$;*phid1mexX!UF z8S|yqvh$9$qoi5$F$yW0&5gN%o7jFFKgu{=a?Ot1b3NgjhSCjoZMi!_r7$W?J9Zcy09c$n8^hVq{%sp@MsIiobcdrP93Z)JZ{mL){kl(PAo%Mdm(! z-xgAv(&a8pI~LBNMyO{Q4%K<)?8RA=>DQJCc9Z#W_z5Nez_9Z4TC>PBZ;LT2?0tq? zS^Fw4n%aKdV)FHt==%2$bg@^7$D`!Cy^EuCy&_zTub4F<+r=JcF8MEG5F(7Nm;BIN z;X-D46I&K$ByH<;eUk33&x;N*Z3Wh>_inOH`-F4oS$1Xlx{Q9D zI1@+c+LhcV^2B|2Yi%MKz47V;EeG&hK~+CQijC-iLQ~vS*WOx5e_hG_v^QOm>Q;LO zwd&C$`v(G>9_&P3u|C^)&!hcWj&qP`82gKr@jfVBT zUnu(E`eavt@jOfv`~P4353~QDGRweT_EG*zC@e|B@W{x>EER2S?dXl_iw9tj{y3I| z>W!;kPvGtVoj12Hka3^Dw74F~8U|)&<_wa}&CSrTu&}PR8B_}2&Tj;ro<`0d?X?}!)e$7t#6vm2Dz^9KW)`e~@< z$Bn(6&5WJd{``#_-_KI~cJWdaR)O+sV%!vBLyz$Yr?F?U%@nF)rTaPABc4d+j{3w=A>)ZENobzyaZZvRK%)OjjHN3_fQ zU`fNC(x*;1{*cYft9RFjZutm@5W1#5`Ok`bV8C@9FZEa^0;PBntt(GYMX0qyk{6g< zo9@YM*hJ{xCp4nk*1a#z+#bFq8?YN)`~~1wR1CjObNQ`rrr+K_%{#suW#an`xbw#( zuP`-UW!WBKClTXFIpUT&jBwlZsrFczK+3Y|zT8-9@1@rn${4@>U3QbZhY!Na4o{M^ zb@h+0j}e3~7sjI8y7G`U;q22>{6}}428H+U@AJPngH7-P>GR+d+#~Na9wDI-Y0S>A zwO*g8sr#OB8{;ACf}MRcSX}#*fUWL?%lvN1eNEQEDnk}&(&|?;=j{gh5SH;McNZ|d zJgkY{g-@u3)g!r$^jYY3)h`ao?7MeioR5Ud|J;Fl6rW1%VhNoPq_J4Loo9K@HeK9n z{kcKN>hbi)+?8@}^MhvdUr%D;dUJ=&AM~WY9`E#WD(TnJfcD7P)@;6DO>7{Io35dU z-p2m@(|bDzw?t|%|9R(QzAvRl?J3O}{2zgIzGs0Ocp(XGU9Y?%~YiJ~w!Iid4I6m`D##L^)(Ue=Nep?#7-`>pk(wffC??Uou znCC8+t6ohOe6qAn%L`c=dt*_pG{Zao>tRP$rqY&@$R8_tJk+zdsJH} zbque5Ic*K7!gtfGr)94CN_GAk?GONdT_|j**Lwjxw>bE&oQ@XrDC`B+dG37W)mdjK z>td5ukygk-*Fj)7$R@u=AY-&&#`gOBIMf?qmmeRGK2tx4vjZK3mWs&)o1oy0#C{V$qeG;$w)bspgn=e;r8mhXnc80Iq#{%F6q;o={q2?aJodgm=1J4ZH(vf0EZGdi zFtfe9X7=!e+xsf`*O=*VO{w?8In(y^aqQ%~HKAQBh)a3gtR0pI$zshan^OxOfdHjYUdvYPIDX3jozTRqgj?0G(>EP-7W^HN#Q!~^$wFB>X zi0QClrpchr67zh5xt%T_tbtFXPovn)?@OWF7CRZ}nZLp0DKdGG@qWmSZZDp`H!&iY zbIG&iYH>&TTc>eOTrx)k`mQe>`XnRx^$`7z`h<|Zc;ru#UKZMJYv^laVv{H_TW~|! zBHmRy;M=ON5yx~*>9gfqD)q@>VwXi!R=TAO!boEKR+d**#!3BtNcWDTi$82M5cBX& z>g`X3n8hzCEM^%&n>ELvjN6tpPi2lWSFGLZ0Y5>ITaK%pPv3UBD{sC`cPKM!%bCDl zxiQW51KhC`UEdo0Svu?-CQL2Yb7O&KWm5YR)hKMIT$TSK&S){QhJQVw?(BY~$)4Pq zM~@%d1?zfU;ZJ%RxxBnQ_gO?tFSy?#US34KdyPCdZ37j2;ylBgQnROO0ZCmS{raNc zewC|2uZ2ZT+}cx5b<5258$O?1`=bkW+;W`vp+`(km4V{{p zQ-YgA+qv!V>LA1?0yq(vV{50cE_|39w^qn9MHFlG{3%4t8cw6iUZqw_^Vu?$N1E%vm)lcJwawpYP_MNKxvanx5l%z;%g*1Ij z*k+2_*>j~s-lQAYS6p@}H9vCP^@n416Op-V=>!Ca`{KKa`#YIpE3FsSYsI}%I!bLn zay;`l}EmwRC?>tucL|c2+8H zx%|1W#72h=GOgkTD^%`AkNyU^jp_nNel-Iyk|mHBM%q~iYg#;mO4?ub6|u`7;V0{> z)VCp>rQ;SB7S?KtbUxrkuvt1^YH?HmkJ3u@u5awg);sHRg9z*^(yrtLY0$kv+q$vG zAI7rzQN*20#uX2J)%6A7V>*0krr!bbI0Nm9V@I4^Un~PTXK(IV%97qKpM5mGT)J)- z$Y6BEDr9Y&zNL>|)}n56XGN|@85`tpzNJw8WRi<(*OWodDej8^$C}187zaS0MidM$0HxqKdi&U5)Nl@b5 zag4j~-vN@E`JzMVBGG4(C>*Akv}0EH^d*^$>@S6JUq5CO?8dpaFi?~^t=ESc$EWi? z7pWn1=w}T|gpGqwQY@(oOq3#N=gyoXM7=P#dW2Z1F7RLqwLUOD6kUI!Fm<`l;ahNu9f3v#a-^f`E_~%BGP~ zZtS~&@SwtDgU1l!|80e?~eqeyi;lrr{9VDnS#HYyJyIaLAT0H7s)I~i(%>qru>qHUD~gfwxEy+vbERiD&-nXiV+#&D-d&yf*djL38vV^0V|pLC z6EbRa)ne_?o|wl%#pInoD^P%mEa&|8yo*!{^CGHmyd%n|3|>CCKhp>GqKnlx{SqLL z@JTBupNC4o4(_<@auewv_bXK&hCPtLNqMl2Jr2hfuCm7E@s_qKC@@H=ey;53(dXx` z--&Zup2*f}_JA0v|u3Fxok zO6>i(sRu~P<1@(HtOg31dH*>fTxx1+*YY_=ph&kGsr4E5aKwg#Qj3~JS@@n-$HDdH z>P(i#?~>H#Q--|4q8u|eyQMCSECfA9usS!Oah%)Gl9vuTyO1N9g{c^Hz%zKE zLjHiMo@`2v7wOfi5?C%7sH@*-4oyttfvl2nmMY-D3XPoqZb?A<&n;;D|Jlj7l30`doanLQ`nbKb`S960;uN_bE{$rz1*m&4JH7EAD)$?~^R^EoiJ z3|?OOSEB8eu^`a}Ilj~F(kS#{s>b%EWgYz;s6!e-=fW-?gah`A#UAn5h@Ts-=s9K& zRFJx%;auAtb@qdAR3#RYR0UPY+sTKS29F!gkQpTbpP`1wsu+e0l z4%2>nn-&`VhzJz83+$^7Kjo^UdG!D~F4M)t2-vaX)4C^BQPB(Hk~I=wjclMSWH{91 zaLFE=O_Dau36F2{(~HFW8Na_1S_WSQLcL8M{Nc%>Y0woE)(8QUfa9Rx$4rL5Y<>}T zjzmFGL-h~i(vQMf!rI45+WO<|Qfjahce}fLDgN+Kf^tX@^?s-q5`UaXvSScYlZyG) ze(5j{+yk4tK8E4vwcU+I5y}!F3vQ^y30wO;6uyLb`ZlnM+dZoO9bm?YKzPlk|WY)C_|s<#)aGFHJwT08aC zY!7_Nh4?Uyh$e~9C`#i=QM(Ao=XYPgI3X-(&|WL_lvHrS+SRh8u3t|M4-bF-4T>{n zR@TgJXQ7{20lkb3P6B>qD=;}eE^lvdzbWk=jKanuyGG!GIu*_SOW5_(;txRgtnpiP zJ4gWk-*wMYsMe1EV&lPYQuMgG_&-J3jfPl8g*O1FWMBi)X?8NL)lNYHA;QzNq@sGa zq5rLP5*+v)ubh?L5$+(VE!-(Al|*r!D=72{T&NhizY_YB~JG{sub8%dT=6~u2#*Xe>jnpKk9iRys}g*)svyzcCG|8(^j7fk~3Qrz8q{) zHOJ_Vm{sqKXSazJnqY^su=UcD!+SZP{{)uDLxZHOYrB+hd~(v6)RB@?m=Z2VP$+fv z>a(fpqcKpTmg5UGN*?YN@5Ek3;a6^EiT&xV)Zne7#r*sb{eC+O*pQZ~rzSF&|2zne z$U*4o>ph6Rzdk@v433!;o7ntkH{dmL##9V6G=94-ET9w47~0;K9lYSTi$mps5)!gg zRwi$hLwBXhDn@Y|wcdRH`d?a3g&-tQH+n2uv902;;ZWDTFHDFtjH>$K~Rh)%C zlcm-JVaota2CrOLnU0qpnR2ElLcib>bs+z*Y9|8&hf^5U|#f%>UGJRXH+ zB4j!)z?yZgsZqP`P%jrY@*v+3M>@$fQkW4zO>7vi?)(y zf8?)<2jp3LadB_2O8*+m&yR#W@LyRm1^WD<>1$zFIS!Vmzfb1_O)M-);@+29EqSyT zD*r7(A81HLbr*%Ml$1OLT>-&;DeV^)b)t_Mjc{`Tl%XbyD zf1WETeF%m<4egf^!~eG7>WJ#n`c~dw^-{=j2(-lo&R2hXD{>?>0nTOW$IbrbRB2GR z331ICLZrPxS^h0ZG|g{+Z{j=j?ml>)`tXmvuZ#k%Q7WmLp_*uHPUiqeLI`#V;cp&E zT_LL8+nPmqkC+d#bUxwFT$4Hv);mr09K_eV9?Q_uOskU$hnw&IhtYuE-tQ&}u}05n znuHwO+INB{bdtu<3bR(3?$0+7%XC6AKe$e5{qyrT)8M2{?BY=wUUQhlg8-{YpUoG3 zN6X9_uwgWlny&6yq!r!xPb*sfU1rX~XKBgLva^SN{dz+qrn*|JsjY2IuIcO{$h+!5 zLuK_&=>L8<@Lg6EutBHAsQ}siZbi9t*dX?jisOIWy}#5TvO!lm0{-40HrOBqyS%^Z z+zFnBZKq-UPjPs|P5kqkwt%5RcD_n1O)4!2YucQZ%83 zK;V=Ped8nu^(Z_58U9aIBFGz=(9n%UARF8ASW}o;cba^(rg!s!$w$f2Fxb@G5J0SK z1wCA5jSQvl|2sC zwGEDD5U^sQRQo?+QE++nSew6HUKw28Zq@LYyzl3^qK^Qd{(t2RZ@JFdemW<-v-GCz z+ki#l(`|2+pO=)v17HE%0FsQ5U6R5p#L;(n2{7X%!%~~zh#a6Nwt1K$<4x;5b^!Y8 zZLk3!7C}K|`)Rb~>s4AX#xsZ_Yiqz;TR~t zl`1@nsN7*5NLJ#x_|GkpA=8B}EizAMzW7X*mJRU!BS)rK;08JnZoqj~T@crsKFKc$ zQ5y87LHQ-|f?Vrh4RV3?7x8JBppn8GB>}2~?yIT$rfw3fJ!4m|k3A0ggYBdT|xalp;S#wK6Oqyy(5jbLt|NOpaG$YMF~ zYU+JW;Jw)T&-?^;e#CR1t8`YAA9hgqmGPIX)cIpRBuvXIx-KF$YOMhdIn%YS=qmKA!g zXru_?uAE{(=F@N*6(YbyTnO~_IwtGuTbo&H=$sui_LEuk{xCxbNIA?(kVfE+k>D*PxiPTh5$GpQ}GA{8hg!`bmtnB>8gA?@iQJY z6Nv4u_V8M?Qx_?VTD1cm&%5mb-}svBg|Fh-3cZh|c0-Ry&Tm3f|MFtp=W36lYG?^8 z!~EIvgzGhvDax?|q4D!uQ#OLx7mXp)k)sv4D`X}*5_td$RTSF}sH+H%9}5nDa^qm@ zOhfBo_JTFPW<6$8tP3*&Jhf=pBUL?EI5Sqi1$qwz0?evn-|u%u_-EAU6e28K2$q!~3Ze zL%ZXFA&?QX-BM8?L91xQU0rHP^3)UC+GpgFBq z=Ud}k#Ao|SEw6Hh{iLW{&J?E)l*$BokP-g-EYL1l^2P+Fc~Z?dqrnnea(kN{e$7u# zUN8mNWi|%G8WyE`v&GYMXHTKl^r~gVW?s%bn zIMuGR+8)#CQA(ZLz`aMOKg($t9~)~Fo{VFxtbT~uFv!(^*Yv{m{nY-({$8T+Ws3+} z=oh$(0PjIY4}b}$_md`4_Q|C{M3R5YY=g@ySHDI(TH!p|%or@j;86YjezaAG)ve)M zAEJ|P14dJ?^Yxj`Bg%K(WD=j4O!SKcy-Omh{0`YKJ@P*Z_A9xgII0$u_sUB)lwU!G z9SKl3x?U&h@*~!1b%hv^Em$RN{w4Q9p=Iz4N&QRc2L=YZ^GzfvR%eQ8g9TymSd~RJ zfwL`EFbXsS5||!g+CTC*Y&tJ?-UyBu9J1-*iXzJn9>l3$yz~#Ee${ul4kS?H_fOlu zu#Sv7C?3hAex=Pe*`H~XFbP`=%JN4=s#;xRLk@KjKmu#HOibv8t*<7ye|_i#x~+Gf zL;#?qq;PZ0x{52?`?0blKK{qmI9{z^L;d)uV-f)cy&8-_pLRZvD4JL*AF)~n2|JU> zpwLf_3r!{6m6^oplr=UszS1Aia8UWCsn&0L0N0pUd*>iqW=rh|PFCs~)^ywkKU)SK zA;!?AKX3lVf+=7RU$W=I&KU@6@^T6PF`a(62y*bJ$=zY_GaZUU!#Ezzo}NR7lz{3r z*-(9idBBS4WYBu8&z)ZcVe@do?6wGL-%Xe`aB%203zhr(h%-qiDPS$tY+w*&&{2*l z*W>4g1t#~F^IcP7LGGTpV~+W=V27ymNGbw*6xs-vXnj_%>VYQ3=s6;Ahpd(EFj8wy z$mk{eoXS(x1g@9{5XT8bDT=_ky`g@0pi9B(wz6qPp(HbqX94`?>MyxRw|M15IJce( zVnX_jz1ni{4A&S!aR4gn0SYN-6|V<=_Tog+!oea4iJ**$sK6Hz6cl_NCj?_`yn0Cr z;`LwHC7>QHO6!6%N^37T1X<$`2ag)L8w)puu)LPUvH0<;zZw82 zvvESHn&H7 z%3$cUbMW0iBBb-gNqTT zy+EkVAR|(71j~`ZDekVnA09}eIKf5uDUo?`KMsYqgC&Nt7eE*XLKk{$@-rokQ2thP z)RS2*mL6gLtIy(imHx|_PGrIt%FO5#%MLpTKy9<2fNX?Hn1Vg!LgGDMMBK0x@6VkL zW2lMIp3_T{!1Mmd3(_P&%t^nAO?;e_Y#y(#7nlz?h|bGA&6L0vJ7W{#*C%8X;z7&D z1W<7SPi%w&0^-=HuF?EzB9U4CFS|nYU*rp3s)vA1P6fICb8})r z{eQcEIa^3L>IbCbC_s5S=IwEM=E|esd&>4z^QFAwAWf=6dg@NIxn+?9?SBPkEZpH^ z5c!PD9g~M)qcfgBC{wR;v#Y8`wCdB-xyCh>h_F&(Yl;yjJ#1Qik33lu|4S^5LYoql z`;sm}=kT(ZE7D-@$yAL36c_E}2bgcYcLgRn5@?v{K-*9V9ldBlhXhXjVmX+C4}Ibp zdO%J5SCL88BSDpzi0JV2!3=9%<;DKx;?TVDX{EdPqykt=(6x$>^+33?=REI`$pOam zmYqqw`}y?;z{p=J2OE-h7^%dMdTI|=f!BVMa+Ce5Fl_P{{Op{&PXjSmo$ zN`_MO(c83f;1@b&>ozhY05a6`;;(`1;T!LD7y>4tZFX7wP36_a{g22@)k>=*IDkL1V>d{v zK&sj{1evdUu=Ss_d5aJjA}vJQ*N!-};iMB(PCsPnIrwZVdfR3W(@7X2nY?S?6dn>F zT-3hEmKk@WV=3ogY{r2*%dZ!j*?tkXzP*ngNOr}}1rbPZ+t(22#hI4>%k$}vO-?GR zsHoiUfAIBZK9Ej&@uG1dr0Z%Rj0{bmJp`kfXg=a{TGXtJcAru^zjg09za0IVI?!2? z&l-;cTu?=ZbC7D`jAdU=^t5muMBfiqZ8z`q4M4})VH zp&TXEZG6pW%Mu#N-9;9vfH;vTB_b0f^pyZ%pQl3QL|U>jqDp_Ydl5N89w|DgB*}C3 zmBdRe=Nbx;J_bi>FpNJGix-lfon?Tp48ekerrrNa$^(KlDS9^X`d8ti92|__`CXyKz4~fX$V~=11lc@Xspop%Uu=vZC>wZxS%&ir{UJ0T2+bRlQ(aDe&!W! zt#&d<*~}}(BMoxO^2BCk4Es){K=!z;D_Z0)C*PjTayumC_tAy`AEaFiNu z0*$1+l*^9~D^9k@n}P5I0U+8eMg~GWJUqE(qhDd!<^Ilv)%qpY(4iAzwm=9!^zj!E z!HoR&B#PPG(Lfe4fdnkV)%7I8P8sfYh?Q+VqMtbUx)^Ecl&dQ8T%LVgMX!It)_FY*8QzUkwR zu17g@TVZ|)k+<_&_nrl`zsCrJTjAVBm2!`dbG6P|QublKfoPR~@C;~c(Efx_Yt)~L zn?YePP^!Q}hTN+*5e zLVfrfSA`H^^#@8(QBk!VBw;TQ!^_Y?5|*kzS`L2N8i&2LhiILlz|yO7|4DDro1@{O z8z#m+Yg*1b$Or-ydBK~8;GS6f1mzOAmG+-2_qvi1w{}BE8GFh@e?v)F1q8-GW0eu$ zA?JfvL+b7bNR46|K(rf#61oukOr2593>{8_#sC*?NqYW&q%2M1atzK~j=zfkTBhW# zJa{EbwoZN|l&?*ot2Q{OX&^}O!ZRN>+Z=pvMk>G5=1PB6Ui7>g?wgc&>RidY$5gRL z*B3qyyw4kd-By?u*;2N_> z4tVt|lO0q>>vBCvaj7c477myzwe{7($cyfwhe*c^)+dMylA3&VQwOdyRbimc6r*cf1|o3wO{w9F-eCjApNI%*(@9|~*=&|+zT%zF%Z zKGAqfRBV6yokSQOH8+7U)Mh@9Wn5w%u|j72_wJ6yz!@x1IbR2SNzZI{QJH`PM(oCQ zL~8^S9?y1myx}oHi7=qS|Mk_}Utcbj#EzYZU!Ul`{E#hp9eZ+OLR()z{EF8j{%ILK z#6$kL*U&Sg)~KACa=?xOY!=WWyTR>{DB;uk3M9M1o79MJ{P7WbS}Yx$QBc?`ey((F zTXUW0(+%T}O3Q+=j8_Q_xBfn3Z<&$H>j0=pRmOs9gUb2hs>C&A0FO@&E{?uNFtzms zg#76q$Ec|AQ@eW)nwj;kSo4mrBvhMudZxGDHIiZBigAp3qJ}{Ms4*~PU!@F>VbZXA z2Wym%=C_Fy=1#cY6Jg$p1<05fjmE)`;eX@$c?n4BUMkm=qX##YMOBVWsCI4zKVMmZcT%%Uat3 z-iq}>oAJA(5)Wd0pfNd?97r1q*o1fH)*NJ-YeZ92RtP!1TxsqO;Njev7eAH25x6cc zDg06Q+&;Z6G_;7|Zft@tmq(}J)$1EP%|(VO^s3!o>3!4b-~5QkzruiaX-39e;mJN5 zOotIUlLK^kyi#e-A~`=~8koLl%$;yLtlPMur%+WIGETt4(L?tEzhW&5k+O~^fJDnH za_pWI3$bI59ZMOl*`Im=vIlzfq7#fa#T zH(@z1?iP$>MnR48JUxh=*tcXDVP9WdOLdKU1 zaG&G0gpR@x4;gY+mF{R!inm{`m&Dd0@)ra=>FIlW5o0HlSG6D#&1V(LXW2>rQVm#s zq&mPLvk7sBUbjT?M5P9au+MlPTRuv7m{v>|m_G-0Uob}#R4{iA8o+RPUvfH*N9e?i z3-hOb|MsEf?L&&}IM$QEMNJTDS`n-!?ya)Ex-)`2W@Djle^24A3A6P`ZB1+R1zB3( zAJ-W(YF_uvwlb!@g$-B2+HJ4}z(4WIcX!Lw_)+x%weQWj1XdqRWEEk52I=edF*zaB z9$43GrZ?+KF7%k3U{^kH33;ws3V*pr(dyA5v2B|kX7SDk>mTy`zYqftl_BiFu7T2Y zpu#z`Y2pz_o*V%m2S>}Wq&H~w6yT|4IffPLm2UGfYQ}N!ZF|Su5#1wrYlaijIF{s) z&pn86po~sEq65wLJZEU`v+Q~r!nahjg*c+-s2}M@?8|_XNyy+zl=^`tT?PGyMD&2T z6#&_T=^@vEpwG%=g#_3#s79#@a;vf9Ml(F>_?R20#(O*NtnIoJf($FU#e z`JU+`{NS_F;xga#f^@Z`(i@R)3`cI#vKDOw!nicKwwsIMT@kk~M(rpYmHxAz;rvbr zI?FJPg-1`n#qB{7!I1rA*j&R%Wwfl=%&}onXkX&+2hiOZ&AfG zX-O=fNhXVVk3xc;t@Tls=2~~}sVBN4muV)hkQ>`Qs@OKOEBUYb{7v^J5euPGFkVEA z;RMG*UExDiQ~JyuTY6YMd<0GfjojZl{Gy`OXtC$4;GhBO3^h089FKo)mfmx2;IFyo z3^xF-kof4zi|}c!6JY4eXS=pwYR}+A++V~DW7W99f0A?wD(;cl2z*tWmc;7VYw;bX zGpVW8CVuMcQTs=vCP3asT$Ny+U0ekS;-zViDx+4plSyZIk%Sei?q0Pt^if^oOIpH8 z@lRD+{w^&KIEUvBL2ERok5vufZ?b>D`7%=0(${CWa1=L>fv45np~AWO%JopKG~K5D z;8!~%)>jEYE8_9mnS|ddBAQTpCE~1a`Gm}EQg4j|tJ4m1#TYmZyVw{XwOSN79ieZC zUJ0QOf15Vbrbd0%n8;C*P|>>ru1khaZ6$Vy?`wcqW99Z41dx(ng(A?kDT^|pxRGVQ zhNL8Lr4(pPmic<%ymT(5+ia0_2~bZ4*z(o-N5%0o8;QA>X`P5#2=Tc^#wx+P53?)5 zW^bRCV_O%?Bh8h0{jX+@IMKrzd&t^ch2xOvNyQOnyY`4D$CfLQ2azc+bDrblz2sZc z6qn!N*oh7ZulIzXt>pN)n0yPHk^MPs{J z05)TWsd|m1`5-FwX3W($8s`zu^{@2y3+> z9-lo#)0)7&o(6n5RM~x*6t-goC)oQ-I&g4W_?l4MZErd{oDr83NkrKR7j%dUgI@J3m;y0l2k%4%=>;C3T ztNs`XxSIU)sV~_qye*Hm&&PR)Zd85YsMS3!W)s)|HXxWenZB;%rB!6pA-3k|JX6-$99ZmbNqbPUpfExj~ac}SdD%<&y_IDODfVX{|)#!mAP*F^amVF z1s;YwD=mf%^8Z>SjRZm;+tm{D;V}_OtnXr2t7P_4I(?c6_09%V=n?HtXub(sbov|R z#rgF-?)a8~DbJB1Bh<3hE}U@z8S>sNJ79PjFwuL+r48SPvy8_)`sA#LwMTy!;**Cc zkS3sCMzetdMiLn?qJAV4nZw#VcPa1oC%eA!IkV^;0V;Rb=R5-;0L7-07WXj4t)jJ& zl+hydGp-5hRiIxu3HC6ubG|88{HKwXT%$Ws27PPdmX6zP4#)DHcEmx{Y?qLF{>rr$ z956RSmQp&0QbOwNN=hNin2-YWWHaH1x}Z>=bnN9ghpRIhUiscAk)4U{)M(8>cILRQ znsw&7H;aqc*pT6_<^kCxG$Qt8c<-{|Q#%l;{zS5G&L6AErt)bXY)Wmp`whdmkCAG+ zN#EAE{Z+66C;GLNIj}tL^DD@yoM-sm07q!G8W~JORERftRyQe`I#}qV%{*K)ihP#; z(4Oj}N#iDI(t*Z`u9}eg4+Ifyio4K5XM*(|e2=I>cOM&5(;VGhJKFix|54?h1k&-A zUdRJwj1RwuKehPp#v!ifGb_X3Y72>eMN1oLCF8Uh=Pj=6fKgD z3@(*fzjdzhA-XXfgJi!@o@5NO6gQt^fw3%6|#Jc@Sh-s{uffj ziAf9^q71zR;T+7ca<~X?)lg=RDq%Q~I0e=7b9f%N%E*Z||_!ti(=JZ%Y zkVg+v%R=Y7@Pa4a;^v*jR583cvWxJs@B}G^3H6Y(J3q>FELvD0E;aA|aiW>hVV-{z z5H_2;fxjQQTULi#6qn1oJ4|?3&l)N*-??01sEXcYLq`+>zGq6#1zTW}y7zan6m zLlX0C{%fOPI#Ek}*mbj+__n5^a5aQ3*euE5H84@AVf@eKfh}s zT(ETdp8uSSIUqq0bfWA*gRrwH9dLoit#*}X{pH3|!cK1tj1>_PDG22DM38z?AdXj4 zBI@`uj`e6|P0+s7s0pnk3wB?~F3X9Lz9#R1bHkODb3jS!XDjKKvK~cuid~^sN?Y<= zGeYSSY1+L!mYa`4hGytruV590e|R1LV--5tj`t4}`413ZA~{}{zYC(inroZxNje_i z###j5=Vc*`vd?uplk%1j9Z1CW?elj z(c@NWCA0cg(GReQX_Bysc>}tOy1PjYfut89rXTbhuOdZ_%pc_}z7|$FD-~Ev8a7JbG@HY9?*W^W<@4%DD2Fp%fIZSVSK6NA`u(v??6)j<)#dPuR{xupiXaudVosK+%Q%i1I?1ar{ynM2?XnYtI?XZ2ZJR}>oV@# zYUvM;P{1x`QXwzwoY9d_NGCEVLV_LvM>LnL6}lh18*64akKYoi_U`B-p>R$4+H}6} zQ!Y!wDb#do%JQ3}O3FpqMdH=tA@Vh05fL(|P3~%8Ho?E(1xz$tb>ayJB^YM`b7Xvc zOw#jn%`qqA>e$z|V4^E%g11|mfMH!UaA2&v= z@&ZG20?%aKPdI~6r5->56=)4t`3}6F>EP!AmWd5o!h@vA_P|)Q8(%b%E}6HC3lst^ zNI~i?*B|E++HldFhb{?`VfwDnQ`9j0$Zu1odgrsui@y;Y^HGf)ZK%J%v|0U~d)8>4 zh;g%f#cDYXkwXVO4^9&~!FyD1sgR)l(vZDQXOfo+3QI`J*6fv_(N6G$1#O(WS8A5@ z?6}KLa{7-B#58B+iahw%su9?bbFcJE-u0sHt|4=a=lQBds(tp?hP^ktCSVaIg$PO* zSh+jC1|34N74LlTRHWcxybJOw?Gz+6Ln9(wr)}J{NlHaU<%v-n+a73Pw4YxT@8U=R_n2TaDogWm>``hFMmKB8(wBc~dc+$eFo)|+ zMK~+~Z4cIyQ{aKzpBE9ZmjqK8E2Jz^mLvNm&`IeWH>-2y=`RwEo(u>+Nut(F8BK^3 zQPcDCJ@tV_=F?qhogmEcdX-m;ygQ^7*W@OjbT0r!K`r1K;^O@}?y@iKP$3;gLP|&H z&3w~~x{mAiHXU9ib7_>dY?=ky&{%NW^Q@Z=S`c@G##pEI@s<<@u#;lSnDj-)aERkr z|1RXjm&<;?D^ww2;bxTSk_J&YS@iU;c9x2O=E*>~XS`-j)Cc`#%a)xUIMRY0?8_5L zaa^aBfv9CIH9IFqN?G~w6VaJ770#SFbNz`4{v`I!N1<|LsvrleO=)X$JAUGXr!DV? ziEuNem?Ccf%TaVn%Q4YL(qD%OBbwWp*_DWs5rTP;o$Iu8rb|lHe?w*p5cv` z1vng@EK(>K5>6QYZ7Y(e3_w5jp}Ts2|M@KpCumyPBj#~~NpPLQjttfHDJdyR-HLBm z!CLY>#{2iq-2-Pqz&%JkG6ZWG#IHx)JLHDhJXF905HBesb5C}buAk-|^)@);SeL#$ zh}m0*13Y1Hd+U#Mz}l7bOyi!^%WK%jIXA+zB2V>iwqX|d#h_uz^y$+*-~^<|}c`ugV;F2~1Lct}AS!ZgF$*;8Aa zpiHA4V*&+mLBk;nq|H<}ro~@Be*Cx!mf8!z2SK@^n}0dyhoZPFL#xX@;1z0&#h{a6O&8$yG@#nS$)9J_P)~ajQV1?>bWV7 zMb{yFGP1A;-PTb0=EZ+3NO_2D>ct@T0FKuH2?s#RiQpj=d^Hx{44I8G=*2jFK;F zrt9p>MGqcx#D!8gT;PZe93og0kTHT&Fi7brj~<2-!m~di0;??yNjE>6cQ0UhV7Yv_?wKPi11!O+b&pYEyW)d0HTd< zbkW40`Z(3YpUdw{r+vMu*J$9q{`p)~uS8d~!mEdNtbPBYusn53;e|6v5-UO?mD5yW zciUz-e3})tzg>OI4R|2$@u5blTM8+o1Np;b1F)$s=RQEqTndN*Wb|5^4_(0=kY9d2O(KwRQHdAtJGn?)$E z{}S?iFtu7(ykn7FY_;Qh7wLBJ89nGywZPtTFG21_RJF*b3Gs!AQY+7cB8spH`@KpdZwlArmh25d$|c1{`{%|D zJi5Tsj!(OHYL3RY!I^u1cM$hr-M^AD)1ehO#F5kXXrbX#ln<>p?j!(iFI3o$8?{t^ zTjR-wWX(N!zm9e&i{uoGOBx^wz?!>>QcX#>(`U#j+!kK^{WsKzpB3NyAKIaFK|ACy zP7`ZCAoujO)9+?63gIvQeNhKxT+i75#eGe-^Ko>U&WpT5LJ%pbyD?;)lU|Z`)6IWb z7x7j0I26LAD*Od$OGy>U1bp(txpqcIM%hp?Rr?%J-`-H_hz`x0>L=F?Ln`s^{)?H&nu6mzkKgsTDIrIrgERt@Y+hrZ-wX=*JRK+K zRYy%o+*FhBa9>*q*Y#Zj?%ggnYQd@cMqiu$oqIvl~G>{2X!>a1*zcX};>ca3#5 zTFLRbF$}g6_fd3l`@>lnAo-(K`x?%)rVyM66XxGPFLq%65|e01|8G9}h~QUryd-zJ zkn5>fZ8|bB91#b`{d$jlT8!qdWkmjq2+!Ljq7B7DL?qlEJmQ{X{j+|&GlNMM?RQr_ zevTC)PZz&T`FK+iKgci^)C~Au;JkVB1N7WaM0U8QVArNCfH|!Y*a9L~a@CI~Y8r9g$MZVqXi=^$3wk_O5pZgw{Oa_;><)UikRjdZJ@u<{|HzInaMK zzg~XRy^!$!elMc;fm!qI(`3C_7VTWoMdgy^;%!tRAt91c+`J}gq^{KI#6SOpLv0yp zsl9B>d5BYEpdC#)Tm(MR;E<4WKXm%D1jxgb^oLHINMn zy;9l8VA=#f<#okv+xJ(m-mr5#FWXw!(Hx*Lo8qcZ&RSppk`^2X^NzTr+%clQmcb@0 zbu~UEKyzmRzKZn8>pyzt*fH9TkA)oxKtF5H6Ro=8EfK`@;u0@j?>SpoT^)k_DTREw z)`cET2ehuP?YNeFIb5Jsz~c(m{Gy8rCCkL!co3Ct2oTPEWl6e>+zpVe{!kU=H%u4K z6}y5A0KAj%xgPCLZzU;8B8}^dST2va9gq1%T}2{9dHC$9 zR`7_yx)q1fa?q*Qmsiv7qDmJjVRwpFIgU%O!J+%`1JTXB~#V7>V^X_7;Det$}N^tczdEoIiT}qB5lIbIq!4evrbcbdU{;@%zs#M znTG*Af9V(>0B0XQq73>Q1YXhxElM`4>cn^(lw0*k82#Sxp%vK(b~Jc%2KDGg z6mX&x?m3j9O5B3ur%p}Is;h<^A*ouhK}{GA{geb&?LxL z*(MniMgFw{N153Fqd)zOMr5A1Zr!SUh8u$(yv{_h5ViI8?b|K@LR?Dg+YeM$n(mB7 z@jLy52fOP5<1{h!-$tpp)f59Ho5loHS`v|vl(g^EuokiJq+l!3g`3VKB_+S^p6)U> zoN*M4uL9SZwJeY0&^bE7%0x_9QY|4ion2jrCT0;L5F1o($&S~WWts8fW8TAeAJ~r3 z9pz%IuUcjrXB;lC0t&0P1F1WEG>UAX5Iv^cOkrUO?eM4{Ig&5LEp;_BW34tNp=qj{ zPx`jn6y;*omF+{Xcy!?#Op{xLt15A?PvQD!Kl1?c2I7}D?zrv9esY>31EjApBEEFT zA{*!=!~=&vN2u+1U7$yO+?F{-e#>CriyYTx6w-)Shk*6ygfRHLii6>ik+Ow*ZR~51 zl#v3`@AbYA0w!H$5m>tXuIRteX^Bm32zdkT(H@NDb)Ri_cE8V=THj zTnXM=mT?nVZ?aS_E_lNhF28TRF7~J*n8q%C;kpF8;d?PTG^H(?Fyks;8BT{@(gkV9 zjtuMTtmb0#?8I>~LRMwIwidw)0lg%kSg6~b_G}`88a2DwXqV-4DdWIEk!SOXVqlpn zP-#}DH0alos}M=wbZ)Qe#k%a!LV!kb84~pr&y8Hj_8$erDX^KR+KP&!jHAa7LnmI| zbo;28eOmSUiAw4{e*M)e%udgkGe>z}pki8U$I~>|j5nxLqoJJ%ViOnLSD%lD7&q*B zwOv~CS)+!Mk~yrok}vLgd6=AgeR<%7Dr5mv-M8BoCA0OEjBE@^gnn=~YeZ-w0#!bH zy`NZ~HJ>vAyo2xk6y1K)z`E7Xk@gdLR6O{hLb*(BebfSQ9t1ZgwT`1#LuAB{Usu}IP_`ad1}Nmny=K8$ zTntZSeQ#)^hB^V-=#j z4n57OI+y8lD0rQv;6?9#ZRFTq8#l;&37@^Uky}-LO`ncLedzC3DnH*D0qZC@6M_bl{%83S=ux z&pj@M(CHV+^j-j(>Y2$zsMGGxpggizC@XtEphD3*V`r zQr_=?hN4&3KY8+G8F$tW*M3ZK)McJ4Ci^oK2lK}lXbYux3wIRd6;2EtA7(Dybx``v z+!Ro%0qXh+&c@YqQ!5;qUFUnP{)yyNK)aYW*AL9kcJbm=(JB z6CicpbW%fwO-#0ei>DAC5zX2xhfV~Ad~Ba?tAUpHBU4SG9d(Rz)g}Cg1dRo;X2Otw2EQvD_wGOTdLf{`%5-WJn#yV9~XN+ zlc^ua5(ff?2p$;yYHs{wdC;r)`nz7);(T*xu zKBr#(#5!!Iy1TRe2-0^g-xEzD=Jj1QKXP<5@YXY+Rq5FewJv0IZr!;-%zp&x+{*nW zh1z+g{9`Vo_%x(x#ZIpLXCdnysWCyN9oi6`*DAfKwOwK9t(}_$x~LrPse8!Pkt+I& zoPGTOyWgsQeFJSpjIRI7*O%u7JE8xXZ)P=5UV05IET7pMCA%cJaP=L}usi0bRb302 zFC!&&-X{)t{QYIV#AU~kEG~e9WL%S6)p*wVY)d;iTj%Ar4PGsk)2qe5f?QW0u3;7= z+>L{S)ZJC4G}|Ryt;ZjNonm{be_HnO5ATjs(AXbZy4FE$OV$WBx~NXqaC&VKO!Ic# zB>-(M59b&c9zL64b(OoMwC~3oSCPNkb-aq@A28xXkDWRd0a1SBmzB6(V$vMlZIrjb zYYOh@dZT;(O!2J(H#XcJ|H*V$-D>MM8@J%PDmX@Ed%NFlEzHiRGY*oCU6kJgv7BU~ zMcz6Hla%oQqW(+luV}6?Fqnc&>g|F8&FT&{m(81vajmm6dZ5));8>m^_m+8mQZg%) zwe$4M*0?wpu~GDce1gU`8&ypj8<#e>6r}^3j}CTMyfcwx;j5b&t0Cy5f} z__1RG{uiI+^n|x;tt!D~w-v8+aOYiUY^)M=DH?&$*jaAexUurqI-Y^=Prqy#1wrKQ zJI9<|D{o9dgl^t$sX-QMPzM#g|L19MY&Vip3}}{%4SCvlU)<5mOM>~-n zstjvCwPeJO*rwg*&7;@d@N~JkHf--=R26gbEFiHq)AC0n`Sp%XzEHKOAi35pd+q5z zYEH2bjrt9LepXZJzrXr$o|g#ruT6@DzCp+=IrzlUuf1x`Z1Ni+$RpMu1ks?Q*Siz% zuEs%rL6VVZ_es!Uo6!0qpUfA1wg(@Ar?5&6G2c{I?clhp5ElhB7y7JQ7u$B(0ne!t zU!r}&5t?YvH^ISd-{Id~P`Rp*UQ5!aj`fGw{|j=&KHg(wIho_*czpc-fBMg%`ae4+ aqvM|L|F+qt!r+I`ChaZB(cL literal 0 HcmV?d00001 diff --git a/docs/boot.html b/docs/boot.html index 343d485..aaeef60 100644 --- a/docs/boot.html +++ b/docs/boot.html @@ -23,7 +23,7 @@ - + diff --git a/docs/ci.html b/docs/ci.html index c30dbab..98f7579 100644 --- a/docs/ci.html +++ b/docs/ci.html @@ -23,7 +23,7 @@ - + diff --git a/docs/condprob.html b/docs/condprob.html index cb9646f..a654765 100644 --- a/docs/condprob.html +++ b/docs/condprob.html @@ -23,7 +23,7 @@ - + diff --git a/docs/crv.html b/docs/crv.html index 9734f8a..2341547 100644 --- a/docs/crv.html +++ b/docs/crv.html @@ -23,7 +23,7 @@ - + diff --git a/docs/distributions-intutition.html b/docs/distributions-intutition.html index 443659b..424bfb4 100644 --- a/docs/distributions-intutition.html +++ b/docs/distributions-intutition.html @@ -23,7 +23,7 @@ - + @@ -519,7 +519,7 @@

    18.1 Discrete distributions scale_x_continuous(breaks = x_vals) # This line ensures integer x-axis labels

    ::: {.solution} -b)

    +b) The expected value of a random variable (the mean) is denoted as \(E[X]\).

    \[E[X] = \frac{1-p}{p}= \frac{1- \frac{1}{6}}{\frac{1}{6}} = \frac{5}{6}\cdot 6 = 5\] On average we will fail 5 times before we roll our first 6.

      @@ -532,6 +532,104 @@

      18.1 Discrete distributions

      18.2 Continuous distributions

      +
      +

      Exercise 18.6 (Uniform intuition 1) The need for a randomness is a common problem. A practical solution are so-called random number generators (RNGs). The simplest RNG one would think of is choosing a set of numbers and having the generator return a number at random, where the probability of returning any number from this set is the same. If this set is an interval of real numbers, then we’ve basically described the continuous uniform distribution.

      +

      It has two parameters \(a\) and \(b\), which define the beginning and end of its support respectively.

      +
        +
      1. Let’s think of the mean intuitively. The expected value or mean of a distribution is the pivot point on our x-axis, which “balances” the graph. Given parameters \(a\) and \(b\) what is your intuitive guess of the mean for this distribution?
      2. +
      3. A special case of the uniform distribution is the standard uniform distribution with \(a=0\),\(b=1\). Write the pdf of this particular case.
      4. +
      +
      + +
      +
      +

      Solution.

      +
        +
      1. It’s the midpoint between \(a\) and \(b\), so \(\frac{b-a}{2}\)
      2. +
      3. Inserting the parameter values we get:\[f(x) = +\begin{cases} +1 & \text{if } 0 \leq x \leq 1 \\ +0 & \text{otherwise} +\end{cases} +\] +Notice how the pdf is just a constant \(1\) across all values of \(x \in [0,1]\). Here it is important to distinguish between probability and probability density. The density may be 1, but the probability is not and while discreet distributions never exceeded 1 on the y-axis, continuous distributions can go as high as you like.
      4. +
      +
      +
      +
      +

      Exercise 18.7 (Normal intuition 1) a

      +
      +
      +

      Exercise 18.8 (Beta intuition 1) The beta distribution is a continuous distribution defined on the unit interval \([0,1]\). It has two strictly positive paramters \(\alpha\) and \(\beta\), which determine its shape. Its support makes it especially suitable to model distribtuions of percentages and proportions.

      +

      Below you’ve been provided with some code that you can copy into Rstudio. Once you run the code, an interactive Shiny app will appear and you will be able to manipulate the graph of the beta distribution.

      +

      Play around with the parameters to get:

      +
        +
      1. A straight line from (0,0) to (1,2)
      2. +
      3. A straight line from (0,2) to (1,0)
      4. +
      5. A symmetric bell curve
      6. +
      7. A bowl-shaped curve
      8. +
      9. The standard uniform distribution is actually a special case of the beta distribution. Find the exact parameters \(\alpha\) and \(\beta\). Once you do, prove the equality by inserting the values into our pdf.
      10. +
      +

      Hint: The beta function is evaluated as \(\text{B}(a,b) = \frac{\Gamma(a)\Gamma(b)}{\Gamma(a+b)}\), +the gamma function for positive integers \(n\) is evaluated as \(\Gamma(n)= (n-1)!\)

      +
      +
      # Install and load necessary packages
      +install.packages(c("shiny", "ggplot2"))
      +library(shiny)
      +library(ggplot2)
      +
      +# The Shiny App
      +ui <- fluidPage(
      +  titlePanel("Beta Distribution Viewer"),
      +  
      +  sidebarLayout(
      +    sidebarPanel(
      +      sliderInput("alpha", "Alpha:", min = 0.1, max = 10, value = 2, step = 0.1),
      +      sliderInput("beta", "Beta:", min = 0.1, max = 10, value = 2, step = 0.1)
      +    ),
      +    
      +    mainPanel(
      +      plotOutput("betaPlot")
      +    )
      +  )
      +)
      +
      +server <- function(input, output) {
      +  output$betaPlot <- renderPlot({
      +    x <- seq(0, 1, by = 0.01)
      +    y <- dbeta(x, shape1 = input$alpha, shape2 = input$beta)
      +    
      +    ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) +
      +      geom_line() +
      +      labs(x = "Value", y = "Density") +
      +      theme_minimal()
      +  })
      +}
      +
      +shinyApp(ui = ui, server = server)
      +
      +
      +
      +

      Solution.

      +
        +
      1. \(\alpha = 2, \beta=1\)

      2. +
      3. \(\alpha = 1, \beta=2\)

      4. +
      5. Possible solution \(\alpha = \beta= 5\)

      6. +
      7. Possible solution \(\alpha = \beta= 0.5\)

      8. +
      9. The correct parameters are \(\alpha = 1, \beta=1\), to prove the equality we insert them into the beta pdf: +\[\frac{x^{\alpha - 1} (1 - x)^{\beta - 1}}{\text{B}(\alpha, \beta)} = +\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\text{B}(1, 1)} = + \frac{1}{\frac{\Gamma(1)\Gamma(1)}{\Gamma(1+1)}}= + \frac{1}{\frac{(1-1)!(1-1)!}{(2-1)!}} = 1\]

      10. +
      +
      +
      +
      +

      Exercise 18.9 (Gamma intuition 1) a

      +
      +
      +

      Exercise 18.10 (Exponential intuition 1) a

      +
      diff --git a/docs/distributions.html b/docs/distributions.html index d3923e9..5c0ce22 100644 --- a/docs/distributions.html +++ b/docs/distributions.html @@ -23,7 +23,7 @@ - + diff --git a/docs/eb.html b/docs/eb.html index 44adc0e..268dc13 100644 --- a/docs/eb.html +++ b/docs/eb.html @@ -23,7 +23,7 @@ - + diff --git a/docs/ev.html b/docs/ev.html index 5ec2b00..7da2fcf 100644 --- a/docs/ev.html +++ b/docs/ev.html @@ -23,7 +23,7 @@ - + diff --git a/docs/index.html b/docs/index.html index 71bb37d..63ba5a9 100644 --- a/docs/index.html +++ b/docs/index.html @@ -23,7 +23,7 @@ - + @@ -300,7 +300,7 @@

      Preface

      diff --git a/docs/integ.html b/docs/integ.html index a068607..d0232ef 100644 --- a/docs/integ.html +++ b/docs/integ.html @@ -23,7 +23,7 @@ - + diff --git a/docs/introduction.html b/docs/introduction.html index 3ca7b69..b738b9a 100644 --- a/docs/introduction.html +++ b/docs/introduction.html @@ -23,7 +23,7 @@ - + diff --git a/docs/lt.html b/docs/lt.html index 095b208..fa2d250 100644 --- a/docs/lt.html +++ b/docs/lt.html @@ -23,7 +23,7 @@ - + diff --git a/docs/ml.html b/docs/ml.html index 7d2bd1f..0275d52 100644 --- a/docs/ml.html +++ b/docs/ml.html @@ -23,7 +23,7 @@ - + diff --git a/docs/mrv.html b/docs/mrv.html index d2f18d7..c48af5f 100644 --- a/docs/mrv.html +++ b/docs/mrv.html @@ -23,7 +23,7 @@ - + diff --git a/docs/mrvs.html b/docs/mrvs.html index 7aa70c5..e68b63f 100644 --- a/docs/mrvs.html +++ b/docs/mrvs.html @@ -23,7 +23,7 @@ - + diff --git a/docs/nhst.html b/docs/nhst.html index cd8320a..a5ce27f 100644 --- a/docs/nhst.html +++ b/docs/nhst.html @@ -23,7 +23,7 @@ - + diff --git a/docs/reference-keys.txt b/docs/reference-keys.txt index 68bb213..3d75cdf 100644 --- a/docs/reference-keys.txt +++ b/docs/reference-keys.txt @@ -119,6 +119,11 @@ exr:unnamed-chunk-280 exr:unnamed-chunk-283 exr:unnamed-chunk-286 exr:unnamed-chunk-288 +exr:unnamed-chunk-293 +exr:unnamed-chunk-296 +exr:unnamed-chunk-297 +exr:unnamed-chunk-299 +exr:unnamed-chunk-300 introduction measure-and-probability-spaces properties-of-probability-measures diff --git a/docs/references.html b/docs/references.html index 20f280c..01a85c5 100644 --- a/docs/references.html +++ b/docs/references.html @@ -23,7 +23,7 @@ - + diff --git a/docs/rvs.html b/docs/rvs.html index 6286024..875aeda 100644 --- a/docs/rvs.html +++ b/docs/rvs.html @@ -23,7 +23,7 @@ - + diff --git a/docs/search_index.json b/docs/search_index.json index 6ddc581..41e37ec 100644 --- a/docs/search_index.json +++ b/docs/search_index.json @@ -1 +1 @@ -[["index.html", "Principles of Uncertainty – exercises Preface", " Principles of Uncertainty – exercises Gregor Pirš and Erik Štrumbelj 2023-09-25 Preface These are the exercises for the Principles of Uncertainty course of the Data Science Master’s at University of Ljubljana, Faculty of Computer and Information Science. This document will be extended each week as the course progresses. At the end of each exercise session, we will post the solutions to the exercises worked in class and select exercises for homework. Students are also encouraged to solve the remaining exercises to further extend their knowledge. Some exercises require the use of R. Those exercises (or parts of) are coloured blue. Students that are not familiar with R programming language should study A to learn the basics. As the course progresses, we will cover more relevant uses of R for data science. "],["introduction.html", "Chapter 1 Probability spaces 1.1 Measure and probability spaces 1.2 Properties of probability measures 1.3 Discrete probability spaces", " Chapter 1 Probability spaces This chapter deals with measures and probability spaces. At the end of the chapter, we look more closely at discrete probability spaces. The students are expected to acquire the following knowledge: Theoretical Use properties of probability to calculate probabilities. Combinatorics. Understanding of continuity of probability. R Vectors and vector operations. For loop. Estimating probability with simulation. sample function. Matrices and matrix operations. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 1.1 Measure and probability spaces Exercise 1.1 (Completing a set to a sigma algebra) Let \\(\\Omega = \\{1,2,...,10\\}\\) and let \\(A = \\{\\emptyset, \\{1\\}, \\{2\\}, \\Omega \\}\\). Show that \\(A\\) is not a sigma algebra of \\(\\Omega\\). Find the minimum number of elements to complete A to a sigma algebra of \\(\\Omega\\). Solution. \\(1^c = \\{2,3,...,10\\} \\notin A \\implies\\) \\(A\\) is not sigma algebra. First we need the complements of all elements, so we need to add sets \\(\\{2,3,...,10\\}\\) and \\(\\{1,3,4,...,10\\}\\). Next we need unions of all sets – we add the set \\(\\{1,2\\}\\). Again we need the complement of this set, so we add \\(\\{3,4,...,10\\}\\). So the minimum number of elements we need to add is 4. Exercise 1.2 (Diversity of sigma algebras) Let \\(\\Omega\\) be a set. Find the smallest sigma algebra of \\(\\Omega\\). Find the largest sigma algebra of \\(\\Omega\\). Solution. \\(A = \\{\\emptyset, \\Omega\\}\\) \\(2^{\\Omega}\\) Exercise 1.3 Find all sigma algebras for \\(\\Omega = \\{0, 1, 2\\}\\). Solution. \\(A = \\{\\emptyset, \\Omega\\}\\) \\(A = 2^{\\Omega}\\) \\(A = \\{\\emptyset, \\{0\\}, \\{1,2\\}, \\Omega\\}\\) \\(A = \\{\\emptyset, \\{1\\}, \\{0,2\\}, \\Omega\\}\\) \\(A = \\{\\emptyset, \\{2\\}, \\{0,1\\}, \\Omega\\}\\) Exercise 1.4 (Difference between algebra and sigma algebra) Let \\(\\Omega = \\mathbb{N}\\) and \\(\\mathcal{A} = \\{A \\subseteq \\mathbb{N}: A \\text{ is finite or } A^c \\text{ is finite.} \\}\\). Show that \\(\\mathcal{A}\\) is an algebra but not a sigma algebra. Solution. \\(\\emptyset\\) is finite so \\(\\emptyset \\in \\mathcal{A}\\). Let \\(A \\in \\mathcal{A}\\) and \\(B \\in \\mathcal{A}\\). If both are finite, then their union is also finite and therefore in \\(\\mathcal{A}\\). Let at least one of them not be finite. Then their union is not finite. But \\((A \\cup B)^c = A^c \\cap B^c\\). And since at least one is infinite, then its complement is finite and the intersection is too. So finite unions are in \\(\\mathcal{A}\\). Let us look at numbers \\(2n\\). For any \\(n\\), \\(2n \\in \\mathcal{A}\\) as it is finite. But \\(\\bigcup_{k = 1}^{\\infty} 2n \\notin \\mathcal{A}\\). Exercise 1.5 We define \\(\\sigma(X) = \\cap_{\\lambda \\in I} S_\\lambda\\) to be a sigma algebra, generated by the set \\(X\\), where \\(S_\\lambda\\) are all sigma algebras such that \\(X \\subseteq S_\\lambda\\). \\(S_\\lambda\\) are indexed by \\(\\lambda \\in I\\). Let \\(A, B \\subseteq 2^{\\Omega}\\). Prove that \\(\\sigma(A) = \\sigma(B) \\iff A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A)\\). Solution. To prove the equivalence, we need to prove that the left hand side implies the right hand side and vice versa. Proving \\(\\sigma(A) = \\sigma(B) \\Rightarrow A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A)\\): we know \\(A \\subseteq \\sigma(A)\\) is always true, so by substituting in \\(\\sigma(B)\\) from the left hand side equality we obtain \\(A \\subseteq \\sigma(B)\\). We obtain \\(B \\subseteq \\sigma(A)\\) by symmetry. This proves the implication. Proving \\(A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A) \\Rightarrow \\sigma(A) = \\sigma(B)\\): by definition of a sigma algebra, generated by a set, we have \\(\\sigma(B) = \\cap_{\\lambda \\in I} S_\\lambda\\) where \\(S_\\lambda\\) are all sigma algebras where \\(B \\subseteq S_\\lambda\\). But \\(\\sigma(A)\\) is one of \\(S_\\lambda\\), so we can write \\(\\sigma(B) = \\sigma(A) \\cap \\left(\\cap_{\\lambda \\in I} S_\\lambda \\right)\\), which implies \\(\\sigma(B) \\subseteq \\sigma(A)\\). By symmetry, we have \\(\\sigma(A) \\subseteq \\sigma(B)\\). Since \\(\\sigma(A) \\subseteq \\sigma(B)\\) and \\(\\sigma(B) \\subseteq \\sigma(A)\\), we obtain \\(\\sigma(A) = \\sigma(B)\\), which proves the implication and completes the equivalence proof. Exercise 1.6 (Intro to measure) Take the measurable space \\(\\Omega = \\{1,2\\}\\), \\(F = 2^{\\Omega}\\). Which of the following is a measure? Which is a probability measure? \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 5\\), \\(\\mu(\\{2\\}) = 6\\), \\(\\mu(\\{1,2\\}) = 11\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 0\\), \\(\\mu(\\{1,2\\}) = 1\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 0\\), \\(\\mu(\\{1,2\\}) = 0\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 1\\), \\(\\mu(\\{1,2\\}) = 1\\) \\(\\mu(\\emptyset)=0\\), \\(\\mu(\\{1\\})=0\\), \\(\\mu(\\{2\\})=\\infty\\), \\(\\mu(\\{1,2\\})=\\infty\\) Solution. Measure. Not probability measure since \\(\\mu(\\Omega) > 1\\). Neither due to countable additivity. Measure. Not probability measure since \\(\\mu(\\Omega) = 0\\). Probability measure. Measure. Not probability measure since \\(\\mu(\\Omega) > 1\\). Exercise 1.7 Define a probability space that could be used to model the outcome of throwing two fair 6-sided dice. Solution. \\(\\Omega = \\{\\{i,j\\}, i = 1,...,6, j = 1,...,6\\}\\) \\(F = 2^{\\Omega}\\) \\(\\forall \\omega \\in \\Omega\\), \\(P(\\omega) = \\frac{1}{6} \\times \\frac{1}{6} = \\frac{1}{36}\\) 1.2 Properties of probability measures Exercise 1.8 A standard deck (52 cards) is distributed to two persons: 26 cards to each person. All partitions are equally likely. Find the probability that: The first person gets 4 Queens. The first person gets at least 2 Queens. R: Use simulation (sample) to check the above answers. Solution. \\(\\frac{\\binom{48}{22}}{\\binom{52}{26}}\\) 1 - \\(\\frac{\\binom{48}{26} + 4 \\times \\binom{48}{25}}{\\binom{52}{26}}\\) For the simulation, let us represent cards with numbers from 1 to 52, and let 1 through 4 represent Queens. set.seed(1) cards <- 1:52 n <- 10000 q4 <- vector(mode = "logical", length = n) q2 <- vector(mode = "logical", length = n) tmp <- vector(mode = "logical", length = n) for (i in 1:n) { p1 <- sample(1:52, 26) q4[i] <- sum(1:4 %in% p1) == 4 q2[i] <- sum(1:4 %in% p1) >= 2 } sum(q4) / n ## [1] 0.0572 sum(q2) / n ## [1] 0.6894 Exercise 1.9 Let \\(A\\) and \\(B\\) be events with probabilities \\(P(A) = \\frac{2}{3}\\) and \\(P(B) = \\frac{1}{2}\\). Show that \\(\\frac{1}{6} \\leq P(A\\cap B) \\leq \\frac{1}{2}\\), and give examples to show that both extremes are possible. Find corresponding bounds for \\(P(A\\cup B)\\). R: Draw samples from the examples and show the probability bounds of \\(P(A \\cap B)\\) . Solution. From the properties of probability we have \\[\\begin{equation} P(A \\cup B) = P(A) + P(B) - P(A \\cap B) \\leq 1. \\end{equation}\\] From this follows \\[\\begin{align} P(A \\cap B) &\\geq P(A) + P(B) - 1 \\\\ &= \\frac{2}{3} + \\frac{1}{2} - 1 \\\\ &= \\frac{1}{6}, \\end{align}\\] which is the lower bound for the intersection. Conversely, we have \\[\\begin{equation} P(A \\cup B) = P(A) + P(B) - P(A \\cap B) \\geq P(A). \\end{equation}\\] From this follows \\[\\begin{align} P(A \\cap B) &\\leq P(B) \\\\ &= \\frac{1}{2}, \\end{align}\\] which is the upper bound for the intersection. For an example take a fair die. To achieve the lower bound let \\(A = \\{3,4,5,6\\}\\) and \\(B = \\{1,2,3\\}\\), then their intersection is \\(A \\cap B = \\{3\\}\\). To achieve the upper bound take \\(A = \\{1,2,3,4\\}\\) and $B = {1,2,3} $. For the bounds of the union we will use the results from the first part. Again from the properties of probability we have \\[\\begin{align} P(A \\cup B) &= P(A) + P(B) - P(A \\cap B) \\\\ &\\geq P(A) + P(B) - \\frac{1}{2} \\\\ &= \\frac{2}{3}. \\end{align}\\] Conversely \\[\\begin{align} P(A \\cup B) &= P(A) + P(B) - P(A \\cap B) \\\\ &\\leq P(A) + P(B) - \\frac{1}{6} \\\\ &= 1. \\end{align}\\] Therefore \\(\\frac{2}{3} \\leq P(A \\cup B) \\leq 1\\). We use sample in R: set.seed(1) n <- 10000 samps <- sample(1:6, n, replace = TRUE) # lower bound lb <- vector(mode = "logical", length = n) A <- c(1,2,3) B <- c(3,4,5,6) for (i in 1:n) { lb[i] <- samps[i] %in% A & samps[i] %in% B } sum(lb) / n ## [1] 0.1605 # upper bound ub <- vector(mode = "logical", length = n) A <- c(1,2,3) B <- c(1,2,3,4) for (i in 1:n) { ub[i] <- samps[i] %in% A & samps[i] %in% B } sum(ub) / n ## [1] 0.4913 Exercise 1.10 A fair coin is tossed repeatedly. Show that, with probability one, a head turns up sooner or later. Show similarly that any given finite sequence of heads and tails occurs eventually with probability one. Solution. \\[\\begin{align} P(\\text{no heads}) &= \\lim_{n \\rightarrow \\infty} P(\\text{no heads in first }n \\text{ tosses}) \\\\ &= \\lim_{n \\rightarrow \\infty} \\frac{1}{2^n} \\\\ &= 0. \\end{align}\\] For the second part, let us fix the given sequence of heads and tails of length \\(k\\) as \\(s\\). A probability that this happens in \\(k\\) tosses is \\(\\frac{1}{2^k}\\). \\[\\begin{align} P(s \\text{ occurs}) &= \\lim_{n \\rightarrow \\infty} P(s \\text{ occurs in first } nk \\text{ tosses}) \\end{align}\\] The right part of the upper equation is greater than if \\(s\\) occurs either in the first \\(k\\) tosses, second \\(k\\) tosses,…, \\(n\\)-th \\(k\\) tosses. Therefore \\[\\begin{align} P(s \\text{ occurs}) &\\geq \\lim_{n \\rightarrow \\infty} P(s \\text{ occurs in first } n \\text{ disjoint sequences of length } k) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - P(s \\text{ does not occur in first } n \\text{ disjoint sequences})) \\\\ &= 1 - \\lim_{n \\rightarrow \\infty} P(s \\text{ does not occur in first } n \\text{ disjoint sequences}) \\\\ &= 1 - \\lim_{n \\rightarrow \\infty} (1 - \\frac{1}{2^k})^n \\\\ &= 1. \\end{align}\\] Exercise 1.11 An Erdos-Renyi random graph \\(G(n,p)\\) is a model with \\(n\\) nodes, where each pair of nodes is connected with probability \\(p\\). Calculate the probability that there exists a node that is not connected to any other node in \\(G(4,0.6)\\). Show that the upper bound for the probability that there exist 2 nodes that are not connected to any other node for an arbitrary \\(G(n,p)\\) is \\(\\binom{n}{2} (1-p)^{2n - 3}\\). R: Estimate the probability from the first point using simulation. Solution. Let \\(A_i\\) be the event that the \\(i\\)-th node is not connected to any other node. Then our goal is to calculate \\(P(\\cup_{i=1}^n A_i)\\). Using the inclusion-exclusion principle, we get \\[\\begin{align} P(\\cup_{i=1}^n A_i) &= \\sum_i A_i - \\sum_{i<j} P(A_i \\cap A_j) + \\sum_{i<j<k} P(A_i \\cap A_j \\cap A_k) - P(A_1 \\cap A_2 \\cap A_3 \\cap A_4) \\\\ &=4 (1 - p)^3 - \\binom{4}{2} (1 - p)^5 + \\binom{4}{3} (1 - p)^6 - (1 - p)^6 \\\\ &\\approx 0.21. \\end{align}\\] Let \\(A_{ij}\\) be the event that nodes \\(i\\) and \\(j\\) are not connected to any other node. We are interested in \\(P(\\cup_{i<j}A_{ij})\\). By using Boole`s inequality, we get \\[\\begin{align} P(\\cup_{i<j}A_{ij}) \\leq \\sum_{i<j} P(A_{ij}). \\end{align}\\] What is the probability of \\(A_{ij}\\)? There need to be no connections to the \\(i\\)-th node to the remaining nodes (excluding \\(j\\)), the same for the \\(j\\)-th node, and there can be no connection between them. Therefore \\[\\begin{align} P(\\cup_{i<j}A_{ij}) &\\leq \\sum_{i<j} (1 - p)^{2(n-2) + 1} \\\\ &= \\binom{n}{2} (1 - p)^{2n - 3}. \\end{align}\\] set.seed(1) n_samp <- 100000 n <- 4 p <- 0.6 conn_samp <- vector(mode = "logical", length = n_samp) for (i in 1:n_samp) { tmp_mat <- matrix(data = 0, nrow = n, ncol = n) samp_conn <- sample(c(0,1), choose(4,2), replace = TRUE, prob = c(1 - p, p)) tmp_mat[lower.tri(tmp_mat)] <- samp_conn tmp_mat[upper.tri(tmp_mat)] <- t(tmp_mat)[upper.tri(t(tmp_mat))] not_conn <- apply(tmp_mat, 1, sum) if (any(not_conn == 0)) { conn_samp[i] <- TRUE } else { conn_samp[i] <- FALSE } } sum(conn_samp) / n_samp ## [1] 0.20565 1.3 Discrete probability spaces Exercise 1.12 Show that the standard measurable space on \\(\\Omega = \\{0,1,...,n\\}\\) equipped with binomial measure is a discrete probability space. Define another probability measure on this measurable space. Show that for \\(n=1\\) the binomial measure is the same as the Bernoulli measure. R: Draw 1000 samples from the binomial distribution \\(p=0.5\\), \\(n=20\\) (rbinom) and compare relative frequencies with theoretical probability measure. Solution. We need to show that the terms of \\(\\sum_{k=0}^n \\binom{n}{k} p^k (1 - p)^{n - k}\\) sum to 1. For that we use the binomial theorem \\(\\sum_{k=0}^n \\binom{n}{k} x^k y^{n-k} = (x + y)^n\\). So \\[\\begin{equation} \\sum_{k=0}^n \\binom{n}{k} p^k (1 - p)^{n - k} = (p + 1 - p)^n = 1. \\end{equation}\\] \\(P(\\{k\\}) = \\frac{1}{n + 1}\\). When \\(n=1\\) then \\(k \\in \\{0,1\\}\\). Inserting \\(n=1\\) into the binomial measure, we get \\(\\binom{1}{k}p^k (1-p)^{1 - k}\\). Now \\(\\binom{1}{1} = \\binom{1}{0} = 1\\), so the measure is \\(p^k (1-p)^{1 - k}\\), which is the Bernoulli measure. set.seed(1) library(ggplot2) library(dplyr) bin_samp <- rbinom(n = 1000, size = 20, prob = 0.5) bin_samp <- data.frame(x = bin_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:20, n = dbinom(0:20, size = 20, prob = 0.5), type = "theoretical_measure")) bin_plot <- ggplot(data = bin_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(bin_plot) Exercise 1.13 Show that the standard measurable space on \\(\\Omega = \\{0,1,...,\\infty\\}\\) equipped with geometric measure is a discrete probability space, equipped with Poisson measure is a discrete probability space. Define another probability measure on this measurable space. R: Draw 1000 samples from the Poisson distribution \\(\\lambda = 10\\) (rpois) and compare relative frequencies with theoretical probability measure. Solution. \\(\\sum_{k = 0}^{\\infty} p(1 - p)^k = p \\sum_{k = 0}^{\\infty} (1 - p)^k = p \\frac{1}{1 - 1 + p} = 1\\). We used the formula for geometric series. \\(\\sum_{k = 0}^{\\infty} \\frac{\\lambda^k e^{-\\lambda}}{k!} = e^{-\\lambda} \\sum_{k = 0}^{\\infty} \\frac{\\lambda^k}{k!} = e^{-\\lambda} e^{\\lambda} = 1.\\) We used the Taylor expansion of the exponential function. Since we only have to define a probability measure, we could only assign probabilities that sum to one to a finite number of events in \\(\\Omega\\), and probability zero to the other infinite number of events. However to make this solution more educational, we will try to find a measure that assigns a non-zero probability to all events in \\(\\Omega\\). A good start for this would be to find a converging infinite series, as the probabilities will have to sum to one. One simple converging series is the geometric series \\(\\sum_{k=0}^{\\infty} p^k\\) for \\(|p| < 1\\). Let us choose an arbitrary \\(p = 0.5\\). Then \\(\\sum_{k=0}^{\\infty} p^k = \\frac{1}{1 - 0.5} = 2\\). To complete the measure, we have to normalize it, so it sums to one, therefore \\(P(\\{k\\}) = \\frac{0.5^k}{2}\\) is a probability measure on \\(\\Omega\\). We could make it even more difficult by making this measure dependent on some parameter \\(\\alpha\\), but this is out of the scope of this introductory chapter. set.seed(1) pois_samp <- rpois(n = 1000, lambda = 10) pois_samp <- data.frame(x = pois_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:25, n = dpois(0:25, lambda = 10), type = "theoretical_measure")) pois_plot <- ggplot(data = pois_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(pois_plot) Exercise 1.14 Define a probability measure on \\((\\Omega = \\mathbb{Z}, 2^{\\mathbb{Z}})\\). Define a probability measure such that \\(P(\\omega) > 0, \\forall \\omega \\in \\Omega\\). R: Implement a random generator that will generate samples with the relative frequency that corresponds to your probability measure. Compare relative frequencies with theoretical probability measure . Solution. \\(P(0) = 1, P(\\omega) = 0, \\forall \\omega \\neq 0\\). \\(P(\\{k\\}) = \\sum_{k = -\\infty}^{\\infty} \\frac{p(1 - p)^{|k|}}{2^{1 - 1_0(k)}}\\), where \\(1_0(k)\\) is the indicator function, which equals to one if \\(k\\) is 0, and equals to zero in every other case. n <- 1000 geom_samps <- rgeom(n, prob = 0.5) sign_samps <- sample(c(FALSE, TRUE), size = n, replace = TRUE) geom_samps[sign_samps] <- -geom_samps[sign_samps] my_pmf <- function (k, p) { indic <- rep(1, length(k)) indic[k == 0] <- 0 return ((p * (1 - p)^(abs(k))) / 2^indic) } geom_samps <- data.frame(x = geom_samps) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = -10:10, n = my_pmf(-10:10, 0.5), type = "theoretical_measure")) geom_plot <- ggplot(data = geom_samps, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(geom_plot) Exercise 1.15 Define a probability measure on \\(\\Omega = \\{1,2,3,4,5,6\\}\\) with parameter \\(m \\in \\{1,2,3,4,5,6\\}\\), so that the probability of outcome at distance \\(1\\) from \\(m\\) is half of the probability at distance \\(0\\), at distance \\(2\\) is half of the probability at distance \\(1\\), etc. R: Implement a random generator that will generate samples with the relative frequency that corresponds to your probability measure. Compare relative frequencies with theoretical probability measure . Solution. \\(P(\\{k\\}) = \\frac{\\frac{1}{2}^{|m - k|}}{\\sum_{i=1}^6 \\frac{1}{2}^{|m - i|}}\\) n <- 10000 m <- 4 my_pmf <- function (k, m) { denom <- sum(0.5^abs(m - 1:6)) return (0.5^abs(m - k) / denom) } samps <- c() for (i in 1:n) { a <- sample(1:6, 1) a_val <- my_pmf(a, m) prob <- runif(1) if (prob < a_val) { samps <- c(samps, a) } } samps <- data.frame(x = samps) %>% count(x) %>% mutate(n = n / length(samps), type = "empirical_frequencies") %>% bind_rows(data.frame(x = 1:6, n = my_pmf(1:6, m), type = "theoretical_measure")) my_plot <- ggplot(data = samps, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(my_plot) "],["uprobspaces.html", "Chapter 2 Uncountable probability spaces 2.1 Borel sets 2.2 Lebesgue measure", " Chapter 2 Uncountable probability spaces This chapter deals with uncountable probability spaces. The students are expected to acquire the following knowledge: Theoretical Understand Borel sets and identify them. Estimate Lebesgue measure for different sets. Know when sets are Borel-measurable. Understanding of countable and uncountable sets. R Uniform sampling. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 2.1 Borel sets Exercise 2.1 Prove that the intersection of two sigma algebras on \\(\\Omega\\) is a sigma algebra. Prove that the collection of all open subsets \\((a,b)\\) on \\((0,1]\\) is not a sigma algebra of \\((0,1]\\). Solution. Empty set: \\[\\begin{equation} \\emptyset \\in \\mathcal{A} \\wedge \\emptyset \\in \\mathcal{B} \\Rightarrow \\emptyset \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Complement: \\[\\begin{equation} \\text{Let } A \\in \\mathcal{A} \\cap \\mathcal{B} \\Rightarrow A \\in \\mathcal{A} \\wedge A \\in \\mathcal{B} \\Rightarrow A^c \\in \\mathcal{A} \\wedge A^c \\in \\mathcal{B} \\Rightarrow A^c \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Countable additivity: Let \\(\\{A_i\\}\\) be a countable sequence of subsets in \\(\\mathcal{A} \\cap \\mathcal{B}\\). \\[\\begin{equation} \\forall i: A_i \\in \\mathcal{A} \\cap \\mathcal{B} \\Rightarrow A_i \\in \\mathcal{A} \\wedge A_i \\in \\mathcal{B} \\Rightarrow \\cup A_i \\in \\mathcal{A} \\wedge \\cup A_i \\in \\mathcal{B} \\Rightarrow \\cup A_i \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Let \\(A\\) denote the collection of all open subsets \\((a,b)\\) on \\((0,1]\\). Then \\((0,1) \\in A\\). But \\((0,1)^c = 1 \\notin A\\). Exercise 2.2 Show that \\(\\mathcal{C} = \\sigma(\\mathcal{C})\\) if and only if \\(\\mathcal{C}\\) is a sigma algebra. Solution. “\\(\\Rightarrow\\)” This follows from the definition of a generated sigma algebra. “\\(\\Leftarrow\\)” Let \\(\\mathcal{F} = \\cap_i F_i\\) be the intersection of all sigma algebras that contain \\(\\mathcal{C}\\). Then \\(\\sigma(\\mathcal{C}) = \\mathcal{F}\\). Additionally, \\(\\forall i: \\mathcal{C} \\in F_i\\). So each \\(F_i\\) can be written as \\(F_i = \\mathcal{C} \\cup D\\), where \\(D\\) are the rest of the elements in the sigma algebra. In other words, each sigma algebra in the collection contains at least \\(\\mathcal{C}\\), but can contain other elements. Now for some \\(j\\), \\(F_j = \\mathcal{C}\\) as \\(\\{F_i\\}\\) contains all sigma algebras that contain \\(\\mathcal{C}\\) and \\(\\mathcal{C}\\) is such a sigma algebra. Since this is the smallest subset in the intersection it follows that \\(\\sigma(\\mathcal{C}) = \\mathcal{F} = \\mathcal{C}\\). Exercise 2.3 Let \\(\\mathcal{C}\\) and \\(\\mathcal{D}\\) be two collections of subsets on \\(\\Omega\\) such that \\(\\mathcal{C} \\subset \\mathcal{D}\\). Prove that \\(\\sigma(\\mathcal{C}) \\subseteq \\sigma(\\mathcal{D})\\). Solution. \\(\\sigma(\\mathcal{D})\\) is a sigma algebra that contains \\(\\mathcal{D}\\). It follows that \\(\\sigma(\\mathcal{D})\\) is a sigma algebra that contains \\(\\mathcal{C}\\). Let us write \\(\\sigma(\\mathcal{C}) = \\cap_i F_i\\), where \\(\\{F_i\\}\\) is the collection of all sigma algebras that contain \\(\\mathcal{C}\\). Since \\(\\sigma(\\mathcal{D})\\) is such a sigma algebra, there exists an index \\(j\\), so that \\(F_j = \\sigma(\\mathcal{D})\\). Then we can write \\[\\begin{align} \\sigma(\\mathcal{C}) &= (\\cap_{i \\neq j} F_i) \\cap \\sigma(\\mathcal{D}) \\\\ &\\subseteq \\sigma(\\mathcal{D}). \\end{align}\\] Exercise 2.4 Prove that the following subsets of \\((0,1]\\) are Borel-measurable by finding their measure. Any countable set. The set of numbers in (0,1] whose decimal expansion does not contain 7. Solution. This follows directly from the fact that every countable set is a union of singletons, whose measure is 0. Let us first look at numbers which have a 7 as the first decimal numbers. Their measure is 0.1. Then we take all the numbers with a 7 as the second decimal number (excluding those who already have it as the first). These have the measure 0.01, and there are 9 of them, so their total measure is 0.09. We can continue to do so infinitely many times. At each \\(n\\), we have the measure of the intervals which is \\(10^n\\) and the number of those intervals is \\(9^{n-1}\\). Now \\[\\begin{align} \\lambda(A) &= 1 - \\sum_{n = 0}^{\\infty} \\frac{9^n}{10^{n+1}} \\\\ &= 1 - \\frac{1}{10} \\sum_{n = 0}^{\\infty} (\\frac{9}{10})^n \\\\ &= 1 - \\frac{1}{10} \\frac{10}{1} \\\\ &= 0. \\end{align}\\] Since we have shown that the measure of the set is \\(0\\), we have also shown that the set is measurable. Exercise 2.5 Let \\(\\Omega = [0,1]\\), and let \\(\\mathcal{F}_3\\) consist of all countable subsets of \\(\\Omega\\), and all subsets of \\(\\Omega\\) having a countable complement. Show that \\(\\mathcal{F}_3\\) is a sigma algebra. Let us define \\(P(A)=0\\) if \\(A\\) is countable, and \\(P(A) = 1\\) if \\(A\\) has a countable complement. Is \\((\\Omega, \\mathcal{F}_3, P)\\) a legitimate probability space? Solution. The empty set is countable, therefore it is in \\(\\mathcal{F}_3\\). For any \\(A \\in \\mathcal{F}_3\\). If \\(A\\) is countable, then \\(A^c\\) has a countable complement and is in \\(\\mathcal{F}_3\\). If \\(A\\) is uncountable, then it has a countable complement \\(A^c\\) which is therefore also in \\(\\mathcal{F}_3\\). We are left with showing countable additivity. Let \\(\\{A_i\\}\\) be an arbitrary collection of sets in \\(\\mathcal{F}_3\\). We will look at two possibilities. First let all \\(A_i\\) be countable. A countable union of countable sets is countable, and therefore in \\(\\mathcal{F}_3\\). Second, let at least one \\(A_i\\) be uncountable. It follows that it has a countable complement. We can write \\[\\begin{equation} (\\cup_{i=1}^{\\infty} A_i)^c = \\cap_{i=1}^{\\infty} A_i^c. \\end{equation}\\] Since at least one \\(A_i^c\\) on the right side is countable, the whole intersection is countable, and therefore the union has a countable complement. It follows that the union is in \\(\\mathcal{F}_3\\). The tuple \\((\\Omega, \\mathcal{F}_3)\\) is a measurable space. Therefore, we only need to check whether \\(P\\) is a probability measure. The measure of the empty set is zero as it is countable. We have to check for countable additivity. Let us look at three situations. Let \\(A_i\\) be disjoint sets. First, let all \\(A_i\\) be countable. \\[\\begin{equation} P(\\cup_{i=1}^{\\infty} A_i) = \\sum_{i=1}^{\\infty}P( A_i)) = 0. \\end{equation}\\] Since the union is countable, the above equation holds. Second, let exactly one \\(A_i\\) be uncountable. W.L.O.G. let that be \\(A_1\\). Then \\[\\begin{equation} P(\\cup_{i=1}^{\\infty} A_i) = 1 + \\sum_{i=2}^{\\infty}P( A_i)) = 1. \\end{equation}\\] Since the union is uncountable, the above equation holds. Third, let at least two \\(A_i\\) be uncountable. We have to check whether it is possible for two uncountable sets in \\(\\mathcal{F}_3\\) to be disjoint. If that is possible, then their measures would sum to more than one and \\(P\\) would not be a probability measure. W.L.O.G. let \\(A_1\\) and \\(A_2\\) be uncountable. Then we have \\[\\begin{equation} A_1 \\cap A_2 = (A_1^c \\cup A_2^c)^c. \\end{equation}\\] Now \\(A_1^c\\) and \\(A_2^c\\) are countable and their union is therefore countable. Let \\(B = A_1^c \\cup A_2^c\\). So the intersection of \\(A_1\\) and \\(A_2\\) equals the complement of \\(B\\), which is countable. For the intersection to be the empty set, \\(B\\) would have to equal to \\(\\Omega\\). But \\(\\Omega\\) is uncountable and therefore \\(B\\) can not equal to \\(\\Omega\\). It follows that two uncountable sets in \\(\\mathcal{F}_3\\) can not have an empty intersection. Therefore the tuple is a legitimate probability space. 2.2 Lebesgue measure Exercise 2.6 Show that the Lebesgue measure of rational numbers on \\([0,1]\\) is 0. R: Implement a random number generator, which generates uniform samples of irrational numbers in \\([0,1]\\) by uniformly sampling from \\([0,1]\\) and rejecting a sample if it is rational. Solution. There are a countable number of rational numbers. Therefore, we can write \\[\\begin{align} \\lambda(\\mathbb{Q}) &= \\lambda(\\cup_{i = 1}^{\\infty} q_i) &\\\\ &= \\sum_{i = 1}^{\\infty} \\lambda(q_i) &\\text{ (countable additivity)} \\\\ &= \\sum_{i = 1}^{\\infty} 0 &\\text{ (Lebesgue measure of a singleton)} \\\\ &= 0. \\end{align}\\] Exercise 2.7 Prove that the Lebesgue measure of \\(\\mathbb{R}\\) is infinity. Paradox. Show that the cardinality of \\(\\mathbb{R}\\) and \\((0,1)\\) is the same, while their Lebesgue measures are infinity and one respectively. Solution. Let \\(a_i\\) be the \\(i\\)-th integer for \\(i \\in \\mathbb{Z}\\). We can write \\(\\mathbb{R} = \\cup_{-\\infty}^{\\infty} (a_i, a_{i + 1}]\\). \\[\\begin{align} \\lambda(\\mathbb{R}) &= \\lambda(\\cup_{i = -\\infty}^{\\infty} (a_i, a_{i + 1}]) \\\\ &= \\lambda(\\lim_{n \\rightarrow \\infty} \\cup_{i = -n}^{n} (a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\lambda(\\cup_{i = -n}^{n} (a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\sum_{i = -n}^{n} \\lambda((a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\sum_{i = -n}^{n} 1 \\\\ &= \\lim_{n \\rightarrow \\infty} 2n \\\\ &= \\infty. \\end{align}\\] We need to find a bijection between \\(\\mathbb{R}\\) and \\((0,1)\\). A well-known function that maps from a bounded interval to \\(\\mathbb{R}\\) is the tangent. To make the bijection easier to achieve, we will take the inverse, which maps from \\(\\mathbb{R}\\) to \\((-\\frac{\\pi}{2}, \\frac{\\pi}{2})\\). However, we need to change the function so it maps to \\((0,1)\\). First we add \\(\\frac{\\pi}{2}\\), so that we move the function above zero. Then we only have to divide by the max value, which in this case is \\(\\pi\\). So our bijection is \\[\\begin{equation} f(x) = \\frac{\\tan^{-1}(x) + \\frac{\\pi}{2}}{\\pi}. \\end{equation}\\] Exercise 2.8 Take the measure space \\((\\Omega_1 = (0,1], B_{(0,1]}, \\lambda)\\) (we know that this is a probability space on \\((0,1]\\)). Define a map (function) from \\(\\Omega_1\\) to \\(\\Omega_2 = \\{1,2,3,4,5,6\\}\\) such that the measure space \\((\\Omega_2, 2^{\\Omega_2}, \\lambda(f^{-1}()))\\) will be a discrete probability space with uniform probabilities (\\(P(\\omega) = \\frac{1}{6}, \\forall \\omega \\in \\Omega_2)\\). Is the map that you defined in (a) the only such map? How would you in the same fashion define a map that would result in a probability space that can be interpreted as a coin toss with probability \\(p\\) of heads? R: Use the map in (a) as a basis for a random generator for this fair die. Solution. In other words, we have to assign disjunct intervals of the same size to each element of \\(\\Omega_2\\). Therefore \\[\\begin{equation} f(x) = \\lceil 6x \\rceil. \\end{equation}\\] No, we could for example rearrange the order in which the intervals are mapped to integers. Additionally, we could have several disjoint intervals that mapped to the same integer, as long as the Lebesgue measure of their union would be \\(\\frac{1}{6}\\) and the function would remain injective. We have \\(\\Omega_3 = \\{0,1\\}\\), where zero represents heads and one represents tails. Then \\[\\begin{equation} f(x) = 0^{I_{A}(x)}, \\end{equation}\\] where \\(A = \\{y \\in (0,1] : y < p\\}\\). set.seed(1) unif_s <- runif(1000) die_s <- ceiling(6 * unif_s) summary(as.factor(die_s)) ## 1 2 3 4 5 6 ## 166 154 200 146 166 168 "],["condprob.html", "Chapter 3 Conditional probability 3.1 Calculating conditional probabilities 3.2 Conditional independence 3.3 Monty Hall problem", " Chapter 3 Conditional probability This chapter deals with conditional probability. The students are expected to acquire the following knowledge: Theoretical Identify whether variables are independent. Calculation of conditional probabilities. Understanding of conditional dependence and independence. How to apply Bayes’ theorem to solve difficult probabilistic questions. R Simulating conditional probabilities. cumsum. apply. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 3.1 Calculating conditional probabilities Exercise 3.1 A military officer is in charge of identifying enemy aircraft and shooting them down. He is able to positively identify an enemy airplane 95% of the time and positively identify a friendly airplane 90% of the time. Furthermore, 99% of the airplanes are friendly. When the officer identifies an airplane as an enemy airplane, what is the probability that it is not and they will shoot at a friendly airplane? Solution. Let \\(E = 0\\) denote that the observed plane is friendly and \\(E=1\\) that it is an enemy. Let \\(I = 0\\) denote that the officer identified it as friendly and \\(I = 1\\) as enemy. Then \\[\\begin{align} P(E = 0 | I = 1) &= \\frac{P(I = 1 | E = 0)P(E = 0)}{P(I = 1)} \\\\ &= \\frac{P(I = 1 | E = 0)P(E = 0)}{P(I = 1 | E = 0)P(E = 0) + P(I = 1 | E = 1)P(E = 1)} \\\\ &= \\frac{0.1 \\times 0.99}{0.1 \\times 0.99 + 0.95 \\times 0.01} \\\\ &= 0.91. \\end{align}\\] Exercise 3.2 R: Consider tossing a fair die. Let \\(A = \\{2,4,6\\}\\) and \\(B = \\{1,2,3,4\\}\\). Then \\(P(A) = \\frac{1}{2}\\), \\(P(B) = \\frac{2}{3}\\) and \\(P(AB) = \\frac{1}{3}\\). Since \\(P(AB) = P(A)P(B)\\), the events \\(A\\) and \\(B\\) are independent. Simulate draws from the sample space and verify that the proportions are the same. Then find two events \\(C\\) and \\(D\\) that are not independent and repeat the simulation. set.seed(1) nsamps <- 10000 tosses <- sample(1:6, nsamps, replace = TRUE) PA <- sum(tosses %in% c(2,4,6)) / nsamps PB <- sum(tosses %in% c(1,2,3,4)) / nsamps PA * PB ## [1] 0.3295095 sum(tosses %in% c(2,4)) / nsamps ## [1] 0.3323 # Let C = {1,2} and D = {2,3} PC <- sum(tosses %in% c(1,2)) / nsamps PD <- sum(tosses %in% c(2,3)) / nsamps PC * PD ## [1] 0.1067492 sum(tosses %in% c(2)) / nsamps ## [1] 0.1622 Exercise 3.3 A machine reports the true value of a thrown 12-sided die 5 out of 6 times. If the machine reports a 1 has been tossed, what is the probability that it is actually a 1? Now let the machine only report whether a 1 has been tossed or not. Does the probability change? R: Use simulation to check your answers to a) and b). Solution. Let \\(T = 1\\) denote that the toss is 1 and \\(M = 1\\) that the machine reports a 1. \\[\\begin{align} P(T = 1 | M = 1) &= \\frac{P(M = 1 | T = 1)P(T = 1)}{P(M = 1)} \\\\ &= \\frac{P(M = 1 | T = 1)P(T = 1)}{\\sum_{k=1}^{12} P(M = 1 | T = k)P(T = k)} \\\\ &= \\frac{\\frac{5}{6}\\frac{1}{12}}{\\frac{5}{6}\\frac{1}{12} + 11 \\frac{1}{6} \\frac{1}{11} \\frac{1}{12}} \\\\ &= \\frac{5}{6}. \\end{align}\\] Yes. \\[\\begin{align} P(T = 1 | M = 1) &= \\frac{P(M = 1 | T = 1)P(T = 1)}{P(M = 1)} \\\\ &= \\frac{P(M = 1 | T = 1)P(T = 1)}{\\sum_{k=1}^{12} P(M = 1 | T = k)P(T = k)} \\\\ &= \\frac{\\frac{5}{6}\\frac{1}{12}}{\\frac{5}{6}\\frac{1}{12} + 11 \\frac{1}{6} \\frac{1}{12}} \\\\ &= \\frac{5}{16}. \\end{align}\\] set.seed(1) nsamps <- 10000 report_a <- vector(mode = "numeric", length = nsamps) report_b <- vector(mode = "logical", length = nsamps) truths <- vector(mode = "logical", length = nsamps) for (i in 1:10000) { toss <- sample(1:12, size = 1) truth <- sample(c(TRUE, FALSE), size = 1, prob = c(5/6, 1/6)) truths[i] <- truth if (truth) { report_a[i] <- toss report_b[i] <- toss == 1 } else { remaining <- (1:12)[1:12 != toss] report_a[i] <- sample(remaining, size = 1) report_b[i] <- toss != 1 } } truth_a1 <- truths[report_a == 1] sum(truth_a1) / length(truth_a1) ## [1] 0.8300733 truth_b1 <- truths[report_b] sum(truth_b1) / length(truth_b1) ## [1] 0.3046209 Exercise 3.4 A coin is tossed independently \\(n\\) times. The probability of heads at each toss is \\(p\\). At each time \\(k\\), \\((k = 2,3,...,n)\\) we get a reward at time \\(k+1\\) if \\(k\\)-th toss was a head and the previous toss was a tail. Let \\(A_k\\) be the event that a reward is obtained at time \\(k\\). Are events \\(A_k\\) and \\(A_{k+1}\\) independent? Are events \\(A_k\\) and \\(A_{k+2}\\) independent? R: simulate 10 tosses 10000 times, where \\(p = 0.7\\). Check your answers to a) and b) by counting the frequencies of the events \\(A_5\\), \\(A_6\\), and \\(A_7\\). Solution. For \\(A_k\\) to happen, we need the tosses \\(k-2\\) and \\(k-1\\) be tails and heads respectively. For \\(A_{k+1}\\) to happen, we need tosses \\(k-1\\) and \\(k\\) be tails and heads respectively. As the toss \\(k-1\\) need to be heads for one and tails for the other, these two events can not happen simultaneously. Therefore the probability of their intersection is 0. But the probability of each of them separately is \\(p(1-p) > 0\\). Therefore, they are not independent. For \\(A_k\\) to happen, we need the tosses \\(k-2\\) and \\(k-1\\) be tails and heads respectively. For \\(A_{k+2}\\) to happen, we need tosses \\(k\\) and \\(k+1\\) be tails and heads respectively. So the probability of intersection is \\(p^2(1-p)^2\\). And the probability of each separately is again \\(p(1-p)\\). Therefore, they are independent. set.seed(1) nsamps <- 10000 p <- 0.7 rewardA_5 <- vector(mode = "logical", length = nsamps) rewardA_6 <- vector(mode = "logical", length = nsamps) rewardA_7 <- vector(mode = "logical", length = nsamps) rewardA_56 <- vector(mode = "logical", length = nsamps) rewardA_57 <- vector(mode = "logical", length = nsamps) for (i in 1:nsamps) { samps <- sample(c(0,1), size = 10, replace = TRUE, prob = c(0.7, 0.3)) rewardA_5[i] <- (samps[4] == 0 & samps[3] == 1) rewardA_6[i] <- (samps[5] == 0 & samps[4] == 1) rewardA_7[i] <- (samps[6] == 0 & samps[5] == 1) rewardA_56[i] <- (rewardA_5[i] & rewardA_6[i]) rewardA_57[i] <- (rewardA_5[i] & rewardA_7[i]) } sum(rewardA_5) / nsamps ## [1] 0.2141 sum(rewardA_6) / nsamps ## [1] 0.2122 sum(rewardA_7) / nsamps ## [1] 0.2107 sum(rewardA_56) / nsamps ## [1] 0 sum(rewardA_57) / nsamps ## [1] 0.0454 Exercise 3.5 A drawer contains two coins. One is an unbiased coin, the other is a biased coin, which will turn up heads with probability \\(p\\) and tails with probability \\(1-p\\). One coin is selected uniformly at random. The selected coin is tossed \\(n\\) times. The coin turns up heads \\(k\\) times and tails \\(n-k\\) times. What is the probability that the coin is biased? The selected coin is tossed repeatedly until it turns up heads \\(k\\) times. Given that it is tossed \\(n\\) times in total, what is the probability that the coin is biased? Solution. Let \\(B = 1\\) denote that the coin is biased and let \\(H = k\\) denote that we’ve seen \\(k\\) heads. \\[\\begin{align} P(B = 1 | H = k) &= \\frac{P(H = k | B = 1)P(B = 1)}{P(H = k)} \\\\ &= \\frac{P(H = k | B = 1)P(B = 1)}{P(H = k | B = 1)P(B = 1) + P(H = k | B = 0)P(B = 0)} \\\\ &= \\frac{p^k(1-p)^{n-k} 0.5}{p^k(1-p)^{n-k} 0.5 + 0.5^{n+1}} \\\\ &= \\frac{p^k(1-p)^{n-k}}{p^k(1-p)^{n-k} + 0.5^n}. \\end{align}\\] The same results as in a). The only difference between these two scenarios is that in b) the last throw must be heads. However, this holds for the biased and the unbiased coin and therefore does not affect the probability of the coin being biased. Exercise 3.6 Judy goes around the company for Women’s day and shares flowers. In every office she leaves a flower, if there is at least one woman inside. The probability that there’s a woman in the office is \\(\\frac{3}{5}\\). What is the probability that Judy leaves her first flower in the fourth office? Given that she has given away exactly three flowers in the first four offices, what is the probability that she gives her fourth flower in the eighth office? What is the probability that she leaves the second flower in the fifth office? What is the probability that she leaves the second flower in the fifth office, given that she did not leave the second flower in the second office? Judy needs a new supply of flowers immediately after the office, where she gives away her last flower. What is the probability that she visits at least five offices, if she starts with two flowers? R: simulate Judy’s walk 10000 times to check your answers a) - e). Solution. Let \\(X_i = k\\) denote the event that … \\(i\\)-th sample on the \\(k\\)-th run. Since the events are independent, we can multiply their probabilities to get \\[\\begin{equation} P(X_1 = 4) = 0.4^3 \\times 0.6 = 0.0384. \\end{equation}\\] Same as in a) as we have a fresh start after first four offices. For this to be possible, she had to leave the first flower in one of the first four offices. Therefore there are four possibilities, and for each of those the probability is \\(0.4^3 \\times 0.6\\). Additionally, the probability that she leaves a flower in the fifth office is \\(0.6\\). So \\[\\begin{equation} P(X_2 = 5) = \\binom{4}{1} \\times 0.4^3 \\times 0.6^2 = 0.09216. \\end{equation}\\] We use Bayes’ theorem. \\[\\begin{align} P(X_2 = 5 | X_2 \\neq 2) &= \\frac{P(X_2 \\neq 2 | X_2 = 5)P(X_2 = 5)}{P(X_2 \\neq 2)} \\\\ &= \\frac{0.09216}{0.64} \\\\ &= 0.144. \\end{align}\\] The denominator in the second equation can be calculated as follows. One of three things has to happen for the second not to be dealt in the second round. First, both are zero, so \\(0.4^2\\). Second, first is zero, and second is one, so \\(0.4 \\times 0.6\\). Third, the first is one and the second one zero, so \\(0.6 \\times 0.4\\). Summing these values we get \\(0.64\\). We will look at the complement, so the events that she gave away exactly two flowers after two, three and four offices. \\[\\begin{equation} P(X_2 \\geq 5) = 1 - 0.6^2 - 2 \\times 0.4 \\times 0.6^2 - 3 \\times 0.4^2 \\times 0.6^2 = 0.1792. \\end{equation}\\] The multiplying parts represent the possibilities of the first flower. set.seed(1) nsamps <- 100000 Judyswalks <- matrix(data = NA, nrow = nsamps, ncol = 8) for (i in 1:nsamps) { thiswalk <- sample(c(0,1), size = 8, replace = TRUE, prob = c(0.4, 0.6)) Judyswalks[i, ] <- thiswalk } csJudy <- t(apply(Judyswalks, 1, cumsum)) # a sum(csJudy[ ,4] == 1 & csJudy[ ,3] == 0) / nsamps ## [1] 0.03848 # b csJsubset <- csJudy[csJudy[ ,4] == 3 & csJudy[ ,3] == 2, ] sum(csJsubset[ ,8] == 4 & csJsubset[ ,7] == 3) / nrow(csJsubset) ## [1] 0.03665893 # c sum(csJudy[ ,5] == 2 & csJudy[ ,4] == 1) / nsamps ## [1] 0.09117 # d sum(csJudy[ ,5] == 2 & csJudy[ ,4] == 1) / sum(csJudy[ ,2] != 2) ## [1] 0.1422398 # e sum(csJudy[ ,4] < 2) / nsamps ## [1] 0.17818 3.2 Conditional independence Exercise 3.7 Describe: A real-world example of two events \\(A\\) and \\(B\\) that are dependent but become conditionally independent if conditioned on a third event \\(C\\). A real-world example of two events \\(A\\) and \\(B\\) that are independent, but become dependent if conditioned on some third event \\(C\\). Solution. Let \\(A\\) be the height of a person and let \\(B\\) be the person’s knowledge of the Dutch language. These events are dependent since the Dutch are known to be taller than average. However if \\(C\\) is the nationality of the person, then \\(A\\) and \\(B\\) are independent given \\(C\\). Let \\(A\\) be the event that Mary passes the exam and let \\(B\\) be the event that John passes the exam. These events are independent. However, if the event \\(C\\) is that Mary and John studied together, then \\(A\\) and \\(B\\) are conditionally dependent given \\(C\\). Exercise 3.8 We have two coins of identical appearance. We know that one is a fair coin and the other flips heads 80% of the time. We choose one of the two coins uniformly at random. We discard the coin that was not chosen. We now flip the chosen coin independently 10 times, producing a sequence \\(Y_1 = y_1\\), \\(Y_2 = y_2\\), …, \\(Y_{10} = y_{10}\\). Intuitively, without doing and computation, are these random variables independent? Compute the probability \\(P(Y_1 = 1)\\). Compute the probabilities \\(P(Y_2 = 1 | Y_1 = 1)\\) and \\(P(Y_{10} = 1 | Y_1 = 1,...,Y_9 = 1)\\). Given your answers to b) and c), would you now change your answer to a)? If so, discuss why your intuition had failed. Solution. \\(P(Y_1 = 1) = 0.5 * 0.8 + 0.5 * 0.5 = 0.65\\). Since we know that \\(Y_1 = 1\\) this should change our view of the probability of the coin being biased or not. Let \\(B = 1\\) denote the event that the coin is biased and let \\(B = 0\\) denote that the coin is unbiased. By using marginal probability, we can write \\[\\begin{align} P(Y_2 = 1 | Y_1 = 1) &= P(Y_2 = 1, B = 1 | Y_1 = 1) + P(Y_2 = 1, B = 0 | Y_1 = 1) \\\\ &= \\sum_{k=1}^2 P(Y_2 = 1 | B = k, Y_1 = 1)P(B = k | Y_1 = 1) \\\\ &= 0.8 \\frac{P(Y_1 = 1 | B = 1)P(B = 1)}{P(Y_1 = 1)} + 0.5 \\frac{P(Y_1 = 1 | B = 0)P(B = 0)}{P(Y_1 = 1)} \\\\ &= 0.8 \\frac{0.8 \\times 0.5}{0.65} + 0.5 \\frac{0.5 \\times 0.5}{0.65} \\\\ &\\approx 0.68. \\end{align}\\] For the other calculation we follow the same procedure. Let \\(X = 1\\) denote that first nine tosses are all heads (equivalent to \\(Y_1 = 1\\),…, \\(Y_9 = 1\\)). \\[\\begin{align} P(Y_{10} = 1 | X = 1) &= P(Y_2 = 1, B = 1 | X = 1) + P(Y_2 = 1, B = 0 | X = 1) \\\\ &= \\sum_{k=1}^2 P(Y_2 = 1 | B = k, X = 1)P(B = k | X = 1) \\\\ &= 0.8 \\frac{P(X = 1 | B = 1)P(B = 1)}{P(X = 1)} + 0.5 \\frac{P(X = 1 | B = 0)P(B = 0)}{P(X = 1)} \\\\ &= 0.8 \\frac{0.8^9 \\times 0.5}{0.5 \\times 0.8^9 + 0.5 \\times 0.5^9} + 0.5 \\frac{0.5^9 \\times 0.5}{0.5 \\times 0.8^9 + 0.5 \\times 0.5^9} \\\\ &\\approx 0.8. \\end{align}\\] 3.3 Monty Hall problem The Monty Hall problem is a famous probability puzzle with non-intuitive outcome. Many established mathematicians and statisticians had problems solving it and many even disregarded the correct solution until they’ve seen the proof by simulation. Here we will show how it can be solved relatively simply with the use of Bayes’ theorem if we select the variables in a smart way. Exercise 3.9 (Monty Hall problem) A prize is placed at random behind one of three doors. You pick a door. Now Monty Hall chooses one of the other two doors, opens it and shows you that it is empty. He then gives you the opportunity to keep your door or switch to the other unopened door. Should you stay or switch? Use Bayes’ theorem to calculate the probability of winning if you switch and if you do not. R: Check your answers in R. Solution. W.L.O.G. assume we always pick the first door. The host can only open door 2 or door 3, as he can not open the door we picked. Let \\(k \\in \\{2,3\\}\\). Let us first look at what happens if we do not change. Then we have \\[\\begin{align} P(\\text{car in 1} | \\text{open $k$}) &= \\frac{P(\\text{open $k$} | \\text{car in 1})P(\\text{car in 1})}{P(\\text{open $k$})} \\\\ &= \\frac{P(\\text{open $k$} | \\text{car in 1})P(\\text{car in 1})}{\\sum_{n=1}^3 P(\\text{open $k$} | \\text{car in $n$})P(\\text{car in $n$)}}. \\end{align}\\] The probability that he opened \\(k\\) if the car is in 1 is \\(\\frac{1}{2}\\), as he can choose between door 2 and 3 as both have a goat behind it. Let us look at the normalization constant. When \\(n = 1\\) we get the value in the nominator. When \\(n=k\\), we get 0, as he will not open the door if there’s a prize behind. The remaining option is that we select 1, the car is behind \\(k\\) and he opens the only door left. Since he can’t open 1 due to it being our pick and \\(k\\) due to having the prize, the probability of opening the remaining door is 1, and the prior probability of the car being behind this door is \\(\\frac{1}{3}\\). So we have \\[\\begin{align} P(\\text{car in 1} | \\text{open $k$}) &= \\frac{\\frac{1}{2}\\frac{1}{3}}{\\frac{1}{2}\\frac{1}{3} + \\frac{1}{3}} \\\\ &= \\frac{1}{3}. \\end{align}\\] Now let us look at what happens if we do change. Let \\(k' \\in \\{2,3\\}\\) be the door that is not opened. If we change, we select this door, so we have \\[\\begin{align} P(\\text{car in $k'$} | \\text{open $k$}) &= \\frac{P(\\text{open $k$} | \\text{car in $k'$})P(\\text{car in $k'$})}{P(\\text{open $k$})} \\\\ &= \\frac{P(\\text{open $k$} | \\text{car in $k'$})P(\\text{car in $k'$})}{\\sum_{n=1}^3 P(\\text{open $k$} | \\text{car in $n$})P(\\text{car in $n$)}}. \\end{align}\\] The denominator stays the same, the only thing that is different from before is \\(P(\\text{open $k$} | \\text{car in $k'$})\\). We have a situation where we initially selected door 1 and the car is in door \\(k'\\). The probability that the host will open door \\(k\\) is then 1, as he can not pick any other door. So we have \\[\\begin{align} P(\\text{car in $k'$} | \\text{open $k$}) &= \\frac{\\frac{1}{3}}{\\frac{1}{2}\\frac{1}{3} + \\frac{1}{3}} \\\\ &= \\frac{2}{3}. \\end{align}\\] Therefore it makes sense to change the door. set.seed(1) nsamps <- 1000 ifchange <- vector(mode = "logical", length = nsamps) ifstay <- vector(mode = "logical", length = nsamps) for (i in 1:nsamps) { where_car <- sample(c(1:3), 1) where_player <- sample(c(1:3), 1) open_samp <- (1:3)[where_car != (1:3) & where_player != (1:3)] if (length(open_samp) == 1) { where_open <- open_samp } else { where_open <- sample(open_samp, 1) } ifstay[i] <- where_car == where_player where_ifchange <- (1:3)[where_open != (1:3) & where_player != (1:3)] ifchange[i] <- where_ifchange == where_car } sum(ifstay) / nsamps ## [1] 0.328 sum(ifchange) / nsamps ## [1] 0.672 "],["rvs.html", "Chapter 4 Random variables 4.1 General properties and calculations 4.2 Discrete random variables 4.3 Continuous random variables 4.4 Singular random variables 4.5 Transformations", " Chapter 4 Random variables This chapter deals with random variables and their distributions. The students are expected to acquire the following knowledge: Theoretical Identification of random variables. Convolutions of random variables. Derivation of PDF, PMF, CDF, and quantile function. Definitions and properties of common discrete random variables. Definitions and properties of common continuous random variables. Transforming univariate random variables. R Familiarize with PDF, PMF, CDF, and quantile functions for several distributions. Visual inspection of probability distributions. Analytical and empirical calculation of probabilities based on distributions. New R functions for plotting (for example, facet_wrap). Creating random number generators based on the Uniform distribution. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 4.1 General properties and calculations Exercise 4.1 Which of the functions below are valid CDFs? Find their respective densities. R: Plot the three functions. \\[\\begin{equation} F(x) = \\begin{cases} 1 - e^{-x^2} & x \\geq 0 \\\\ 0 & x < 0. \\end{cases} \\end{equation}\\] \\[\\begin{equation} F(x) = \\begin{cases} e^{-\\frac{1}{x}} & x > 0 \\\\ 0 & x \\leq 0. \\end{cases} \\end{equation}\\] \\[\\begin{equation} F(x) = \\begin{cases} 0 & x \\leq 0 \\\\ \\frac{1}{3} & 0 < x \\leq \\frac{1}{2} \\\\ 1 & x > \\frac{1}{2}. \\end{cases} \\end{equation}\\] Solution. Yes. First, let us check the limits. \\(\\lim_{x \\rightarrow -\\infty} (0) = 0\\). \\(\\lim_{x \\rightarrow \\infty} (1 - e^{-x^2}) = 1 - \\lim_{x \\rightarrow \\infty} e^{-x^2} = 1 - 0 = 1\\). Second, let us check whether the function is increasing. Let \\(x > y \\geq 0\\). Then \\(1 - e^{-x^2} \\geq 1 - e^{-y^2}\\). We only have to check right continuity for the point zero. \\(F(0) = 0\\) and \\(\\lim_{\\epsilon \\downarrow 0}F (0 + \\epsilon) = \\lim_{\\epsilon \\downarrow 0} 1 - e^{-\\epsilon^2} = 1 - \\lim_{\\epsilon \\downarrow 0} e^{-\\epsilon^2} = 1 - 1 = 0\\). We get the density by differentiating the CDF. \\(p(x) = \\frac{d}{dx} 1 - e^{-x^2} = 2xe^{-x^2}.\\) Students are encouraged to check that this is a proper PDF. Yes. First, let us check the limits. $_{x -} (0) = 0 and \\(\\lim_{x \\rightarrow \\infty} (e^{-\\frac{1}{x}}) = 1\\). Second, let us check whether the function is increasing. Let \\(x > y \\geq 0\\). Then \\(e^{-\\frac{1}{x}} \\geq e^{-\\frac{1}{y}}\\). We only have to check right continuity for the point zero. \\(F(0) = 0\\) and \\(\\lim_{\\epsilon \\downarrow 0}F (0 + \\epsilon) = \\lim_{\\epsilon \\downarrow 0} e^{-\\frac{1}{\\epsilon}} = 0\\). We get the density by differentiating the CDF. \\(p(x) = \\frac{d}{dx} e^{-\\frac{1}{x}} = \\frac{1}{x^2}e^{-\\frac{1}{x}}.\\) Students are encouraged to check that this is a proper PDF. No. The function is not right continuous as \\(F(\\frac{1}{2}) = \\frac{1}{3}\\), but \\(\\lim_{\\epsilon \\downarrow 0} F(\\frac{1}{2} + \\epsilon) = 1\\). f1 <- function (x) { tmp <- 1 - exp(-x^2) tmp[x < 0] <- 0 return(tmp) } f2 <- function (x) { tmp <- exp(-(1 / x)) tmp[x <= 0] <- 0 return(tmp) } f3 <- function (x) { tmp <- x tmp[x == x] <- 1 tmp[x <= 0.5] <- 1/3 tmp[x <= 0] <- 0 return(tmp) } cdf_data <- tibble(x = seq(-1, 20, by = 0.001), f1 = f1(x), f2 = f2(x), f3 = f3(x)) %>% melt(id.vars = "x") cdf_plot <- ggplot(data = cdf_data, aes(x = x, y = value, color = variable)) + geom_hline(yintercept = 1) + geom_line() plot(cdf_plot) Exercise 4.2 Let \\(X\\) be a random variable with CDF \\[\\begin{equation} F(x) = \\begin{cases} 0 & x < 0 \\\\ \\frac{x^2}{2} & 0 \\leq x < 1 \\\\ \\frac{1}{2} + \\frac{p}{2} & 1 \\leq x < 2 \\\\ \\frac{1}{2} + \\frac{p}{2} + \\frac{1 - p}{2} & x \\geq 2 \\end{cases} \\end{equation}\\] R: Plot this CDF for \\(p = 0.3\\). Is it a discrete, continuous, or mixed random varible? Find the probability density/mass of \\(X\\). f1 <- function (x, p) { tmp <- x tmp[x >= 2] <- 0.5 + (p * 0.5) + ((1-p) * 0.5) tmp[x < 2] <- 0.5 + (p * 0.5) tmp[x < 1] <- (x[x < 1])^2 / 2 tmp[x < 0] <- 0 return(tmp) } cdf_data <- tibble(x = seq(-1, 5, by = 0.001), y = f1(x, 0.3)) cdf_plot <- ggplot(data = cdf_data, aes(x = x, y = y)) + geom_hline(yintercept = 1) + geom_line(color = "blue") plot(cdf_plot) ::: {.solution} \\(X\\) is a mixed random variable. Since \\(X\\) is a mixed random variable, we have to find the PDF of the continuous part and the PMF of the discrete part. We get the continuous part by differentiating the corresponding CDF, \\(\\frac{d}{dx}\\frac{x^2}{2} = x\\). So the PDF, when \\(0 \\leq x < 1\\), is \\(p(x) = x\\). Let us look at the discrete part now. It has two steps, so this is a discrete distribution with two outcomes – numbers 1 and 2. The first happens with probability \\(\\frac{p}{2}\\), and the second with probability \\(\\frac{1 - p}{2}\\). This reminds us of the Bernoulli distribution. The PMF for the discrete part is \\(P(X = x) = (\\frac{p}{2})^{2 - x} (\\frac{1 - p}{2})^{x - 1}\\). ::: Exercise 4.3 (Convolutions) Convolutions are probability distributions that correspond to sums of independent random variables. Let \\(X\\) and \\(Y\\) be independent discrete variables. Find the PMF of \\(Z = X + Y\\). Hint: Use the law of total probability. Let \\(X\\) and \\(Y\\) be independent continuous variables. Find the PDF of \\(Z = X + Y\\). Hint: Start with the CDF. Solution. \\[\\begin{align} P(Z = z) &= P(X + Y = z) & \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + Y = z | Y = k) P(Y = k) & \\text{ (law of total probability)} \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + k = z | Y = k) P(Y = k) & \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + k = z) P(Y = k) & \\text{ (independence of $X$ and $Y$)} \\\\ &= \\sum_{k = -\\infty}^\\infty P(X = z - k) P(Y = k). & \\end{align}\\] Let \\(f\\) and \\(g\\) be the PDFs of \\(X\\) and \\(Y\\) respectively. \\[\\begin{align} F(z) &= P(Z < z) \\\\ &= P(X + Y < z) \\\\ &= \\int_{-\\infty}^{\\infty} P(X + Y < z | Y = y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X + y < z | Y = y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X + y < z)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X < z - y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} (\\int_{-\\infty}^{z - y} f(x) dx) g(y) dy \\end{align}\\] Now \\[\\begin{align} p(z) &= \\frac{d}{dz} F(z) & \\\\ &= \\int_{-\\infty}^{\\infty} (\\frac{d}{dz}\\int_{-\\infty}^{z - y} f(x) dx) g(y) dy & \\\\ &= \\int_{-\\infty}^{\\infty} f(z - y) g(y) dy & \\text{ (fundamental theorem of calculus)}. \\end{align}\\] 4.2 Discrete random variables Exercise 4.4 (Binomial random variable) Let \\(X_k\\), \\(k = 1,...,n\\), be random variables with the Bernoulli measure as the PMF. Let \\(X = \\sum_{k=1}^n X_k\\). We call \\(X_k\\) a Bernoulli random variable with parameter \\(p \\in (0,1)\\). Find the CDF of \\(X_k\\). Find PMF of \\(X\\). This is a Binomial random variable with support in \\(\\{0,1,2,...,n\\}\\) and parameters \\(p \\in (0,1)\\) and \\(n \\in \\mathbb{N}_0\\). We denote \\[\\begin{equation} X | n,p \\sim \\text{binomial}(n,p). \\end{equation}\\] Find CDF of \\(X\\). R: Simulate from the binomial distribution with \\(n = 10\\) and \\(p = 0.5\\), and from \\(n\\) Bernoulli distributions with \\(p = 0.5\\). Visually compare the sum of Bernoullis and the binomial. Hint: there is no standard function like rpois for a Bernoulli random variable. Check exercise 1.12 to find out how to sample from a Bernoulli distribution. Solution. There are two outcomes – zero and one. Zero happens with probability \\(1 - p\\). Therefore \\[\\begin{equation} F(k) = \\begin{cases} 0 & k < 0 \\\\ 1 - p & 0 \\leq k < 1 \\\\ 1 & k \\geq 1. \\end{cases} \\end{equation}\\] For the probability of \\(X\\) to be equal to some \\(k \\leq n\\), exactly \\(k\\) Bernoulli variables need to be one, and the others zero. So \\(p^k(1-p)^{n-k}\\). There are \\(\\binom{n}{k}\\) such possible arrangements. Therefore \\[\\begin{align} P(X = k) = \\binom{n}{k} p^k (1 - p)^{n-k}. \\end{align}\\] \\[\\begin{equation} F(k) = \\sum_{i = 0}^{\\lfloor k \\rfloor} \\binom{n}{i} p^i (1 - p)^{n - i} \\end{equation}\\] set.seed(1) nsamps <- 10000 binom_samp <- rbinom(nsamps, size = 10, prob = 0.5) bernoulli_mat <- matrix(data = NA, nrow = nsamps, ncol = 10) for (i in 1:nsamps) { bernoulli_mat[i, ] <- rbinom(10, size = 1, prob = 0.5) } bern_samp <- apply(bernoulli_mat, 1, sum) b_data <- tibble(x = c(binom_samp, bern_samp), type = c(rep("binomial", 10000), rep("Bernoulli_sum", 10000))) b_plot <- ggplot(data = b_data, aes(x = x, fill = type)) + geom_bar(position = "dodge") plot(b_plot) Exercise 4.5 (Geometric random variable) A variable with PMF \\[\\begin{equation} P(k) = p(1-p)^k \\end{equation}\\] is a geometric random variable with support in non-negative integers. It has one parameter \\(p \\in (0,1]\\). We denote \\[\\begin{equation} X | p \\sim \\text{geometric}(p) \\end{equation}\\] Derive the CDF of a geometric random variable. R: Draw 1000 samples from the geometric distribution with \\(p = 0.3\\) and compare their frequencies to theoretical values. Solution. \\[\\begin{align} P(X \\leq k) &= \\sum_{i = 0}^k p(1-p)^i \\\\ &= p \\sum_{i = 0}^k (1-p)^i \\\\ &= p \\frac{1 - (1-p)^{k+1}}{1 - (1 - p)} \\\\ &= 1 - (1-p)^{k + 1} \\end{align}\\] set.seed(1) geo_samp <- rgeom(n = 1000, prob = 0.3) geo_samp <- data.frame(x = geo_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:20, n = dgeom(0:20, prob = 0.3), type = "theoretical_measure")) geo_plot <- ggplot(data = geo_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(geo_plot) Exercise 4.6 (Poisson random variable) A variable with PMF \\[\\begin{equation} P(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!} \\end{equation}\\] is a Poisson random variable with support in non-negative integers. It has one positive parameter \\(\\lambda\\), which also represents its mean value and variance (a measure of the deviation of the values from the mean – more on mean and variance in the next chapter). We denote \\[\\begin{equation} X | \\lambda \\sim \\text{Poisson}(\\lambda). \\end{equation}\\] This distribution is usually the default choice for modeling counts. We have already encountered a Poisson random variable in exercise 1.13, where we also sampled from this distribution. The CDF of a Poisson random variable is \\(P(X <= x) = e^{-\\lambda} \\sum_{i=0}^x \\frac{\\lambda^{i}}{i!}\\). R: Draw 1000 samples from the Poisson distribution with \\(\\lambda = 5\\) and compare their empirical cumulative distribution function with the theoretical CDF. set.seed(1) pois_samp <- rpois(n = 1000, lambda = 5) pois_samp <- data.frame(x = pois_samp) pois_plot <- ggplot(data = pois_samp, aes(x = x, colour = "ECDF")) + stat_ecdf(geom = "step") + geom_step(data = tibble(x = 0:17, y = ppois(x, 5)), aes(x = x, y = y, colour = "CDF")) + scale_colour_manual("Lgend title", values = c("black", "red")) plot(pois_plot) Exercise 4.7 (Negative binomial random variable) A variable with PMF \\[\\begin{equation} p(k) = \\binom{k + r - 1}{k}(1-p)^r p^k \\end{equation}\\] is a negative binomial random variable with support in non-negative integers. It has two parameters \\(r > 0\\) and \\(p \\in (0,1)\\). We denote \\[\\begin{equation} X | r,p \\sim \\text{NB}(r,p). \\end{equation}\\] Let us reparameterize the negative binomial distribution with \\(q = 1 - p\\). Find the PMF of \\(X \\sim \\text{NB}(1, q)\\). Do you recognize this distribution? Show that the sum of two negative binomial random variables with the same \\(p\\) is also a negative binomial random variable. Hint: Use the fact that the number of ways to place \\(n\\) indistinct balls into \\(k\\) boxes is \\(\\binom{n + k - 1}{n}\\). R: Draw samples from \\(X \\sim \\text{NB}(5, 0.4)\\) and \\(Y \\sim \\text{NB}(3, 0.4)\\). Draw samples from \\(Z = X + Y\\), where you use the parameters calculated in b). Plot both distributions, their sum, and \\(Z\\) using facet_wrap. Be careful, as R uses a different parameterization size=\\(r\\) and prob=\\(1 - p\\). Solution. \\[\\begin{align} P(X = k) &= \\binom{k + 1 - 1}{k}q^1 (1-q)^k \\\\ &= q(1-q)^k. \\end{align}\\] This is the geometric distribution. Let \\(X \\sim \\text{NB}(r_1, p)\\) and \\(Y \\sim \\text{NB}(r_2, p)\\). Let \\(Z = X + Y\\). \\[\\begin{align} P(Z = z) &= \\sum_{k = 0}^{\\infty} P(X = z - k)P(Y = k), \\text{ if k < 0, then the probabilities are 0} \\\\ &= \\sum_{k = 0}^{z} P(X = z - k)P(Y = k), \\text{ if k > z, then the probabilities are 0} \\\\ &= \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k}(1 - p)^{r_1} p^{z - k} \\binom{k + r_2 - 1}{k}(1 - p)^{r_2} p^{k} & \\\\ &= \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}(1 - p)^{r_1 + r_2} p^{z} & \\\\ &= (1 - p)^{r_1 + r_2} p^{z} \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}& \\end{align}\\] The part before the sum reminds us of the negative binomial distribution with parameters \\(r_1 + r_2\\) and \\(p\\). To complete this term to the negative binomial PMF we need \\(\\binom{z + r_1 + r_2 -1}{z}\\). So the only thing we need to prove is that the sum equals this term. Both terms in the sum can be interpreted as numbers of ways to place a number of balls into boxes. For the left term it is \\(z-k\\) balls into \\(r_1\\) boxes, and for the right \\(k\\) balls into \\(r_2\\) boxes. For each \\(k\\) we are distributing \\(z\\) balls in total. By summing over all \\(k\\), we actually get all the possible placements of \\(z\\) balls into \\(r_1 + r_2\\) boxes. Therefore \\[\\begin{align} P(Z = z) &= (1 - p)^{r_1 + r_2} p^{z} \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}& \\\\ &= \\binom{z + r_1 + r_2 -1}{z} (1 - p)^{r_1 + r_2} p^{z}. \\end{align}\\] From this it also follows that the sum of geometric distributions with the same parameter is a negative binomial distribution. \\(Z \\sim \\text{NB}(8, 0.4)\\). set.seed(1) nsamps <- 10000 x <- rnbinom(nsamps, size = 5, prob = 0.6) y <- rnbinom(nsamps, size = 3, prob = 0.6) xpy <- x + y z <- rnbinom(nsamps, size = 8, prob = 0.6) samps <- tibble(x, y, xpy, z) samps <- melt(samps) ggplot(data = samps, aes(x = value)) + geom_bar() + facet_wrap(~ variable) 4.3 Continuous random variables Exercise 4.8 (Exponential random variable) A variable \\(X\\) with PDF \\(\\lambda e^{-\\lambda x}\\) is an exponential random variable with support in non-negative real numbers. It has one positive parameter \\(\\lambda\\). We denote \\[\\begin{equation} X | \\lambda \\sim \\text{Exp}(\\lambda). \\end{equation}\\] Find the CDF of an exponential random variable. Find the quantile function of an exponential random variable. Calculate the probability \\(P(1 \\leq X \\leq 3)\\), where \\(X \\sim \\text{Exp(1.5)}\\). R: Check your answer to c) with a simulation (rexp). Plot the probability in a meaningful way. R: Implement PDF, CDF, and the quantile function and compare their values with corresponding R functions visually. Hint: use the size parameter to make one of the curves wider. Solution. \\[\\begin{align} F(x) &= \\int_{0}^{x} \\lambda e^{-\\lambda t} dt \\\\ &= \\lambda \\int_{0}^{x} e^{-\\lambda t} dt \\\\ &= \\lambda (\\frac{1}{-\\lambda}e^{-\\lambda t} |_{0}^{x}) \\\\ &= \\lambda(\\frac{1}{\\lambda} - \\frac{1}{\\lambda} e^{-\\lambda x}) \\\\ &= 1 - e^{-\\lambda x}. \\end{align}\\] \\[\\begin{align} F(F^{-1}(x)) &= x \\\\ 1 - e^{-\\lambda F^{-1}(x)} &= x \\\\ e^{-\\lambda F^{-1}(x)} &= 1 - x \\\\ -\\lambda F^{-1}(x) &= \\ln(1 - x) \\\\ F^{-1}(x) &= - \\frac{ln(1 - x)}{\\lambda}. \\end{align}\\] \\[\\begin{align} P(1 \\leq X \\leq 3) &= P(X \\leq 3) - P(X \\leq 1) \\\\ &= P(X \\leq 3) - P(X \\leq 1) \\\\ &= 1 - e^{-1.5 \\times 3} - 1 + e^{-1.5 \\times 1} \\\\ &\\approx 0.212. \\end{align}\\] set.seed(1) nsamps <- 1000 samps <- rexp(nsamps, rate = 1.5) sum(samps >= 1 & samps <= 3) / nsamps ## [1] 0.212 exp_plot <- ggplot(data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dexp, args = list(rate = 1.5)) + stat_function(fun = dexp, args = list(rate = 1.5), xlim = c(1,3), geom = "area", fill = "red") plot(exp_plot) exp_pdf <- function(x, lambda) { return (lambda * exp(-lambda * x)) } exp_cdf <- function(x, lambda) { return (1 - exp(-lambda * x)) } exp_quant <- function(q, lambda) { return (-(log(1 - q) / lambda)) } ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_pdf, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = pexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_cdf, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) ggplot(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x)) + stat_function(fun = qexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_quant, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) Exercise 4.9 (Uniform random variable) Continuous uniform random variable with parameters \\(a\\) and \\(b\\) has the PDF \\[\\begin{equation} p(x) = \\begin{cases} \\frac{1}{b - a} & x \\in [a,b] \\\\ 0 & \\text{otherwise}. \\end{cases} \\end{equation}\\] Find the CDF of the uniform random variable. Find the quantile function of the uniform random variable. Let \\(X \\sim \\text{Uniform}(a,b)\\). Find the CDF of the variable \\(Y = \\frac{X - a}{b - a}\\). This is the standard uniform random variable. Let \\(X \\sim \\text{Uniform}(-1, 3)\\). Find such \\(z\\) that \\(P(X < z + \\mu_x) = \\frac{1}{5}\\). R: Check your result from d) using simulation. Solution. \\[\\begin{align} F(x) &= \\int_{a}^x \\frac{1}{b - a} dt \\\\ &= \\frac{1}{b - a} \\int_{a}^x dt \\\\ &= \\frac{x - a}{b - a}. \\end{align}\\] \\[\\begin{align} F(F^{-1}(p)) &= p \\\\ \\frac{F^{-1}(p) - a}{b - a} &= p \\\\ F^{-1}(p) &= p(b - a) + a. \\end{align}\\] \\[\\begin{align} F_Y(y) &= P(Y < y) \\\\ &= P(\\frac{X - a}{b - a} < y) \\\\ &= P(X < y(b - a) + a) \\\\ &= F_X(y(b - a) + a) \\\\ &= \\frac{(y(b - a) + a) - a}{b - a} \\\\ &= y. \\end{align}\\] \\[\\begin{align} P(X < z + 1) &= \\frac{1}{5} \\\\ F(z + 1) &= \\frac{1}{5} \\\\ z + 1 &= F^{-1}(\\frac{1}{5}) \\\\ z &= \\frac{1}{5}4 - 1 - 1 \\\\ z &= -1.2. \\end{align}\\] set.seed(1) a <- -1 b <- 3 nsamps <- 10000 unif_samp <- runif(nsamps, a, b) mu_x <- mean(unif_samp) new_samp <- unif_samp - mu_x quantile(new_samp, probs = 1/5) ## 20% ## -1.203192 punif(-0.2, -1, 3) ## [1] 0.2 Exercise 4.10 (Beta random variable) A variable \\(X\\) with PDF \\[\\begin{equation} p(x) = \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)}, \\end{equation}\\] where \\(\\text{B}(\\alpha, \\beta) = \\frac{\\Gamma(\\alpha) \\Gamma(\\beta)}{\\Gamma(\\alpha + \\beta)}\\) and \\(\\Gamma(x) = \\int_0^{\\infty} x^{z - 1} e^{-x} dx\\) is a Beta random variable with support on \\([0,1]\\). It has two positive parameters \\(\\alpha\\) and \\(\\beta\\). Notation: \\[\\begin{equation} X | \\alpha, \\beta \\sim \\text{Beta}(\\alpha, \\beta) \\end{equation}\\] It is often used in modeling rates. Calculate the PDF for \\(\\alpha = 1\\) and \\(\\beta = 1\\). What do you notice? R: Plot densities of the beta distribution for parameter pairs (2, 2), (4, 1), (1, 4), (2, 5), and (0.1, 0.1). R: Sample from \\(X \\sim \\text{Beta}(2, 5)\\) and compare the histogram with Beta PDF. Solution. \\[\\begin{equation} p(x) = \\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\\text{B}(1, 1)} = 1. \\end{equation}\\] This is the standard uniform distribution. set.seed(1) ggplot(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x)) + stat_function(fun = dbeta, args = list(shape1 = 2, shape2 = 2), aes(color = "alpha = 0.5")) + stat_function(fun = dbeta, args = list(shape1 = 4, shape2 = 1), aes(color = "alpha = 4")) + stat_function(fun = dbeta, args = list(shape1 = 1, shape2 = 4), aes(color = "alpha = 1")) + stat_function(fun = dbeta, args = list(shape1 = 2, shape2 = 5), aes(color = "alpha = 25")) + stat_function(fun = dbeta, args = list(shape1 = 0.1, shape2 = 0.1), aes(color = "alpha = 0.1")) set.seed(1) nsamps <- 1000 samps <- rbeta(nsamps, 2, 5) ggplot(data = data.frame(x = samps), aes(x = x)) + geom_histogram(aes(y = ..density..), color = "black") + stat_function(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x), fun = dbeta, args = list(shape1 = 2, shape2 = 5), color = "red", size = 1.2) Exercise 4.11 (Gamma random variable) A random variable with PDF \\[\\begin{equation} p(x) = \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x} \\end{equation}\\] is a Gamma random variable with support on the positive numbers and parameters shape \\(\\alpha > 0\\) and rate \\(\\beta > 0\\). We write \\[\\begin{equation} X | \\alpha, \\beta \\sim \\text{Gamma}(\\alpha, \\beta) \\end{equation}\\] and it’s CDF is \\[\\begin{equation} \\frac{\\gamma(\\alpha, \\beta x)}{\\Gamma(\\alpha)}, \\end{equation}\\] where \\(\\gamma(s, x) = \\int_0^x t^{s-1} e^{-t} dt\\). It is usually used in modeling positive phenomena (for example insurance claims and rainfalls). Let \\(X \\sim \\text{Gamma}(1, \\beta)\\). Find the PDF of \\(X\\). Do you recognize this PDF? Let \\(k = \\alpha\\) and \\(\\theta = \\frac{1}{\\beta}\\). Find the PDF of \\(X | k, \\theta \\sim \\text{Gamma}(k, \\theta)\\). Random variables can be reparameterized, and sometimes a reparameterized distribution is more suitable for certain calculations. The first parameterization is for example usually used in Bayesian statistics, while this parameterization is more common in econometrics and some other applied fields. Note that you also need to pay attention to the parameters in statistical software, so diligently read the help files when using functions like rgamma to see how the function is parameterized. R: Plot gamma CDF for random variables with shape and rate parameters (1,1), (10,1), (1,10). Solution. \\[\\begin{align} p(x) &= \\frac{\\beta^1}{\\Gamma(1)} x^{1 - 1}e^{-\\beta x} \\\\ &= \\beta e^{-\\beta x} \\end{align}\\] This is the PDF of the exponential distribution with parameter \\(\\beta\\). \\[\\begin{align} p(x) &= \\frac{1}{\\Gamma(k)\\beta^k} x^{k - 1}e^{-\\frac{x}{\\theta}}. \\end{align}\\] set.seed(1) ggplot(data = data.frame(x = seq(0, 25, by = 0.01)), aes(x = x)) + stat_function(fun = pgamma, args = list(shape = 1, rate = 1), aes(color = "Gamma(1,1)")) + stat_function(fun = pgamma, args = list(shape = 10, rate = 1), aes(color = "Gamma(10,1)")) + stat_function(fun = pgamma, args = list(shape = 1, rate = 10), aes(color = "Gamma(1,10)")) Exercise 4.12 (Normal random variable) A random variable with PDF \\[\\begin{equation} p(x) = \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} \\end{equation}\\] is a normal random variable with support on the real axis and parameters \\(\\mu\\) in reals and \\(\\sigma^2 > 0\\). The first is the mean parameter and the second is the variance parameter. Many statistical methods assume a normal distribution. We denote \\[\\begin{equation} X | \\mu, \\sigma \\sim \\text{N}(\\mu, \\sigma^2), \\end{equation}\\] and it’s CDF is \\[\\begin{equation} F(x) = \\int_{-\\infty}^x \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(t - \\mu)^2}{2 \\sigma^2}} dt, \\end{equation}\\] which is intractable and is usually approximated. Due to its flexibility it is also one of the most researched distributions. For that reason statisticians often use transformations of variables or approximate distributions with the normal distribution. Show that a variable \\(\\frac{X - \\mu}{\\sigma} \\sim \\text{N}(0,1)\\). This transformation is called standardization, and \\(\\text{N}(0,1)\\) is a standard normal distribution. R: Plot the normal distribution with \\(\\mu = 0\\) and different values for the \\(\\sigma\\) parameter. R: The normal distribution provides a good approximation for the Poisson distribution with a large \\(\\lambda\\). Let \\(X \\sim \\text{Poisson}(50)\\). Approximate \\(X\\) with the normal distribution and compare its density with the Poisson histogram. What are the values of \\(\\mu\\) and \\(\\sigma^2\\) that should provide the best approximation? Note that R function rnorm takes standard deviation (\\(\\sigma\\)) as a parameter and not variance. Solution. \\[\\begin{align} P(\\frac{X - \\mu}{\\sigma} < x) &= P(X < \\sigma x + \\mu) \\\\ &= F(\\sigma x + \\mu) \\\\ &= \\int_{-\\infty}^{\\sigma x + \\mu} \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(t - \\mu)^2}{2\\sigma^2}} dt \\end{align}\\] Now let \\(s = f(t) = \\frac{t - \\mu}{\\sigma}\\), then \\(ds = \\frac{dt}{\\sigma}\\) and \\(f(\\sigma x + \\mu) = x\\), so \\[\\begin{align} P(\\frac{X - \\mu}{\\sigma} < x) &= \\int_{-\\infty}^{x} \\frac{1}{\\sqrt{2 \\pi}} e^{-\\frac{s^2}{2}} ds. \\end{align}\\] There is no need to evaluate this integral, as we recognize it as the CDF of a normal distribution with \\(\\mu = 0\\) and \\(\\sigma^2 = 1\\). set.seed(1) # b ggplot(data = data.frame(x = seq(-15, 15, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 1), aes(color = "sd = 1")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 0.4), aes(color = "sd = 0.1")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "sd = 2")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 5), aes(color = "sd = 5")) # c mean_par <- 50 nsamps <- 100000 pois_samps <- rpois(nsamps, lambda = mean_par) norm_samps <- rnorm(nsamps, mean = mean_par, sd = sqrt(mean_par)) my_plot <- ggplot() + geom_bar(data = tibble(x = pois_samps), aes(x = x, y = (..count..)/sum(..count..))) + geom_density(data = tibble(x = norm_samps), aes(x = x), color = "red") plot(my_plot) Exercise 4.13 (Logistic random variable) A logistic random variable has CDF \\[\\begin{equation} F(x) = \\frac{1}{1 + e^{-\\frac{x - \\mu}{s}}}, \\end{equation}\\] where \\(\\mu\\) is real and \\(s > 0\\). The support is on the real axis. We denote \\[\\begin{equation} X | \\mu, s \\sim \\text{Logistic}(\\mu, s). \\end{equation}\\] The distribution of the logistic random variable resembles a normal random variable, however it has heavier tails. Find the PDF of a logistic random variable. R: Implement logistic PDF and CDF and visually compare both for \\(X \\sim \\text{N}(0, 1)\\) and \\(Y \\sim \\text{logit}(0, \\sqrt{\\frac{3}{\\pi^2}})\\). These distributions have the same mean and variance. Additionally, plot the same plot on the interval [5,10], to better see the difference in the tails. R: For the distributions in b) find the probability \\(P(|X| > 4)\\) and interpret the result. Solution. \\[\\begin{align} p(x) &= \\frac{d}{dx} \\frac{1}{1 + e^{-\\frac{x - \\mu}{s}}} \\\\ &= \\frac{- \\frac{d}{dx} (1 + e^{-\\frac{x - \\mu}{s}})}{(1 + e{-\\frac{x - \\mu}{s}})^2} \\\\ &= \\frac{e^{-\\frac{x - \\mu}{s}}}{s(1 + e{-\\frac{x - \\mu}{s}})^2}. \\end{align}\\] # b set.seed(1) logit_pdf <- function (x, mu, s) { return ((exp(-(x - mu)/(s))) / (s * (1 + exp(-(x - mu)/(s)))^2)) } nl_plot <- ggplot(data = data.frame(x = seq(-12, 12, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "normal")) + stat_function(fun = logit_pdf, args = list(mu = 0, s = sqrt(12/pi^2)), aes(color = "logit")) plot(nl_plot) nl_plot <- ggplot(data = data.frame(x = seq(5, 10, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "normal")) + stat_function(fun = logit_pdf, args = list(mu = 0, s = sqrt(12/pi^2)), aes(color = "logit")) plot(nl_plot) # c logit_cdf <- function (x, mu, s) { return (1 / (1 + exp(-(x - mu) / s))) } p_logistic <- 1 - logit_cdf(4, 0, sqrt(12/pi^2)) + logit_cdf(-4, 0, sqrt(12/pi^2)) p_norm <- 1 - pnorm(4, 0, 2) + pnorm(-4, 0, 2) p_logistic ## [1] 0.05178347 p_norm ## [1] 0.04550026 # Logistic distribution has wider tails, therefore the probability of larger # absolute values is higher. 4.4 Singular random variables Exercise 4.14 (Cantor distribution) The Cantor set is a subset of \\([0,1]\\), which we create by iteratively deleting the middle third of the interval. For example, in the first iteration, we get the sets \\([0,\\frac{1}{3}]\\) and \\([\\frac{2}{3},1]\\). In the second iteration, we get \\([0,\\frac{1}{9}]\\), \\([\\frac{2}{9},\\frac{1}{3}]\\), \\([\\frac{2}{3}, \\frac{7}{9}]\\), and \\([\\frac{8}{9}, 1]\\). On the \\(n\\)-th iteration, we have \\[\\begin{equation} C_n = \\frac{C_{n-1}}{3} \\cup \\bigg(\\frac{2}{3} + \\frac{C_{n-1}}{3} \\bigg), \\end{equation}\\] where \\(C_0 = [0,1]\\). The Cantor set is then defined as the intersection of these sets \\[\\begin{equation} C = \\cap_{n=1}^{\\infty} C_n. \\end{equation}\\] It has the same cardinality as \\([0,1]\\). Another way to define the Cantor set is the set of all numbers on \\([0,1]\\), that do not have a 1 in the ternary representation \\(x = \\sum_{n=1}^\\infty \\frac{x_i}{3^i}, x_i \\in \\{0,1,2\\}\\). A random variable follows the Cantor distribution, if its CDF is the Cantor function (below). You can find the implementations of random number generator, CDF, and quantile functions for the Cantor distributions at https://github.com/Henrygb/CantorDist.R. Show that the Lebesgue measure of the Cantor set is 0. (Jagannathan) Let us look at an infinite sequence of independent fair-coin tosses. If the outcome is heads, let \\(x_i = 2\\) and \\(x_i = 0\\), when tails. Then use these to create \\(x = \\sum_{n=1}^\\infty \\frac{x_i}{3^i}\\). This is a random variable with the Cantor distribution. Show that \\(X\\) has a singular distribution. Solution. \\[\\begin{align} \\lambda(C) &= 1 - \\lambda(C^c) \\\\ &= 1 - \\frac{1}{3}\\sum_{k = 0}^\\infty (\\frac{2}{3})^k \\\\ &= 1 - \\frac{\\frac{1}{3}}{1 - \\frac{2}{3}} \\\\ &= 0. \\end{align}\\] First, for every \\(x\\), the probability of observing it is \\(\\lim_{n \\rightarrow \\infty} \\frac{1}{2^n} = 0\\). Second, the probability that we observe one of all the possible sequences is 1. Therefore \\(P(C) = 1\\). So this is a singular variable. The CDF only increments on the elements of the Cantor set. 4.5 Transformations Exercise 4.15 Let \\(X\\) be a random variable that is uniformly distributed on \\(\\{-2, -1, 0, 1, 2\\}\\). Find the PMF of \\(Y = X^2\\). Solution. \\[\\begin{align} P_Y(y) = \\sum_{x \\in \\sqrt(y)} P_X(x) = \\begin{cases} 0 & y \\notin \\{0,1,4\\} \\\\ \\frac{1}{5} & y = 0 \\\\ \\frac{2}{5} & y \\in \\{1,4\\} \\end{cases} \\end{align}\\] Exercise 4.16 (Lognormal random variable) A lognormal random variable is a variable whose logarithm is normally distributed. In practice, we often encounter skewed data. Usually using a log transformation on such data makes it more symmetric and therefore more suitable for modeling with the normal distribution (more on why we wish to model data with the normal distribution in the following chapters). Let \\(X \\sim \\text{N}(\\mu,\\sigma)\\). Find the PDF of \\(Y: \\log(Y) = X\\). R: Sample from the lognormal distribution with parameters \\(\\mu = 5\\) and \\(\\sigma = 2\\). Plot a histogram of the samples. Then log-transform the samples and plot a histogram along with the theoretical normal PDF. Solution. \\[\\begin{align} p_Y(y) &= p_X(\\log(y)) \\frac{d}{dy} \\log(y) \\\\ &= \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{\\frac{(\\log(y) - \\mu)^2}{2 \\sigma^2}} \\frac{1}{y} \\\\ &= \\frac{1}{y \\sqrt{2 \\pi \\sigma^2}} e^{\\frac{(\\log(y) - \\mu)^2}{2 \\sigma^2}}. \\end{align}\\] set.seed(1) nsamps <- 10000 mu <- 0.5 sigma <- 0.4 ln_samps <- rlnorm(nsamps, mu, sigma) ln_plot <- ggplot(data = data.frame(x = ln_samps), aes(x = x)) + geom_histogram(color = "black") plot(ln_plot) norm_samps <- log(ln_samps) n_plot <- ggplot(data = data.frame(x = norm_samps), aes(x = x)) + geom_histogram(aes(y = ..density..), color = "black") + stat_function(fun = dnorm, args = list(mean = mu, sd = sigma), color = "red") plot(n_plot) Exercise 4.17 (Probability integral transform) This exercise is borrowed from Wasserman. Let \\(X\\) have a continuous, strictly increasing CDF \\(F\\). Let \\(Y = F(X)\\). Find the density of \\(Y\\). This is called the probability integral transform. Let \\(U \\sim \\text{Uniform}(0,1)\\) and let \\(X = F^{-1}(U)\\). Show that \\(X \\sim F\\). R: Implement a program that takes Uniform(0,1) random variables and generates random variables from an exponential(\\(\\beta\\)) distribution. Compare your implemented function with function rexp in R. Solution. \\[\\begin{align} F_Y(y) &= P(Y < y) \\\\ &= P(F(X) < y) \\\\ &= P(X < F_X^{-1}(y)) \\\\ &= F_X(F_X^{-1}(y)) \\\\ &= y. \\end{align}\\] From the above it follows that \\(p(y) = 1\\). Note that we need to know the inverse CDF to be able to apply this procedure. \\[\\begin{align} P(X < x) &= P(F^{-1}(U) < x) \\\\ &= P(U < F(x)) \\\\ &= F_U(F(x)) \\\\ &= F(x). \\end{align}\\] set.seed(1) nsamps <- 10000 beta <- 4 generate_exp <- function (n, beta) { tmp <- runif(n) X <- qexp(tmp, beta) return (X) } df <- tibble("R" = rexp(nsamps, beta), "myGenerator" = generate_exp(nsamps, beta)) %>% gather() ggplot(data = df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") "],["mrvs.html", "Chapter 5 Multiple random variables 5.1 General 5.2 Bivariate distribution examples 5.3 Transformations", " Chapter 5 Multiple random variables This chapter deals with multiple random variables and their distributions. The students are expected to acquire the following knowledge: Theoretical Calculation of PDF of transformed multiple random variables. Finding marginal and conditional distributions. R Scatterplots of bivariate random variables. New R functions (for example, expand.grid). .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 5.1 General Exercise 5.1 Let \\(X \\sim \\text{N}(0,1)\\) and \\(Y \\sim \\text{N}(0,1)\\) be independent random variables. Draw 1000 samples from \\((X,Y)\\) and plot a scatterplot. Now let \\(X \\sim \\text{N}(0,1)\\) and \\(Y | X = x \\sim N(ax, 1)\\). Draw 1000 samples from \\((X,Y)\\) for \\(a = 1\\), \\(a=0\\), and \\(a=-0.5\\). Plot the scatterplots. How would you interpret parameter \\(a\\)? Plot the marginal distribution of \\(Y\\) for cases \\(a=1\\), \\(a=0\\), and \\(a=-0.5\\). Can you guess which distribution it is? set.seed(1) nsamps <- 1000 x <- rnorm(nsamps) y <- rnorm(nsamps) ggplot(data.frame(x, y), aes(x = x, y = y)) + geom_point() y1 <- rnorm(nsamps, mean = 1 * x) y2 <- rnorm(nsamps, mean = 0 * x) y3 <- rnorm(nsamps, mean = -0.5 * x) df <- tibble(x = c(x,x,x), y = c(y1,y2,y3), a = c(rep(1, nsamps), rep(0, nsamps), rep(-0.5, nsamps))) ggplot(df, aes(x = x, y = y)) + geom_point() + facet_wrap(~a) + coord_equal(ratio=1) # Parameter a controls the scale of linear dependency between X and Y. ggplot(df, aes(x = y)) + geom_density() + facet_wrap(~a) 5.2 Bivariate distribution examples Exercise 5.2 (Discrete bivariate random variable) Let \\(X\\) represent the event that a die rolls an even number and let \\(Y\\) represent the event that a die rolls one, two, or a three. Find the marginal distributions of \\(X\\) and \\(Y\\). Find the PMF of \\((X,Y)\\). Find the CDF of \\((X,Y)\\). Find \\(P(X = 1 | Y = 1)\\). Solution. \\[\\begin{align} P(X = 1) = \\frac{1}{2} \\text{ and } P(X = 0) = \\frac{1}{2} \\\\ P(Y = 1) = \\frac{1}{2} \\text{ and } P(Y = 0) = \\frac{1}{2} \\\\ \\end{align}\\] \\[\\begin{align} P(X = 1, Y = 1) = \\frac{1}{6} \\\\ P(X = 1, Y = 0) = \\frac{2}{6} \\\\ P(X = 0, Y = 1) = \\frac{2}{6} \\\\ P(X = 0, Y = 0) = \\frac{1}{6} \\end{align}\\] \\[\\begin{align} P(X \\leq x, Y \\leq y) = \\begin{cases} \\frac{1}{6} & x = 0, y = 0 \\\\ \\frac{3}{6} & x \\neq y \\\\ 1 & x = 1, y = 1 \\end{cases} \\end{align}\\] \\[\\begin{align} P(X = 1 | Y = 1) = \\frac{1}{3} \\end{align}\\] Exercise 5.3 (Continuous bivariate random variable) Let \\(p(x,y) = 6 (x - y)^2\\) be the PDF of a bivariate random variable \\((X,Y)\\), where both variables range from zero to one. Find CDF. Find marginal distributions. Find conditional distributions. R: Plot a grid of points and colour them by value – this can help us visualize the PDF. R: Implement a random number generator, which will generate numbers from \\((X,Y)\\) and visually check the results. R: Plot the marginal distribution of \\(Y\\) and the conditional distributions of \\(X | Y = y\\), where \\(y \\in \\{0, 0.1, 0.5\\}\\). Solution. \\[\\begin{align} F(x,y) &= \\int_0^{x} \\int_0^{y} 6 (t - s)^2 ds dt\\\\ &= 6 \\int_0^{x} \\int_0^{y} t^2 - 2ts + s^2 ds dt\\\\ &= 6 \\int_0^{x} t^2y - ty^2 + \\frac{y^3}{3} dt \\\\ &= 6 (\\frac{x^3 y}{3} - \\frac{x^2y^2}{2} + \\frac{x y^3}{3}) \\\\ &= 2 x^3 y - 3 t^2y^2 + 2 x y^3 \\end{align}\\] \\[\\begin{align} p(x) &= \\int_0^{1} 6 (x - y)^2 dy\\\\ &= 6 (x^2 - x + \\frac{1}{3}) \\\\ &= 6x^2 - 6x + 2 \\end{align}\\] \\[\\begin{align} p(y) &= \\int_0^{1} 6 (x - y)^2 dx\\\\ &= 6 (y^2 - y + \\frac{1}{3}) \\\\ &= 6y^2 - 6y + 2 \\end{align}\\] \\[\\begin{align} p(x|y) &= \\frac{p(xy)}{p(y)} \\\\ &= \\frac{6 (x - y)^2}{6 (y^2 - y + \\frac{1}{3})} \\\\ &= \\frac{(x - y)^2}{y^2 - y + \\frac{1}{3}} \\end{align}\\] \\[\\begin{align} p(y|x) &= \\frac{p(xy)}{p(x)} \\\\ &= \\frac{6 (x - y)^2}{6 (x^2 - x + \\frac{1}{3})} \\\\ &= \\frac{(x - y)^2}{x^2 - x + \\frac{1}{3}} \\end{align}\\] set.seed(1) # d pxy <- function (x, y) { return ((x - y)^2) } x_axis <- seq(0, 1, length.out = 100) y_axis <- seq(0, 1, length.out = 100) df <- expand.grid(x_axis, y_axis) colnames(df) <- c("x", "y") df <- cbind(df, pdf = pxy(df$x, df$y)) ggplot(data = df, aes(x = x, y = y, color = pdf)) + geom_point() # e samps <- NULL for (i in 1:10000) { xt <- runif(1, 0, 1) yt <- runif(1, 0, 1) pdft <- pxy(xt, yt) acc <- runif(1, 0, 6) if (acc <= pdft) { samps <- rbind(samps, c(xt, yt)) } } colnames(samps) <- c("x", "y") ggplot(data = as.data.frame(samps), aes(x = x, y = y)) + geom_point() # f mar_pdf <- function (x) { return (6 * x^2 - 6 * x + 2) } cond_pdf <- function (x, y) { return (((x - y)^2) / (y^2 - y + 1/3)) } df <- tibble(x = x_axis, mar = mar_pdf(x), y0 = cond_pdf(x, 0), y0.1 = cond_pdf(x, 0.1), y0.5 = cond_pdf(x, 0.5)) %>% gather(dist, value, -x) ggplot(df, aes(x = x, y = value, color = dist)) + geom_line() Exercise 5.4 (Mixed bivariate random variable) Let \\(f(x,y) = \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y+ \\alpha -1} e^{-x(1 + \\beta)}\\) be the PDF of a bivariate random variable, where \\(x \\in (0, \\infty)\\) and \\(y \\in \\mathbb{N}_0\\). Find the marginal distribution of \\(X\\). Do you recognize this distribution? Find the conditional distribution of \\(Y | X\\). Do you recognize this distribution? Calculate the probability \\(P(Y = 2 | X = 2.5)\\) for \\((X,Y)\\). Find the marginal distribution of \\(Y\\). Do you recognize this distribution? R: Take 1000 random samples from \\((X,Y)\\) with parameters \\(\\beta = 1\\) and \\(\\alpha = 1\\). Plot a scatterplot. Plot a bar plot of the marginal distribution of \\(Y\\), and the theoretical PMF calculated from d) on the range from 0 to 10. Hint: Use the gamma function in R.? Solution. \\[\\begin{align} p(x) &= \\sum_{k = 0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)k!} x^{k + \\alpha -1} e^{-x(1 + \\beta)} & \\\\ &= \\sum_{k = 0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)k!} x^{k} x^{\\alpha -1} e^{-x} e^{-\\beta x} & \\\\ &= \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x} \\sum_{k = 0}^{\\infty} \\frac{1}{k!} x^{k} e^{-x} & \\\\ &= \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x} & \\text{the last term above sums to one} \\end{align}\\] This is the Gamma PDF. \\[\\begin{align} p(y|x) &= \\frac{p(x,y)}{p(x)} \\\\ &= \\frac{\\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y+ \\alpha -1} e^{-x(1 + \\beta)}}{\\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x}} \\\\ &= \\frac{x^y e^{-x}}{y!}. \\end{align}\\] This is the Poisson PMF. \\[\\begin{align} P(Y = 2 | X = 2.5) = \\frac{2.5^2 e^{-2.5}}{2!} \\approx 0.26. \\end{align}\\] \\[\\begin{align} p(y) &= \\int_{0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y + \\alpha -1} e^{-x(1 + \\beta)} dx & \\\\ &= \\frac{1}{y!} \\int_{0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{(y + \\alpha) -1} e^{-(1 + \\beta)x} dx & \\\\ &= \\frac{1}{y!} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} \\int_{0}^{\\infty} \\frac{\\Gamma(y + \\alpha)}{(1 + \\beta)^{y + \\alpha}} \\frac{(1 + \\beta)^{y + \\alpha}}{\\Gamma(y + \\alpha)} x^{(y + \\alpha) -1} e^{-(1 + \\beta)x} dx & \\text{complete to Gamma PDF} \\\\ &= \\frac{1}{y!} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} \\frac{\\Gamma(y + \\alpha)}{(1 + \\beta)^{y + \\alpha}}. \\end{align}\\] We add the terms in the third equality to get a Gamma PDF inside the integral, which then integrates to one. We do not recognize this distribution. set.seed(1) px <- function (x, alpha, beta) { return((1 / factorial(x)) * (beta^alpha / gamma(alpha)) * (gamma(x + alpha) / (1 + beta)^(x + alpha))) } nsamps <- 1000 rx <- rgamma(nsamps, 1, 1) ryx <- rpois(nsamps, rx) ggplot(data = data.frame(x = rx, y = ryx), aes(x = x, y = y)) + geom_point() ggplot(data = data.frame(x = rx, y = ryx), aes(x = y)) + geom_bar(aes(y = (..count..)/sum(..count..))) + stat_function(fun = px, args = list(alpha = 1, beta = 1), color = "red") Exercise 5.5 Let \\(f(x,y) = cx^2y\\) for \\(x^2 \\leq y \\leq 1\\) and zero otherwise. Find such \\(c\\) that \\(f\\) is a PDF of a bivariate random variable. This exercise is borrowed from Wasserman. Solution. \\[\\begin{align} 1 &= \\int_{-1}^{1} \\int_{x^2}^1 cx^2y dy dx \\\\ &= \\int_{-1}^{1} cx^2 (\\frac{1}{2} - \\frac{x^4}{2}) dx \\\\ &= \\frac{c}{2} \\int_{-1}^{1} x^2 - x^6 dx \\\\ &= \\frac{c}{2} (\\frac{1}{3} + \\frac{1}{3} - \\frac{1}{7} - \\frac{1}{7}) \\\\ &= \\frac{c}{2} \\frac{8}{21} \\\\ &= \\frac{4c}{21} \\end{align}\\] It follows \\(c = \\frac{21}{4}\\). 5.3 Transformations Exercise 5.6 Let \\((X,Y)\\) be uniformly distributed on the unit ball \\(\\{(x,y,z) : x^2 + y^2 + z^2 \\leq 1\\}\\). Let \\(R = \\sqrt{X^2 + Y^2 + Z^2}\\). Find the CDF and PDF of \\(R\\). Solution. \\[\\begin{align} P(R < r) &= P(\\sqrt{X^2 + Y^2 + Z^2} < r) \\\\ &= P(X^2 + Y^2 + Z^2 < r^2) \\\\ &= \\frac{\\frac{4}{3} \\pi r^3}{\\frac{4}{3}\\pi} \\\\ &= r^3. \\end{align}\\] The second line shows us that we are looking at the probability which is represented by a smaller ball with radius \\(r\\). To get the probability, we divide it by the radius of the whole ball. We get the PDF by differentiating the CDF, so \\(p(r) = 3r^2\\). "],["integ.html", "Chapter 6 Integration 6.1 Monte Carlo integration 6.2 Lebesgue integrals", " Chapter 6 Integration This chapter deals with abstract and Monte Carlo integration. The students are expected to acquire the following knowledge: Theoretical How to calculate Lebesgue integrals for non-simple functions. R Monte Carlo integration. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 6.1 Monte Carlo integration Exercise 6.1 Let \\(X\\) and \\(Y\\) be continuous random variables on the unit interval and \\(p(x,y) = 6(x - y)^2\\). Use Monte Carlo integration to estimate the probability \\(P(0.2 \\leq X \\leq 0.5, \\: 0.1 \\leq Y \\leq 0.2)\\). Can you find the exact value? set.seed(1) nsamps <- 1000 V <- (0.5 - 0.2) * (0.2 - 0.1) x1 <- runif(nsamps, 0.2, 0.5) x2 <- runif(nsamps, 0.1, 0.2) f_1 <- function (x, y) { return (6 * (x - y)^2) } mcint <- V * (1 / nsamps) * sum(f_1(x1, x2)) sdm <- sqrt((V^2 / nsamps) * var(f_1(x1, x2))) mcint ## [1] 0.008793445 sdm ## [1] 0.0002197686 F_1 <- function (x, y) { return (2 * x^3 * y - 3 * x^2 * y^2 + 2 * x * y^3) } F_1(0.5, 0.2) - F_1(0.2, 0.2) - F_1(0.5, 0.1) + F_1(0.2, 0.1) ## [1] 0.0087 6.2 Lebesgue integrals Exercise 6.2 (borrowed from Jagannathan) Find the Lebesgue integral of the following functions on (\\(\\mathbb{R}\\), \\(\\mathcal{B}(\\mathbb{R})\\), \\(\\lambda\\)). \\[\\begin{align} f(\\omega) = \\begin{cases} \\omega, & \\text{for } \\omega = 0,1,...,n \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] \\[\\begin{align} f(\\omega) = \\begin{cases} 1, & \\text{for } \\omega = \\mathbb{Q}^c \\cap [0,1] \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] \\[\\begin{align} f(\\omega) = \\begin{cases} n, & \\text{for } \\omega = \\mathbb{Q}^c \\cap [0,n] \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] Solution. \\[\\begin{align} \\int f(\\omega) d\\lambda = \\sum_{\\omega = 0}^n \\omega \\lambda(\\omega) = 0. \\end{align}\\] \\[\\begin{align} \\int f(\\omega) d\\lambda = 1 \\times \\lambda(\\mathbb{Q}^c \\cap [0,1]) = 1. \\end{align}\\] \\[\\begin{align} \\int f(\\omega) d\\lambda = n \\times \\lambda(\\mathbb{Q}^c \\cap [0,n]) = n^2. \\end{align}\\] Exercise 6.3 (borrowed from Jagannathan) Let \\(c \\in \\mathbb{R}\\) be fixed and (\\(\\mathbb{R}\\), \\(\\mathcal{B}(\\mathbb{R})\\)) a measurable space. If for any Borel set \\(A\\), \\(\\delta_c (A) = 1\\) if \\(c \\in A\\), and \\(\\delta_c (A) = 0\\) otherwise, then \\(\\delta_c\\) is called a Dirac measure. Let \\(g\\) be a non-negative, measurable function. Show that \\(\\int g d \\delta_c = g(c)\\). Solution. \\[\\begin{align} \\int g d \\delta_c &= \\sup_{q \\in S(g)} \\int q d \\delta_c \\\\ &= \\sup_{q \\in S(g)} \\sum_{i = 1}^n a_i \\delta_c(A_i) \\\\ &= \\sup_{q \\in S(g)} \\sum_{i = 1}^n a_i \\text{I}_{A_i}(c) \\\\ &= \\sup_{q \\in S(g)} q(c) \\\\ &= g(c) \\end{align}\\] "],["ev.html", "Chapter 7 Expected value 7.1 Discrete random variables 7.2 Continuous random variables 7.3 Sums, functions, conditional expectations 7.4 Covariance", " Chapter 7 Expected value This chapter deals with expected values of random variables. The students are expected to acquire the following knowledge: Theoretical Calculation of the expected value. Calculation of variance and covariance. Cauchy distribution. R Estimation of expected value. Estimation of variance and covariance. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 7.1 Discrete random variables Exercise 7.1 (Bernoulli) Let \\(X \\sim \\text{Bernoulli}(p)\\). Find \\(E[X]\\). Find \\(Var[X]\\). R: Let \\(p = 0.4\\). Check your answers to a) and b) with a simulation. Solution. \\[\\begin{align*} E[X] = \\sum_{k=0}^1 p^k (1-p)^{1-k} k = p. \\end{align*}\\] \\[\\begin{align*} Var[X] = E[X^2] - E[X]^2 = \\sum_{k=0}^1 (p^k (1-p)^{1-k} k^2) - p^2 = p(1-p). \\end{align*}\\] set.seed(1) nsamps <- 1000 x <- rbinom(nsamps, 1, 0.4) mean(x) ## [1] 0.394 var(x) ## [1] 0.239003 0.4 * (1 - 0.4) ## [1] 0.24 Exercise 7.2 (Binomial) Let \\(X \\sim \\text{Binomial}(n,p)\\). Find \\(E[X]\\). Find \\(Var[X]\\). Solution. Let \\(X = \\sum_{i=0}^n X_i\\), where \\(X_i \\sim \\text{Bernoulli}(p)\\). Then, due to linearity of expectation \\[\\begin{align*} E[X] = E[\\sum_{i=0}^n X_i] = \\sum_{i=0}^n E[X_i] = np. \\end{align*}\\] Again let \\(X = \\sum_{i=0}^n X_i\\), where \\(X_i \\sim \\text{Bernoulli}(p)\\). Since the Bernoulli variables \\(X_i\\) are independent we have \\[\\begin{align*} Var[X] = Var[\\sum_{i=0}^n X_i] = \\sum_{i=0}^n Var[X_i] = np(1-p). \\end{align*}\\] Exercise 7.3 (Poisson) Let \\(X \\sim \\text{Poisson}(\\lambda)\\). Find \\(E[X]\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\sum_{k=0}^\\infty \\frac{\\lambda^k e^{-\\lambda}}{k!} k & \\\\ &= \\sum_{k=1}^\\infty \\frac{\\lambda^k e^{-\\lambda}}{k!} k & \\text{term at $k=0$ is 0} \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty \\frac{\\lambda^{k-1}}{(k - 1)!} & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=0}^\\infty \\frac{\\lambda^{k}}{k!} & \\\\ &= e^{-\\lambda} \\lambda e^\\lambda & \\\\ &= \\lambda. \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty k \\frac{\\lambda^{k-1}}{(k - 1)!} - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty (k - 1) + 1) \\frac{\\lambda^{k-1}}{(k - 1)!} - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\sum_{k=1}^\\infty (k - 1) \\frac{\\lambda^{k-1}}{(k - 1)!} + \\sum_{k=1}^\\infty \\frac{\\lambda^{k-1}}{(k - 1)!}\\Big) - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\lambda\\sum_{k=2}^\\infty \\frac{\\lambda^{k-2}}{(k - 2)!} + e^\\lambda\\Big) - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\lambda e^\\lambda + e^\\lambda\\Big) - \\lambda^2 & \\\\ &= \\lambda^2 + \\lambda - \\lambda^2 & \\\\ &= \\lambda. \\end{align*}\\] Exercise 7.4 (Geometric) Let \\(X \\sim \\text{Geometric}(p)\\). Find \\(E[X]\\). Hint: \\(\\frac{d}{dx} x^k = k x^{(k - 1)}\\). Solution. \\[\\begin{align*} E[X] &= \\sum_{k=0}^\\infty (1 - p)^k p k & \\\\ &= p (1 - p) \\sum_{k=0}^\\infty (1 - p)^{k-1} k & \\\\ &= p (1 - p) \\sum_{k=0}^\\infty -\\frac{d}{dp}(1 - p)^k & \\\\ &= p (1 - p) \\Big(-\\frac{d}{dp}\\Big) \\sum_{k=0}^\\infty (1 - p)^k & \\\\ &= p (1 - p) \\Big(-\\frac{d}{dp}\\Big) \\frac{1}{1 - (1 - p)} & \\text{geometric series} \\\\ &= \\frac{1 - p}{p} \\end{align*}\\] 7.2 Continuous random variables Exercise 7.5 (Gamma) Let \\(X \\sim \\text{Gamma}(\\alpha, \\beta)\\). Hint: \\(\\Gamma(z) = \\int_0^\\infty t^{z-1}e^{-t} dt\\) and \\(\\Gamma(z + 1) = z \\Gamma(z)\\). Find \\(E[X]\\). Find \\(Var[X]\\). R: Let \\(\\alpha = 10\\) and \\(\\beta = 2\\). Plot the density of \\(X\\). Add a horizontal line at the expected value that touches the density curve (geom_segment). Shade the area within a standard deviation of the expected value. Solution. \\[\\begin{align*} E[X] &= \\int_0^\\infty \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)}x^\\alpha e^{-\\beta x} dx & \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\int_0^\\infty x^\\alpha e^{-\\beta x} dx & \\text{ (let $t = \\beta x$)} \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(\\alpha) }\\int_0^\\infty \\frac{t^\\alpha}{\\beta^\\alpha} e^{-t} \\frac{dt}{\\beta} & \\\\ &= \\frac{1}{\\beta \\Gamma(\\alpha) }\\int_0^\\infty t^\\alpha e^{-t} dt & \\\\ &= \\frac{\\Gamma(\\alpha + 1)}{\\beta \\Gamma(\\alpha)} & \\\\ &= \\frac{\\alpha \\Gamma(\\alpha)}{\\beta \\Gamma(\\alpha)} & \\\\ &= \\frac{\\alpha}{\\beta}. & \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\int_0^\\infty \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)}x^{\\alpha+1} e^{-\\beta x} dx - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\Gamma(\\alpha + 2)}{\\beta^2 \\Gamma(\\alpha)} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{(\\alpha + 1)\\alpha\\Gamma(\\alpha)}{\\beta^2 \\Gamma(\\alpha)} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\alpha^2 + \\alpha}{\\beta^2} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\alpha}{\\beta^2}. \\end{align*}\\] set.seed(1) x <- seq(0, 25, by = 0.01) y <- dgamma(x, shape = 10, rate = 2) df <- data.frame(x = x, y = y) ggplot(df, aes(x = x, y = y)) + geom_line() + geom_segment(aes(x = 5, y = 0, xend = 5, yend = dgamma(5, shape = 10, rate = 2)), color = "red") + stat_function(fun = dgamma, args = list(shape = 10, rate = 2), xlim = c(5 - sqrt(10/4), 5 + sqrt(10/4)), geom = "area", fill = "gray", alpha = 0.4) Exercise 7.6 (Beta) Let \\(X \\sim \\text{Beta}(\\alpha, \\beta)\\). Find \\(E[X]\\). Hint 1: \\(\\text{B}(x,y) = \\int_0^1 t^{x-1} (1 - t)^{y-1} dt\\). Hint 2: \\(\\text{B}(x + 1, y) = \\text{B}(x,y)\\frac{x}{x + y}\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\int_0^1 \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} x dx \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)}\\int_0^1 x^{\\alpha} (1 - x)^{\\beta - 1} dx \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 1, \\beta) \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha, \\beta) \\frac{\\alpha}{\\alpha + \\beta} \\\\ &= \\frac{\\alpha}{\\alpha + \\beta}. \\\\ \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\int_0^1 \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} x^2 dx - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)}\\int_0^1 x^{\\alpha + 1} (1 - x)^{\\beta - 1} dx - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 2, \\beta) - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 1, \\beta) \\frac{\\alpha + 1}{\\alpha + \\beta + 1} - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{\\alpha + 1}{\\alpha + \\beta + 1} \\frac{\\alpha}{\\alpha + \\beta} - \\frac{\\alpha^2}{(\\alpha + \\beta)^2}\\\\ &= \\frac{\\alpha \\beta}{(\\alpha + \\beta)^2(\\alpha + \\beta + 1)}. \\end{align*}\\] Exercise 7.7 (Exponential) Let \\(X \\sim \\text{Exp}(\\lambda)\\). Find \\(E[X]\\). Hint: \\(\\Gamma(z + 1) = z\\Gamma(z)\\) and \\(\\Gamma(1) = 1\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\int_0^\\infty \\lambda e^{-\\lambda x} x dx & \\\\ &= \\lambda \\int_0^\\infty x e^{-\\lambda x} dx & \\\\ &= \\lambda \\int_0^\\infty \\frac{t}{\\lambda} e^{-t} \\frac{dt}{\\lambda} & \\text{$t = \\lambda x$}\\\\ &= \\lambda \\lambda^{-2} \\Gamma(2) & \\text{definition of gamma function} \\\\ &= \\lambda^{-1}. \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 & \\\\ &= \\int_0^\\infty \\lambda e^{-\\lambda x} x^2 dx - \\lambda^{-2} & \\\\ &= \\lambda \\int_0^\\infty \\frac{t^2}{\\lambda^2} e^{-t} \\frac{dt}{\\lambda} - \\lambda^{-2} & \\text{$t = \\lambda x$} \\\\ &= \\lambda \\lambda^{-3} \\Gamma(3) - \\lambda^{-2} & \\text{definition of gamma function} & \\\\ &= \\lambda^{-2} 2 \\Gamma(2) - \\lambda^{-2} & \\\\ &= 2 \\lambda^{-2} - \\lambda^{-2} & \\\\ &= \\lambda^{-2}. & \\\\ \\end{align*}\\] Exercise 7.8 (Normal) Let \\(X \\sim \\text{N}(\\mu, \\sigma)\\). Show that \\(E[X] = \\mu\\). Hint: Use the error function \\(\\text{erf}(x) = \\frac{1}{\\sqrt(\\pi)} \\int_{-x}^x e^{-t^2} dt\\). The statistical interpretation of this function is that if \\(Y \\sim \\text{N}(0, 0.5)\\), then the error function describes the probability of \\(Y\\) falling between \\(-x\\) and \\(x\\). Also, \\(\\text{erf}(\\infty) = 1\\). Show that \\(Var[X] = \\sigma^2\\). Hint: Start with the definition of variance. Solution. \\[\\begin{align*} E[X] &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} x dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty x e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty \\Big(t \\sqrt{2\\sigma^2} + \\mu\\Big)e^{-t^2} \\sqrt{2 \\sigma^2} dt & t = \\frac{x - \\mu}{\\sqrt{2}\\sigma} \\\\ &= \\frac{\\sqrt{2\\sigma^2}}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty t e^{-t^2} dt + \\frac{1}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty \\mu e^{-t^2} dt & \\\\ \\end{align*}\\] Let us calculate these integrals separately. \\[\\begin{align*} \\int t e^{-t^2} dt &= -\\frac{1}{2}\\int e^{s} ds & s = -t^2 \\\\ &= -\\frac{e^s}{2} + C \\\\ &= -\\frac{e^{-t^2}}{2} + C & \\text{undoing substitution}. \\end{align*}\\] Inserting the integration limits we get \\[\\begin{align*} \\int_{-\\infty}^\\infty t e^{-t^2} dt &= 0, \\end{align*}\\] due to the integrated function being symmetric. Reordering the second integral we get \\[\\begin{align*} \\mu \\frac{1}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty e^{-t^2} dt &= \\mu \\text{erf}(\\infty) & \\text{definition of error function} \\\\ &= \\mu & \\text{probability of $Y$ falling between $-\\infty$ and $\\infty$}. \\end{align*}\\] Combining all of the above we get \\[\\begin{align*} E[X] &= \\frac{\\sqrt{2\\sigma^2}}{\\sqrt{\\pi}} \\times 0 + \\mu &= \\mu.\\\\ \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[(X - E[X])^2] \\\\ &= E[(X - \\mu)^2] \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty (x - \\mu)^2 e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} dx \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\int_{-\\infty}^\\infty t^2 e^{-\\frac{t^2}{2}} dt \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\bigg(\\Big(- t e^{-\\frac{t^2}{2}} |_{-\\infty}^\\infty \\Big) + \\int_{-\\infty}^\\infty e^{-\\frac{t^2}{2}} \\bigg) dt & \\text{integration by parts} \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\sqrt{2 \\pi} \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt(\\pi)}e^{-s^2} \\bigg) & s = \\frac{t}{\\sqrt{2}} \\text{ and evaluating the left expression at the bounds} \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\sqrt{2 \\pi} \\Big(\\text{erf}(\\infty) & \\text{definition of error function} \\\\ &= \\sigma^2. \\end{align*}\\] 7.3 Sums, functions, conditional expectations Exercise 7.9 (Expectation of transformations) Let \\(X\\) follow a normal distribution with mean \\(\\mu\\) and variance \\(\\sigma^2\\). Find \\(E[2X + 4]\\). Find \\(E[X^2]\\). Find \\(E[\\exp(X)]\\). Hint: Use the error function \\(\\text{erf}(x) = \\frac{1}{\\sqrt(\\pi)} \\int_{-x}^x e^{-t^2} dt\\). Also, \\(\\text{erf}(\\infty) = 1\\). R: Check your results numerically for \\(\\mu = 0.4\\) and \\(\\sigma^2 = 0.25\\) and plot the densities of all four distributions. Solution. \\[\\begin{align} E[2X + 4] &= 2E[X] + 4 & \\text{linearity of expectation} \\\\ &= 2\\mu + 4. \\\\ \\end{align}\\] \\[\\begin{align} E[X^2] &= E[X]^2 + Var[X] & \\text{definition of variance} \\\\ &= \\mu^2 + \\sigma^2. \\end{align}\\] \\[\\begin{align} E[\\exp(X)] &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} e^x dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{\\frac{2 \\sigma^2 x}{2\\sigma^2} -\\frac{(x - \\mu)^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{x^2 - 2x(\\mu + \\sigma^2) + \\mu^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{(x - (\\mu + \\sigma^2))^2 + \\mu^2 - (\\mu + \\sigma^2)^2}{2\\sigma^2}} dx & \\text{complete the square} \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{\\frac{- \\mu^2 + (\\mu + \\sigma^2)^2}{2\\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{(x - (\\mu + \\sigma^2))^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{\\frac{- \\mu^2 + (\\mu + \\sigma^2)^2}{2\\sigma^2}} \\sigma \\sqrt{2 \\pi} \\text{erf}(\\infty) & \\\\ &= e^{\\frac{2\\mu + \\sigma^2}{2}}. \\end{align}\\] set.seed(1) mu <- 0.4 sigma <- 0.5 x <- rnorm(100000, mean = mu, sd = sigma) mean(2*x + 4) ## [1] 4.797756 2 * mu + 4 ## [1] 4.8 mean(x^2) ## [1] 0.4108658 mu^2 + sigma^2 ## [1] 0.41 mean(exp(x)) ## [1] 1.689794 exp((2 * mu + sigma^2) / 2) ## [1] 1.690459 Exercise 7.10 (Sum of independent random variables) Borrowed from Wasserman. Let \\(X_1, X_2,...,X_n\\) be IID random variables with expected value \\(E[X_i] = \\mu\\) and variance \\(Var[X_i] = \\sigma^2\\). Find the expected value and variance of \\(\\bar{X} = \\frac{1}{n} \\sum_{i=1}^n X_i\\). \\(\\bar{X}\\) is called a statistic (a function of the values in a sample). It is itself a random variable and its distribution is called a sampling distribution. R: Take \\(n = 5, 10, 100, 1000\\) samples from the N(\\(2\\), \\(6\\)) distribution 10000 times. Plot the theoretical density and the densities of \\(\\bar{X}\\) statistic for each \\(n\\). Intuitively, are the results in correspondence with your calculations? Check them numerically. Solution. Let us start with the expectation of \\(\\bar{X}\\). \\[\\begin{align} E[\\bar{X}] &= E[\\frac{1}{n} \\sum_{i=1}^n X_i] & \\\\ &= \\frac{1}{n} E[\\sum_{i=1}^n X_i] & \\text{ (multiplication with a scalar)} \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[X_i] & \\text{ (linearity)} \\\\ &= \\frac{1}{n} n \\mu & \\\\ &= \\mu. \\end{align}\\] Now the variance \\[\\begin{align} Var[\\bar{X}] &= Var[\\frac{1}{n} \\sum_{i=1}^n X_i] & \\\\ &= \\frac{1}{n^2} Var[\\sum_{i=1}^n X_i] & \\text{ (multiplication with a scalar)} \\\\ &= \\frac{1}{n^2} \\sum_{i=1}^n Var[X_i] & \\text{ (independence of samples)} \\\\ &= \\frac{1}{n^2} n \\sigma^2 & \\\\ &= \\frac{1}{n} \\sigma^2. \\end{align}\\] set.seed(1) nsamps <- 10000 mu <- 2 sigma <- sqrt(6) N <- c(5, 10, 100, 500) X <- matrix(data = NA, nrow = nsamps, ncol = length(N)) ind <- 1 for (n in N) { for (i in 1:nsamps) { X[i,ind] <- mean(rnorm(n, mu, sigma)) } ind <- ind + 1 } colnames(X) <- N X <- melt(as.data.frame(X)) ggplot(data = X, aes(x = value, colour = variable)) + geom_density() + stat_function(data = data.frame(x = seq(-2, 6, by = 0.01)), aes(x = x), fun = dnorm, args = list(mean = mu, sd = sigma), color = "black") Exercise 7.11 (Conditional expectation) Let \\(X \\in \\mathbb{R}_0^+\\) and \\(Y \\in \\mathbb{N}_0\\) be random variables with joint distribution \\(p_{XY}(X,Y) = \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} 0.5^{y + 1}\\). Find \\(E[X | Y = y]\\) by first finding \\(p_Y\\) and then \\(p_{X|Y}\\). Find \\(E[X]\\). R: check your answers to a) and b) by drawing 10000 samples from \\(p_Y\\) and \\(p_{X|Y}\\). Solution. \\[\\begin{align} p(y) &= \\int_0^\\infty \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} 0.5^{y + 1} dx \\\\ &= \\frac{0.5^{y + 1}}{y + 1} \\int_0^\\infty e^{-\\frac{x}{y + 1}} dx \\\\ &= \\frac{0.5^{y + 1}}{y + 1} (y + 1) \\\\ &= 0.5^{y + 1} \\\\ &= 0.5(1 - 0.5)^y. \\end{align}\\] We recognize this as the geometric distribution. \\[\\begin{align} p(x|y) &= \\frac{p(x,y)}{p(y)} \\\\ &= \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}}. \\end{align}\\] We recognize this as the exponential distribution. \\[\\begin{align} E[X | Y = y] &= \\int_0^\\infty x \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} dx \\\\ &= y + 1 & \\text{expected value of the exponential distribution} \\end{align}\\] Use the law of iterated expectation. \\[\\begin{align} E[X] &= E[E[X | Y]] \\\\ &= E[Y + 1] \\\\ &= E[Y] + 1 \\\\ &= \\frac{1 - 0.5}{0.5} + 1 \\\\ &= 2. \\end{align}\\] set.seed(1) y <- rgeom(100000, 0.5) x <- rexp(100000, rate = 1 / (y + 1)) x2 <- x[y == 3] mean(x2) ## [1] 4.048501 3 + 1 ## [1] 4 mean(x) ## [1] 2.007639 (1 - 0.5) / 0.5 + 1 ## [1] 2 Exercise 7.12 (Cauchy distribution) Let \\(p(x | x_0, \\gamma) = \\frac{1}{\\pi \\gamma \\Big(1 + \\big(\\frac{x - x_0}{\\gamma}\\big)^2\\Big)}\\). A random variable with this PDF follows a Cauchy distribution. This distribution is symmetric and has wider tails than the normal distribution. R: Draw \\(n = 1,...,1000\\) samples from a standard normal and \\(\\text{Cauchy}(0, 1)\\). For each \\(n\\) plot the mean and the median of the sample using facets. Interpret the results. To get a mathematical explanation of the results in a), evaluate the integral \\(\\int_0^\\infty \\frac{x}{1 + x^2} dx\\) and consider that \\(E[X] = \\int_{-\\infty}^\\infty \\frac{x}{1 + x^2}dx\\). set.seed(1) n <- 1000 means_n <- vector(mode = "numeric", length = n) means_c <- vector(mode = "numeric", length = n) medians_n <- vector(mode = "numeric", length = n) medians_c <- vector(mode = "numeric", length = n) for (i in 1:n) { tmp_n <- rnorm(i) tmp_c <- rcauchy(i) means_n[i] <- mean(tmp_n) means_c[i] <- mean(tmp_c) medians_n[i] <- median(tmp_n) medians_c[i] <- median(tmp_c) } df <- data.frame("distribution" = c(rep("normal", 2 * n), rep("Cauchy", 2 * n)), "type" = c(rep("mean", n), rep("median", n), rep("mean", n), rep("median", n)), "value" = c(means_n, medians_n, means_c, medians_c), "n" = rep(1:n, times = 4)) ggplot(df, aes(x = n, y = value)) + geom_line(alpha = 0.5) + facet_wrap(~ type + distribution , scales = "free") Solution. \\[\\begin{align} \\int_0^\\infty \\frac{x}{1 + x^2} dx &= \\frac{1}{2} \\int_1^\\infty \\frac{1}{u} du & u = 1 + x^2 \\\\ &= \\frac{1}{2} \\ln(x) |_0^\\infty. \\end{align}\\] This integral is not finite. The same holds for the negative part. Therefore, the expectation is undefined, as \\(E[|X|] = \\infty\\). Why can we not just claim that \\(f(x) = x / (1 + x^2)\\) is odd and \\(\\int_{-\\infty}^\\infty f(x) = 0\\)? By definition of the Lebesgue integral \\(\\int_{-\\infty}^{\\infty} f= \\int_{-\\infty}^{\\infty} f_+-\\int_{-\\infty}^{\\infty} f_-\\). At least one of the two integrals needs to be finite for \\(\\int_{-\\infty}^{\\infty} f\\) to be well-defined. However \\(\\int_{-\\infty}^{\\infty} f_+=\\int_0^{\\infty} x/(1+x^2)\\) and \\(\\int_{-\\infty}^{\\infty} f_-=\\int_{-\\infty}^{0} |x|/(1+x^2)\\). We have just shown that both of these integrals are infinite, which implies that their sum is also infinite. 7.4 Covariance Exercise 7.13 Below is a table of values for random variables \\(X\\) and \\(Y\\). X Y 2.1 8 -0.5 11 1 10 -2 12 4 9 Find sample covariance of \\(X\\) and \\(Y\\). Find sample variances of \\(X\\) and \\(Y\\). Find sample correlation of \\(X\\) and \\(Y\\). Find sample variance of \\(Z = 2X - 3Y\\). Solution. \\(\\bar{X} = 0.92\\) and \\(\\bar{Y} = 10\\). \\[\\begin{align} s(X, Y) &= \\frac{1}{n - 1} \\sum_{i=1}^5 (X_i - 0.92) (Y_i - 10) \\\\ &= -3.175. \\end{align}\\] \\[\\begin{align} s(X) &= \\frac{\\sum_{i=1}^5(X_i - 0.92)^2}{5 - 1} \\\\ &= 5.357. \\end{align}\\] \\[\\begin{align} s(Y) &= \\frac{\\sum_{i=1}^5(Y_i - 10)^2}{5 - 1} \\\\ &= 2.5. \\end{align}\\] \\[\\begin{align} r(X,Y) &= \\frac{Cov(X,Y)}{\\sqrt{Var[X]Var[Y]}} \\\\ &= \\frac{-3.175}{\\sqrt{5.357 \\times 2.5}} \\\\ &= -8.68. \\end{align}\\] \\[\\begin{align} s(Z) &= 2^2 s(X) + 3^2 s(Y) + 2 \\times 2 \\times 3 s(X, Y) \\\\ &= 4 \\times 5.357 + 9 \\times 2.5 + 12 \\times 3.175 \\\\ &= 82.028. \\end{align}\\] Exercise 7.14 Let \\(X \\sim \\text{Uniform}(0,1)\\) and \\(Y | X = x \\sim \\text{Uniform(0,x)}\\). Find the covariance of \\(X\\) and \\(Y\\). Find the correlation of \\(X\\) and \\(Y\\). R: check your answers to a) and b) with simulation. Plot \\(X\\) against \\(Y\\) on a scatterplot. Solution. The joint PDF is \\(p(x,y) = p(x)p(y|x) = \\frac{1}{x}\\). \\[\\begin{align} Cov(X,Y) &= E[XY] - E[X]E[Y] \\\\ \\end{align}\\] Let us first evaluate the first term: \\[\\begin{align} E[XY] &= \\int_0^1 \\int_0^x x y \\frac{1}{x} dy dx \\\\ &= \\int_0^1 \\int_0^x y dy dx \\\\ &= \\int_0^1 \\frac{x^2}{2} dx \\\\ &= \\frac{1}{6}. \\end{align}\\] Now let us find \\(E[Y]\\), \\(E[X]\\) is trivial. \\[\\begin{align} E[Y] = E[E[Y | X]] = E[\\frac{X}{2}] = \\frac{1}{2} \\int_0^1 x dx = \\frac{1}{4}. \\end{align}\\] Combining all: \\[\\begin{align} Cov(X,Y) &= \\frac{1}{6} - \\frac{1}{2} \\frac{1}{4} = \\frac{1}{24}. \\end{align}\\] \\[\\begin{align} \\rho(X,Y) &= \\frac{Cov(X,Y)}{\\sqrt{Var[X]Var[Y]}} \\\\ \\end{align}\\] Let us calculate \\(Var[X]\\). \\[\\begin{align} Var[X] &= E[X^2] - \\frac{1}{4} \\\\ &= \\int_0^1 x^2 - \\frac{1}{4} \\\\ &= \\frac{1}{3} - \\frac{1}{4} \\\\ &= \\frac{1}{12}. \\end{align}\\] Let us calculate \\(E[E[Y^2|X]]\\). \\[\\begin{align} E[E[Y^2|X]] &= E[\\frac{x^2}{3}] \\\\ &= \\frac{1}{9}. \\end{align}\\] Then \\(Var[Y] = \\frac{1}{9} - \\frac{1}{16} = \\frac{7}{144}\\). Combining all \\[\\begin{align} \\rho(X,Y) &= \\frac{\\frac{1}{24}}{\\sqrt{\\frac{1}{12}\\frac{7}{144}}} \\\\ &= 0.65. \\end{align}\\] set.seed(1) nsamps <- 10000 x <- runif(nsamps) y <- runif(nsamps, 0, x) cov(x, y) ## [1] 0.04274061 1/24 ## [1] 0.04166667 cor(x, y) ## [1] 0.6629567 (1 / 24) / (sqrt(7 / (12 * 144))) ## [1] 0.6546537 ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_point(alpha = 0.2) + geom_smooth(method = "lm") "],["mrv.html", "Chapter 8 Multivariate random variables 8.1 Multinomial random variables 8.2 Multivariate normal random variables 8.3 Transformations", " Chapter 8 Multivariate random variables This chapter deals with multivariate random variables. The students are expected to acquire the following knowledge: Theoretical Multinomial distribution. Multivariate normal distribution. Cholesky decomposition. Eigendecomposition. R Sampling from the multinomial distribution. Sampling from the multivariate normal distribution. Matrix decompositions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 8.1 Multinomial random variables Exercise 8.1 Let \\(X_i\\), \\(i = 1,...,k\\) represent \\(k\\) events, and \\(p_i\\) the probabilities of these events happening in a trial. Let \\(n\\) be the number of trials, and \\(X\\) a multivariate random variable, the collection of \\(X_i\\). Then \\(p(x) = \\frac{n!}{x_1!x_2!...x_k!} p_1^{x_1} p_2^{x_2}...p_k^{x_k}\\) is the PMF of a multinomial distribution, where \\(n = \\sum_{i = 1}^k x_i\\). Show that the marginal distribution of \\(X_i\\) is a binomial distribution. Take 1000 samples from the multinomial distribution with \\(n=4\\) and probabilities \\(p = (0.2, 0.2, 0.5, 0.1)\\). Then take 1000 samples from four binomial distributions with the same parameters. Inspect the results visually. Solution. We will approach this proof from the probabilistic point of view. W.L.O.G. let \\(x_1\\) be the marginal distribution we are interested in. The term \\(p^{x_1}\\) denotes the probability that event 1 happened \\(x_1\\) times. For this event not to happen, one of the other events needs to happen. So for each of the remaining trials, the probability of another event is \\(\\sum_{i=2}^k p_i = 1 - p_1\\), and there were \\(n - x_1\\) such trials. What is left to do is to calculate the number of permutations of event 1 happening and event 1 not happening. We choose \\(x_1\\) trials, from \\(n\\) trials. Therefore \\(p(x_1) = \\binom{n}{x_1} p_1^{x_1} (1 - p_1)^{n - x_1}\\), which is the binomial PMF. Interested students are encouraged to prove this mathematically. set.seed(1) nsamps <- 1000 samps_mult <- rmultinom(nsamps, 4, prob = c(0.2, 0.2, 0.5, 0.1)) samps_mult <- as_tibble(t(samps_mult)) %>% gather() samps <- tibble( V1 = rbinom(nsamps, 4, 0.2), V2 = rbinom(nsamps, 4, 0.2), V3 = rbinom(nsamps, 4, 0.5), V4 = rbinom(nsamps, 4, 0.1) ) %>% gather() %>% bind_rows(samps_mult) %>% bind_cols("dist" = c(rep("binomial", 4*nsamps), rep("multinomial", 4*nsamps))) ggplot(samps, aes(x = value, fill = dist)) + geom_bar(position = "dodge") + facet_wrap(~ key) Exercise 8.2 (Multinomial expected value) Find the expected value, variance and covariance of the multinomial distribution. Hint: First find the expected value for \\(n = 1\\) and then use the fact that the trials are independent. Solution. Let us first calculate the expected value of \\(X_1\\), when \\(n = 1\\). \\[\\begin{align} E[X_1] &= \\sum_{n_1 = 0}^1 \\sum_{n_2 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_1!n_2!...n_k!}p_1^{n_1}p_2^{n_2}...p_k^{n_k}n_1 \\\\ &= \\sum_{n_1 = 0}^1 \\frac{p_1^{n_1} n_1}{n_1!} \\sum_{n_2 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_2!...n_k!}p_2^{n_2}...p_k^{n_k} \\end{align}\\] When \\(n_1 = 0\\) then the whole terms is zero, so we do not need to evaluate other sums. When \\(n_1 = 1\\), all other \\(n_i\\) must be zero, as we have \\(1 = \\sum_{i=1}^k n_i\\). Therefore the other sums equal \\(1\\). So \\(E[X_1] = p_1\\) and \\(E[X_i] = p_i\\) for \\(i = 1,...,k\\). Now let \\(Y_j\\), \\(j = 1,...,n\\), have a multinomial distribution with \\(n = 1\\), and let \\(X\\) have a multinomial distribution with an arbitrary \\(n\\). Then we can write \\(X = \\sum_{j=1}^n Y_j\\). And due to independence \\[\\begin{align} E[X] &= E[\\sum_{j=1}^n Y_j] \\\\ &= \\sum_{j=1}^n E[Y_j] \\\\ &= np. \\end{align}\\] For the variance, we need \\(E[X^2]\\). Let us follow the same procedure as above and first calculate \\(E[X_i]\\) for \\(n = 1\\). The only thing that changes is that the term \\(n_i\\) becomes \\(n_i^2\\). Since we only have \\(0\\) and \\(1\\) this does not change the outcome. So \\[\\begin{align} Var[X_i] &= E[X_i^2] - E[X_i]^2\\\\ &= p_i(1 - p_i). \\end{align}\\] Analogous to above for arbitrary \\(n\\) \\[\\begin{align} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\sum_{j=1}^n E[Y_j^2] - \\sum_{j=1}^n E[Y_j]^2 \\\\ &= \\sum_{j=1}^n E[Y_j^2] - E[Y_j]^2 \\\\ &= \\sum_{j=1}^n p(1-p) \\\\ &= np(1-p). \\end{align}\\] To calculate the covariance, we need \\(E[X_i X_j]\\). Again, let us start with \\(n = 1\\). Without loss of generality, let us assume \\(i = 1\\) and \\(j = 2\\). \\[\\begin{align} E[X_1 X_2] = \\sum_{n_1 = 0}^1 \\sum_{n_2 = 0}^1 \\frac{p_1^{n_1} n_1}{n_1!} \\frac{p_2^{n_2} n_2}{n_2!} \\sum_{n_3 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_3!...n_k!}p_3^{n_3}...p_k^{n_k}. \\end{align}\\] In the above expression, at each iteration we multiply with \\(n_1\\) and \\(n_2\\). Since \\(n = 1\\), one of these always has to be zero. Therefore \\(E[X_1 X_2] = 0\\) and \\[\\begin{align} Cov(X_i, X_j) &= E[X_i X_j] - E[X_i]E[X_j] \\\\ &= - p_i p_j. \\end{align}\\] For arbitrary \\(n\\), let \\(X = \\sum_{t = 1}^n Y_t\\) be the sum of independent multinomial random variables \\(Y_t = [X_{1t}, X_{2t},...,X_{kt}]^T\\) with \\(n=1\\). Then \\(X_1 = \\sum_{t = 1}^n X_{1t}\\) and \\(X_2 = \\sum_{l = 1}^n X_{2l}\\). \\[\\begin{align} Cov(X_1, X_2) &= E[X_1 X_2] - E[X_1] E[X_2] \\\\ &= E[\\sum_{t = 1}^n X_{1t} \\sum_{l = 1}^n X_{2l}] - n^2 p_1 p_2 \\\\ &= \\sum_{t = 1}^n \\sum_{l = 1}^n E[X_{1t} X_{2l}] - n^2 p_1 p_2. \\end{align}\\] For \\(X_{1t}\\) and \\(X_{2l}\\) the expected value is zero when \\(t = l\\). When \\(t \\neq l\\) then they are independent, so the expected value is the product \\(p_1 p_2\\). There are \\(n^2\\) total terms, and for \\(n\\) of them \\(t = l\\) holds. So \\(E[X_1 X_2] = (n^2 - n) p_1 p_2\\). Inserting into the above \\[\\begin{align} Cov(X_1, X_2) &= (n^2 - n) p_1 p_2 - n^2 p_1 p_2 \\\\ &= - n p_1 p_2. \\end{align}\\] 8.2 Multivariate normal random variables Exercise 8.3 (Cholesky decomposition) Let \\(X\\) be a random vector of length \\(k\\) with \\(X_i \\sim \\text{N}(0, 1)\\) and \\(LL^*\\) the Cholesky decomposition of a Hermitian positive-definite matrix \\(A\\). Let \\(\\mu\\) be a vector of length \\(k\\). Find the distribution of the random vector \\(Y = \\mu + L X\\). Find the Cholesky decomposition of \\(A = \\begin{bmatrix} 2 & 1.2 \\\\ 1.2 & 1 \\end{bmatrix}\\). R: Use the results from a) and b) to sample from the MVN distribution \\(\\text{N}(\\mu, A)\\), where \\(\\mu = [1.5, -1]^T\\). Plot a scatterplot and compare it to direct samples from the multivariate normal distribution (rmvnorm). Solution. \\(X\\) has an independent normal distribution of dimension \\(k\\). Then \\[\\begin{align} Y = \\mu + L X &\\sim \\text{N}(\\mu, LL^T) \\\\ &\\sim \\text{N}(\\mu, A). \\end{align}\\] Solve \\[\\begin{align} \\begin{bmatrix} a & 0 \\\\ b & c \\end{bmatrix} \\begin{bmatrix} a & b \\\\ 0 & c \\end{bmatrix} = \\begin{bmatrix} 2 & 1.2 \\\\ 1.2 & 1 \\end{bmatrix} \\end{align}\\] # a set.seed(1) nsamps <- 1000 X <- matrix(data = rnorm(nsamps * 2), ncol = 2) mu <- c(1.5, -1) L <- matrix(data = c(sqrt(2), 0, 1.2 / sqrt(2), sqrt(1 - 1.2^2/2)), ncol = 2, byrow = TRUE) Y <- t(mu + L %*% t(X)) plot_df <- data.frame(rbind(X, Y), c(rep("X", nsamps), rep("Y", nsamps))) colnames(plot_df) <- c("D1", "D2", "var") ggplot(data = plot_df, aes(x = D1, y = D2, colour = as.factor(var))) + geom_point() Exercise 8.4 (Eigendecomposition) R: Let \\(\\Sigma = U \\Lambda U^T\\) be the eigendecomposition of covariance matrix \\(\\Sigma\\). Follow the procedure below, to sample from a multivariate normal with \\(\\mu = [-2, 1]^T\\) and \\(\\Sigma = \\begin{bmatrix} 0.3, -0.5 \\\\ -0.5, 1.6 \\end{bmatrix}\\): Sample from two independent standardized normal distributions to get \\(X\\). Find the eigendecomposition of \\(X\\) (eigen). Multiply \\(X\\) by \\(\\Lambda^{\\frac{1}{2}}\\) to get \\(X2\\). Consider how the eigendecomposition for \\(X2\\) changes compared to \\(X\\). Multiply \\(X2\\) by \\(U\\) to get \\(X3\\). Consider how the eigendecomposition for \\(X3\\) changes compared to \\(X2\\). Add \\(\\mu\\) to \\(X3\\). Consider how the eigendecomposition for \\(X4\\) changes compared to \\(X3\\). Plot the data and the eigenvectors (scaled with \\(\\Lambda^{\\frac{1}{2}}\\)) at each step. Hint: Use geom_segment for the eigenvectors. # a set.seed(1) sigma <- matrix(data = c(0.3, -0.5, -0.5, 1.6), nrow = 2, byrow = TRUE) ed <- eigen(sigma) e_val <- ed$values e_vec <- ed$vectors # b set.seed(1) nsamps <- 1000 X <- matrix(data = rnorm(nsamps * 2), ncol = 2) vec1 <- matrix(c(1,0,0,1), nrow = 2) X2 <- t(sqrt(diag(e_val)) %*% t(X)) vec2 <- sqrt(diag(e_val)) %*% vec1 X3 <- t(e_vec %*% t(X2)) vec3 <- e_vec %*% vec2 X4 <- t(c(-2, 1) + t(X3)) vec4 <- c(-2, 1) + vec3 vec_mat <- data.frame(matrix(c(0,0,0,0,0,0,0,0,0,0,0,0,-2,1,-2,1), ncol = 2, byrow = TRUE), t(cbind(vec1, vec2, vec3, vec4)), c(1,1,2,2,3,3,4,4)) df <- data.frame(rbind(X, X2, X3, X4), c(rep(1, nsamps), rep(2, nsamps), rep(3, nsamps), rep(4, nsamps))) colnames(df) <- c("D1", "D2", "wh") colnames(vec_mat) <- c("D1", "D2", "E1", "E2", "wh") ggplot(data = df, aes(x = D1, y = D2)) + geom_point() + geom_segment(data = vec_mat, aes(xend = E1, yend = E2), color = "red") + facet_wrap(~ wh) + coord_fixed() Exercise 8.5 (Marginal and conditional distributions) Let \\(X \\sim \\text{N}(\\mu, \\Sigma)\\), where \\(\\mu = [2, 0, -1]^T\\) and \\(\\Sigma = \\begin{bmatrix} 1 & -0.2 & 0.5 \\\\ -0.2 & 1.4 & -1.2 \\\\ 0.5 & -1.2 & 2 \\\\ \\end{bmatrix}\\). Let \\(A\\) represent the first two random variables and \\(B\\) the third random variable. R: For the calculation in the following points, you can use R. Find the marginal distribution of \\(B\\). Find the conditional distribution of \\(B | A = [a_1, a_2]^T\\). Find the marginal distribution of \\(A\\). Find the conditional distribution of \\(A | B = b\\). R: Visually compare the distributions of a) and b), and c) and d) at three different conditional values. mu <- c(2, 0, -1) Sigma <- matrix(c(1, -0.2, 0.5, -0.2, 1.4, -1.2, 0.5, -1.2, 2), nrow = 3, byrow = TRUE) mu_A <- c(2, 0) mu_B <- -1 Sigma_A <- Sigma[1:2, 1:2] Sigma_B <- Sigma[3, 3] Sigma_AB <- Sigma[1:2, 3] # b tmp_b <- t(Sigma_AB) %*% solve(Sigma_A) mu_b <- mu_B - tmp_b %*% mu_A Sigma_b <- Sigma_B - t(Sigma_AB) %*% solve(Sigma_A) %*% Sigma_AB mu_b ## [,1] ## [1,] -1.676471 tmp_b ## [,1] [,2] ## [1,] 0.3382353 -0.8088235 Sigma_b ## [,1] ## [1,] 0.8602941 # d tmp_a <- Sigma_AB * (1 / Sigma_B) mu_a <- mu_A - tmp_a * mu_B Sigma_d <- Sigma_A - (Sigma_AB * (1 / Sigma_B)) %*% t(Sigma_AB) mu_a ## [1] 2.25 -0.60 tmp_a ## [1] 0.25 -0.60 Sigma_d ## [,1] [,2] ## [1,] 0.875 0.10 ## [2,] 0.100 0.68 Solution. \\(B \\sim \\text{N}(-1, 2)\\). \\(B | A = a \\sim \\text{N}(-1.68 + [0.34, -0.81] a, 0.86)\\). \\(\\mu_A = [2, 0]^T\\) and \\(\\Sigma_A = \\begin{bmatrix} 1 & -0.2 & \\\\ -0.2 & 1.4 \\\\ \\end{bmatrix}\\). \\[\\begin{align} A | B = b &\\sim \\text{N}(\\mu_t, \\Sigma_t), \\\\ \\mu_t &= [2.25, -0.6]^T + [0.25, -0.6]^T b, \\\\ \\Sigma_t &= \\begin{bmatrix} 0.875 & 0.1 \\\\ 0.1 & 0.68 \\\\ \\end{bmatrix} \\end{align}\\] library(mvtnorm) set.seed(1) nsamps <- 1000 # a and b samps <- as.data.frame(matrix(data = NA, nrow = 4 * nsamps, ncol = 2)) samps[1:nsamps,1] <- rnorm(nsamps, mu_B, Sigma_B) samps[1:nsamps,2] <- "marginal" for (i in 1:3) { a <- rmvnorm(1, mu_A, Sigma_A) samps[(i*nsamps + 1):((i + 1) * nsamps), 1] <- rnorm(nsamps, mu_b + tmp_b %*% t(a), Sigma_b) samps[(i*nsamps + 1):((i + 1) * nsamps), 2] <- paste0(# "cond", round(a, digits = 2), collapse = "-") } colnames(samps) <- c("x", "dist") ggplot(samps, aes(x = x)) + geom_density() + facet_wrap(~ dist) # c and d samps <- as.data.frame(matrix(data = NA, nrow = 4 * nsamps, ncol = 3)) samps[1:nsamps,1:2] <- rmvnorm(nsamps, mu_A, Sigma_A) samps[1:nsamps,3] <- "marginal" for (i in 1:3) { b <- rnorm(1, mu_B, Sigma_B) samps[(i*nsamps + 1):((i + 1) * nsamps), 1:2] <- rmvnorm(nsamps, mu_a + tmp_a * b, Sigma_d) samps[(i*nsamps + 1):((i + 1) * nsamps), 3] <- b } colnames(samps) <- c("x", "y", "dist") ggplot(samps, aes(x = x, y = y)) + geom_point() + geom_smooth(method = "lm") + facet_wrap(~ dist) 8.3 Transformations Exercise 8.6 Let \\((U,V)\\) be a random variable with PDF \\(p(u,v) = \\frac{1}{4 \\sqrt{u}}\\), \\(U \\in [0,4]\\) and \\(V \\in [\\sqrt{U}, \\sqrt{U} + 1]\\). Let \\(X = \\sqrt{U}\\) and \\(Y = V - \\sqrt{U}\\). Find PDF of \\((X,Y)\\). What can you tell about distributions of \\(X\\) and \\(Y\\)? This exercise shows how we can simplify a probabilistic problem with a clever use of transformations. R: Take 1000 samples from \\((X,Y)\\) and transform them with inverses of the above functions to get samples from \\((U,V)\\). Plot both sets of samples. Solution. First we need to find the inverse functions. Since \\(x = \\sqrt{u}\\) it follows that \\(u = x^2\\), and that \\(x \\in [0,2]\\). Similarly \\(v = y + x\\) and \\(y \\in [0,1]\\). Let us first find the Jacobian. \\[\\renewcommand\\arraystretch{1.6} J(x,y) = \\begin{bmatrix} \\frac{\\partial u}{\\partial x} & \\frac{\\partial v}{\\partial x} \\\\%[1ex] % <-- 1ex more space between rows of matrix \\frac{\\partial u}{\\partial y} & \\frac{\\partial v}{\\partial y} \\end{bmatrix} = \\begin{bmatrix} 2x & 1 \\\\%[1ex] % <-- 1ex more space between rows of matrix 0 & 1 \\end{bmatrix}, \\] and the determinant is \\(|J(x,y)| = 2x\\). Putting everything together, we get \\[\\begin{align} p_{X,Y}(x,y) = p_{U,V}(x^2, y + x) |J(x,y)| = \\frac{1}{4 \\sqrt{x^2}} 2x = \\frac{1}{2}. \\end{align}\\] This reminds us of the Uniform distribution. Indeed we can see that \\(p_X(x) = \\frac{1}{2}\\) and \\(p_Y(y) = 1\\). So instead of dealing with an awkward PDF of \\((U,V)\\) and the corresponding dynamic bounds, we are now looking at two independent Uniform random variables. In practice, this could make modeling much easier. set.seed(1) nsamps <- 2000 x <- runif(nsamps, min = 0, max = 2) y <- runif(nsamps) orig <- tibble(x = x, y = y, vrs = "original") u <- x^2 v <- y + x transf <- tibble(x = u, y = v, vrs = "transformed") df <- bind_rows(orig, transf) ggplot(df, aes(x = x, y = y, color = vrs)) + geom_point(alpha = 0.3) Exercise 8.7 R: Write a function that will calculate the probability density of an arbitraty multivariate normal distribution, based on independent standardized normal PDFs. Compare with dmvnorm from the mvtnorm package. library(mvtnorm) set.seed(1) mvn_dens <- function (y, mu, Sigma) { L <- chol(Sigma) L_inv <- solve(t(L)) g_inv <- L_inv %*% t(y - mu) J <- L_inv J_det <- det(J) return(prod(dnorm(g_inv)) * J_det) } mu_v <- c(-2, 0, 1) cov_m <- matrix(c(1, -0.2, 0.5, -0.2, 2, 0.3, 0.5, 0.3, 1.6), ncol = 3, byrow = TRUE) n_comp <- 20 for (i in 1:n_comp) { x <- rmvnorm(1, mean = mu_v, sigma = cov_m) print(paste0("My function: ", mvn_dens(x, mu_v, cov_m), ", dmvnorm: ", dmvnorm(x, mu_v, cov_m))) } ## [1] "My function: 0.0229514237156383, dmvnorm: 0.0229514237156383" ## [1] "My function: 0.00763138915406231, dmvnorm: 0.00763138915406231" ## [1] "My function: 0.0230688881105741, dmvnorm: 0.0230688881105741" ## [1] "My function: 0.0113616213114731, dmvnorm: 0.0113616213114731" ## [1] "My function: 0.00151808500121907, dmvnorm: 0.00151808500121907" ## [1] "My function: 0.0257658045974509, dmvnorm: 0.0257658045974509" ## [1] "My function: 0.0157963825730805, dmvnorm: 0.0157963825730805" ## [1] "My function: 0.00408856287529248, dmvnorm: 0.00408856287529248" ## [1] "My function: 0.0327793540101256, dmvnorm: 0.0327793540101256" ## [1] "My function: 0.0111606542967978, dmvnorm: 0.0111606542967978" ## [1] "My function: 0.0147636757585684, dmvnorm: 0.0147636757585684" ## [1] "My function: 0.0142948300412207, dmvnorm: 0.0142948300412207" ## [1] "My function: 0.0203093820657542, dmvnorm: 0.0203093820657542" ## [1] "My function: 0.0287533273357481, dmvnorm: 0.0287533273357481" ## [1] "My function: 0.0213402305128623, dmvnorm: 0.0213402305128623" ## [1] "My function: 0.0218356957993885, dmvnorm: 0.0218356957993885" ## [1] "My function: 0.0250750113961771, dmvnorm: 0.0250750113961771" ## [1] "My function: 0.0166498666348048, dmvnorm: 0.0166498666348048" ## [1] "My function: 0.00189725106874659, dmvnorm: 0.00189725106874659" ## [1] "My function: 0.0196697814975113, dmvnorm: 0.0196697814975113" "],["ard.html", "Chapter 9 Alternative representation of distributions 9.1 Probability generating functions (PGFs) 9.2 Moment generating functions (MGFs)", " Chapter 9 Alternative representation of distributions This chapter deals with alternative representation of distributions. The students are expected to acquire the following knowledge: Theoretical Probability generating functions. Moment generating functions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 9.1 Probability generating functions (PGFs) Exercise 9.1 Show that the sum of independent Poisson random variables is itself a Poisson random variable. R: Let \\(X\\) be a sum of three Poisson distributions with \\(\\lambda_i \\in \\{2, 5.2, 10\\}\\). Take 1000 samples and plot the three distributions and the sum. Then take 1000 samples from the theoretical distribution of \\(X\\) and compare them to the sum. Solution. Let \\(X_i \\sim \\text{Poisson}(\\lambda_i)\\) for \\(i = 1,...,n\\), and let \\(X = \\sum_{i=1}^n X_i\\). \\[\\begin{align} \\alpha_X(t) &= \\prod_{i=1}^n \\alpha_{X_i}(t) \\\\ &= \\prod_{i=1}^n \\bigg( \\sum_{j=0}^\\infty t^j \\frac{\\lambda_i^j e^{-\\lambda_i}}{j!} \\bigg) \\\\ &= \\prod_{i=1}^n \\bigg( e^{-\\lambda_i} \\sum_{j=0}^\\infty \\frac{(t\\lambda_i)^j }{j!} \\bigg) \\\\ &= \\prod_{i=1}^n \\bigg( e^{-\\lambda_i} e^{t \\lambda_i} \\bigg) & \\text{power series} \\\\ &= \\prod_{i=1}^n \\bigg( e^{\\lambda_i(t - 1)} \\bigg) \\\\ &= e^{\\sum_{i=1}^n \\lambda_i(t - 1)} \\\\ &= e^{t \\sum_{i=1}^n \\lambda_i - \\sum_{i=1}^n \\lambda_i} \\\\ &= e^{-\\sum_{i=1}^n \\lambda_i} \\sum_{j=0}^\\infty \\frac{(t \\sum_{i=1}^n \\lambda_i)^j}{j!}\\\\ &= \\sum_{j=0}^\\infty \\frac{e^{-\\sum_{i=1}^n \\lambda_i} (t \\sum_{i=1}^n \\lambda_i)^j}{j!}\\\\ \\end{align}\\] The last term is the PGF of a Poisson random variable with parameter \\(\\sum_{i=1}^n \\lambda_i\\). Because the PGF is unique, \\(X\\) is a Poisson random variable. set.seed(1) library(tidyr) nsamps <- 1000 samps <- matrix(data = NA, nrow = nsamps, ncol = 4) samps[ ,1] <- rpois(nsamps, 2) samps[ ,2] <- rpois(nsamps, 5.2) samps[ ,3] <- rpois(nsamps, 10) samps[ ,4] <- samps[ ,1] + samps[ ,2] + samps[ ,3] colnames(samps) <- c(2, 2.5, 10, "sum") gsamps <- as_tibble(samps) gsamps <- gather(gsamps, key = "dist", value = "value") ggplot(gsamps, aes(x = value)) + geom_bar() + facet_wrap(~ dist) samps <- cbind(samps, "theoretical" = rpois(nsamps, 2 + 5.2 + 10)) gsamps <- as_tibble(samps[ ,4:5]) gsamps <- gather(gsamps, key = "dist", value = "value") ggplot(gsamps, aes(x = value, fill = dist)) + geom_bar(position = "dodge") Exercise 9.2 Find the expected value and variance of the negative binomial distribution. Hint: Find the Taylor series of \\((1 - y)^{-r}\\) at point 0. Solution. Let \\(X \\sim \\text{NB}(r, p)\\). \\[\\begin{align} \\alpha_X(t) &= E[t^X] \\\\ &= \\sum_{j=0}^\\infty t^j \\binom{j + r - 1}{j} (1 - p)^r p^j \\\\ &= (1 - p)^r \\sum_{j=0}^\\infty \\binom{j + r - 1}{j} (tp)^j \\\\ &= (1 - p)^r \\sum_{j=0}^\\infty \\frac{(j + r - 1)(j + r - 2)...r}{j!} (tp)^j. \\\\ \\end{align}\\] Let us look at the Taylor series of \\((1 - y)^{-r}\\) at 0 \\[\\begin{align} (1 - y)^{-r} = &1 + \\frac{-r(-1)}{1!}y + \\frac{-r(-r - 1)(-1)^2}{2!}y^2 + \\\\ &\\frac{-r(-r - 1)(-r - 2)(-1)^3}{3!}y^3 + ... \\\\ \\end{align}\\] How does the \\(k\\)-th term look like? We have \\(k\\) derivatives of our function so \\[\\begin{align} \\frac{d^k}{d^k y} (1 - y)^{-r} &= \\frac{-r(-r - 1)...(-r - k + 1)(-1)^k}{k!}y^k \\\\ &= \\frac{r(r + 1)...(r + k - 1)}{k!}y^k. \\end{align}\\] We observe that this equals to the \\(j\\)-th term in the sum of NB PGF. Therefore \\[\\begin{align} \\alpha_X(t) &= (1 - p)^r (1 - tp)^{-r} \\\\ &= \\Big(\\frac{1 - p}{1 - tp}\\Big)^r \\end{align}\\] To find the expected value, we need to differentiate \\[\\begin{align} \\frac{d}{dt} \\Big(\\frac{1 - p}{1 - tp}\\Big)^r &= r \\Big(\\frac{1 - p}{1 - tp}\\Big)^{r-1} \\frac{d}{dt} \\frac{1 - p}{1 - tp} \\\\ &= r \\Big(\\frac{1 - p}{1 - tp}\\Big)^{r-1} \\frac{p(1 - p)}{(1 - tp)^2}. \\\\ \\end{align}\\] Evaluating this at 1, we get: \\[\\begin{align} E[X] = \\frac{rp}{1 - p}. \\end{align}\\] For the variance we need the second derivative. \\[\\begin{align} \\frac{d^2}{d^2t} \\Big(\\frac{1 - p}{1 - tp}\\Big)^r &= \\frac{p^2 r (r + 1) (\\frac{1 - p}{1 - tp})^r}{(tp - 1)^2} \\end{align}\\] Evaluating this at 1 and inserting the first derivatives, we get: \\[\\begin{align} Var[X] &= \\frac{d^2}{dt^2} \\alpha_X(1) + \\frac{d}{dt}\\alpha_X(1) - \\Big(\\frac{d}{dt}\\alpha_X(t) \\Big)^2 \\\\ &= \\frac{p^2 r (r + 1)}{(1 - p)^2} + \\frac{rp}{1 - p} - \\frac{r^2p^2}{(1 - p)^2} \\\\ &= \\frac{rp}{(1 - p)^2}. \\end{align}\\] library(tidyr) set.seed(1) nsamps <- 100000 find_p <- function (mu, r) { return (10 / (r + 10)) } r <- c(1,2,10,20) p <- find_p(10, r) sigma <- rep(sqrt(p*r / (1 - p)^2), each = nsamps) samps <- cbind("r=1" = rnbinom(nsamps, size = r[1], prob = 1 - p[1]), "r=2" = rnbinom(nsamps, size = r[2], prob = 1 - p[2]), "r=4" = rnbinom(nsamps, size = r[3], prob = 1 - p[3]), "r=20" = rnbinom(nsamps, size = r[4], prob = 1 - p[4])) gsamps <- gather(as.data.frame(samps)) iw <- (gsamps$value > sigma + 10) | (gsamps$value < sigma - 10) ggplot(gsamps, aes(x = value, fill = iw)) + geom_bar() + # geom_density() + facet_wrap(~ key) 9.2 Moment generating functions (MGFs) Exercise 9.3 Find the variance of the geometric distribution. Solution. Let \\(X \\sim \\text{Geometric}(p)\\). The MGF of the geometric distribution is \\[\\begin{align} M_X(t) &= E[e^{tX}] \\\\ &= \\sum_{k=0}^\\infty p(1 - p)^k e^{tk} \\\\ &= p \\sum_{k=0}^\\infty ((1 - p)e^t)^k. \\end{align}\\] Let us assume that \\((1 - p)e^t < 1\\). Then, by using the geometric series we get \\[\\begin{align} M_X(t) &= \\frac{p}{1 - e^t + pe^t}. \\end{align}\\] The first derivative of the above expression is \\[\\begin{align} \\frac{d}{dt}M_X(t) &= \\frac{-p(-e^t + pe^t)}{(1 - e^t + pe^t)^2}, \\end{align}\\] and evaluating at \\(t = 0\\), we get \\(\\frac{1 - p}{p}\\), which we already recognize as the expected value of the geometric distribution. The second derivative is \\[\\begin{align} \\frac{d^2}{dt^2}M_X(t) &= \\frac{(p-1)pe^t((p-1)e^t - 1)}{((p - 1)e^t + 1)^3}, \\end{align}\\] and evaluating at \\(t = 0\\), we get \\(\\frac{(p - 1)(p - 2)}{p^2}\\). Combining we get the variance \\[\\begin{align} Var(X) &= \\frac{(p - 1)(p - 2)}{p^2} - \\frac{(1 - p)^2}{p^2} \\\\ &= \\frac{(p-1)(p-2) - (1-p)^2}{p^2} \\\\ &= \\frac{1 - p}{p^2}. \\end{align}\\] Exercise 9.4 Find the distribution of sum of two normal random variables \\(X\\) and \\(Y\\), by comparing \\(M_{X+Y}(t)\\) to \\(M_X(t)\\). R: To illustrate the result draw random samples from N\\((-3, 1)\\) and N\\((5, 1.2)\\) and calculate the empirical mean and variance of \\(X+Y\\). Plot all three histograms in one plot. Solution. Let \\(X \\sim \\text{N}(\\mu_X, 1)\\) and \\(Y \\sim \\text{N}(\\mu_Y, 1)\\). The MGF of the sum is \\[\\begin{align} M_{X+Y}(t) &= M_X(t) M_Y(t). \\end{align}\\] Let us calculate \\(M_X(t)\\), the MGF for \\(Y\\) then follows analogously. \\[\\begin{align} M_X(t) &= \\int_{-\\infty}^\\infty e^{tx} \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - mu_X)^2}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - mu_X)^2 - 2\\sigma_X tx}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{x^2 - 2\\mu_X x + \\mu_X^2 - 2\\sigma_X tx}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - (\\mu_X + \\sigma_X^2 t))^2 + \\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} dx & \\text{complete the square}\\\\ &= e^{-\\frac{\\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - (\\mu_X + \\sigma_X^2 t))^2}{2\\sigma_X^2}} dx & \\\\ &= e^{-\\frac{\\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} & \\text{normal PDF} \\\\ &= e^{-\\frac{\\mu_X^2 - \\mu_X^2 - \\mu_X \\sigma_X^2 t - 2 \\sigma_X^4 t^2}{2\\sigma_X^2}} \\\\ &= e^{\\sigma_X^2 t^2 + \\frac{\\mu_X t}{2}}. \\\\ \\end{align}\\] The MGF of the sum is then \\[\\begin{align} M_{X+Y}(t) &= e^{\\sigma_X^2 t^2 + 0.5\\mu_X t} e^{\\sigma_Y^2 t^2 + 0.5\\mu_Y t} \\\\ &= e^{t^2(\\sigma_X^2 + \\sigma_Y^2) + 0.5 t(\\mu_X + \\mu_Y)}. \\end{align}\\] By comparing \\(M_{X+Y}(t)\\) and \\(M_X(t)\\) we observe that both have two terms. The first is \\(2t^2\\) multiplied by the variance, and the second is \\(2t\\) multiplied by the mean. Since MGFs are unique, we conclude that \\(Z = X + Y \\sim \\text{N}(\\mu_X + \\mu_Y, \\sigma_X^2 + \\sigma_Y^2)\\). library(tidyr) library(ggplot2) set.seed(1) nsamps <- 1000 x <- rnorm(nsamps, -3, 1) y <- rnorm(nsamps, 5, 1.2) z <- x + y mean(z) ## [1] 1.968838 var(z) ## [1] 2.645034 df <- data.frame(x = x, y = y, z = z) %>% gather() ggplot(df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") "],["ci.html", "Chapter 10 Concentration inequalities 10.1 Comparison 10.2 Practical", " Chapter 10 Concentration inequalities This chapter deals with concentration inequalities. The students are expected to acquire the following knowledge: Theoretical More assumptions produce closer bounds. R Optimization. Estimating probability inequalities. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 10.1 Comparison Exercise 10.1 R: Let \\(X\\) be geometric random variable with \\(p = 0.7\\). Visually compare the Markov bound, Chernoff bound, and the theoretical probabilities for \\(x = 1,...,12\\). To get the best fitting Chernoff bound, you will need to optimize the bound depending on \\(t\\). Use either analytical or numerical optimization. bound_chernoff <- function (t, p, a) { return ((p / (1 - exp(t) + p * exp(t))) / exp(a * t)) } set.seed(1) p <- 0.7 a <- seq(1, 12, by = 1) ci_markov <- (1 - p) / p / a t <- vector(mode = "numeric", length = length(a)) for (i in 1:length(t)) { t[i] <- optimize(bound_chernoff, interval = c(0, log(1 / (1 - p))), p = p, a = a[i])$minimum } t ## [1] 0.5108267 0.7984981 0.9162927 0.9808238 1.0216635 1.0498233 1.0704327 ## [8] 1.0861944 1.0986159 1.1086800 1.1169653 1.1239426 ci_chernoff <- (p / (1 - exp(t) + p * exp(t))) / exp(a * t) actual <- 1 - pgeom(a, 0.7) plot_df <- rbind( data.frame(x = a, y = ci_markov, type = "Markov"), data.frame(x = a, y = ci_chernoff, type = "Chernoff"), data.frame(x = a, y = actual, type = "Actual") ) ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() Exercise 10.2 R: Let \\(X\\) be a sum of 100 Beta distributions with random parameters. Take 1000 samples and plot the Chebyshev bound, Hoeffding bound, and the empirical probabilities. set.seed(1) nvars <- 100 nsamps <- 1000 samps <- matrix(data = NA, nrow = nsamps, ncol = nvars) Sn_mean <- 0 Sn_var <- 0 for (i in 1:nvars) { alpha1 <- rgamma(1, 10, 1) beta1 <- rgamma(1, 10, 1) X <- rbeta(nsamps, alpha1, beta1) Sn_mean <- Sn_mean + alpha1 / (alpha1 + beta1) Sn_var <- Sn_var + alpha1 * beta1 / ((alpha1 + beta1)^2 * (alpha1 + beta1 + 1)) samps[ ,i] <- X } mean(apply(samps, 1, sum)) ## [1] 51.12511 Sn_mean ## [1] 51.15723 var(apply(samps, 1, sum)) ## [1] 1.170652 Sn_var ## [1] 1.166183 a <- 1:30 b <- a / sqrt(Sn_var) ci_chebyshev <- 1 / b^2 ci_hoeffding <- 2 * exp(- 2 * a^2 / nvars) empirical <- NULL for (i in 1:length(a)) { empirical[i] <- sum(abs((apply(samps, 1, sum)) - Sn_mean) >= a[i])/ nsamps } plot_df <- rbind( data.frame(x = a, y = ci_chebyshev, type = "Chebyshev"), data.frame(x = a, y = ci_hoeffding, type = "Hoeffding"), data.frame(x = a, y = empirical, type = "Empirical") ) ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() + coord_cartesian(xlim = c(15, 25), ylim = c(0, 0.05)) 10.2 Practical Exercise 10.3 From Jagannathan. Let \\(X_i\\), \\(i = 1,...n\\), be a random sample of size \\(n\\) of a random variable \\(X\\). Let \\(X\\) have mean \\(\\mu\\) and variance \\(\\sigma^2\\). Find the size of the sample \\(n\\) required so that the probability that the difference between sample mean and true mean is smaller than \\(\\frac{\\sigma}{10}\\) is at least 0.95. Hint: Derive a version of the Chebyshev inequality for \\(P(|X - \\mu| \\geq a)\\) using Markov inequality. Solution. Let \\(\\bar{X} = \\sum_{i=1}^n X_i\\). Then \\(E[\\bar{X}] = \\mu\\) and \\(Var[\\bar{X}] = \\frac{\\sigma^2}{n}\\). Let us first derive another representation of Chebyshev inequality. \\[\\begin{align} P(|X - \\mu| \\geq a) = P(|X - \\mu|^2 \\geq a^2) \\leq \\frac{E[|X - \\mu|^2]}{a^2} = \\frac{Var[X]}{a^2}. \\end{align}\\] Let us use that on our sampling distribution: \\[\\begin{align} P(|\\bar{X} - \\mu| \\geq \\frac{\\sigma}{10}) \\leq \\frac{100 Var[\\bar{X}]}{\\sigma^2} = \\frac{100 Var[X]}{n \\sigma^2} = \\frac{100}{n}. \\end{align}\\] We are interested in the difference being smaller, therefore \\[\\begin{align} P(|\\bar{X} - \\mu| < \\frac{\\sigma}{10}) = 1 - P(|\\bar{X} - \\mu| \\geq \\frac{\\sigma}{10}) \\geq 1 - \\frac{100}{n} \\geq 0.95. \\end{align}\\] It follows that we need a sample size of \\(n \\geq \\frac{100}{0.05} = 2000\\). "],["crv.html", "Chapter 11 Convergence of random variables", " Chapter 11 Convergence of random variables This chapter deals with convergence of random variables. The students are expected to acquire the following knowledge: Theoretical Finding convergences of random variables. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 11.1 Let \\(X_1\\), \\(X_2\\),…, \\(X_n\\) be a sequence of Bernoulli random variables. Let \\(Y_n = \\frac{X_1 + X_2 + ... + X_n}{n^2}\\). Show that this sequence converges point-wise to the zero random variable. R: Use a simulation to check your answer. Solution. Let \\(\\epsilon\\) be arbitrary. We need to find such \\(n_0\\), that for every \\(n\\) greater than \\(n_0\\) \\(|Y_n| < \\epsilon\\) holds. \\[\\begin{align} |Y_n| &= |\\frac{X_1 + X_2 + ... + X_n}{n^2}| \\\\ &\\leq |\\frac{n}{n^2}| \\\\ &= \\frac{1}{n}. \\end{align}\\] So we need to find such \\(n_0\\), that for every \\(n > n_0\\) we will have \\(\\frac{1}{n} < \\epsilon\\). So \\(n_0 > \\frac{1}{\\epsilon}\\). x <- 1:1000 X <- matrix(data = NA, nrow = length(x), ncol = 100) y <- vector(mode = "numeric", length = length(x)) for (i in 1:length(x)) { X[i, ] <- rbinom(100, size = 1, prob = 0.5) } X <- apply(X, 2, cumsum) tmp_mat <- matrix(data = (1:1000)^2, nrow = 1000, ncol = 100) X <- X / tmp_mat y <- apply(X, 1, mean) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() Exercise 11.2 Let \\(\\Omega = [0,1]\\) and let \\(X_n\\) be a sequence of random variables, defined as \\[\\begin{align} X_n(\\omega) = \\begin{cases} \\omega^3, &\\omega = \\frac{i}{n}, &0 \\leq i \\leq 1 \\\\ 1, & \\text{otherwise.} \\end{cases} \\end{align}\\] Show that \\(X_n\\) converges almost surely to \\(X \\sim \\text{Uniform}(0,1)\\). Solution. We need to show \\(P(\\{\\omega: X_n(\\omega) \\rightarrow X(\\omega)\\}) = 1\\). Let \\(\\omega \\neq \\frac{i}{n}\\). Then for any \\(\\omega\\), \\(X_n\\) converges pointwise to \\(X\\): \\[\\begin{align} X_n(\\omega) = 1 \\implies |X_n(\\omega) - X(s)| = |1 - 1| < \\epsilon. \\end{align}\\] The above is independent of \\(n\\). Since there are countably infinite number of elements in the complement (\\(\\frac{i}{n}\\)), the probability of this set is 1. Exercise 11.3 Borrowed from Wasserman. Let \\(X_n \\sim \\text{N}(0, \\frac{1}{n})\\) and let \\(X\\) be a random variable with CDF \\[\\begin{align} F_X(x) = \\begin{cases} 0, &x < 0 \\\\ 1, &x \\geq 0. \\end{cases} \\end{align}\\] Does \\(X_n\\) converge to \\(X\\) in distribution? How about in probability? Prove or disprove these statement. R: Plot the CDF of \\(X_n\\) for \\(n = 1, 2, 5, 10, 100, 1000\\). Solution. Let us first check convergence in distribution. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} F_{X_n}(x) &= \\lim_{n \\rightarrow \\infty} \\phi (\\sqrt(n) x). \\end{align}\\] We have two cases, for \\(x < 0\\) and \\(x > 0\\). We do not need to check for \\(x = 0\\), since \\(F_X\\) is not continuous in that point. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} \\phi (\\sqrt(n) x) = \\begin{cases} 0, & x < 0 \\\\ 1, & x > 0. \\end{cases} \\end{align}\\] This is the same as \\(F_X\\). Let us now check convergence in probability. Since \\(X\\) is a point-mass distribution at zero, we have \\[\\begin{align} \\lim_{n \\rightarrow \\infty} P(|X_n| > \\epsilon) &= \\lim_{n \\rightarrow \\infty} (P(X_n > \\epsilon) + P(X_n < -\\epsilon)) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - P(X_n < \\epsilon) + P(X_n < -\\epsilon)) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - \\phi(\\sqrt{n} \\epsilon) + \\phi(- \\sqrt{n} \\epsilon)) \\\\ &= 0. \\end{align}\\] n <- c(1,2,5,10,100,1000) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01)), aes(x = x)) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/1), aes(color = "sd = 1/1")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/2), aes(color = "sd = 1/2")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/5), aes(color = "sd = 1/5")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/10), aes(color = "sd = 1/10")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/100), aes(color = "sd = 1/100")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/1000), aes(color = "sd = 1/1000")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/10000), aes(color = "sd = 1/10000")) Exercise 11.4 Let \\(X_i\\) be i.i.d. and \\(\\mu = E(X_1)\\). Let variance of \\(X_1\\) be finite. Show that the mean of \\(X_i\\), \\(\\bar{X}_n = \\frac{1}{n}\\sum_{i=1}^n X_i\\) converges in quadratic mean to \\(\\mu\\). Solution. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} E(|\\bar{X_n} - \\mu|^2) &= \\lim_{n \\rightarrow \\infty} E(\\bar{X_n}^2 - 2 \\bar{X_n} \\mu + \\mu^2) \\\\ &= \\lim_{n \\rightarrow \\infty} (E(\\bar{X_n}^2) - 2 \\mu E(\\frac{\\sum_{i=1}^n X_i}{n}) + \\mu^2) \\\\ &= \\lim_{n \\rightarrow \\infty} E(\\bar{X_n})^2 + \\lim_{n \\rightarrow \\infty} Var(\\bar{X_n}) - 2 \\mu^2 + \\mu^2 \\\\ &= \\lim_{n \\rightarrow \\infty} \\frac{n^2 \\mu^2}{n^2} + \\lim_{n \\rightarrow \\infty} \\frac{\\sigma^2}{n} - \\mu^2 \\\\ &= \\mu^2 - \\mu^2 + \\lim_{n \\rightarrow \\infty} \\frac{\\sigma^2}{n} \\\\ &= 0. \\end{align}\\] "],["lt.html", "Chapter 12 Limit theorems", " Chapter 12 Limit theorems This chapter deals with limit theorems. The students are expected to acquire the following knowledge: Theoretical Monte Carlo integration convergence. Difference between weak and strong law of large numbers. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 12.1 Show that Monte Carlo integration converges almost surely to the true integral of a bounded function. Solution. Let \\(g\\) be a function defined on \\(\\Omega\\). Let \\(X_i\\), \\(i = 1,...,n\\) be i.i.d. (multivariate) uniform random variables with bounds defined on \\(\\Omega\\). Let \\(Y_i\\) = \\(g(X_i)\\). Then it follows that \\(Y_i\\) are also i.i.d. random variables and their expected value is \\(E[g(X)] = \\int_{\\Omega} g(x) f_X(x) dx = \\frac{1}{V_{\\Omega}} \\int_{\\Omega} g(x) dx\\). By the strong law of large numbers, we have \\[\\begin{equation} \\frac{1}{n}\\sum_{i=1}^n Y_i \\xrightarrow{\\text{a.s.}} E[g(X)]. \\end{equation}\\] It follows that \\[\\begin{equation} V_{\\Omega} \\frac{1}{n}\\sum_{i=1}^n Y_i \\xrightarrow{\\text{a.s.}} \\int_{\\Omega} g(x) dx. \\end{equation}\\] Exercise 12.2 Let \\(X\\) be a geometric random variable with probability 0.5 and support in positive integers. Let \\(Y = 2^X (-1)^X X^{-1}\\). Find the expected value of \\(Y\\) by using conditional convergence (this variable does not have an expected value in the conventional sense – the series is not absolutely convergent). R: Draw \\(10000\\) samples from a geometric distribution with probability 0.5 and support in positive integers to get \\(X\\). Then calculate \\(Y\\) and plot the means at each iteration (sample). Additionally, plot the expected value calculated in a. Try it with different seeds. What do you notice? Solution. \\[\\begin{align*} E[Y] &= \\sum_{x=1}^{\\infty} \\frac{2^x (-1)^x}{x} 0.5^x \\\\ &= \\sum_{x=1}^{\\infty} \\frac{(-1)^x}{x} \\\\ &= - \\sum_{x=1}^{\\infty} \\frac{(-1)^{x+1}}{x} \\\\ &= - \\ln(2) \\end{align*}\\] set.seed(3) x <- rgeom(100000, prob = 0.5) + 1 y <- 2^x * (-1)^x * x^{-1} y_means <- cumsum(y) / seq_along(y) df <- data.frame(x = 1:length(y_means), y = y_means) ggplot(data = df, aes(x = x, y = y)) + geom_line() + geom_hline(yintercept = -log(2)) "],["eb.html", "Chapter 13 Estimation basics 13.1 ECDF 13.2 Properties of estimators", " Chapter 13 Estimation basics This chapter deals with estimation basics. The students are expected to acquire the following knowledge: Biased and unbiased estimators. Consistent estimators. Empirical cumulative distribution function. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 13.1 ECDF Exercise 13.1 (ECDF intuition) Take any univariate continuous distribution that is readily available in R and plot its CDF (\\(F\\)). Draw one sample (\\(n = 1\\)) from the chosen distribution and draw the ECDF (\\(F_n\\)) of that one sample. Use the definition of the ECDF, not an existing function in R. Implementation hint: ECDFs are always piecewise constant - they only jump at the sampled values and by \\(1/n\\). Repeat (b) for \\(n = 5, 10, 100, 1000...\\) Theory says that \\(F_n\\) should converge to \\(F\\). Can you observe that? For \\(n = 100\\) repeat the process \\(m = 20\\) times and plot every \\(F_n^{(m)}\\). Theory says that \\(F_n\\) will converge to \\(F\\) the slowest where \\(F\\) is close to 0.5 (where the variance is largest). Can you observe that? library(ggplot2) set.seed(1) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) one_samp <- rnorm(1) X <- data.frame(x = c(-5, one_samp, 5), y = c(0,1,1)) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) + geom_step(data = X, aes(x = x, y = y)) N <- c(5, 10, 100, 1000) X <- NULL for (n in N) { tmp <- rnorm(n) tmp_X <- data.frame(x = c(-5, sort(tmp), 5), y = c(0, seq(1/n, 1, by = 1/n), 1), n = n) X <- rbind(X, tmp_X) } ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) + geom_step(data = X, aes(x = x, y = y, color = as.factor(n))) + labs(color = "N") 13.2 Properties of estimators Exercise 13.2 Show that the sample average is, as an estimator of the mean: unbiased, consistent, asymptotically normal. Solution. \\[\\begin{align*} E[\\frac{1}{n} \\sum_{i=1}^n X_i] &= \\frac{1}{n} \\sum_{i=i}^n E[X_i] \\\\ &= E[X]. \\end{align*}\\] \\[\\begin{align*} \\lim_{n \\rightarrow \\infty} P(|\\frac{1}{n} \\sum_{i=1}^n X_i - E[X]| > \\epsilon) &= \\lim_{n \\rightarrow \\infty} P((\\frac{1}{n} \\sum_{i=1}^n X_i - E[X])^2 > \\epsilon^2) \\\\ & \\leq \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i - E[X])^2]}{\\epsilon^2} & \\text{Markov inequality} \\\\ & = \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i)^2 - 2 \\frac{1}{n} \\sum_{i=1}^n X_i E[X] + E[X]^2]}{\\epsilon^2} \\\\ & = \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i)^2] - 2 E[X]^2 + E[X]^2}{\\epsilon^2} \\\\ &= 0 \\end{align*}\\] For the last equality see the solution to ??. Follows directly from the CLT. Exercise 13.3 (Consistent but biased estimator) Show that sample variance (the plug-in estimator of variance) is a biased estimator of variance. Show that sample variance is a consistent estimator of variance. Show that the estimator with (\\(N-1\\)) (Bessel correction) is unbiased. Solution. \\[\\begin{align*} E[\\frac{1}{n} \\sum_{i=1}^n (Y_i - \\bar{Y})^2] &= \\frac{1}{n} \\sum_{i=1}^n E[(Y_i - \\bar{Y})^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2] - 2 E[Y_i \\bar{Y}] + \\bar{Y}^2)] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2 - 2 Y_i \\bar{Y} + \\bar{Y}^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2 - \\frac{2}{n} Y_i^2 - \\frac{2}{n} \\sum_{i \\neq j} Y_i Y_j + \\frac{1}{n^2}\\sum_j \\sum_{k \\neq j} Y_j Y_k + \\frac{1}{n^2} \\sum_j Y_j^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n \\frac{n - 2}{n} (\\sigma^2 + \\mu^2) - \\frac{2}{n} (n - 1) \\mu^2 + \\frac{1}{n^2}n(n-1)\\mu^2 + \\frac{1}{n^2}n(\\sigma^2 + \\mu^2) \\\\ &= \\frac{n-1}{n}\\sigma^2 \\\\ < \\sigma^2. \\end{align*}\\] Let \\(S_n\\) denote the sample variance. Then we can write it as \\[\\begin{align*} S_n &= \\frac{1}{n} \\sum_{i=1}^n (X_i - \\bar{X})^2 = \\frac{1}{n} \\sum_{i=1}^n (X_i - \\mu)^2 + 2(X_i - \\mu)(\\mu - \\bar{X}) + (\\mu - \\bar{X})^2. \\end{align*}\\] Now \\(\\bar{X}\\) converges in probability (by WLLN) to \\(\\mu\\) therefore the right terms converge in probability to zero. The left term converges in probability to \\(\\sigma^2\\), also by WLLN. Therefore the sample variance is a consistent estimatior of the variance. The denominator changes in the second-to-last line of a., therefore the last line is now equality. Exercise 13.4 (Estimating the median) Show that the sample median is an unbiased estimator of the median for N\\((\\mu, \\sigma^2)\\). Show that the sample median is an unbiased estimator of the mean for any distribution with symmetric density. Hint 1: The pdf of an order statistic is \\(f_{X_{(k)}}(x) = \\frac{n!}{(n - k)!(k - 1)!}f_X(x)\\Big(F_X(x)^{k-1} (1 - F_X(x)^{n - k}) \\Big)\\). Hint 2: A distribution is symmetric when \\(X\\) and \\(2a - X\\) have the same distribution for some \\(a\\). Solution. Let \\(Z_i\\), \\(i = 1,...,n\\) be i.i.d. variables with a symmetric distribution and let \\(Z_{k:n}\\) denote the \\(k\\)-th order statistic. We will distinguish two cases, when \\(n\\) is odd and when \\(n\\) is even. Let first \\(n = 2m + 1\\) be odd. Then the sample median is \\(M = Z_{m+1:2m+1}\\). Its PDF is \\[\\begin{align*} f_M(x) = (m+1)\\binom{2m + 1}{m}f_Z(x)\\Big(F_Z(x)^m (1 - F_Z(x)^m) \\Big). \\end{align*}\\] For every symmetric distribution, it holds that \\(F_X(x) = 1 - F(2a - x)\\). Let \\(a = \\mu\\), the population mean. Plugging this into the PDF, we get that \\(f_M(x) = f_M(2\\mu -x)\\). It follows that \\[\\begin{align*} E[M] &= E[2\\mu - M] \\\\ 2E[M] &= 2\\mu \\\\ E[M] &= \\mu. \\end{align*}\\] Now let \\(n = 2m\\) be even. Then the sample median is \\(M = \\frac{Z_{m:2m} + Z_{m+1:2m}}{2}\\). It can be shown, that the joint PDF of these terms is also symmetric. Therefore, similar to the above \\[\\begin{align*} E[M] &= E[\\frac{Z_{m:2m} + Z_{m+1:2m}}{2}] \\\\ &= E[\\frac{2\\mu - M + 2\\mu - M}{2}] \\\\ &= E[2\\mu - M]. \\end{align*}\\] The above also proves point a. as the median and the mean are the same in normal distribution. Exercise 13.5 (Matrix trace estimation) The Hutchinson trace estimator [1] is an estimator of the trace of a symmetric positive semidefinite matrix A that relies on Monte Carlo sampling. The estimator is defined as \\[\\begin{align*} \\textrm{tr}(A) \\approx \\frac{1}{n} \\Sigma_{i=1}^n z_i^T A z_i, &\\\\ z_i \\sim_{\\mathrm{IID}} \\textrm{Uniform}(\\{-1, 1\\}^m), & \\end{align*}\\] where \\(A \\in \\mathbb{R}^{m \\times m}\\) is a symmetric positive semidefinite matrix. Elements of each vector \\(z_i\\) are either \\(-1\\) or \\(1\\) with equal probability. This is also called a Rademacher distribution. Data scientists often want the trace of a Hessian to obtain valuable curvature information for a loss function. Per [2], an example is classifying ten digits based on \\((28,28)\\) grayscale images (i.e. MNIST data) using logistic regression. The number of parameters is \\(m = 28^2 \\cdot 10 = 7840\\) and the size of the Hessian is \\(m^2\\), roughly \\(6 \\cdot 10^6\\). The diagonal average is equal to the average eigenvalue, which may be useful for optimization; in MCMC contexts, this would be useful for preconditioners and step size optimization. Computing Hessians (as a means of getting eigenvalue information) is often intractable, but Hessian-vector products can be computed faster by autodifferentiation (with e.g. Tensorflow, Pytorch, Jax). This is one motivation for the use of a stochastic trace estimator as outlined above. References: A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines (Hutchinson, 1990) A Modern Analysis of Hutchinson’s Trace Estimator (Skorski, 2020) Prove that the Hutchinson trace estimator is an unbiased estimator of the trace. Solution. We first simplify our task: \\[\\begin{align} \\mathbb{E}\\left[\\frac{1}{n} \\Sigma_{i=1}^n z_i^T A z_i \\right] &= \\frac{1}{n} \\Sigma_{i=1}^n \\mathbb{E}\\left[z_i^T A z_i \\right] \\\\ &= \\mathbb{E}\\left[z_i^T A z_i \\right], \\end{align}\\] where the second equality is due to having \\(n\\) IID vectors \\(z_i\\). We now only need to show that \\(\\mathbb{E}\\left[z^T A z \\right] = \\mathrm{tr}(A)\\). We omit the index due to all vectors being IID: \\[\\begin{align} \\mathrm{tr}(A) &= \\mathrm{tr}(AI) \\\\ &= \\mathrm{tr}(A\\mathbb{E}[zz^T]) \\\\ &= \\mathbb{E}[\\mathrm{tr}(Azz^T)] \\\\ &= \\mathbb{E}[\\mathrm{tr}(z^TAz)] \\\\ &= \\mathbb{E}[z^TAz]. \\end{align}\\] This concludes the proof. We clarify some equalities below. The second equality assumes that \\(\\mathbb{E}[zz^T] = I\\). By noting that the mean of the Rademacher distribution is 0, we have \\[\\begin{align} \\mathrm{Cov}[z, z] &= \\mathbb{E}[(z - \\mathbb{E}[z])(z - \\mathbb{E}[z])^T] \\\\ &= \\mathbb{E}[zz^T]. \\end{align}\\] Dimensions of \\(z\\) are independent, so \\(\\mathrm{Cov}[z, z]_{ij} = 0\\) for \\(i \\neq j\\). The diagonal will contain variances, which are equal to \\(1\\) for all dimensions \\(k = 1 \\dots m\\): \\(\\mathrm{Var}[z^{(k)}] = \\mathbb{E}[z^{(k)}z^{(k)}] - \\mathbb{E}[z^{(k)}]^2 = 1 - 0 = 1\\). It follows that the covariance is an identity matrix. Note that this is a general result for vectors with IID dimensions sampled from a distribution with mean 0 and variance 1. We could therefore use something else instead of the Rademacher, e.g. \\(z ~ N(0, I)\\). The third equality uses the fact that the expectation of a trace equals the trace of an expectation. If \\(X\\) is a random matrix, then \\(\\mathbb{E}[X]_{ij} = \\mathbb{E}[X_{ij}]\\). Therefore: \\[\\begin{align} \\mathrm{tr}(\\mathbb{E}[X]) &= \\Sigma_{i=1}^m(\\mathbb{E}[X]_{ii}) \\\\ &= \\Sigma_{i=1}^m(\\mathbb{E}[X_{ii}]) \\\\ &= \\mathbb{E}[\\Sigma_{i=1}^m(X_{ii})] \\\\ &= \\mathbb{E}[\\mathrm{tr}(X)], \\end{align}\\] where we used the linearity of the expectation in the third step. The fourth equality uses the fact that \\(\\mathrm{tr}(AB) = \\mathrm{tr}(BA)\\) for any matrices \\(A \\in \\mathbb{R}^{n \\times m}, B \\in \\mathbb{R}^{m \\times n}\\). The last inequality uses the fact that the trace of a \\(1 \\times 1\\) matrix is just its element. "],["boot.html", "Chapter 14 Bootstrap", " Chapter 14 Bootstrap This chapter deals with bootstrap. The students are expected to acquire the following knowledge: How to use bootstrap to generate coverage intervals. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 14.1 Ideally, a \\(1-\\alpha\\) CI would have \\(1-\\alpha\\) coverage. That is, say a 95% CI should, in the long run, contain the true value of the parameter 95% of the time. In practice, it is impossible to assess the coverage of our CI method, because we rarely know the true parameter. In simulation, however, we can. Let’s assess the coverage of bootstrap percentile intervals. Pick a univariate distribution with readily available mean and one that you can easily sample from. Draw \\(n = 30\\) random samples from the chosen distribution and use the bootstrap (with large enough m) and percentile CI method to construct 95% CI. Repeat the process many times and count how many times the CI contains the true mean. That is, compute the actual coverage probability (don’t forget to include the standard error of the coverage probability!). What can you observe? Try one or two different distributions. What can you observe? Repeat (b) and (c) using BCa intervals (R package boot). How does the coverage compare to percentile intervals? As (d) but using intervals based on asymptotic normality (+/- 1.96 SE). How do results from (b), (d), and (e) change if we increase the sample size to n = 200? What about n = 5? library(boot) set.seed(0) nit <- 1000 # Repeat the process "many times" alpha <- 0.05 # CI parameter nboot <- 100 # m parameter for bootstrap ("large enough m") # f: change this to 200 or 5. nsample <- 30 # n = 30 random samples from the chosen distribution. Comment out BCa code if it breaks. covers <- matrix(nrow = nit, ncol = 3) covers_BCa <- matrix(nrow = nit, ncol = 3) covers_asymp_norm <- matrix(nrow = nit, ncol = 3) isin <- function (x, lower, upper) { (x > lower) & (x < upper) } for (j in 1:nit) { # Repeating many times # a: pick a univariate distribution - standard normal x1 <- rnorm(nsample) # c: one or two different distributions - beta and poisson x2 <- rbeta(nsample, 1, 2) x3 <- rpois(nsample, 5) X1 <- matrix(data = NA, nrow = nsample, ncol = nboot) X2 <- matrix(data = NA, nrow = nsample, ncol = nboot) X3 <- matrix(data = NA, nrow = nsample, ncol = nboot) for (i in 1:nboot) { X1[ ,i] <- sample(x1, nsample, replace = T) X2[ ,i] <- sample(x2, nsample, T) X3[ ,i] <- sample(x3, nsample, T) } X1_func <- apply(X1, 2, mean) X2_func <- apply(X2, 2, mean) X3_func <- apply(X3, 2, mean) X1_quant <- quantile(X1_func, probs = c(alpha / 2, 1 - alpha / 2)) X2_quant <- quantile(X2_func, probs = c(alpha / 2, 1 - alpha / 2)) X3_quant <- quantile(X3_func, probs = c(alpha / 2, 1 - alpha / 2)) covers[j,1] <- (0 > X1_quant[1]) & (0 < X1_quant[2]) covers[j,2] <- ((1 / 3) > X2_quant[1]) & ((1 / 3) < X2_quant[2]) covers[j,3] <- (5 > X3_quant[1]) & (5 < X3_quant[2]) mf <- function (x, i) return(mean(x[i])) bootX1 <- boot(x1, statistic = mf, R = nboot) bootX2 <- boot(x2, statistic = mf, R = nboot) bootX3 <- boot(x3, statistic = mf, R = nboot) X1_quant_BCa <- boot.ci(bootX1, type = "bca")$bca X2_quant_BCa <- boot.ci(bootX2, type = "bca")$bca X3_quant_BCa <- boot.ci(bootX3, type = "bca")$bca covers_BCa[j,1] <- (0 > X1_quant_BCa[4]) & (0 < X1_quant_BCa[5]) covers_BCa[j,2] <- ((1 / 3) > X2_quant_BCa[4]) & ((1 / 3) < X2_quant_BCa[5]) covers_BCa[j,3] <- (5 > X3_quant_BCa[4]) & (5 < X3_quant_BCa[5]) # e: estimate mean and standard error # sample mean: x1_bar <- mean(x1) x2_bar <- mean(x2) x3_bar <- mean(x3) # standard error (of the sample mean) estimate: sample standard deviation / sqrt(n) x1_bar_SE <- sd(x1) / sqrt(nsample) x2_bar_SE <- sd(x2) / sqrt(nsample) x3_bar_SE <- sd(x3) / sqrt(nsample) covers_asymp_norm[j,1] <- isin(0, x1_bar - 1.96 * x1_bar_SE, x1_bar + 1.96 * x1_bar_SE) covers_asymp_norm[j,2] <- isin(1/3, x2_bar - 1.96 * x2_bar_SE, x2_bar + 1.96 * x2_bar_SE) covers_asymp_norm[j,3] <- isin(5, x3_bar - 1.96 * x3_bar_SE, x3_bar + 1.96 * x3_bar_SE) } apply(covers, 2, mean) ## [1] 0.918 0.925 0.905 apply(covers, 2, sd) / sqrt(nit) ## [1] 0.008680516 0.008333333 0.009276910 apply(covers_BCa, 2, mean) ## [1] 0.927 0.944 0.927 apply(covers_BCa, 2, sd) / sqrt(nit) ## [1] 0.008230355 0.007274401 0.008230355 apply(covers_asymp_norm, 2, mean) ## [1] 0.939 0.937 0.930 apply(covers_asymp_norm, 2, sd) / sqrt(nit) ## [1] 0.007572076 0.007687008 0.008072494 Exercise 14.2 You are given a sample of independent observations from a process of interest: Index 1 2 3 4 5 6 7 8 X 7 2 4 6 4 5 9 10 Compute the plug-in estimate of mean and 95% symmetric CI based on asymptotic normality. Use the plug-in estimate of SE. Same as (a), but use the unbiased estimate of SE. Apply nonparametric bootstrap with 1000 bootstrap replications and estimate the 95% CI for the mean with percentile-based CI. # a x <- c(7, 2, 4, 6, 4, 5, 9, 10) n <- length(x) mu <- mean(x) SE <- sqrt(mean((x - mu)^2)) / sqrt(n) SE ## [1] 0.8915839 z <- qnorm(1 - 0.05 / 2) c(mu - z * SE, mu + z * SE) ## [1] 4.127528 7.622472 # b SE <- sd(x) / sqrt(n) SE ## [1] 0.9531433 c(mu - z * SE, mu + z * SE) ## [1] 4.006873 7.743127 # c set.seed(0) m <- 1000 T_mean <- function(x) {mean(x)} est_boot <- array(NA, m) for (i in 1:m) { x_boot <- x[sample(1:n, n, rep = T)] est_boot[i] <- T_mean(x_boot) } quantile(est_boot, p = c(0.025, 0.975)) ## 2.5% 97.5% ## 4.250 7.625 Exercise 14.3 We are given a sample of 10 independent paired (bivariate) observations: Index 1 2 3 4 5 6 7 8 9 10 X 1.26 -0.33 1.33 1.27 0.41 -1.54 -0.93 -0.29 -0.01 2.40 Y 2.64 0.33 0.48 0.06 -0.88 -2.14 -2.21 0.95 0.83 1.45 Compute Pearson correlation between X and Y. Use the cor.test() from R to estimate a 95% CI for the estimate from (a). Apply nonparametric bootstrap with 1000 bootstrap replications and estimate the 95% CI for the Pearson correlation with percentile-based CI. Compare CI from (b) and (c). Are they similar? How would the bootstrap estimation of CI change if we were interested in Spearman or Kendall correlation instead? x <- c(1.26, -0.33, 1.33, 1.27, 0.41, -1.54, -0.93, -0.29, -0.01, 2.40) y <- c(2.64, 0.33, 0.48, 0.06, -0.88, -2.14, -2.21, 0.95, 0.83, 1.45) # a cor(x, y) ## [1] 0.6991247 # b res <- cor.test(x, y) res$conf.int[1:2] ## [1] 0.1241458 0.9226238 # c set.seed(0) m <- 1000 n <- length(x) T_cor <- function(x, y) {cor(x, y)} est_boot <- array(NA, m) for (i in 1:m) { idx <- sample(1:n, n, rep = T) # !!! important to use same indices to keep dependency between x and y est_boot[i] <- T_cor(x[idx], y[idx]) } quantile(est_boot, p = c(0.025, 0.975)) ## 2.5% 97.5% ## 0.2565537 0.9057664 # d # Yes, but the bootstrap CI is more narrow. # e # We just use the functions for Kendall/Spearman coefficients instead: T_kendall <- function(x, y) {cor(x, y, method = "kendall")} T_spearman <- function(x, y) {cor(x, y, method = "spearman")} # Put this in a function that returns the CI bootstrap_95_ci <- function(x, y, t, m = 1000) { n <- length(x) est_boot <- array(NA, m) for (i in 1:m) { idx <- sample(1:n, n, rep = T) # !!! important to use same indices to keep dependency between x and y est_boot[i] <- t(x[idx], y[idx]) } quantile(est_boot, p = c(0.025, 0.975)) } bootstrap_95_ci(x, y, T_kendall) ## 2.5% 97.5% ## -0.08108108 0.78378378 bootstrap_95_ci(x, y, T_spearman) ## 2.5% 97.5% ## -0.1701115 0.8867925 Exercise 14.4 In this problem we will illustrate the use of the nonparametric bootstrap for estimating CIs of regression model coefficients. Load the longley dataset from base R with data(longley). Use lm() to apply linear regression using “Employed” as the target (dependent) variable and all other variables as the predictors (independent). Using lm() results, print the estimated regression coefficients and standard errors. Estimate 95% CI for the coefficients using +/- 1.96 * SE. Use nonparametric bootstrap with 100 replications to estimate the SE of the coefficients from (b). Compare the SE from (c) with those from (b). # a data(longley) # b res <- lm(Employed ~ . , longley) tmp <- data.frame(summary(res)$coefficients[,1:2]) tmp$LB <- tmp[,1] - 1.96 * tmp[,2] tmp$UB <- tmp[,1] + 1.96 * tmp[,2] tmp ## Estimate Std..Error LB UB ## (Intercept) -3.482259e+03 8.904204e+02 -5.227483e+03 -1.737035e+03 ## GNP.deflator 1.506187e-02 8.491493e-02 -1.513714e-01 1.814951e-01 ## GNP -3.581918e-02 3.349101e-02 -1.014616e-01 2.982320e-02 ## Unemployed -2.020230e-02 4.883997e-03 -2.977493e-02 -1.062966e-02 ## Armed.Forces -1.033227e-02 2.142742e-03 -1.453204e-02 -6.132495e-03 ## Population -5.110411e-02 2.260732e-01 -4.942076e-01 3.919994e-01 ## Year 1.829151e+00 4.554785e-01 9.364136e-01 2.721889e+00 # c set.seed(0) m <- 100 n <- nrow(longley) T_coef <- function(x) { lm(Employed ~ . , x)$coefficients } est_boot <- array(NA, c(m, ncol(longley))) for (i in 1:m) { idx <- sample(1:n, n, rep = T) est_boot[i,] <- T_coef(longley[idx,]) } SE <- apply(est_boot, 2, sd) SE ## [1] 1.826011e+03 1.605981e-01 5.693746e-02 8.204892e-03 3.802225e-03 ## [6] 3.907527e-01 9.414436e-01 # Show the standard errors around coefficients library(ggplot2) library(reshape2) df <- data.frame(index = 1:7, bootstrap_SE = SE, lm_SE = tmp$Std..Error) melted_df <- melt(df[2:nrow(df), ], id.vars = "index") # Ignore bias which has a really large magnitude ggplot(melted_df, aes(x = index, y = value, fill = variable)) + geom_bar(stat="identity", position="dodge") + xlab("Coefficient") + ylab("Standard error") # + scale_y_continuous(trans = "log") # If you want to also plot bias Exercise 14.5 This exercise shows a shortcoming of the bootstrap method when using the plug in estimator for the maximum. Compute the 95% bootstrap CI for the maximum of a standard normal distribution. Compute the 95% bootstrap CI for the maximum of a binomial distribution with n = 15 and p = 0.2. Repeat (b) using p = 0.9. Why is the result different? # bootstrap CI for maximum alpha <- 0.05 T_max <- function(x) {max(x)} # Equal to T_max = max bootstrap <- function(x, t, m = 1000) { n <- length(x) values <- rep(0, m) for (i in 1:m) { values[i] <- t(sample(x, n, replace = T)) } quantile(values, probs = c(alpha / 2, 1 - alpha / 2)) } # a # Meaningless, as the normal distribution can yield arbitrarily large values. x <- rnorm(100) bootstrap(x, T_max) ## 2.5% 97.5% ## 1.819425 2.961743 # b x <- rbinom(100, size = 15, prob = 0.2) # min = 0, max = 15 bootstrap(x, T_max) ## 2.5% 97.5% ## 6 7 # c x <- rbinom(100, size = 15, prob = 0.9) # min = 0, max = 15 bootstrap(x, T_max) ## 2.5% 97.5% ## 15 15 # Observation: to estimate the maximum, we need sufficient probability mass near the maximum value the distribution can yield. # Using bootstrap is pointless when there is too little mass near the true maximum. # In general, bootstrap will fail when estimating the CI for the maximum. "],["ml.html", "Chapter 15 Maximum likelihood 15.1 Deriving MLE 15.2 Fisher information 15.3 The German tank problem", " Chapter 15 Maximum likelihood This chapter deals with maximum likelihood estimation. The students are expected to acquire the following knowledge: How to derive MLE. Applying MLE in R. Calculating and interpreting Fisher information. Practical use of MLE. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 15.1 Deriving MLE Exercise 15.1 Derive the maximum likelihood estimator of variance for N\\((\\mu, \\sigma^2)\\). Compare with results from 13.3. What does that say about the MLE estimator? Solution. The mean is assumed constant, so we have the likelihood \\[\\begin{align} L(\\sigma^2; y) &= \\prod_{i=1}^n \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(y_i - \\mu)^2}{2 \\sigma^2}} \\\\ &= \\frac{1}{\\sqrt{2 \\pi \\sigma^2}^n} e^{\\frac{-\\sum_{i=1}^n (y_i - \\mu)^2}{2 \\sigma^2}} \\end{align}\\] We need to find the maximum of this function. We first observe that we can replace \\(\\frac{-\\sum_{i=1}^n (y_i - \\mu)^2}{2}\\) with a constant \\(c\\), since none of the terms are dependent on \\(\\sigma^2\\). Additionally, the term \\(\\frac{1}{\\sqrt{2 \\pi}^n}\\) does not affect the calculation of the maximum. So now we have \\[\\begin{align} L(\\sigma^2; y) &= (\\sigma^2)^{-\\frac{n}{2}} e^{\\frac{c}{\\sigma^2}}. \\end{align}\\] Differentiating we get \\[\\begin{align} \\frac{d}{d \\sigma^2} L(\\sigma^2; y) &= (\\sigma^2)^{-\\frac{n}{2}} \\frac{d}{d \\sigma^2} e^{\\frac{c}{\\sigma^2}} + e^{\\frac{c}{\\sigma^2}} \\frac{d}{d \\sigma^2} (\\sigma^2)^{-\\frac{n}{2}} \\\\ &= - (\\sigma^2)^{-\\frac{n}{2}} e^{\\frac{c}{\\sigma^2}} \\frac{c}{(\\sigma^2)^2} - e^{\\frac{c}{\\sigma^2}} \\frac{n}{2} (\\sigma^2)^{-\\frac{n + 2}{2}} \\\\ &= - (\\sigma^2)^{-\\frac{n + 4}{2}} e^{\\frac{c}{\\sigma^2}} c - e^{\\frac{c}{\\sigma^2}} \\frac{n}{2} (\\sigma^2)^{-\\frac{n + 2}{2}} \\\\ &= - e^{\\frac{c}{\\sigma^2}} (\\sigma^2)^{-\\frac{n + 4}{2}} \\Big(c + \\frac{n}{2}\\sigma^2 \\Big). \\end{align}\\] To get the maximum, this has to equal to 0, so \\[\\begin{align} c + \\frac{n}{2}\\sigma^2 &= 0 \\\\ \\sigma^2 &= -\\frac{2c}{n} \\\\ \\sigma^2 &= \\frac{\\sum_{i=1}^n (Y_i - \\mu)^2}{n}. \\end{align}\\] The MLE estimator is biased. Exercise 15.2 (Multivariate normal distribution) Derive the maximum likelihood estimate for the mean and covariance matrix of the multivariate normal. Simulate \\(n = 40\\) samples from a bivariate normal distribution (choose non-trivial parameters, that is, mean \\(\\neq 0\\) and covariance \\(\\neq 0\\)). Compute the MLE for the sample. Overlay the data with an ellipse that is determined by the MLE and an ellipse that is determined by the chosen true parameters. Repeat b. several times and observe how the estimates (ellipses) vary around the true value. Hint: For the derivation of MLE, these identities will be helpful: \\(\\frac{\\partial b^T a}{\\partial a} = \\frac{\\partial a^T b}{\\partial a} = b\\), \\(\\frac{\\partial a^T A a}{\\partial a} = (A + A^T)a\\), \\(\\frac{\\partial \\text{tr}(BA)}{\\partial A} = B^T\\), \\(\\frac{\\partial \\ln |A|}{\\partial A} = (A^{-1})^T\\), \\(a^T A a = \\text{tr}(a^T A a) = \\text{tr}(a a^T A) = \\text{tr}(Aaa^T)\\). Solution. The log likelihood of the MVN distribution is \\[\\begin{align*} l(\\mu, \\Sigma ; x) &= -\\frac{1}{2}\\Big(\\sum_{i=1}^n k\\ln(2\\pi) + |\\Sigma| + (x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) \\\\ &= -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n(x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) + c, \\end{align*}\\] where \\(c\\) is a constant with respect to \\(\\mu\\) and \\(\\Sigma\\). To find the MLE we first need to find partial derivatives. Let us start with \\(\\mu\\). \\[\\begin{align*} \\frac{\\partial}{\\partial \\mu}l(\\mu, \\Sigma ; x) &= \\frac{\\partial}{\\partial \\mu} -\\frac{1}{2}\\Big(\\sum_{i=1}^n x_i^T \\Sigma^{-1} x_i - x_i^T \\Sigma^{-1} \\mu - \\mu^T \\Sigma^{-1} x_i + \\mu^T \\Sigma^{-1} \\mu \\Big) \\\\ &= -\\frac{1}{2}\\Big(\\sum_{i=1}^n - \\Sigma^{-1} x_i - \\Sigma^{-1} x_i + 2 \\Sigma^{-1} \\mu \\Big) \\\\ &= -\\Sigma^{-1}\\Big(\\sum_{i=1}^n - x_i + \\mu \\Big). \\end{align*}\\] Equating above with zero, we get \\[\\begin{align*} \\sum_{i=1}^n - x_i + \\mu &= 0 \\\\ \\hat{\\mu} = \\frac{1}{n} \\sum_{i=1}^n x_i, \\end{align*}\\] which is the dimension-wise empirical mean. Now for the covariance matrix \\[\\begin{align*} \\frac{\\partial}{\\partial \\Sigma^{-1}}l(\\mu, \\Sigma ; x) &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n(x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) \\\\ &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n \\text{tr}((x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu))\\Big) \\\\ &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n \\text{tr}((\\Sigma^{-1} (x_i - \\mu) (x_i - \\mu)^T )\\Big) \\\\ &= \\frac{n}{2}\\Sigma + -\\frac{1}{2}\\Big(\\sum_{i=1}^n (x_i - \\mu) (x_i - \\mu)^T \\Big). \\end{align*}\\] Equating above with zero, we get \\[\\begin{align*} \\hat{\\Sigma} = \\frac{1}{n}\\sum_{i=1}^n (x_i - \\mu) (x_i - \\mu)^T. \\end{align*}\\] set.seed(1) n <- 40 mu <- c(1, -2) Sigma <- matrix(data = c(2, -1.6, -1.6, 1.8), ncol = 2) X <- mvrnorm(n = n, mu = mu, Sigma = Sigma) colnames(X) <- c("X1", "X2") X <- as.data.frame(X) # plot.new() tru_ellip <- ellipse(mu, Sigma, draw = FALSE) colnames(tru_ellip) <- c("X1", "X2") tru_ellip <- as.data.frame(tru_ellip) mu_est <- apply(X, 2, mean) tmp <- as.matrix(sweep(X, 2, mu_est)) Sigma_est <- (1 / n) * t(tmp) %*% tmp est_ellip <- ellipse(mu_est, Sigma_est, draw = FALSE) colnames(est_ellip) <- c("X1", "X2") est_ellip <- as.data.frame(est_ellip) ggplot(data = X, aes(x = X1, y = X2)) + geom_point() + geom_path(data = tru_ellip, aes(x = X1, y = X2, color = "truth")) + geom_path(data = est_ellip, aes(x = X1, y = X2, color = "estimated")) + labs(color = "type") Exercise 15.3 (Logistic regression) Logistic regression is a popular discriminative model when our target variable is binary (categorical with 2 values). One of the ways of looking at logistic regression is that it is linear regression but instead of using the linear term as the mean of a normal RV, we use it as the mean of a Bernoulli RV. Of course, the mean of a Bernoulli is bounded on \\([0,1]\\), so, to avoid non-sensical values, we squeeze the linear between 0 and 1 with the inverse logit function inv_logit\\((z) = 1 / (1 + e^{-z})\\). This leads to the following model: \\(y_i | \\beta, x_i \\sim \\text{Bernoulli}(\\text{inv_logit}(\\beta x_i))\\). Explicitly write the likelihood function of beta. Implement the likelihood function in R. Use black-box box-constraint optimization (for example, optim() with L-BFGS) to find the maximum likelihood estimate for beta for \\(x\\) and \\(y\\) defined below. Plot the estimated probability as a function of the independent variable. Compare with the truth. Let \\(y2\\) be a response defined below. Will logistic regression work well on this dataset? Why not? How can we still use the model, without changing it? inv_log <- function (z) { return (1 / (1 + exp(-z))) } set.seed(1) x <- rnorm(100) y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) y2 <- rbinom(100, size = 1, prob = inv_log(1.2 * x + 1.4 * x^2)) Solution. \\[\\begin{align*} l(\\beta; x, y) &= p(y | x, \\beta) \\\\ &= \\ln(\\prod_{i=1}^n \\text{inv_logit}(\\beta x_i)^{y_i} (1 - \\text{inv_logit}(\\beta x_i))^{1 - y_i}) \\\\ &= \\sum_{i=1}^n y_i \\ln(\\text{inv_logit}(\\beta x_i)) + (1 - y_i) \\ln(1 - \\text{inv_logit}(\\beta x_i)). \\end{align*}\\] set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par ## [1] 1.166558 truth_p <- data.frame(x = x, prob = inv_log(1.2 * x), type = "truth") est_p <- data.frame(x = x, prob = inv_log(my_optim$par * x), type = "estimated") plot_df <- rbind(truth_p, est_p) ggplot(data = plot_df, aes(x = x, y = prob, color = type)) + geom_point(alpha = 0.3) y2 <- rbinom(2000, size = 1, prob = inv_log(1.2 * x + 1.4 * x^2)) X2 <- cbind(x, x^2) my_optim2 <- optim(par = c(0, 0), fn = l_logistic, method = "L-BFGS-B", lower = c(0, 0), upper = c(2, 2), X = t(X2), y = y2) my_optim2$par ## [1] 1.153656 1.257649 tmp <- sweep(data.frame(x = x, x2 = x^2), 2, my_optim2$par, FUN = "*") tmp <- tmp[ ,1] + tmp[ ,2] truth_p <- data.frame(x = x, prob = inv_log(1.2 * x + 1.4 * x^2), type = "truth") est_p <- data.frame(x = x, prob = inv_log(tmp), type = "estimated") plot_df <- rbind(truth_p, est_p) ggplot(data = plot_df, aes(x = x, y = prob, color = type)) + geom_point(alpha = 0.3) Exercise 15.4 (Linear regression) For the data generated below, do the following: Compute the least squares (MLE) estimate of coefficients beta using the matrix exact solution. Compute the MLE by minimizing the sum of squared residuals using black-box optimization (optim()). Compute the MLE by using the output built-in linear regression (lm() ). Compare (a-c and the true coefficients). Compute 95% CI on the beta coefficients using the output of built-in linear regression. Compute 95% CI on the beta coefficients by using (a or b) and the bootstrap with percentile method for CI. Compare with d. set.seed(1) n <- 100 x1 <- rnorm(n) x2 <- rnorm(n) x3 <- rnorm(n) X <- cbind(x1, x2, x3) beta <- c(0.2, 0.6, -1.2) y <- as.vector(t(beta %*% t(X))) + rnorm(n, sd = 0.2) set.seed(1) n <- 100 x1 <- rnorm(n) x2 <- rnorm(n) x3 <- rnorm(n) X <- cbind(x1, x2, x3) beta <- c(0.2, 0.6, -1.2) y <- as.vector(t(beta %*% t(X))) + rnorm(n, sd = 0.2) LS_fun <- function (beta, X, y) { return(sum((y - beta %*% t(X))^2)) } my_optim <- optim(par = c(0, 0, 0), fn = LS_fun, lower = -5, upper = 5, X = X, y = y, method = "L-BFGS-B") my_optim$par ## [1] 0.1898162 0.5885946 -1.1788264 df <- data.frame(y = y, x1 = x1, x2 = x2, x3 = x3) my_lm <- lm(y ~ x1 + x2 + x3 - 1, data = df) my_lm ## ## Call: ## lm(formula = y ~ x1 + x2 + x3 - 1, data = df) ## ## Coefficients: ## x1 x2 x3 ## 0.1898 0.5886 -1.1788 # matrix solution beta_hat <- solve(t(X) %*% X) %*% t(X) %*% y beta_hat ## [,1] ## x1 0.1898162 ## x2 0.5885946 ## x3 -1.1788264 out <- summary(my_lm) out$coefficients[ ,2] ## x1 x2 x3 ## 0.02209328 0.02087542 0.01934506 # bootstrap CI nboot <- 1000 beta_boot <- matrix(data = NA, ncol = length(beta), nrow = nboot) for (i in 1:nboot) { inds <- sample(1:n, n, replace = T) new_df <- df[inds, ] X_tmp <- as.matrix(new_df[ ,-1]) y_tmp <- new_df[ ,1] # print(nrow(new_df)) tmp_beta <- solve(t(X_tmp) %*% X_tmp) %*% t(X_tmp) %*% y_tmp beta_boot[i, ] <- tmp_beta } apply(beta_boot, 2, mean) ## [1] 0.1893281 0.5887068 -1.1800738 apply(beta_boot, 2, quantile, probs = c(0.025, 0.975)) ## [,1] [,2] [,3] ## 2.5% 0.1389441 0.5436911 -1.221560 ## 97.5% 0.2386295 0.6363102 -1.140416 out$coefficients[ ,2] ## x1 x2 x3 ## 0.02209328 0.02087542 0.01934506 Exercise 15.5 (Principal component analysis) Load the olympic data set from package ade4. The data show decathlon results for 33 men in 1988 Olympic Games. This data set serves as a great example of finding the latent structure in the data, as there are certain characteristics of the athletes that make them excel at different events. For example an explosive athlete will do particulary well in sprints and long jumps. Perform PCA (prcomp) on the data set and interpret the first 2 latent dimensions. Hint: Standardize the data first to get meaningful results. Use MLE to estimate the covariance of the standardized multivariate distribution. Decompose the estimated covariance matrix with the eigendecomposition. Compare the eigenvectors to the output of PCA. data(olympic) X <- olympic$tab X_scaled <- scale(X) my_pca <- prcomp(X_scaled) summary(my_pca) ## Importance of components: ## PC1 PC2 PC3 PC4 PC5 PC6 PC7 ## Standard deviation 1.8488 1.6144 0.97123 0.9370 0.74607 0.70088 0.65620 ## Proportion of Variance 0.3418 0.2606 0.09433 0.0878 0.05566 0.04912 0.04306 ## Cumulative Proportion 0.3418 0.6025 0.69679 0.7846 0.84026 0.88938 0.93244 ## PC8 PC9 PC10 ## Standard deviation 0.55389 0.51667 0.31915 ## Proportion of Variance 0.03068 0.02669 0.01019 ## Cumulative Proportion 0.96312 0.98981 1.00000 autoplot(my_pca, data = X, loadings = TRUE, loadings.colour = 'blue', loadings.label = TRUE, loadings.label.size = 3) Sigma_est <- (1 / nrow(X_scaled)) * t(X_scaled) %*% X_scaled Sigma_dec <- eigen(Sigma_est) Sigma_dec$vectors ## [,1] [,2] [,3] [,4] [,5] [,6] ## [1,] 0.4158823 0.1488081 -0.26747198 -0.08833244 -0.442314456 0.03071237 ## [2,] -0.3940515 -0.1520815 -0.16894945 -0.24424963 0.368913901 -0.09378242 ## [3,] -0.2691057 0.4835374 0.09853273 -0.10776276 -0.009754680 0.23002054 ## [4,] -0.2122818 0.0278985 -0.85498656 0.38794393 -0.001876311 0.07454380 ## [5,] 0.3558474 0.3521598 -0.18949642 0.08057457 0.146965351 -0.32692886 ## [6,] 0.4334816 0.0695682 -0.12616012 -0.38229029 -0.088802794 0.21049130 ## [7,] -0.1757923 0.5033347 0.04609969 0.02558404 0.019358607 0.61491241 ## [8,] -0.3840821 0.1495820 0.13687235 0.14396548 -0.716743474 -0.34776037 ## [9,] -0.1799436 0.3719570 -0.19232803 -0.60046566 0.095582043 -0.43744387 ## [10,] 0.1701426 0.4209653 0.22255233 0.48564231 0.339772188 -0.30032419 ## [,7] [,8] [,9] [,10] ## [1,] 0.2543985 0.663712826 -0.10839531 0.10948045 ## [2,] 0.7505343 0.141264141 0.04613910 0.05580431 ## [3,] -0.1106637 0.072505560 0.42247611 0.65073655 ## [4,] -0.1351242 -0.155435871 -0.10206505 0.11941181 ## [5,] 0.1413388 -0.146839303 0.65076229 -0.33681395 ## [6,] 0.2725296 -0.639003579 -0.20723854 0.25971800 ## [7,] 0.1439726 0.009400445 -0.16724055 -0.53450315 ## [8,] 0.2732665 -0.276873049 -0.01766443 -0.06589572 ## [9,] -0.3419099 0.058519366 -0.30619617 -0.13093187 ## [10,] 0.1868704 0.007310045 -0.45688227 0.24311846 my_pca$rotation ## PC1 PC2 PC3 PC4 PC5 PC6 ## 100 -0.4158823 0.1488081 0.26747198 -0.08833244 -0.442314456 0.03071237 ## long 0.3940515 -0.1520815 0.16894945 -0.24424963 0.368913901 -0.09378242 ## poid 0.2691057 0.4835374 -0.09853273 -0.10776276 -0.009754680 0.23002054 ## haut 0.2122818 0.0278985 0.85498656 0.38794393 -0.001876311 0.07454380 ## 400 -0.3558474 0.3521598 0.18949642 0.08057457 0.146965351 -0.32692886 ## 110 -0.4334816 0.0695682 0.12616012 -0.38229029 -0.088802794 0.21049130 ## disq 0.1757923 0.5033347 -0.04609969 0.02558404 0.019358607 0.61491241 ## perc 0.3840821 0.1495820 -0.13687235 0.14396548 -0.716743474 -0.34776037 ## jave 0.1799436 0.3719570 0.19232803 -0.60046566 0.095582043 -0.43744387 ## 1500 -0.1701426 0.4209653 -0.22255233 0.48564231 0.339772188 -0.30032419 ## PC7 PC8 PC9 PC10 ## 100 0.2543985 -0.663712826 0.10839531 -0.10948045 ## long 0.7505343 -0.141264141 -0.04613910 -0.05580431 ## poid -0.1106637 -0.072505560 -0.42247611 -0.65073655 ## haut -0.1351242 0.155435871 0.10206505 -0.11941181 ## 400 0.1413388 0.146839303 -0.65076229 0.33681395 ## 110 0.2725296 0.639003579 0.20723854 -0.25971800 ## disq 0.1439726 -0.009400445 0.16724055 0.53450315 ## perc 0.2732665 0.276873049 0.01766443 0.06589572 ## jave -0.3419099 -0.058519366 0.30619617 0.13093187 ## 1500 0.1868704 -0.007310045 0.45688227 -0.24311846 15.2 Fisher information Exercise 15.6 Let us assume a Poisson likelihood. Derive the MLE estimate of the mean. Derive the Fisher information. For the data below compute the MLE and construct confidence intervals. Use bootstrap to construct the CI for the mean. Compare with c) and discuss. x <- c(2, 5, 3, 1, 2, 1, 0, 3, 0, 2) Solution. The log likelihood of the Poisson is \\[\\begin{align*} l(\\lambda; x) = \\sum_{i=1}^n x_i \\ln \\lambda - n \\lambda - \\sum_{i=1}^n \\ln x_i! \\end{align*}\\] Taking the derivative and equating with 0 we get \\[\\begin{align*} \\frac{1}{\\hat{\\lambda}}\\sum_{i=1}^n x_i - n &= 0 \\\\ \\hat{\\lambda} &= \\frac{1}{n} \\sum_{i=1}^n x_i. \\end{align*}\\] Since \\(\\lambda\\) is the mean parameter, this was expected. For the Fischer information, we first need the second derivative, which is \\[\\begin{align*} - \\lambda^{-2} \\sum_{i=1}^n x_i. \\\\ \\end{align*}\\] Now taking the expectation of the negative of the above, we get \\[\\begin{align*} E[\\lambda^{-2} \\sum_{i=1}^n x_i] &= \\lambda^{-2} E[\\sum_{i=1}^n x_i] \\\\ &= \\lambda^{-2} n \\lambda \\\\ &= \\frac{n}{\\lambda}. \\end{align*}\\] set.seed(1) x <- c(2, 5, 3, 1, 2, 1, 0, 3, 0, 2) lambda_hat <- mean(x) finfo <- length(x) / lambda_hat mle_CI <- c(lambda_hat - 1.96 * sqrt(1 / finfo), lambda_hat + 1.96 * sqrt(1 / finfo)) boot_lambda <- c() nboot <- 1000 for (i in 1:nboot) { tmp_x <- sample(x, length(x), replace = T) boot_lambda[i] <- mean(tmp_x) } boot_CI <- c(quantile(boot_lambda, 0.025), quantile(boot_lambda, 0.975)) mle_CI ## [1] 1.045656 2.754344 boot_CI ## 2.5% 97.5% ## 1.0 2.7 Exercise 15.7 Find the Fisher information matrix for the Gamma distribution. Generate 20 samples from a Gamma distribution and plot a confidence ellipse of the inverse of Fisher information matrix around the ML estimates of the parameters. Also plot the theoretical values. Repeat the sampling several times. What do you observe? Discuss what a non-diagonal Fisher matrix implies. Hint: The digamma function is defined as \\(\\psi(x) = \\frac{\\frac{d}{dx} \\Gamma(x)}{\\Gamma(x)}\\). Additionally, you do not need to evaluate \\(\\frac{d}{dx} \\psi(x)\\). To calculate its value in R, use package numDeriv. Solution. The log likelihood of the Gamma is \\[\\begin{equation*} l(\\alpha, \\beta; x) = n \\alpha \\ln \\beta - n \\ln \\Gamma(\\alpha) + (\\alpha - 1) \\sum_{i=1}^n \\ln x_i - \\beta \\sum_{i=1}^n x_i. \\end{equation*}\\] Let us calculate the derivatives. \\[\\begin{align*} \\frac{\\partial}{\\partial \\alpha} l(\\alpha, \\beta; x) &= n \\ln \\beta - n \\psi(\\alpha) + \\sum_{i=1}^n \\ln x_i, \\\\ \\frac{\\partial}{\\partial \\beta} l(\\alpha, \\beta; x) &= \\frac{n \\alpha}{\\beta} - \\sum_{i=1}^n x_i, \\\\ \\frac{\\partial^2}{\\partial \\alpha \\beta} l(\\alpha, \\beta; x) &= \\frac{n}{\\beta}, \\\\ \\frac{\\partial^2}{\\partial \\alpha^2} l(\\alpha, \\beta; x) &= - n \\frac{\\partial}{\\partial \\alpha} \\psi(\\alpha), \\\\ \\frac{\\partial^2}{\\partial \\beta^2} l(\\alpha, \\beta; x) &= - \\frac{n \\alpha}{\\beta^2}. \\end{align*}\\] The Fisher information matrix is then \\[\\begin{align*} I(\\alpha, \\beta) = - E[ \\begin{bmatrix} - n \\psi'(\\alpha) & \\frac{n}{\\beta} \\\\ \\frac{n}{\\beta} & - \\frac{n \\alpha}{\\beta^2} \\end{bmatrix} ] = \\begin{bmatrix} n \\psi'(\\alpha) & - \\frac{n}{\\beta} \\\\ - \\frac{n}{\\beta} & \\frac{n \\alpha}{\\beta^2} \\end{bmatrix} \\end{align*}\\] A non-diagonal Fisher matrix implies that the parameter estimates are linearly dependent. set.seed(1) n <- 20 pars_theor <- c(5, 2) x <- rgamma(n, 5, 2) # MLE for alpha and beta log_lik <- function (pars, x) { n <- length(x) return (- (n * pars[1] * log(pars[2]) - n * log(gamma(pars[1])) + (pars[1] - 1) * sum(log(x)) - pars[2] * sum(x))) } my_optim <- optim(par = c(1,1), fn = log_lik, method = "L-BFGS-B", lower = c(0.001, 0.001), upper = c(8, 8), x = x) pars_mle <- my_optim$par fish_mat <- matrix(data = NA, nrow = 2, ncol = 2) fish_mat[1,2] <- - n / pars_mle[2] fish_mat[2,1] <- - n / pars_mle[2] fish_mat[2,2] <- (n * pars_mle[1]) / (pars_mle[2]^2) fish_mat[1,1] <- n * grad(digamma, pars_mle[1]) fish_mat_inv <- solve(fish_mat) est_ellip <- ellipse(pars_mle, fish_mat_inv, draw = FALSE) colnames(est_ellip) <- c("X1", "X2") est_ellip <- as.data.frame(est_ellip) ggplot() + geom_point(data = data.frame(x = pars_mle[1], y = pars_mle[2]), aes(x = x, y = y)) + geom_path(data = est_ellip, aes(x = X1, y = X2)) + geom_point(aes(x = pars_theor[1], y = pars_theor[2]), color = "red") + geom_text(aes(x = pars_theor[1], y = pars_theor[2], label = "Theoretical parameters"), color = "red", nudge_y = -0.2) 15.3 The German tank problem Exercise 15.8 (The German tank problem) During WWII the allied intelligence were faced with an important problem of estimating the total production of certain German tanks, such as the Panther. What turned out to be a successful approach was to estimate the maximum from the serial numbers of the small sample of captured or destroyed tanks (describe the statistical model used). What assumptions were made by using the above model? Do you think they are reasonable assumptions in practice? Show that the plug-in estimate for the maximum (i.e. the maximum of the sample) is a biased estimator. Derive the maximum likelihood estimate of the maximum. Check that the following estimator is not biased: \\(\\hat{n} = \\frac{k + 1}{k}m - 1\\). Solution. The data are the serial numbers of the tanks. The parameter is \\(n\\), the total production of the tank. The distribution of the serial numbers is a discrete uniform distribution over all serial numbers. One of the assumptions is that we have i.i.d samples, however in practice this might not be true, as some tanks produced later could be sent to the field later, therefore already in theory we would not be able to recover some values from the population. To find the expected value we first need to find the distribution of \\(m\\). Let us start with the CDF. \\[\\begin{align*} F_m(x) = P(Y_1 < x,...,Y_k < x). \\end{align*}\\] If \\(x < k\\) then \\(F_m(x) = 0\\) and if \\(x \\geq 1\\) then \\(F_m(x) = 1\\). What about between those values. So the probability that the maximum value is less than or equal to \\(m\\) is just the number of possible draws from \\(Y\\) that are all smaller than \\(m\\), divided by all possible draws. This is \\(\\frac{{x}\\choose{k}}{{n}\\choose{k}}\\). The PDF on the suitable bounds is then \\[\\begin{align*} P(m = x) = F_m(x) - F_m(x - 1) = \\frac{\\binom{x}{k} - \\binom{x - 1}{k}}{\\binom{n}{k}} = \\frac{\\binom{x - 1}{k - 1}}{\\binom{n}{k}}. \\end{align*}\\] Now we can calculate the expected value of \\(m\\) using some combinatorial identities. \\[\\begin{align*} E[m] &= \\sum_{i = k}^n i \\frac{{i - 1}\\choose{k - 1}}{{n}\\choose{k}} \\\\ &= \\sum_{i = k}^n i \\frac{\\frac{(i - 1)!}{(k - 1)!(i - k)!}}{{n}\\choose{k}} \\\\ &= \\frac{k}{\\binom{n}{k}}\\sum_{i = k}^n \\binom{i}{k} \\\\ &= \\frac{k}{\\binom{n}{k}} \\binom{n + 1}{k + 1} \\\\ &= \\frac{k(n + 1)}{k + 1}. \\end{align*}\\] The bias of this estimator is then \\[\\begin{align*} E[m] - n = \\frac{k(n + 1)}{k + 1} - n = \\frac{k - n}{k + 1}. \\end{align*}\\] The probability that we observed our sample \\(Y = {Y_1, Y_2,...,,Y_k}\\) given \\(n\\) is \\(\\frac{1}{{n}\\choose{k}}\\). We need to find such \\(n^*\\) that this function is maximized. Additionally, we have a constraint that \\(n^* \\geq m = \\max{(Y)}\\). Let us plot this function for \\(m = 10\\) and \\(k = 4\\). library(ggplot2) my_fun <- function (x, m, k) { tmp <- 1 / (choose(x, k)) tmp[x < m] <- 0 return (tmp) } x <- 1:20 y <- my_fun(x, 10, 4) df <- data.frame(x = x, y = y) ggplot(data = df, aes(x = x, y = y)) + geom_line() ::: {.solution} (continued) We observe that the maximum of this function lies at the maximum value of the sample. Therefore \\(n^* = m\\) and ML estimate equals the plug-in estimate. \\[\\begin{align*} E[\\hat{n}] &= \\frac{k + 1}{k} E[m] - 1 \\\\ &= \\frac{k + 1}{k} \\frac{k(n + 1)}{k + 1} - 1 \\\\ &= n. \\end{align*}\\] ::: "],["nhst.html", "Chapter 16 Null hypothesis significance testing", " Chapter 16 Null hypothesis significance testing This chapter deals with null hypothesis significance testing. The students are expected to acquire the following knowledge: Binomial test. t-test. Chi-squared test. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 16.1 (Binomial test) We assume \\(y_i \\in \\{0,1\\}\\), \\(i = 1,...,n\\) and \\(y_i | \\theta = 0.5 \\sim i.i.d.\\) Bernoulli\\((\\theta)\\). The test statistic is \\(X = \\sum_{i=1}^n\\) and the rejection region R is defined as the region where the probability of obtaining such or more extreme \\(X\\) given \\(\\theta = 0.5\\) is less than 0.05. Derive and plot the power function of the test for \\(n=100\\). What is the significance level of this test if \\(H0: \\theta = 0.5\\)? At which values of X will we reject the null hypothesis? # a # First we need the rejection region, so we need to find X_min and X_max n <- 100 qbinom(0.025, n, 0.5) ## [1] 40 qbinom(0.975, n, 0.5) ## [1] 60 pbinom(40, n, 0.5) ## [1] 0.02844397 pbinom(60, n, 0.5) ## [1] 0.9823999 X_min <- 39 X_max <- 60 thetas <- seq(0, 1, by = 0.01) beta_t <- 1 - pbinom(X_max, size = n, prob = thetas) + pbinom(X_min, size = n, prob = thetas) plot(beta_t) # b # The significance level is beta_t[51] ## [1] 0.0352002 # We will reject the null hypothesis at X values below X_min and above X_max. Exercise 16.2 (Long-run guarantees of the t-test) Generate a sample of size \\(n = 10\\) from the standard normal. Use the two-sided t-test with \\(H0: \\mu = 0\\) and record the p-value. Can you reject H0 at 0.05 significance level? (before simulating) If we repeated (b) many times, what would be the relative frequency of false positives/Type I errors (rejecting the null that is true)? What would be the relative frequency of false negatives /Type II errors (retaining the null when the null is false)? (now simulate b and check if the simulation results match your answer in b) Similar to (a-c) but now we generate data from N(-0.5, 1). Similar to (a-c) but now we generate data from N(\\(\\mu\\), 1) where we every time pick a different \\(\\mu < 0\\) and use a one-sided test \\(H0: \\mu <= 0\\). set.seed(2) # a x <- rnorm(10) my_test <- t.test(x, alternative = "two.sided", mu = 0) my_test ## ## One Sample t-test ## ## data: x ## t = 0.6779, df = 9, p-value = 0.5149 ## alternative hypothesis: true mean is not equal to 0 ## 95 percent confidence interval: ## -0.4934661 0.9157694 ## sample estimates: ## mean of x ## 0.2111516 # we can not reject the null hypothesis # b # The expected value of false positives would be 0.05. The expected value of # true negatives would be 0, as there are no negatives (the null hypothesis is # always the truth). nit <- 1000 typeIerr <- vector(mode = "logical", length = nit) typeIIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { x <- rnorm(10) my_test <- t.test(x, alternative = "two.sided", mu = 0) if (my_test$p.value < 0.05) { typeIerr[i] <- T } else { typeIerr[i] <- F } } mean(typeIerr) ## [1] 0.052 sd(typeIerr) / sqrt(nit) ## [1] 0.007024624 # d # We can not estimate the percentage of true negatives, but it will probably be # higher than 0.05. There will be no false positives as the null hypothesis is # always false. typeIIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { x <- rnorm(10, -0.5) my_test <- t.test(x, alternative = "two.sided", mu = 0) if (my_test$p.value < 0.05) { typeIIerr[i] <- F } else { typeIIerr[i] <- T } } mean(typeIIerr) ## [1] 0.719 sd(typeIIerr) / sqrt(nit) ## [1] 0.01422115 # e # The expected value of false positives would be lower than 0.05. The expected # value of true negatives would be 0, as there are no negatives (the null # hypothesis is always the truth). typeIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { u <- runif(1, -1, 0) x <- rnorm(10, u) my_test <- t.test(x, alternative = "greater", mu = 0) if (my_test$p.value < 0.05) { typeIerr[i] <- T } else { typeIerr[i] <- F } } mean(typeIerr) ## [1] 0.012 sd(typeIerr) / sqrt(nit) ## [1] 0.003444977 Exercise 16.3 (T-test, confidence intervals, and bootstrap) Sample \\(n=20\\) from a standard normal distribution and calculate the p-value using t-test, confidence intervals based on normal distribution, and bootstrap. Repeat this several times and check how many times we rejected the null hypothesis (made a type I error). Hint: For the confidence intervals you can use function CI from the Rmisc package. set.seed(1) library(Rmisc) nit <- 1000 n_boot <- 100 t_logic <- rep(F, nit) boot_logic <- rep(F, nit) norm_logic <- rep(F, nit) for (i in 1:nit) { x <- rnorm(20) my_test <- t.test(x) my_CI <- CI(x) if (my_test$p.value <= 0.05) t_logic[i] <- T boot_tmp <- vector(mode = "numeric", length = n_boot) for (j in 1:n_boot) { tmp_samp <- sample(x, size = 20, replace = T) boot_tmp[j] <- mean(tmp_samp) } if ((quantile(boot_tmp, 0.025) >= 0) | (quantile(boot_tmp, 0.975) <= 0)) { boot_logic[i] <- T } if ((my_CI[3] >= 0) | (my_CI[1] <= 0)) { norm_logic[i] <- T } } mean(t_logic) ## [1] 0.053 sd(t_logic) / sqrt(nit) ## [1] 0.007088106 mean(boot_logic) ## [1] 0.093 sd(boot_logic) / sqrt(nit) ## [1] 0.009188876 mean(norm_logic) ## [1] 0.053 sd(norm_logic) / sqrt(nit) ## [1] 0.007088106 Exercise 16.4 (Chi-squared test) Show that the \\(\\chi^2 = \\sum_{i=1}^k \\frac{(O_i - E_i)^2}{E_i}\\) test statistic is approximately \\(\\chi^2\\) distributed when we have two categories. Let us look at the US voting data here. Compare the number of voters who voted for Trump or Hillary depending on their income (less or more than 100.000 dollars per year). Manually calculate the chi-squared statistic, compare to the chisq.test in R, and discuss the results. Visualize the test. Solution. Let \\(X_i\\) be binary variables, \\(i = 1,...,n\\). We can then express the test statistic as \\[\\begin{align} \\chi^2 = &\\frac{(O_i - np)^2}{np} + \\frac{(n - O_i - n(1 - p))^2}{n(1 - p)} \\\\ &= \\frac{(O_i - np)^2}{np(1 - p)} \\\\ &= (\\frac{O_i - np}{\\sqrt{np(1 - p)}})^2. \\end{align}\\] When \\(n\\) is large, this distrbution is approximately normal with \\(\\mu = np\\) and \\(\\sigma^2 = np(1 - p)\\) (binomial converges in distribution to standard normal). By definition, the chi-squared distribution with \\(k\\) degrees of freedom is a sum of squares of \\(k\\) independent standard normal random variables. n <- 24588 less100 <- round(0.66 * n * c(0.49, 0.45, 0.06)) # some rounding, but it should not affect results more100 <- round(0.34 * n * c(0.47, 0.47, 0.06)) x <- rbind(less100, more100) colnames(x) <- c("Clinton", "Trump", "other/no answer") print(x) ## Clinton Trump other/no answer ## less100 7952 7303 974 ## more100 3929 3929 502 chisq.test(x) ## ## Pearson's Chi-squared test ## ## data: x ## X-squared = 9.3945, df = 2, p-value = 0.00912 x ## Clinton Trump other/no answer ## less100 7952 7303 974 ## more100 3929 3929 502 csum <- apply(x, 2, sum) rsum <- apply(x, 1, sum) chi2 <- (x[1,1] - csum[1] * rsum[1] / sum(x))^2 / (csum[1] * rsum[1] / sum(x)) + (x[1,2] - csum[2] * rsum[1] / sum(x))^2 / (csum[2] * rsum[1] / sum(x)) + (x[1,3] - csum[3] * rsum[1] / sum(x))^2 / (csum[3] * rsum[1] / sum(x)) + (x[2,1] - csum[1] * rsum[2] / sum(x))^2 / (csum[1] * rsum[2] / sum(x)) + (x[2,2] - csum[2] * rsum[2] / sum(x))^2 / (csum[2] * rsum[2] / sum(x)) + (x[2,3] - csum[3] * rsum[2] / sum(x))^2 / (csum[3] * rsum[2] / sum(x)) chi2 ## Clinton ## 9.394536 1 - pchisq(chi2, df = 2) ## Clinton ## 0.009120161 x <- seq(0, 15, by = 0.01) df <- data.frame(x = x) ggplot(data = df, aes(x = x)) + stat_function(fun = dchisq, args = list(df = 2)) + geom_segment(aes(x = chi2, y = 0, xend = chi2, yend = dchisq(chi2, df = 2))) + stat_function(fun = dchisq, args = list(df = 2), xlim = c(chi2, 15), geom = "area", fill = "red") "],["bi.html", "Chapter 17 Bayesian inference 17.1 Conjugate priors 17.2 Posterior sampling", " Chapter 17 Bayesian inference This chapter deals with Bayesian inference. The students are expected to acquire the following knowledge: How to set prior distribution. Compute posterior distribution. Compute posterior predictive distribution. Use sampling for inference. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 17.1 Conjugate priors Exercise 17.1 (Poisson-gamma model) Let us assume a Poisson likelihood and a gamma prior on the Poisson mean parameter (this is a conjugate prior). Derive posterior Below we have some data, which represents number of goals in a football match. Choose sensible prior for this data (draw the gamma density if necessary), justify it. Compute the posterior. Compute an interval such that the probability that the true mean is in there is 95%. What is the probability that the true mean is greater than 2.5? Back to theory: Compute prior predictive and posterior predictive. Discuss why the posterior predictive is overdispersed and not Poisson? Draw a histogram of the prior predictive and posterior predictive for the data from (b). Discuss. Generate 10 and 100 random samples from a Poisson distribution and compare the posteriors with a flat prior, and a prior concentrated away from the truth. x <- c(3, 2, 1, 1, 5, 4, 0, 0, 4, 3) Solution. \\[\\begin{align*} p(\\lambda | X) &= \\frac{p(X | \\lambda) p(\\lambda)}{\\int_0^\\infty p(X | \\lambda) p(\\lambda) d\\lambda} \\\\ &\\propto p(X | \\lambda) p(\\lambda) \\\\ &= \\Big(\\prod_{i=1}^n \\frac{1}{x_i!} \\lambda^{x_i} e^{-\\lambda}\\Big) \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\lambda^{\\alpha - 1} e^{-\\beta \\lambda} \\\\ &\\propto \\lambda^{\\sum_{i=1}^n x_i + \\alpha - 1} e^{- \\lambda (n + \\beta)} \\\\ \\end{align*}\\] We recognize this as the shape of a gamma distribution, therefore \\[\\begin{align*} \\lambda | X \\sim \\text{gamma}(\\alpha + \\sum_{i=1}^n x_i, \\beta + n) \\end{align*}\\] For the prior predictive, we have \\[\\begin{align*} p(x^*) &= \\int_0^\\infty p(x^*, \\lambda) d\\lambda \\\\ &= \\int_0^\\infty p(x^* | \\lambda) p(\\lambda) d\\lambda \\\\ &= \\int_0^\\infty \\frac{1}{x^*!} \\lambda^{x^*} e^{-\\lambda} \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\lambda^{\\alpha - 1} e^{-\\beta \\lambda} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\int_0^\\infty \\lambda^{x^* + \\alpha - 1} e^{-\\lambda (1 + \\beta)} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\frac{\\Gamma(x^* + \\alpha)}{(1 + \\beta)^{x^* + \\alpha}} \\int_0^\\infty \\frac{(1 + \\beta)^{x^* + \\alpha}}{\\Gamma(x^* + \\alpha)} \\lambda^{x^* + \\alpha - 1} e^{-\\lambda (1 + \\beta)} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\frac{\\Gamma(x^* + \\alpha)}{(1 + \\beta)^{x^* + \\alpha}} \\\\ &= \\frac{\\Gamma(x^* + \\alpha)}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} (\\frac{\\beta}{1 + \\beta})^\\alpha (\\frac{1}{1 + \\beta})^{x^*}, \\end{align*}\\] which we recognize as the negative binomial distribution with \\(r = \\alpha\\) and \\(p = \\frac{1}{\\beta + 1}\\). For the posterior predictive, the calculation is the same, only now the parameters are \\(r = \\alpha + \\sum_{i=1}^n x_i\\) and \\(p = \\frac{1}{\\beta + n + 1}\\). There are two sources of uncertainty in the predictive distribution. First is the uncertainty about the population. Second is the variability in sampling from the population. When \\(n\\) is large, the latter is going to be very small. But when \\(n\\) is small, the latter is going to be higher, resulting in an overdispersed predictive distribution. x <- c(3, 2, 1, 1, 5, 4, 0, 0, 4, 3) # b # quick visual check of the prior ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dgamma, args = list(shape = 1, rate = 1)) palpha <- 1 pbeta <- 1 alpha_post <- palpha + sum(x) beta_post <- pbeta + length(x) ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dgamma, args = list(shape = alpha_post, rate = beta_post)) # probability of being higher than 2.5 1 - pgamma(2.5, alpha_post, beta_post) ## [1] 0.2267148 # interval qgamma(c(0.025, 0.975), alpha_post, beta_post) ## [1] 1.397932 3.137390 # d prior_pred <- rnbinom(1000, size = palpha, prob = 1 - 1 / (pbeta + 1)) post_pred <- rnbinom(1000, size = palpha + sum(x), prob = 1 - 1 / (pbeta + 10 + 1)) df <- data.frame(prior = prior_pred, posterior = post_pred) df <- gather(df) ggplot(df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") # e set.seed(1) x1 <- rpois(10, 2.5) x2 <- rpois(100, 2.5) alpha_flat <- 1 beta_flat <- 0.1 alpha_conc <- 50 beta_conc <- 10 n <- 10000 df_flat <- data.frame(x1 = rgamma(n, alpha_flat + sum(x1), beta_flat + 10), x2 = rgamma(n, alpha_flat + sum(x2), beta_flat + 100), type = "flat") df_flat <- tidyr::gather(df_flat, key = "key", value = "value", - type) df_conc <- data.frame(x1 = rgamma(n, alpha_conc + sum(x1), beta_conc + 10), x2 = rgamma(n, alpha_conc + sum(x2), beta_conc + 100), type = "conc") df_conc <- tidyr::gather(df_conc, key = "key", value = "value", - type) df <- rbind(df_flat, df_conc) ggplot(data = df, aes(x = value, color = type)) + facet_wrap(~ key) + geom_density() 17.2 Posterior sampling Exercise 17.2 (Bayesian logistic regression) In Chapter 15 we implemented a MLE for logistic regression (see the code below). For this model, conjugate priors do not exist, which complicates the calculation of the posterior. However, we can use sampling from the numerator of the posterior, using rejection sampling. Set a sensible prior distribution on \\(\\beta\\) and use rejection sampling to find the posterior distribution. In a) you will get a distribution of parameter \\(\\beta\\). Plot the probabilities (as in exercise 15.3) for each sample of \\(\\beta\\) and compare to the truth. Hint: We can use rejection sampling even for functions which are not PDFs – they do not have to sum/integrate to 1. We just need to use a suitable envelope that we know how to sample from. For example, here we could use a uniform distribution and scale it suitably. set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par # Let's say we believe that the mean of beta is 0.5. Since we are not very sure # about this, we will give it a relatively high variance. So a normal prior with # mean 0.5 and standard deviation 5. But there is no right solution to this, # this is basically us expressing our prior belief in the parameter values. set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) if (is.nan(logl)) logl <- Inf return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par ## [1] 1.166558 f_logistic <- function (beta, X, y) { logl <- prod(inv_log(as.vector(beta %*% X))^y * (1 - inv_log(as.vector(beta %*% X)))^(1 - y)) return(logl) } a <- seq(0, 3, by = 0.01) my_l <- c() for (i in a) { my_l <- c(my_l, f_logistic(i, x, y) * dnorm(i, 0.5, 5)) } plot(my_l) envlp <- 10^(-25.8) * dunif(a, -5, 5) # found by trial and error tmp <- data.frame(envel = envlp, l = my_l, t = a) tmp <- gather(tmp, key = "key", value = "value", - t) ggplot(tmp, aes(x = t, y = value, color = key)) + geom_line() # envelope OK set.seed(1) nsamps <- 1000 samps <- c() for (i in 1:nsamps) { tmp <- runif(1, -5, 5) u <- runif(1, 0, 1) if (u < (f_logistic(tmp, x, y) * dnorm(tmp, 0.5, 5)) / (10^(-25.8) * dunif(tmp, -5, 5))) { samps <- c(samps, tmp) } } plot(density(samps)) mean(samps) ## [1] 1.211578 median(samps) ## [1] 1.204279 truth_p <- data.frame(x = x, prob = inv_log(1.2 * x), type = "truth") preds <- inv_log(x %*% t(samps)) preds <- gather(cbind(as.data.frame(preds), x = x), key = "key", "value" = value, - x) ggplot(preds, aes(x = x, y = value)) + geom_line(aes(group = key), color = "gray", alpha = 0.7) + geom_point(data = truth_p, aes(y = prob), color = "red", alpha = 0.7) + theme_bw() "],["distributions-intutition.html", "Chapter 18 Distributions intutition 18.1 Discrete distributions 18.2 Continuous distributions", " Chapter 18 Distributions intutition This chapter is intended to help you familiarize yourself with the different probability distributions you will encounter in this course. Use Appendix B as a reference for the basic properties of distributions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 18.1 Discrete distributions Exercise 18.1 (Bernoulli intuition 1) Arguably the simplest distribution you will enocounter is the Bernoulli distribution. It is a discrete probability distribution used to represent the outcome of a yes/no question. It has one parameter \\(p\\) which is the probability of success. The probability of failure is \\(1-p\\), sometimes denoted as \\(q\\). A classic way to think about a Bernoulli trial (a yes/no experiment) is a coin flip. Real coins are fair, meaning the probability of either heads (1) or tails (0) are the same i.e. \\(p=0.5\\), shown below in figure a. Alternatively we may want to represent a process that doesn’t have equal probabilities of outcomes like “Will a throw of a fair die result in a 6?”. In this case \\(p=\\frac{1}{6}\\), shown in figure b. Using your knowledge of the Bernoulli distribution use the throw of a fair die to think of events, such that: \\(p = 0.5\\) \\(p = \\frac{5}{6}\\) \\(q = \\frac{2}{3}\\) Solution. An event that is equally likely to happen or not happen i.e. \\(p = 0.5\\) would be throwing an even number. More formally we can name this event \\(A\\) and write: \\(A = \\{2,4,6\\}\\), its probability being \\(P(A) = 0.5\\) An example of an event with \\(p = \\frac{5}{6}\\) would be throwing a number greater than 1. Defined as \\(B = \\{2,3,4,5,6\\}\\). We need an event that fails \\(\\frac{2}{3}\\) of the time. Alternatively we can reverse the problem and find an event that succeeds \\(\\frac{1}{3}\\) of the time, since: \\(q = 1 - p \\implies p = 1 - q = \\frac{1}{3}\\). The event that our outcome is divisible by 3: \\(C = \\{3, 6\\}\\) satisfies this condition. Exercise 18.2 (Binomial intuition 1) The binomial distribution is a generalization of the Bernoulli distribution. Instead of considering a single Bernoulli trial, we now consider a sequence of \\(n\\) trials, which are independent and have the same parameter \\(p\\). So the binomial distribution has two parameters \\(n\\) - the number of trials and \\(p\\) - the probability of success for each trial. If we return to our coin flip representation, we now flip a coin several times. The binomial distribution will give us the probabilities of all possible outcomes. Below we show the distribution for a series of 10 coin flips with a fair coin (left) and a biased coin (right). The numbers on the x axis represent the number of times the coin landed heads. Using your knowledge of the binomial distribution: Take the pmf of the binomial distribution and plug in \\(n=1\\), check that it is in fact equivalent to a Bernoulli distribution. In our examples we show the graph of a binomial distribution over 10 trials with \\(p=0.8\\). If we take a look at the graph, it appears as though the probabilities of getting 0,1,2 or 3 heads in 10 flips are zero. Is it actually zero? Check by plugging in the values into the pmf. Solution. The pmf of a binomial distribution is \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\), now we insert \\(n=1\\) to get: \\[\\binom{1}{k} p^k (1 - p)^{1 - k}\\]. Not quite equivalent to a Bernoulli, however note that the support of the binomial distribution is defined as \\(k \\in \\{0,1,\\dots,n\\}\\), so in our case \\(k = \\{0,1\\}\\), then: \\[\\binom{1}{0} = \\binom{1}{1} = 1\\] we get: \\(p^k (1 - p)^{1 - k}\\) ,the Bernoulli distribution. As we already know \\(p=0.8, n=10\\), so: \\[\\binom{10}{0} 0.8^0 (1 - 0.8)^{10 - 0} = 1.024 \\cdot 10^{-7}\\] \\[\\binom{10}{1} 0.8^1 (1 - 0.8)^{10 - 1} = 4.096 \\cdot 10^{-6}\\] \\[\\binom{10}{2} 0.8^2 (1 - 0.8)^{10 - 2} = 7.3728 \\cdot 10^{-5}\\] \\[\\binom{10}{3} 0.8^3 (1 - 0.8)^{10 - 3} = 7.86432\\cdot 10^{-4}\\] So the probabilities are not zero, just very small. Exercise 18.3 (Poisson intuition 1) Below are shown 3 different graphs of the Poisson distribution. Your task is to replicate them on your own in R by varying the \\(\\lambda\\) parameter. Hint: You can use dpois() to get the probabilities. library(ggplot2) library(gridExtra) x = 0:15 # Create Poisson data data1 <- data.frame(x = x, y = dpois(x, lambda = 0.1)) data2 <- data.frame(x = x, y = dpois(x, lambda = 1)) data3 <- data.frame(x = x, y = dpois(x, lambda = 7.5)) # Create individual ggplot objects plot1 <- ggplot(data1, aes(x, y)) + geom_col() + xlab("x") + ylab("Probability") + ylim(0,1) plot2 <- ggplot(data2, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) plot3 <- ggplot(data3, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) # Combine the plots grid.arrange(plot1, plot2, plot3, ncol = 3) Exercise 18.4 (Poisson intuition 2) The Poisson distribution is a discrete probability distribution that models processes where events occur at a constant mean rate and are independent of each other. It has a single parameter \\(\\lambda\\), which represents the constant mean rate. A classic example of a scenario that can be modeled using the Poisson distribution is the number of calls received by a call center in a day (or in fact any other time interval). Suppose you work in a call center and have some understanding of probability distributions. You overhear your supervisor mentioning that the call center receives an average of 2.5 calls per day. Using your knowledge of the Poisson distribution, calculate: The probability you will get no calls today. The probability you will get more than 5 calls today. Solution. First recall the Poisson pmf: \\[p(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!}\\] as stated previously our parameter \\(\\lambda = 2.5\\) To get the probability of no calls we simply plug in \\(k = 0\\), so: \\[p(0) = \\frac{2.5^0 e^{-2.5}}{0!} = e^{-2.5} \\approx 0.082\\] The support of the Poisson distribution is non-negative integers. So if we wanted to calculate the probability of getting more than 5 calls we would need to add up the probabilities of getting 6 calls and 7 calls and so on up to infinity. Let us instead remember that the sum of all probabilties will be 1, we will reverse the problem and instead ask “What is the probability we get 5 calls or less?”. We can subtract the probability of the opposite outcome (the complement) from 1 to get the probability of our original question. \\[P(k > 5) = 1 - P(k \\leq 5)\\] \\[P(k \\leq 5) = \\sum_{i=0}^{5} p(i) = p(0) + p(1) + p(2) + p(3) + p(4) + p(5) =\\] \\[= \\frac{2.5^0 e^{-2.5}}{0!} + \\frac{2.5^1 e^{-2.5}}{1!} + \\dots =\\] \\[=0.957979\\] So the probability of geting more than 5 calls will be \\(1 - 0.957979 = 0.042021\\) Exercise 18.5 (Geometric intuition 1) The geometric distribution is a discrete distribution that models the number of failures before the first success in a sequence of independent Bernoulli trials. It has a single parameter \\(p\\), representing the probability of success. Disclaimer: There are two forms of this distribution, the one we just described and another version that models the number of trials before the first success. The difference is subtle yet significant and you are likely to encounter both forms, though here we will limit ourselves to the former. In the graph below we show the pmf of a geometric distribution with \\(p=0.5\\). This can be thought of as the number of successive failures (tails) in the flip of a fair coin. You can see that there’s a 50% chance you will have zero failures i.e. you will flip a heads on your very first attempt. But there is some smaller chance that you will flip a sequence of tails in a row, with longer sequences having ever lower probability. Create an equivalent graph that represents the probability of rolling a 6 with a fair 6-sided die. Use the formula for the mean of the geometric distribution and determine the average number of failures before you roll a 6. Look up the alternative form of the geometric distribtuion and again use the formula for the mean to determine the average number of trials before you roll a 6. Solution. Parameter p (the probability of success) for rolling a 6 is \\(p=\\frac{1}{6}\\). library(ggplot2) # Parameters p <- 1/6 x_vals <- 0:9 # Starting from 0 probs <- dgeom(x_vals, p) # Data data <- data.frame(x_vals, probs) # Plot ggplot(data, aes(x=x_vals, y=probs)) + geom_segment(aes(xend=x_vals, yend=0), color="black", size=1) + geom_point(color="red", size=2) + labs(x = "Number of trials", y = "Probability") + theme_minimal() + scale_x_continuous(breaks = x_vals) # This line ensures integer x-axis labels ::: {.solution} b) \\[E[X] = \\frac{1-p}{p}= \\frac{1- \\frac{1}{6}}{\\frac{1}{6}} = \\frac{5}{6}\\cdot 6 = 5\\] On average we will fail 5 times before we roll our first 6. The alternative form of this distribution (with support on all positive integers) has a slightly different formula for the mean. This change reflects the difference in the way we posed our question: \\[E[X] = \\frac{1}{p} = \\frac{1}{\\frac{1}{6}} = 6\\] On average we will have to throw the die 6 times before we roll a 6. ::: 18.2 Continuous distributions "],["A1.html", "A R programming language A.1 Basic characteristics A.2 Why R? A.3 Setting up A.4 R basics A.5 Functions A.6 Other tips A.7 Further reading and references", " A R programming language A.1 Basic characteristics R is free software for statistical computing and graphics. It is widely used by statisticians, scientists, and other professionals for software development and data analysis. It is an interpreted language and therefore the programs do not need compilation. A.2 Why R? R is one of the main two languages used for statistics and machine learning (the other being Python). Pros Libraries. Comprehensive collection of statistical and machine learning packages. Easy to code. Open source. Anyone can access R and develop new methods. Additionally, it is relatively simple to get source code of established methods. Large community. The use of R has been rising for some time, in industry and academia. Therefore a large collection of blogs and tutorials exists, along with people offering help on pages like StackExchange and CrossValidated. Integration with other languages and LaTeX. New methods. Many researchers develop R packages based on their research, therefore new methods are available soon after development. Cons Slow. Programs run slower than in other programming languages, however this can be somewhat ammended by effective coding or integration with other languages. Memory intensive. This can become a problem with large data sets, as they need to be stored in the memory, along with all the information the models produce. Some packages are not as good as they should be, or have poor documentation. Object oriented programming in R can be very confusing and complex. A.3 Setting up https://www.r-project.org/. A.3.1 RStudio RStudio is the most widely used IDE for R. It is free, you can download it from https://rstudio.com/. While console R is sufficient for the requirements of this course, we recommend the students install RStudio for its better user interface. A.3.2 Libraries for data science Listed below are some of the more useful libraries (packages) for data science. Students are also encouraged to find other useful packages. dplyr Efficient data manipulation. Part of the wider package collection called tidyverse. ggplot2 Plotting based on grammar of graphics. stats Several statistical models. rstan Bayesian inference using Hamiltonian Monte Carlo. Very flexible model building. MCMCpack Bayesian inference. rmarkdown, knitr, and bookdown Dynamic reports (for example such as this one). devtools Package development. A.4 R basics A.4.1 Variables and types Important information and tips: no type declaration define variables with <- instead of = (although both work, there is a slight difference, additionally most of the packages use the arrow) for strings use \"\" for comments use # change types with as.type() functions no special type for single character like C++ for example n <- 20 x <- 2.7 m <- n # m gets value 20 my_flag <- TRUE student_name <- "Luka" typeof(n) ## [1] "double" typeof(student_name) ## [1] "character" typeof(my_flag) ## [1] "logical" typeof(as.integer(n)) ## [1] "integer" typeof(as.character(n)) ## [1] "character" A.4.2 Basic operations n + x ## [1] 22.7 n - x ## [1] 17.3 diff <- n - x # variable diff gets the difference between n and x diff ## [1] 17.3 n * x ## [1] 54 n / x ## [1] 7.407407 x^2 ## [1] 7.29 sqrt(x) ## [1] 1.643168 n > 2 * n ## [1] FALSE n == n ## [1] TRUE n == 2 * n ## [1] FALSE n != n ## [1] FALSE paste(student_name, "is", n, "years old") ## [1] "Luka is 20 years old" A.4.3 Vectors use c() to combine elements into vectors can only contain one type of variable if different types are provided, all are transformed to the most basic type in the vector access elements by indexes or logical vectors of the same length a scalar value is regarded as a vector of length 1 1:4 # creates a vector of integers from 1 to 4 ## [1] 1 2 3 4 student_ages <- c(20, 23, 21) student_names <- c("Luke", "Jen", "Mike") passed <- c(TRUE, TRUE, FALSE) length(student_ages) ## [1] 3 # access by index student_ages[2] ## [1] 23 student_ages[1:2] ## [1] 20 23 student_ages[2] <- 24 # change values # access by logical vectors student_ages[passed == TRUE] # same as student_ages[passed] ## [1] 20 24 student_ages[student_names %in% c("Luke", "Mike")] ## [1] 20 21 student_names[student_ages > 20] ## [1] "Jen" "Mike" A.4.3.1 Operations with vectors most operations are element-wise if we operate on vectors of different lengths, the shorter vector periodically repeats its elements until it reaches the length of the longer one a <- c(1, 3, 5) b <- c(2, 2, 1) d <- c(6, 7) a + b ## [1] 3 5 6 a * b ## [1] 2 6 5 a + d ## Warning in a + d: longer object length is not a multiple of shorter object ## length ## [1] 7 10 11 a + 2 * b ## [1] 5 7 7 a > b ## [1] FALSE TRUE TRUE b == a ## [1] FALSE FALSE FALSE a %*% b # vector multiplication, not element-wise ## [,1] ## [1,] 13 A.4.4 Factors vectors of finite predetermined classes suitable for categorical variables ordinal (ordered) or nominal (unordered) car_brand <- factor(c("Audi", "BMW", "Mercedes", "BMW"), ordered = FALSE) car_brand ## [1] Audi BMW Mercedes BMW ## Levels: Audi BMW Mercedes freq <- factor(x = NA, levels = c("never","rarely","sometimes","often","always"), ordered = TRUE) freq[1:3] <- c("rarely", "sometimes", "rarely") freq ## [1] rarely sometimes rarely ## Levels: never < rarely < sometimes < often < always freq[4] <- "quite_often" # non-existing level, returns NA ## Warning in `[<-.factor`(`*tmp*`, 4, value = "quite_often"): invalid factor ## level, NA generated freq ## [1] rarely sometimes rarely <NA> ## Levels: never < rarely < sometimes < often < always A.4.5 Matrices two-dimensional generalizations of vectors my_matrix <- matrix(c(1, 2, 1, 5, 4, 2), nrow = 2, byrow = TRUE) my_matrix ## [,1] [,2] [,3] ## [1,] 1 2 1 ## [2,] 5 4 2 my_square_matrix <- matrix(c(1, 3, 2, 3), nrow = 2) my_square_matrix ## [,1] [,2] ## [1,] 1 2 ## [2,] 3 3 my_matrix[1,2] # first row, second column ## [1] 2 my_matrix[2, ] # second row ## [1] 5 4 2 my_matrix[ ,3] # third column ## [1] 1 2 A.4.5.1 Matrix functions and operations most operation element-wise mind the dimensions when using matrix multiplication %*% nrow(my_matrix) # number of matrix rows ## [1] 2 ncol(my_matrix) # number of matrix columns ## [1] 3 dim(my_matrix) # matrix dimension ## [1] 2 3 t(my_matrix) # transpose ## [,1] [,2] ## [1,] 1 5 ## [2,] 2 4 ## [3,] 1 2 diag(my_matrix) # the diagonal of the matrix as vector ## [1] 1 4 diag(1, nrow = 3) # creates a diagonal matrix ## [,1] [,2] [,3] ## [1,] 1 0 0 ## [2,] 0 1 0 ## [3,] 0 0 1 det(my_square_matrix) # matrix determinant ## [1] -3 my_matrix + 2 * my_matrix ## [,1] [,2] [,3] ## [1,] 3 6 3 ## [2,] 15 12 6 my_matrix * my_matrix # element-wise multiplication ## [,1] [,2] [,3] ## [1,] 1 4 1 ## [2,] 25 16 4 my_matrix %*% t(my_matrix) # matrix multiplication ## [,1] [,2] ## [1,] 6 15 ## [2,] 15 45 my_vec <- as.vector(my_matrix) # transform to vector my_vec ## [1] 1 5 2 4 1 2 A.4.6 Arrays multi-dimensional generalizations of matrices my_array <- array(c(1, 2, 3, 4, 5, 6, 7, 8), dim = c(2, 2, 2)) my_array[1, 1, 1] ## [1] 1 my_array[2, 2, 1] ## [1] 4 my_array[1, , ] ## [,1] [,2] ## [1,] 1 5 ## [2,] 3 7 dim(my_array) ## [1] 2 2 2 A.4.7 Data frames basic data structure for analysis differ from matrices as columns can be of different types student_data <- data.frame("Name" = student_names, "Age" = student_ages, "Pass" = passed) student_data ## Name Age Pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE ## 3 Mike 21 FALSE colnames(student_data) <- c("name", "age", "pass") # change column names student_data[1, ] ## name age pass ## 1 Luke 20 TRUE student_data[ ,colnames(student_data) %in% c("name", "pass")] ## name pass ## 1 Luke TRUE ## 2 Jen TRUE ## 3 Mike FALSE student_data$pass # access column by name ## [1] TRUE TRUE FALSE student_data[student_data$pass == TRUE, ] ## name age pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE A.4.8 Lists useful for storing different data structures access elements with double square brackets elements can be named first_list <- list(student_ages, my_matrix, student_data) second_list <- list(student_ages, my_matrix, student_data, first_list) first_list[[1]] ## [1] 20 24 21 second_list[[4]] ## [[1]] ## [1] 20 24 21 ## ## [[2]] ## [,1] [,2] [,3] ## [1,] 1 2 1 ## [2,] 5 4 2 ## ## [[3]] ## name age pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE ## 3 Mike 21 FALSE second_list[[4]][[1]] # first element of the fourth element of second_list ## [1] 20 24 21 length(second_list) ## [1] 4 second_list[[length(second_list) + 1]] <- "add_me" # append an element names(first_list) <- c("Age", "Matrix", "Data") first_list$Age ## [1] 20 24 21 A.4.9 Loops mostly for loop for loop can iterate over an arbitrary vector # iterate over consecutive natural numbers my_sum <- 0 for (i in 1:10) { my_sum <- my_sum + i } my_sum ## [1] 55 # iterate over an arbirary vector my_sum <- 0 some_numbers <- c(2, 3.5, 6, 100) for (i in some_numbers) { my_sum <- my_sum + i } my_sum ## [1] 111.5 A.5 Functions for help use ?function_name A.5.1 Writing functions We can write our own functions with function(). In the brackets, we define the parameters the function gets, and in curly brackets we define what the function does. We use return() to return values. sum_first_n_elements <- function (n) { my_sum <- 0 for (i in 1:n) { my_sum <- my_sum + i } return (my_sum) } sum_first_n_elements(10) ## [1] 55 A.6 Other tips Use set.seed(arbitrary_number) at the beginning of a script to set the seed and ensure replication. To dynamically set the working directory in R Studio to the parent folder of a R script use setwd(dirname(rstudioapi::getSourceEditorContext()$path)). To avoid slow R loops use the apply family of functions. See ?apply and ?lapply. To make your data manipulation (and therefore your life) a whole lot easier, use the dplyr package. Use getAnywhere(function_name) to get the source code of any function. Use browser for debugging. See ?browser. A.7 Further reading and references Getting started with R Studio: https://www.youtube.com/watch?v=lVKMsaWju8w Official R manuals: https://cran.r-project.org/manuals.html Cheatsheets: https://www.rstudio.com/resources/cheatsheets/ Workshop on R, dplyr, ggplot2, and R Markdown: https://github.com/bstatcomp/Rworkshop "],["distributions.html", "B Probability distributions", " B Probability distributions Name parameters support pdf/pmf mean variance Bernoulli \\(p \\in [0,1]\\) \\(k \\in \\{0,1\\}\\) \\(p^k (1 - p)^{1 - k}\\) 1.12 \\(p\\) 7.1 \\(p(1-p)\\) 7.1 binomial \\(n \\in \\mathbb{N}\\), \\(p \\in [0,1]\\) \\(k \\in \\{0,1,\\dots,n\\}\\) \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\) 4.4 \\(np\\) 7.2 \\(np(1-p)\\) 7.2 Poisson \\(\\lambda > 0\\) \\(k \\in \\mathbb{N}_0\\) \\(\\frac{\\lambda^k e^{-\\lambda}}{k!}\\) 4.6 \\(\\lambda\\) 7.3 \\(\\lambda\\) 7.3 geometric \\(p \\in (0,1]\\) \\(k \\in \\mathbb{N}_0\\) \\(p(1-p)^k\\) 4.5 \\(\\frac{1 - p}{p}\\) 7.4 \\(\\frac{1 - p}{p^2}\\) 9.3 normal \\(\\mu \\in \\mathbb{R}\\), \\(\\sigma^2 > 0\\) \\(x \\in \\mathbb{R}\\) \\(\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}}\\) 4.12 \\(\\mu\\) 7.8 \\(\\sigma^2\\) 7.8 uniform \\(a,b \\in \\mathbb{R}\\), \\(a < b\\) \\(x \\in [a,b]\\) \\(\\frac{1}{b-a}\\) 4.9 \\(\\frac{a+b}{2}\\) \\(\\frac{(b-a)^2}{12}\\) beta \\(\\alpha,\\beta > 0\\) \\(x \\in [0,1]\\) \\(\\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)}\\) 4.10 \\(\\frac{\\alpha}{\\alpha + \\beta}\\) 7.6 \\(\\frac{\\alpha \\beta}{(\\alpha + \\beta)^2(\\alpha + \\beta + 1)}\\) 7.6 gamma \\(\\alpha,\\beta > 0\\) \\(x \\in (0, \\infty)\\) \\(\\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x}\\) 4.11 \\(\\frac{\\alpha}{\\beta}\\) 7.5 \\(\\frac{\\alpha}{\\beta^2}\\) 7.5 exponential \\(\\lambda > 0\\) \\(x \\in [0, \\infty)\\) \\(\\lambda e^{-\\lambda x}\\) 4.8 \\(\\frac{1}{\\lambda}\\) 7.7 \\(\\frac{1}{\\lambda^2}\\) 7.7 logistic \\(\\mu \\in \\mathbb{R}\\), \\(s > 0\\) \\(x \\in \\mathbb{R}\\) \\(\\frac{e^{-\\frac{x - \\mu}{s}}}{s(1 + e^{-\\frac{x - \\mu}{s}})^2}\\) 4.13 \\(\\mu\\) \\(\\frac{s^2 \\pi^2}{3}\\) negative binomial \\(r \\in \\mathbb{N}\\), \\(p \\in [0,1]\\) \\(k \\in \\mathbb{N}_0\\) \\(\\binom{k + r - 1}{k}(1-p)^r p^k\\) 4.7 \\(\\frac{rp}{1 - p}\\) 9.2 \\(\\frac{rp}{(1 - p)^2}\\) 9.2 multinomial \\(n \\in \\mathbb{N}\\), \\(k \\in \\mathbb{N}\\) \\(p_i \\in [0,1]\\), \\(\\sum p_i = 1\\) \\(x_i \\in \\{0,..., n\\}\\), \\(i \\in \\{1,...,k\\}\\), \\(\\sum{x_i} = n\\) \\(\\frac{n!}{x_1!x_2!...x_k!} p_1^{x_1} p_2^{x_2}...p_k^{x_k}\\) 8.1 \\(np_i\\) \\(np_i(1-p_i)\\) "],["references.html", "References", " References "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]] +[["index.html", "Principles of Uncertainty – exercises Preface", " Principles of Uncertainty – exercises Gregor Pirš and Erik Štrumbelj 2023-09-28 Preface These are the exercises for the Principles of Uncertainty course of the Data Science Master’s at University of Ljubljana, Faculty of Computer and Information Science. This document will be extended each week as the course progresses. At the end of each exercise session, we will post the solutions to the exercises worked in class and select exercises for homework. Students are also encouraged to solve the remaining exercises to further extend their knowledge. Some exercises require the use of R. Those exercises (or parts of) are coloured blue. Students that are not familiar with R programming language should study A to learn the basics. As the course progresses, we will cover more relevant uses of R for data science. "],["introduction.html", "Chapter 1 Probability spaces 1.1 Measure and probability spaces 1.2 Properties of probability measures 1.3 Discrete probability spaces", " Chapter 1 Probability spaces This chapter deals with measures and probability spaces. At the end of the chapter, we look more closely at discrete probability spaces. The students are expected to acquire the following knowledge: Theoretical Use properties of probability to calculate probabilities. Combinatorics. Understanding of continuity of probability. R Vectors and vector operations. For loop. Estimating probability with simulation. sample function. Matrices and matrix operations. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 1.1 Measure and probability spaces Exercise 1.1 (Completing a set to a sigma algebra) Let \\(\\Omega = \\{1,2,...,10\\}\\) and let \\(A = \\{\\emptyset, \\{1\\}, \\{2\\}, \\Omega \\}\\). Show that \\(A\\) is not a sigma algebra of \\(\\Omega\\). Find the minimum number of elements to complete A to a sigma algebra of \\(\\Omega\\). Solution. \\(1^c = \\{2,3,...,10\\} \\notin A \\implies\\) \\(A\\) is not sigma algebra. First we need the complements of all elements, so we need to add sets \\(\\{2,3,...,10\\}\\) and \\(\\{1,3,4,...,10\\}\\). Next we need unions of all sets – we add the set \\(\\{1,2\\}\\). Again we need the complement of this set, so we add \\(\\{3,4,...,10\\}\\). So the minimum number of elements we need to add is 4. Exercise 1.2 (Diversity of sigma algebras) Let \\(\\Omega\\) be a set. Find the smallest sigma algebra of \\(\\Omega\\). Find the largest sigma algebra of \\(\\Omega\\). Solution. \\(A = \\{\\emptyset, \\Omega\\}\\) \\(2^{\\Omega}\\) Exercise 1.3 Find all sigma algebras for \\(\\Omega = \\{0, 1, 2\\}\\). Solution. \\(A = \\{\\emptyset, \\Omega\\}\\) \\(A = 2^{\\Omega}\\) \\(A = \\{\\emptyset, \\{0\\}, \\{1,2\\}, \\Omega\\}\\) \\(A = \\{\\emptyset, \\{1\\}, \\{0,2\\}, \\Omega\\}\\) \\(A = \\{\\emptyset, \\{2\\}, \\{0,1\\}, \\Omega\\}\\) Exercise 1.4 (Difference between algebra and sigma algebra) Let \\(\\Omega = \\mathbb{N}\\) and \\(\\mathcal{A} = \\{A \\subseteq \\mathbb{N}: A \\text{ is finite or } A^c \\text{ is finite.} \\}\\). Show that \\(\\mathcal{A}\\) is an algebra but not a sigma algebra. Solution. \\(\\emptyset\\) is finite so \\(\\emptyset \\in \\mathcal{A}\\). Let \\(A \\in \\mathcal{A}\\) and \\(B \\in \\mathcal{A}\\). If both are finite, then their union is also finite and therefore in \\(\\mathcal{A}\\). Let at least one of them not be finite. Then their union is not finite. But \\((A \\cup B)^c = A^c \\cap B^c\\). And since at least one is infinite, then its complement is finite and the intersection is too. So finite unions are in \\(\\mathcal{A}\\). Let us look at numbers \\(2n\\). For any \\(n\\), \\(2n \\in \\mathcal{A}\\) as it is finite. But \\(\\bigcup_{k = 1}^{\\infty} 2n \\notin \\mathcal{A}\\). Exercise 1.5 We define \\(\\sigma(X) = \\cap_{\\lambda \\in I} S_\\lambda\\) to be a sigma algebra, generated by the set \\(X\\), where \\(S_\\lambda\\) are all sigma algebras such that \\(X \\subseteq S_\\lambda\\). \\(S_\\lambda\\) are indexed by \\(\\lambda \\in I\\). Let \\(A, B \\subseteq 2^{\\Omega}\\). Prove that \\(\\sigma(A) = \\sigma(B) \\iff A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A)\\). Solution. To prove the equivalence, we need to prove that the left hand side implies the right hand side and vice versa. Proving \\(\\sigma(A) = \\sigma(B) \\Rightarrow A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A)\\): we know \\(A \\subseteq \\sigma(A)\\) is always true, so by substituting in \\(\\sigma(B)\\) from the left hand side equality we obtain \\(A \\subseteq \\sigma(B)\\). We obtain \\(B \\subseteq \\sigma(A)\\) by symmetry. This proves the implication. Proving \\(A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A) \\Rightarrow \\sigma(A) = \\sigma(B)\\): by definition of a sigma algebra, generated by a set, we have \\(\\sigma(B) = \\cap_{\\lambda \\in I} S_\\lambda\\) where \\(S_\\lambda\\) are all sigma algebras where \\(B \\subseteq S_\\lambda\\). But \\(\\sigma(A)\\) is one of \\(S_\\lambda\\), so we can write \\(\\sigma(B) = \\sigma(A) \\cap \\left(\\cap_{\\lambda \\in I} S_\\lambda \\right)\\), which implies \\(\\sigma(B) \\subseteq \\sigma(A)\\). By symmetry, we have \\(\\sigma(A) \\subseteq \\sigma(B)\\). Since \\(\\sigma(A) \\subseteq \\sigma(B)\\) and \\(\\sigma(B) \\subseteq \\sigma(A)\\), we obtain \\(\\sigma(A) = \\sigma(B)\\), which proves the implication and completes the equivalence proof. Exercise 1.6 (Intro to measure) Take the measurable space \\(\\Omega = \\{1,2\\}\\), \\(F = 2^{\\Omega}\\). Which of the following is a measure? Which is a probability measure? \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 5\\), \\(\\mu(\\{2\\}) = 6\\), \\(\\mu(\\{1,2\\}) = 11\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 0\\), \\(\\mu(\\{1,2\\}) = 1\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 0\\), \\(\\mu(\\{1,2\\}) = 0\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 1\\), \\(\\mu(\\{1,2\\}) = 1\\) \\(\\mu(\\emptyset)=0\\), \\(\\mu(\\{1\\})=0\\), \\(\\mu(\\{2\\})=\\infty\\), \\(\\mu(\\{1,2\\})=\\infty\\) Solution. Measure. Not probability measure since \\(\\mu(\\Omega) > 1\\). Neither due to countable additivity. Measure. Not probability measure since \\(\\mu(\\Omega) = 0\\). Probability measure. Measure. Not probability measure since \\(\\mu(\\Omega) > 1\\). Exercise 1.7 Define a probability space that could be used to model the outcome of throwing two fair 6-sided dice. Solution. \\(\\Omega = \\{\\{i,j\\}, i = 1,...,6, j = 1,...,6\\}\\) \\(F = 2^{\\Omega}\\) \\(\\forall \\omega \\in \\Omega\\), \\(P(\\omega) = \\frac{1}{6} \\times \\frac{1}{6} = \\frac{1}{36}\\) 1.2 Properties of probability measures Exercise 1.8 A standard deck (52 cards) is distributed to two persons: 26 cards to each person. All partitions are equally likely. Find the probability that: The first person gets 4 Queens. The first person gets at least 2 Queens. R: Use simulation (sample) to check the above answers. Solution. \\(\\frac{\\binom{48}{22}}{\\binom{52}{26}}\\) 1 - \\(\\frac{\\binom{48}{26} + 4 \\times \\binom{48}{25}}{\\binom{52}{26}}\\) For the simulation, let us represent cards with numbers from 1 to 52, and let 1 through 4 represent Queens. set.seed(1) cards <- 1:52 n <- 10000 q4 <- vector(mode = "logical", length = n) q2 <- vector(mode = "logical", length = n) tmp <- vector(mode = "logical", length = n) for (i in 1:n) { p1 <- sample(1:52, 26) q4[i] <- sum(1:4 %in% p1) == 4 q2[i] <- sum(1:4 %in% p1) >= 2 } sum(q4) / n ## [1] 0.0572 sum(q2) / n ## [1] 0.6894 Exercise 1.9 Let \\(A\\) and \\(B\\) be events with probabilities \\(P(A) = \\frac{2}{3}\\) and \\(P(B) = \\frac{1}{2}\\). Show that \\(\\frac{1}{6} \\leq P(A\\cap B) \\leq \\frac{1}{2}\\), and give examples to show that both extremes are possible. Find corresponding bounds for \\(P(A\\cup B)\\). R: Draw samples from the examples and show the probability bounds of \\(P(A \\cap B)\\) . Solution. From the properties of probability we have \\[\\begin{equation} P(A \\cup B) = P(A) + P(B) - P(A \\cap B) \\leq 1. \\end{equation}\\] From this follows \\[\\begin{align} P(A \\cap B) &\\geq P(A) + P(B) - 1 \\\\ &= \\frac{2}{3} + \\frac{1}{2} - 1 \\\\ &= \\frac{1}{6}, \\end{align}\\] which is the lower bound for the intersection. Conversely, we have \\[\\begin{equation} P(A \\cup B) = P(A) + P(B) - P(A \\cap B) \\geq P(A). \\end{equation}\\] From this follows \\[\\begin{align} P(A \\cap B) &\\leq P(B) \\\\ &= \\frac{1}{2}, \\end{align}\\] which is the upper bound for the intersection. For an example take a fair die. To achieve the lower bound let \\(A = \\{3,4,5,6\\}\\) and \\(B = \\{1,2,3\\}\\), then their intersection is \\(A \\cap B = \\{3\\}\\). To achieve the upper bound take \\(A = \\{1,2,3,4\\}\\) and $B = {1,2,3} $. For the bounds of the union we will use the results from the first part. Again from the properties of probability we have \\[\\begin{align} P(A \\cup B) &= P(A) + P(B) - P(A \\cap B) \\\\ &\\geq P(A) + P(B) - \\frac{1}{2} \\\\ &= \\frac{2}{3}. \\end{align}\\] Conversely \\[\\begin{align} P(A \\cup B) &= P(A) + P(B) - P(A \\cap B) \\\\ &\\leq P(A) + P(B) - \\frac{1}{6} \\\\ &= 1. \\end{align}\\] Therefore \\(\\frac{2}{3} \\leq P(A \\cup B) \\leq 1\\). We use sample in R: set.seed(1) n <- 10000 samps <- sample(1:6, n, replace = TRUE) # lower bound lb <- vector(mode = "logical", length = n) A <- c(1,2,3) B <- c(3,4,5,6) for (i in 1:n) { lb[i] <- samps[i] %in% A & samps[i] %in% B } sum(lb) / n ## [1] 0.1605 # upper bound ub <- vector(mode = "logical", length = n) A <- c(1,2,3) B <- c(1,2,3,4) for (i in 1:n) { ub[i] <- samps[i] %in% A & samps[i] %in% B } sum(ub) / n ## [1] 0.4913 Exercise 1.10 A fair coin is tossed repeatedly. Show that, with probability one, a head turns up sooner or later. Show similarly that any given finite sequence of heads and tails occurs eventually with probability one. Solution. \\[\\begin{align} P(\\text{no heads}) &= \\lim_{n \\rightarrow \\infty} P(\\text{no heads in first }n \\text{ tosses}) \\\\ &= \\lim_{n \\rightarrow \\infty} \\frac{1}{2^n} \\\\ &= 0. \\end{align}\\] For the second part, let us fix the given sequence of heads and tails of length \\(k\\) as \\(s\\). A probability that this happens in \\(k\\) tosses is \\(\\frac{1}{2^k}\\). \\[\\begin{align} P(s \\text{ occurs}) &= \\lim_{n \\rightarrow \\infty} P(s \\text{ occurs in first } nk \\text{ tosses}) \\end{align}\\] The right part of the upper equation is greater than if \\(s\\) occurs either in the first \\(k\\) tosses, second \\(k\\) tosses,…, \\(n\\)-th \\(k\\) tosses. Therefore \\[\\begin{align} P(s \\text{ occurs}) &\\geq \\lim_{n \\rightarrow \\infty} P(s \\text{ occurs in first } n \\text{ disjoint sequences of length } k) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - P(s \\text{ does not occur in first } n \\text{ disjoint sequences})) \\\\ &= 1 - \\lim_{n \\rightarrow \\infty} P(s \\text{ does not occur in first } n \\text{ disjoint sequences}) \\\\ &= 1 - \\lim_{n \\rightarrow \\infty} (1 - \\frac{1}{2^k})^n \\\\ &= 1. \\end{align}\\] Exercise 1.11 An Erdos-Renyi random graph \\(G(n,p)\\) is a model with \\(n\\) nodes, where each pair of nodes is connected with probability \\(p\\). Calculate the probability that there exists a node that is not connected to any other node in \\(G(4,0.6)\\). Show that the upper bound for the probability that there exist 2 nodes that are not connected to any other node for an arbitrary \\(G(n,p)\\) is \\(\\binom{n}{2} (1-p)^{2n - 3}\\). R: Estimate the probability from the first point using simulation. Solution. Let \\(A_i\\) be the event that the \\(i\\)-th node is not connected to any other node. Then our goal is to calculate \\(P(\\cup_{i=1}^n A_i)\\). Using the inclusion-exclusion principle, we get \\[\\begin{align} P(\\cup_{i=1}^n A_i) &= \\sum_i A_i - \\sum_{i<j} P(A_i \\cap A_j) + \\sum_{i<j<k} P(A_i \\cap A_j \\cap A_k) - P(A_1 \\cap A_2 \\cap A_3 \\cap A_4) \\\\ &=4 (1 - p)^3 - \\binom{4}{2} (1 - p)^5 + \\binom{4}{3} (1 - p)^6 - (1 - p)^6 \\\\ &\\approx 0.21. \\end{align}\\] Let \\(A_{ij}\\) be the event that nodes \\(i\\) and \\(j\\) are not connected to any other node. We are interested in \\(P(\\cup_{i<j}A_{ij})\\). By using Boole`s inequality, we get \\[\\begin{align} P(\\cup_{i<j}A_{ij}) \\leq \\sum_{i<j} P(A_{ij}). \\end{align}\\] What is the probability of \\(A_{ij}\\)? There need to be no connections to the \\(i\\)-th node to the remaining nodes (excluding \\(j\\)), the same for the \\(j\\)-th node, and there can be no connection between them. Therefore \\[\\begin{align} P(\\cup_{i<j}A_{ij}) &\\leq \\sum_{i<j} (1 - p)^{2(n-2) + 1} \\\\ &= \\binom{n}{2} (1 - p)^{2n - 3}. \\end{align}\\] set.seed(1) n_samp <- 100000 n <- 4 p <- 0.6 conn_samp <- vector(mode = "logical", length = n_samp) for (i in 1:n_samp) { tmp_mat <- matrix(data = 0, nrow = n, ncol = n) samp_conn <- sample(c(0,1), choose(4,2), replace = TRUE, prob = c(1 - p, p)) tmp_mat[lower.tri(tmp_mat)] <- samp_conn tmp_mat[upper.tri(tmp_mat)] <- t(tmp_mat)[upper.tri(t(tmp_mat))] not_conn <- apply(tmp_mat, 1, sum) if (any(not_conn == 0)) { conn_samp[i] <- TRUE } else { conn_samp[i] <- FALSE } } sum(conn_samp) / n_samp ## [1] 0.20565 1.3 Discrete probability spaces Exercise 1.12 Show that the standard measurable space on \\(\\Omega = \\{0,1,...,n\\}\\) equipped with binomial measure is a discrete probability space. Define another probability measure on this measurable space. Show that for \\(n=1\\) the binomial measure is the same as the Bernoulli measure. R: Draw 1000 samples from the binomial distribution \\(p=0.5\\), \\(n=20\\) (rbinom) and compare relative frequencies with theoretical probability measure. Solution. We need to show that the terms of \\(\\sum_{k=0}^n \\binom{n}{k} p^k (1 - p)^{n - k}\\) sum to 1. For that we use the binomial theorem \\(\\sum_{k=0}^n \\binom{n}{k} x^k y^{n-k} = (x + y)^n\\). So \\[\\begin{equation} \\sum_{k=0}^n \\binom{n}{k} p^k (1 - p)^{n - k} = (p + 1 - p)^n = 1. \\end{equation}\\] \\(P(\\{k\\}) = \\frac{1}{n + 1}\\). When \\(n=1\\) then \\(k \\in \\{0,1\\}\\). Inserting \\(n=1\\) into the binomial measure, we get \\(\\binom{1}{k}p^k (1-p)^{1 - k}\\). Now \\(\\binom{1}{1} = \\binom{1}{0} = 1\\), so the measure is \\(p^k (1-p)^{1 - k}\\), which is the Bernoulli measure. set.seed(1) library(ggplot2) library(dplyr) bin_samp <- rbinom(n = 1000, size = 20, prob = 0.5) bin_samp <- data.frame(x = bin_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:20, n = dbinom(0:20, size = 20, prob = 0.5), type = "theoretical_measure")) bin_plot <- ggplot(data = bin_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(bin_plot) Exercise 1.13 Show that the standard measurable space on \\(\\Omega = \\{0,1,...,\\infty\\}\\) equipped with geometric measure is a discrete probability space, equipped with Poisson measure is a discrete probability space. Define another probability measure on this measurable space. R: Draw 1000 samples from the Poisson distribution \\(\\lambda = 10\\) (rpois) and compare relative frequencies with theoretical probability measure. Solution. \\(\\sum_{k = 0}^{\\infty} p(1 - p)^k = p \\sum_{k = 0}^{\\infty} (1 - p)^k = p \\frac{1}{1 - 1 + p} = 1\\). We used the formula for geometric series. \\(\\sum_{k = 0}^{\\infty} \\frac{\\lambda^k e^{-\\lambda}}{k!} = e^{-\\lambda} \\sum_{k = 0}^{\\infty} \\frac{\\lambda^k}{k!} = e^{-\\lambda} e^{\\lambda} = 1.\\) We used the Taylor expansion of the exponential function. Since we only have to define a probability measure, we could only assign probabilities that sum to one to a finite number of events in \\(\\Omega\\), and probability zero to the other infinite number of events. However to make this solution more educational, we will try to find a measure that assigns a non-zero probability to all events in \\(\\Omega\\). A good start for this would be to find a converging infinite series, as the probabilities will have to sum to one. One simple converging series is the geometric series \\(\\sum_{k=0}^{\\infty} p^k\\) for \\(|p| < 1\\). Let us choose an arbitrary \\(p = 0.5\\). Then \\(\\sum_{k=0}^{\\infty} p^k = \\frac{1}{1 - 0.5} = 2\\). To complete the measure, we have to normalize it, so it sums to one, therefore \\(P(\\{k\\}) = \\frac{0.5^k}{2}\\) is a probability measure on \\(\\Omega\\). We could make it even more difficult by making this measure dependent on some parameter \\(\\alpha\\), but this is out of the scope of this introductory chapter. set.seed(1) pois_samp <- rpois(n = 1000, lambda = 10) pois_samp <- data.frame(x = pois_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:25, n = dpois(0:25, lambda = 10), type = "theoretical_measure")) pois_plot <- ggplot(data = pois_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(pois_plot) Exercise 1.14 Define a probability measure on \\((\\Omega = \\mathbb{Z}, 2^{\\mathbb{Z}})\\). Define a probability measure such that \\(P(\\omega) > 0, \\forall \\omega \\in \\Omega\\). R: Implement a random generator that will generate samples with the relative frequency that corresponds to your probability measure. Compare relative frequencies with theoretical probability measure . Solution. \\(P(0) = 1, P(\\omega) = 0, \\forall \\omega \\neq 0\\). \\(P(\\{k\\}) = \\sum_{k = -\\infty}^{\\infty} \\frac{p(1 - p)^{|k|}}{2^{1 - 1_0(k)}}\\), where \\(1_0(k)\\) is the indicator function, which equals to one if \\(k\\) is 0, and equals to zero in every other case. n <- 1000 geom_samps <- rgeom(n, prob = 0.5) sign_samps <- sample(c(FALSE, TRUE), size = n, replace = TRUE) geom_samps[sign_samps] <- -geom_samps[sign_samps] my_pmf <- function (k, p) { indic <- rep(1, length(k)) indic[k == 0] <- 0 return ((p * (1 - p)^(abs(k))) / 2^indic) } geom_samps <- data.frame(x = geom_samps) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = -10:10, n = my_pmf(-10:10, 0.5), type = "theoretical_measure")) geom_plot <- ggplot(data = geom_samps, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(geom_plot) Exercise 1.15 Define a probability measure on \\(\\Omega = \\{1,2,3,4,5,6\\}\\) with parameter \\(m \\in \\{1,2,3,4,5,6\\}\\), so that the probability of outcome at distance \\(1\\) from \\(m\\) is half of the probability at distance \\(0\\), at distance \\(2\\) is half of the probability at distance \\(1\\), etc. R: Implement a random generator that will generate samples with the relative frequency that corresponds to your probability measure. Compare relative frequencies with theoretical probability measure . Solution. \\(P(\\{k\\}) = \\frac{\\frac{1}{2}^{|m - k|}}{\\sum_{i=1}^6 \\frac{1}{2}^{|m - i|}}\\) n <- 10000 m <- 4 my_pmf <- function (k, m) { denom <- sum(0.5^abs(m - 1:6)) return (0.5^abs(m - k) / denom) } samps <- c() for (i in 1:n) { a <- sample(1:6, 1) a_val <- my_pmf(a, m) prob <- runif(1) if (prob < a_val) { samps <- c(samps, a) } } samps <- data.frame(x = samps) %>% count(x) %>% mutate(n = n / length(samps), type = "empirical_frequencies") %>% bind_rows(data.frame(x = 1:6, n = my_pmf(1:6, m), type = "theoretical_measure")) my_plot <- ggplot(data = samps, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(my_plot) "],["uprobspaces.html", "Chapter 2 Uncountable probability spaces 2.1 Borel sets 2.2 Lebesgue measure", " Chapter 2 Uncountable probability spaces This chapter deals with uncountable probability spaces. The students are expected to acquire the following knowledge: Theoretical Understand Borel sets and identify them. Estimate Lebesgue measure for different sets. Know when sets are Borel-measurable. Understanding of countable and uncountable sets. R Uniform sampling. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 2.1 Borel sets Exercise 2.1 Prove that the intersection of two sigma algebras on \\(\\Omega\\) is a sigma algebra. Prove that the collection of all open subsets \\((a,b)\\) on \\((0,1]\\) is not a sigma algebra of \\((0,1]\\). Solution. Empty set: \\[\\begin{equation} \\emptyset \\in \\mathcal{A} \\wedge \\emptyset \\in \\mathcal{B} \\Rightarrow \\emptyset \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Complement: \\[\\begin{equation} \\text{Let } A \\in \\mathcal{A} \\cap \\mathcal{B} \\Rightarrow A \\in \\mathcal{A} \\wedge A \\in \\mathcal{B} \\Rightarrow A^c \\in \\mathcal{A} \\wedge A^c \\in \\mathcal{B} \\Rightarrow A^c \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Countable additivity: Let \\(\\{A_i\\}\\) be a countable sequence of subsets in \\(\\mathcal{A} \\cap \\mathcal{B}\\). \\[\\begin{equation} \\forall i: A_i \\in \\mathcal{A} \\cap \\mathcal{B} \\Rightarrow A_i \\in \\mathcal{A} \\wedge A_i \\in \\mathcal{B} \\Rightarrow \\cup A_i \\in \\mathcal{A} \\wedge \\cup A_i \\in \\mathcal{B} \\Rightarrow \\cup A_i \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Let \\(A\\) denote the collection of all open subsets \\((a,b)\\) on \\((0,1]\\). Then \\((0,1) \\in A\\). But \\((0,1)^c = 1 \\notin A\\). Exercise 2.2 Show that \\(\\mathcal{C} = \\sigma(\\mathcal{C})\\) if and only if \\(\\mathcal{C}\\) is a sigma algebra. Solution. “\\(\\Rightarrow\\)” This follows from the definition of a generated sigma algebra. “\\(\\Leftarrow\\)” Let \\(\\mathcal{F} = \\cap_i F_i\\) be the intersection of all sigma algebras that contain \\(\\mathcal{C}\\). Then \\(\\sigma(\\mathcal{C}) = \\mathcal{F}\\). Additionally, \\(\\forall i: \\mathcal{C} \\in F_i\\). So each \\(F_i\\) can be written as \\(F_i = \\mathcal{C} \\cup D\\), where \\(D\\) are the rest of the elements in the sigma algebra. In other words, each sigma algebra in the collection contains at least \\(\\mathcal{C}\\), but can contain other elements. Now for some \\(j\\), \\(F_j = \\mathcal{C}\\) as \\(\\{F_i\\}\\) contains all sigma algebras that contain \\(\\mathcal{C}\\) and \\(\\mathcal{C}\\) is such a sigma algebra. Since this is the smallest subset in the intersection it follows that \\(\\sigma(\\mathcal{C}) = \\mathcal{F} = \\mathcal{C}\\). Exercise 2.3 Let \\(\\mathcal{C}\\) and \\(\\mathcal{D}\\) be two collections of subsets on \\(\\Omega\\) such that \\(\\mathcal{C} \\subset \\mathcal{D}\\). Prove that \\(\\sigma(\\mathcal{C}) \\subseteq \\sigma(\\mathcal{D})\\). Solution. \\(\\sigma(\\mathcal{D})\\) is a sigma algebra that contains \\(\\mathcal{D}\\). It follows that \\(\\sigma(\\mathcal{D})\\) is a sigma algebra that contains \\(\\mathcal{C}\\). Let us write \\(\\sigma(\\mathcal{C}) = \\cap_i F_i\\), where \\(\\{F_i\\}\\) is the collection of all sigma algebras that contain \\(\\mathcal{C}\\). Since \\(\\sigma(\\mathcal{D})\\) is such a sigma algebra, there exists an index \\(j\\), so that \\(F_j = \\sigma(\\mathcal{D})\\). Then we can write \\[\\begin{align} \\sigma(\\mathcal{C}) &= (\\cap_{i \\neq j} F_i) \\cap \\sigma(\\mathcal{D}) \\\\ &\\subseteq \\sigma(\\mathcal{D}). \\end{align}\\] Exercise 2.4 Prove that the following subsets of \\((0,1]\\) are Borel-measurable by finding their measure. Any countable set. The set of numbers in (0,1] whose decimal expansion does not contain 7. Solution. This follows directly from the fact that every countable set is a union of singletons, whose measure is 0. Let us first look at numbers which have a 7 as the first decimal numbers. Their measure is 0.1. Then we take all the numbers with a 7 as the second decimal number (excluding those who already have it as the first). These have the measure 0.01, and there are 9 of them, so their total measure is 0.09. We can continue to do so infinitely many times. At each \\(n\\), we have the measure of the intervals which is \\(10^n\\) and the number of those intervals is \\(9^{n-1}\\). Now \\[\\begin{align} \\lambda(A) &= 1 - \\sum_{n = 0}^{\\infty} \\frac{9^n}{10^{n+1}} \\\\ &= 1 - \\frac{1}{10} \\sum_{n = 0}^{\\infty} (\\frac{9}{10})^n \\\\ &= 1 - \\frac{1}{10} \\frac{10}{1} \\\\ &= 0. \\end{align}\\] Since we have shown that the measure of the set is \\(0\\), we have also shown that the set is measurable. Exercise 2.5 Let \\(\\Omega = [0,1]\\), and let \\(\\mathcal{F}_3\\) consist of all countable subsets of \\(\\Omega\\), and all subsets of \\(\\Omega\\) having a countable complement. Show that \\(\\mathcal{F}_3\\) is a sigma algebra. Let us define \\(P(A)=0\\) if \\(A\\) is countable, and \\(P(A) = 1\\) if \\(A\\) has a countable complement. Is \\((\\Omega, \\mathcal{F}_3, P)\\) a legitimate probability space? Solution. The empty set is countable, therefore it is in \\(\\mathcal{F}_3\\). For any \\(A \\in \\mathcal{F}_3\\). If \\(A\\) is countable, then \\(A^c\\) has a countable complement and is in \\(\\mathcal{F}_3\\). If \\(A\\) is uncountable, then it has a countable complement \\(A^c\\) which is therefore also in \\(\\mathcal{F}_3\\). We are left with showing countable additivity. Let \\(\\{A_i\\}\\) be an arbitrary collection of sets in \\(\\mathcal{F}_3\\). We will look at two possibilities. First let all \\(A_i\\) be countable. A countable union of countable sets is countable, and therefore in \\(\\mathcal{F}_3\\). Second, let at least one \\(A_i\\) be uncountable. It follows that it has a countable complement. We can write \\[\\begin{equation} (\\cup_{i=1}^{\\infty} A_i)^c = \\cap_{i=1}^{\\infty} A_i^c. \\end{equation}\\] Since at least one \\(A_i^c\\) on the right side is countable, the whole intersection is countable, and therefore the union has a countable complement. It follows that the union is in \\(\\mathcal{F}_3\\). The tuple \\((\\Omega, \\mathcal{F}_3)\\) is a measurable space. Therefore, we only need to check whether \\(P\\) is a probability measure. The measure of the empty set is zero as it is countable. We have to check for countable additivity. Let us look at three situations. Let \\(A_i\\) be disjoint sets. First, let all \\(A_i\\) be countable. \\[\\begin{equation} P(\\cup_{i=1}^{\\infty} A_i) = \\sum_{i=1}^{\\infty}P( A_i)) = 0. \\end{equation}\\] Since the union is countable, the above equation holds. Second, let exactly one \\(A_i\\) be uncountable. W.L.O.G. let that be \\(A_1\\). Then \\[\\begin{equation} P(\\cup_{i=1}^{\\infty} A_i) = 1 + \\sum_{i=2}^{\\infty}P( A_i)) = 1. \\end{equation}\\] Since the union is uncountable, the above equation holds. Third, let at least two \\(A_i\\) be uncountable. We have to check whether it is possible for two uncountable sets in \\(\\mathcal{F}_3\\) to be disjoint. If that is possible, then their measures would sum to more than one and \\(P\\) would not be a probability measure. W.L.O.G. let \\(A_1\\) and \\(A_2\\) be uncountable. Then we have \\[\\begin{equation} A_1 \\cap A_2 = (A_1^c \\cup A_2^c)^c. \\end{equation}\\] Now \\(A_1^c\\) and \\(A_2^c\\) are countable and their union is therefore countable. Let \\(B = A_1^c \\cup A_2^c\\). So the intersection of \\(A_1\\) and \\(A_2\\) equals the complement of \\(B\\), which is countable. For the intersection to be the empty set, \\(B\\) would have to equal to \\(\\Omega\\). But \\(\\Omega\\) is uncountable and therefore \\(B\\) can not equal to \\(\\Omega\\). It follows that two uncountable sets in \\(\\mathcal{F}_3\\) can not have an empty intersection. Therefore the tuple is a legitimate probability space. 2.2 Lebesgue measure Exercise 2.6 Show that the Lebesgue measure of rational numbers on \\([0,1]\\) is 0. R: Implement a random number generator, which generates uniform samples of irrational numbers in \\([0,1]\\) by uniformly sampling from \\([0,1]\\) and rejecting a sample if it is rational. Solution. There are a countable number of rational numbers. Therefore, we can write \\[\\begin{align} \\lambda(\\mathbb{Q}) &= \\lambda(\\cup_{i = 1}^{\\infty} q_i) &\\\\ &= \\sum_{i = 1}^{\\infty} \\lambda(q_i) &\\text{ (countable additivity)} \\\\ &= \\sum_{i = 1}^{\\infty} 0 &\\text{ (Lebesgue measure of a singleton)} \\\\ &= 0. \\end{align}\\] Exercise 2.7 Prove that the Lebesgue measure of \\(\\mathbb{R}\\) is infinity. Paradox. Show that the cardinality of \\(\\mathbb{R}\\) and \\((0,1)\\) is the same, while their Lebesgue measures are infinity and one respectively. Solution. Let \\(a_i\\) be the \\(i\\)-th integer for \\(i \\in \\mathbb{Z}\\). We can write \\(\\mathbb{R} = \\cup_{-\\infty}^{\\infty} (a_i, a_{i + 1}]\\). \\[\\begin{align} \\lambda(\\mathbb{R}) &= \\lambda(\\cup_{i = -\\infty}^{\\infty} (a_i, a_{i + 1}]) \\\\ &= \\lambda(\\lim_{n \\rightarrow \\infty} \\cup_{i = -n}^{n} (a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\lambda(\\cup_{i = -n}^{n} (a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\sum_{i = -n}^{n} \\lambda((a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\sum_{i = -n}^{n} 1 \\\\ &= \\lim_{n \\rightarrow \\infty} 2n \\\\ &= \\infty. \\end{align}\\] We need to find a bijection between \\(\\mathbb{R}\\) and \\((0,1)\\). A well-known function that maps from a bounded interval to \\(\\mathbb{R}\\) is the tangent. To make the bijection easier to achieve, we will take the inverse, which maps from \\(\\mathbb{R}\\) to \\((-\\frac{\\pi}{2}, \\frac{\\pi}{2})\\). However, we need to change the function so it maps to \\((0,1)\\). First we add \\(\\frac{\\pi}{2}\\), so that we move the function above zero. Then we only have to divide by the max value, which in this case is \\(\\pi\\). So our bijection is \\[\\begin{equation} f(x) = \\frac{\\tan^{-1}(x) + \\frac{\\pi}{2}}{\\pi}. \\end{equation}\\] Exercise 2.8 Take the measure space \\((\\Omega_1 = (0,1], B_{(0,1]}, \\lambda)\\) (we know that this is a probability space on \\((0,1]\\)). Define a map (function) from \\(\\Omega_1\\) to \\(\\Omega_2 = \\{1,2,3,4,5,6\\}\\) such that the measure space \\((\\Omega_2, 2^{\\Omega_2}, \\lambda(f^{-1}()))\\) will be a discrete probability space with uniform probabilities (\\(P(\\omega) = \\frac{1}{6}, \\forall \\omega \\in \\Omega_2)\\). Is the map that you defined in (a) the only such map? How would you in the same fashion define a map that would result in a probability space that can be interpreted as a coin toss with probability \\(p\\) of heads? R: Use the map in (a) as a basis for a random generator for this fair die. Solution. In other words, we have to assign disjunct intervals of the same size to each element of \\(\\Omega_2\\). Therefore \\[\\begin{equation} f(x) = \\lceil 6x \\rceil. \\end{equation}\\] No, we could for example rearrange the order in which the intervals are mapped to integers. Additionally, we could have several disjoint intervals that mapped to the same integer, as long as the Lebesgue measure of their union would be \\(\\frac{1}{6}\\) and the function would remain injective. We have \\(\\Omega_3 = \\{0,1\\}\\), where zero represents heads and one represents tails. Then \\[\\begin{equation} f(x) = 0^{I_{A}(x)}, \\end{equation}\\] where \\(A = \\{y \\in (0,1] : y < p\\}\\). set.seed(1) unif_s <- runif(1000) die_s <- ceiling(6 * unif_s) summary(as.factor(die_s)) ## 1 2 3 4 5 6 ## 166 154 200 146 166 168 "],["condprob.html", "Chapter 3 Conditional probability 3.1 Calculating conditional probabilities 3.2 Conditional independence 3.3 Monty Hall problem", " Chapter 3 Conditional probability This chapter deals with conditional probability. The students are expected to acquire the following knowledge: Theoretical Identify whether variables are independent. Calculation of conditional probabilities. Understanding of conditional dependence and independence. How to apply Bayes’ theorem to solve difficult probabilistic questions. R Simulating conditional probabilities. cumsum. apply. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 3.1 Calculating conditional probabilities Exercise 3.1 A military officer is in charge of identifying enemy aircraft and shooting them down. He is able to positively identify an enemy airplane 95% of the time and positively identify a friendly airplane 90% of the time. Furthermore, 99% of the airplanes are friendly. When the officer identifies an airplane as an enemy airplane, what is the probability that it is not and they will shoot at a friendly airplane? Solution. Let \\(E = 0\\) denote that the observed plane is friendly and \\(E=1\\) that it is an enemy. Let \\(I = 0\\) denote that the officer identified it as friendly and \\(I = 1\\) as enemy. Then \\[\\begin{align} P(E = 0 | I = 1) &= \\frac{P(I = 1 | E = 0)P(E = 0)}{P(I = 1)} \\\\ &= \\frac{P(I = 1 | E = 0)P(E = 0)}{P(I = 1 | E = 0)P(E = 0) + P(I = 1 | E = 1)P(E = 1)} \\\\ &= \\frac{0.1 \\times 0.99}{0.1 \\times 0.99 + 0.95 \\times 0.01} \\\\ &= 0.91. \\end{align}\\] Exercise 3.2 R: Consider tossing a fair die. Let \\(A = \\{2,4,6\\}\\) and \\(B = \\{1,2,3,4\\}\\). Then \\(P(A) = \\frac{1}{2}\\), \\(P(B) = \\frac{2}{3}\\) and \\(P(AB) = \\frac{1}{3}\\). Since \\(P(AB) = P(A)P(B)\\), the events \\(A\\) and \\(B\\) are independent. Simulate draws from the sample space and verify that the proportions are the same. Then find two events \\(C\\) and \\(D\\) that are not independent and repeat the simulation. set.seed(1) nsamps <- 10000 tosses <- sample(1:6, nsamps, replace = TRUE) PA <- sum(tosses %in% c(2,4,6)) / nsamps PB <- sum(tosses %in% c(1,2,3,4)) / nsamps PA * PB ## [1] 0.3295095 sum(tosses %in% c(2,4)) / nsamps ## [1] 0.3323 # Let C = {1,2} and D = {2,3} PC <- sum(tosses %in% c(1,2)) / nsamps PD <- sum(tosses %in% c(2,3)) / nsamps PC * PD ## [1] 0.1067492 sum(tosses %in% c(2)) / nsamps ## [1] 0.1622 Exercise 3.3 A machine reports the true value of a thrown 12-sided die 5 out of 6 times. If the machine reports a 1 has been tossed, what is the probability that it is actually a 1? Now let the machine only report whether a 1 has been tossed or not. Does the probability change? R: Use simulation to check your answers to a) and b). Solution. Let \\(T = 1\\) denote that the toss is 1 and \\(M = 1\\) that the machine reports a 1. \\[\\begin{align} P(T = 1 | M = 1) &= \\frac{P(M = 1 | T = 1)P(T = 1)}{P(M = 1)} \\\\ &= \\frac{P(M = 1 | T = 1)P(T = 1)}{\\sum_{k=1}^{12} P(M = 1 | T = k)P(T = k)} \\\\ &= \\frac{\\frac{5}{6}\\frac{1}{12}}{\\frac{5}{6}\\frac{1}{12} + 11 \\frac{1}{6} \\frac{1}{11} \\frac{1}{12}} \\\\ &= \\frac{5}{6}. \\end{align}\\] Yes. \\[\\begin{align} P(T = 1 | M = 1) &= \\frac{P(M = 1 | T = 1)P(T = 1)}{P(M = 1)} \\\\ &= \\frac{P(M = 1 | T = 1)P(T = 1)}{\\sum_{k=1}^{12} P(M = 1 | T = k)P(T = k)} \\\\ &= \\frac{\\frac{5}{6}\\frac{1}{12}}{\\frac{5}{6}\\frac{1}{12} + 11 \\frac{1}{6} \\frac{1}{12}} \\\\ &= \\frac{5}{16}. \\end{align}\\] set.seed(1) nsamps <- 10000 report_a <- vector(mode = "numeric", length = nsamps) report_b <- vector(mode = "logical", length = nsamps) truths <- vector(mode = "logical", length = nsamps) for (i in 1:10000) { toss <- sample(1:12, size = 1) truth <- sample(c(TRUE, FALSE), size = 1, prob = c(5/6, 1/6)) truths[i] <- truth if (truth) { report_a[i] <- toss report_b[i] <- toss == 1 } else { remaining <- (1:12)[1:12 != toss] report_a[i] <- sample(remaining, size = 1) report_b[i] <- toss != 1 } } truth_a1 <- truths[report_a == 1] sum(truth_a1) / length(truth_a1) ## [1] 0.8300733 truth_b1 <- truths[report_b] sum(truth_b1) / length(truth_b1) ## [1] 0.3046209 Exercise 3.4 A coin is tossed independently \\(n\\) times. The probability of heads at each toss is \\(p\\). At each time \\(k\\), \\((k = 2,3,...,n)\\) we get a reward at time \\(k+1\\) if \\(k\\)-th toss was a head and the previous toss was a tail. Let \\(A_k\\) be the event that a reward is obtained at time \\(k\\). Are events \\(A_k\\) and \\(A_{k+1}\\) independent? Are events \\(A_k\\) and \\(A_{k+2}\\) independent? R: simulate 10 tosses 10000 times, where \\(p = 0.7\\). Check your answers to a) and b) by counting the frequencies of the events \\(A_5\\), \\(A_6\\), and \\(A_7\\). Solution. For \\(A_k\\) to happen, we need the tosses \\(k-2\\) and \\(k-1\\) be tails and heads respectively. For \\(A_{k+1}\\) to happen, we need tosses \\(k-1\\) and \\(k\\) be tails and heads respectively. As the toss \\(k-1\\) need to be heads for one and tails for the other, these two events can not happen simultaneously. Therefore the probability of their intersection is 0. But the probability of each of them separately is \\(p(1-p) > 0\\). Therefore, they are not independent. For \\(A_k\\) to happen, we need the tosses \\(k-2\\) and \\(k-1\\) be tails and heads respectively. For \\(A_{k+2}\\) to happen, we need tosses \\(k\\) and \\(k+1\\) be tails and heads respectively. So the probability of intersection is \\(p^2(1-p)^2\\). And the probability of each separately is again \\(p(1-p)\\). Therefore, they are independent. set.seed(1) nsamps <- 10000 p <- 0.7 rewardA_5 <- vector(mode = "logical", length = nsamps) rewardA_6 <- vector(mode = "logical", length = nsamps) rewardA_7 <- vector(mode = "logical", length = nsamps) rewardA_56 <- vector(mode = "logical", length = nsamps) rewardA_57 <- vector(mode = "logical", length = nsamps) for (i in 1:nsamps) { samps <- sample(c(0,1), size = 10, replace = TRUE, prob = c(0.7, 0.3)) rewardA_5[i] <- (samps[4] == 0 & samps[3] == 1) rewardA_6[i] <- (samps[5] == 0 & samps[4] == 1) rewardA_7[i] <- (samps[6] == 0 & samps[5] == 1) rewardA_56[i] <- (rewardA_5[i] & rewardA_6[i]) rewardA_57[i] <- (rewardA_5[i] & rewardA_7[i]) } sum(rewardA_5) / nsamps ## [1] 0.2141 sum(rewardA_6) / nsamps ## [1] 0.2122 sum(rewardA_7) / nsamps ## [1] 0.2107 sum(rewardA_56) / nsamps ## [1] 0 sum(rewardA_57) / nsamps ## [1] 0.0454 Exercise 3.5 A drawer contains two coins. One is an unbiased coin, the other is a biased coin, which will turn up heads with probability \\(p\\) and tails with probability \\(1-p\\). One coin is selected uniformly at random. The selected coin is tossed \\(n\\) times. The coin turns up heads \\(k\\) times and tails \\(n-k\\) times. What is the probability that the coin is biased? The selected coin is tossed repeatedly until it turns up heads \\(k\\) times. Given that it is tossed \\(n\\) times in total, what is the probability that the coin is biased? Solution. Let \\(B = 1\\) denote that the coin is biased and let \\(H = k\\) denote that we’ve seen \\(k\\) heads. \\[\\begin{align} P(B = 1 | H = k) &= \\frac{P(H = k | B = 1)P(B = 1)}{P(H = k)} \\\\ &= \\frac{P(H = k | B = 1)P(B = 1)}{P(H = k | B = 1)P(B = 1) + P(H = k | B = 0)P(B = 0)} \\\\ &= \\frac{p^k(1-p)^{n-k} 0.5}{p^k(1-p)^{n-k} 0.5 + 0.5^{n+1}} \\\\ &= \\frac{p^k(1-p)^{n-k}}{p^k(1-p)^{n-k} + 0.5^n}. \\end{align}\\] The same results as in a). The only difference between these two scenarios is that in b) the last throw must be heads. However, this holds for the biased and the unbiased coin and therefore does not affect the probability of the coin being biased. Exercise 3.6 Judy goes around the company for Women’s day and shares flowers. In every office she leaves a flower, if there is at least one woman inside. The probability that there’s a woman in the office is \\(\\frac{3}{5}\\). What is the probability that Judy leaves her first flower in the fourth office? Given that she has given away exactly three flowers in the first four offices, what is the probability that she gives her fourth flower in the eighth office? What is the probability that she leaves the second flower in the fifth office? What is the probability that she leaves the second flower in the fifth office, given that she did not leave the second flower in the second office? Judy needs a new supply of flowers immediately after the office, where she gives away her last flower. What is the probability that she visits at least five offices, if she starts with two flowers? R: simulate Judy’s walk 10000 times to check your answers a) - e). Solution. Let \\(X_i = k\\) denote the event that … \\(i\\)-th sample on the \\(k\\)-th run. Since the events are independent, we can multiply their probabilities to get \\[\\begin{equation} P(X_1 = 4) = 0.4^3 \\times 0.6 = 0.0384. \\end{equation}\\] Same as in a) as we have a fresh start after first four offices. For this to be possible, she had to leave the first flower in one of the first four offices. Therefore there are four possibilities, and for each of those the probability is \\(0.4^3 \\times 0.6\\). Additionally, the probability that she leaves a flower in the fifth office is \\(0.6\\). So \\[\\begin{equation} P(X_2 = 5) = \\binom{4}{1} \\times 0.4^3 \\times 0.6^2 = 0.09216. \\end{equation}\\] We use Bayes’ theorem. \\[\\begin{align} P(X_2 = 5 | X_2 \\neq 2) &= \\frac{P(X_2 \\neq 2 | X_2 = 5)P(X_2 = 5)}{P(X_2 \\neq 2)} \\\\ &= \\frac{0.09216}{0.64} \\\\ &= 0.144. \\end{align}\\] The denominator in the second equation can be calculated as follows. One of three things has to happen for the second not to be dealt in the second round. First, both are zero, so \\(0.4^2\\). Second, first is zero, and second is one, so \\(0.4 \\times 0.6\\). Third, the first is one and the second one zero, so \\(0.6 \\times 0.4\\). Summing these values we get \\(0.64\\). We will look at the complement, so the events that she gave away exactly two flowers after two, three and four offices. \\[\\begin{equation} P(X_2 \\geq 5) = 1 - 0.6^2 - 2 \\times 0.4 \\times 0.6^2 - 3 \\times 0.4^2 \\times 0.6^2 = 0.1792. \\end{equation}\\] The multiplying parts represent the possibilities of the first flower. set.seed(1) nsamps <- 100000 Judyswalks <- matrix(data = NA, nrow = nsamps, ncol = 8) for (i in 1:nsamps) { thiswalk <- sample(c(0,1), size = 8, replace = TRUE, prob = c(0.4, 0.6)) Judyswalks[i, ] <- thiswalk } csJudy <- t(apply(Judyswalks, 1, cumsum)) # a sum(csJudy[ ,4] == 1 & csJudy[ ,3] == 0) / nsamps ## [1] 0.03848 # b csJsubset <- csJudy[csJudy[ ,4] == 3 & csJudy[ ,3] == 2, ] sum(csJsubset[ ,8] == 4 & csJsubset[ ,7] == 3) / nrow(csJsubset) ## [1] 0.03665893 # c sum(csJudy[ ,5] == 2 & csJudy[ ,4] == 1) / nsamps ## [1] 0.09117 # d sum(csJudy[ ,5] == 2 & csJudy[ ,4] == 1) / sum(csJudy[ ,2] != 2) ## [1] 0.1422398 # e sum(csJudy[ ,4] < 2) / nsamps ## [1] 0.17818 3.2 Conditional independence Exercise 3.7 Describe: A real-world example of two events \\(A\\) and \\(B\\) that are dependent but become conditionally independent if conditioned on a third event \\(C\\). A real-world example of two events \\(A\\) and \\(B\\) that are independent, but become dependent if conditioned on some third event \\(C\\). Solution. Let \\(A\\) be the height of a person and let \\(B\\) be the person’s knowledge of the Dutch language. These events are dependent since the Dutch are known to be taller than average. However if \\(C\\) is the nationality of the person, then \\(A\\) and \\(B\\) are independent given \\(C\\). Let \\(A\\) be the event that Mary passes the exam and let \\(B\\) be the event that John passes the exam. These events are independent. However, if the event \\(C\\) is that Mary and John studied together, then \\(A\\) and \\(B\\) are conditionally dependent given \\(C\\). Exercise 3.8 We have two coins of identical appearance. We know that one is a fair coin and the other flips heads 80% of the time. We choose one of the two coins uniformly at random. We discard the coin that was not chosen. We now flip the chosen coin independently 10 times, producing a sequence \\(Y_1 = y_1\\), \\(Y_2 = y_2\\), …, \\(Y_{10} = y_{10}\\). Intuitively, without doing and computation, are these random variables independent? Compute the probability \\(P(Y_1 = 1)\\). Compute the probabilities \\(P(Y_2 = 1 | Y_1 = 1)\\) and \\(P(Y_{10} = 1 | Y_1 = 1,...,Y_9 = 1)\\). Given your answers to b) and c), would you now change your answer to a)? If so, discuss why your intuition had failed. Solution. \\(P(Y_1 = 1) = 0.5 * 0.8 + 0.5 * 0.5 = 0.65\\). Since we know that \\(Y_1 = 1\\) this should change our view of the probability of the coin being biased or not. Let \\(B = 1\\) denote the event that the coin is biased and let \\(B = 0\\) denote that the coin is unbiased. By using marginal probability, we can write \\[\\begin{align} P(Y_2 = 1 | Y_1 = 1) &= P(Y_2 = 1, B = 1 | Y_1 = 1) + P(Y_2 = 1, B = 0 | Y_1 = 1) \\\\ &= \\sum_{k=1}^2 P(Y_2 = 1 | B = k, Y_1 = 1)P(B = k | Y_1 = 1) \\\\ &= 0.8 \\frac{P(Y_1 = 1 | B = 1)P(B = 1)}{P(Y_1 = 1)} + 0.5 \\frac{P(Y_1 = 1 | B = 0)P(B = 0)}{P(Y_1 = 1)} \\\\ &= 0.8 \\frac{0.8 \\times 0.5}{0.65} + 0.5 \\frac{0.5 \\times 0.5}{0.65} \\\\ &\\approx 0.68. \\end{align}\\] For the other calculation we follow the same procedure. Let \\(X = 1\\) denote that first nine tosses are all heads (equivalent to \\(Y_1 = 1\\),…, \\(Y_9 = 1\\)). \\[\\begin{align} P(Y_{10} = 1 | X = 1) &= P(Y_2 = 1, B = 1 | X = 1) + P(Y_2 = 1, B = 0 | X = 1) \\\\ &= \\sum_{k=1}^2 P(Y_2 = 1 | B = k, X = 1)P(B = k | X = 1) \\\\ &= 0.8 \\frac{P(X = 1 | B = 1)P(B = 1)}{P(X = 1)} + 0.5 \\frac{P(X = 1 | B = 0)P(B = 0)}{P(X = 1)} \\\\ &= 0.8 \\frac{0.8^9 \\times 0.5}{0.5 \\times 0.8^9 + 0.5 \\times 0.5^9} + 0.5 \\frac{0.5^9 \\times 0.5}{0.5 \\times 0.8^9 + 0.5 \\times 0.5^9} \\\\ &\\approx 0.8. \\end{align}\\] 3.3 Monty Hall problem The Monty Hall problem is a famous probability puzzle with non-intuitive outcome. Many established mathematicians and statisticians had problems solving it and many even disregarded the correct solution until they’ve seen the proof by simulation. Here we will show how it can be solved relatively simply with the use of Bayes’ theorem if we select the variables in a smart way. Exercise 3.9 (Monty Hall problem) A prize is placed at random behind one of three doors. You pick a door. Now Monty Hall chooses one of the other two doors, opens it and shows you that it is empty. He then gives you the opportunity to keep your door or switch to the other unopened door. Should you stay or switch? Use Bayes’ theorem to calculate the probability of winning if you switch and if you do not. R: Check your answers in R. Solution. W.L.O.G. assume we always pick the first door. The host can only open door 2 or door 3, as he can not open the door we picked. Let \\(k \\in \\{2,3\\}\\). Let us first look at what happens if we do not change. Then we have \\[\\begin{align} P(\\text{car in 1} | \\text{open $k$}) &= \\frac{P(\\text{open $k$} | \\text{car in 1})P(\\text{car in 1})}{P(\\text{open $k$})} \\\\ &= \\frac{P(\\text{open $k$} | \\text{car in 1})P(\\text{car in 1})}{\\sum_{n=1}^3 P(\\text{open $k$} | \\text{car in $n$})P(\\text{car in $n$)}}. \\end{align}\\] The probability that he opened \\(k\\) if the car is in 1 is \\(\\frac{1}{2}\\), as he can choose between door 2 and 3 as both have a goat behind it. Let us look at the normalization constant. When \\(n = 1\\) we get the value in the nominator. When \\(n=k\\), we get 0, as he will not open the door if there’s a prize behind. The remaining option is that we select 1, the car is behind \\(k\\) and he opens the only door left. Since he can’t open 1 due to it being our pick and \\(k\\) due to having the prize, the probability of opening the remaining door is 1, and the prior probability of the car being behind this door is \\(\\frac{1}{3}\\). So we have \\[\\begin{align} P(\\text{car in 1} | \\text{open $k$}) &= \\frac{\\frac{1}{2}\\frac{1}{3}}{\\frac{1}{2}\\frac{1}{3} + \\frac{1}{3}} \\\\ &= \\frac{1}{3}. \\end{align}\\] Now let us look at what happens if we do change. Let \\(k' \\in \\{2,3\\}\\) be the door that is not opened. If we change, we select this door, so we have \\[\\begin{align} P(\\text{car in $k'$} | \\text{open $k$}) &= \\frac{P(\\text{open $k$} | \\text{car in $k'$})P(\\text{car in $k'$})}{P(\\text{open $k$})} \\\\ &= \\frac{P(\\text{open $k$} | \\text{car in $k'$})P(\\text{car in $k'$})}{\\sum_{n=1}^3 P(\\text{open $k$} | \\text{car in $n$})P(\\text{car in $n$)}}. \\end{align}\\] The denominator stays the same, the only thing that is different from before is \\(P(\\text{open $k$} | \\text{car in $k'$})\\). We have a situation where we initially selected door 1 and the car is in door \\(k'\\). The probability that the host will open door \\(k\\) is then 1, as he can not pick any other door. So we have \\[\\begin{align} P(\\text{car in $k'$} | \\text{open $k$}) &= \\frac{\\frac{1}{3}}{\\frac{1}{2}\\frac{1}{3} + \\frac{1}{3}} \\\\ &= \\frac{2}{3}. \\end{align}\\] Therefore it makes sense to change the door. set.seed(1) nsamps <- 1000 ifchange <- vector(mode = "logical", length = nsamps) ifstay <- vector(mode = "logical", length = nsamps) for (i in 1:nsamps) { where_car <- sample(c(1:3), 1) where_player <- sample(c(1:3), 1) open_samp <- (1:3)[where_car != (1:3) & where_player != (1:3)] if (length(open_samp) == 1) { where_open <- open_samp } else { where_open <- sample(open_samp, 1) } ifstay[i] <- where_car == where_player where_ifchange <- (1:3)[where_open != (1:3) & where_player != (1:3)] ifchange[i] <- where_ifchange == where_car } sum(ifstay) / nsamps ## [1] 0.328 sum(ifchange) / nsamps ## [1] 0.672 "],["rvs.html", "Chapter 4 Random variables 4.1 General properties and calculations 4.2 Discrete random variables 4.3 Continuous random variables 4.4 Singular random variables 4.5 Transformations", " Chapter 4 Random variables This chapter deals with random variables and their distributions. The students are expected to acquire the following knowledge: Theoretical Identification of random variables. Convolutions of random variables. Derivation of PDF, PMF, CDF, and quantile function. Definitions and properties of common discrete random variables. Definitions and properties of common continuous random variables. Transforming univariate random variables. R Familiarize with PDF, PMF, CDF, and quantile functions for several distributions. Visual inspection of probability distributions. Analytical and empirical calculation of probabilities based on distributions. New R functions for plotting (for example, facet_wrap). Creating random number generators based on the Uniform distribution. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 4.1 General properties and calculations Exercise 4.1 Which of the functions below are valid CDFs? Find their respective densities. R: Plot the three functions. \\[\\begin{equation} F(x) = \\begin{cases} 1 - e^{-x^2} & x \\geq 0 \\\\ 0 & x < 0. \\end{cases} \\end{equation}\\] \\[\\begin{equation} F(x) = \\begin{cases} e^{-\\frac{1}{x}} & x > 0 \\\\ 0 & x \\leq 0. \\end{cases} \\end{equation}\\] \\[\\begin{equation} F(x) = \\begin{cases} 0 & x \\leq 0 \\\\ \\frac{1}{3} & 0 < x \\leq \\frac{1}{2} \\\\ 1 & x > \\frac{1}{2}. \\end{cases} \\end{equation}\\] Solution. Yes. First, let us check the limits. \\(\\lim_{x \\rightarrow -\\infty} (0) = 0\\). \\(\\lim_{x \\rightarrow \\infty} (1 - e^{-x^2}) = 1 - \\lim_{x \\rightarrow \\infty} e^{-x^2} = 1 - 0 = 1\\). Second, let us check whether the function is increasing. Let \\(x > y \\geq 0\\). Then \\(1 - e^{-x^2} \\geq 1 - e^{-y^2}\\). We only have to check right continuity for the point zero. \\(F(0) = 0\\) and \\(\\lim_{\\epsilon \\downarrow 0}F (0 + \\epsilon) = \\lim_{\\epsilon \\downarrow 0} 1 - e^{-\\epsilon^2} = 1 - \\lim_{\\epsilon \\downarrow 0} e^{-\\epsilon^2} = 1 - 1 = 0\\). We get the density by differentiating the CDF. \\(p(x) = \\frac{d}{dx} 1 - e^{-x^2} = 2xe^{-x^2}.\\) Students are encouraged to check that this is a proper PDF. Yes. First, let us check the limits. $_{x -} (0) = 0 and \\(\\lim_{x \\rightarrow \\infty} (e^{-\\frac{1}{x}}) = 1\\). Second, let us check whether the function is increasing. Let \\(x > y \\geq 0\\). Then \\(e^{-\\frac{1}{x}} \\geq e^{-\\frac{1}{y}}\\). We only have to check right continuity for the point zero. \\(F(0) = 0\\) and \\(\\lim_{\\epsilon \\downarrow 0}F (0 + \\epsilon) = \\lim_{\\epsilon \\downarrow 0} e^{-\\frac{1}{\\epsilon}} = 0\\). We get the density by differentiating the CDF. \\(p(x) = \\frac{d}{dx} e^{-\\frac{1}{x}} = \\frac{1}{x^2}e^{-\\frac{1}{x}}.\\) Students are encouraged to check that this is a proper PDF. No. The function is not right continuous as \\(F(\\frac{1}{2}) = \\frac{1}{3}\\), but \\(\\lim_{\\epsilon \\downarrow 0} F(\\frac{1}{2} + \\epsilon) = 1\\). f1 <- function (x) { tmp <- 1 - exp(-x^2) tmp[x < 0] <- 0 return(tmp) } f2 <- function (x) { tmp <- exp(-(1 / x)) tmp[x <= 0] <- 0 return(tmp) } f3 <- function (x) { tmp <- x tmp[x == x] <- 1 tmp[x <= 0.5] <- 1/3 tmp[x <= 0] <- 0 return(tmp) } cdf_data <- tibble(x = seq(-1, 20, by = 0.001), f1 = f1(x), f2 = f2(x), f3 = f3(x)) %>% melt(id.vars = "x") cdf_plot <- ggplot(data = cdf_data, aes(x = x, y = value, color = variable)) + geom_hline(yintercept = 1) + geom_line() plot(cdf_plot) Exercise 4.2 Let \\(X\\) be a random variable with CDF \\[\\begin{equation} F(x) = \\begin{cases} 0 & x < 0 \\\\ \\frac{x^2}{2} & 0 \\leq x < 1 \\\\ \\frac{1}{2} + \\frac{p}{2} & 1 \\leq x < 2 \\\\ \\frac{1}{2} + \\frac{p}{2} + \\frac{1 - p}{2} & x \\geq 2 \\end{cases} \\end{equation}\\] R: Plot this CDF for \\(p = 0.3\\). Is it a discrete, continuous, or mixed random varible? Find the probability density/mass of \\(X\\). f1 <- function (x, p) { tmp <- x tmp[x >= 2] <- 0.5 + (p * 0.5) + ((1-p) * 0.5) tmp[x < 2] <- 0.5 + (p * 0.5) tmp[x < 1] <- (x[x < 1])^2 / 2 tmp[x < 0] <- 0 return(tmp) } cdf_data <- tibble(x = seq(-1, 5, by = 0.001), y = f1(x, 0.3)) cdf_plot <- ggplot(data = cdf_data, aes(x = x, y = y)) + geom_hline(yintercept = 1) + geom_line(color = "blue") plot(cdf_plot) ::: {.solution} \\(X\\) is a mixed random variable. Since \\(X\\) is a mixed random variable, we have to find the PDF of the continuous part and the PMF of the discrete part. We get the continuous part by differentiating the corresponding CDF, \\(\\frac{d}{dx}\\frac{x^2}{2} = x\\). So the PDF, when \\(0 \\leq x < 1\\), is \\(p(x) = x\\). Let us look at the discrete part now. It has two steps, so this is a discrete distribution with two outcomes – numbers 1 and 2. The first happens with probability \\(\\frac{p}{2}\\), and the second with probability \\(\\frac{1 - p}{2}\\). This reminds us of the Bernoulli distribution. The PMF for the discrete part is \\(P(X = x) = (\\frac{p}{2})^{2 - x} (\\frac{1 - p}{2})^{x - 1}\\). ::: Exercise 4.3 (Convolutions) Convolutions are probability distributions that correspond to sums of independent random variables. Let \\(X\\) and \\(Y\\) be independent discrete variables. Find the PMF of \\(Z = X + Y\\). Hint: Use the law of total probability. Let \\(X\\) and \\(Y\\) be independent continuous variables. Find the PDF of \\(Z = X + Y\\). Hint: Start with the CDF. Solution. \\[\\begin{align} P(Z = z) &= P(X + Y = z) & \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + Y = z | Y = k) P(Y = k) & \\text{ (law of total probability)} \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + k = z | Y = k) P(Y = k) & \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + k = z) P(Y = k) & \\text{ (independence of $X$ and $Y$)} \\\\ &= \\sum_{k = -\\infty}^\\infty P(X = z - k) P(Y = k). & \\end{align}\\] Let \\(f\\) and \\(g\\) be the PDFs of \\(X\\) and \\(Y\\) respectively. \\[\\begin{align} F(z) &= P(Z < z) \\\\ &= P(X + Y < z) \\\\ &= \\int_{-\\infty}^{\\infty} P(X + Y < z | Y = y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X + y < z | Y = y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X + y < z)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X < z - y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} (\\int_{-\\infty}^{z - y} f(x) dx) g(y) dy \\end{align}\\] Now \\[\\begin{align} p(z) &= \\frac{d}{dz} F(z) & \\\\ &= \\int_{-\\infty}^{\\infty} (\\frac{d}{dz}\\int_{-\\infty}^{z - y} f(x) dx) g(y) dy & \\\\ &= \\int_{-\\infty}^{\\infty} f(z - y) g(y) dy & \\text{ (fundamental theorem of calculus)}. \\end{align}\\] 4.2 Discrete random variables Exercise 4.4 (Binomial random variable) Let \\(X_k\\), \\(k = 1,...,n\\), be random variables with the Bernoulli measure as the PMF. Let \\(X = \\sum_{k=1}^n X_k\\). We call \\(X_k\\) a Bernoulli random variable with parameter \\(p \\in (0,1)\\). Find the CDF of \\(X_k\\). Find PMF of \\(X\\). This is a Binomial random variable with support in \\(\\{0,1,2,...,n\\}\\) and parameters \\(p \\in (0,1)\\) and \\(n \\in \\mathbb{N}_0\\). We denote \\[\\begin{equation} X | n,p \\sim \\text{binomial}(n,p). \\end{equation}\\] Find CDF of \\(X\\). R: Simulate from the binomial distribution with \\(n = 10\\) and \\(p = 0.5\\), and from \\(n\\) Bernoulli distributions with \\(p = 0.5\\). Visually compare the sum of Bernoullis and the binomial. Hint: there is no standard function like rpois for a Bernoulli random variable. Check exercise 1.12 to find out how to sample from a Bernoulli distribution. Solution. There are two outcomes – zero and one. Zero happens with probability \\(1 - p\\). Therefore \\[\\begin{equation} F(k) = \\begin{cases} 0 & k < 0 \\\\ 1 - p & 0 \\leq k < 1 \\\\ 1 & k \\geq 1. \\end{cases} \\end{equation}\\] For the probability of \\(X\\) to be equal to some \\(k \\leq n\\), exactly \\(k\\) Bernoulli variables need to be one, and the others zero. So \\(p^k(1-p)^{n-k}\\). There are \\(\\binom{n}{k}\\) such possible arrangements. Therefore \\[\\begin{align} P(X = k) = \\binom{n}{k} p^k (1 - p)^{n-k}. \\end{align}\\] \\[\\begin{equation} F(k) = \\sum_{i = 0}^{\\lfloor k \\rfloor} \\binom{n}{i} p^i (1 - p)^{n - i} \\end{equation}\\] set.seed(1) nsamps <- 10000 binom_samp <- rbinom(nsamps, size = 10, prob = 0.5) bernoulli_mat <- matrix(data = NA, nrow = nsamps, ncol = 10) for (i in 1:nsamps) { bernoulli_mat[i, ] <- rbinom(10, size = 1, prob = 0.5) } bern_samp <- apply(bernoulli_mat, 1, sum) b_data <- tibble(x = c(binom_samp, bern_samp), type = c(rep("binomial", 10000), rep("Bernoulli_sum", 10000))) b_plot <- ggplot(data = b_data, aes(x = x, fill = type)) + geom_bar(position = "dodge") plot(b_plot) Exercise 4.5 (Geometric random variable) A variable with PMF \\[\\begin{equation} P(k) = p(1-p)^k \\end{equation}\\] is a geometric random variable with support in non-negative integers. It has one parameter \\(p \\in (0,1]\\). We denote \\[\\begin{equation} X | p \\sim \\text{geometric}(p) \\end{equation}\\] Derive the CDF of a geometric random variable. R: Draw 1000 samples from the geometric distribution with \\(p = 0.3\\) and compare their frequencies to theoretical values. Solution. \\[\\begin{align} P(X \\leq k) &= \\sum_{i = 0}^k p(1-p)^i \\\\ &= p \\sum_{i = 0}^k (1-p)^i \\\\ &= p \\frac{1 - (1-p)^{k+1}}{1 - (1 - p)} \\\\ &= 1 - (1-p)^{k + 1} \\end{align}\\] set.seed(1) geo_samp <- rgeom(n = 1000, prob = 0.3) geo_samp <- data.frame(x = geo_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:20, n = dgeom(0:20, prob = 0.3), type = "theoretical_measure")) geo_plot <- ggplot(data = geo_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(geo_plot) Exercise 4.6 (Poisson random variable) A variable with PMF \\[\\begin{equation} P(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!} \\end{equation}\\] is a Poisson random variable with support in non-negative integers. It has one positive parameter \\(\\lambda\\), which also represents its mean value and variance (a measure of the deviation of the values from the mean – more on mean and variance in the next chapter). We denote \\[\\begin{equation} X | \\lambda \\sim \\text{Poisson}(\\lambda). \\end{equation}\\] This distribution is usually the default choice for modeling counts. We have already encountered a Poisson random variable in exercise 1.13, where we also sampled from this distribution. The CDF of a Poisson random variable is \\(P(X <= x) = e^{-\\lambda} \\sum_{i=0}^x \\frac{\\lambda^{i}}{i!}\\). R: Draw 1000 samples from the Poisson distribution with \\(\\lambda = 5\\) and compare their empirical cumulative distribution function with the theoretical CDF. set.seed(1) pois_samp <- rpois(n = 1000, lambda = 5) pois_samp <- data.frame(x = pois_samp) pois_plot <- ggplot(data = pois_samp, aes(x = x, colour = "ECDF")) + stat_ecdf(geom = "step") + geom_step(data = tibble(x = 0:17, y = ppois(x, 5)), aes(x = x, y = y, colour = "CDF")) + scale_colour_manual("Lgend title", values = c("black", "red")) plot(pois_plot) Exercise 4.7 (Negative binomial random variable) A variable with PMF \\[\\begin{equation} p(k) = \\binom{k + r - 1}{k}(1-p)^r p^k \\end{equation}\\] is a negative binomial random variable with support in non-negative integers. It has two parameters \\(r > 0\\) and \\(p \\in (0,1)\\). We denote \\[\\begin{equation} X | r,p \\sim \\text{NB}(r,p). \\end{equation}\\] Let us reparameterize the negative binomial distribution with \\(q = 1 - p\\). Find the PMF of \\(X \\sim \\text{NB}(1, q)\\). Do you recognize this distribution? Show that the sum of two negative binomial random variables with the same \\(p\\) is also a negative binomial random variable. Hint: Use the fact that the number of ways to place \\(n\\) indistinct balls into \\(k\\) boxes is \\(\\binom{n + k - 1}{n}\\). R: Draw samples from \\(X \\sim \\text{NB}(5, 0.4)\\) and \\(Y \\sim \\text{NB}(3, 0.4)\\). Draw samples from \\(Z = X + Y\\), where you use the parameters calculated in b). Plot both distributions, their sum, and \\(Z\\) using facet_wrap. Be careful, as R uses a different parameterization size=\\(r\\) and prob=\\(1 - p\\). Solution. \\[\\begin{align} P(X = k) &= \\binom{k + 1 - 1}{k}q^1 (1-q)^k \\\\ &= q(1-q)^k. \\end{align}\\] This is the geometric distribution. Let \\(X \\sim \\text{NB}(r_1, p)\\) and \\(Y \\sim \\text{NB}(r_2, p)\\). Let \\(Z = X + Y\\). \\[\\begin{align} P(Z = z) &= \\sum_{k = 0}^{\\infty} P(X = z - k)P(Y = k), \\text{ if k < 0, then the probabilities are 0} \\\\ &= \\sum_{k = 0}^{z} P(X = z - k)P(Y = k), \\text{ if k > z, then the probabilities are 0} \\\\ &= \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k}(1 - p)^{r_1} p^{z - k} \\binom{k + r_2 - 1}{k}(1 - p)^{r_2} p^{k} & \\\\ &= \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}(1 - p)^{r_1 + r_2} p^{z} & \\\\ &= (1 - p)^{r_1 + r_2} p^{z} \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}& \\end{align}\\] The part before the sum reminds us of the negative binomial distribution with parameters \\(r_1 + r_2\\) and \\(p\\). To complete this term to the negative binomial PMF we need \\(\\binom{z + r_1 + r_2 -1}{z}\\). So the only thing we need to prove is that the sum equals this term. Both terms in the sum can be interpreted as numbers of ways to place a number of balls into boxes. For the left term it is \\(z-k\\) balls into \\(r_1\\) boxes, and for the right \\(k\\) balls into \\(r_2\\) boxes. For each \\(k\\) we are distributing \\(z\\) balls in total. By summing over all \\(k\\), we actually get all the possible placements of \\(z\\) balls into \\(r_1 + r_2\\) boxes. Therefore \\[\\begin{align} P(Z = z) &= (1 - p)^{r_1 + r_2} p^{z} \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}& \\\\ &= \\binom{z + r_1 + r_2 -1}{z} (1 - p)^{r_1 + r_2} p^{z}. \\end{align}\\] From this it also follows that the sum of geometric distributions with the same parameter is a negative binomial distribution. \\(Z \\sim \\text{NB}(8, 0.4)\\). set.seed(1) nsamps <- 10000 x <- rnbinom(nsamps, size = 5, prob = 0.6) y <- rnbinom(nsamps, size = 3, prob = 0.6) xpy <- x + y z <- rnbinom(nsamps, size = 8, prob = 0.6) samps <- tibble(x, y, xpy, z) samps <- melt(samps) ggplot(data = samps, aes(x = value)) + geom_bar() + facet_wrap(~ variable) 4.3 Continuous random variables Exercise 4.8 (Exponential random variable) A variable \\(X\\) with PDF \\(\\lambda e^{-\\lambda x}\\) is an exponential random variable with support in non-negative real numbers. It has one positive parameter \\(\\lambda\\). We denote \\[\\begin{equation} X | \\lambda \\sim \\text{Exp}(\\lambda). \\end{equation}\\] Find the CDF of an exponential random variable. Find the quantile function of an exponential random variable. Calculate the probability \\(P(1 \\leq X \\leq 3)\\), where \\(X \\sim \\text{Exp(1.5)}\\). R: Check your answer to c) with a simulation (rexp). Plot the probability in a meaningful way. R: Implement PDF, CDF, and the quantile function and compare their values with corresponding R functions visually. Hint: use the size parameter to make one of the curves wider. Solution. \\[\\begin{align} F(x) &= \\int_{0}^{x} \\lambda e^{-\\lambda t} dt \\\\ &= \\lambda \\int_{0}^{x} e^{-\\lambda t} dt \\\\ &= \\lambda (\\frac{1}{-\\lambda}e^{-\\lambda t} |_{0}^{x}) \\\\ &= \\lambda(\\frac{1}{\\lambda} - \\frac{1}{\\lambda} e^{-\\lambda x}) \\\\ &= 1 - e^{-\\lambda x}. \\end{align}\\] \\[\\begin{align} F(F^{-1}(x)) &= x \\\\ 1 - e^{-\\lambda F^{-1}(x)} &= x \\\\ e^{-\\lambda F^{-1}(x)} &= 1 - x \\\\ -\\lambda F^{-1}(x) &= \\ln(1 - x) \\\\ F^{-1}(x) &= - \\frac{ln(1 - x)}{\\lambda}. \\end{align}\\] \\[\\begin{align} P(1 \\leq X \\leq 3) &= P(X \\leq 3) - P(X \\leq 1) \\\\ &= P(X \\leq 3) - P(X \\leq 1) \\\\ &= 1 - e^{-1.5 \\times 3} - 1 + e^{-1.5 \\times 1} \\\\ &\\approx 0.212. \\end{align}\\] set.seed(1) nsamps <- 1000 samps <- rexp(nsamps, rate = 1.5) sum(samps >= 1 & samps <= 3) / nsamps ## [1] 0.212 exp_plot <- ggplot(data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dexp, args = list(rate = 1.5)) + stat_function(fun = dexp, args = list(rate = 1.5), xlim = c(1,3), geom = "area", fill = "red") plot(exp_plot) exp_pdf <- function(x, lambda) { return (lambda * exp(-lambda * x)) } exp_cdf <- function(x, lambda) { return (1 - exp(-lambda * x)) } exp_quant <- function(q, lambda) { return (-(log(1 - q) / lambda)) } ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_pdf, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = pexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_cdf, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) ggplot(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x)) + stat_function(fun = qexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_quant, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) Exercise 4.9 (Uniform random variable) Continuous uniform random variable with parameters \\(a\\) and \\(b\\) has the PDF \\[\\begin{equation} p(x) = \\begin{cases} \\frac{1}{b - a} & x \\in [a,b] \\\\ 0 & \\text{otherwise}. \\end{cases} \\end{equation}\\] Find the CDF of the uniform random variable. Find the quantile function of the uniform random variable. Let \\(X \\sim \\text{Uniform}(a,b)\\). Find the CDF of the variable \\(Y = \\frac{X - a}{b - a}\\). This is the standard uniform random variable. Let \\(X \\sim \\text{Uniform}(-1, 3)\\). Find such \\(z\\) that \\(P(X < z + \\mu_x) = \\frac{1}{5}\\). R: Check your result from d) using simulation. Solution. \\[\\begin{align} F(x) &= \\int_{a}^x \\frac{1}{b - a} dt \\\\ &= \\frac{1}{b - a} \\int_{a}^x dt \\\\ &= \\frac{x - a}{b - a}. \\end{align}\\] \\[\\begin{align} F(F^{-1}(p)) &= p \\\\ \\frac{F^{-1}(p) - a}{b - a} &= p \\\\ F^{-1}(p) &= p(b - a) + a. \\end{align}\\] \\[\\begin{align} F_Y(y) &= P(Y < y) \\\\ &= P(\\frac{X - a}{b - a} < y) \\\\ &= P(X < y(b - a) + a) \\\\ &= F_X(y(b - a) + a) \\\\ &= \\frac{(y(b - a) + a) - a}{b - a} \\\\ &= y. \\end{align}\\] \\[\\begin{align} P(X < z + 1) &= \\frac{1}{5} \\\\ F(z + 1) &= \\frac{1}{5} \\\\ z + 1 &= F^{-1}(\\frac{1}{5}) \\\\ z &= \\frac{1}{5}4 - 1 - 1 \\\\ z &= -1.2. \\end{align}\\] set.seed(1) a <- -1 b <- 3 nsamps <- 10000 unif_samp <- runif(nsamps, a, b) mu_x <- mean(unif_samp) new_samp <- unif_samp - mu_x quantile(new_samp, probs = 1/5) ## 20% ## -1.203192 punif(-0.2, -1, 3) ## [1] 0.2 Exercise 4.10 (Beta random variable) A variable \\(X\\) with PDF \\[\\begin{equation} p(x) = \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)}, \\end{equation}\\] where \\(\\text{B}(\\alpha, \\beta) = \\frac{\\Gamma(\\alpha) \\Gamma(\\beta)}{\\Gamma(\\alpha + \\beta)}\\) and \\(\\Gamma(x) = \\int_0^{\\infty} x^{z - 1} e^{-x} dx\\) is a Beta random variable with support on \\([0,1]\\). It has two positive parameters \\(\\alpha\\) and \\(\\beta\\). Notation: \\[\\begin{equation} X | \\alpha, \\beta \\sim \\text{Beta}(\\alpha, \\beta) \\end{equation}\\] It is often used in modeling rates. Calculate the PDF for \\(\\alpha = 1\\) and \\(\\beta = 1\\). What do you notice? R: Plot densities of the beta distribution for parameter pairs (2, 2), (4, 1), (1, 4), (2, 5), and (0.1, 0.1). R: Sample from \\(X \\sim \\text{Beta}(2, 5)\\) and compare the histogram with Beta PDF. Solution. \\[\\begin{equation} p(x) = \\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\\text{B}(1, 1)} = 1. \\end{equation}\\] This is the standard uniform distribution. set.seed(1) ggplot(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x)) + stat_function(fun = dbeta, args = list(shape1 = 2, shape2 = 2), aes(color = "alpha = 0.5")) + stat_function(fun = dbeta, args = list(shape1 = 4, shape2 = 1), aes(color = "alpha = 4")) + stat_function(fun = dbeta, args = list(shape1 = 1, shape2 = 4), aes(color = "alpha = 1")) + stat_function(fun = dbeta, args = list(shape1 = 2, shape2 = 5), aes(color = "alpha = 25")) + stat_function(fun = dbeta, args = list(shape1 = 0.1, shape2 = 0.1), aes(color = "alpha = 0.1")) set.seed(1) nsamps <- 1000 samps <- rbeta(nsamps, 2, 5) ggplot(data = data.frame(x = samps), aes(x = x)) + geom_histogram(aes(y = ..density..), color = "black") + stat_function(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x), fun = dbeta, args = list(shape1 = 2, shape2 = 5), color = "red", size = 1.2) Exercise 4.11 (Gamma random variable) A random variable with PDF \\[\\begin{equation} p(x) = \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x} \\end{equation}\\] is a Gamma random variable with support on the positive numbers and parameters shape \\(\\alpha > 0\\) and rate \\(\\beta > 0\\). We write \\[\\begin{equation} X | \\alpha, \\beta \\sim \\text{Gamma}(\\alpha, \\beta) \\end{equation}\\] and it’s CDF is \\[\\begin{equation} \\frac{\\gamma(\\alpha, \\beta x)}{\\Gamma(\\alpha)}, \\end{equation}\\] where \\(\\gamma(s, x) = \\int_0^x t^{s-1} e^{-t} dt\\). It is usually used in modeling positive phenomena (for example insurance claims and rainfalls). Let \\(X \\sim \\text{Gamma}(1, \\beta)\\). Find the PDF of \\(X\\). Do you recognize this PDF? Let \\(k = \\alpha\\) and \\(\\theta = \\frac{1}{\\beta}\\). Find the PDF of \\(X | k, \\theta \\sim \\text{Gamma}(k, \\theta)\\). Random variables can be reparameterized, and sometimes a reparameterized distribution is more suitable for certain calculations. The first parameterization is for example usually used in Bayesian statistics, while this parameterization is more common in econometrics and some other applied fields. Note that you also need to pay attention to the parameters in statistical software, so diligently read the help files when using functions like rgamma to see how the function is parameterized. R: Plot gamma CDF for random variables with shape and rate parameters (1,1), (10,1), (1,10). Solution. \\[\\begin{align} p(x) &= \\frac{\\beta^1}{\\Gamma(1)} x^{1 - 1}e^{-\\beta x} \\\\ &= \\beta e^{-\\beta x} \\end{align}\\] This is the PDF of the exponential distribution with parameter \\(\\beta\\). \\[\\begin{align} p(x) &= \\frac{1}{\\Gamma(k)\\beta^k} x^{k - 1}e^{-\\frac{x}{\\theta}}. \\end{align}\\] set.seed(1) ggplot(data = data.frame(x = seq(0, 25, by = 0.01)), aes(x = x)) + stat_function(fun = pgamma, args = list(shape = 1, rate = 1), aes(color = "Gamma(1,1)")) + stat_function(fun = pgamma, args = list(shape = 10, rate = 1), aes(color = "Gamma(10,1)")) + stat_function(fun = pgamma, args = list(shape = 1, rate = 10), aes(color = "Gamma(1,10)")) Exercise 4.12 (Normal random variable) A random variable with PDF \\[\\begin{equation} p(x) = \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} \\end{equation}\\] is a normal random variable with support on the real axis and parameters \\(\\mu\\) in reals and \\(\\sigma^2 > 0\\). The first is the mean parameter and the second is the variance parameter. Many statistical methods assume a normal distribution. We denote \\[\\begin{equation} X | \\mu, \\sigma \\sim \\text{N}(\\mu, \\sigma^2), \\end{equation}\\] and it’s CDF is \\[\\begin{equation} F(x) = \\int_{-\\infty}^x \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(t - \\mu)^2}{2 \\sigma^2}} dt, \\end{equation}\\] which is intractable and is usually approximated. Due to its flexibility it is also one of the most researched distributions. For that reason statisticians often use transformations of variables or approximate distributions with the normal distribution. Show that a variable \\(\\frac{X - \\mu}{\\sigma} \\sim \\text{N}(0,1)\\). This transformation is called standardization, and \\(\\text{N}(0,1)\\) is a standard normal distribution. R: Plot the normal distribution with \\(\\mu = 0\\) and different values for the \\(\\sigma\\) parameter. R: The normal distribution provides a good approximation for the Poisson distribution with a large \\(\\lambda\\). Let \\(X \\sim \\text{Poisson}(50)\\). Approximate \\(X\\) with the normal distribution and compare its density with the Poisson histogram. What are the values of \\(\\mu\\) and \\(\\sigma^2\\) that should provide the best approximation? Note that R function rnorm takes standard deviation (\\(\\sigma\\)) as a parameter and not variance. Solution. \\[\\begin{align} P(\\frac{X - \\mu}{\\sigma} < x) &= P(X < \\sigma x + \\mu) \\\\ &= F(\\sigma x + \\mu) \\\\ &= \\int_{-\\infty}^{\\sigma x + \\mu} \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(t - \\mu)^2}{2\\sigma^2}} dt \\end{align}\\] Now let \\(s = f(t) = \\frac{t - \\mu}{\\sigma}\\), then \\(ds = \\frac{dt}{\\sigma}\\) and \\(f(\\sigma x + \\mu) = x\\), so \\[\\begin{align} P(\\frac{X - \\mu}{\\sigma} < x) &= \\int_{-\\infty}^{x} \\frac{1}{\\sqrt{2 \\pi}} e^{-\\frac{s^2}{2}} ds. \\end{align}\\] There is no need to evaluate this integral, as we recognize it as the CDF of a normal distribution with \\(\\mu = 0\\) and \\(\\sigma^2 = 1\\). set.seed(1) # b ggplot(data = data.frame(x = seq(-15, 15, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 1), aes(color = "sd = 1")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 0.4), aes(color = "sd = 0.1")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "sd = 2")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 5), aes(color = "sd = 5")) # c mean_par <- 50 nsamps <- 100000 pois_samps <- rpois(nsamps, lambda = mean_par) norm_samps <- rnorm(nsamps, mean = mean_par, sd = sqrt(mean_par)) my_plot <- ggplot() + geom_bar(data = tibble(x = pois_samps), aes(x = x, y = (..count..)/sum(..count..))) + geom_density(data = tibble(x = norm_samps), aes(x = x), color = "red") plot(my_plot) Exercise 4.13 (Logistic random variable) A logistic random variable has CDF \\[\\begin{equation} F(x) = \\frac{1}{1 + e^{-\\frac{x - \\mu}{s}}}, \\end{equation}\\] where \\(\\mu\\) is real and \\(s > 0\\). The support is on the real axis. We denote \\[\\begin{equation} X | \\mu, s \\sim \\text{Logistic}(\\mu, s). \\end{equation}\\] The distribution of the logistic random variable resembles a normal random variable, however it has heavier tails. Find the PDF of a logistic random variable. R: Implement logistic PDF and CDF and visually compare both for \\(X \\sim \\text{N}(0, 1)\\) and \\(Y \\sim \\text{logit}(0, \\sqrt{\\frac{3}{\\pi^2}})\\). These distributions have the same mean and variance. Additionally, plot the same plot on the interval [5,10], to better see the difference in the tails. R: For the distributions in b) find the probability \\(P(|X| > 4)\\) and interpret the result. Solution. \\[\\begin{align} p(x) &= \\frac{d}{dx} \\frac{1}{1 + e^{-\\frac{x - \\mu}{s}}} \\\\ &= \\frac{- \\frac{d}{dx} (1 + e^{-\\frac{x - \\mu}{s}})}{(1 + e{-\\frac{x - \\mu}{s}})^2} \\\\ &= \\frac{e^{-\\frac{x - \\mu}{s}}}{s(1 + e{-\\frac{x - \\mu}{s}})^2}. \\end{align}\\] # b set.seed(1) logit_pdf <- function (x, mu, s) { return ((exp(-(x - mu)/(s))) / (s * (1 + exp(-(x - mu)/(s)))^2)) } nl_plot <- ggplot(data = data.frame(x = seq(-12, 12, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "normal")) + stat_function(fun = logit_pdf, args = list(mu = 0, s = sqrt(12/pi^2)), aes(color = "logit")) plot(nl_plot) nl_plot <- ggplot(data = data.frame(x = seq(5, 10, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "normal")) + stat_function(fun = logit_pdf, args = list(mu = 0, s = sqrt(12/pi^2)), aes(color = "logit")) plot(nl_plot) # c logit_cdf <- function (x, mu, s) { return (1 / (1 + exp(-(x - mu) / s))) } p_logistic <- 1 - logit_cdf(4, 0, sqrt(12/pi^2)) + logit_cdf(-4, 0, sqrt(12/pi^2)) p_norm <- 1 - pnorm(4, 0, 2) + pnorm(-4, 0, 2) p_logistic ## [1] 0.05178347 p_norm ## [1] 0.04550026 # Logistic distribution has wider tails, therefore the probability of larger # absolute values is higher. 4.4 Singular random variables Exercise 4.14 (Cantor distribution) The Cantor set is a subset of \\([0,1]\\), which we create by iteratively deleting the middle third of the interval. For example, in the first iteration, we get the sets \\([0,\\frac{1}{3}]\\) and \\([\\frac{2}{3},1]\\). In the second iteration, we get \\([0,\\frac{1}{9}]\\), \\([\\frac{2}{9},\\frac{1}{3}]\\), \\([\\frac{2}{3}, \\frac{7}{9}]\\), and \\([\\frac{8}{9}, 1]\\). On the \\(n\\)-th iteration, we have \\[\\begin{equation} C_n = \\frac{C_{n-1}}{3} \\cup \\bigg(\\frac{2}{3} + \\frac{C_{n-1}}{3} \\bigg), \\end{equation}\\] where \\(C_0 = [0,1]\\). The Cantor set is then defined as the intersection of these sets \\[\\begin{equation} C = \\cap_{n=1}^{\\infty} C_n. \\end{equation}\\] It has the same cardinality as \\([0,1]\\). Another way to define the Cantor set is the set of all numbers on \\([0,1]\\), that do not have a 1 in the ternary representation \\(x = \\sum_{n=1}^\\infty \\frac{x_i}{3^i}, x_i \\in \\{0,1,2\\}\\). A random variable follows the Cantor distribution, if its CDF is the Cantor function (below). You can find the implementations of random number generator, CDF, and quantile functions for the Cantor distributions at https://github.com/Henrygb/CantorDist.R. Show that the Lebesgue measure of the Cantor set is 0. (Jagannathan) Let us look at an infinite sequence of independent fair-coin tosses. If the outcome is heads, let \\(x_i = 2\\) and \\(x_i = 0\\), when tails. Then use these to create \\(x = \\sum_{n=1}^\\infty \\frac{x_i}{3^i}\\). This is a random variable with the Cantor distribution. Show that \\(X\\) has a singular distribution. Solution. \\[\\begin{align} \\lambda(C) &= 1 - \\lambda(C^c) \\\\ &= 1 - \\frac{1}{3}\\sum_{k = 0}^\\infty (\\frac{2}{3})^k \\\\ &= 1 - \\frac{\\frac{1}{3}}{1 - \\frac{2}{3}} \\\\ &= 0. \\end{align}\\] First, for every \\(x\\), the probability of observing it is \\(\\lim_{n \\rightarrow \\infty} \\frac{1}{2^n} = 0\\). Second, the probability that we observe one of all the possible sequences is 1. Therefore \\(P(C) = 1\\). So this is a singular variable. The CDF only increments on the elements of the Cantor set. 4.5 Transformations Exercise 4.15 Let \\(X\\) be a random variable that is uniformly distributed on \\(\\{-2, -1, 0, 1, 2\\}\\). Find the PMF of \\(Y = X^2\\). Solution. \\[\\begin{align} P_Y(y) = \\sum_{x \\in \\sqrt(y)} P_X(x) = \\begin{cases} 0 & y \\notin \\{0,1,4\\} \\\\ \\frac{1}{5} & y = 0 \\\\ \\frac{2}{5} & y \\in \\{1,4\\} \\end{cases} \\end{align}\\] Exercise 4.16 (Lognormal random variable) A lognormal random variable is a variable whose logarithm is normally distributed. In practice, we often encounter skewed data. Usually using a log transformation on such data makes it more symmetric and therefore more suitable for modeling with the normal distribution (more on why we wish to model data with the normal distribution in the following chapters). Let \\(X \\sim \\text{N}(\\mu,\\sigma)\\). Find the PDF of \\(Y: \\log(Y) = X\\). R: Sample from the lognormal distribution with parameters \\(\\mu = 5\\) and \\(\\sigma = 2\\). Plot a histogram of the samples. Then log-transform the samples and plot a histogram along with the theoretical normal PDF. Solution. \\[\\begin{align} p_Y(y) &= p_X(\\log(y)) \\frac{d}{dy} \\log(y) \\\\ &= \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{\\frac{(\\log(y) - \\mu)^2}{2 \\sigma^2}} \\frac{1}{y} \\\\ &= \\frac{1}{y \\sqrt{2 \\pi \\sigma^2}} e^{\\frac{(\\log(y) - \\mu)^2}{2 \\sigma^2}}. \\end{align}\\] set.seed(1) nsamps <- 10000 mu <- 0.5 sigma <- 0.4 ln_samps <- rlnorm(nsamps, mu, sigma) ln_plot <- ggplot(data = data.frame(x = ln_samps), aes(x = x)) + geom_histogram(color = "black") plot(ln_plot) norm_samps <- log(ln_samps) n_plot <- ggplot(data = data.frame(x = norm_samps), aes(x = x)) + geom_histogram(aes(y = ..density..), color = "black") + stat_function(fun = dnorm, args = list(mean = mu, sd = sigma), color = "red") plot(n_plot) Exercise 4.17 (Probability integral transform) This exercise is borrowed from Wasserman. Let \\(X\\) have a continuous, strictly increasing CDF \\(F\\). Let \\(Y = F(X)\\). Find the density of \\(Y\\). This is called the probability integral transform. Let \\(U \\sim \\text{Uniform}(0,1)\\) and let \\(X = F^{-1}(U)\\). Show that \\(X \\sim F\\). R: Implement a program that takes Uniform(0,1) random variables and generates random variables from an exponential(\\(\\beta\\)) distribution. Compare your implemented function with function rexp in R. Solution. \\[\\begin{align} F_Y(y) &= P(Y < y) \\\\ &= P(F(X) < y) \\\\ &= P(X < F_X^{-1}(y)) \\\\ &= F_X(F_X^{-1}(y)) \\\\ &= y. \\end{align}\\] From the above it follows that \\(p(y) = 1\\). Note that we need to know the inverse CDF to be able to apply this procedure. \\[\\begin{align} P(X < x) &= P(F^{-1}(U) < x) \\\\ &= P(U < F(x)) \\\\ &= F_U(F(x)) \\\\ &= F(x). \\end{align}\\] set.seed(1) nsamps <- 10000 beta <- 4 generate_exp <- function (n, beta) { tmp <- runif(n) X <- qexp(tmp, beta) return (X) } df <- tibble("R" = rexp(nsamps, beta), "myGenerator" = generate_exp(nsamps, beta)) %>% gather() ggplot(data = df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") "],["mrvs.html", "Chapter 5 Multiple random variables 5.1 General 5.2 Bivariate distribution examples 5.3 Transformations", " Chapter 5 Multiple random variables This chapter deals with multiple random variables and their distributions. The students are expected to acquire the following knowledge: Theoretical Calculation of PDF of transformed multiple random variables. Finding marginal and conditional distributions. R Scatterplots of bivariate random variables. New R functions (for example, expand.grid). .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 5.1 General Exercise 5.1 Let \\(X \\sim \\text{N}(0,1)\\) and \\(Y \\sim \\text{N}(0,1)\\) be independent random variables. Draw 1000 samples from \\((X,Y)\\) and plot a scatterplot. Now let \\(X \\sim \\text{N}(0,1)\\) and \\(Y | X = x \\sim N(ax, 1)\\). Draw 1000 samples from \\((X,Y)\\) for \\(a = 1\\), \\(a=0\\), and \\(a=-0.5\\). Plot the scatterplots. How would you interpret parameter \\(a\\)? Plot the marginal distribution of \\(Y\\) for cases \\(a=1\\), \\(a=0\\), and \\(a=-0.5\\). Can you guess which distribution it is? set.seed(1) nsamps <- 1000 x <- rnorm(nsamps) y <- rnorm(nsamps) ggplot(data.frame(x, y), aes(x = x, y = y)) + geom_point() y1 <- rnorm(nsamps, mean = 1 * x) y2 <- rnorm(nsamps, mean = 0 * x) y3 <- rnorm(nsamps, mean = -0.5 * x) df <- tibble(x = c(x,x,x), y = c(y1,y2,y3), a = c(rep(1, nsamps), rep(0, nsamps), rep(-0.5, nsamps))) ggplot(df, aes(x = x, y = y)) + geom_point() + facet_wrap(~a) + coord_equal(ratio=1) # Parameter a controls the scale of linear dependency between X and Y. ggplot(df, aes(x = y)) + geom_density() + facet_wrap(~a) 5.2 Bivariate distribution examples Exercise 5.2 (Discrete bivariate random variable) Let \\(X\\) represent the event that a die rolls an even number and let \\(Y\\) represent the event that a die rolls one, two, or a three. Find the marginal distributions of \\(X\\) and \\(Y\\). Find the PMF of \\((X,Y)\\). Find the CDF of \\((X,Y)\\). Find \\(P(X = 1 | Y = 1)\\). Solution. \\[\\begin{align} P(X = 1) = \\frac{1}{2} \\text{ and } P(X = 0) = \\frac{1}{2} \\\\ P(Y = 1) = \\frac{1}{2} \\text{ and } P(Y = 0) = \\frac{1}{2} \\\\ \\end{align}\\] \\[\\begin{align} P(X = 1, Y = 1) = \\frac{1}{6} \\\\ P(X = 1, Y = 0) = \\frac{2}{6} \\\\ P(X = 0, Y = 1) = \\frac{2}{6} \\\\ P(X = 0, Y = 0) = \\frac{1}{6} \\end{align}\\] \\[\\begin{align} P(X \\leq x, Y \\leq y) = \\begin{cases} \\frac{1}{6} & x = 0, y = 0 \\\\ \\frac{3}{6} & x \\neq y \\\\ 1 & x = 1, y = 1 \\end{cases} \\end{align}\\] \\[\\begin{align} P(X = 1 | Y = 1) = \\frac{1}{3} \\end{align}\\] Exercise 5.3 (Continuous bivariate random variable) Let \\(p(x,y) = 6 (x - y)^2\\) be the PDF of a bivariate random variable \\((X,Y)\\), where both variables range from zero to one. Find CDF. Find marginal distributions. Find conditional distributions. R: Plot a grid of points and colour them by value – this can help us visualize the PDF. R: Implement a random number generator, which will generate numbers from \\((X,Y)\\) and visually check the results. R: Plot the marginal distribution of \\(Y\\) and the conditional distributions of \\(X | Y = y\\), where \\(y \\in \\{0, 0.1, 0.5\\}\\). Solution. \\[\\begin{align} F(x,y) &= \\int_0^{x} \\int_0^{y} 6 (t - s)^2 ds dt\\\\ &= 6 \\int_0^{x} \\int_0^{y} t^2 - 2ts + s^2 ds dt\\\\ &= 6 \\int_0^{x} t^2y - ty^2 + \\frac{y^3}{3} dt \\\\ &= 6 (\\frac{x^3 y}{3} - \\frac{x^2y^2}{2} + \\frac{x y^3}{3}) \\\\ &= 2 x^3 y - 3 t^2y^2 + 2 x y^3 \\end{align}\\] \\[\\begin{align} p(x) &= \\int_0^{1} 6 (x - y)^2 dy\\\\ &= 6 (x^2 - x + \\frac{1}{3}) \\\\ &= 6x^2 - 6x + 2 \\end{align}\\] \\[\\begin{align} p(y) &= \\int_0^{1} 6 (x - y)^2 dx\\\\ &= 6 (y^2 - y + \\frac{1}{3}) \\\\ &= 6y^2 - 6y + 2 \\end{align}\\] \\[\\begin{align} p(x|y) &= \\frac{p(xy)}{p(y)} \\\\ &= \\frac{6 (x - y)^2}{6 (y^2 - y + \\frac{1}{3})} \\\\ &= \\frac{(x - y)^2}{y^2 - y + \\frac{1}{3}} \\end{align}\\] \\[\\begin{align} p(y|x) &= \\frac{p(xy)}{p(x)} \\\\ &= \\frac{6 (x - y)^2}{6 (x^2 - x + \\frac{1}{3})} \\\\ &= \\frac{(x - y)^2}{x^2 - x + \\frac{1}{3}} \\end{align}\\] set.seed(1) # d pxy <- function (x, y) { return ((x - y)^2) } x_axis <- seq(0, 1, length.out = 100) y_axis <- seq(0, 1, length.out = 100) df <- expand.grid(x_axis, y_axis) colnames(df) <- c("x", "y") df <- cbind(df, pdf = pxy(df$x, df$y)) ggplot(data = df, aes(x = x, y = y, color = pdf)) + geom_point() # e samps <- NULL for (i in 1:10000) { xt <- runif(1, 0, 1) yt <- runif(1, 0, 1) pdft <- pxy(xt, yt) acc <- runif(1, 0, 6) if (acc <= pdft) { samps <- rbind(samps, c(xt, yt)) } } colnames(samps) <- c("x", "y") ggplot(data = as.data.frame(samps), aes(x = x, y = y)) + geom_point() # f mar_pdf <- function (x) { return (6 * x^2 - 6 * x + 2) } cond_pdf <- function (x, y) { return (((x - y)^2) / (y^2 - y + 1/3)) } df <- tibble(x = x_axis, mar = mar_pdf(x), y0 = cond_pdf(x, 0), y0.1 = cond_pdf(x, 0.1), y0.5 = cond_pdf(x, 0.5)) %>% gather(dist, value, -x) ggplot(df, aes(x = x, y = value, color = dist)) + geom_line() Exercise 5.4 (Mixed bivariate random variable) Let \\(f(x,y) = \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y+ \\alpha -1} e^{-x(1 + \\beta)}\\) be the PDF of a bivariate random variable, where \\(x \\in (0, \\infty)\\) and \\(y \\in \\mathbb{N}_0\\). Find the marginal distribution of \\(X\\). Do you recognize this distribution? Find the conditional distribution of \\(Y | X\\). Do you recognize this distribution? Calculate the probability \\(P(Y = 2 | X = 2.5)\\) for \\((X,Y)\\). Find the marginal distribution of \\(Y\\). Do you recognize this distribution? R: Take 1000 random samples from \\((X,Y)\\) with parameters \\(\\beta = 1\\) and \\(\\alpha = 1\\). Plot a scatterplot. Plot a bar plot of the marginal distribution of \\(Y\\), and the theoretical PMF calculated from d) on the range from 0 to 10. Hint: Use the gamma function in R.? Solution. \\[\\begin{align} p(x) &= \\sum_{k = 0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)k!} x^{k + \\alpha -1} e^{-x(1 + \\beta)} & \\\\ &= \\sum_{k = 0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)k!} x^{k} x^{\\alpha -1} e^{-x} e^{-\\beta x} & \\\\ &= \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x} \\sum_{k = 0}^{\\infty} \\frac{1}{k!} x^{k} e^{-x} & \\\\ &= \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x} & \\text{the last term above sums to one} \\end{align}\\] This is the Gamma PDF. \\[\\begin{align} p(y|x) &= \\frac{p(x,y)}{p(x)} \\\\ &= \\frac{\\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y+ \\alpha -1} e^{-x(1 + \\beta)}}{\\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x}} \\\\ &= \\frac{x^y e^{-x}}{y!}. \\end{align}\\] This is the Poisson PMF. \\[\\begin{align} P(Y = 2 | X = 2.5) = \\frac{2.5^2 e^{-2.5}}{2!} \\approx 0.26. \\end{align}\\] \\[\\begin{align} p(y) &= \\int_{0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y + \\alpha -1} e^{-x(1 + \\beta)} dx & \\\\ &= \\frac{1}{y!} \\int_{0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{(y + \\alpha) -1} e^{-(1 + \\beta)x} dx & \\\\ &= \\frac{1}{y!} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} \\int_{0}^{\\infty} \\frac{\\Gamma(y + \\alpha)}{(1 + \\beta)^{y + \\alpha}} \\frac{(1 + \\beta)^{y + \\alpha}}{\\Gamma(y + \\alpha)} x^{(y + \\alpha) -1} e^{-(1 + \\beta)x} dx & \\text{complete to Gamma PDF} \\\\ &= \\frac{1}{y!} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} \\frac{\\Gamma(y + \\alpha)}{(1 + \\beta)^{y + \\alpha}}. \\end{align}\\] We add the terms in the third equality to get a Gamma PDF inside the integral, which then integrates to one. We do not recognize this distribution. set.seed(1) px <- function (x, alpha, beta) { return((1 / factorial(x)) * (beta^alpha / gamma(alpha)) * (gamma(x + alpha) / (1 + beta)^(x + alpha))) } nsamps <- 1000 rx <- rgamma(nsamps, 1, 1) ryx <- rpois(nsamps, rx) ggplot(data = data.frame(x = rx, y = ryx), aes(x = x, y = y)) + geom_point() ggplot(data = data.frame(x = rx, y = ryx), aes(x = y)) + geom_bar(aes(y = (..count..)/sum(..count..))) + stat_function(fun = px, args = list(alpha = 1, beta = 1), color = "red") Exercise 5.5 Let \\(f(x,y) = cx^2y\\) for \\(x^2 \\leq y \\leq 1\\) and zero otherwise. Find such \\(c\\) that \\(f\\) is a PDF of a bivariate random variable. This exercise is borrowed from Wasserman. Solution. \\[\\begin{align} 1 &= \\int_{-1}^{1} \\int_{x^2}^1 cx^2y dy dx \\\\ &= \\int_{-1}^{1} cx^2 (\\frac{1}{2} - \\frac{x^4}{2}) dx \\\\ &= \\frac{c}{2} \\int_{-1}^{1} x^2 - x^6 dx \\\\ &= \\frac{c}{2} (\\frac{1}{3} + \\frac{1}{3} - \\frac{1}{7} - \\frac{1}{7}) \\\\ &= \\frac{c}{2} \\frac{8}{21} \\\\ &= \\frac{4c}{21} \\end{align}\\] It follows \\(c = \\frac{21}{4}\\). 5.3 Transformations Exercise 5.6 Let \\((X,Y)\\) be uniformly distributed on the unit ball \\(\\{(x,y,z) : x^2 + y^2 + z^2 \\leq 1\\}\\). Let \\(R = \\sqrt{X^2 + Y^2 + Z^2}\\). Find the CDF and PDF of \\(R\\). Solution. \\[\\begin{align} P(R < r) &= P(\\sqrt{X^2 + Y^2 + Z^2} < r) \\\\ &= P(X^2 + Y^2 + Z^2 < r^2) \\\\ &= \\frac{\\frac{4}{3} \\pi r^3}{\\frac{4}{3}\\pi} \\\\ &= r^3. \\end{align}\\] The second line shows us that we are looking at the probability which is represented by a smaller ball with radius \\(r\\). To get the probability, we divide it by the radius of the whole ball. We get the PDF by differentiating the CDF, so \\(p(r) = 3r^2\\). "],["integ.html", "Chapter 6 Integration 6.1 Monte Carlo integration 6.2 Lebesgue integrals", " Chapter 6 Integration This chapter deals with abstract and Monte Carlo integration. The students are expected to acquire the following knowledge: Theoretical How to calculate Lebesgue integrals for non-simple functions. R Monte Carlo integration. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 6.1 Monte Carlo integration Exercise 6.1 Let \\(X\\) and \\(Y\\) be continuous random variables on the unit interval and \\(p(x,y) = 6(x - y)^2\\). Use Monte Carlo integration to estimate the probability \\(P(0.2 \\leq X \\leq 0.5, \\: 0.1 \\leq Y \\leq 0.2)\\). Can you find the exact value? set.seed(1) nsamps <- 1000 V <- (0.5 - 0.2) * (0.2 - 0.1) x1 <- runif(nsamps, 0.2, 0.5) x2 <- runif(nsamps, 0.1, 0.2) f_1 <- function (x, y) { return (6 * (x - y)^2) } mcint <- V * (1 / nsamps) * sum(f_1(x1, x2)) sdm <- sqrt((V^2 / nsamps) * var(f_1(x1, x2))) mcint ## [1] 0.008793445 sdm ## [1] 0.0002197686 F_1 <- function (x, y) { return (2 * x^3 * y - 3 * x^2 * y^2 + 2 * x * y^3) } F_1(0.5, 0.2) - F_1(0.2, 0.2) - F_1(0.5, 0.1) + F_1(0.2, 0.1) ## [1] 0.0087 6.2 Lebesgue integrals Exercise 6.2 (borrowed from Jagannathan) Find the Lebesgue integral of the following functions on (\\(\\mathbb{R}\\), \\(\\mathcal{B}(\\mathbb{R})\\), \\(\\lambda\\)). \\[\\begin{align} f(\\omega) = \\begin{cases} \\omega, & \\text{for } \\omega = 0,1,...,n \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] \\[\\begin{align} f(\\omega) = \\begin{cases} 1, & \\text{for } \\omega = \\mathbb{Q}^c \\cap [0,1] \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] \\[\\begin{align} f(\\omega) = \\begin{cases} n, & \\text{for } \\omega = \\mathbb{Q}^c \\cap [0,n] \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] Solution. \\[\\begin{align} \\int f(\\omega) d\\lambda = \\sum_{\\omega = 0}^n \\omega \\lambda(\\omega) = 0. \\end{align}\\] \\[\\begin{align} \\int f(\\omega) d\\lambda = 1 \\times \\lambda(\\mathbb{Q}^c \\cap [0,1]) = 1. \\end{align}\\] \\[\\begin{align} \\int f(\\omega) d\\lambda = n \\times \\lambda(\\mathbb{Q}^c \\cap [0,n]) = n^2. \\end{align}\\] Exercise 6.3 (borrowed from Jagannathan) Let \\(c \\in \\mathbb{R}\\) be fixed and (\\(\\mathbb{R}\\), \\(\\mathcal{B}(\\mathbb{R})\\)) a measurable space. If for any Borel set \\(A\\), \\(\\delta_c (A) = 1\\) if \\(c \\in A\\), and \\(\\delta_c (A) = 0\\) otherwise, then \\(\\delta_c\\) is called a Dirac measure. Let \\(g\\) be a non-negative, measurable function. Show that \\(\\int g d \\delta_c = g(c)\\). Solution. \\[\\begin{align} \\int g d \\delta_c &= \\sup_{q \\in S(g)} \\int q d \\delta_c \\\\ &= \\sup_{q \\in S(g)} \\sum_{i = 1}^n a_i \\delta_c(A_i) \\\\ &= \\sup_{q \\in S(g)} \\sum_{i = 1}^n a_i \\text{I}_{A_i}(c) \\\\ &= \\sup_{q \\in S(g)} q(c) \\\\ &= g(c) \\end{align}\\] "],["ev.html", "Chapter 7 Expected value 7.1 Discrete random variables 7.2 Continuous random variables 7.3 Sums, functions, conditional expectations 7.4 Covariance", " Chapter 7 Expected value This chapter deals with expected values of random variables. The students are expected to acquire the following knowledge: Theoretical Calculation of the expected value. Calculation of variance and covariance. Cauchy distribution. R Estimation of expected value. Estimation of variance and covariance. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 7.1 Discrete random variables Exercise 7.1 (Bernoulli) Let \\(X \\sim \\text{Bernoulli}(p)\\). Find \\(E[X]\\). Find \\(Var[X]\\). R: Let \\(p = 0.4\\). Check your answers to a) and b) with a simulation. Solution. \\[\\begin{align*} E[X] = \\sum_{k=0}^1 p^k (1-p)^{1-k} k = p. \\end{align*}\\] \\[\\begin{align*} Var[X] = E[X^2] - E[X]^2 = \\sum_{k=0}^1 (p^k (1-p)^{1-k} k^2) - p^2 = p(1-p). \\end{align*}\\] set.seed(1) nsamps <- 1000 x <- rbinom(nsamps, 1, 0.4) mean(x) ## [1] 0.394 var(x) ## [1] 0.239003 0.4 * (1 - 0.4) ## [1] 0.24 Exercise 7.2 (Binomial) Let \\(X \\sim \\text{Binomial}(n,p)\\). Find \\(E[X]\\). Find \\(Var[X]\\). Solution. Let \\(X = \\sum_{i=0}^n X_i\\), where \\(X_i \\sim \\text{Bernoulli}(p)\\). Then, due to linearity of expectation \\[\\begin{align*} E[X] = E[\\sum_{i=0}^n X_i] = \\sum_{i=0}^n E[X_i] = np. \\end{align*}\\] Again let \\(X = \\sum_{i=0}^n X_i\\), where \\(X_i \\sim \\text{Bernoulli}(p)\\). Since the Bernoulli variables \\(X_i\\) are independent we have \\[\\begin{align*} Var[X] = Var[\\sum_{i=0}^n X_i] = \\sum_{i=0}^n Var[X_i] = np(1-p). \\end{align*}\\] Exercise 7.3 (Poisson) Let \\(X \\sim \\text{Poisson}(\\lambda)\\). Find \\(E[X]\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\sum_{k=0}^\\infty \\frac{\\lambda^k e^{-\\lambda}}{k!} k & \\\\ &= \\sum_{k=1}^\\infty \\frac{\\lambda^k e^{-\\lambda}}{k!} k & \\text{term at $k=0$ is 0} \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty \\frac{\\lambda^{k-1}}{(k - 1)!} & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=0}^\\infty \\frac{\\lambda^{k}}{k!} & \\\\ &= e^{-\\lambda} \\lambda e^\\lambda & \\\\ &= \\lambda. \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty k \\frac{\\lambda^{k-1}}{(k - 1)!} - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty (k - 1) + 1) \\frac{\\lambda^{k-1}}{(k - 1)!} - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\sum_{k=1}^\\infty (k - 1) \\frac{\\lambda^{k-1}}{(k - 1)!} + \\sum_{k=1}^\\infty \\frac{\\lambda^{k-1}}{(k - 1)!}\\Big) - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\lambda\\sum_{k=2}^\\infty \\frac{\\lambda^{k-2}}{(k - 2)!} + e^\\lambda\\Big) - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\lambda e^\\lambda + e^\\lambda\\Big) - \\lambda^2 & \\\\ &= \\lambda^2 + \\lambda - \\lambda^2 & \\\\ &= \\lambda. \\end{align*}\\] Exercise 7.4 (Geometric) Let \\(X \\sim \\text{Geometric}(p)\\). Find \\(E[X]\\). Hint: \\(\\frac{d}{dx} x^k = k x^{(k - 1)}\\). Solution. \\[\\begin{align*} E[X] &= \\sum_{k=0}^\\infty (1 - p)^k p k & \\\\ &= p (1 - p) \\sum_{k=0}^\\infty (1 - p)^{k-1} k & \\\\ &= p (1 - p) \\sum_{k=0}^\\infty -\\frac{d}{dp}(1 - p)^k & \\\\ &= p (1 - p) \\Big(-\\frac{d}{dp}\\Big) \\sum_{k=0}^\\infty (1 - p)^k & \\\\ &= p (1 - p) \\Big(-\\frac{d}{dp}\\Big) \\frac{1}{1 - (1 - p)} & \\text{geometric series} \\\\ &= \\frac{1 - p}{p} \\end{align*}\\] 7.2 Continuous random variables Exercise 7.5 (Gamma) Let \\(X \\sim \\text{Gamma}(\\alpha, \\beta)\\). Hint: \\(\\Gamma(z) = \\int_0^\\infty t^{z-1}e^{-t} dt\\) and \\(\\Gamma(z + 1) = z \\Gamma(z)\\). Find \\(E[X]\\). Find \\(Var[X]\\). R: Let \\(\\alpha = 10\\) and \\(\\beta = 2\\). Plot the density of \\(X\\). Add a horizontal line at the expected value that touches the density curve (geom_segment). Shade the area within a standard deviation of the expected value. Solution. \\[\\begin{align*} E[X] &= \\int_0^\\infty \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)}x^\\alpha e^{-\\beta x} dx & \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\int_0^\\infty x^\\alpha e^{-\\beta x} dx & \\text{ (let $t = \\beta x$)} \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(\\alpha) }\\int_0^\\infty \\frac{t^\\alpha}{\\beta^\\alpha} e^{-t} \\frac{dt}{\\beta} & \\\\ &= \\frac{1}{\\beta \\Gamma(\\alpha) }\\int_0^\\infty t^\\alpha e^{-t} dt & \\\\ &= \\frac{\\Gamma(\\alpha + 1)}{\\beta \\Gamma(\\alpha)} & \\\\ &= \\frac{\\alpha \\Gamma(\\alpha)}{\\beta \\Gamma(\\alpha)} & \\\\ &= \\frac{\\alpha}{\\beta}. & \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\int_0^\\infty \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)}x^{\\alpha+1} e^{-\\beta x} dx - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\Gamma(\\alpha + 2)}{\\beta^2 \\Gamma(\\alpha)} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{(\\alpha + 1)\\alpha\\Gamma(\\alpha)}{\\beta^2 \\Gamma(\\alpha)} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\alpha^2 + \\alpha}{\\beta^2} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\alpha}{\\beta^2}. \\end{align*}\\] set.seed(1) x <- seq(0, 25, by = 0.01) y <- dgamma(x, shape = 10, rate = 2) df <- data.frame(x = x, y = y) ggplot(df, aes(x = x, y = y)) + geom_line() + geom_segment(aes(x = 5, y = 0, xend = 5, yend = dgamma(5, shape = 10, rate = 2)), color = "red") + stat_function(fun = dgamma, args = list(shape = 10, rate = 2), xlim = c(5 - sqrt(10/4), 5 + sqrt(10/4)), geom = "area", fill = "gray", alpha = 0.4) Exercise 7.6 (Beta) Let \\(X \\sim \\text{Beta}(\\alpha, \\beta)\\). Find \\(E[X]\\). Hint 1: \\(\\text{B}(x,y) = \\int_0^1 t^{x-1} (1 - t)^{y-1} dt\\). Hint 2: \\(\\text{B}(x + 1, y) = \\text{B}(x,y)\\frac{x}{x + y}\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\int_0^1 \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} x dx \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)}\\int_0^1 x^{\\alpha} (1 - x)^{\\beta - 1} dx \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 1, \\beta) \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha, \\beta) \\frac{\\alpha}{\\alpha + \\beta} \\\\ &= \\frac{\\alpha}{\\alpha + \\beta}. \\\\ \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\int_0^1 \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} x^2 dx - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)}\\int_0^1 x^{\\alpha + 1} (1 - x)^{\\beta - 1} dx - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 2, \\beta) - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 1, \\beta) \\frac{\\alpha + 1}{\\alpha + \\beta + 1} - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{\\alpha + 1}{\\alpha + \\beta + 1} \\frac{\\alpha}{\\alpha + \\beta} - \\frac{\\alpha^2}{(\\alpha + \\beta)^2}\\\\ &= \\frac{\\alpha \\beta}{(\\alpha + \\beta)^2(\\alpha + \\beta + 1)}. \\end{align*}\\] Exercise 7.7 (Exponential) Let \\(X \\sim \\text{Exp}(\\lambda)\\). Find \\(E[X]\\). Hint: \\(\\Gamma(z + 1) = z\\Gamma(z)\\) and \\(\\Gamma(1) = 1\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\int_0^\\infty \\lambda e^{-\\lambda x} x dx & \\\\ &= \\lambda \\int_0^\\infty x e^{-\\lambda x} dx & \\\\ &= \\lambda \\int_0^\\infty \\frac{t}{\\lambda} e^{-t} \\frac{dt}{\\lambda} & \\text{$t = \\lambda x$}\\\\ &= \\lambda \\lambda^{-2} \\Gamma(2) & \\text{definition of gamma function} \\\\ &= \\lambda^{-1}. \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 & \\\\ &= \\int_0^\\infty \\lambda e^{-\\lambda x} x^2 dx - \\lambda^{-2} & \\\\ &= \\lambda \\int_0^\\infty \\frac{t^2}{\\lambda^2} e^{-t} \\frac{dt}{\\lambda} - \\lambda^{-2} & \\text{$t = \\lambda x$} \\\\ &= \\lambda \\lambda^{-3} \\Gamma(3) - \\lambda^{-2} & \\text{definition of gamma function} & \\\\ &= \\lambda^{-2} 2 \\Gamma(2) - \\lambda^{-2} & \\\\ &= 2 \\lambda^{-2} - \\lambda^{-2} & \\\\ &= \\lambda^{-2}. & \\\\ \\end{align*}\\] Exercise 7.8 (Normal) Let \\(X \\sim \\text{N}(\\mu, \\sigma)\\). Show that \\(E[X] = \\mu\\). Hint: Use the error function \\(\\text{erf}(x) = \\frac{1}{\\sqrt(\\pi)} \\int_{-x}^x e^{-t^2} dt\\). The statistical interpretation of this function is that if \\(Y \\sim \\text{N}(0, 0.5)\\), then the error function describes the probability of \\(Y\\) falling between \\(-x\\) and \\(x\\). Also, \\(\\text{erf}(\\infty) = 1\\). Show that \\(Var[X] = \\sigma^2\\). Hint: Start with the definition of variance. Solution. \\[\\begin{align*} E[X] &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} x dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty x e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty \\Big(t \\sqrt{2\\sigma^2} + \\mu\\Big)e^{-t^2} \\sqrt{2 \\sigma^2} dt & t = \\frac{x - \\mu}{\\sqrt{2}\\sigma} \\\\ &= \\frac{\\sqrt{2\\sigma^2}}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty t e^{-t^2} dt + \\frac{1}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty \\mu e^{-t^2} dt & \\\\ \\end{align*}\\] Let us calculate these integrals separately. \\[\\begin{align*} \\int t e^{-t^2} dt &= -\\frac{1}{2}\\int e^{s} ds & s = -t^2 \\\\ &= -\\frac{e^s}{2} + C \\\\ &= -\\frac{e^{-t^2}}{2} + C & \\text{undoing substitution}. \\end{align*}\\] Inserting the integration limits we get \\[\\begin{align*} \\int_{-\\infty}^\\infty t e^{-t^2} dt &= 0, \\end{align*}\\] due to the integrated function being symmetric. Reordering the second integral we get \\[\\begin{align*} \\mu \\frac{1}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty e^{-t^2} dt &= \\mu \\text{erf}(\\infty) & \\text{definition of error function} \\\\ &= \\mu & \\text{probability of $Y$ falling between $-\\infty$ and $\\infty$}. \\end{align*}\\] Combining all of the above we get \\[\\begin{align*} E[X] &= \\frac{\\sqrt{2\\sigma^2}}{\\sqrt{\\pi}} \\times 0 + \\mu &= \\mu.\\\\ \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[(X - E[X])^2] \\\\ &= E[(X - \\mu)^2] \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty (x - \\mu)^2 e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} dx \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\int_{-\\infty}^\\infty t^2 e^{-\\frac{t^2}{2}} dt \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\bigg(\\Big(- t e^{-\\frac{t^2}{2}} |_{-\\infty}^\\infty \\Big) + \\int_{-\\infty}^\\infty e^{-\\frac{t^2}{2}} \\bigg) dt & \\text{integration by parts} \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\sqrt{2 \\pi} \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt(\\pi)}e^{-s^2} \\bigg) & s = \\frac{t}{\\sqrt{2}} \\text{ and evaluating the left expression at the bounds} \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\sqrt{2 \\pi} \\Big(\\text{erf}(\\infty) & \\text{definition of error function} \\\\ &= \\sigma^2. \\end{align*}\\] 7.3 Sums, functions, conditional expectations Exercise 7.9 (Expectation of transformations) Let \\(X\\) follow a normal distribution with mean \\(\\mu\\) and variance \\(\\sigma^2\\). Find \\(E[2X + 4]\\). Find \\(E[X^2]\\). Find \\(E[\\exp(X)]\\). Hint: Use the error function \\(\\text{erf}(x) = \\frac{1}{\\sqrt(\\pi)} \\int_{-x}^x e^{-t^2} dt\\). Also, \\(\\text{erf}(\\infty) = 1\\). R: Check your results numerically for \\(\\mu = 0.4\\) and \\(\\sigma^2 = 0.25\\) and plot the densities of all four distributions. Solution. \\[\\begin{align} E[2X + 4] &= 2E[X] + 4 & \\text{linearity of expectation} \\\\ &= 2\\mu + 4. \\\\ \\end{align}\\] \\[\\begin{align} E[X^2] &= E[X]^2 + Var[X] & \\text{definition of variance} \\\\ &= \\mu^2 + \\sigma^2. \\end{align}\\] \\[\\begin{align} E[\\exp(X)] &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} e^x dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{\\frac{2 \\sigma^2 x}{2\\sigma^2} -\\frac{(x - \\mu)^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{x^2 - 2x(\\mu + \\sigma^2) + \\mu^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{(x - (\\mu + \\sigma^2))^2 + \\mu^2 - (\\mu + \\sigma^2)^2}{2\\sigma^2}} dx & \\text{complete the square} \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{\\frac{- \\mu^2 + (\\mu + \\sigma^2)^2}{2\\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{(x - (\\mu + \\sigma^2))^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{\\frac{- \\mu^2 + (\\mu + \\sigma^2)^2}{2\\sigma^2}} \\sigma \\sqrt{2 \\pi} \\text{erf}(\\infty) & \\\\ &= e^{\\frac{2\\mu + \\sigma^2}{2}}. \\end{align}\\] set.seed(1) mu <- 0.4 sigma <- 0.5 x <- rnorm(100000, mean = mu, sd = sigma) mean(2*x + 4) ## [1] 4.797756 2 * mu + 4 ## [1] 4.8 mean(x^2) ## [1] 0.4108658 mu^2 + sigma^2 ## [1] 0.41 mean(exp(x)) ## [1] 1.689794 exp((2 * mu + sigma^2) / 2) ## [1] 1.690459 Exercise 7.10 (Sum of independent random variables) Borrowed from Wasserman. Let \\(X_1, X_2,...,X_n\\) be IID random variables with expected value \\(E[X_i] = \\mu\\) and variance \\(Var[X_i] = \\sigma^2\\). Find the expected value and variance of \\(\\bar{X} = \\frac{1}{n} \\sum_{i=1}^n X_i\\). \\(\\bar{X}\\) is called a statistic (a function of the values in a sample). It is itself a random variable and its distribution is called a sampling distribution. R: Take \\(n = 5, 10, 100, 1000\\) samples from the N(\\(2\\), \\(6\\)) distribution 10000 times. Plot the theoretical density and the densities of \\(\\bar{X}\\) statistic for each \\(n\\). Intuitively, are the results in correspondence with your calculations? Check them numerically. Solution. Let us start with the expectation of \\(\\bar{X}\\). \\[\\begin{align} E[\\bar{X}] &= E[\\frac{1}{n} \\sum_{i=1}^n X_i] & \\\\ &= \\frac{1}{n} E[\\sum_{i=1}^n X_i] & \\text{ (multiplication with a scalar)} \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[X_i] & \\text{ (linearity)} \\\\ &= \\frac{1}{n} n \\mu & \\\\ &= \\mu. \\end{align}\\] Now the variance \\[\\begin{align} Var[\\bar{X}] &= Var[\\frac{1}{n} \\sum_{i=1}^n X_i] & \\\\ &= \\frac{1}{n^2} Var[\\sum_{i=1}^n X_i] & \\text{ (multiplication with a scalar)} \\\\ &= \\frac{1}{n^2} \\sum_{i=1}^n Var[X_i] & \\text{ (independence of samples)} \\\\ &= \\frac{1}{n^2} n \\sigma^2 & \\\\ &= \\frac{1}{n} \\sigma^2. \\end{align}\\] set.seed(1) nsamps <- 10000 mu <- 2 sigma <- sqrt(6) N <- c(5, 10, 100, 500) X <- matrix(data = NA, nrow = nsamps, ncol = length(N)) ind <- 1 for (n in N) { for (i in 1:nsamps) { X[i,ind] <- mean(rnorm(n, mu, sigma)) } ind <- ind + 1 } colnames(X) <- N X <- melt(as.data.frame(X)) ggplot(data = X, aes(x = value, colour = variable)) + geom_density() + stat_function(data = data.frame(x = seq(-2, 6, by = 0.01)), aes(x = x), fun = dnorm, args = list(mean = mu, sd = sigma), color = "black") Exercise 7.11 (Conditional expectation) Let \\(X \\in \\mathbb{R}_0^+\\) and \\(Y \\in \\mathbb{N}_0\\) be random variables with joint distribution \\(p_{XY}(X,Y) = \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} 0.5^{y + 1}\\). Find \\(E[X | Y = y]\\) by first finding \\(p_Y\\) and then \\(p_{X|Y}\\). Find \\(E[X]\\). R: check your answers to a) and b) by drawing 10000 samples from \\(p_Y\\) and \\(p_{X|Y}\\). Solution. \\[\\begin{align} p(y) &= \\int_0^\\infty \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} 0.5^{y + 1} dx \\\\ &= \\frac{0.5^{y + 1}}{y + 1} \\int_0^\\infty e^{-\\frac{x}{y + 1}} dx \\\\ &= \\frac{0.5^{y + 1}}{y + 1} (y + 1) \\\\ &= 0.5^{y + 1} \\\\ &= 0.5(1 - 0.5)^y. \\end{align}\\] We recognize this as the geometric distribution. \\[\\begin{align} p(x|y) &= \\frac{p(x,y)}{p(y)} \\\\ &= \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}}. \\end{align}\\] We recognize this as the exponential distribution. \\[\\begin{align} E[X | Y = y] &= \\int_0^\\infty x \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} dx \\\\ &= y + 1 & \\text{expected value of the exponential distribution} \\end{align}\\] Use the law of iterated expectation. \\[\\begin{align} E[X] &= E[E[X | Y]] \\\\ &= E[Y + 1] \\\\ &= E[Y] + 1 \\\\ &= \\frac{1 - 0.5}{0.5} + 1 \\\\ &= 2. \\end{align}\\] set.seed(1) y <- rgeom(100000, 0.5) x <- rexp(100000, rate = 1 / (y + 1)) x2 <- x[y == 3] mean(x2) ## [1] 4.048501 3 + 1 ## [1] 4 mean(x) ## [1] 2.007639 (1 - 0.5) / 0.5 + 1 ## [1] 2 Exercise 7.12 (Cauchy distribution) Let \\(p(x | x_0, \\gamma) = \\frac{1}{\\pi \\gamma \\Big(1 + \\big(\\frac{x - x_0}{\\gamma}\\big)^2\\Big)}\\). A random variable with this PDF follows a Cauchy distribution. This distribution is symmetric and has wider tails than the normal distribution. R: Draw \\(n = 1,...,1000\\) samples from a standard normal and \\(\\text{Cauchy}(0, 1)\\). For each \\(n\\) plot the mean and the median of the sample using facets. Interpret the results. To get a mathematical explanation of the results in a), evaluate the integral \\(\\int_0^\\infty \\frac{x}{1 + x^2} dx\\) and consider that \\(E[X] = \\int_{-\\infty}^\\infty \\frac{x}{1 + x^2}dx\\). set.seed(1) n <- 1000 means_n <- vector(mode = "numeric", length = n) means_c <- vector(mode = "numeric", length = n) medians_n <- vector(mode = "numeric", length = n) medians_c <- vector(mode = "numeric", length = n) for (i in 1:n) { tmp_n <- rnorm(i) tmp_c <- rcauchy(i) means_n[i] <- mean(tmp_n) means_c[i] <- mean(tmp_c) medians_n[i] <- median(tmp_n) medians_c[i] <- median(tmp_c) } df <- data.frame("distribution" = c(rep("normal", 2 * n), rep("Cauchy", 2 * n)), "type" = c(rep("mean", n), rep("median", n), rep("mean", n), rep("median", n)), "value" = c(means_n, medians_n, means_c, medians_c), "n" = rep(1:n, times = 4)) ggplot(df, aes(x = n, y = value)) + geom_line(alpha = 0.5) + facet_wrap(~ type + distribution , scales = "free") Solution. \\[\\begin{align} \\int_0^\\infty \\frac{x}{1 + x^2} dx &= \\frac{1}{2} \\int_1^\\infty \\frac{1}{u} du & u = 1 + x^2 \\\\ &= \\frac{1}{2} \\ln(x) |_0^\\infty. \\end{align}\\] This integral is not finite. The same holds for the negative part. Therefore, the expectation is undefined, as \\(E[|X|] = \\infty\\). Why can we not just claim that \\(f(x) = x / (1 + x^2)\\) is odd and \\(\\int_{-\\infty}^\\infty f(x) = 0\\)? By definition of the Lebesgue integral \\(\\int_{-\\infty}^{\\infty} f= \\int_{-\\infty}^{\\infty} f_+-\\int_{-\\infty}^{\\infty} f_-\\). At least one of the two integrals needs to be finite for \\(\\int_{-\\infty}^{\\infty} f\\) to be well-defined. However \\(\\int_{-\\infty}^{\\infty} f_+=\\int_0^{\\infty} x/(1+x^2)\\) and \\(\\int_{-\\infty}^{\\infty} f_-=\\int_{-\\infty}^{0} |x|/(1+x^2)\\). We have just shown that both of these integrals are infinite, which implies that their sum is also infinite. 7.4 Covariance Exercise 7.13 Below is a table of values for random variables \\(X\\) and \\(Y\\). X Y 2.1 8 -0.5 11 1 10 -2 12 4 9 Find sample covariance of \\(X\\) and \\(Y\\). Find sample variances of \\(X\\) and \\(Y\\). Find sample correlation of \\(X\\) and \\(Y\\). Find sample variance of \\(Z = 2X - 3Y\\). Solution. \\(\\bar{X} = 0.92\\) and \\(\\bar{Y} = 10\\). \\[\\begin{align} s(X, Y) &= \\frac{1}{n - 1} \\sum_{i=1}^5 (X_i - 0.92) (Y_i - 10) \\\\ &= -3.175. \\end{align}\\] \\[\\begin{align} s(X) &= \\frac{\\sum_{i=1}^5(X_i - 0.92)^2}{5 - 1} \\\\ &= 5.357. \\end{align}\\] \\[\\begin{align} s(Y) &= \\frac{\\sum_{i=1}^5(Y_i - 10)^2}{5 - 1} \\\\ &= 2.5. \\end{align}\\] \\[\\begin{align} r(X,Y) &= \\frac{Cov(X,Y)}{\\sqrt{Var[X]Var[Y]}} \\\\ &= \\frac{-3.175}{\\sqrt{5.357 \\times 2.5}} \\\\ &= -8.68. \\end{align}\\] \\[\\begin{align} s(Z) &= 2^2 s(X) + 3^2 s(Y) + 2 \\times 2 \\times 3 s(X, Y) \\\\ &= 4 \\times 5.357 + 9 \\times 2.5 + 12 \\times 3.175 \\\\ &= 82.028. \\end{align}\\] Exercise 7.14 Let \\(X \\sim \\text{Uniform}(0,1)\\) and \\(Y | X = x \\sim \\text{Uniform(0,x)}\\). Find the covariance of \\(X\\) and \\(Y\\). Find the correlation of \\(X\\) and \\(Y\\). R: check your answers to a) and b) with simulation. Plot \\(X\\) against \\(Y\\) on a scatterplot. Solution. The joint PDF is \\(p(x,y) = p(x)p(y|x) = \\frac{1}{x}\\). \\[\\begin{align} Cov(X,Y) &= E[XY] - E[X]E[Y] \\\\ \\end{align}\\] Let us first evaluate the first term: \\[\\begin{align} E[XY] &= \\int_0^1 \\int_0^x x y \\frac{1}{x} dy dx \\\\ &= \\int_0^1 \\int_0^x y dy dx \\\\ &= \\int_0^1 \\frac{x^2}{2} dx \\\\ &= \\frac{1}{6}. \\end{align}\\] Now let us find \\(E[Y]\\), \\(E[X]\\) is trivial. \\[\\begin{align} E[Y] = E[E[Y | X]] = E[\\frac{X}{2}] = \\frac{1}{2} \\int_0^1 x dx = \\frac{1}{4}. \\end{align}\\] Combining all: \\[\\begin{align} Cov(X,Y) &= \\frac{1}{6} - \\frac{1}{2} \\frac{1}{4} = \\frac{1}{24}. \\end{align}\\] \\[\\begin{align} \\rho(X,Y) &= \\frac{Cov(X,Y)}{\\sqrt{Var[X]Var[Y]}} \\\\ \\end{align}\\] Let us calculate \\(Var[X]\\). \\[\\begin{align} Var[X] &= E[X^2] - \\frac{1}{4} \\\\ &= \\int_0^1 x^2 - \\frac{1}{4} \\\\ &= \\frac{1}{3} - \\frac{1}{4} \\\\ &= \\frac{1}{12}. \\end{align}\\] Let us calculate \\(E[E[Y^2|X]]\\). \\[\\begin{align} E[E[Y^2|X]] &= E[\\frac{x^2}{3}] \\\\ &= \\frac{1}{9}. \\end{align}\\] Then \\(Var[Y] = \\frac{1}{9} - \\frac{1}{16} = \\frac{7}{144}\\). Combining all \\[\\begin{align} \\rho(X,Y) &= \\frac{\\frac{1}{24}}{\\sqrt{\\frac{1}{12}\\frac{7}{144}}} \\\\ &= 0.65. \\end{align}\\] set.seed(1) nsamps <- 10000 x <- runif(nsamps) y <- runif(nsamps, 0, x) cov(x, y) ## [1] 0.04274061 1/24 ## [1] 0.04166667 cor(x, y) ## [1] 0.6629567 (1 / 24) / (sqrt(7 / (12 * 144))) ## [1] 0.6546537 ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_point(alpha = 0.2) + geom_smooth(method = "lm") "],["mrv.html", "Chapter 8 Multivariate random variables 8.1 Multinomial random variables 8.2 Multivariate normal random variables 8.3 Transformations", " Chapter 8 Multivariate random variables This chapter deals with multivariate random variables. The students are expected to acquire the following knowledge: Theoretical Multinomial distribution. Multivariate normal distribution. Cholesky decomposition. Eigendecomposition. R Sampling from the multinomial distribution. Sampling from the multivariate normal distribution. Matrix decompositions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 8.1 Multinomial random variables Exercise 8.1 Let \\(X_i\\), \\(i = 1,...,k\\) represent \\(k\\) events, and \\(p_i\\) the probabilities of these events happening in a trial. Let \\(n\\) be the number of trials, and \\(X\\) a multivariate random variable, the collection of \\(X_i\\). Then \\(p(x) = \\frac{n!}{x_1!x_2!...x_k!} p_1^{x_1} p_2^{x_2}...p_k^{x_k}\\) is the PMF of a multinomial distribution, where \\(n = \\sum_{i = 1}^k x_i\\). Show that the marginal distribution of \\(X_i\\) is a binomial distribution. Take 1000 samples from the multinomial distribution with \\(n=4\\) and probabilities \\(p = (0.2, 0.2, 0.5, 0.1)\\). Then take 1000 samples from four binomial distributions with the same parameters. Inspect the results visually. Solution. We will approach this proof from the probabilistic point of view. W.L.O.G. let \\(x_1\\) be the marginal distribution we are interested in. The term \\(p^{x_1}\\) denotes the probability that event 1 happened \\(x_1\\) times. For this event not to happen, one of the other events needs to happen. So for each of the remaining trials, the probability of another event is \\(\\sum_{i=2}^k p_i = 1 - p_1\\), and there were \\(n - x_1\\) such trials. What is left to do is to calculate the number of permutations of event 1 happening and event 1 not happening. We choose \\(x_1\\) trials, from \\(n\\) trials. Therefore \\(p(x_1) = \\binom{n}{x_1} p_1^{x_1} (1 - p_1)^{n - x_1}\\), which is the binomial PMF. Interested students are encouraged to prove this mathematically. set.seed(1) nsamps <- 1000 samps_mult <- rmultinom(nsamps, 4, prob = c(0.2, 0.2, 0.5, 0.1)) samps_mult <- as_tibble(t(samps_mult)) %>% gather() samps <- tibble( V1 = rbinom(nsamps, 4, 0.2), V2 = rbinom(nsamps, 4, 0.2), V3 = rbinom(nsamps, 4, 0.5), V4 = rbinom(nsamps, 4, 0.1) ) %>% gather() %>% bind_rows(samps_mult) %>% bind_cols("dist" = c(rep("binomial", 4*nsamps), rep("multinomial", 4*nsamps))) ggplot(samps, aes(x = value, fill = dist)) + geom_bar(position = "dodge") + facet_wrap(~ key) Exercise 8.2 (Multinomial expected value) Find the expected value, variance and covariance of the multinomial distribution. Hint: First find the expected value for \\(n = 1\\) and then use the fact that the trials are independent. Solution. Let us first calculate the expected value of \\(X_1\\), when \\(n = 1\\). \\[\\begin{align} E[X_1] &= \\sum_{n_1 = 0}^1 \\sum_{n_2 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_1!n_2!...n_k!}p_1^{n_1}p_2^{n_2}...p_k^{n_k}n_1 \\\\ &= \\sum_{n_1 = 0}^1 \\frac{p_1^{n_1} n_1}{n_1!} \\sum_{n_2 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_2!...n_k!}p_2^{n_2}...p_k^{n_k} \\end{align}\\] When \\(n_1 = 0\\) then the whole terms is zero, so we do not need to evaluate other sums. When \\(n_1 = 1\\), all other \\(n_i\\) must be zero, as we have \\(1 = \\sum_{i=1}^k n_i\\). Therefore the other sums equal \\(1\\). So \\(E[X_1] = p_1\\) and \\(E[X_i] = p_i\\) for \\(i = 1,...,k\\). Now let \\(Y_j\\), \\(j = 1,...,n\\), have a multinomial distribution with \\(n = 1\\), and let \\(X\\) have a multinomial distribution with an arbitrary \\(n\\). Then we can write \\(X = \\sum_{j=1}^n Y_j\\). And due to independence \\[\\begin{align} E[X] &= E[\\sum_{j=1}^n Y_j] \\\\ &= \\sum_{j=1}^n E[Y_j] \\\\ &= np. \\end{align}\\] For the variance, we need \\(E[X^2]\\). Let us follow the same procedure as above and first calculate \\(E[X_i]\\) for \\(n = 1\\). The only thing that changes is that the term \\(n_i\\) becomes \\(n_i^2\\). Since we only have \\(0\\) and \\(1\\) this does not change the outcome. So \\[\\begin{align} Var[X_i] &= E[X_i^2] - E[X_i]^2\\\\ &= p_i(1 - p_i). \\end{align}\\] Analogous to above for arbitrary \\(n\\) \\[\\begin{align} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\sum_{j=1}^n E[Y_j^2] - \\sum_{j=1}^n E[Y_j]^2 \\\\ &= \\sum_{j=1}^n E[Y_j^2] - E[Y_j]^2 \\\\ &= \\sum_{j=1}^n p(1-p) \\\\ &= np(1-p). \\end{align}\\] To calculate the covariance, we need \\(E[X_i X_j]\\). Again, let us start with \\(n = 1\\). Without loss of generality, let us assume \\(i = 1\\) and \\(j = 2\\). \\[\\begin{align} E[X_1 X_2] = \\sum_{n_1 = 0}^1 \\sum_{n_2 = 0}^1 \\frac{p_1^{n_1} n_1}{n_1!} \\frac{p_2^{n_2} n_2}{n_2!} \\sum_{n_3 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_3!...n_k!}p_3^{n_3}...p_k^{n_k}. \\end{align}\\] In the above expression, at each iteration we multiply with \\(n_1\\) and \\(n_2\\). Since \\(n = 1\\), one of these always has to be zero. Therefore \\(E[X_1 X_2] = 0\\) and \\[\\begin{align} Cov(X_i, X_j) &= E[X_i X_j] - E[X_i]E[X_j] \\\\ &= - p_i p_j. \\end{align}\\] For arbitrary \\(n\\), let \\(X = \\sum_{t = 1}^n Y_t\\) be the sum of independent multinomial random variables \\(Y_t = [X_{1t}, X_{2t},...,X_{kt}]^T\\) with \\(n=1\\). Then \\(X_1 = \\sum_{t = 1}^n X_{1t}\\) and \\(X_2 = \\sum_{l = 1}^n X_{2l}\\). \\[\\begin{align} Cov(X_1, X_2) &= E[X_1 X_2] - E[X_1] E[X_2] \\\\ &= E[\\sum_{t = 1}^n X_{1t} \\sum_{l = 1}^n X_{2l}] - n^2 p_1 p_2 \\\\ &= \\sum_{t = 1}^n \\sum_{l = 1}^n E[X_{1t} X_{2l}] - n^2 p_1 p_2. \\end{align}\\] For \\(X_{1t}\\) and \\(X_{2l}\\) the expected value is zero when \\(t = l\\). When \\(t \\neq l\\) then they are independent, so the expected value is the product \\(p_1 p_2\\). There are \\(n^2\\) total terms, and for \\(n\\) of them \\(t = l\\) holds. So \\(E[X_1 X_2] = (n^2 - n) p_1 p_2\\). Inserting into the above \\[\\begin{align} Cov(X_1, X_2) &= (n^2 - n) p_1 p_2 - n^2 p_1 p_2 \\\\ &= - n p_1 p_2. \\end{align}\\] 8.2 Multivariate normal random variables Exercise 8.3 (Cholesky decomposition) Let \\(X\\) be a random vector of length \\(k\\) with \\(X_i \\sim \\text{N}(0, 1)\\) and \\(LL^*\\) the Cholesky decomposition of a Hermitian positive-definite matrix \\(A\\). Let \\(\\mu\\) be a vector of length \\(k\\). Find the distribution of the random vector \\(Y = \\mu + L X\\). Find the Cholesky decomposition of \\(A = \\begin{bmatrix} 2 & 1.2 \\\\ 1.2 & 1 \\end{bmatrix}\\). R: Use the results from a) and b) to sample from the MVN distribution \\(\\text{N}(\\mu, A)\\), where \\(\\mu = [1.5, -1]^T\\). Plot a scatterplot and compare it to direct samples from the multivariate normal distribution (rmvnorm). Solution. \\(X\\) has an independent normal distribution of dimension \\(k\\). Then \\[\\begin{align} Y = \\mu + L X &\\sim \\text{N}(\\mu, LL^T) \\\\ &\\sim \\text{N}(\\mu, A). \\end{align}\\] Solve \\[\\begin{align} \\begin{bmatrix} a & 0 \\\\ b & c \\end{bmatrix} \\begin{bmatrix} a & b \\\\ 0 & c \\end{bmatrix} = \\begin{bmatrix} 2 & 1.2 \\\\ 1.2 & 1 \\end{bmatrix} \\end{align}\\] # a set.seed(1) nsamps <- 1000 X <- matrix(data = rnorm(nsamps * 2), ncol = 2) mu <- c(1.5, -1) L <- matrix(data = c(sqrt(2), 0, 1.2 / sqrt(2), sqrt(1 - 1.2^2/2)), ncol = 2, byrow = TRUE) Y <- t(mu + L %*% t(X)) plot_df <- data.frame(rbind(X, Y), c(rep("X", nsamps), rep("Y", nsamps))) colnames(plot_df) <- c("D1", "D2", "var") ggplot(data = plot_df, aes(x = D1, y = D2, colour = as.factor(var))) + geom_point() Exercise 8.4 (Eigendecomposition) R: Let \\(\\Sigma = U \\Lambda U^T\\) be the eigendecomposition of covariance matrix \\(\\Sigma\\). Follow the procedure below, to sample from a multivariate normal with \\(\\mu = [-2, 1]^T\\) and \\(\\Sigma = \\begin{bmatrix} 0.3, -0.5 \\\\ -0.5, 1.6 \\end{bmatrix}\\): Sample from two independent standardized normal distributions to get \\(X\\). Find the eigendecomposition of \\(X\\) (eigen). Multiply \\(X\\) by \\(\\Lambda^{\\frac{1}{2}}\\) to get \\(X2\\). Consider how the eigendecomposition for \\(X2\\) changes compared to \\(X\\). Multiply \\(X2\\) by \\(U\\) to get \\(X3\\). Consider how the eigendecomposition for \\(X3\\) changes compared to \\(X2\\). Add \\(\\mu\\) to \\(X3\\). Consider how the eigendecomposition for \\(X4\\) changes compared to \\(X3\\). Plot the data and the eigenvectors (scaled with \\(\\Lambda^{\\frac{1}{2}}\\)) at each step. Hint: Use geom_segment for the eigenvectors. # a set.seed(1) sigma <- matrix(data = c(0.3, -0.5, -0.5, 1.6), nrow = 2, byrow = TRUE) ed <- eigen(sigma) e_val <- ed$values e_vec <- ed$vectors # b set.seed(1) nsamps <- 1000 X <- matrix(data = rnorm(nsamps * 2), ncol = 2) vec1 <- matrix(c(1,0,0,1), nrow = 2) X2 <- t(sqrt(diag(e_val)) %*% t(X)) vec2 <- sqrt(diag(e_val)) %*% vec1 X3 <- t(e_vec %*% t(X2)) vec3 <- e_vec %*% vec2 X4 <- t(c(-2, 1) + t(X3)) vec4 <- c(-2, 1) + vec3 vec_mat <- data.frame(matrix(c(0,0,0,0,0,0,0,0,0,0,0,0,-2,1,-2,1), ncol = 2, byrow = TRUE), t(cbind(vec1, vec2, vec3, vec4)), c(1,1,2,2,3,3,4,4)) df <- data.frame(rbind(X, X2, X3, X4), c(rep(1, nsamps), rep(2, nsamps), rep(3, nsamps), rep(4, nsamps))) colnames(df) <- c("D1", "D2", "wh") colnames(vec_mat) <- c("D1", "D2", "E1", "E2", "wh") ggplot(data = df, aes(x = D1, y = D2)) + geom_point() + geom_segment(data = vec_mat, aes(xend = E1, yend = E2), color = "red") + facet_wrap(~ wh) + coord_fixed() Exercise 8.5 (Marginal and conditional distributions) Let \\(X \\sim \\text{N}(\\mu, \\Sigma)\\), where \\(\\mu = [2, 0, -1]^T\\) and \\(\\Sigma = \\begin{bmatrix} 1 & -0.2 & 0.5 \\\\ -0.2 & 1.4 & -1.2 \\\\ 0.5 & -1.2 & 2 \\\\ \\end{bmatrix}\\). Let \\(A\\) represent the first two random variables and \\(B\\) the third random variable. R: For the calculation in the following points, you can use R. Find the marginal distribution of \\(B\\). Find the conditional distribution of \\(B | A = [a_1, a_2]^T\\). Find the marginal distribution of \\(A\\). Find the conditional distribution of \\(A | B = b\\). R: Visually compare the distributions of a) and b), and c) and d) at three different conditional values. mu <- c(2, 0, -1) Sigma <- matrix(c(1, -0.2, 0.5, -0.2, 1.4, -1.2, 0.5, -1.2, 2), nrow = 3, byrow = TRUE) mu_A <- c(2, 0) mu_B <- -1 Sigma_A <- Sigma[1:2, 1:2] Sigma_B <- Sigma[3, 3] Sigma_AB <- Sigma[1:2, 3] # b tmp_b <- t(Sigma_AB) %*% solve(Sigma_A) mu_b <- mu_B - tmp_b %*% mu_A Sigma_b <- Sigma_B - t(Sigma_AB) %*% solve(Sigma_A) %*% Sigma_AB mu_b ## [,1] ## [1,] -1.676471 tmp_b ## [,1] [,2] ## [1,] 0.3382353 -0.8088235 Sigma_b ## [,1] ## [1,] 0.8602941 # d tmp_a <- Sigma_AB * (1 / Sigma_B) mu_a <- mu_A - tmp_a * mu_B Sigma_d <- Sigma_A - (Sigma_AB * (1 / Sigma_B)) %*% t(Sigma_AB) mu_a ## [1] 2.25 -0.60 tmp_a ## [1] 0.25 -0.60 Sigma_d ## [,1] [,2] ## [1,] 0.875 0.10 ## [2,] 0.100 0.68 Solution. \\(B \\sim \\text{N}(-1, 2)\\). \\(B | A = a \\sim \\text{N}(-1.68 + [0.34, -0.81] a, 0.86)\\). \\(\\mu_A = [2, 0]^T\\) and \\(\\Sigma_A = \\begin{bmatrix} 1 & -0.2 & \\\\ -0.2 & 1.4 \\\\ \\end{bmatrix}\\). \\[\\begin{align} A | B = b &\\sim \\text{N}(\\mu_t, \\Sigma_t), \\\\ \\mu_t &= [2.25, -0.6]^T + [0.25, -0.6]^T b, \\\\ \\Sigma_t &= \\begin{bmatrix} 0.875 & 0.1 \\\\ 0.1 & 0.68 \\\\ \\end{bmatrix} \\end{align}\\] library(mvtnorm) set.seed(1) nsamps <- 1000 # a and b samps <- as.data.frame(matrix(data = NA, nrow = 4 * nsamps, ncol = 2)) samps[1:nsamps,1] <- rnorm(nsamps, mu_B, Sigma_B) samps[1:nsamps,2] <- "marginal" for (i in 1:3) { a <- rmvnorm(1, mu_A, Sigma_A) samps[(i*nsamps + 1):((i + 1) * nsamps), 1] <- rnorm(nsamps, mu_b + tmp_b %*% t(a), Sigma_b) samps[(i*nsamps + 1):((i + 1) * nsamps), 2] <- paste0(# "cond", round(a, digits = 2), collapse = "-") } colnames(samps) <- c("x", "dist") ggplot(samps, aes(x = x)) + geom_density() + facet_wrap(~ dist) # c and d samps <- as.data.frame(matrix(data = NA, nrow = 4 * nsamps, ncol = 3)) samps[1:nsamps,1:2] <- rmvnorm(nsamps, mu_A, Sigma_A) samps[1:nsamps,3] <- "marginal" for (i in 1:3) { b <- rnorm(1, mu_B, Sigma_B) samps[(i*nsamps + 1):((i + 1) * nsamps), 1:2] <- rmvnorm(nsamps, mu_a + tmp_a * b, Sigma_d) samps[(i*nsamps + 1):((i + 1) * nsamps), 3] <- b } colnames(samps) <- c("x", "y", "dist") ggplot(samps, aes(x = x, y = y)) + geom_point() + geom_smooth(method = "lm") + facet_wrap(~ dist) 8.3 Transformations Exercise 8.6 Let \\((U,V)\\) be a random variable with PDF \\(p(u,v) = \\frac{1}{4 \\sqrt{u}}\\), \\(U \\in [0,4]\\) and \\(V \\in [\\sqrt{U}, \\sqrt{U} + 1]\\). Let \\(X = \\sqrt{U}\\) and \\(Y = V - \\sqrt{U}\\). Find PDF of \\((X,Y)\\). What can you tell about distributions of \\(X\\) and \\(Y\\)? This exercise shows how we can simplify a probabilistic problem with a clever use of transformations. R: Take 1000 samples from \\((X,Y)\\) and transform them with inverses of the above functions to get samples from \\((U,V)\\). Plot both sets of samples. Solution. First we need to find the inverse functions. Since \\(x = \\sqrt{u}\\) it follows that \\(u = x^2\\), and that \\(x \\in [0,2]\\). Similarly \\(v = y + x\\) and \\(y \\in [0,1]\\). Let us first find the Jacobian. \\[\\renewcommand\\arraystretch{1.6} J(x,y) = \\begin{bmatrix} \\frac{\\partial u}{\\partial x} & \\frac{\\partial v}{\\partial x} \\\\%[1ex] % <-- 1ex more space between rows of matrix \\frac{\\partial u}{\\partial y} & \\frac{\\partial v}{\\partial y} \\end{bmatrix} = \\begin{bmatrix} 2x & 1 \\\\%[1ex] % <-- 1ex more space between rows of matrix 0 & 1 \\end{bmatrix}, \\] and the determinant is \\(|J(x,y)| = 2x\\). Putting everything together, we get \\[\\begin{align} p_{X,Y}(x,y) = p_{U,V}(x^2, y + x) |J(x,y)| = \\frac{1}{4 \\sqrt{x^2}} 2x = \\frac{1}{2}. \\end{align}\\] This reminds us of the Uniform distribution. Indeed we can see that \\(p_X(x) = \\frac{1}{2}\\) and \\(p_Y(y) = 1\\). So instead of dealing with an awkward PDF of \\((U,V)\\) and the corresponding dynamic bounds, we are now looking at two independent Uniform random variables. In practice, this could make modeling much easier. set.seed(1) nsamps <- 2000 x <- runif(nsamps, min = 0, max = 2) y <- runif(nsamps) orig <- tibble(x = x, y = y, vrs = "original") u <- x^2 v <- y + x transf <- tibble(x = u, y = v, vrs = "transformed") df <- bind_rows(orig, transf) ggplot(df, aes(x = x, y = y, color = vrs)) + geom_point(alpha = 0.3) Exercise 8.7 R: Write a function that will calculate the probability density of an arbitraty multivariate normal distribution, based on independent standardized normal PDFs. Compare with dmvnorm from the mvtnorm package. library(mvtnorm) set.seed(1) mvn_dens <- function (y, mu, Sigma) { L <- chol(Sigma) L_inv <- solve(t(L)) g_inv <- L_inv %*% t(y - mu) J <- L_inv J_det <- det(J) return(prod(dnorm(g_inv)) * J_det) } mu_v <- c(-2, 0, 1) cov_m <- matrix(c(1, -0.2, 0.5, -0.2, 2, 0.3, 0.5, 0.3, 1.6), ncol = 3, byrow = TRUE) n_comp <- 20 for (i in 1:n_comp) { x <- rmvnorm(1, mean = mu_v, sigma = cov_m) print(paste0("My function: ", mvn_dens(x, mu_v, cov_m), ", dmvnorm: ", dmvnorm(x, mu_v, cov_m))) } ## [1] "My function: 0.0229514237156383, dmvnorm: 0.0229514237156383" ## [1] "My function: 0.00763138915406231, dmvnorm: 0.00763138915406231" ## [1] "My function: 0.0230688881105741, dmvnorm: 0.0230688881105741" ## [1] "My function: 0.0113616213114731, dmvnorm: 0.0113616213114731" ## [1] "My function: 0.00151808500121907, dmvnorm: 0.00151808500121907" ## [1] "My function: 0.0257658045974509, dmvnorm: 0.0257658045974509" ## [1] "My function: 0.0157963825730805, dmvnorm: 0.0157963825730805" ## [1] "My function: 0.00408856287529248, dmvnorm: 0.00408856287529248" ## [1] "My function: 0.0327793540101256, dmvnorm: 0.0327793540101256" ## [1] "My function: 0.0111606542967978, dmvnorm: 0.0111606542967978" ## [1] "My function: 0.0147636757585684, dmvnorm: 0.0147636757585684" ## [1] "My function: 0.0142948300412207, dmvnorm: 0.0142948300412207" ## [1] "My function: 0.0203093820657542, dmvnorm: 0.0203093820657542" ## [1] "My function: 0.0287533273357481, dmvnorm: 0.0287533273357481" ## [1] "My function: 0.0213402305128623, dmvnorm: 0.0213402305128623" ## [1] "My function: 0.0218356957993885, dmvnorm: 0.0218356957993885" ## [1] "My function: 0.0250750113961771, dmvnorm: 0.0250750113961771" ## [1] "My function: 0.0166498666348048, dmvnorm: 0.0166498666348048" ## [1] "My function: 0.00189725106874659, dmvnorm: 0.00189725106874659" ## [1] "My function: 0.0196697814975113, dmvnorm: 0.0196697814975113" "],["ard.html", "Chapter 9 Alternative representation of distributions 9.1 Probability generating functions (PGFs) 9.2 Moment generating functions (MGFs)", " Chapter 9 Alternative representation of distributions This chapter deals with alternative representation of distributions. The students are expected to acquire the following knowledge: Theoretical Probability generating functions. Moment generating functions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 9.1 Probability generating functions (PGFs) Exercise 9.1 Show that the sum of independent Poisson random variables is itself a Poisson random variable. R: Let \\(X\\) be a sum of three Poisson distributions with \\(\\lambda_i \\in \\{2, 5.2, 10\\}\\). Take 1000 samples and plot the three distributions and the sum. Then take 1000 samples from the theoretical distribution of \\(X\\) and compare them to the sum. Solution. Let \\(X_i \\sim \\text{Poisson}(\\lambda_i)\\) for \\(i = 1,...,n\\), and let \\(X = \\sum_{i=1}^n X_i\\). \\[\\begin{align} \\alpha_X(t) &= \\prod_{i=1}^n \\alpha_{X_i}(t) \\\\ &= \\prod_{i=1}^n \\bigg( \\sum_{j=0}^\\infty t^j \\frac{\\lambda_i^j e^{-\\lambda_i}}{j!} \\bigg) \\\\ &= \\prod_{i=1}^n \\bigg( e^{-\\lambda_i} \\sum_{j=0}^\\infty \\frac{(t\\lambda_i)^j }{j!} \\bigg) \\\\ &= \\prod_{i=1}^n \\bigg( e^{-\\lambda_i} e^{t \\lambda_i} \\bigg) & \\text{power series} \\\\ &= \\prod_{i=1}^n \\bigg( e^{\\lambda_i(t - 1)} \\bigg) \\\\ &= e^{\\sum_{i=1}^n \\lambda_i(t - 1)} \\\\ &= e^{t \\sum_{i=1}^n \\lambda_i - \\sum_{i=1}^n \\lambda_i} \\\\ &= e^{-\\sum_{i=1}^n \\lambda_i} \\sum_{j=0}^\\infty \\frac{(t \\sum_{i=1}^n \\lambda_i)^j}{j!}\\\\ &= \\sum_{j=0}^\\infty \\frac{e^{-\\sum_{i=1}^n \\lambda_i} (t \\sum_{i=1}^n \\lambda_i)^j}{j!}\\\\ \\end{align}\\] The last term is the PGF of a Poisson random variable with parameter \\(\\sum_{i=1}^n \\lambda_i\\). Because the PGF is unique, \\(X\\) is a Poisson random variable. set.seed(1) library(tidyr) nsamps <- 1000 samps <- matrix(data = NA, nrow = nsamps, ncol = 4) samps[ ,1] <- rpois(nsamps, 2) samps[ ,2] <- rpois(nsamps, 5.2) samps[ ,3] <- rpois(nsamps, 10) samps[ ,4] <- samps[ ,1] + samps[ ,2] + samps[ ,3] colnames(samps) <- c(2, 2.5, 10, "sum") gsamps <- as_tibble(samps) gsamps <- gather(gsamps, key = "dist", value = "value") ggplot(gsamps, aes(x = value)) + geom_bar() + facet_wrap(~ dist) samps <- cbind(samps, "theoretical" = rpois(nsamps, 2 + 5.2 + 10)) gsamps <- as_tibble(samps[ ,4:5]) gsamps <- gather(gsamps, key = "dist", value = "value") ggplot(gsamps, aes(x = value, fill = dist)) + geom_bar(position = "dodge") Exercise 9.2 Find the expected value and variance of the negative binomial distribution. Hint: Find the Taylor series of \\((1 - y)^{-r}\\) at point 0. Solution. Let \\(X \\sim \\text{NB}(r, p)\\). \\[\\begin{align} \\alpha_X(t) &= E[t^X] \\\\ &= \\sum_{j=0}^\\infty t^j \\binom{j + r - 1}{j} (1 - p)^r p^j \\\\ &= (1 - p)^r \\sum_{j=0}^\\infty \\binom{j + r - 1}{j} (tp)^j \\\\ &= (1 - p)^r \\sum_{j=0}^\\infty \\frac{(j + r - 1)(j + r - 2)...r}{j!} (tp)^j. \\\\ \\end{align}\\] Let us look at the Taylor series of \\((1 - y)^{-r}\\) at 0 \\[\\begin{align} (1 - y)^{-r} = &1 + \\frac{-r(-1)}{1!}y + \\frac{-r(-r - 1)(-1)^2}{2!}y^2 + \\\\ &\\frac{-r(-r - 1)(-r - 2)(-1)^3}{3!}y^3 + ... \\\\ \\end{align}\\] How does the \\(k\\)-th term look like? We have \\(k\\) derivatives of our function so \\[\\begin{align} \\frac{d^k}{d^k y} (1 - y)^{-r} &= \\frac{-r(-r - 1)...(-r - k + 1)(-1)^k}{k!}y^k \\\\ &= \\frac{r(r + 1)...(r + k - 1)}{k!}y^k. \\end{align}\\] We observe that this equals to the \\(j\\)-th term in the sum of NB PGF. Therefore \\[\\begin{align} \\alpha_X(t) &= (1 - p)^r (1 - tp)^{-r} \\\\ &= \\Big(\\frac{1 - p}{1 - tp}\\Big)^r \\end{align}\\] To find the expected value, we need to differentiate \\[\\begin{align} \\frac{d}{dt} \\Big(\\frac{1 - p}{1 - tp}\\Big)^r &= r \\Big(\\frac{1 - p}{1 - tp}\\Big)^{r-1} \\frac{d}{dt} \\frac{1 - p}{1 - tp} \\\\ &= r \\Big(\\frac{1 - p}{1 - tp}\\Big)^{r-1} \\frac{p(1 - p)}{(1 - tp)^2}. \\\\ \\end{align}\\] Evaluating this at 1, we get: \\[\\begin{align} E[X] = \\frac{rp}{1 - p}. \\end{align}\\] For the variance we need the second derivative. \\[\\begin{align} \\frac{d^2}{d^2t} \\Big(\\frac{1 - p}{1 - tp}\\Big)^r &= \\frac{p^2 r (r + 1) (\\frac{1 - p}{1 - tp})^r}{(tp - 1)^2} \\end{align}\\] Evaluating this at 1 and inserting the first derivatives, we get: \\[\\begin{align} Var[X] &= \\frac{d^2}{dt^2} \\alpha_X(1) + \\frac{d}{dt}\\alpha_X(1) - \\Big(\\frac{d}{dt}\\alpha_X(t) \\Big)^2 \\\\ &= \\frac{p^2 r (r + 1)}{(1 - p)^2} + \\frac{rp}{1 - p} - \\frac{r^2p^2}{(1 - p)^2} \\\\ &= \\frac{rp}{(1 - p)^2}. \\end{align}\\] library(tidyr) set.seed(1) nsamps <- 100000 find_p <- function (mu, r) { return (10 / (r + 10)) } r <- c(1,2,10,20) p <- find_p(10, r) sigma <- rep(sqrt(p*r / (1 - p)^2), each = nsamps) samps <- cbind("r=1" = rnbinom(nsamps, size = r[1], prob = 1 - p[1]), "r=2" = rnbinom(nsamps, size = r[2], prob = 1 - p[2]), "r=4" = rnbinom(nsamps, size = r[3], prob = 1 - p[3]), "r=20" = rnbinom(nsamps, size = r[4], prob = 1 - p[4])) gsamps <- gather(as.data.frame(samps)) iw <- (gsamps$value > sigma + 10) | (gsamps$value < sigma - 10) ggplot(gsamps, aes(x = value, fill = iw)) + geom_bar() + # geom_density() + facet_wrap(~ key) 9.2 Moment generating functions (MGFs) Exercise 9.3 Find the variance of the geometric distribution. Solution. Let \\(X \\sim \\text{Geometric}(p)\\). The MGF of the geometric distribution is \\[\\begin{align} M_X(t) &= E[e^{tX}] \\\\ &= \\sum_{k=0}^\\infty p(1 - p)^k e^{tk} \\\\ &= p \\sum_{k=0}^\\infty ((1 - p)e^t)^k. \\end{align}\\] Let us assume that \\((1 - p)e^t < 1\\). Then, by using the geometric series we get \\[\\begin{align} M_X(t) &= \\frac{p}{1 - e^t + pe^t}. \\end{align}\\] The first derivative of the above expression is \\[\\begin{align} \\frac{d}{dt}M_X(t) &= \\frac{-p(-e^t + pe^t)}{(1 - e^t + pe^t)^2}, \\end{align}\\] and evaluating at \\(t = 0\\), we get \\(\\frac{1 - p}{p}\\), which we already recognize as the expected value of the geometric distribution. The second derivative is \\[\\begin{align} \\frac{d^2}{dt^2}M_X(t) &= \\frac{(p-1)pe^t((p-1)e^t - 1)}{((p - 1)e^t + 1)^3}, \\end{align}\\] and evaluating at \\(t = 0\\), we get \\(\\frac{(p - 1)(p - 2)}{p^2}\\). Combining we get the variance \\[\\begin{align} Var(X) &= \\frac{(p - 1)(p - 2)}{p^2} - \\frac{(1 - p)^2}{p^2} \\\\ &= \\frac{(p-1)(p-2) - (1-p)^2}{p^2} \\\\ &= \\frac{1 - p}{p^2}. \\end{align}\\] Exercise 9.4 Find the distribution of sum of two normal random variables \\(X\\) and \\(Y\\), by comparing \\(M_{X+Y}(t)\\) to \\(M_X(t)\\). R: To illustrate the result draw random samples from N\\((-3, 1)\\) and N\\((5, 1.2)\\) and calculate the empirical mean and variance of \\(X+Y\\). Plot all three histograms in one plot. Solution. Let \\(X \\sim \\text{N}(\\mu_X, 1)\\) and \\(Y \\sim \\text{N}(\\mu_Y, 1)\\). The MGF of the sum is \\[\\begin{align} M_{X+Y}(t) &= M_X(t) M_Y(t). \\end{align}\\] Let us calculate \\(M_X(t)\\), the MGF for \\(Y\\) then follows analogously. \\[\\begin{align} M_X(t) &= \\int_{-\\infty}^\\infty e^{tx} \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - mu_X)^2}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - mu_X)^2 - 2\\sigma_X tx}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{x^2 - 2\\mu_X x + \\mu_X^2 - 2\\sigma_X tx}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - (\\mu_X + \\sigma_X^2 t))^2 + \\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} dx & \\text{complete the square}\\\\ &= e^{-\\frac{\\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - (\\mu_X + \\sigma_X^2 t))^2}{2\\sigma_X^2}} dx & \\\\ &= e^{-\\frac{\\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} & \\text{normal PDF} \\\\ &= e^{-\\frac{\\mu_X^2 - \\mu_X^2 - \\mu_X \\sigma_X^2 t - 2 \\sigma_X^4 t^2}{2\\sigma_X^2}} \\\\ &= e^{\\sigma_X^2 t^2 + \\frac{\\mu_X t}{2}}. \\\\ \\end{align}\\] The MGF of the sum is then \\[\\begin{align} M_{X+Y}(t) &= e^{\\sigma_X^2 t^2 + 0.5\\mu_X t} e^{\\sigma_Y^2 t^2 + 0.5\\mu_Y t} \\\\ &= e^{t^2(\\sigma_X^2 + \\sigma_Y^2) + 0.5 t(\\mu_X + \\mu_Y)}. \\end{align}\\] By comparing \\(M_{X+Y}(t)\\) and \\(M_X(t)\\) we observe that both have two terms. The first is \\(2t^2\\) multiplied by the variance, and the second is \\(2t\\) multiplied by the mean. Since MGFs are unique, we conclude that \\(Z = X + Y \\sim \\text{N}(\\mu_X + \\mu_Y, \\sigma_X^2 + \\sigma_Y^2)\\). library(tidyr) library(ggplot2) set.seed(1) nsamps <- 1000 x <- rnorm(nsamps, -3, 1) y <- rnorm(nsamps, 5, 1.2) z <- x + y mean(z) ## [1] 1.968838 var(z) ## [1] 2.645034 df <- data.frame(x = x, y = y, z = z) %>% gather() ggplot(df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") "],["ci.html", "Chapter 10 Concentration inequalities 10.1 Comparison 10.2 Practical", " Chapter 10 Concentration inequalities This chapter deals with concentration inequalities. The students are expected to acquire the following knowledge: Theoretical More assumptions produce closer bounds. R Optimization. Estimating probability inequalities. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 10.1 Comparison Exercise 10.1 R: Let \\(X\\) be geometric random variable with \\(p = 0.7\\). Visually compare the Markov bound, Chernoff bound, and the theoretical probabilities for \\(x = 1,...,12\\). To get the best fitting Chernoff bound, you will need to optimize the bound depending on \\(t\\). Use either analytical or numerical optimization. bound_chernoff <- function (t, p, a) { return ((p / (1 - exp(t) + p * exp(t))) / exp(a * t)) } set.seed(1) p <- 0.7 a <- seq(1, 12, by = 1) ci_markov <- (1 - p) / p / a t <- vector(mode = "numeric", length = length(a)) for (i in 1:length(t)) { t[i] <- optimize(bound_chernoff, interval = c(0, log(1 / (1 - p))), p = p, a = a[i])$minimum } t ## [1] 0.5108267 0.7984981 0.9162927 0.9808238 1.0216635 1.0498233 1.0704327 ## [8] 1.0861944 1.0986159 1.1086800 1.1169653 1.1239426 ci_chernoff <- (p / (1 - exp(t) + p * exp(t))) / exp(a * t) actual <- 1 - pgeom(a, 0.7) plot_df <- rbind( data.frame(x = a, y = ci_markov, type = "Markov"), data.frame(x = a, y = ci_chernoff, type = "Chernoff"), data.frame(x = a, y = actual, type = "Actual") ) ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() Exercise 10.2 R: Let \\(X\\) be a sum of 100 Beta distributions with random parameters. Take 1000 samples and plot the Chebyshev bound, Hoeffding bound, and the empirical probabilities. set.seed(1) nvars <- 100 nsamps <- 1000 samps <- matrix(data = NA, nrow = nsamps, ncol = nvars) Sn_mean <- 0 Sn_var <- 0 for (i in 1:nvars) { alpha1 <- rgamma(1, 10, 1) beta1 <- rgamma(1, 10, 1) X <- rbeta(nsamps, alpha1, beta1) Sn_mean <- Sn_mean + alpha1 / (alpha1 + beta1) Sn_var <- Sn_var + alpha1 * beta1 / ((alpha1 + beta1)^2 * (alpha1 + beta1 + 1)) samps[ ,i] <- X } mean(apply(samps, 1, sum)) ## [1] 51.12511 Sn_mean ## [1] 51.15723 var(apply(samps, 1, sum)) ## [1] 1.170652 Sn_var ## [1] 1.166183 a <- 1:30 b <- a / sqrt(Sn_var) ci_chebyshev <- 1 / b^2 ci_hoeffding <- 2 * exp(- 2 * a^2 / nvars) empirical <- NULL for (i in 1:length(a)) { empirical[i] <- sum(abs((apply(samps, 1, sum)) - Sn_mean) >= a[i])/ nsamps } plot_df <- rbind( data.frame(x = a, y = ci_chebyshev, type = "Chebyshev"), data.frame(x = a, y = ci_hoeffding, type = "Hoeffding"), data.frame(x = a, y = empirical, type = "Empirical") ) ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() + coord_cartesian(xlim = c(15, 25), ylim = c(0, 0.05)) 10.2 Practical Exercise 10.3 From Jagannathan. Let \\(X_i\\), \\(i = 1,...n\\), be a random sample of size \\(n\\) of a random variable \\(X\\). Let \\(X\\) have mean \\(\\mu\\) and variance \\(\\sigma^2\\). Find the size of the sample \\(n\\) required so that the probability that the difference between sample mean and true mean is smaller than \\(\\frac{\\sigma}{10}\\) is at least 0.95. Hint: Derive a version of the Chebyshev inequality for \\(P(|X - \\mu| \\geq a)\\) using Markov inequality. Solution. Let \\(\\bar{X} = \\sum_{i=1}^n X_i\\). Then \\(E[\\bar{X}] = \\mu\\) and \\(Var[\\bar{X}] = \\frac{\\sigma^2}{n}\\). Let us first derive another representation of Chebyshev inequality. \\[\\begin{align} P(|X - \\mu| \\geq a) = P(|X - \\mu|^2 \\geq a^2) \\leq \\frac{E[|X - \\mu|^2]}{a^2} = \\frac{Var[X]}{a^2}. \\end{align}\\] Let us use that on our sampling distribution: \\[\\begin{align} P(|\\bar{X} - \\mu| \\geq \\frac{\\sigma}{10}) \\leq \\frac{100 Var[\\bar{X}]}{\\sigma^2} = \\frac{100 Var[X]}{n \\sigma^2} = \\frac{100}{n}. \\end{align}\\] We are interested in the difference being smaller, therefore \\[\\begin{align} P(|\\bar{X} - \\mu| < \\frac{\\sigma}{10}) = 1 - P(|\\bar{X} - \\mu| \\geq \\frac{\\sigma}{10}) \\geq 1 - \\frac{100}{n} \\geq 0.95. \\end{align}\\] It follows that we need a sample size of \\(n \\geq \\frac{100}{0.05} = 2000\\). "],["crv.html", "Chapter 11 Convergence of random variables", " Chapter 11 Convergence of random variables This chapter deals with convergence of random variables. The students are expected to acquire the following knowledge: Theoretical Finding convergences of random variables. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 11.1 Let \\(X_1\\), \\(X_2\\),…, \\(X_n\\) be a sequence of Bernoulli random variables. Let \\(Y_n = \\frac{X_1 + X_2 + ... + X_n}{n^2}\\). Show that this sequence converges point-wise to the zero random variable. R: Use a simulation to check your answer. Solution. Let \\(\\epsilon\\) be arbitrary. We need to find such \\(n_0\\), that for every \\(n\\) greater than \\(n_0\\) \\(|Y_n| < \\epsilon\\) holds. \\[\\begin{align} |Y_n| &= |\\frac{X_1 + X_2 + ... + X_n}{n^2}| \\\\ &\\leq |\\frac{n}{n^2}| \\\\ &= \\frac{1}{n}. \\end{align}\\] So we need to find such \\(n_0\\), that for every \\(n > n_0\\) we will have \\(\\frac{1}{n} < \\epsilon\\). So \\(n_0 > \\frac{1}{\\epsilon}\\). x <- 1:1000 X <- matrix(data = NA, nrow = length(x), ncol = 100) y <- vector(mode = "numeric", length = length(x)) for (i in 1:length(x)) { X[i, ] <- rbinom(100, size = 1, prob = 0.5) } X <- apply(X, 2, cumsum) tmp_mat <- matrix(data = (1:1000)^2, nrow = 1000, ncol = 100) X <- X / tmp_mat y <- apply(X, 1, mean) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() Exercise 11.2 Let \\(\\Omega = [0,1]\\) and let \\(X_n\\) be a sequence of random variables, defined as \\[\\begin{align} X_n(\\omega) = \\begin{cases} \\omega^3, &\\omega = \\frac{i}{n}, &0 \\leq i \\leq 1 \\\\ 1, & \\text{otherwise.} \\end{cases} \\end{align}\\] Show that \\(X_n\\) converges almost surely to \\(X \\sim \\text{Uniform}(0,1)\\). Solution. We need to show \\(P(\\{\\omega: X_n(\\omega) \\rightarrow X(\\omega)\\}) = 1\\). Let \\(\\omega \\neq \\frac{i}{n}\\). Then for any \\(\\omega\\), \\(X_n\\) converges pointwise to \\(X\\): \\[\\begin{align} X_n(\\omega) = 1 \\implies |X_n(\\omega) - X(s)| = |1 - 1| < \\epsilon. \\end{align}\\] The above is independent of \\(n\\). Since there are countably infinite number of elements in the complement (\\(\\frac{i}{n}\\)), the probability of this set is 1. Exercise 11.3 Borrowed from Wasserman. Let \\(X_n \\sim \\text{N}(0, \\frac{1}{n})\\) and let \\(X\\) be a random variable with CDF \\[\\begin{align} F_X(x) = \\begin{cases} 0, &x < 0 \\\\ 1, &x \\geq 0. \\end{cases} \\end{align}\\] Does \\(X_n\\) converge to \\(X\\) in distribution? How about in probability? Prove or disprove these statement. R: Plot the CDF of \\(X_n\\) for \\(n = 1, 2, 5, 10, 100, 1000\\). Solution. Let us first check convergence in distribution. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} F_{X_n}(x) &= \\lim_{n \\rightarrow \\infty} \\phi (\\sqrt(n) x). \\end{align}\\] We have two cases, for \\(x < 0\\) and \\(x > 0\\). We do not need to check for \\(x = 0\\), since \\(F_X\\) is not continuous in that point. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} \\phi (\\sqrt(n) x) = \\begin{cases} 0, & x < 0 \\\\ 1, & x > 0. \\end{cases} \\end{align}\\] This is the same as \\(F_X\\). Let us now check convergence in probability. Since \\(X\\) is a point-mass distribution at zero, we have \\[\\begin{align} \\lim_{n \\rightarrow \\infty} P(|X_n| > \\epsilon) &= \\lim_{n \\rightarrow \\infty} (P(X_n > \\epsilon) + P(X_n < -\\epsilon)) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - P(X_n < \\epsilon) + P(X_n < -\\epsilon)) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - \\phi(\\sqrt{n} \\epsilon) + \\phi(- \\sqrt{n} \\epsilon)) \\\\ &= 0. \\end{align}\\] n <- c(1,2,5,10,100,1000) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01)), aes(x = x)) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/1), aes(color = "sd = 1/1")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/2), aes(color = "sd = 1/2")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/5), aes(color = "sd = 1/5")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/10), aes(color = "sd = 1/10")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/100), aes(color = "sd = 1/100")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/1000), aes(color = "sd = 1/1000")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/10000), aes(color = "sd = 1/10000")) Exercise 11.4 Let \\(X_i\\) be i.i.d. and \\(\\mu = E(X_1)\\). Let variance of \\(X_1\\) be finite. Show that the mean of \\(X_i\\), \\(\\bar{X}_n = \\frac{1}{n}\\sum_{i=1}^n X_i\\) converges in quadratic mean to \\(\\mu\\). Solution. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} E(|\\bar{X_n} - \\mu|^2) &= \\lim_{n \\rightarrow \\infty} E(\\bar{X_n}^2 - 2 \\bar{X_n} \\mu + \\mu^2) \\\\ &= \\lim_{n \\rightarrow \\infty} (E(\\bar{X_n}^2) - 2 \\mu E(\\frac{\\sum_{i=1}^n X_i}{n}) + \\mu^2) \\\\ &= \\lim_{n \\rightarrow \\infty} E(\\bar{X_n})^2 + \\lim_{n \\rightarrow \\infty} Var(\\bar{X_n}) - 2 \\mu^2 + \\mu^2 \\\\ &= \\lim_{n \\rightarrow \\infty} \\frac{n^2 \\mu^2}{n^2} + \\lim_{n \\rightarrow \\infty} \\frac{\\sigma^2}{n} - \\mu^2 \\\\ &= \\mu^2 - \\mu^2 + \\lim_{n \\rightarrow \\infty} \\frac{\\sigma^2}{n} \\\\ &= 0. \\end{align}\\] "],["lt.html", "Chapter 12 Limit theorems", " Chapter 12 Limit theorems This chapter deals with limit theorems. The students are expected to acquire the following knowledge: Theoretical Monte Carlo integration convergence. Difference between weak and strong law of large numbers. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 12.1 Show that Monte Carlo integration converges almost surely to the true integral of a bounded function. Solution. Let \\(g\\) be a function defined on \\(\\Omega\\). Let \\(X_i\\), \\(i = 1,...,n\\) be i.i.d. (multivariate) uniform random variables with bounds defined on \\(\\Omega\\). Let \\(Y_i\\) = \\(g(X_i)\\). Then it follows that \\(Y_i\\) are also i.i.d. random variables and their expected value is \\(E[g(X)] = \\int_{\\Omega} g(x) f_X(x) dx = \\frac{1}{V_{\\Omega}} \\int_{\\Omega} g(x) dx\\). By the strong law of large numbers, we have \\[\\begin{equation} \\frac{1}{n}\\sum_{i=1}^n Y_i \\xrightarrow{\\text{a.s.}} E[g(X)]. \\end{equation}\\] It follows that \\[\\begin{equation} V_{\\Omega} \\frac{1}{n}\\sum_{i=1}^n Y_i \\xrightarrow{\\text{a.s.}} \\int_{\\Omega} g(x) dx. \\end{equation}\\] Exercise 12.2 Let \\(X\\) be a geometric random variable with probability 0.5 and support in positive integers. Let \\(Y = 2^X (-1)^X X^{-1}\\). Find the expected value of \\(Y\\) by using conditional convergence (this variable does not have an expected value in the conventional sense – the series is not absolutely convergent). R: Draw \\(10000\\) samples from a geometric distribution with probability 0.5 and support in positive integers to get \\(X\\). Then calculate \\(Y\\) and plot the means at each iteration (sample). Additionally, plot the expected value calculated in a. Try it with different seeds. What do you notice? Solution. \\[\\begin{align*} E[Y] &= \\sum_{x=1}^{\\infty} \\frac{2^x (-1)^x}{x} 0.5^x \\\\ &= \\sum_{x=1}^{\\infty} \\frac{(-1)^x}{x} \\\\ &= - \\sum_{x=1}^{\\infty} \\frac{(-1)^{x+1}}{x} \\\\ &= - \\ln(2) \\end{align*}\\] set.seed(3) x <- rgeom(100000, prob = 0.5) + 1 y <- 2^x * (-1)^x * x^{-1} y_means <- cumsum(y) / seq_along(y) df <- data.frame(x = 1:length(y_means), y = y_means) ggplot(data = df, aes(x = x, y = y)) + geom_line() + geom_hline(yintercept = -log(2)) "],["eb.html", "Chapter 13 Estimation basics 13.1 ECDF 13.2 Properties of estimators", " Chapter 13 Estimation basics This chapter deals with estimation basics. The students are expected to acquire the following knowledge: Biased and unbiased estimators. Consistent estimators. Empirical cumulative distribution function. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 13.1 ECDF Exercise 13.1 (ECDF intuition) Take any univariate continuous distribution that is readily available in R and plot its CDF (\\(F\\)). Draw one sample (\\(n = 1\\)) from the chosen distribution and draw the ECDF (\\(F_n\\)) of that one sample. Use the definition of the ECDF, not an existing function in R. Implementation hint: ECDFs are always piecewise constant - they only jump at the sampled values and by \\(1/n\\). Repeat (b) for \\(n = 5, 10, 100, 1000...\\) Theory says that \\(F_n\\) should converge to \\(F\\). Can you observe that? For \\(n = 100\\) repeat the process \\(m = 20\\) times and plot every \\(F_n^{(m)}\\). Theory says that \\(F_n\\) will converge to \\(F\\) the slowest where \\(F\\) is close to 0.5 (where the variance is largest). Can you observe that? library(ggplot2) set.seed(1) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) one_samp <- rnorm(1) X <- data.frame(x = c(-5, one_samp, 5), y = c(0,1,1)) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) + geom_step(data = X, aes(x = x, y = y)) N <- c(5, 10, 100, 1000) X <- NULL for (n in N) { tmp <- rnorm(n) tmp_X <- data.frame(x = c(-5, sort(tmp), 5), y = c(0, seq(1/n, 1, by = 1/n), 1), n = n) X <- rbind(X, tmp_X) } ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) + geom_step(data = X, aes(x = x, y = y, color = as.factor(n))) + labs(color = "N") 13.2 Properties of estimators Exercise 13.2 Show that the sample average is, as an estimator of the mean: unbiased, consistent, asymptotically normal. Solution. \\[\\begin{align*} E[\\frac{1}{n} \\sum_{i=1}^n X_i] &= \\frac{1}{n} \\sum_{i=i}^n E[X_i] \\\\ &= E[X]. \\end{align*}\\] \\[\\begin{align*} \\lim_{n \\rightarrow \\infty} P(|\\frac{1}{n} \\sum_{i=1}^n X_i - E[X]| > \\epsilon) &= \\lim_{n \\rightarrow \\infty} P((\\frac{1}{n} \\sum_{i=1}^n X_i - E[X])^2 > \\epsilon^2) \\\\ & \\leq \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i - E[X])^2]}{\\epsilon^2} & \\text{Markov inequality} \\\\ & = \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i)^2 - 2 \\frac{1}{n} \\sum_{i=1}^n X_i E[X] + E[X]^2]}{\\epsilon^2} \\\\ & = \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i)^2] - 2 E[X]^2 + E[X]^2}{\\epsilon^2} \\\\ &= 0 \\end{align*}\\] For the last equality see the solution to ??. Follows directly from the CLT. Exercise 13.3 (Consistent but biased estimator) Show that sample variance (the plug-in estimator of variance) is a biased estimator of variance. Show that sample variance is a consistent estimator of variance. Show that the estimator with (\\(N-1\\)) (Bessel correction) is unbiased. Solution. \\[\\begin{align*} E[\\frac{1}{n} \\sum_{i=1}^n (Y_i - \\bar{Y})^2] &= \\frac{1}{n} \\sum_{i=1}^n E[(Y_i - \\bar{Y})^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2] - 2 E[Y_i \\bar{Y}] + \\bar{Y}^2)] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2 - 2 Y_i \\bar{Y} + \\bar{Y}^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2 - \\frac{2}{n} Y_i^2 - \\frac{2}{n} \\sum_{i \\neq j} Y_i Y_j + \\frac{1}{n^2}\\sum_j \\sum_{k \\neq j} Y_j Y_k + \\frac{1}{n^2} \\sum_j Y_j^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n \\frac{n - 2}{n} (\\sigma^2 + \\mu^2) - \\frac{2}{n} (n - 1) \\mu^2 + \\frac{1}{n^2}n(n-1)\\mu^2 + \\frac{1}{n^2}n(\\sigma^2 + \\mu^2) \\\\ &= \\frac{n-1}{n}\\sigma^2 \\\\ < \\sigma^2. \\end{align*}\\] Let \\(S_n\\) denote the sample variance. Then we can write it as \\[\\begin{align*} S_n &= \\frac{1}{n} \\sum_{i=1}^n (X_i - \\bar{X})^2 = \\frac{1}{n} \\sum_{i=1}^n (X_i - \\mu)^2 + 2(X_i - \\mu)(\\mu - \\bar{X}) + (\\mu - \\bar{X})^2. \\end{align*}\\] Now \\(\\bar{X}\\) converges in probability (by WLLN) to \\(\\mu\\) therefore the right terms converge in probability to zero. The left term converges in probability to \\(\\sigma^2\\), also by WLLN. Therefore the sample variance is a consistent estimatior of the variance. The denominator changes in the second-to-last line of a., therefore the last line is now equality. Exercise 13.4 (Estimating the median) Show that the sample median is an unbiased estimator of the median for N\\((\\mu, \\sigma^2)\\). Show that the sample median is an unbiased estimator of the mean for any distribution with symmetric density. Hint 1: The pdf of an order statistic is \\(f_{X_{(k)}}(x) = \\frac{n!}{(n - k)!(k - 1)!}f_X(x)\\Big(F_X(x)^{k-1} (1 - F_X(x)^{n - k}) \\Big)\\). Hint 2: A distribution is symmetric when \\(X\\) and \\(2a - X\\) have the same distribution for some \\(a\\). Solution. Let \\(Z_i\\), \\(i = 1,...,n\\) be i.i.d. variables with a symmetric distribution and let \\(Z_{k:n}\\) denote the \\(k\\)-th order statistic. We will distinguish two cases, when \\(n\\) is odd and when \\(n\\) is even. Let first \\(n = 2m + 1\\) be odd. Then the sample median is \\(M = Z_{m+1:2m+1}\\). Its PDF is \\[\\begin{align*} f_M(x) = (m+1)\\binom{2m + 1}{m}f_Z(x)\\Big(F_Z(x)^m (1 - F_Z(x)^m) \\Big). \\end{align*}\\] For every symmetric distribution, it holds that \\(F_X(x) = 1 - F(2a - x)\\). Let \\(a = \\mu\\), the population mean. Plugging this into the PDF, we get that \\(f_M(x) = f_M(2\\mu -x)\\). It follows that \\[\\begin{align*} E[M] &= E[2\\mu - M] \\\\ 2E[M] &= 2\\mu \\\\ E[M] &= \\mu. \\end{align*}\\] Now let \\(n = 2m\\) be even. Then the sample median is \\(M = \\frac{Z_{m:2m} + Z_{m+1:2m}}{2}\\). It can be shown, that the joint PDF of these terms is also symmetric. Therefore, similar to the above \\[\\begin{align*} E[M] &= E[\\frac{Z_{m:2m} + Z_{m+1:2m}}{2}] \\\\ &= E[\\frac{2\\mu - M + 2\\mu - M}{2}] \\\\ &= E[2\\mu - M]. \\end{align*}\\] The above also proves point a. as the median and the mean are the same in normal distribution. Exercise 13.5 (Matrix trace estimation) The Hutchinson trace estimator [1] is an estimator of the trace of a symmetric positive semidefinite matrix A that relies on Monte Carlo sampling. The estimator is defined as \\[\\begin{align*} \\textrm{tr}(A) \\approx \\frac{1}{n} \\Sigma_{i=1}^n z_i^T A z_i, &\\\\ z_i \\sim_{\\mathrm{IID}} \\textrm{Uniform}(\\{-1, 1\\}^m), & \\end{align*}\\] where \\(A \\in \\mathbb{R}^{m \\times m}\\) is a symmetric positive semidefinite matrix. Elements of each vector \\(z_i\\) are either \\(-1\\) or \\(1\\) with equal probability. This is also called a Rademacher distribution. Data scientists often want the trace of a Hessian to obtain valuable curvature information for a loss function. Per [2], an example is classifying ten digits based on \\((28,28)\\) grayscale images (i.e. MNIST data) using logistic regression. The number of parameters is \\(m = 28^2 \\cdot 10 = 7840\\) and the size of the Hessian is \\(m^2\\), roughly \\(6 \\cdot 10^6\\). The diagonal average is equal to the average eigenvalue, which may be useful for optimization; in MCMC contexts, this would be useful for preconditioners and step size optimization. Computing Hessians (as a means of getting eigenvalue information) is often intractable, but Hessian-vector products can be computed faster by autodifferentiation (with e.g. Tensorflow, Pytorch, Jax). This is one motivation for the use of a stochastic trace estimator as outlined above. References: A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines (Hutchinson, 1990) A Modern Analysis of Hutchinson’s Trace Estimator (Skorski, 2020) Prove that the Hutchinson trace estimator is an unbiased estimator of the trace. Solution. We first simplify our task: \\[\\begin{align} \\mathbb{E}\\left[\\frac{1}{n} \\Sigma_{i=1}^n z_i^T A z_i \\right] &= \\frac{1}{n} \\Sigma_{i=1}^n \\mathbb{E}\\left[z_i^T A z_i \\right] \\\\ &= \\mathbb{E}\\left[z_i^T A z_i \\right], \\end{align}\\] where the second equality is due to having \\(n\\) IID vectors \\(z_i\\). We now only need to show that \\(\\mathbb{E}\\left[z^T A z \\right] = \\mathrm{tr}(A)\\). We omit the index due to all vectors being IID: \\[\\begin{align} \\mathrm{tr}(A) &= \\mathrm{tr}(AI) \\\\ &= \\mathrm{tr}(A\\mathbb{E}[zz^T]) \\\\ &= \\mathbb{E}[\\mathrm{tr}(Azz^T)] \\\\ &= \\mathbb{E}[\\mathrm{tr}(z^TAz)] \\\\ &= \\mathbb{E}[z^TAz]. \\end{align}\\] This concludes the proof. We clarify some equalities below. The second equality assumes that \\(\\mathbb{E}[zz^T] = I\\). By noting that the mean of the Rademacher distribution is 0, we have \\[\\begin{align} \\mathrm{Cov}[z, z] &= \\mathbb{E}[(z - \\mathbb{E}[z])(z - \\mathbb{E}[z])^T] \\\\ &= \\mathbb{E}[zz^T]. \\end{align}\\] Dimensions of \\(z\\) are independent, so \\(\\mathrm{Cov}[z, z]_{ij} = 0\\) for \\(i \\neq j\\). The diagonal will contain variances, which are equal to \\(1\\) for all dimensions \\(k = 1 \\dots m\\): \\(\\mathrm{Var}[z^{(k)}] = \\mathbb{E}[z^{(k)}z^{(k)}] - \\mathbb{E}[z^{(k)}]^2 = 1 - 0 = 1\\). It follows that the covariance is an identity matrix. Note that this is a general result for vectors with IID dimensions sampled from a distribution with mean 0 and variance 1. We could therefore use something else instead of the Rademacher, e.g. \\(z ~ N(0, I)\\). The third equality uses the fact that the expectation of a trace equals the trace of an expectation. If \\(X\\) is a random matrix, then \\(\\mathbb{E}[X]_{ij} = \\mathbb{E}[X_{ij}]\\). Therefore: \\[\\begin{align} \\mathrm{tr}(\\mathbb{E}[X]) &= \\Sigma_{i=1}^m(\\mathbb{E}[X]_{ii}) \\\\ &= \\Sigma_{i=1}^m(\\mathbb{E}[X_{ii}]) \\\\ &= \\mathbb{E}[\\Sigma_{i=1}^m(X_{ii})] \\\\ &= \\mathbb{E}[\\mathrm{tr}(X)], \\end{align}\\] where we used the linearity of the expectation in the third step. The fourth equality uses the fact that \\(\\mathrm{tr}(AB) = \\mathrm{tr}(BA)\\) for any matrices \\(A \\in \\mathbb{R}^{n \\times m}, B \\in \\mathbb{R}^{m \\times n}\\). The last inequality uses the fact that the trace of a \\(1 \\times 1\\) matrix is just its element. "],["boot.html", "Chapter 14 Bootstrap", " Chapter 14 Bootstrap This chapter deals with bootstrap. The students are expected to acquire the following knowledge: How to use bootstrap to generate coverage intervals. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 14.1 Ideally, a \\(1-\\alpha\\) CI would have \\(1-\\alpha\\) coverage. That is, say a 95% CI should, in the long run, contain the true value of the parameter 95% of the time. In practice, it is impossible to assess the coverage of our CI method, because we rarely know the true parameter. In simulation, however, we can. Let’s assess the coverage of bootstrap percentile intervals. Pick a univariate distribution with readily available mean and one that you can easily sample from. Draw \\(n = 30\\) random samples from the chosen distribution and use the bootstrap (with large enough m) and percentile CI method to construct 95% CI. Repeat the process many times and count how many times the CI contains the true mean. That is, compute the actual coverage probability (don’t forget to include the standard error of the coverage probability!). What can you observe? Try one or two different distributions. What can you observe? Repeat (b) and (c) using BCa intervals (R package boot). How does the coverage compare to percentile intervals? As (d) but using intervals based on asymptotic normality (+/- 1.96 SE). How do results from (b), (d), and (e) change if we increase the sample size to n = 200? What about n = 5? library(boot) set.seed(0) nit <- 1000 # Repeat the process "many times" alpha <- 0.05 # CI parameter nboot <- 100 # m parameter for bootstrap ("large enough m") # f: change this to 200 or 5. nsample <- 30 # n = 30 random samples from the chosen distribution. Comment out BCa code if it breaks. covers <- matrix(nrow = nit, ncol = 3) covers_BCa <- matrix(nrow = nit, ncol = 3) covers_asymp_norm <- matrix(nrow = nit, ncol = 3) isin <- function (x, lower, upper) { (x > lower) & (x < upper) } for (j in 1:nit) { # Repeating many times # a: pick a univariate distribution - standard normal x1 <- rnorm(nsample) # c: one or two different distributions - beta and poisson x2 <- rbeta(nsample, 1, 2) x3 <- rpois(nsample, 5) X1 <- matrix(data = NA, nrow = nsample, ncol = nboot) X2 <- matrix(data = NA, nrow = nsample, ncol = nboot) X3 <- matrix(data = NA, nrow = nsample, ncol = nboot) for (i in 1:nboot) { X1[ ,i] <- sample(x1, nsample, replace = T) X2[ ,i] <- sample(x2, nsample, T) X3[ ,i] <- sample(x3, nsample, T) } X1_func <- apply(X1, 2, mean) X2_func <- apply(X2, 2, mean) X3_func <- apply(X3, 2, mean) X1_quant <- quantile(X1_func, probs = c(alpha / 2, 1 - alpha / 2)) X2_quant <- quantile(X2_func, probs = c(alpha / 2, 1 - alpha / 2)) X3_quant <- quantile(X3_func, probs = c(alpha / 2, 1 - alpha / 2)) covers[j,1] <- (0 > X1_quant[1]) & (0 < X1_quant[2]) covers[j,2] <- ((1 / 3) > X2_quant[1]) & ((1 / 3) < X2_quant[2]) covers[j,3] <- (5 > X3_quant[1]) & (5 < X3_quant[2]) mf <- function (x, i) return(mean(x[i])) bootX1 <- boot(x1, statistic = mf, R = nboot) bootX2 <- boot(x2, statistic = mf, R = nboot) bootX3 <- boot(x3, statistic = mf, R = nboot) X1_quant_BCa <- boot.ci(bootX1, type = "bca")$bca X2_quant_BCa <- boot.ci(bootX2, type = "bca")$bca X3_quant_BCa <- boot.ci(bootX3, type = "bca")$bca covers_BCa[j,1] <- (0 > X1_quant_BCa[4]) & (0 < X1_quant_BCa[5]) covers_BCa[j,2] <- ((1 / 3) > X2_quant_BCa[4]) & ((1 / 3) < X2_quant_BCa[5]) covers_BCa[j,3] <- (5 > X3_quant_BCa[4]) & (5 < X3_quant_BCa[5]) # e: estimate mean and standard error # sample mean: x1_bar <- mean(x1) x2_bar <- mean(x2) x3_bar <- mean(x3) # standard error (of the sample mean) estimate: sample standard deviation / sqrt(n) x1_bar_SE <- sd(x1) / sqrt(nsample) x2_bar_SE <- sd(x2) / sqrt(nsample) x3_bar_SE <- sd(x3) / sqrt(nsample) covers_asymp_norm[j,1] <- isin(0, x1_bar - 1.96 * x1_bar_SE, x1_bar + 1.96 * x1_bar_SE) covers_asymp_norm[j,2] <- isin(1/3, x2_bar - 1.96 * x2_bar_SE, x2_bar + 1.96 * x2_bar_SE) covers_asymp_norm[j,3] <- isin(5, x3_bar - 1.96 * x3_bar_SE, x3_bar + 1.96 * x3_bar_SE) } apply(covers, 2, mean) ## [1] 0.918 0.925 0.905 apply(covers, 2, sd) / sqrt(nit) ## [1] 0.008680516 0.008333333 0.009276910 apply(covers_BCa, 2, mean) ## [1] 0.927 0.944 0.927 apply(covers_BCa, 2, sd) / sqrt(nit) ## [1] 0.008230355 0.007274401 0.008230355 apply(covers_asymp_norm, 2, mean) ## [1] 0.939 0.937 0.930 apply(covers_asymp_norm, 2, sd) / sqrt(nit) ## [1] 0.007572076 0.007687008 0.008072494 Exercise 14.2 You are given a sample of independent observations from a process of interest: Index 1 2 3 4 5 6 7 8 X 7 2 4 6 4 5 9 10 Compute the plug-in estimate of mean and 95% symmetric CI based on asymptotic normality. Use the plug-in estimate of SE. Same as (a), but use the unbiased estimate of SE. Apply nonparametric bootstrap with 1000 bootstrap replications and estimate the 95% CI for the mean with percentile-based CI. # a x <- c(7, 2, 4, 6, 4, 5, 9, 10) n <- length(x) mu <- mean(x) SE <- sqrt(mean((x - mu)^2)) / sqrt(n) SE ## [1] 0.8915839 z <- qnorm(1 - 0.05 / 2) c(mu - z * SE, mu + z * SE) ## [1] 4.127528 7.622472 # b SE <- sd(x) / sqrt(n) SE ## [1] 0.9531433 c(mu - z * SE, mu + z * SE) ## [1] 4.006873 7.743127 # c set.seed(0) m <- 1000 T_mean <- function(x) {mean(x)} est_boot <- array(NA, m) for (i in 1:m) { x_boot <- x[sample(1:n, n, rep = T)] est_boot[i] <- T_mean(x_boot) } quantile(est_boot, p = c(0.025, 0.975)) ## 2.5% 97.5% ## 4.250 7.625 Exercise 14.3 We are given a sample of 10 independent paired (bivariate) observations: Index 1 2 3 4 5 6 7 8 9 10 X 1.26 -0.33 1.33 1.27 0.41 -1.54 -0.93 -0.29 -0.01 2.40 Y 2.64 0.33 0.48 0.06 -0.88 -2.14 -2.21 0.95 0.83 1.45 Compute Pearson correlation between X and Y. Use the cor.test() from R to estimate a 95% CI for the estimate from (a). Apply nonparametric bootstrap with 1000 bootstrap replications and estimate the 95% CI for the Pearson correlation with percentile-based CI. Compare CI from (b) and (c). Are they similar? How would the bootstrap estimation of CI change if we were interested in Spearman or Kendall correlation instead? x <- c(1.26, -0.33, 1.33, 1.27, 0.41, -1.54, -0.93, -0.29, -0.01, 2.40) y <- c(2.64, 0.33, 0.48, 0.06, -0.88, -2.14, -2.21, 0.95, 0.83, 1.45) # a cor(x, y) ## [1] 0.6991247 # b res <- cor.test(x, y) res$conf.int[1:2] ## [1] 0.1241458 0.9226238 # c set.seed(0) m <- 1000 n <- length(x) T_cor <- function(x, y) {cor(x, y)} est_boot <- array(NA, m) for (i in 1:m) { idx <- sample(1:n, n, rep = T) # !!! important to use same indices to keep dependency between x and y est_boot[i] <- T_cor(x[idx], y[idx]) } quantile(est_boot, p = c(0.025, 0.975)) ## 2.5% 97.5% ## 0.2565537 0.9057664 # d # Yes, but the bootstrap CI is more narrow. # e # We just use the functions for Kendall/Spearman coefficients instead: T_kendall <- function(x, y) {cor(x, y, method = "kendall")} T_spearman <- function(x, y) {cor(x, y, method = "spearman")} # Put this in a function that returns the CI bootstrap_95_ci <- function(x, y, t, m = 1000) { n <- length(x) est_boot <- array(NA, m) for (i in 1:m) { idx <- sample(1:n, n, rep = T) # !!! important to use same indices to keep dependency between x and y est_boot[i] <- t(x[idx], y[idx]) } quantile(est_boot, p = c(0.025, 0.975)) } bootstrap_95_ci(x, y, T_kendall) ## 2.5% 97.5% ## -0.08108108 0.78378378 bootstrap_95_ci(x, y, T_spearman) ## 2.5% 97.5% ## -0.1701115 0.8867925 Exercise 14.4 In this problem we will illustrate the use of the nonparametric bootstrap for estimating CIs of regression model coefficients. Load the longley dataset from base R with data(longley). Use lm() to apply linear regression using “Employed” as the target (dependent) variable and all other variables as the predictors (independent). Using lm() results, print the estimated regression coefficients and standard errors. Estimate 95% CI for the coefficients using +/- 1.96 * SE. Use nonparametric bootstrap with 100 replications to estimate the SE of the coefficients from (b). Compare the SE from (c) with those from (b). # a data(longley) # b res <- lm(Employed ~ . , longley) tmp <- data.frame(summary(res)$coefficients[,1:2]) tmp$LB <- tmp[,1] - 1.96 * tmp[,2] tmp$UB <- tmp[,1] + 1.96 * tmp[,2] tmp ## Estimate Std..Error LB UB ## (Intercept) -3.482259e+03 8.904204e+02 -5.227483e+03 -1.737035e+03 ## GNP.deflator 1.506187e-02 8.491493e-02 -1.513714e-01 1.814951e-01 ## GNP -3.581918e-02 3.349101e-02 -1.014616e-01 2.982320e-02 ## Unemployed -2.020230e-02 4.883997e-03 -2.977493e-02 -1.062966e-02 ## Armed.Forces -1.033227e-02 2.142742e-03 -1.453204e-02 -6.132495e-03 ## Population -5.110411e-02 2.260732e-01 -4.942076e-01 3.919994e-01 ## Year 1.829151e+00 4.554785e-01 9.364136e-01 2.721889e+00 # c set.seed(0) m <- 100 n <- nrow(longley) T_coef <- function(x) { lm(Employed ~ . , x)$coefficients } est_boot <- array(NA, c(m, ncol(longley))) for (i in 1:m) { idx <- sample(1:n, n, rep = T) est_boot[i,] <- T_coef(longley[idx,]) } SE <- apply(est_boot, 2, sd) SE ## [1] 1.826011e+03 1.605981e-01 5.693746e-02 8.204892e-03 3.802225e-03 ## [6] 3.907527e-01 9.414436e-01 # Show the standard errors around coefficients library(ggplot2) library(reshape2) df <- data.frame(index = 1:7, bootstrap_SE = SE, lm_SE = tmp$Std..Error) melted_df <- melt(df[2:nrow(df), ], id.vars = "index") # Ignore bias which has a really large magnitude ggplot(melted_df, aes(x = index, y = value, fill = variable)) + geom_bar(stat="identity", position="dodge") + xlab("Coefficient") + ylab("Standard error") # + scale_y_continuous(trans = "log") # If you want to also plot bias Exercise 14.5 This exercise shows a shortcoming of the bootstrap method when using the plug in estimator for the maximum. Compute the 95% bootstrap CI for the maximum of a standard normal distribution. Compute the 95% bootstrap CI for the maximum of a binomial distribution with n = 15 and p = 0.2. Repeat (b) using p = 0.9. Why is the result different? # bootstrap CI for maximum alpha <- 0.05 T_max <- function(x) {max(x)} # Equal to T_max = max bootstrap <- function(x, t, m = 1000) { n <- length(x) values <- rep(0, m) for (i in 1:m) { values[i] <- t(sample(x, n, replace = T)) } quantile(values, probs = c(alpha / 2, 1 - alpha / 2)) } # a # Meaningless, as the normal distribution can yield arbitrarily large values. x <- rnorm(100) bootstrap(x, T_max) ## 2.5% 97.5% ## 1.819425 2.961743 # b x <- rbinom(100, size = 15, prob = 0.2) # min = 0, max = 15 bootstrap(x, T_max) ## 2.5% 97.5% ## 6 7 # c x <- rbinom(100, size = 15, prob = 0.9) # min = 0, max = 15 bootstrap(x, T_max) ## 2.5% 97.5% ## 15 15 # Observation: to estimate the maximum, we need sufficient probability mass near the maximum value the distribution can yield. # Using bootstrap is pointless when there is too little mass near the true maximum. # In general, bootstrap will fail when estimating the CI for the maximum. "],["ml.html", "Chapter 15 Maximum likelihood 15.1 Deriving MLE 15.2 Fisher information 15.3 The German tank problem", " Chapter 15 Maximum likelihood This chapter deals with maximum likelihood estimation. The students are expected to acquire the following knowledge: How to derive MLE. Applying MLE in R. Calculating and interpreting Fisher information. Practical use of MLE. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 15.1 Deriving MLE Exercise 15.1 Derive the maximum likelihood estimator of variance for N\\((\\mu, \\sigma^2)\\). Compare with results from 13.3. What does that say about the MLE estimator? Solution. The mean is assumed constant, so we have the likelihood \\[\\begin{align} L(\\sigma^2; y) &= \\prod_{i=1}^n \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(y_i - \\mu)^2}{2 \\sigma^2}} \\\\ &= \\frac{1}{\\sqrt{2 \\pi \\sigma^2}^n} e^{\\frac{-\\sum_{i=1}^n (y_i - \\mu)^2}{2 \\sigma^2}} \\end{align}\\] We need to find the maximum of this function. We first observe that we can replace \\(\\frac{-\\sum_{i=1}^n (y_i - \\mu)^2}{2}\\) with a constant \\(c\\), since none of the terms are dependent on \\(\\sigma^2\\). Additionally, the term \\(\\frac{1}{\\sqrt{2 \\pi}^n}\\) does not affect the calculation of the maximum. So now we have \\[\\begin{align} L(\\sigma^2; y) &= (\\sigma^2)^{-\\frac{n}{2}} e^{\\frac{c}{\\sigma^2}}. \\end{align}\\] Differentiating we get \\[\\begin{align} \\frac{d}{d \\sigma^2} L(\\sigma^2; y) &= (\\sigma^2)^{-\\frac{n}{2}} \\frac{d}{d \\sigma^2} e^{\\frac{c}{\\sigma^2}} + e^{\\frac{c}{\\sigma^2}} \\frac{d}{d \\sigma^2} (\\sigma^2)^{-\\frac{n}{2}} \\\\ &= - (\\sigma^2)^{-\\frac{n}{2}} e^{\\frac{c}{\\sigma^2}} \\frac{c}{(\\sigma^2)^2} - e^{\\frac{c}{\\sigma^2}} \\frac{n}{2} (\\sigma^2)^{-\\frac{n + 2}{2}} \\\\ &= - (\\sigma^2)^{-\\frac{n + 4}{2}} e^{\\frac{c}{\\sigma^2}} c - e^{\\frac{c}{\\sigma^2}} \\frac{n}{2} (\\sigma^2)^{-\\frac{n + 2}{2}} \\\\ &= - e^{\\frac{c}{\\sigma^2}} (\\sigma^2)^{-\\frac{n + 4}{2}} \\Big(c + \\frac{n}{2}\\sigma^2 \\Big). \\end{align}\\] To get the maximum, this has to equal to 0, so \\[\\begin{align} c + \\frac{n}{2}\\sigma^2 &= 0 \\\\ \\sigma^2 &= -\\frac{2c}{n} \\\\ \\sigma^2 &= \\frac{\\sum_{i=1}^n (Y_i - \\mu)^2}{n}. \\end{align}\\] The MLE estimator is biased. Exercise 15.2 (Multivariate normal distribution) Derive the maximum likelihood estimate for the mean and covariance matrix of the multivariate normal. Simulate \\(n = 40\\) samples from a bivariate normal distribution (choose non-trivial parameters, that is, mean \\(\\neq 0\\) and covariance \\(\\neq 0\\)). Compute the MLE for the sample. Overlay the data with an ellipse that is determined by the MLE and an ellipse that is determined by the chosen true parameters. Repeat b. several times and observe how the estimates (ellipses) vary around the true value. Hint: For the derivation of MLE, these identities will be helpful: \\(\\frac{\\partial b^T a}{\\partial a} = \\frac{\\partial a^T b}{\\partial a} = b\\), \\(\\frac{\\partial a^T A a}{\\partial a} = (A + A^T)a\\), \\(\\frac{\\partial \\text{tr}(BA)}{\\partial A} = B^T\\), \\(\\frac{\\partial \\ln |A|}{\\partial A} = (A^{-1})^T\\), \\(a^T A a = \\text{tr}(a^T A a) = \\text{tr}(a a^T A) = \\text{tr}(Aaa^T)\\). Solution. The log likelihood of the MVN distribution is \\[\\begin{align*} l(\\mu, \\Sigma ; x) &= -\\frac{1}{2}\\Big(\\sum_{i=1}^n k\\ln(2\\pi) + |\\Sigma| + (x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) \\\\ &= -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n(x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) + c, \\end{align*}\\] where \\(c\\) is a constant with respect to \\(\\mu\\) and \\(\\Sigma\\). To find the MLE we first need to find partial derivatives. Let us start with \\(\\mu\\). \\[\\begin{align*} \\frac{\\partial}{\\partial \\mu}l(\\mu, \\Sigma ; x) &= \\frac{\\partial}{\\partial \\mu} -\\frac{1}{2}\\Big(\\sum_{i=1}^n x_i^T \\Sigma^{-1} x_i - x_i^T \\Sigma^{-1} \\mu - \\mu^T \\Sigma^{-1} x_i + \\mu^T \\Sigma^{-1} \\mu \\Big) \\\\ &= -\\frac{1}{2}\\Big(\\sum_{i=1}^n - \\Sigma^{-1} x_i - \\Sigma^{-1} x_i + 2 \\Sigma^{-1} \\mu \\Big) \\\\ &= -\\Sigma^{-1}\\Big(\\sum_{i=1}^n - x_i + \\mu \\Big). \\end{align*}\\] Equating above with zero, we get \\[\\begin{align*} \\sum_{i=1}^n - x_i + \\mu &= 0 \\\\ \\hat{\\mu} = \\frac{1}{n} \\sum_{i=1}^n x_i, \\end{align*}\\] which is the dimension-wise empirical mean. Now for the covariance matrix \\[\\begin{align*} \\frac{\\partial}{\\partial \\Sigma^{-1}}l(\\mu, \\Sigma ; x) &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n(x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) \\\\ &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n \\text{tr}((x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu))\\Big) \\\\ &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n \\text{tr}((\\Sigma^{-1} (x_i - \\mu) (x_i - \\mu)^T )\\Big) \\\\ &= \\frac{n}{2}\\Sigma + -\\frac{1}{2}\\Big(\\sum_{i=1}^n (x_i - \\mu) (x_i - \\mu)^T \\Big). \\end{align*}\\] Equating above with zero, we get \\[\\begin{align*} \\hat{\\Sigma} = \\frac{1}{n}\\sum_{i=1}^n (x_i - \\mu) (x_i - \\mu)^T. \\end{align*}\\] set.seed(1) n <- 40 mu <- c(1, -2) Sigma <- matrix(data = c(2, -1.6, -1.6, 1.8), ncol = 2) X <- mvrnorm(n = n, mu = mu, Sigma = Sigma) colnames(X) <- c("X1", "X2") X <- as.data.frame(X) # plot.new() tru_ellip <- ellipse(mu, Sigma, draw = FALSE) colnames(tru_ellip) <- c("X1", "X2") tru_ellip <- as.data.frame(tru_ellip) mu_est <- apply(X, 2, mean) tmp <- as.matrix(sweep(X, 2, mu_est)) Sigma_est <- (1 / n) * t(tmp) %*% tmp est_ellip <- ellipse(mu_est, Sigma_est, draw = FALSE) colnames(est_ellip) <- c("X1", "X2") est_ellip <- as.data.frame(est_ellip) ggplot(data = X, aes(x = X1, y = X2)) + geom_point() + geom_path(data = tru_ellip, aes(x = X1, y = X2, color = "truth")) + geom_path(data = est_ellip, aes(x = X1, y = X2, color = "estimated")) + labs(color = "type") Exercise 15.3 (Logistic regression) Logistic regression is a popular discriminative model when our target variable is binary (categorical with 2 values). One of the ways of looking at logistic regression is that it is linear regression but instead of using the linear term as the mean of a normal RV, we use it as the mean of a Bernoulli RV. Of course, the mean of a Bernoulli is bounded on \\([0,1]\\), so, to avoid non-sensical values, we squeeze the linear between 0 and 1 with the inverse logit function inv_logit\\((z) = 1 / (1 + e^{-z})\\). This leads to the following model: \\(y_i | \\beta, x_i \\sim \\text{Bernoulli}(\\text{inv_logit}(\\beta x_i))\\). Explicitly write the likelihood function of beta. Implement the likelihood function in R. Use black-box box-constraint optimization (for example, optim() with L-BFGS) to find the maximum likelihood estimate for beta for \\(x\\) and \\(y\\) defined below. Plot the estimated probability as a function of the independent variable. Compare with the truth. Let \\(y2\\) be a response defined below. Will logistic regression work well on this dataset? Why not? How can we still use the model, without changing it? inv_log <- function (z) { return (1 / (1 + exp(-z))) } set.seed(1) x <- rnorm(100) y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) y2 <- rbinom(100, size = 1, prob = inv_log(1.2 * x + 1.4 * x^2)) Solution. \\[\\begin{align*} l(\\beta; x, y) &= p(y | x, \\beta) \\\\ &= \\ln(\\prod_{i=1}^n \\text{inv_logit}(\\beta x_i)^{y_i} (1 - \\text{inv_logit}(\\beta x_i))^{1 - y_i}) \\\\ &= \\sum_{i=1}^n y_i \\ln(\\text{inv_logit}(\\beta x_i)) + (1 - y_i) \\ln(1 - \\text{inv_logit}(\\beta x_i)). \\end{align*}\\] set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par ## [1] 1.166558 truth_p <- data.frame(x = x, prob = inv_log(1.2 * x), type = "truth") est_p <- data.frame(x = x, prob = inv_log(my_optim$par * x), type = "estimated") plot_df <- rbind(truth_p, est_p) ggplot(data = plot_df, aes(x = x, y = prob, color = type)) + geom_point(alpha = 0.3) y2 <- rbinom(2000, size = 1, prob = inv_log(1.2 * x + 1.4 * x^2)) X2 <- cbind(x, x^2) my_optim2 <- optim(par = c(0, 0), fn = l_logistic, method = "L-BFGS-B", lower = c(0, 0), upper = c(2, 2), X = t(X2), y = y2) my_optim2$par ## [1] 1.153656 1.257649 tmp <- sweep(data.frame(x = x, x2 = x^2), 2, my_optim2$par, FUN = "*") tmp <- tmp[ ,1] + tmp[ ,2] truth_p <- data.frame(x = x, prob = inv_log(1.2 * x + 1.4 * x^2), type = "truth") est_p <- data.frame(x = x, prob = inv_log(tmp), type = "estimated") plot_df <- rbind(truth_p, est_p) ggplot(data = plot_df, aes(x = x, y = prob, color = type)) + geom_point(alpha = 0.3) Exercise 15.4 (Linear regression) For the data generated below, do the following: Compute the least squares (MLE) estimate of coefficients beta using the matrix exact solution. Compute the MLE by minimizing the sum of squared residuals using black-box optimization (optim()). Compute the MLE by using the output built-in linear regression (lm() ). Compare (a-c and the true coefficients). Compute 95% CI on the beta coefficients using the output of built-in linear regression. Compute 95% CI on the beta coefficients by using (a or b) and the bootstrap with percentile method for CI. Compare with d. set.seed(1) n <- 100 x1 <- rnorm(n) x2 <- rnorm(n) x3 <- rnorm(n) X <- cbind(x1, x2, x3) beta <- c(0.2, 0.6, -1.2) y <- as.vector(t(beta %*% t(X))) + rnorm(n, sd = 0.2) set.seed(1) n <- 100 x1 <- rnorm(n) x2 <- rnorm(n) x3 <- rnorm(n) X <- cbind(x1, x2, x3) beta <- c(0.2, 0.6, -1.2) y <- as.vector(t(beta %*% t(X))) + rnorm(n, sd = 0.2) LS_fun <- function (beta, X, y) { return(sum((y - beta %*% t(X))^2)) } my_optim <- optim(par = c(0, 0, 0), fn = LS_fun, lower = -5, upper = 5, X = X, y = y, method = "L-BFGS-B") my_optim$par ## [1] 0.1898162 0.5885946 -1.1788264 df <- data.frame(y = y, x1 = x1, x2 = x2, x3 = x3) my_lm <- lm(y ~ x1 + x2 + x3 - 1, data = df) my_lm ## ## Call: ## lm(formula = y ~ x1 + x2 + x3 - 1, data = df) ## ## Coefficients: ## x1 x2 x3 ## 0.1898 0.5886 -1.1788 # matrix solution beta_hat <- solve(t(X) %*% X) %*% t(X) %*% y beta_hat ## [,1] ## x1 0.1898162 ## x2 0.5885946 ## x3 -1.1788264 out <- summary(my_lm) out$coefficients[ ,2] ## x1 x2 x3 ## 0.02209328 0.02087542 0.01934506 # bootstrap CI nboot <- 1000 beta_boot <- matrix(data = NA, ncol = length(beta), nrow = nboot) for (i in 1:nboot) { inds <- sample(1:n, n, replace = T) new_df <- df[inds, ] X_tmp <- as.matrix(new_df[ ,-1]) y_tmp <- new_df[ ,1] # print(nrow(new_df)) tmp_beta <- solve(t(X_tmp) %*% X_tmp) %*% t(X_tmp) %*% y_tmp beta_boot[i, ] <- tmp_beta } apply(beta_boot, 2, mean) ## [1] 0.1893281 0.5887068 -1.1800738 apply(beta_boot, 2, quantile, probs = c(0.025, 0.975)) ## [,1] [,2] [,3] ## 2.5% 0.1389441 0.5436911 -1.221560 ## 97.5% 0.2386295 0.6363102 -1.140416 out$coefficients[ ,2] ## x1 x2 x3 ## 0.02209328 0.02087542 0.01934506 Exercise 15.5 (Principal component analysis) Load the olympic data set from package ade4. The data show decathlon results for 33 men in 1988 Olympic Games. This data set serves as a great example of finding the latent structure in the data, as there are certain characteristics of the athletes that make them excel at different events. For example an explosive athlete will do particulary well in sprints and long jumps. Perform PCA (prcomp) on the data set and interpret the first 2 latent dimensions. Hint: Standardize the data first to get meaningful results. Use MLE to estimate the covariance of the standardized multivariate distribution. Decompose the estimated covariance matrix with the eigendecomposition. Compare the eigenvectors to the output of PCA. data(olympic) X <- olympic$tab X_scaled <- scale(X) my_pca <- prcomp(X_scaled) summary(my_pca) ## Importance of components: ## PC1 PC2 PC3 PC4 PC5 PC6 PC7 ## Standard deviation 1.8488 1.6144 0.97123 0.9370 0.74607 0.70088 0.65620 ## Proportion of Variance 0.3418 0.2606 0.09433 0.0878 0.05566 0.04912 0.04306 ## Cumulative Proportion 0.3418 0.6025 0.69679 0.7846 0.84026 0.88938 0.93244 ## PC8 PC9 PC10 ## Standard deviation 0.55389 0.51667 0.31915 ## Proportion of Variance 0.03068 0.02669 0.01019 ## Cumulative Proportion 0.96312 0.98981 1.00000 autoplot(my_pca, data = X, loadings = TRUE, loadings.colour = 'blue', loadings.label = TRUE, loadings.label.size = 3) Sigma_est <- (1 / nrow(X_scaled)) * t(X_scaled) %*% X_scaled Sigma_dec <- eigen(Sigma_est) Sigma_dec$vectors ## [,1] [,2] [,3] [,4] [,5] [,6] ## [1,] 0.4158823 0.1488081 -0.26747198 -0.08833244 -0.442314456 0.03071237 ## [2,] -0.3940515 -0.1520815 -0.16894945 -0.24424963 0.368913901 -0.09378242 ## [3,] -0.2691057 0.4835374 0.09853273 -0.10776276 -0.009754680 0.23002054 ## [4,] -0.2122818 0.0278985 -0.85498656 0.38794393 -0.001876311 0.07454380 ## [5,] 0.3558474 0.3521598 -0.18949642 0.08057457 0.146965351 -0.32692886 ## [6,] 0.4334816 0.0695682 -0.12616012 -0.38229029 -0.088802794 0.21049130 ## [7,] -0.1757923 0.5033347 0.04609969 0.02558404 0.019358607 0.61491241 ## [8,] -0.3840821 0.1495820 0.13687235 0.14396548 -0.716743474 -0.34776037 ## [9,] -0.1799436 0.3719570 -0.19232803 -0.60046566 0.095582043 -0.43744387 ## [10,] 0.1701426 0.4209653 0.22255233 0.48564231 0.339772188 -0.30032419 ## [,7] [,8] [,9] [,10] ## [1,] 0.2543985 0.663712826 -0.10839531 0.10948045 ## [2,] 0.7505343 0.141264141 0.04613910 0.05580431 ## [3,] -0.1106637 0.072505560 0.42247611 0.65073655 ## [4,] -0.1351242 -0.155435871 -0.10206505 0.11941181 ## [5,] 0.1413388 -0.146839303 0.65076229 -0.33681395 ## [6,] 0.2725296 -0.639003579 -0.20723854 0.25971800 ## [7,] 0.1439726 0.009400445 -0.16724055 -0.53450315 ## [8,] 0.2732665 -0.276873049 -0.01766443 -0.06589572 ## [9,] -0.3419099 0.058519366 -0.30619617 -0.13093187 ## [10,] 0.1868704 0.007310045 -0.45688227 0.24311846 my_pca$rotation ## PC1 PC2 PC3 PC4 PC5 PC6 ## 100 -0.4158823 0.1488081 0.26747198 -0.08833244 -0.442314456 0.03071237 ## long 0.3940515 -0.1520815 0.16894945 -0.24424963 0.368913901 -0.09378242 ## poid 0.2691057 0.4835374 -0.09853273 -0.10776276 -0.009754680 0.23002054 ## haut 0.2122818 0.0278985 0.85498656 0.38794393 -0.001876311 0.07454380 ## 400 -0.3558474 0.3521598 0.18949642 0.08057457 0.146965351 -0.32692886 ## 110 -0.4334816 0.0695682 0.12616012 -0.38229029 -0.088802794 0.21049130 ## disq 0.1757923 0.5033347 -0.04609969 0.02558404 0.019358607 0.61491241 ## perc 0.3840821 0.1495820 -0.13687235 0.14396548 -0.716743474 -0.34776037 ## jave 0.1799436 0.3719570 0.19232803 -0.60046566 0.095582043 -0.43744387 ## 1500 -0.1701426 0.4209653 -0.22255233 0.48564231 0.339772188 -0.30032419 ## PC7 PC8 PC9 PC10 ## 100 0.2543985 -0.663712826 0.10839531 -0.10948045 ## long 0.7505343 -0.141264141 -0.04613910 -0.05580431 ## poid -0.1106637 -0.072505560 -0.42247611 -0.65073655 ## haut -0.1351242 0.155435871 0.10206505 -0.11941181 ## 400 0.1413388 0.146839303 -0.65076229 0.33681395 ## 110 0.2725296 0.639003579 0.20723854 -0.25971800 ## disq 0.1439726 -0.009400445 0.16724055 0.53450315 ## perc 0.2732665 0.276873049 0.01766443 0.06589572 ## jave -0.3419099 -0.058519366 0.30619617 0.13093187 ## 1500 0.1868704 -0.007310045 0.45688227 -0.24311846 15.2 Fisher information Exercise 15.6 Let us assume a Poisson likelihood. Derive the MLE estimate of the mean. Derive the Fisher information. For the data below compute the MLE and construct confidence intervals. Use bootstrap to construct the CI for the mean. Compare with c) and discuss. x <- c(2, 5, 3, 1, 2, 1, 0, 3, 0, 2) Solution. The log likelihood of the Poisson is \\[\\begin{align*} l(\\lambda; x) = \\sum_{i=1}^n x_i \\ln \\lambda - n \\lambda - \\sum_{i=1}^n \\ln x_i! \\end{align*}\\] Taking the derivative and equating with 0 we get \\[\\begin{align*} \\frac{1}{\\hat{\\lambda}}\\sum_{i=1}^n x_i - n &= 0 \\\\ \\hat{\\lambda} &= \\frac{1}{n} \\sum_{i=1}^n x_i. \\end{align*}\\] Since \\(\\lambda\\) is the mean parameter, this was expected. For the Fischer information, we first need the second derivative, which is \\[\\begin{align*} - \\lambda^{-2} \\sum_{i=1}^n x_i. \\\\ \\end{align*}\\] Now taking the expectation of the negative of the above, we get \\[\\begin{align*} E[\\lambda^{-2} \\sum_{i=1}^n x_i] &= \\lambda^{-2} E[\\sum_{i=1}^n x_i] \\\\ &= \\lambda^{-2} n \\lambda \\\\ &= \\frac{n}{\\lambda}. \\end{align*}\\] set.seed(1) x <- c(2, 5, 3, 1, 2, 1, 0, 3, 0, 2) lambda_hat <- mean(x) finfo <- length(x) / lambda_hat mle_CI <- c(lambda_hat - 1.96 * sqrt(1 / finfo), lambda_hat + 1.96 * sqrt(1 / finfo)) boot_lambda <- c() nboot <- 1000 for (i in 1:nboot) { tmp_x <- sample(x, length(x), replace = T) boot_lambda[i] <- mean(tmp_x) } boot_CI <- c(quantile(boot_lambda, 0.025), quantile(boot_lambda, 0.975)) mle_CI ## [1] 1.045656 2.754344 boot_CI ## 2.5% 97.5% ## 1.0 2.7 Exercise 15.7 Find the Fisher information matrix for the Gamma distribution. Generate 20 samples from a Gamma distribution and plot a confidence ellipse of the inverse of Fisher information matrix around the ML estimates of the parameters. Also plot the theoretical values. Repeat the sampling several times. What do you observe? Discuss what a non-diagonal Fisher matrix implies. Hint: The digamma function is defined as \\(\\psi(x) = \\frac{\\frac{d}{dx} \\Gamma(x)}{\\Gamma(x)}\\). Additionally, you do not need to evaluate \\(\\frac{d}{dx} \\psi(x)\\). To calculate its value in R, use package numDeriv. Solution. The log likelihood of the Gamma is \\[\\begin{equation*} l(\\alpha, \\beta; x) = n \\alpha \\ln \\beta - n \\ln \\Gamma(\\alpha) + (\\alpha - 1) \\sum_{i=1}^n \\ln x_i - \\beta \\sum_{i=1}^n x_i. \\end{equation*}\\] Let us calculate the derivatives. \\[\\begin{align*} \\frac{\\partial}{\\partial \\alpha} l(\\alpha, \\beta; x) &= n \\ln \\beta - n \\psi(\\alpha) + \\sum_{i=1}^n \\ln x_i, \\\\ \\frac{\\partial}{\\partial \\beta} l(\\alpha, \\beta; x) &= \\frac{n \\alpha}{\\beta} - \\sum_{i=1}^n x_i, \\\\ \\frac{\\partial^2}{\\partial \\alpha \\beta} l(\\alpha, \\beta; x) &= \\frac{n}{\\beta}, \\\\ \\frac{\\partial^2}{\\partial \\alpha^2} l(\\alpha, \\beta; x) &= - n \\frac{\\partial}{\\partial \\alpha} \\psi(\\alpha), \\\\ \\frac{\\partial^2}{\\partial \\beta^2} l(\\alpha, \\beta; x) &= - \\frac{n \\alpha}{\\beta^2}. \\end{align*}\\] The Fisher information matrix is then \\[\\begin{align*} I(\\alpha, \\beta) = - E[ \\begin{bmatrix} - n \\psi'(\\alpha) & \\frac{n}{\\beta} \\\\ \\frac{n}{\\beta} & - \\frac{n \\alpha}{\\beta^2} \\end{bmatrix} ] = \\begin{bmatrix} n \\psi'(\\alpha) & - \\frac{n}{\\beta} \\\\ - \\frac{n}{\\beta} & \\frac{n \\alpha}{\\beta^2} \\end{bmatrix} \\end{align*}\\] A non-diagonal Fisher matrix implies that the parameter estimates are linearly dependent. set.seed(1) n <- 20 pars_theor <- c(5, 2) x <- rgamma(n, 5, 2) # MLE for alpha and beta log_lik <- function (pars, x) { n <- length(x) return (- (n * pars[1] * log(pars[2]) - n * log(gamma(pars[1])) + (pars[1] - 1) * sum(log(x)) - pars[2] * sum(x))) } my_optim <- optim(par = c(1,1), fn = log_lik, method = "L-BFGS-B", lower = c(0.001, 0.001), upper = c(8, 8), x = x) pars_mle <- my_optim$par fish_mat <- matrix(data = NA, nrow = 2, ncol = 2) fish_mat[1,2] <- - n / pars_mle[2] fish_mat[2,1] <- - n / pars_mle[2] fish_mat[2,2] <- (n * pars_mle[1]) / (pars_mle[2]^2) fish_mat[1,1] <- n * grad(digamma, pars_mle[1]) fish_mat_inv <- solve(fish_mat) est_ellip <- ellipse(pars_mle, fish_mat_inv, draw = FALSE) colnames(est_ellip) <- c("X1", "X2") est_ellip <- as.data.frame(est_ellip) ggplot() + geom_point(data = data.frame(x = pars_mle[1], y = pars_mle[2]), aes(x = x, y = y)) + geom_path(data = est_ellip, aes(x = X1, y = X2)) + geom_point(aes(x = pars_theor[1], y = pars_theor[2]), color = "red") + geom_text(aes(x = pars_theor[1], y = pars_theor[2], label = "Theoretical parameters"), color = "red", nudge_y = -0.2) 15.3 The German tank problem Exercise 15.8 (The German tank problem) During WWII the allied intelligence were faced with an important problem of estimating the total production of certain German tanks, such as the Panther. What turned out to be a successful approach was to estimate the maximum from the serial numbers of the small sample of captured or destroyed tanks (describe the statistical model used). What assumptions were made by using the above model? Do you think they are reasonable assumptions in practice? Show that the plug-in estimate for the maximum (i.e. the maximum of the sample) is a biased estimator. Derive the maximum likelihood estimate of the maximum. Check that the following estimator is not biased: \\(\\hat{n} = \\frac{k + 1}{k}m - 1\\). Solution. The data are the serial numbers of the tanks. The parameter is \\(n\\), the total production of the tank. The distribution of the serial numbers is a discrete uniform distribution over all serial numbers. One of the assumptions is that we have i.i.d samples, however in practice this might not be true, as some tanks produced later could be sent to the field later, therefore already in theory we would not be able to recover some values from the population. To find the expected value we first need to find the distribution of \\(m\\). Let us start with the CDF. \\[\\begin{align*} F_m(x) = P(Y_1 < x,...,Y_k < x). \\end{align*}\\] If \\(x < k\\) then \\(F_m(x) = 0\\) and if \\(x \\geq 1\\) then \\(F_m(x) = 1\\). What about between those values. So the probability that the maximum value is less than or equal to \\(m\\) is just the number of possible draws from \\(Y\\) that are all smaller than \\(m\\), divided by all possible draws. This is \\(\\frac{{x}\\choose{k}}{{n}\\choose{k}}\\). The PDF on the suitable bounds is then \\[\\begin{align*} P(m = x) = F_m(x) - F_m(x - 1) = \\frac{\\binom{x}{k} - \\binom{x - 1}{k}}{\\binom{n}{k}} = \\frac{\\binom{x - 1}{k - 1}}{\\binom{n}{k}}. \\end{align*}\\] Now we can calculate the expected value of \\(m\\) using some combinatorial identities. \\[\\begin{align*} E[m] &= \\sum_{i = k}^n i \\frac{{i - 1}\\choose{k - 1}}{{n}\\choose{k}} \\\\ &= \\sum_{i = k}^n i \\frac{\\frac{(i - 1)!}{(k - 1)!(i - k)!}}{{n}\\choose{k}} \\\\ &= \\frac{k}{\\binom{n}{k}}\\sum_{i = k}^n \\binom{i}{k} \\\\ &= \\frac{k}{\\binom{n}{k}} \\binom{n + 1}{k + 1} \\\\ &= \\frac{k(n + 1)}{k + 1}. \\end{align*}\\] The bias of this estimator is then \\[\\begin{align*} E[m] - n = \\frac{k(n + 1)}{k + 1} - n = \\frac{k - n}{k + 1}. \\end{align*}\\] The probability that we observed our sample \\(Y = {Y_1, Y_2,...,,Y_k}\\) given \\(n\\) is \\(\\frac{1}{{n}\\choose{k}}\\). We need to find such \\(n^*\\) that this function is maximized. Additionally, we have a constraint that \\(n^* \\geq m = \\max{(Y)}\\). Let us plot this function for \\(m = 10\\) and \\(k = 4\\). library(ggplot2) my_fun <- function (x, m, k) { tmp <- 1 / (choose(x, k)) tmp[x < m] <- 0 return (tmp) } x <- 1:20 y <- my_fun(x, 10, 4) df <- data.frame(x = x, y = y) ggplot(data = df, aes(x = x, y = y)) + geom_line() ::: {.solution} (continued) We observe that the maximum of this function lies at the maximum value of the sample. Therefore \\(n^* = m\\) and ML estimate equals the plug-in estimate. \\[\\begin{align*} E[\\hat{n}] &= \\frac{k + 1}{k} E[m] - 1 \\\\ &= \\frac{k + 1}{k} \\frac{k(n + 1)}{k + 1} - 1 \\\\ &= n. \\end{align*}\\] ::: "],["nhst.html", "Chapter 16 Null hypothesis significance testing", " Chapter 16 Null hypothesis significance testing This chapter deals with null hypothesis significance testing. The students are expected to acquire the following knowledge: Binomial test. t-test. Chi-squared test. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 16.1 (Binomial test) We assume \\(y_i \\in \\{0,1\\}\\), \\(i = 1,...,n\\) and \\(y_i | \\theta = 0.5 \\sim i.i.d.\\) Bernoulli\\((\\theta)\\). The test statistic is \\(X = \\sum_{i=1}^n\\) and the rejection region R is defined as the region where the probability of obtaining such or more extreme \\(X\\) given \\(\\theta = 0.5\\) is less than 0.05. Derive and plot the power function of the test for \\(n=100\\). What is the significance level of this test if \\(H0: \\theta = 0.5\\)? At which values of X will we reject the null hypothesis? # a # First we need the rejection region, so we need to find X_min and X_max n <- 100 qbinom(0.025, n, 0.5) ## [1] 40 qbinom(0.975, n, 0.5) ## [1] 60 pbinom(40, n, 0.5) ## [1] 0.02844397 pbinom(60, n, 0.5) ## [1] 0.9823999 X_min <- 39 X_max <- 60 thetas <- seq(0, 1, by = 0.01) beta_t <- 1 - pbinom(X_max, size = n, prob = thetas) + pbinom(X_min, size = n, prob = thetas) plot(beta_t) # b # The significance level is beta_t[51] ## [1] 0.0352002 # We will reject the null hypothesis at X values below X_min and above X_max. Exercise 16.2 (Long-run guarantees of the t-test) Generate a sample of size \\(n = 10\\) from the standard normal. Use the two-sided t-test with \\(H0: \\mu = 0\\) and record the p-value. Can you reject H0 at 0.05 significance level? (before simulating) If we repeated (b) many times, what would be the relative frequency of false positives/Type I errors (rejecting the null that is true)? What would be the relative frequency of false negatives /Type II errors (retaining the null when the null is false)? (now simulate b and check if the simulation results match your answer in b) Similar to (a-c) but now we generate data from N(-0.5, 1). Similar to (a-c) but now we generate data from N(\\(\\mu\\), 1) where we every time pick a different \\(\\mu < 0\\) and use a one-sided test \\(H0: \\mu <= 0\\). set.seed(2) # a x <- rnorm(10) my_test <- t.test(x, alternative = "two.sided", mu = 0) my_test ## ## One Sample t-test ## ## data: x ## t = 0.6779, df = 9, p-value = 0.5149 ## alternative hypothesis: true mean is not equal to 0 ## 95 percent confidence interval: ## -0.4934661 0.9157694 ## sample estimates: ## mean of x ## 0.2111516 # we can not reject the null hypothesis # b # The expected value of false positives would be 0.05. The expected value of # true negatives would be 0, as there are no negatives (the null hypothesis is # always the truth). nit <- 1000 typeIerr <- vector(mode = "logical", length = nit) typeIIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { x <- rnorm(10) my_test <- t.test(x, alternative = "two.sided", mu = 0) if (my_test$p.value < 0.05) { typeIerr[i] <- T } else { typeIerr[i] <- F } } mean(typeIerr) ## [1] 0.052 sd(typeIerr) / sqrt(nit) ## [1] 0.007024624 # d # We can not estimate the percentage of true negatives, but it will probably be # higher than 0.05. There will be no false positives as the null hypothesis is # always false. typeIIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { x <- rnorm(10, -0.5) my_test <- t.test(x, alternative = "two.sided", mu = 0) if (my_test$p.value < 0.05) { typeIIerr[i] <- F } else { typeIIerr[i] <- T } } mean(typeIIerr) ## [1] 0.719 sd(typeIIerr) / sqrt(nit) ## [1] 0.01422115 # e # The expected value of false positives would be lower than 0.05. The expected # value of true negatives would be 0, as there are no negatives (the null # hypothesis is always the truth). typeIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { u <- runif(1, -1, 0) x <- rnorm(10, u) my_test <- t.test(x, alternative = "greater", mu = 0) if (my_test$p.value < 0.05) { typeIerr[i] <- T } else { typeIerr[i] <- F } } mean(typeIerr) ## [1] 0.012 sd(typeIerr) / sqrt(nit) ## [1] 0.003444977 Exercise 16.3 (T-test, confidence intervals, and bootstrap) Sample \\(n=20\\) from a standard normal distribution and calculate the p-value using t-test, confidence intervals based on normal distribution, and bootstrap. Repeat this several times and check how many times we rejected the null hypothesis (made a type I error). Hint: For the confidence intervals you can use function CI from the Rmisc package. set.seed(1) library(Rmisc) nit <- 1000 n_boot <- 100 t_logic <- rep(F, nit) boot_logic <- rep(F, nit) norm_logic <- rep(F, nit) for (i in 1:nit) { x <- rnorm(20) my_test <- t.test(x) my_CI <- CI(x) if (my_test$p.value <= 0.05) t_logic[i] <- T boot_tmp <- vector(mode = "numeric", length = n_boot) for (j in 1:n_boot) { tmp_samp <- sample(x, size = 20, replace = T) boot_tmp[j] <- mean(tmp_samp) } if ((quantile(boot_tmp, 0.025) >= 0) | (quantile(boot_tmp, 0.975) <= 0)) { boot_logic[i] <- T } if ((my_CI[3] >= 0) | (my_CI[1] <= 0)) { norm_logic[i] <- T } } mean(t_logic) ## [1] 0.053 sd(t_logic) / sqrt(nit) ## [1] 0.007088106 mean(boot_logic) ## [1] 0.093 sd(boot_logic) / sqrt(nit) ## [1] 0.009188876 mean(norm_logic) ## [1] 0.053 sd(norm_logic) / sqrt(nit) ## [1] 0.007088106 Exercise 16.4 (Chi-squared test) Show that the \\(\\chi^2 = \\sum_{i=1}^k \\frac{(O_i - E_i)^2}{E_i}\\) test statistic is approximately \\(\\chi^2\\) distributed when we have two categories. Let us look at the US voting data here. Compare the number of voters who voted for Trump or Hillary depending on their income (less or more than 100.000 dollars per year). Manually calculate the chi-squared statistic, compare to the chisq.test in R, and discuss the results. Visualize the test. Solution. Let \\(X_i\\) be binary variables, \\(i = 1,...,n\\). We can then express the test statistic as \\[\\begin{align} \\chi^2 = &\\frac{(O_i - np)^2}{np} + \\frac{(n - O_i - n(1 - p))^2}{n(1 - p)} \\\\ &= \\frac{(O_i - np)^2}{np(1 - p)} \\\\ &= (\\frac{O_i - np}{\\sqrt{np(1 - p)}})^2. \\end{align}\\] When \\(n\\) is large, this distrbution is approximately normal with \\(\\mu = np\\) and \\(\\sigma^2 = np(1 - p)\\) (binomial converges in distribution to standard normal). By definition, the chi-squared distribution with \\(k\\) degrees of freedom is a sum of squares of \\(k\\) independent standard normal random variables. n <- 24588 less100 <- round(0.66 * n * c(0.49, 0.45, 0.06)) # some rounding, but it should not affect results more100 <- round(0.34 * n * c(0.47, 0.47, 0.06)) x <- rbind(less100, more100) colnames(x) <- c("Clinton", "Trump", "other/no answer") print(x) ## Clinton Trump other/no answer ## less100 7952 7303 974 ## more100 3929 3929 502 chisq.test(x) ## ## Pearson's Chi-squared test ## ## data: x ## X-squared = 9.3945, df = 2, p-value = 0.00912 x ## Clinton Trump other/no answer ## less100 7952 7303 974 ## more100 3929 3929 502 csum <- apply(x, 2, sum) rsum <- apply(x, 1, sum) chi2 <- (x[1,1] - csum[1] * rsum[1] / sum(x))^2 / (csum[1] * rsum[1] / sum(x)) + (x[1,2] - csum[2] * rsum[1] / sum(x))^2 / (csum[2] * rsum[1] / sum(x)) + (x[1,3] - csum[3] * rsum[1] / sum(x))^2 / (csum[3] * rsum[1] / sum(x)) + (x[2,1] - csum[1] * rsum[2] / sum(x))^2 / (csum[1] * rsum[2] / sum(x)) + (x[2,2] - csum[2] * rsum[2] / sum(x))^2 / (csum[2] * rsum[2] / sum(x)) + (x[2,3] - csum[3] * rsum[2] / sum(x))^2 / (csum[3] * rsum[2] / sum(x)) chi2 ## Clinton ## 9.394536 1 - pchisq(chi2, df = 2) ## Clinton ## 0.009120161 x <- seq(0, 15, by = 0.01) df <- data.frame(x = x) ggplot(data = df, aes(x = x)) + stat_function(fun = dchisq, args = list(df = 2)) + geom_segment(aes(x = chi2, y = 0, xend = chi2, yend = dchisq(chi2, df = 2))) + stat_function(fun = dchisq, args = list(df = 2), xlim = c(chi2, 15), geom = "area", fill = "red") "],["bi.html", "Chapter 17 Bayesian inference 17.1 Conjugate priors 17.2 Posterior sampling", " Chapter 17 Bayesian inference This chapter deals with Bayesian inference. The students are expected to acquire the following knowledge: How to set prior distribution. Compute posterior distribution. Compute posterior predictive distribution. Use sampling for inference. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 17.1 Conjugate priors Exercise 17.1 (Poisson-gamma model) Let us assume a Poisson likelihood and a gamma prior on the Poisson mean parameter (this is a conjugate prior). Derive posterior Below we have some data, which represents number of goals in a football match. Choose sensible prior for this data (draw the gamma density if necessary), justify it. Compute the posterior. Compute an interval such that the probability that the true mean is in there is 95%. What is the probability that the true mean is greater than 2.5? Back to theory: Compute prior predictive and posterior predictive. Discuss why the posterior predictive is overdispersed and not Poisson? Draw a histogram of the prior predictive and posterior predictive for the data from (b). Discuss. Generate 10 and 100 random samples from a Poisson distribution and compare the posteriors with a flat prior, and a prior concentrated away from the truth. x <- c(3, 2, 1, 1, 5, 4, 0, 0, 4, 3) Solution. \\[\\begin{align*} p(\\lambda | X) &= \\frac{p(X | \\lambda) p(\\lambda)}{\\int_0^\\infty p(X | \\lambda) p(\\lambda) d\\lambda} \\\\ &\\propto p(X | \\lambda) p(\\lambda) \\\\ &= \\Big(\\prod_{i=1}^n \\frac{1}{x_i!} \\lambda^{x_i} e^{-\\lambda}\\Big) \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\lambda^{\\alpha - 1} e^{-\\beta \\lambda} \\\\ &\\propto \\lambda^{\\sum_{i=1}^n x_i + \\alpha - 1} e^{- \\lambda (n + \\beta)} \\\\ \\end{align*}\\] We recognize this as the shape of a gamma distribution, therefore \\[\\begin{align*} \\lambda | X \\sim \\text{gamma}(\\alpha + \\sum_{i=1}^n x_i, \\beta + n) \\end{align*}\\] For the prior predictive, we have \\[\\begin{align*} p(x^*) &= \\int_0^\\infty p(x^*, \\lambda) d\\lambda \\\\ &= \\int_0^\\infty p(x^* | \\lambda) p(\\lambda) d\\lambda \\\\ &= \\int_0^\\infty \\frac{1}{x^*!} \\lambda^{x^*} e^{-\\lambda} \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\lambda^{\\alpha - 1} e^{-\\beta \\lambda} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\int_0^\\infty \\lambda^{x^* + \\alpha - 1} e^{-\\lambda (1 + \\beta)} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\frac{\\Gamma(x^* + \\alpha)}{(1 + \\beta)^{x^* + \\alpha}} \\int_0^\\infty \\frac{(1 + \\beta)^{x^* + \\alpha}}{\\Gamma(x^* + \\alpha)} \\lambda^{x^* + \\alpha - 1} e^{-\\lambda (1 + \\beta)} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\frac{\\Gamma(x^* + \\alpha)}{(1 + \\beta)^{x^* + \\alpha}} \\\\ &= \\frac{\\Gamma(x^* + \\alpha)}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} (\\frac{\\beta}{1 + \\beta})^\\alpha (\\frac{1}{1 + \\beta})^{x^*}, \\end{align*}\\] which we recognize as the negative binomial distribution with \\(r = \\alpha\\) and \\(p = \\frac{1}{\\beta + 1}\\). For the posterior predictive, the calculation is the same, only now the parameters are \\(r = \\alpha + \\sum_{i=1}^n x_i\\) and \\(p = \\frac{1}{\\beta + n + 1}\\). There are two sources of uncertainty in the predictive distribution. First is the uncertainty about the population. Second is the variability in sampling from the population. When \\(n\\) is large, the latter is going to be very small. But when \\(n\\) is small, the latter is going to be higher, resulting in an overdispersed predictive distribution. x <- c(3, 2, 1, 1, 5, 4, 0, 0, 4, 3) # b # quick visual check of the prior ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dgamma, args = list(shape = 1, rate = 1)) palpha <- 1 pbeta <- 1 alpha_post <- palpha + sum(x) beta_post <- pbeta + length(x) ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dgamma, args = list(shape = alpha_post, rate = beta_post)) # probability of being higher than 2.5 1 - pgamma(2.5, alpha_post, beta_post) ## [1] 0.2267148 # interval qgamma(c(0.025, 0.975), alpha_post, beta_post) ## [1] 1.397932 3.137390 # d prior_pred <- rnbinom(1000, size = palpha, prob = 1 - 1 / (pbeta + 1)) post_pred <- rnbinom(1000, size = palpha + sum(x), prob = 1 - 1 / (pbeta + 10 + 1)) df <- data.frame(prior = prior_pred, posterior = post_pred) df <- gather(df) ggplot(df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") # e set.seed(1) x1 <- rpois(10, 2.5) x2 <- rpois(100, 2.5) alpha_flat <- 1 beta_flat <- 0.1 alpha_conc <- 50 beta_conc <- 10 n <- 10000 df_flat <- data.frame(x1 = rgamma(n, alpha_flat + sum(x1), beta_flat + 10), x2 = rgamma(n, alpha_flat + sum(x2), beta_flat + 100), type = "flat") df_flat <- tidyr::gather(df_flat, key = "key", value = "value", - type) df_conc <- data.frame(x1 = rgamma(n, alpha_conc + sum(x1), beta_conc + 10), x2 = rgamma(n, alpha_conc + sum(x2), beta_conc + 100), type = "conc") df_conc <- tidyr::gather(df_conc, key = "key", value = "value", - type) df <- rbind(df_flat, df_conc) ggplot(data = df, aes(x = value, color = type)) + facet_wrap(~ key) + geom_density() 17.2 Posterior sampling Exercise 17.2 (Bayesian logistic regression) In Chapter 15 we implemented a MLE for logistic regression (see the code below). For this model, conjugate priors do not exist, which complicates the calculation of the posterior. However, we can use sampling from the numerator of the posterior, using rejection sampling. Set a sensible prior distribution on \\(\\beta\\) and use rejection sampling to find the posterior distribution. In a) you will get a distribution of parameter \\(\\beta\\). Plot the probabilities (as in exercise 15.3) for each sample of \\(\\beta\\) and compare to the truth. Hint: We can use rejection sampling even for functions which are not PDFs – they do not have to sum/integrate to 1. We just need to use a suitable envelope that we know how to sample from. For example, here we could use a uniform distribution and scale it suitably. set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par # Let's say we believe that the mean of beta is 0.5. Since we are not very sure # about this, we will give it a relatively high variance. So a normal prior with # mean 0.5 and standard deviation 5. But there is no right solution to this, # this is basically us expressing our prior belief in the parameter values. set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) if (is.nan(logl)) logl <- Inf return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par ## [1] 1.166558 f_logistic <- function (beta, X, y) { logl <- prod(inv_log(as.vector(beta %*% X))^y * (1 - inv_log(as.vector(beta %*% X)))^(1 - y)) return(logl) } a <- seq(0, 3, by = 0.01) my_l <- c() for (i in a) { my_l <- c(my_l, f_logistic(i, x, y) * dnorm(i, 0.5, 5)) } plot(my_l) envlp <- 10^(-25.8) * dunif(a, -5, 5) # found by trial and error tmp <- data.frame(envel = envlp, l = my_l, t = a) tmp <- gather(tmp, key = "key", value = "value", - t) ggplot(tmp, aes(x = t, y = value, color = key)) + geom_line() # envelope OK set.seed(1) nsamps <- 1000 samps <- c() for (i in 1:nsamps) { tmp <- runif(1, -5, 5) u <- runif(1, 0, 1) if (u < (f_logistic(tmp, x, y) * dnorm(tmp, 0.5, 5)) / (10^(-25.8) * dunif(tmp, -5, 5))) { samps <- c(samps, tmp) } } plot(density(samps)) mean(samps) ## [1] 1.211578 median(samps) ## [1] 1.204279 truth_p <- data.frame(x = x, prob = inv_log(1.2 * x), type = "truth") preds <- inv_log(x %*% t(samps)) preds <- gather(cbind(as.data.frame(preds), x = x), key = "key", "value" = value, - x) ggplot(preds, aes(x = x, y = value)) + geom_line(aes(group = key), color = "gray", alpha = 0.7) + geom_point(data = truth_p, aes(y = prob), color = "red", alpha = 0.7) + theme_bw() "],["distributions-intutition.html", "Chapter 18 Distributions intutition 18.1 Discrete distributions 18.2 Continuous distributions", " Chapter 18 Distributions intutition This chapter is intended to help you familiarize yourself with the different probability distributions you will encounter in this course. Use Appendix B as a reference for the basic properties of distributions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 18.1 Discrete distributions Exercise 18.1 (Bernoulli intuition 1) Arguably the simplest distribution you will enocounter is the Bernoulli distribution. It is a discrete probability distribution used to represent the outcome of a yes/no question. It has one parameter \\(p\\) which is the probability of success. The probability of failure is \\(1-p\\), sometimes denoted as \\(q\\). A classic way to think about a Bernoulli trial (a yes/no experiment) is a coin flip. Real coins are fair, meaning the probability of either heads (1) or tails (0) are the same i.e. \\(p=0.5\\), shown below in figure a. Alternatively we may want to represent a process that doesn’t have equal probabilities of outcomes like “Will a throw of a fair die result in a 6?”. In this case \\(p=\\frac{1}{6}\\), shown in figure b. Using your knowledge of the Bernoulli distribution use the throw of a fair die to think of events, such that: \\(p = 0.5\\) \\(p = \\frac{5}{6}\\) \\(q = \\frac{2}{3}\\) Solution. An event that is equally likely to happen or not happen i.e. \\(p = 0.5\\) would be throwing an even number. More formally we can name this event \\(A\\) and write: \\(A = \\{2,4,6\\}\\), its probability being \\(P(A) = 0.5\\) An example of an event with \\(p = \\frac{5}{6}\\) would be throwing a number greater than 1. Defined as \\(B = \\{2,3,4,5,6\\}\\). We need an event that fails \\(\\frac{2}{3}\\) of the time. Alternatively we can reverse the problem and find an event that succeeds \\(\\frac{1}{3}\\) of the time, since: \\(q = 1 - p \\implies p = 1 - q = \\frac{1}{3}\\). The event that our outcome is divisible by 3: \\(C = \\{3, 6\\}\\) satisfies this condition. Exercise 18.2 (Binomial intuition 1) The binomial distribution is a generalization of the Bernoulli distribution. Instead of considering a single Bernoulli trial, we now consider a sequence of \\(n\\) trials, which are independent and have the same parameter \\(p\\). So the binomial distribution has two parameters \\(n\\) - the number of trials and \\(p\\) - the probability of success for each trial. If we return to our coin flip representation, we now flip a coin several times. The binomial distribution will give us the probabilities of all possible outcomes. Below we show the distribution for a series of 10 coin flips with a fair coin (left) and a biased coin (right). The numbers on the x axis represent the number of times the coin landed heads. Using your knowledge of the binomial distribution: Take the pmf of the binomial distribution and plug in \\(n=1\\), check that it is in fact equivalent to a Bernoulli distribution. In our examples we show the graph of a binomial distribution over 10 trials with \\(p=0.8\\). If we take a look at the graph, it appears as though the probabilities of getting 0,1,2 or 3 heads in 10 flips are zero. Is it actually zero? Check by plugging in the values into the pmf. Solution. The pmf of a binomial distribution is \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\), now we insert \\(n=1\\) to get: \\[\\binom{1}{k} p^k (1 - p)^{1 - k}\\]. Not quite equivalent to a Bernoulli, however note that the support of the binomial distribution is defined as \\(k \\in \\{0,1,\\dots,n\\}\\), so in our case \\(k = \\{0,1\\}\\), then: \\[\\binom{1}{0} = \\binom{1}{1} = 1\\] we get: \\(p^k (1 - p)^{1 - k}\\) ,the Bernoulli distribution. As we already know \\(p=0.8, n=10\\), so: \\[\\binom{10}{0} 0.8^0 (1 - 0.8)^{10 - 0} = 1.024 \\cdot 10^{-7}\\] \\[\\binom{10}{1} 0.8^1 (1 - 0.8)^{10 - 1} = 4.096 \\cdot 10^{-6}\\] \\[\\binom{10}{2} 0.8^2 (1 - 0.8)^{10 - 2} = 7.3728 \\cdot 10^{-5}\\] \\[\\binom{10}{3} 0.8^3 (1 - 0.8)^{10 - 3} = 7.86432\\cdot 10^{-4}\\] So the probabilities are not zero, just very small. Exercise 18.3 (Poisson intuition 1) Below are shown 3 different graphs of the Poisson distribution. Your task is to replicate them on your own in R by varying the \\(\\lambda\\) parameter. Hint: You can use dpois() to get the probabilities. library(ggplot2) library(gridExtra) x = 0:15 # Create Poisson data data1 <- data.frame(x = x, y = dpois(x, lambda = 0.1)) data2 <- data.frame(x = x, y = dpois(x, lambda = 1)) data3 <- data.frame(x = x, y = dpois(x, lambda = 7.5)) # Create individual ggplot objects plot1 <- ggplot(data1, aes(x, y)) + geom_col() + xlab("x") + ylab("Probability") + ylim(0,1) plot2 <- ggplot(data2, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) plot3 <- ggplot(data3, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) # Combine the plots grid.arrange(plot1, plot2, plot3, ncol = 3) Exercise 18.4 (Poisson intuition 2) The Poisson distribution is a discrete probability distribution that models processes where events occur at a constant mean rate and are independent of each other. It has a single parameter \\(\\lambda\\), which represents the constant mean rate. A classic example of a scenario that can be modeled using the Poisson distribution is the number of calls received by a call center in a day (or in fact any other time interval). Suppose you work in a call center and have some understanding of probability distributions. You overhear your supervisor mentioning that the call center receives an average of 2.5 calls per day. Using your knowledge of the Poisson distribution, calculate: The probability you will get no calls today. The probability you will get more than 5 calls today. Solution. First recall the Poisson pmf: \\[p(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!}\\] as stated previously our parameter \\(\\lambda = 2.5\\) To get the probability of no calls we simply plug in \\(k = 0\\), so: \\[p(0) = \\frac{2.5^0 e^{-2.5}}{0!} = e^{-2.5} \\approx 0.082\\] The support of the Poisson distribution is non-negative integers. So if we wanted to calculate the probability of getting more than 5 calls we would need to add up the probabilities of getting 6 calls and 7 calls and so on up to infinity. Let us instead remember that the sum of all probabilties will be 1, we will reverse the problem and instead ask “What is the probability we get 5 calls or less?”. We can subtract the probability of the opposite outcome (the complement) from 1 to get the probability of our original question. \\[P(k > 5) = 1 - P(k \\leq 5)\\] \\[P(k \\leq 5) = \\sum_{i=0}^{5} p(i) = p(0) + p(1) + p(2) + p(3) + p(4) + p(5) =\\] \\[= \\frac{2.5^0 e^{-2.5}}{0!} + \\frac{2.5^1 e^{-2.5}}{1!} + \\dots =\\] \\[=0.957979\\] So the probability of geting more than 5 calls will be \\(1 - 0.957979 = 0.042021\\) Exercise 18.5 (Geometric intuition 1) The geometric distribution is a discrete distribution that models the number of failures before the first success in a sequence of independent Bernoulli trials. It has a single parameter \\(p\\), representing the probability of success. Disclaimer: There are two forms of this distribution, the one we just described and another version that models the number of trials before the first success. The difference is subtle yet significant and you are likely to encounter both forms, though here we will limit ourselves to the former. In the graph below we show the pmf of a geometric distribution with \\(p=0.5\\). This can be thought of as the number of successive failures (tails) in the flip of a fair coin. You can see that there’s a 50% chance you will have zero failures i.e. you will flip a heads on your very first attempt. But there is some smaller chance that you will flip a sequence of tails in a row, with longer sequences having ever lower probability. Create an equivalent graph that represents the probability of rolling a 6 with a fair 6-sided die. Use the formula for the mean of the geometric distribution and determine the average number of failures before you roll a 6. Look up the alternative form of the geometric distribtuion and again use the formula for the mean to determine the average number of trials before you roll a 6. Solution. Parameter p (the probability of success) for rolling a 6 is \\(p=\\frac{1}{6}\\). library(ggplot2) # Parameters p <- 1/6 x_vals <- 0:9 # Starting from 0 probs <- dgeom(x_vals, p) # Data data <- data.frame(x_vals, probs) # Plot ggplot(data, aes(x=x_vals, y=probs)) + geom_segment(aes(xend=x_vals, yend=0), color="black", size=1) + geom_point(color="red", size=2) + labs(x = "Number of trials", y = "Probability") + theme_minimal() + scale_x_continuous(breaks = x_vals) # This line ensures integer x-axis labels ::: {.solution} b) The expected value of a random variable (the mean) is denoted as \\(E[X]\\). \\[E[X] = \\frac{1-p}{p}= \\frac{1- \\frac{1}{6}}{\\frac{1}{6}} = \\frac{5}{6}\\cdot 6 = 5\\] On average we will fail 5 times before we roll our first 6. The alternative form of this distribution (with support on all positive integers) has a slightly different formula for the mean. This change reflects the difference in the way we posed our question: \\[E[X] = \\frac{1}{p} = \\frac{1}{\\frac{1}{6}} = 6\\] On average we will have to throw the die 6 times before we roll a 6. ::: 18.2 Continuous distributions Exercise 18.6 (Uniform intuition 1) The need for a randomness is a common problem. A practical solution are so-called random number generators (RNGs). The simplest RNG one would think of is choosing a set of numbers and having the generator return a number at random, where the probability of returning any number from this set is the same. If this set is an interval of real numbers, then we’ve basically described the continuous uniform distribution. It has two parameters \\(a\\) and \\(b\\), which define the beginning and end of its support respectively. Let’s think of the mean intuitively. The expected value or mean of a distribution is the pivot point on our x-axis, which “balances” the graph. Given parameters \\(a\\) and \\(b\\) what is your intuitive guess of the mean for this distribution? A special case of the uniform distribution is the standard uniform distribution with \\(a=0\\),\\(b=1\\). Write the pdf of this particular case. Solution. It’s the midpoint between \\(a\\) and \\(b\\), so \\(\\frac{b-a}{2}\\) Inserting the parameter values we get:\\[f(x) = \\begin{cases} 1 & \\text{if } 0 \\leq x \\leq 1 \\\\ 0 & \\text{otherwise} \\end{cases} \\] Notice how the pdf is just a constant \\(1\\) across all values of \\(x \\in [0,1]\\). Here it is important to distinguish between probability and probability density. The density may be 1, but the probability is not and while discreet distributions never exceeded 1 on the y-axis, continuous distributions can go as high as you like. Exercise 18.7 (Normal intuition 1) a Exercise 18.8 (Beta intuition 1) The beta distribution is a continuous distribution defined on the unit interval \\([0,1]\\). It has two strictly positive paramters \\(\\alpha\\) and \\(\\beta\\), which determine its shape. Its support makes it especially suitable to model distribtuions of percentages and proportions. Below you’ve been provided with some code that you can copy into Rstudio. Once you run the code, an interactive Shiny app will appear and you will be able to manipulate the graph of the beta distribution. Play around with the parameters to get: A straight line from (0,0) to (1,2) A straight line from (0,2) to (1,0) A symmetric bell curve A bowl-shaped curve The standard uniform distribution is actually a special case of the beta distribution. Find the exact parameters \\(\\alpha\\) and \\(\\beta\\). Once you do, prove the equality by inserting the values into our pdf. Hint: The beta function is evaluated as \\(\\text{B}(a,b) = \\frac{\\Gamma(a)\\Gamma(b)}{\\Gamma(a+b)}\\), the gamma function for positive integers \\(n\\) is evaluated as \\(\\Gamma(n)= (n-1)!\\) # Install and load necessary packages install.packages(c("shiny", "ggplot2")) library(shiny) library(ggplot2) # The Shiny App ui <- fluidPage( titlePanel("Beta Distribution Viewer"), sidebarLayout( sidebarPanel( sliderInput("alpha", "Alpha:", min = 0.1, max = 10, value = 2, step = 0.1), sliderInput("beta", "Beta:", min = 0.1, max = 10, value = 2, step = 0.1) ), mainPanel( plotOutput("betaPlot") ) ) ) server <- function(input, output) { output$betaPlot <- renderPlot({ x <- seq(0, 1, by = 0.01) y <- dbeta(x, shape1 = input$alpha, shape2 = input$beta) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() + labs(x = "Value", y = "Density") + theme_minimal() }) } shinyApp(ui = ui, server = server) Solution. \\(\\alpha = 2, \\beta=1\\) \\(\\alpha = 1, \\beta=2\\) Possible solution \\(\\alpha = \\beta= 5\\) Possible solution \\(\\alpha = \\beta= 0.5\\) The correct parameters are \\(\\alpha = 1, \\beta=1\\), to prove the equality we insert them into the beta pdf: \\[\\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} = \\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\\text{B}(1, 1)} = \\frac{1}{\\frac{\\Gamma(1)\\Gamma(1)}{\\Gamma(1+1)}}= \\frac{1}{\\frac{(1-1)!(1-1)!}{(2-1)!}} = 1\\] Exercise 18.9 (Gamma intuition 1) a Exercise 18.10 (Exponential intuition 1) a "],["A1.html", "A R programming language A.1 Basic characteristics A.2 Why R? A.3 Setting up A.4 R basics A.5 Functions A.6 Other tips A.7 Further reading and references", " A R programming language A.1 Basic characteristics R is free software for statistical computing and graphics. It is widely used by statisticians, scientists, and other professionals for software development and data analysis. It is an interpreted language and therefore the programs do not need compilation. A.2 Why R? R is one of the main two languages used for statistics and machine learning (the other being Python). Pros Libraries. Comprehensive collection of statistical and machine learning packages. Easy to code. Open source. Anyone can access R and develop new methods. Additionally, it is relatively simple to get source code of established methods. Large community. The use of R has been rising for some time, in industry and academia. Therefore a large collection of blogs and tutorials exists, along with people offering help on pages like StackExchange and CrossValidated. Integration with other languages and LaTeX. New methods. Many researchers develop R packages based on their research, therefore new methods are available soon after development. Cons Slow. Programs run slower than in other programming languages, however this can be somewhat ammended by effective coding or integration with other languages. Memory intensive. This can become a problem with large data sets, as they need to be stored in the memory, along with all the information the models produce. Some packages are not as good as they should be, or have poor documentation. Object oriented programming in R can be very confusing and complex. A.3 Setting up https://www.r-project.org/. A.3.1 RStudio RStudio is the most widely used IDE for R. It is free, you can download it from https://rstudio.com/. While console R is sufficient for the requirements of this course, we recommend the students install RStudio for its better user interface. A.3.2 Libraries for data science Listed below are some of the more useful libraries (packages) for data science. Students are also encouraged to find other useful packages. dplyr Efficient data manipulation. Part of the wider package collection called tidyverse. ggplot2 Plotting based on grammar of graphics. stats Several statistical models. rstan Bayesian inference using Hamiltonian Monte Carlo. Very flexible model building. MCMCpack Bayesian inference. rmarkdown, knitr, and bookdown Dynamic reports (for example such as this one). devtools Package development. A.4 R basics A.4.1 Variables and types Important information and tips: no type declaration define variables with <- instead of = (although both work, there is a slight difference, additionally most of the packages use the arrow) for strings use \"\" for comments use # change types with as.type() functions no special type for single character like C++ for example n <- 20 x <- 2.7 m <- n # m gets value 20 my_flag <- TRUE student_name <- "Luka" typeof(n) ## [1] "double" typeof(student_name) ## [1] "character" typeof(my_flag) ## [1] "logical" typeof(as.integer(n)) ## [1] "integer" typeof(as.character(n)) ## [1] "character" A.4.2 Basic operations n + x ## [1] 22.7 n - x ## [1] 17.3 diff <- n - x # variable diff gets the difference between n and x diff ## [1] 17.3 n * x ## [1] 54 n / x ## [1] 7.407407 x^2 ## [1] 7.29 sqrt(x) ## [1] 1.643168 n > 2 * n ## [1] FALSE n == n ## [1] TRUE n == 2 * n ## [1] FALSE n != n ## [1] FALSE paste(student_name, "is", n, "years old") ## [1] "Luka is 20 years old" A.4.3 Vectors use c() to combine elements into vectors can only contain one type of variable if different types are provided, all are transformed to the most basic type in the vector access elements by indexes or logical vectors of the same length a scalar value is regarded as a vector of length 1 1:4 # creates a vector of integers from 1 to 4 ## [1] 1 2 3 4 student_ages <- c(20, 23, 21) student_names <- c("Luke", "Jen", "Mike") passed <- c(TRUE, TRUE, FALSE) length(student_ages) ## [1] 3 # access by index student_ages[2] ## [1] 23 student_ages[1:2] ## [1] 20 23 student_ages[2] <- 24 # change values # access by logical vectors student_ages[passed == TRUE] # same as student_ages[passed] ## [1] 20 24 student_ages[student_names %in% c("Luke", "Mike")] ## [1] 20 21 student_names[student_ages > 20] ## [1] "Jen" "Mike" A.4.3.1 Operations with vectors most operations are element-wise if we operate on vectors of different lengths, the shorter vector periodically repeats its elements until it reaches the length of the longer one a <- c(1, 3, 5) b <- c(2, 2, 1) d <- c(6, 7) a + b ## [1] 3 5 6 a * b ## [1] 2 6 5 a + d ## Warning in a + d: longer object length is not a multiple of shorter object ## length ## [1] 7 10 11 a + 2 * b ## [1] 5 7 7 a > b ## [1] FALSE TRUE TRUE b == a ## [1] FALSE FALSE FALSE a %*% b # vector multiplication, not element-wise ## [,1] ## [1,] 13 A.4.4 Factors vectors of finite predetermined classes suitable for categorical variables ordinal (ordered) or nominal (unordered) car_brand <- factor(c("Audi", "BMW", "Mercedes", "BMW"), ordered = FALSE) car_brand ## [1] Audi BMW Mercedes BMW ## Levels: Audi BMW Mercedes freq <- factor(x = NA, levels = c("never","rarely","sometimes","often","always"), ordered = TRUE) freq[1:3] <- c("rarely", "sometimes", "rarely") freq ## [1] rarely sometimes rarely ## Levels: never < rarely < sometimes < often < always freq[4] <- "quite_often" # non-existing level, returns NA ## Warning in `[<-.factor`(`*tmp*`, 4, value = "quite_often"): invalid factor ## level, NA generated freq ## [1] rarely sometimes rarely <NA> ## Levels: never < rarely < sometimes < often < always A.4.5 Matrices two-dimensional generalizations of vectors my_matrix <- matrix(c(1, 2, 1, 5, 4, 2), nrow = 2, byrow = TRUE) my_matrix ## [,1] [,2] [,3] ## [1,] 1 2 1 ## [2,] 5 4 2 my_square_matrix <- matrix(c(1, 3, 2, 3), nrow = 2) my_square_matrix ## [,1] [,2] ## [1,] 1 2 ## [2,] 3 3 my_matrix[1,2] # first row, second column ## [1] 2 my_matrix[2, ] # second row ## [1] 5 4 2 my_matrix[ ,3] # third column ## [1] 1 2 A.4.5.1 Matrix functions and operations most operation element-wise mind the dimensions when using matrix multiplication %*% nrow(my_matrix) # number of matrix rows ## [1] 2 ncol(my_matrix) # number of matrix columns ## [1] 3 dim(my_matrix) # matrix dimension ## [1] 2 3 t(my_matrix) # transpose ## [,1] [,2] ## [1,] 1 5 ## [2,] 2 4 ## [3,] 1 2 diag(my_matrix) # the diagonal of the matrix as vector ## [1] 1 4 diag(1, nrow = 3) # creates a diagonal matrix ## [,1] [,2] [,3] ## [1,] 1 0 0 ## [2,] 0 1 0 ## [3,] 0 0 1 det(my_square_matrix) # matrix determinant ## [1] -3 my_matrix + 2 * my_matrix ## [,1] [,2] [,3] ## [1,] 3 6 3 ## [2,] 15 12 6 my_matrix * my_matrix # element-wise multiplication ## [,1] [,2] [,3] ## [1,] 1 4 1 ## [2,] 25 16 4 my_matrix %*% t(my_matrix) # matrix multiplication ## [,1] [,2] ## [1,] 6 15 ## [2,] 15 45 my_vec <- as.vector(my_matrix) # transform to vector my_vec ## [1] 1 5 2 4 1 2 A.4.6 Arrays multi-dimensional generalizations of matrices my_array <- array(c(1, 2, 3, 4, 5, 6, 7, 8), dim = c(2, 2, 2)) my_array[1, 1, 1] ## [1] 1 my_array[2, 2, 1] ## [1] 4 my_array[1, , ] ## [,1] [,2] ## [1,] 1 5 ## [2,] 3 7 dim(my_array) ## [1] 2 2 2 A.4.7 Data frames basic data structure for analysis differ from matrices as columns can be of different types student_data <- data.frame("Name" = student_names, "Age" = student_ages, "Pass" = passed) student_data ## Name Age Pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE ## 3 Mike 21 FALSE colnames(student_data) <- c("name", "age", "pass") # change column names student_data[1, ] ## name age pass ## 1 Luke 20 TRUE student_data[ ,colnames(student_data) %in% c("name", "pass")] ## name pass ## 1 Luke TRUE ## 2 Jen TRUE ## 3 Mike FALSE student_data$pass # access column by name ## [1] TRUE TRUE FALSE student_data[student_data$pass == TRUE, ] ## name age pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE A.4.8 Lists useful for storing different data structures access elements with double square brackets elements can be named first_list <- list(student_ages, my_matrix, student_data) second_list <- list(student_ages, my_matrix, student_data, first_list) first_list[[1]] ## [1] 20 24 21 second_list[[4]] ## [[1]] ## [1] 20 24 21 ## ## [[2]] ## [,1] [,2] [,3] ## [1,] 1 2 1 ## [2,] 5 4 2 ## ## [[3]] ## name age pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE ## 3 Mike 21 FALSE second_list[[4]][[1]] # first element of the fourth element of second_list ## [1] 20 24 21 length(second_list) ## [1] 4 second_list[[length(second_list) + 1]] <- "add_me" # append an element names(first_list) <- c("Age", "Matrix", "Data") first_list$Age ## [1] 20 24 21 A.4.9 Loops mostly for loop for loop can iterate over an arbitrary vector # iterate over consecutive natural numbers my_sum <- 0 for (i in 1:10) { my_sum <- my_sum + i } my_sum ## [1] 55 # iterate over an arbirary vector my_sum <- 0 some_numbers <- c(2, 3.5, 6, 100) for (i in some_numbers) { my_sum <- my_sum + i } my_sum ## [1] 111.5 A.5 Functions for help use ?function_name A.5.1 Writing functions We can write our own functions with function(). In the brackets, we define the parameters the function gets, and in curly brackets we define what the function does. We use return() to return values. sum_first_n_elements <- function (n) { my_sum <- 0 for (i in 1:n) { my_sum <- my_sum + i } return (my_sum) } sum_first_n_elements(10) ## [1] 55 A.6 Other tips Use set.seed(arbitrary_number) at the beginning of a script to set the seed and ensure replication. To dynamically set the working directory in R Studio to the parent folder of a R script use setwd(dirname(rstudioapi::getSourceEditorContext()$path)). To avoid slow R loops use the apply family of functions. See ?apply and ?lapply. To make your data manipulation (and therefore your life) a whole lot easier, use the dplyr package. Use getAnywhere(function_name) to get the source code of any function. Use browser for debugging. See ?browser. A.7 Further reading and references Getting started with R Studio: https://www.youtube.com/watch?v=lVKMsaWju8w Official R manuals: https://cran.r-project.org/manuals.html Cheatsheets: https://www.rstudio.com/resources/cheatsheets/ Workshop on R, dplyr, ggplot2, and R Markdown: https://github.com/bstatcomp/Rworkshop "],["distributions.html", "B Probability distributions", " B Probability distributions Name parameters support pdf/pmf mean variance Bernoulli \\(p \\in [0,1]\\) \\(k \\in \\{0,1\\}\\) \\(p^k (1 - p)^{1 - k}\\) 1.12 \\(p\\) 7.1 \\(p(1-p)\\) 7.1 binomial \\(n \\in \\mathbb{N}\\), \\(p \\in [0,1]\\) \\(k \\in \\{0,1,\\dots,n\\}\\) \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\) 4.4 \\(np\\) 7.2 \\(np(1-p)\\) 7.2 Poisson \\(\\lambda > 0\\) \\(k \\in \\mathbb{N}_0\\) \\(\\frac{\\lambda^k e^{-\\lambda}}{k!}\\) 4.6 \\(\\lambda\\) 7.3 \\(\\lambda\\) 7.3 geometric \\(p \\in (0,1]\\) \\(k \\in \\mathbb{N}_0\\) \\(p(1-p)^k\\) 4.5 \\(\\frac{1 - p}{p}\\) 7.4 \\(\\frac{1 - p}{p^2}\\) 9.3 normal \\(\\mu \\in \\mathbb{R}\\), \\(\\sigma^2 > 0\\) \\(x \\in \\mathbb{R}\\) \\(\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}}\\) 4.12 \\(\\mu\\) 7.8 \\(\\sigma^2\\) 7.8 uniform \\(a,b \\in \\mathbb{R}\\), \\(a < b\\) \\(x \\in [a,b]\\) \\(\\frac{1}{b-a}\\) 4.9 \\(\\frac{a+b}{2}\\) \\(\\frac{(b-a)^2}{12}\\) beta \\(\\alpha,\\beta > 0\\) \\(x \\in [0,1]\\) \\(\\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)}\\) 4.10 \\(\\frac{\\alpha}{\\alpha + \\beta}\\) 7.6 \\(\\frac{\\alpha \\beta}{(\\alpha + \\beta)^2(\\alpha + \\beta + 1)}\\) 7.6 gamma \\(\\alpha,\\beta > 0\\) \\(x \\in (0, \\infty)\\) \\(\\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x}\\) 4.11 \\(\\frac{\\alpha}{\\beta}\\) 7.5 \\(\\frac{\\alpha}{\\beta^2}\\) 7.5 exponential \\(\\lambda > 0\\) \\(x \\in [0, \\infty)\\) \\(\\lambda e^{-\\lambda x}\\) 4.8 \\(\\frac{1}{\\lambda}\\) 7.7 \\(\\frac{1}{\\lambda^2}\\) 7.7 logistic \\(\\mu \\in \\mathbb{R}\\), \\(s > 0\\) \\(x \\in \\mathbb{R}\\) \\(\\frac{e^{-\\frac{x - \\mu}{s}}}{s(1 + e^{-\\frac{x - \\mu}{s}})^2}\\) 4.13 \\(\\mu\\) \\(\\frac{s^2 \\pi^2}{3}\\) negative binomial \\(r \\in \\mathbb{N}\\), \\(p \\in [0,1]\\) \\(k \\in \\mathbb{N}_0\\) \\(\\binom{k + r - 1}{k}(1-p)^r p^k\\) 4.7 \\(\\frac{rp}{1 - p}\\) 9.2 \\(\\frac{rp}{(1 - p)^2}\\) 9.2 multinomial \\(n \\in \\mathbb{N}\\), \\(k \\in \\mathbb{N}\\) \\(p_i \\in [0,1]\\), \\(\\sum p_i = 1\\) \\(x_i \\in \\{0,..., n\\}\\), \\(i \\in \\{1,...,k\\}\\), \\(\\sum{x_i} = n\\) \\(\\frac{n!}{x_1!x_2!...x_k!} p_1^{x_1} p_2^{x_2}...p_k^{x_k}\\) 8.1 \\(np_i\\) \\(np_i(1-p_i)\\) "],["references.html", "References", " References "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]] diff --git a/docs/uprobspaces.html b/docs/uprobspaces.html index 31c6cd1..c119e98 100644 --- a/docs/uprobspaces.html +++ b/docs/uprobspaces.html @@ -23,7 +23,7 @@ - + From 01136aae666c2eaada0c8d2fda75dd87ee76ecf9 Mon Sep 17 00:00:00 2001 From: LeonHvastja Date: Sat, 30 Sep 2023 23:25:24 +0200 Subject: [PATCH 4/7] first iteration finished --- 18-distributions_intuition.Rmd | 208 +++++++- docs/404.html | 2 +- .../figure-html/unnamed-chunk-22-1.png | Bin 28042 -> 109778 bytes .../figure-html/unnamed-chunk-24-1.png | Bin 17288 -> 92696 bytes ...-from-wide-to-long-format-for-ggplot2.html | 380 ++++++++++++++ docs/create-data.html | 381 ++++++++++++++ docs/define-the-iq-distributions.html | 381 ++++++++++++++ docs/distributions-intutition.html | 289 ++++++++--- ...bution-factor-match-our-desired-order.html | 380 ++++++++++++++ docs/plot.html | 468 ++++++++++++++++++ docs/reference-keys.txt | 13 + docs/search_index.json | 2 +- 12 files changed, 2413 insertions(+), 91 deletions(-) create mode 100644 docs/convert-from-wide-to-long-format-for-ggplot2.html create mode 100644 docs/create-data.html create mode 100644 docs/define-the-iq-distributions.html create mode 100644 docs/ensure-the-levels-of-the-distribution-factor-match-our-desired-order.html create mode 100644 docs/plot.html diff --git a/18-distributions_intuition.Rmd b/18-distributions_intuition.Rmd index 1deb49e..cd2f50a 100644 --- a/18-distributions_intuition.Rmd +++ b/18-distributions_intuition.Rmd @@ -3,7 +3,7 @@ This chapter is intended to help you familiarize yourself with the different probability distributions you will encounter in this course. -Use [Appendix B](#distributions) as a reference for the basic properties of distributions. +You will need to use [Appendix B](#distributions) extensively as a reference for the basic properties of distributions, so keep it close! ```{r, echo = FALSE} @@ -44,11 +44,11 @@ $(document).ready(function() { Arguably the simplest distribution you will enocounter is the Bernoulli distribution. It is a discrete probability distribution used to represent the outcome of a yes/no question. It has one parameter $p$ which is the probability of success. The -probability of failure is $1-p$, sometimes denoted as $q$. +probability of failure is $(1-p)$, sometimes denoted as $q$. A classic way to think about a Bernoulli trial (a yes/no experiment) is a coin flip. Real coins are fair, meaning the probability of either heads (1) -or tails (0) are the same i.e. $p=0.5$, shown below in *figure a*. Alternatively +or tails (0) are the same, so $p=0.5$ as shown below in *figure a*. Alternatively we may want to represent a process that doesn't have equal probabilities of outcomes like "Will a throw of a fair die result in a 6?". In this case $p=\frac{1}{6}$, shown in *figure b*. @@ -233,8 +233,9 @@ grid.arrange(plot1, plot2, plot3, ncol = 3) ```{exercise, name = "Poisson intuition 2"} -The Poisson distribution is a discrete probability distribution that models -processes where events occur at a constant mean rate and are independent of each other. +The Poisson distribution is a discrete probability distribution that models the +probability of a given number of events occuring within processes where events +occur at a constant mean rate and independently of each other - a **Poisson process**. It has a single parameter $\lambda$, which represents the constant mean rate. @@ -282,10 +283,9 @@ The geometric distribution is a discrete distribution that models the **number o failures** before the first success in a sequence of independent Bernoulli trials. It has a single parameter $p$, representing the probability of success. -*Disclaimer*: There are two forms of this distribution, the one we just described -and another version that models the **number of trials** before the first success. The -difference is subtle yet significant and you are likely to encounter both forms, -though here we will limit ourselves to the former. +NOTE: There are two forms of this distribution, the one we just described +and another that models the **number of trials** before the first success. The +difference is subtle yet significant and you are likely to encounter both forms. In the graph below we show the pmf of a geometric distribution with $p=0.5$. This can be thought of as the number of successive failures (tails) in the flip of a fair coin. @@ -369,7 +369,7 @@ The need for a randomness is a common problem. A practical solution are so-calle It has two parameters $a$ and $b$, which define the beginning and end of its support respectively. a) Let's think of the mean intuitively. The expected value or mean of a distribution is the pivot point on our x-axis, which "balances" the graph. Given parameters $a$ and $b$ what is your intuitive guess of the mean for this distribution? -b) A special case of the uniform distribution is the **standard uniform distribution** with $a=0$,$b=1$. Write the pdf of this particular case. +b) A special case of the uniform distribution is the **standard uniform distribution** with $a=0$ and $b=1$. Write the pdf $f(x)$ of this particular distribution. ``` ```{r, fig.width=5, fig.height=3, echo=FALSE, warning=FALSE, message=FALSE} # Load required libraries @@ -395,21 +395,117 @@ print(p) ```
      ```{solution, echo = togs} -a. It's the midpoint between $a$ and $b$, so $\frac{b-a}{2}$ +a. It's the midpoint between $a$ and $b$, so $\frac{a+b}{2}$ b. Inserting the parameter values we get:$$f(x) = \begin{cases} 1 & \text{if } 0 \leq x \leq 1 \\ 0 & \text{otherwise} \end{cases} $$ -Notice how the pdf is just a constant $1$ across all values of $x \in [0,1]$. Here it is important to distinguish between probability and **probability density**. The density may be 1, but the probability is not and while discreet distributions never exceeded 1 on the y-axis, continuous distributions can go as high as you like. +Notice how the pdf is just a constant $1$ across all values of $x \in [0,1]$. Here it is important to distinguish between probability and **probability density**. The density may be 1, but the probability is not and while discreet distributions never exceed 1 on the y-axis, continuous distributions can go as high as you like. ```
      ```{exercise, name = "Normal intuition 1"} -a +The normal distribution, also known as the Gaussian distribution, is a continuous distribution that encompasses the entire real number line. It has two parameters: the mean, denoted by $\mu$, and the variance, represented by $\sigma^2$. Its shape resembles the iconic bell curve. The position of its peak is determined by the parameter $\mu$, while the variance determines the spread or width of the curve. A smaller variance results in a sharper, narrower peak, while a larger variance leads to a broader, more spread-out curve. + +Below, we graph the distribution of IQ scores for two different populations. + +We aim to identify individuals with an IQ at or above 140 for an experiment. We can identify them reliably; however, we only have time to examine one of the two groups. Which group should we investigate to have the best chance? + +NOTE: The graph below displays the parameter $\sigma$, which is the square root of the variance, more commonly referred to as the **standard deviation**. Keep this in mind when solving the problems. + +a) Insert the values of either population into the pdf of a normal distribution and determine which one has a higher density at $x=140$. + +b) Generate the graph yourself and zoom into the relevant area to graphically verify your answer. + +To determine probability density, we can use the pdf. However, if we wish to know the proportion of the population that falls within certain parameters, we would need to integrate the pdf. Fortunately, the integrals of common distributions are well-established. This integral gives us the cumulative distribution function $F(x)$ (CDF). + +c) BONUS: Look up the CDF of the normal distribution and input the appropriate values to determine the percentage of each population that comprises individuals with an IQ of 140 or higher. + + +``` +```{r, echo=FALSE, warning=FALSE, message=FALSE} +library(ggplot2) +library(tidyr) + +# Create data +x <- seq(70, 135, by = 0.01) # Adjusting the x range to account for the larger standard deviations +df <- data.frame(x = x) + +# Define the IQ distributions +df$IQ_mu100_sd10 <- dnorm(df$x, mean = 100, sd = 10) +df$IQ_mu105_sd8 <- dnorm(df$x, mean = 105, sd = 8) + +# Convert from wide to long format for ggplot2 +df_long <- gather(df, distribution, density, -x) + +# Ensure the levels of the 'distribution' factor match our desired order +df_long$distribution <- factor(df_long$distribution, levels = c("IQ_mu100_sd10", "IQ_mu105_sd8")) + +# Plot +ggplot(df_long, aes(x = x, y = density, color = distribution)) + + geom_line() + + labs(x = "IQ Score", y = "Density") + + scale_color_manual( + name = "IQ Distribution", + values = c(IQ_mu100_sd10 = "red", IQ_mu105_sd8 = "blue"), + labels = c("Group 1 (µ=100, σ=10)", "Group 2 (µ=105, σ=8)") + ) + + theme_minimal() + +``` + +
      +```{solution, echo = togs} +a. Group 1: $\mu = 100, \sigma=10 \rightarrow \sigma^2 = 100$ + $$\frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{(x - \mu)^2}{2 \sigma^2}} = + \frac{1}{\sqrt{2 \pi 100}} e^{-\frac{(140 - 100)^2}{2 \cdot 100}} \approx 1.34e-05$$ + Group 2: $\mu = 105, \sigma=8 \rightarrow \sigma^2 = 64$ + $$\frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{(x - \mu)^2}{2 \sigma^2}} = + \frac{1}{\sqrt{2 \pi 64}} e^{-\frac{(140 - 105)^2}{2 \cdot 64}} \approx 3.48e-06$$ + + So despite the fact that group 1 has a lower average IQ, we are more likely to find 140 IQ individuals in group 2. +``` +b. +```{r, echo=togs} +library(ggplot2) +library(tidyr) + +# Create data +x <- seq(135, 145, by = 0.01) # Adjusting the x range to account for the larger standard deviations +df <- data.frame(x = x) + +# Define the IQ distributions +df$IQ_mu100_sd10 <- dnorm(df$x, mean = 100, sd = 10) +df$IQ_mu105_sd8 <- dnorm(df$x, mean = 105, sd = 8) + +# Convert from wide to long format for ggplot2 +df_long <- gather(df, distribution, density, -x) + +# Ensure the levels of the 'distribution' factor match our desired order +df_long$distribution <- factor(df_long$distribution, levels = c("IQ_mu100_sd10", "IQ_mu105_sd8")) + +# Plot +ggplot(df_long, aes(x = x, y = density, color = distribution)) + + geom_line() + + labs(x = "IQ Score", y = "Density") + + scale_color_manual( + name = "IQ Distribution", + values = c(IQ_mu100_sd10 = "red", IQ_mu105_sd8 = "blue"), + labels = c("Group 1 (µ=100, σ=10)", "Group 2 (µ=105, σ=8)") + ) + + theme_minimal() +``` +```{solution, echo = togs} +c. The CDF of the normal distribution is $\Phi(x) = \frac{1}{2} \left[ 1 + \text{erf} \left( \frac{x - \mu}{\sigma \sqrt{2}} \right) \right]$. The CDF is defined as the integral of the distribution density up to x. So to get the total percentage of individuals with IQ at 140 or higher we will need to subtract the value from 1. + + Group 1: $$1 - \Phi(140) = \frac{1}{2} \left[ 1 + \text{erf} \left( \frac{140 - 100}{10 \sqrt{2}} \right) \right] \approx 3.17e-05 $$ + Group 2 : $$1 - \Phi(140) = \frac{1}{2} \left[ 1 + \text{erf} \left( \frac{140 - 105}{8 \sqrt{2}} \right) \right] \approx 6.07e-06 $$ + So roughly 0.003% and 0.0006% of individuals in groups 1 and 2 respectively have an IQ at or above 140. ``` +
      ```{exercise, name = "Beta intuition 1"} The beta distribution is a continuous distribution defined on the unit interval $[0,1]$. It has two strictly positive paramters $\alpha$ and $\beta$, which determine its shape. Its support makes it especially suitable to model distribtuions of percentages and proportions. @@ -483,12 +579,92 @@ shinyApp(ui = ui, server = server) ```
      +```{exercise, name = "Exponential intuition 1"} +The exponential distribution represents the distributon of **time between events** in a Poisson process. It is the continuous analogue of the geometric distribution. + +It has a single parameter $\lambda$, which is strictly positive and represents the *constant rate* of the corresponding Poisson process. The support is all positive reals, since time between events is non-negative, but not bound upwards. + +Let's revisit the call center from our Poisson problem. We get 2.5 calls per day on average, this is our rate parameter $\lambda$. A work day is 8 hours. + +a) What is the mean time between phone calls? + +The cdf $F(x)$ tells us what percentage of calls occur within x amount of time of each other. + +b) You want to take an hour long lunch break but are worried about missing calls. Calculate the percentage of calls you are likely to miss if you're gone for an hour. Hint: The cdf is $F(x) = \int_{-\infty}^{x} f(x) dx$ +``` + +
      +```{solution, echo = togs} +a. Taking $\lambda = \frac{2.5 \text{ calls}}{8 \text{ hours}} = \frac{1 \text{ call}}{3.2 \text{ hours}}$ + $$E[X] = \frac{1}{\lambda} = \frac{3.2 \text{ hours}}{\text{call}}$$ + +b. First we derive the CDF, we can integrate from 0 instead of $-\infty$, since we have no support in the negatives: + \begin{align} + F(x) &= \int_{0}^{x} \lambda e^{-\lambda t} dt \\ + &= \lambda \int_{0}^{x} e^{-\lambda t} dt \\ + &= \lambda (\frac{1}{-\lambda}e^{-\lambda t} |_{0}^{x}) \\ + &= \lambda(\frac{1}{\lambda} - \frac{1}{\lambda} e^{-\lambda x}) \\ + &= 1 - e^{-\lambda x}. + \end{align} + Then we just evaluate it for a time of 1 hour: + $$F(1 \text{ hour}) = 1 - e^{-\frac{1 \text{ call}}{3.2 \text{ hours}} \cdot 1 \text{ hour}}= + 1 - e^{-\frac{1 \text{ call}}{3.2 \text{ hours}}} \approx 0.268$$ + So we have about a 27% chance of missing a call if we're gone for an hour. +``` +
      ```{exercise, name = "Gamma intuition 1"} -a +The gamma distribution is a continuous distribution characterized by two parameters, $\alpha$ and $\beta$, both greater than 0. These parameters afford the distribution a broad range of shapes, leading to it being commonly referred to as a *family of distributions*. Given its support over the positive real numbers, it is well suited for modeling a diverse range of positive-valued phenomena. + +a) The exponential distribution is actually just a particular form of the gamma distribution. What are the values of $\alpha$ and $\beta$? +b) Copy the code from our beta distribution Shiny app and modify it to simulate the gamma distribution. Then get it to show the exponential. ``` -```{exercise, name = "Exponential intuition 1"} -a +
      +```{solution, echo = togs} +a. Let's start by taking a look at the pdfs of the two distributions side by side: + $$\frac{\beta^\alpha}{\Gamma(\alpha)} x^{\alpha - 1}e^{-\beta x} = \lambda e^{-\lambda x}$$ + The $x^{\alpha - 1}$ term is not found anywhere in the pdf of the exponential so we need to eliminate it by setting $\alpha = 1$. This also makes the fraction evaluate to $\frac{\beta^1}{\Gamma(1)} = \beta$, which leaves us with $$\beta \cdot e^{-\beta x}$$ + Now we can see that $\beta = \lambda$ and $\alpha = 1$. + +b. ``` +```{r, echo = tog_ex, eval=FALSE} +# Install and load necessary packages +install.packages(c("shiny", "ggplot2")) +library(shiny) +library(ggplot2) + +# The Shiny App +ui <- fluidPage( + titlePanel("Gamma Distribution Viewer"), + + sidebarLayout( + sidebarPanel( + sliderInput("shape", "Shape (α):", min = 0.1, max = 10, value = 2, step = 0.1), + sliderInput("scale", "Scale (β):", min = 0.1, max = 10, value = 2, step = 0.1) + ), + + mainPanel( + plotOutput("gammaPlot") + ) + ) +) + +server <- function(input, output) { + output$gammaPlot <- renderPlot({ + x <- seq(0, 25, by = 0.1) + y <- dgamma(x, shape = input$shape, scale = input$scale) + + ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + + geom_line() + + labs(x = "Value", y = "Density") + + theme_minimal() + }) +} + +shinyApp(ui = ui, server = server) + +``` +
      diff --git a/docs/404.html b/docs/404.html index 9efe05c..47877bf 100644 --- a/docs/404.html +++ b/docs/404.html @@ -23,7 +23,7 @@ - + diff --git a/docs/bookdown-pou_files/figure-html/unnamed-chunk-22-1.png b/docs/bookdown-pou_files/figure-html/unnamed-chunk-22-1.png index 082779d36a8ab8d240da88b4a92fa503a694b7e6..8b9d4fa0d53b00ff914aa48330a50f7cc88f57a2 100644 GIT binary patch literal 109778 zcmeFZWn7e9_XY|hDxs8wA}L5C-L2BnB_N=5cOwj_NQpE^BaMK>&@CaI(%ndR$DDf@ zp4ahxpL4#QZ~qVS!-4zWYp=cXTGz&3K~CZ>IuSY&64G5M$>&N)NH;~0kZ#nY-U2?^ zH~KY-goJ`=`s|s4`7?=U))v-w$~JHG4J8b%4DC$yl_Z`bA@KwSscM@LtKtd8RX`Z1 zy5A3EWQ1W}az+$eDD{09`+aE$v7BHGw)t%6@YSBcz;BD{AU7 zh>_?!)8el=NXt@F6Q9EmkKVf;y-W8+TtlL*_t? zG4}L(7snq0?LRl4pO>)e#-lrWD#hPf!XibQnWJYaa66(TEs!;(_v;8!H{oj8^*b(K zA}|7ZY%&>}@(uU$rW*x_GHwRW@8YOs3N@)`Cq_TL&}Aya`}tV#qfY!ChgKHH^Mg|- z^@JGIxiDTr45Ckunbk^mp0vg6_H$VqicM2iaHWZ@gHgqrZb z1ht9$_~5#Z&+_`-^AK_@)3;i$Q!~+tew%##oo@2=cyMNYx`FQG5~)nZ?6P(3cOT2s z3@GJx@3)366IIIUJD#p4IgY((MmCfKtWwNUVegNLAB8WlJfKJ=52I;QHuz-F$0v=+ z*%I~0-e9kyZ@?9qQYnc$usZO!v*^!DV}ma<#XZf^N!-{YJtIBl?UBP)I(!~h>VZ|_ zYDrZ?_A3rl#zbaJ4y|JY33ffu2HoZ10^Mb9%{!O3J#T>5#?{I=*El-W&^;=xwVEXM z1mz5b(wS5(-at}QG&t2xL1Z0B9u+QCsK&Rn&KHi_gu~3+j>5>O!ShoKLbKiWNp?xL z(_XFGGf(Vv?R0I)?2{zS23)r|%s2B>3(=QMqZKETo@$OT3D6q&2sSh}7|6)~#{c*M z?G3Y=6L-$NAVTe`TP+3F200$NUWD29>sGmtp;=)S@u*ugZ!_k%*l);L&3~Yj5`OT= z$Fj=p!^S~|lc$W}Sibj9`yC5`n|g0`-*O1y2{+ihe=r?btGK>1V7F4#fjRfdG@-a8 zb~T`}ZDC#yJ)e!9;Lc)wGVNOB?<(TH49t>flSQNgcQ&2|YpKf%!g1GhL;Iv#Jm)n0 zN1CO)STngdde2NcxV((#YqpSbg)&?Ym0W0=^rk^PgYO%EsLK=e10RGtjI0e$R((X2nz0R#vh!dde{$3aT*nokVKHAo{OruAg@o}j>lWPXxZ6XC#jLV&6!0j zi{>W@@lL_ql3cgZhl`+fK~ zoi6t}3##mGY|7jrRS-y$YYmT)v2o(^QFRi@=J6zZkN#*kH5UrD2om!Dej$GcDnu>y zr>5{i0-=)s@0agLXbM88|MnA9a%>G`+~m*=O`?zg{w$Tq$(?_BCGav4gPW3G;-%)M z_y2WK)DWaa_WzaTN{Uc?a%7^>@9A&;T_zDwM9;rVgZf;}3&q5sD(d6^%U*$v-uU;I z{{Oeph^Jm6*r3SBNGsOIj~|QTNj#vqmz9+zDXyZ~F8omH6Y9nn_mNV|zB-&$6OEh*8jsuw2*+-Ia_R!m+v$l$LqkKe;qThW;_|JWXf{~GR4T_?aqcb`rw43M zamclDMYcf#AhK$@$B+K?97QBiB)vYSQKH`X7n=m^^c^8^%K!KDe_8VXY3={41yKX< z2x;%uww_15R8z|^PIMLjl_@_QfX^_HE<<>*IjPn7@vgFzl+@Udt{8R+dEtv(I^5=ixcB=c$W+e~c}Fs2ACiF2ceFf!>f_(@1&_DnV-fZ5 zud*RMLRz|>C+Q3W-&7fZDA9ev?z(HB?Kr6VO}EZ7@8HFwgTrxa=N{{^;_UX04o>}_ zcY;Xx)wrizQcw3+AYvLlf=BUF?prlXpJ_0^G|{2;ez*mu{I_Tkq77S9B8X^_TKM*z z`^gO7%b$*0mL(7{>*RJY*S)$&BR4?eJnggVvOz+xYU_6!c8_B9|DQ_uNn0uZhXF6 zA1O?A8hPW}G`lc8S>^035&B)TL^_awNt$$80NIQ*26}n66}|;||6gJ1`o9(Uew~!t zM&IQjB_@3-nm?L$nm|&6iN!0gWesC~y>K-j%C%i4md!;?lJtsgT0U7$%0HS0>%x*@ zQ88dbYDplV=4f?+$_Mk7vA@&$ z--3A)If-{GmB?_O)^aI-=X!@6K?!0VjVBey0hAa#|RTX86<8va6^S|jrq5bdu`rMB}wpQ1(b zuw&1Vb|dB0Kba+cQh+JL{%UiU;|Kb-!{#h#y_nI^v9ZmeokmNjH8bxpCT_-b zn%kcuh{>;>3_E6qBzPI81c8Qw!7~szTF!=9adltrLw#DvH{f zE3|So+ht?0pMNL?N*T6k?$jctm*kVKnTv0+O5?YFj3*|h$`ER0#_O$pVz;m?(^f&L zb^@xX^9jAvXeKuhgIcZ)G>D5Sa7>W6W~x0`%*c$1y`bSdzBvl)P3w`=Sx1cntDI)B z{hosc6k^G4b)6?J`TDMwHtjsOgH+n%w%|ftS)ef^W2LUHOL1u$9Bt>>R_4>SHExIN zD?+!dN&S1Mkj>gTgNJ-=+1oVsXmWRa`f@|`c6$g-PDh}XA!DMsT7)dnkG5FWlbOBK zJJXOQ($5Xx^-iMq;7iDU{SCXYs*<&@74}TpPBH0mR$*4CtK7a$!+VWVOw2m5{n5JN*8tl&@o#_$7qhPWyC~fj78WhPj&D zV?Wf{GgGc#!&vHQf>Q7_D_&P2qx}sQVkSzV0fMB-qY4vobxdL&o4`c2mafvX%_{4i zsioVvhA}wlOQF4Pad)vaO3WfV=RYiQmuOcy*fK-+f5~$iw&QVHO}vyKaU56@tIUsP z)R?=xNF}?-8av|kyACFRzR&vyXowrZe8OG|fyXtu6(FS6V64*S?2m z$nISxa(3|dP(8nWCuFlm{j|%vv^t6sv8u)rYSE_Uu;zxz@XTw6B>8hkLy%70S5 zFt$7;Qx7H;EOklxr*|lDM1}~>;z&rAP0<@yKFoHtw{BW)A(JVa*74Z2-|Ig;2v77F zl5vSuOg6I5z#6s&GiclT9L47UC8Z!NCUs@?XgiHIYa9GtjMn|$uN*9R_@z^SnSZMq z!rtHbV0CsJ$mVG~^DXAy8e%`nNZi*8H*;9^f2zc&4}6+?JLgX`adb{8D63Z@7D!NJ zsf8uDuV7b*W}gz%drocKzzUK4<$1QTOszf%I(~PpXijvVy5ND{uT%Q#O9sa6rX(K4 z`*qMZSKQQ}9c4T}&b#;DW2Na{&>swy3`;M^hr4x=BKC04C6X7+j02zAcY?5tHoMX8 zQJc&)`Jc`OGRM2t)rLuV3LQ<@Z|bk7iU%qAVv>C0IBLLP9tdQv{jF2AoajzHJ>KT} zx%<;4M{sEzP7<}Z8L4gE9K$JDr5DC&t+%akA=T9r4_J2@VwzT#?BB;qET8d*2YFncEN=$v zFNHeW=fy}22&8}$*pW3X3EoIIDM*6Dk(kn^q=nTk=Ob%1zG{o^ruDip*0^9CkaF33 ze!fq!EOj@(#%SolvUAkwiu^Bg=ukHo+5p9~>Xwy+Y2C`Wn3JEK3~u$vT2=J%Efv0v zqs*dWs#!-i3LDHH?IgVjo-zLthgC1$Z_!_NQ;s_c-Dp*tq_#CBU}eJ~icw6W{u;Al zo`}0A-;l}vzio4zuUZ9lgo*2^!GHS}uxR|73 zgcMG`g*dc5&JOFc;I6xLTy-2d>J+rVrSaT5k>IX=#dCE;GuTUl#n+-=2c7guY07#)igrf8x@R z@r?c5xw<>;vCLFtpp#@55WjWNuj3x0Zh`343ay|RtU$&EwR`;6KJ!7XKBi5)?o6A$ z`k_@;@ zx-4bC?3U@YL4Pp~9b7G{m$S>*9AfX0H2U!v5_7?-U+qzg*v*T)K@UM<>hz>EgTLbu z&1*Fmc`AV}>2-M_I!?>jGo!=b6q8O`-=>7mMKz0Q4`KzcSG&qUw2XiVK7+c?1Fu4l z2-VHbS8>LPF=k_&!wb&HDeT=Ln8Fi*EgJC}OPqhJAOlcA!#Z~gysjlz0!8c4GwG~z zn${z&G+^vaSD$S~YnJTJxh&$Z^hxZI-Mh-@|LkAn6htNRi+djZS{hV^*Q+WkwP)3! zuGFn1*5j}+B!7Qk9dAf%&lX{7)ZNX}zZieT4Ta0O6=*Hn3iAp!Hk*u&zwm=Y{*fQ)h!?u744}6Ol*A?V10GWJh1{+IznT?54_PLauEu zE0@z#UKEDAQn{uSt?S`PPZ7)4{#UB{APcNL;`YG>UgY|fdr1+;o^TwE%M#Sht6^H{zB%!tsAl2G-^|^U1niJI%}WLv>NVdF+FW(OQpLPWurS(-{%w zO&^SOU$Q6FT-dA#A)uKGu)QyCaiwr^LL`vG8DB6G4`hWp>@>UH$@r|Q>uUJT>!J0F z$BxPpqt}5AUi=pTr+!yP9{V=@eK>^Wiv-yz4dx4`nfOb;d!7^CC3|wPGGB9gZg09i z(!_rrglP3|?g6k|DJcah{LCNG7C2<@&%D=ZTpbL5Qkk{=@FY_>eaz;UY_`WCd}r{3 zB7X%at8tL*IlNT_Tz#`rMze0pe9X2*iiDrQlO22_lWne^PeSHNTXSkQTjo!IfE;6s z0luQRt~&Ms9x41#Ud-)nR@HbhG7zN-%bM$T%PSJ=<2jrbwFlnFZl0S{~#+d5{E2W6=B@drv>SKgcdM*Z>l8RfH6I>ST zr^w?`mzPH>ufgZ~X2R(1cOOa{CncEq*X3oW{=KXafNw7;CskgXb$R{qPXGF-4khs5 zO31U6;TcGG7!k+z*ujW?cHj3tb2=sj*Ggsqo+dD=!{LM<(oG7R<%DdO9)9K$vUI^c z)bqj?H>R-yQ6fP;>AX}X1cC8tVS#b$nI;@Y5uv?Fz!;+A)`pR%%CK@g|LlPgxoCpD z+_x|8o1yd!%K3lW#yKpYL5*7QIGNC-^Tbqtp%sK$`I+LB@EKd1?q8z_R9`@1 zoBbP#*D9v-HckwUVm^xHaL6iN`yOhhQxf!m^{LtTko(%8diu#;j$qJxgdLNI=mWOF zdn=Objv-kN)B3@Zi z8!y+rR5BZEcVCNCPiqeRA6-$xboDNS=s&vpaa%c$Be;%m-iY8C-=$0bVq|1CPGPf& zXdK*fuiO{)J#a=W`ykfqGeY*Zu8aH`v`5rmlc-dZ{HX~liS<@yPLGvRtb_|I}nAhNgQUGXa< z8&le=!`6H30b2(^z)4TnsY-H_re|Ofx}o{$Pka-(4+|Dw3H^}a!2(3+vJk3r!*efg zz>{^W#GK|X$tqgzONyFfJy!ExCq!9k09nC@1MruLP@!~IC&>%UayAz!F(gZ**g9G? z`O{>ev)HE<{S}=yVPRq_p6EJEMASdDmYjY&&uQ9A4z#w92CMlvmzIU2V>$N%SSSJ1 zHCNg86@tJ>pa(!W??MWaxem7EkGFwA=4u}1MM?QlcT7ww+Xubbq5A%hiwF~mMJHWP){tzjT}p%}BF z!j_D~Q5m#|2uTUDjhfFb>6uA#H7$Q7Y6Do2a2S+*eJqmbx!Mdh+wX>psb#4kGgw*& zpt{Qv4x40-*IxViU{x@t1cLF3kn@AmN9m%Ju0JDR1wk+Ji#=gsT5+@aga7H`XClM@aTwA{#gFBWt=dbU(i?Ht!0oq z!anoSJI1%nC=(#&fe=eT&!YqxZ?E-CTVNE#VHH1IVAPd1+ekUm>J_&jk9!v0f~@|` z7V&@zVgQfp5^;e?2qX|E#+R09Zj0@1{&Sn`H;u9RbcgH!jsRc-%3ru-f$mz5@aqM_ z3l_uyjakKK-@MLxXfI+O9js8s;CEjHJ4t*C)iCMw!^wWCafiAjU>mdneRR%_#tkTEp=(I!`@?%jT<2LPw54ih&Mf8896rFplg3L!*y9n zA6VTw+i02J;&eTk<_`Yhy#AXchEEkhmY`MZh)BN(7vOkC&m!5b{h+`L+Gh!+oW!xR zdo9bJ^8>o7u47xubg?Y#jV~*Tuut*=d_com-m@&%<&gy)kal8ll8wv<{0655{r780 zT@6;x!BGB^9>k5J8b|;da(YFFf2|?(Ph`w_8uOeRz0cCAWgm@p=5n>ac-UK8qu#7# zE!7caSi*-6Z@8{Do%|b?qAw%>rqGptm@&_Q4aSAT3J)LK7{nXZZloM2C+1#^u6v>STb@4ibJ1VThA5D$?n;N* z!Cld>ygp4TyQSndR+MJxCztn-k!rx{mbz-tb=+k&%bo;ADso(_0~QFyo&eQit1uD= z8GNf_=mKXJtDTgeB{v|^wIUa%{qo>d?XoxTIjmlt8rs}McocO&4$#s@JP!u^(G)_L z%?dnGJ-odCIIxv~ysJV+D;>isdkU)KRpmMy~EF=jG1hpp9!H z;#lJw%~&{4VSkT?T84k~{_y%JqXwzUZI6@mbW*|C%I1wytyf!hNimih8Zq?r^b_FJ z_W?n64U`JQ?mPla6k&P5;M`i4TDavQm&NZFibbZM!_Gg5#e#^|RA0_oi!$NWMViy$;$Qo$XaFkWk`i3un>t*UAqI^9&tfJTdxWQ3EBeox zTPS=jY&uvB3kS86RT)f=CLi)hExMha?|l(Ew|O6KAqQsbeNJYNHC!X#@bYO2Sk?k~ z_>ILqR#kk-ebyrz)#$f1^h`{Tkz<+PuELzW2xvGsWC}*hNJP)8Kgub)8= zCx4HXS?9lGta)HVE%$5+N{eE~^-ZMB z{Q0YyU-CdjTqC|&5?$PgiDPX;8t44`tDPG0A?QB0TtLg-{X41%Mvu)31WTMXh1lyZ zOaXKm>L+|*s*tWE=LqVGlL4^c8i+$Shoz>YvF%@l92+cm`x6brHKc$7Vy=rfqeAPA z5Gy@A|H)hvk*7&Zcpc+lo0|g~Ma{P!a)EGH&Qh$;Jr}Y~an{UH>-R$D)M-F-#M`D3 zxoBJFBj{MXjuo(`1+NCkd?`>=BzOl?9|g*ADY1RAf?w@D9Z&vHVkrY0!x#%1!q&Xs;R2E> zOo|o2g;)@;&wXyi)e+W-s!%$CN`;%8!3D!G zQZQHA>bMri8Gbn4ljAtbwo-FsM}mBSP%~l+0%zE_rmT703MZhP+slU`<=rzBCp_!A z@Ca3W^9uf>>+lFbB)5^1{oJcIXNgN0KNgGI@6~G5&B?9gLsXKr+J^Lq#n~7Oc%>+iV zIZaH3NLMLuec{w=HZY3$!cR>svR#(+k3EkKn+^l~~CI zu-;e=jKc)ikOoeHB`h+fbhU5YCW(kNqvg2k-Skv)2^ac%f+Ju?1wkR$Ho&vHtpB{t zd9!D$c8~nlosd}08;_?!XIspiW(#6()ZXRib~-;F2>!FX|AuEl_}jrO`@sTtigD3p zWRl-JYoikFDm9Dbu#7G1QAb=-QXa_qZaVSTMGDnT+~B&npP@*1h>gC|2fR?OFG>TZ7lECUlVwxV5F|N-(hUh&&3pNpT_{qBalQqn1H_;U zELcA|YnoPn;~&g}xVb7MN)ce(+5|7iLGCxMTz(})|aDNzt}b23faInr|nV`bbry}~|yu_Jvb z52I2F1@QpofSw~(*@_+x9RL{GC63Q!=*QWBFq6)wlFFIV(_rSBom->`;gdY%GjJFV z#XRs}?Cl3Dj6Pu=&}3eCsNL_o_ZY!PPXH!mtS;CLt{ZY|kW92R8F*c_ZrrM0j(?E} zkN&%8y|CbqwftclHfw}D;w1vBx>b2xysxZHWC!(&M7HI)c|Q2Vn;ld=eVWwQe&U)D=}^vuaip-0G?JQw0=e}4ajkhTkq`W&7CkU$j#EWrwKydt;nY%g6} z3_lshQP8af-6o}Nj@Q+0na@Q@`cAU^XLEREz3$HV zS<M?n_D{Ved0z7e|Ybq&};{Tc=o4;$!v zt1AXFg9#HfTFs`a3&f>g{ZS_B2Uwrva{BJt;amN^gpu3lw|FVAigCe7CueNE(g2?I zXC+iv;!rT|$Af=lMj0X*+Z>++x^?$k7k>ObYC!%st9SL_7yv9f(xLz#`Si*(!O=Al z5s=Yk4|DA{I{O~D)Z&4{54>=8WT-&0zkyRXedeS8tb~k$t^kOJ>_kLyZAsZ)!f%$y zq`sKCzKVADHtPCvo8xG8;16gAOy7WrQE}u)awV@f$4zRv;xKg}11RB5e6HJ~enDHl zjfMaydB?*5H#+V}c>WF1^=?8{=x75*uGypL%CEO)R3`j|WJ&&JLYxDOTo7Hix7csI zn*;T`&y%K5dlLoI)$$%qjQDYP31iVxicn6Ca{Q@6le>ZE)?P>=!*!4Mu|9(CZkYF} z1U5BRkunRcI;(Map9T1K8YA?aC89q#-iaW3RVd(9#G=Y~;Rbd>@P#eilMMwoSt{V& z`{L?h4^lU;zaN@tc*x}}sD@`QQ0ZmWJ_?$^uXHrw50Kn-ec8?qSQ-2J=5OMAO zass#tK{zjqs4{>OlxmdJUD9%#Kyv0og(@;TR+dI*yTvD@17gH{E8+<3F-Lj!n)Hox zZ(I1X`*)9t!*`9Q)fVByvE61uR*${A!b4)I#-@K7QITHIc^Yvs?IK(b0EWj+rdW?a zdb&#knRd<8L3*r&g5X$vupvc_P`cQ)V_lV6VUonKdeej0|7Ra6f)Rm$@{b6csu}KN z*lOO93`|`y^Ig!c1)oXs1JTsi_XFvH!^u5D41Gn0IA+AMe{5caoE+E!vZ&H^Ja~iuMPT&d zh3Ops3=)6t8(j5|pzO3#Sf5*f{p1gDEJEG@D3EfZg5$d8rDQRbHnWYtfmu&w2D%jT zSj)>Q`Fw~FFBF`CbsV|VP~ba$$^i7V{k~*wlp^CLIV#KDU++fNv*AewAw^Qlb^d>@ z+g#E@Iz{`#q^y!hi`(Y)H|vX3 z&p_q(YJUPfxq>RNpemNmwNX9rKAMgm%G-0$#KTNeS8U)yJ=x#9xRGSGLaX1kSU;kL;=xc|@r|6cc4(CJCSk ziLOfzWRrxn0UDb^^b_rYDrf7v54m0dw+BqQox9_C-A-3BOKfJJa+>x(7j!=y2%&fe z%u%X_zA=uMI(Bgm{{|Sd1CGPmnK1UmDxci-{Ng((oZ9vAK#np0G=_3pVP#au!7Sc= z7V^OsYGk(`H)lVEUymxUO1{<-ku=<72KDa)qsB=a)DQ0(0PQ}9)dbH`&q7-`Kmk_I z1{i`RTwMMGTTcm00!-g=0bg!-fNlNg1;sIew$@t=h{eoev$XZpdepp( zdJ-vg!9i?`^CwOrfldhgUV!$)<%RkHf@6+&5YY<2FdGD9SRUIfpZ{EWFr+ovG7Ye% z4f8Z08EQ*d!si8@^J_HmHt%lTC8FvmQpwj51ZZ%IM67xrAeF1)f@kXxEv62E0BceY zW+o!+vvy6ibcwkoR^kkGVs1Qk{8>8-bwolIuzMK0~o%iOL5kfHQX-S zua{ZtzV)2MNX(ucZEXoPScN5XS_<=YWX2kN8~un$QrW!XuWuA#DkgRS&8|M0axZ!P z1C(Ij%?qGnI|PO{*G-lMIu)t6lP7rjE!K;!yD?pSbo@}z*+z5##m)oCztW-i93G$$ zhoM{zE3DX->74NwpaBaqk9|c)Yjs{O_d^TkedCtTBqfcQ{3@@dDX-Q2ikCmE!zG+T2qkRoskAR$Y`(8e(by$i~x56Ik zWqP!Zrn0JO9rWU8I@3e;$rBc3&=;j*NT&hPLX z**vdwoG5pblZma!8+G5htD30xvQY0P@cHTJ^c{PwIB<@POdUX~9p^>c#_dWlUqaUY zT52ifb(V*2I}L5ZIkKTReBH&m=TUBy`YN#xVAC2a@*3+IUQJ59TTy{CthAcsD^tex z@9uKX`d@x>nc5Un*+j}>M|72x!rpI>kT@xp*b1@6Y9GN`mT`P(i^au7A6$A{Bl3E| zcw|jf<`{trPd55+q>_Y%((s#+>=dQ^BrkSlod6Ec=4`D;M?6Xv_eFP_ROv@7tfL;x z(6tYY+sME2DF(f-t)5uMG} z+0<9{JF0Pe!YX#wp^%YmS{Coii;jci zc^#0$Sn(-P`G%)oqZY8l{1hKx@p2FZR^-&>1Dx_4N~$Z;vT~|k>5%DC=xvDQ8uvON zT(J1ti-X}xAOsA4eRey+dCFCbp;E4;OTPtRo#iP%7_D?USRLFGCsX`zc{HjN-I%4e}OKK7HEM>px<;vi2@f(jd;xf`TUWPquDc0TJ&AU|iak*dkFyxe`E z_dC}7yDiwH;$$gKHIA=T%}TaKMy7GpaLE^i9hWg{&@%*87|d z!bOR95BMewQfYI(*cHbuZ-no*_r3#P{Z(P}3Xs7Zf`f>8`vqIhJ%F1ImbRp~;mv=R zRK1gDA7R}}evx&ZjOH2@dsnGBjp-$5Is1s(%{WszO}@JEOq>5jdNepSQm$y=(>ABp z+2Mv^%!$ZyclgEBol)h?HZBYl&%He7(7pRLf=fx?-X7yW()Uw_f&zDYOoQ$&GL%NcL>on7vc7WtmGj?=w`yPx{ z?GI%+o#tk!iIaFI+W?V?ezK|cxVr&&scethwDKgWsl>?k@t%&OA$9kxH9TpQQe#7x zQ@^eEKRa~fP7h1!UPv7Jahsm&kTsG~JC`m&UQ(wP8gi@61NwRK#clnAk!OX|@yTJ} ziSJ>}7%FUUs&l9LhPfsjg4mx9IFaV$Po$R7GuJ7VvUlClnnhi&vZf!B3jdphqwon7 z`T*Gtpj-pA1f4!0-Dk{y=Kn6O8!G|P1z$KGz1*6r$*ESK3?y+H5uHp}ZdvM#NzGQy z8i{c|6OODnbTu9Ua$4Y;%@(Zo!o| zi1r01c1KyFofKL(FwWZX@T==dx)mGy&d-ImDV-ZM!4*eS{;A`a;~v#cy~y;omm_5p z$;Xz?txlx*&Jthvf?LOjr*qp2p-OGSC5)3qdTikZ~-jwp4J?Ds7u#0VkHC8Jt}5_}+d4 zJQs9}Gq@MxqghW?CyP5_K_w!18-dMlV-Yh2|ZAg>(!0NiPVYs&70}OVlUJ?Z0q?m+1RP7OA3|9_GID9_Nspu6wJ1!`lY7VN^jp5vNow~PorY$mb8OBkfj=`W zsqG$U#Q!cCz4@tu>gBom6y;90{m>y9v*Oz}rDi!^IxiyHN0P|mUbnKPH2yfM4`awl zLF-d1&O&%2h|PxV9v1$Rm`z2RO4xZ+SC^rcr27&&f{GSO`q;Kkq%uuEgbfzg>eK7%YK=_ zBCj_K+K4s3y%{F|E6tjp7&*hi;)1Vr{`p$~2@A_88k1Fnd@rEH!J{v{V!d*Wx>Rk$kx4iK_XDdcmJUIs0Q?+5IAb&Nl~4Z)lYW z?c4${%hp3RwrSe-2(a-HTIP`30G!||k|_;OC7-+m*X}j@UcMM}G|m>iVd@2$J3Wpz zD6xnkP%upAaQx;hwYL-okW2{UG_dm)68Q};tPF}OkGG)7$Em?Ix{z>PZ;wN%!7cf) zzZnlA`@mkr>w--wU|~Z9_2xQrol3ftXDt3;X5ChIw#IHdM$wB~>|Go4!3oG|<~4!9 zKtzxCa+9=bk4qynVg2T7QeWW{hvt@=Fe3&Y!G|Eh1P%1~2N)AwRBiJ+76VUX{@4WS z2QBgzs2Do;t-foV4+GhKklwoFs9D#2&~Ys#%zZ1YWYXz@>K3?oRCta#wSld?^%F|a zQ%{}b;W5;e z$!zTTV`d=+b7N!7UOo$dl&lA{#-!do!$yPo`6G2ncGMJ;lNdi9%xazl;uLXPw5m%E ztf8(K`f^w&vM=Yd3yRJ`K!_;z-(2JcRr1T3XGL3;ErAqE0$i8J!iF1zRdTFM+N+Pt zUQkr=yS5N8t*X3AVCD+miLTxz%huM)&s|~16TZmhiktpBD-yPeL<}r zI=T%uvU_+FF$aZ^0bX-ro2hms`y*VxB!}2yo6uoBW6#bJ=p^CGwMHHgM_>6!E4x90 zL&}kzz7KscuPF6{&M0?eFlYc8)8;{ifMa`!{6!lKkR-o6;x~jIJPt0Fj2KLl zA_xAaM1QfVj$-#f+$M3dHy-h8io&X`5VwmF2HU>qfFwI{uXJSD)B^V4H7^X> zNudSMsel^kxT}wzQoIIlf1Qgyr+Xl?K2>GlIL+}uf7fJ*pWRq`ZOC;vZgBb@#-xFc z-nP1C5K*?l`g~aUIcMC(1V+HLhxInT;vCn_PU-EgIQPuzv+o9B%ZS%u@B#t!PH)9@ z?26?UqKxc*;igJ@^f3Evs#d-$o`6#|JC=!8oNGjd?89s@*z|ZC2?Mj7Hb+andGpl0 zXyDi9}*b$pW~k05Nhsdy7C#1E{%x?fmD4QanZzbH%IJi)?l4 z!ZmV%tE~&{W?L#p+X6F!|&6{?f2xuYWid_Z)@WCSgLxp z%#=wtKKv6raA1D6gPGxa^-%?(U)|-$^{UN7`X@|^ZK?VOoR&l`9wp-&e0Uq-t`3Wmi$47KRg2fP@Yq7+N+~}^k_!tv=%VS*qTF?^?-p#lA>IF!tZ*Z zCiWqY2C2CtA@2&4rDQ=$U=~WX}&OfVw65KNei68(pG4QBdNr3%mz%A~>dV zogvjYSToK;h8WA_sJk~G*=O`K(zR)-YB2FQK_KXFeGhd=tmUIFW}a!}1vjK8b|)Jc zBYm%mk(g=Y-OK)m9srEfZUHK9?(HbXfyGVo9MDePh4$FX*`YTHu9dkSM{Q;vmlDwz zl&3|SrUdBD>OOTI%5x)h!k#-x2F&(Lw+uUCMNG2?%JS9E+avoklZ^7{(Ge3E2cYq1 zzwlA6u8S)8fp%IFLoUErQj-abB}d~Dg%8CIYaU!So1^n0n1cz|$(^*r&fmyOmjqlc>{0tlQQ|kvJq*5y|wzi=@su& z2lvCm3qK^;Ulm z!Bkr0U51b=5!75r*@M{x167p}nU$`v0lK;;{gp-1svw_1Dgdyc1EY>ABdjXkRm8qP zHfggnawQjJ%>|Vd+LE004AOQgEZ&pTxsg#rc&oJ#BY z+|+A+&=u(zx?3Yf!0Z2$%l{mGF~^g?uEvC4 zwbY(FpsfA%lK0^~sod3Wh9><&9Eo;5n-IcNXBoV@%Z#TM;-XFK@`-1{$~SQYL=i~R zA+TF{96xxjV3_3cn-<2cug>4egrOGZKpAsXa~9Stwy45X4-z=vAC57Mmn_zvk4!^X z8Sh$G;E$J0Qn=4HOx#vP;XJ*Ue5u$h*6R?9FuVN#*j-@3FH^QF6IlWoMcR3s`>TtT zJt^wuW8Z7&ukK#})lC*JDQ!)4H>}+4XG}(tD?40&ycQ8&E!okIU$-6r@~hl_I^k}e zT|5xTAF9G!t@hWosuCDyji06)J`oW)taJ>}b(K|m%EZl>W*)d&3N1@&tqYTnIZNF{s@3hODtZg{1T)r)5E7y-tXA?=KO%9*g%1A-RA zKWPW`1MF{EnN)qOzX~P|RIizB2^K3B)tIS_1KLNJxEBNjz+vr4)$Ok2yji`KGgwN# zhgPzOarf}SroU2@d9$ise%l6rJWtBgGZo(Whtv*)4^mk8_%&RNcU5*_f_BWUBZj`) z%M^SjrO3%=_t(QL3qUvhc_+VNC=Vz`&ZQSce(N#XXU;s~+79 z&rjDQPo=g`u!nC0Ps8=T=L-1wsIXoB&l!wk6iB_w`dA&67IvLkgatcZ6caXbM@;)y zUEzFMV-M!8SbWJsynpV)14gz zQL?;C#fj+-RyJb5Lb`|Vv4A4v?(_Pj7f0ZqAiDUcSLX!(g6a>bBE$jZ&6RdOD3g1p zvx8VS8O1g&hPHabN-hpv(FT&@>7yiOwC#<*=tU|fr&@(*t3^c%`RSzt32!8)7!4-L zRp0m89yqbMz-^+dPrPM8hZ5|D!h62C9PX?_%?j7gB$kHi`nBFKCh^ z&4-P22oUd7V5zb_%%IsjhCGCyP-pM(c%T*BSDB ztlAT`!y2rGJmC~I+c|dpU>PG$7sR_{DbzseY9=V{ex(@@jvusHv3MQe}!5 z`J$0hYt>f9_g&$lcs@;H_rkhYp%y0R8Fg&v9NCIx2>?PKF$c{Z%v1S}nDGPv!)p9R z`25NsuyK(A3f}VNn%F$-Zxqxm+sja;uMS7MS!-eoChaWbSef@C7hw{ut3o%N1A;wS zbiY3TrUO=*6dvPI3QA-B3*oROwBvownL0XcmUfQbT3vpKfGRG&{Tb0ynfIexAX%I5{~_xu zfU4ZSw*dhGK}s6wmXK~ZG>C+BBOM|g(jbi>ozmSPh?InaNJ&Yjba(fEAGr6Y_xonv zaU5sdXUAI4>a}EnhJni*4U~*CR9Y-2ZcD18%{9)Y;_{sG=J(=N=6`2d$bi?AL2vx};IzLWel^c%{~YpJ+*b>)WM^*rO3r*hsgx-X(Ue#)By-&efwTAHSX z&L7_Ofk4f>z+&OxJ3TG}?)dVoGM#usqCe-O;Y#|4ALd_*t~%+wqZxL5msowq$nVSE zJ9ouQe320$81=(E>qYfV)zVFmM7GnXKN;2;V*qt=FKB&^Z*Le>mz#^Kwoi|a5af4R z{-Qb46LJOx8YeRcmR#@pUAB@V=)P&1ZSKp)_QL`xGJczZcP!8^qiBJ9y zP0+!Bvm^A&_}|K>NKe>{4>rcam%-{UuG~U(-*_EADa{O25Q+lX1oG=EsBb=%5Hizc z)s&`RsjMFvm49~-in&T(2`xDK%I$PAotr1uXrC|VVCq}pSMK*u?ZaJ^JpllPN2l^_ zNhPEJejW(PxVQ#ZO{jhHkbhB7=tf%LhO)~Qoe>^ZL!OJrc}=2Ix`$5ja}NS2iS-Ai z%cJFbAe|+~Ra58RQX(jxjX|%wR4VxXEuO8w?WqI z^SBDY{?$e~F$l@G~@Z;f05q$!8y zPxI~;{nwT9yfSJly{<@x-3(VUbl&!s-R$oUf>o8~pR4W$882*ou#wxYifQt>VPC%4 z%+<$I|06WWFqpwj&#j|p-GWlbM;~3aywfcg-k8_dWO5$bAEWYDc;2oTYd7nipBn8n zzJHtxT8aplk40$F-l$oQj+j*Fd&(W(C&lHLY2TuSO=f+J_eWvz*8wuLgFkrpmWC^z zyabSbxaDF^Jra4(WNVtX0cWK0kbpfW6lKTQ+ps7;0q>`UT5Divs)KW7<> zF)(t@FRJ-R#?79|5jO0M8>L4zn!l)Cv8yokZ8IyK(nQ}BL;N*@+D`BXHV{BI(c~uQK+v6z>#*NSCFVuaHN1JV7Xi)-4m{0LtvZLAdbF~F!#obU zD`(?d7Jn?{jx{?_9Qnd zW&nTS*x8CS^-Kik@hWSjdig__7V_}kbSni9L@`1JgTGc(gcccA=go`XQUo3H82e1Z z)v|cG%%n@C(%`wy>b|8O>03KWvZS;TK4B+|qe1a7EaJ$<*TakN3Jzv6B-@ks^#S}GcpPZZ0y z2#0ce{SY=*S5tE#{PCH*zcP2IuFB<`S??xBvD)heK(&Te#y< z{cf*yi}IHCaN|^0E!%+miqYPSsW(xJFrn`qF-KZp2Zee`tC)1CE@u1OEQ zMv#6Tu36}Ib^dubB!4A9ZFCI*8F$o1k8$&(aj+ew24$m$wu>aO3}z2(?YP*zd!Gq`==WC zc3U@nH7oYo}G z;*mgcj`+vf^tdo`uM6EyPiIS7kh!g*wVv9I)IX7AhxKdK-@Y@~IP86)(Ax>s}Oa%no5B7WbF-obHK zP+S>$g?z=;_08CxTSZyQGokKvWT1jL7|B7~2|&u=8CST&oy!=q(^gSRbn29FaBwVt z<;gh`X|^$G-`08sW`J6-0a8IVf6qqCVTOO&#zwtvY_!c}aGBICa+Kt-8P%bFhu2{_ zNTssH=~t@F!%KV6P`3P&dL>p_Xz5M_LiVa9Kzt22%+d6;!mdhhl*=)>YyFXvQ&n~&fKN>O8&@jvHGjV zD(Uny?&N>m$w?}Y)01I{}t(ZUYKynapArLQeS?> zs+N3k>K*UQ`?2$dzphXIC!C<0bax!SZ2-#5372Qc@30aAzlla< zY~x0()f?P3wj+tCWZfTdsy<~3n^V7`C45mhWPw+n;ga8-CP`ENxv2M7`ot@ioz0!* z1fiQRAYA)*S?U~AiZ%(6-=;bH`XFrPr^UzfYZ9AP1wLjdc3AOwg_OsGmGPB_=Gn)T z07FC5?LIl}OVSe>=d%V}xT}kNdAE~kQ1kc~?mS8cN`&YJ>-?=tR9d?w{j}1{MQvhq zWgX5tx-;FQOZDnS$q&9GszQT%H5N3O)-3uY-%U+(<*r|JTrYa4F=Q+LCDtmT{^57E zk8W#2BE4bV&)o+(RwBlox2Rs7D%uDe>;EK_AoW^BFd6wcZjm2}pph}BQL5E>el(zW zNYrI4d}{0and-zK2IglYFXta&{GWPUGAg*eh8_|Zxx9lO|q1m2PF5+!BE5_lhpmm{>oEQRvK7BD{f8CllXI9?;b91-%a&)_Y05G7a2DC zJF%o*ksy(<{TJ_D{rT-GzKXyvgJx2L@#t1iV89)VMM!UO*Ggytm)CH;6UKMbTvV(K zO#?aHb$!&%CR$Dg9#=U2VU*LDz3V>nYFY=3yWQLh$4Ka67C$gijGfqw9NzxW|%Z0i?s@G zHM+Zl`o08Ei?|_UaZ3#3n8D3^d53E{EXjiIT@z+6r70%aP(jR(e}}R9b;A-bap4D_ z;ex1=*|gs{N(AbczV|oOUPkFO*04LY+$RN9W7440Dph#iUy-vYD7Drx=iXIhOSU-U zHpp0+?qxrze;LM3r$TrCmzF|jR)F)V;6l9%ch&bryV3ueF`aAh(L2XaJ>}do;GsYS!VJd{;oOW@l6|K zR~cF}Fz;$%$()4~<)RBS?zUqH{Sh@Uz^yWMS?3Qx5$zMgChJWsms|El$$}SCv1XMb zcOty`cKwg@&aZ=0uGRs^p4krScCe$dz$;91DqM^(MJ4*j)lA0*U|M2ijssd(6I}1} zdQx@SB@Ac;s~-eB>+!NoWK4S{K5rsxNRg0NTWqPljPRB_#?v=0uMCcT4#z?9Smu9c z^`!vtq1BeE@C&GLzH}&K<;~=tLbIX z=*zH?gWc%dmTG*q%8D?XXT_aiGpm9YV(ht&^f$Nvx1Z7=M#aue^WykNc6qt&Ag3Qx zu_+?O+smXDcq4DJG1NB?hgQCM?Hb^U6=f=G^WQV&ciNXv7Sz^i_#F7)4bk5O3aUEL zG=ng1(>Nf@aL1Juw$W#V65WVDGRhUrc4Z-^h@cdOWICAVT|Is-t9<)gb zCw`%Ep6Kkndb;yv?z!gqQi4+9!Q(ph@~2XSrDIdyBekxC-B(~~B;u;)l*M{QBE#T2 zd6DoVOUMY|Dc`~BwiXqaU9Yc)GPmniHRP#=`S>gr_`sqTz>6S9!;4T!_xAK$?bLCq zI+*3T=ZQJF1nda)Jy%zsw(Zqc6%@>uFMxcq5e{)BjT!ss7e>f=nVaJ`v5Kg;2E`Xl&{~d! z6^Mt_=C{FPVm%GTqg_~k(lbMlMAhCzG}rOc`PF1R9n*$3r2xdw6%F$0&u{Fq!Qw+2 zJ&r*ly$=kMkB=PL!j+wPY9FoTVB(;puGx;uLEyGbzWZ36rm1jeo<41o4r*tSZr1?I+8^{LmKo7wNDseT9}na)n9LZJNonsTCXQm0n+Nfw2K5*d6vRkoETTOb;z zObU;T;S{_gvRH=GtE_JUKmB%L(C#5Oi2QXAIx^rA)YEX05biF4-nXY_=5s@y?gcFy zYOB}IGTOAC{0R_sJzjB4TlG<&Epv zV~PK9s*-fG+R>}A!ZTL%|5C$SKn)=op}coTh!%zPq&->aI4P*+Ih^L%D9I_=XrtJL zaeABO;E-d{e$exA7Y#dRw12JW3rqH)##K?LM}2iqMw6$wEIWNrqiZtKwg$_Kn5#f#on+ z{rL3m%mVQKC1g1Q5Z7q?%E+I;efOS&W)&Z}Fdy^0WFF5$dyy$Bqf7EGfb8W^iHu}< z+HzE*cVr?jzc|aL<-pql4b~VxleYUffBnG!f34DRo4HbJ1nY@*XSQt5Vv)xC!ug$A zqezrCB{lu1Vzwn1XS}$ji=;|kMNx>CY9&oCu_H}Z&$;fUo7NwkjDA=Y|ir73N0!msx@zSD|n&(P_m zjD67HT+=Q25>~x1#sT=N#EZ%AzfWHq2cF(qkUjFj?F*7I`u4cYR7K?KE~L&&F*=&k z4D2uEoi626&5fH)7QJjmfOoVpyyA24H7~;yIe{5&Owy$!gTDj8yKTYv>cGE*`H28; zb4N{6az4cl@9apEMZ1co+k>A>Gubk)jg(UI*|$xNmpzU$z$NJ$eqf_4(Xu+`TVN;V z;JRi~@+DlEI!t!-^`L)&3cYfzNw*FIRombA3j-StHcp6Y_4XFeYSVoZ*(1#R^r#C4 zs_qkTTGr~D%ZUmaYoJq))W2l^4nHXQzJl5ZuOR%}44q1FfgI7n*7cS62d?p;iaVV8 z7jMd_z$HIXc;J8cYe-U%FBeUo9r&)*jazYO(=bIrSnd+nWE~bdo(NHg%*!P zB=FVQgD;w$F%d~|_#6@cf*WaAz_ZFQk7e&R+9nCvoRl&gA$G7$8Z<^F#F!MPC(b2$ z=SZo9BJpjo4hf(0RrEmPhdP%NlKIeGb6)UjW!6J|;}3TJ$6?UFv^)lv?DWIc=g}Qf zVq#!m(_p3YmAVnO21dWAG=|Tf`W5gPS-lJHc#1)lqDWcl^?bquy2A7!xrwR<0Rq!yU2#h7cG4||0BWxZAE?n zDph>P%NMGP2UIOG-> zEiwwnJC>HFhPB%_c7{#<7SK^^rNQt$#PzkHxCZ*;_pYL-&rx8rHI=hQn0{7sSy6mE zoCMM$PNL-1*%)gU=c_LioQv>(EsaD7*ul@aJ(C4`5pTkL5(N`hTSTa<2QDbs7}Spw z5%X%F`nc_XqR(`2^EF+`Ix3Uwm9Zv(xJq6Tx>zIh+&#*_+`5Migo9UL;vJEZ`NI3- zI$P;%{oHkF_e0r5{vwSfP;K|!Hj6{7ltL&)oI+6W>k~IKH^9dF3o`>S(RWmSrxU?L zbs`a-?tV-t#R@!wY|bil-HTmfQBDqny%H$w+*=r*`&w;@nd+KHJB(D)FIjDAe7I`8 z-fYGaa+VGsFJb-N=^mW`#NpRh_rC>94F-52CNntuyWd$PC!3hxPbE?e$o+v zDjF$>NdIVCp+0(7)3xjSin;!Y8%N@>lUO5pw@ z4iT_-x4mN!HUdfR7A2vH{y~5iZ2*BsBWG_vLD$7Vf}8)59;|j{zCKX^ zORzQJrnZCgB$$m|ZtqCZppW6v%dg=S5*_$lLsXGbGV`r6uz69Xd}as*FIqirvcJ6M z#iJwq3!z|;YD1tBmyakC;r1aRm>qLhuDLGnNI$Z@z% z1iLY@%&;qb2}AcEK>4p)j3fX{4j;0I{rX{`scof;oB2INMfB7XY4OJ29$6FBJ!!KZ zvW5rC(voLQ5hgGu5=XCQPC~^*{-#m|D3xO8Ba7bRh!2v-71{ucwp;-JC)$P0D#6k( zCa7$vS)XUZ7f%j(ZI#}LrAaS@&K<|rf1H^iXhFk=+P zzbAdi@{tk{W8=ug!>BW6IkyS4J_eqJ7oy$qn4Q61!;bf&q=tRRG~vVInSztKxMoW= zr3{DHTdE3GLRL$vDkDqC`J5SdrD6W1O6c#jNI(VYK&Q{`^f%S{+gpH;z|FTZ z?kKp5_f;k4XG=K)(G-P^WagA=9aur{3RuKlUkv7`L2AA8}{Q3378@QyG| z!-LL%90@X5H^rJ%R#{&pUwW>+>=92B60#lZJ02wz$0XOyHnmqRzIbX6s5g z`E-79jQM)Ts;FNh?;WG=#IJS8DF*ytS#8hE2l_^c-wd!xXkHTBRGqy1f7n|J`bZyt zbls6CAuQ{)*hOk4LHuSa$>eW1X|-R13niX!j)Y(rDrG~AKQdbz&Ku8D=!sb4FklRq z+hA!lr>%BGk^S=7{+r0HDS$5ap%>8@ddiv04DuQ9ig7>AKxS2&M@AibLijk&2aF&{(F0c)G+tVEzgXc z?$}5M@;uSp;VCnzwT-)LE8F!W2JxZ#>I6>T%`;5J@AaXN)O6coEvV$vd&?VkBEG_` zUnrv@`~^EPP}o_t@au>E#&1$M!y7NG17TJi2KD$bAv|}zIzji~&#&2s%_m|D#F6`o z;Ss)JJniN21DN(R`vToqzuVya!qa8)vv^<$3TXtA|0th-W52R+w{=wDxMtP{~_fQ-QFz07|rMSE)3rtl?43*rA;3QJge=~=` zslGOtyEas)%$9j$pUh)NA$YbJGZa5f{3$*yP5xxsg_KFRUS*3(IS=*q%$KBiMorO$ z_V8>_PnDcp`2^z*u6^Vn9z5eSYO2*+E;G}B{~K1AaXZ6B?gfY9QS#YEw}oj^$)}KI zzuD8s+y9QwbzTLlf|O;_EiJ@z9H6BL_vhjvEv@i_0*vk?1E4P2n_&T3Mr{G=tT{Z- z9h09Lzzvt{r#GF>KXBXoj&~4-Tw?^PSBr#+o|C$7vL>^ey=Tjr0i!=0FHh~xCugrE zn{Te&-_Mmd9w)PTsg>*Lh9|KaA$o$bU>t^RNNccmk)VA2;W?P~3qvxZQK`4*&}ETPGrrarp{*U8=Uel5__uA=eT9pZ)t=taY{s*9YSsQ*E?`I# z!p`DDMoTosX1y*-d#cplwucc=wwCt1DJJB!><7ceioeaS^rp%Np_9Hx+6E&}C|YsozE zaEV)J&OR*-f;Ff&_iJ~pTs!dHheaP4zDnS?};x>BubhM_s)i!tfWne@%- z&biGxZ9t(R2k!P77oSVmnpKNbGVhamM5cLPyBhZOO}(a6hl>P#3QX(J=WZk2T*P za)9TnGSwSZ0%{dCQQD?FQa{jJ(aGb$}AiW46ipq z?mVpdIguPb@g4h?LPytc_y%Zg9?2N0#h*lWUq7s|(u^J{p)VS`+N`3i#E2$fjiI}` z5-PuT!9S@G?)yBsZQ!l+UZ*&EKU0n4y8~r9f#4koD@n9)Blc%&W}3!lvUB%WRzhI8 zi}kk`&fSX@YU^rkL=H+5mK3BSNaUdH>dlL8HXR>`A~|#Kl)}TN5l|H{B%A(ETX^15Bks9Ix^QXwE!XxZ^zIo z1oEXAk*>fj;aW2bnf^2@?Y=ntZ1Ic6q-60jd+M|G^fTz3K$Oza3j-o31-z9ed;P;k z<@93d*%9#JrFqHo&%V^Cex@cAU26V<94%v;&FdVoQX7X}FHBx{N>;{9I-)1*K-jK0 z2RQ*G68}0TELKYjRCK@3^*?%o{)Ma)-eV3nnTCDK;mF0ykg&JKVk?3wHuDi}d33ng zi*KB-nrs#BUmUK9CFt0P5MCej3IEED5PAmf6Tmf2lp00ACj+syuxD{fc?~fSs(#(f zGzJXCGHD?dX$?P4AFmBc73fCjaT;C{SI#SxH(B-+&AssugT0r++jFwJ_A9?oGytXd zIsOWx&c#PXa`GEzd|*4Vj(T~ADF4hsTd8n)5{Hm(biMs8UBqAcUg@!ktC zO%CKSkYt0iK|5lEJmC7Sg-VZguPj%JP53$2R9D@z%r&QDK1bU=GYW4ZZFJVqNAz$DGH@(hcu`w1d z4@oGDzKSw1Wr}Bx}dZ!J69}YVOdX0Qwep(_=`p96etOIEt7$Kz+fMuT^O)Xor zK>V2vjOBC#?M$|a<<22o$@T_R*2Gv5h)}avggUa2z<-cKEQ<7vh9pT+d~!0^k)LEg zIWC+e%UNoUs4J&Rhvlx}&9z3}-X%UhM9>>0^MA@eupk1VTTA70%Hnfc?+4P@9L(0_ zmG1{)1Omp02BC=UjD0kEZ={oS1)Abyg5DX6W8U9!)C*3)6sVt3>W(Md(;QxxZ=9^g zDh!SjahbFfm%8FY&{E5r&KW)q3|pr3UxE%nN+Cb%(w8_bAOCoAD=Gtge)mn3tnaAQ zJS158bP~Q!JkD}6RM6fxfpS69{n24kNrT{c5|;!SuvZ)Ns&iSZ$Kn%WJ3cWSM^E zo52RZ%eIOU=*ICX{f1e$9wVr`T(=<}wEB^lu6I&SOWic+Y99i~l3wq!V|}vXzXWDR zB29x9K!N^{)syELvNxhBk@imummmgO*95G=|c zy9`i`kB^N2Bxwl{q%`PqM66n>r5aG!e0{#rb=sMhd2@ZzTu9wOAwE`{ybgLHi32ju z1^uEh+@mTD4qANgIj-d<41#fY2l#B`4F_Q4X^cUHTwGMFpoA+ik;XXwIhat^$#Q+V zOgcvo;RZZPiq(-4`x@rx)hT~E7e2)E&7Ndo^nh~USM`z%$&dgxRK4Yg=!U;`5E`X5 zFf<`%|FAXGJoO#IeClsis4Zmoq$+_1k9Gd7q!oso?7Qia4{2#-`z~xnTGf^>3YBtE z9(gqPoK(M_TizOvF`pB4u+YToQ{{yKVtj>nR1Az09dB%9 zxY6u)reXCVZ69o&to^z^K>=~a*WWni+!K|zH3EpSKP$#!H}4@1gmWkR8ArYkix07f zb5w`7MZdTa<#Alacr5Ida>`D4YY;+FrNtGl3E8(Vd^wE0m+aau%Wn7>N;RCS0Vp`Z z-u-vL^Bm5r?7BAss}yB7S2l(*C#dQnoOsIeS2mxb4PjqI#hU%V`vuy4sEdEX7QQbr zgQuC;G*U8{Ze;-^1>WAj8J&o5#cG?d$pX9c5q;kC*AtP46^;oWku;p}AQpL#6tPbt zk_V5DdxPBf>|ygCgApo5UEz0GwDvWOsG(1U#0dM)JKyBV^DC3nhJmuua*fqGG^Yq| zeR^b*ESi@cNLU~O>!9N%Nml%|n=p2$C@us{FpFOE&1w&Zh3eV=`!Kixwk=2oBi|hr zh2rwNzr(;;gV0yOWG~X< zt%HOVYu5pg8uN5PHj%r>8-`4!SopB*>B`pMc%Fm+w9W-HXP>;t5_kEF&_H}E$;aj$ z+4tx0_wLS2OilLLs`H_Dx%Lm2xt*jW@wGK^z}0)seu(eqTa8I*Uv}Wz8ui0q*))nd z0x&sDGS=ZAT+ouB6++Dz*b}|=aY`fjPO~X=f-!L>MbWQSid5)}l=B~^I*n+mWQm8U z1p9G9f2_q04um_2>1poF(0=6LPUWo{H=c^dUkf61G}VcJX z?9SK-V|)^IqJJBH3l-9ZhCqXabV1L1Sj^2yfxCPI;VhLhqv)$TT{+F82e?r@wk>s?f=_|wAzrHe;qCUMR2%2O zKOCYY`x2PxLnWH*=LhLG&$|;?(CK?RV`#z9t=W7q)Hy-&J+Ta}dbO@D4gQV z^UwRg-;?nfAdB>XL-JJ+awxft#QKtx8Pnl-*7OY3^lBQX{F-s8Z-rTJP+qSUWkB_M z>5XO6obVV?*xr9(qt;c_i)VLOJV_1)PkG{dI_f=S^y}mfvl1&&1egIJ z4Z7DwqZex_gFbq}jr1Ug6=YmJ&wx&xK3W*9el^OF{WAAi+&mnZIbgO9hB3eA9UJRP z3nC8eo5>=Mz5$l`;O<=O8feFt3v6X}DQ_=j@w_B@y)SRluJ$&^s$b!ONunlu&okR; zL_gOCbR}nyJsxKto!3fi{m>6-p@p~QyjT<$MS~KPEu}B<+>Y4&x)*J=CU0Rmto?lx zTkyGJ+rBB}E|oKf^d(>*^ut)Dnm3s(-c*_2u_!%R3$IXl-7=xckg?b(=td?Aon-l3 zH9D@CN;VI4byET}JJ)u=Bn~h!SRugA8@Nomj1IFS=4Dp~3zccRSUx3(XGVY_T%a9} zpyoGe|#t7@$`M$SHJi2JI9^9 zytLY)SD7NmdREZ51TdWaTA<^-MSr?7aXoFi-0udNyY&yR-qb-gG0e-lSfJ=8Tb zTEtcm$P`_rBp^~CWv1c)14N;7SXFo~!Nd#m79ZH-SfCC(*_x_TZWNXhvM`DmnR*Q@ zgu|_xS?FX36%B48`?H{TOiSB+9$<7cx_ju>egk8~QhPuaN@vbUN^$H#zWsvhn8Wdt zq=kSwwPLmKKvV)zvlKf-35pNN;@9(>b^SwZXZI_CtS-XGSs{~?eXv#j*`ynHqRK2H zI%o-Hv^E%fBsI(usZ42Vb1z|OoIrksznI`@IV6yV1 zlu7GT1{U=*e?rks-^+`gW@RwxI8L3#W!~{VXeCwc$ehz)==n^KP8%EYxKm)wcOa@Z-|;H|%I|oD{~j$LDQ>GZW4}yG`kbzF}+ZRXZWR zVvluWcMOt#br0~Srj^JhY+jgLx#gRkoOsj*U3q$GHu&8~N3nRc+nm+^G!q_|QKQA= z*Ki-8CJ?x2zCJ2Hwp;5=T6rVF4-|XTh=lg)xAiMLO>g!L-8ThKreYQ}yljlOU${@n zaZkr)Z|J&An~=sB9PqyHwA>ISlnvmppYUYlx}3A;O+_=m-bU9Ij+h(0V%h{w#A|>Jv zw~vn>yG&)aJ4WfF7?)|<9Sy4or7XOCz#Q51XmZPL35A-#g+0)|8J`WNex9-A%TdMJ*x+-_~> zd&uH_u~id-)?$@XW&f^=PLhz{xkM1>|IyZ4v6W!wUa`rXzDZo5U( z#AVElFXMU!l@oFjSW0yp)B<)L?k}?&eDj}DE;u1J6n5Fxso(lMX%75>4B;>i%eHSG z`mVIEp|wYlmVe%R0G|TQRKkA5djC$%eF$gR#X$gLw9OWX+7)9&FJ%_d6h)sQEtI0Z z)x#cQ91g>QK^N)6$}#CKX)f^%g?QEPVLhyh_nY7MhBM%xYt*K}5G9qgp9vdTfR3Zp zL2C!&{MesfG7LpR$+l#o0=F zsTgZxWt2kTvNEs;WddE4fen$J9Q5E>r|mAWQ6M_0e1y<>!NiX<&S_(KW0qkfHdrBd zuS(b$LG`0{$roI!S}S3iLHuUA4bHiE`W`P_eY6~GuT`d%lN#R)Kw9WQmLi`7)&bZo z_{QMd77LG^CvwLTN!|0&xzx&OqI(||tpz?NEkMM-ZHWL)`6>=ogBx}1 z9x_XptHki5M;isL7;yj5p`yio1W8NVGp9dtFzXm`9;ZE5`_;%YeyXNuRQ}u=bTP~Z z3f0P5uVjC^*LjHX_GWF^J1g;U>Np>2jG(X}qCQDguHW`Xl!ZjeJ$^+3oSL{v4KZL& z5|$F9kiPfL#q2Qa@w~C0d4Z{G^s?_CrHd>`U*>p?}#?RM1Y{`19!fV_H@-y}=RZR?(40 z3T5xkr@Y+K@B~m}f(#$NA#LhEGdNR^qw zEB3`#AVRfyHMdxoerv1XMrRH#B%Mwoz6&{_Tf`YZa=Mj$JZp*zk1fRrv zcq>Uj(BrZ59CQUS$x1PQCf2O3VVS=qXkTY z3Wm5y^sqV;3LPp&O5)%&AK{xP;c53I?+A@Y9D?r69r)1RT=&hn81lZs)Vm&- zYWH3WAa2QP5&<=rqAWwmYV4`ubZL9N!;Jhf7EnOgLElxIk<0>pr}Bw{``xf9J8aFS z0WD_6#{FQr{lP|ZtPx*vSKO181{(I7uteZmlImzWcSRnDt5k0Km!9?}~Y01$Pndz%Gbfby{AZkR=$ik-1V;|f0 z;{jG}azow-T1El6e3Eeuu@<#k#&I1A#)6s%7HbGtTbc9f5~3nPLXKTo}D zaJ8~1Z*S#=v2g&2t8B>I+@($DVXzqG4#SOgx+V}`m6i`6AB}) zZG2F^`Hcc+xc^A`y>3Hi?QakDj#wn}!+u?S2)4-BjRWD;1r`od)*~Et)2Ir@^=tLi z5_idyoQFOWinsYxO7ah?q4(O)i8ZR zdUJZ8_l0;z+S49HJmyLPwELjddaFBo55~N$UGLsFu#V4vrC&d)Bq9<{cUz~w7z1X; z*6SxW)H@X@G~@Rl&m?lbQX3=Semx-$3`pf44lCN#qTRefYhV5Di}mjLd353TvNEYX z9$$SO4m>{|9ZsdGR!?iRfYDPb_vJ0*88VMGjLGYKT_s2Yi?66w68vZlBLSPEl8$CmX49d_m2Kvte0uM`45QAX zoP=TI1e?#k;+0;g0R!r-J%AA$-J)&ND1pPEqL@tr8f`DBcBaLv8D&o#jp{257pW>E zegayvzN@>ID=|ymBG#awOsSoGQ{~D$(Af-#=<5+g;g^-`DhF4YkNhiuD=$LFL~+A{MRWNYhw&6e0rVYLph0lE_j$TryzF*!WXDY<+BIeCMg-Elu}Oj#B^aB*ct3}u9BhPu3B>EgGnCpW4^=F6 zHdnvn>3>4=VPsiZ5BsDk$;V|sN1N1bJb7aIMCszAx_(H!exKdBTdspJF`WU_+3 zl7=ilfnm%+t~#bEWGZOz4SrJ69vB@3UvgOP}v=Q$`46p*C7^nDTc6 z`_8;SL!>S29S-{g=H+GFi1(YAR!fviPI~TJoZ8O4GF^3Q{LY)^`RczNQLNe|?!l^K zL~`qjCO?1kWbuAn#mS&};VLZ&0q zw4wR>#?5PzElf|^FN0IAt^6Gx%X>GbwKcjt`BXw@G>fOzE3xg(#8)|(LqasIpC!Tx zIYBD+Yk!XUiHqEXSJEHEBS-`{g>WBeZixS6IdGo%0WuC^jO4E5c*ai;rjif(ysxVw z=Qh^`o1cm0U%pQ=B{kDVx$#6Fwys;kZ6NKV4G%lkxlCIj;k1lhr+hrH|2!)FebB?l zl`HWiEw1g)vMVA3!Z(E^-r?e~?W@N!>rTx4aFeEVN&7fhHM$)$>!M^Ku6lLqCUy@t zABkrhmcfQ=n1=;HI4P0c?BVO#rc|B?Y2f}DV)f$tS~dZKDoUL}yf=s|I3oA%lDBAJ z?Kbg29nptx2@Epr-y6RaO2aFaYlu*h+7{ymu#3X8XUp~@;a3(F4=jl)Z-E+xQ$ss4 zjE;h5Nh~uePDhr}X_UT-M9-kPcs|8Q92SHAYNc8dJ53j5&vfyg)k5P%9?3dXtAH=K z?N$#@*j!{l8cqS|oBkem+@M_c;g+dQ&y%v+x4=6fnQ}j-5!J||y!pDmk{9^1q?Yy< zb+Vo;Ch(k9M0O>xlncmd;vK8?Oph&m0xfl^(AL7qh(Sieh;hbS2v0~L@b@` zt&Y{$=&MSHHmaWy@;a@@x_K{lU@krGaWx%(75|Gi!`J5n626s;Tpr|-yqkFjY6WtK zRCBGYuMQXE0n6E`4f*{MI#DsY-n6Wb?$Mc;x)8E`d1NG`DcrtHE?={N2*Ap%#Zgv; zVBa#z7UZF`?srGj3|g@4CO8{CvgL}cVFqFa{cW3VD+9et$Jdo~MW_R8xr>))Y~AlR zCPC7Z$*>!{AZ$?iccS`bw*Wz^m&;GeUo11Z(1nC++hmEd2$RMfdZ0V3P9~Y95&Swe zMvvA%T67U{P%RTAf2EqSBnES0D`e|oFYTs7zkpCgeX;28rifOy@U@Pe*PAlvM-?m7 ztHS%OU>vFAdnC^dCDFuZtMU2sT~jW)N-`{63H^SH`JHwv|D~NDA=knQ zPrf-cDKT@jAr6=mrnAcnc1XVFI+gr2KOA+&M#(i}Z>?GOD5CG9c`wX4O$=lZEu-Gv z3ceB8WkW$xP3gifc`SB7cl^S%-qjx!Tk_B9)aay5U>cp_`R41wwl#doG5` z1*}`6_e}ugo9ORW*x%@~%%N=xT&MS9(%Uq`PdYD&+uCuRCzr*4Y>bxI{F~|A5&-Xi z$cJ6Ek{pf0A}Kj5#2h=w8&y&?yx)gY{c2_EOCo(DJnEXN!b<8ZhMP(Gtny}LcqBnj ze7?ryXCOKYSia52rx(diCW8k>%KzDt=PbZ!ThuK!6!oL3bY+GCYqsNtMXXv<;r)AVEI5s%twPpI_pH6+L{QnHn&@-4zb-r1k*7Bx(hhv zZ})~q^9j4|UhH-UUC2BnaLuaME~lpp5x0ILb}a|&{7clTK^L^pxr&< z7U%co{?2nFz*eLpQzsqr2y&t*_EMBW=Dv${V{jr8g!9yXNgDSnN8d>xn)rQm#FHs8 zhrce@w+#!>7$3-dbOX78=0HEkl)VO#kU~+Z0em^iG~&m14~Qj7rd)how!-qf!;m>q zjAG#Z9dRSJI`o<>KvDvwu5>T?S^(#Ca9`mVIHC2&*UyA2< zbgHa3i78}JSVE=OPeq#lkE^c?tE$`D1_Y%+Iwd5Ol3k!?c)w!LecY7@WLWTaw8)Dch6Y3FWZ6ubtdTKGe(i86* z(4gZ{+q~RUvm%c{gwvCgyWQd8s7vO3ANh4(x`VYmWemLdfk2%8(>jp+?<4rrh4H~% z>ApZwx&Wt<^5X%Z>54luC1jszu--$tusK^Z2U~{1`^V;?ez=wF=znzvc`?ETNm)u5 zbMiUTNcvvg=qdZIg^l77c;cPm)UG^o&y3)%@A8%`{Uk;U&|*uyi&M{HzE?DfGXMM0 zjX(X+|ESh!wd=HxzM_Zrd})SgAcODI7yCI$-|N$ySp*MwsL&HIm=bO~lld#{p{jr{ z-}U^*V7F`lC))o|jPKx7(QzbGaF_CY!IZI8n}rBo6cg(FlXS}^ZF}l_#(`w*$pukI z85tq_%6+3eRv=!YiQUVaT5E}+yw-O=jHdG!`S|;5L#VJGkW;OKjYv}{NtOg}=Mi7Ztj34u&yO4z*Z^cjmSu%sWI^zl@crYM%b^nS>Q^zz zH0{fd_cbU~uhS8I6suqph?i`S33>n5zrig_UXFRG6c-67Sx}aWf#Ma}jo{|*p3biz zY_)PAa(MrHEZ;kTQopX7*$CrlVB*GlKthGEpFQ?7^mb z4}zIr{1;5ke+^!TPhka^b22N2rRdg`JC9-3-Yi)BJSOQv$9$6z0A!uHY;;(6RnZ-m zS@9qd9F?pPX|x4m4C8jH^26-ED+UHm8{UL7fJvwJ18D2A3mU<2h+sGVo5O`3zmz>v0%iO%DMNr4*S^i}g%gk(Dwqv^jg}-pEL8{#(EkqXWl6ER4@= zAg9QG6cpDAaQ-i_uD?3M*unIQa(aj0{gdfAzs1UC zA#EuEwW%RqvHEfBv^J`-}eCRHiL- zU(ma5N4U#$$;ie;{P*b;4{ve~)ADeCvsM0eUq!ZNqhNwYF;i}9Yin|@$l%N~m*~Ub z-$#H%OaAtNM&~%cQDK4(_%~!PQysK~xw?QGBZLhLl@Rr6p|2s2D-44cI|De&_kCEI zNPp?BT`hX8;i?L>Zal^~FM}wd{_Eo82+-$=mv|(58~-)&g)-%T7Yr;ci!Y6Gfl?{~ zSk|%Y59IF#@_2Xec_CViO3CGqen!WWH z*?*RwCN}01j5w{+;+Lku$7Zg3zid1+uoN!D#V)_9s<9~M!GnsccXOhB*KhMxSg`hm zQ69EOQ5J)bqWxSd92vK8t<&ewpwOa!f1h&N7|cg6HefP1?TOGxxMP5RE#c7*K=hcK z52d<-c~WCV%AbpXcImWG=uSp(tNZgkN%rAm=uOuX4`>pOs`uuRNZp${*)B|6c$)e6zdL*B7q7G)ah`T z^pa(h7_3V`e?_pir7hjXvjAqO)Y{FkUiTZ`=YFg)PFVW1k~dy6R^q^RLUvd*84v!L!U*ukg!pecy?19`*Z%JJ@$$aGM9p3l^pAP= z-{m?l_st31`ZKeN!S$bMV*pYSe=hY=?b>mnDKUvjAMzy>l`uCl&)&Sc5Eul)yiPZB zP(Qy;VsK)k@A)Hr45T6lucPZ}^60{*?^8l*S@?I4h=mk0+0`e|*aDTF$u;~qWeLZbqF-~;)eBQBUfU{Z~ylLZaoLKvROXzJ0RWlMyu57Jf* zv|&V^dCRvwIY9_||L;Xkt^@-sg_aEfz3}*gUX21dLT~HE3xJtHE;f1`zW}7MGWw_s zq@YdQfUmhR9nJOkr;Dkf<+9b1A9c`33o#+=b^2r!bCWVnoh5w9lSE;iemrrhk922j zb-8AKt2s-y*8-bd0y%$!1cZMfk?=TE9Yw&q1_MNTbg_k%i-!m?h~gW1ocH-|4v@`+ zZpGd%d?XSE;>6OLK@}aX%D6QLch*rpn6<0CNc4BY?NP1E=iz z#)vS4IEJQeY-F~spxSL4QyqeX*_l&div&)C*=i)R4cK7ValxPelvn^xy*}I7y1zLn z1ek{kkRT?gYDlE$HGz}PbBOqG^`ql3InsqPIjiB1Icrt_&PdJNOfl@OvoTg8O!_i9_^G7!L8!Jq?fx z5w*4KbNl2OHR3=mD-ni8z3V&rf^HFv_=1;Sj(XX4d;D6t)=IfkjJLRx;S{JIJ~sfx zY6Z*C%5UNOEOQ`Qk*xs2v^Bo72=YQCdd42IB>io>w;fDXE2H~1 z1yQG`v~5X6Z2NnxU@01H)r+_7u}|^TV-u>E(D=!JDm^SyTd--8N`)S7G01QtM{s+? z-}Ah$*+=>ki^-$@X#HT|o5h#G2dA}$Q8%#AH1MRZ3MKM}cUWxJZgDPqdVaZDxkv2K z5NP1JhRa;M@u8F?kv!_EM~;*&oF>L2U+i_)5c)D_ z8fOaNqExla)qo8=gg?DeY7=UYK3snaPJ2IQFI8e`w7&rd z(r~JU3hL)i(ChD(-(!qNM+#GKT@@#-vHOx>J?wuy7*$mi$UhgjI~}8Hk;HzPWV>!5 zb_m9b$&IqbndN<;lO>IQ109b_H2Bf3*Z~EPAsg)T@sU&?r7KuWyIwrM^1WD)0)iTv zR*UJ!Zr!BrML0;bikW%PVO^m*Rgt|cfe}mM-u4}Rzo0Av1{n~M^(KtHnnvSzs`3Us z80^u*NsPKhAT+XLF$<{9Yd-xdWC0V;o}rQQCJ=l`$p-=^_`4+ zMyP2i7z}KD(pBF==<#P5zLf|dlXd#e>ql;0cN+5Z1E-yeBZ|F_m}HCBB*ju&-?;Rs zZNLk8L7D`sB3En2k^!0Rs~jq>j{`9jCW{&`zKs@6SxeWt5y##VFxA>WFGfC!d$QDi z+j%#@Hgw9Lp(sevv`MLowpIE3!TPLO)`>pAo7*9%JZpJ4M+~9_?=J~GT*VcZcWZ{g z_QW{?$_?p|`VGFAHk95v-WMr7dVrlZf-wZTmdWT3LPMPHRIy4{4Eer zDqJsTE^pTy22vv&YOW$G@LSPOyA0a(D%17|eLl|QFekBGr&rmXJj?@}Sb5_^MC1vF zg+AG}&uu-W!Q5;1DamGr%hctYj}@OICHpJ z9Pl_Y*HsZ(gITCo?D0`v_nj_^Ei=jHg37>8U$4SA2V6g}i;0CIF#vbyns0(6<`HPd z#M${actMYnZ{!^`@y!3$i6a^{3LilJ)f)*PVMU3^)VT09mf9nH4+`@89R70li|+#UFr_%b&izSHT~Y4_Tjt9Q(F+x>3LNvBps zZg$xg^U)d-%03O@FzJQ$0;3cmRRb^GS9O>fEW!1h$y5%s6$J(5UY9eYSfUV$kc>n# z6l9nh5QZ=V;E;CFYlqQ|$z1wts*oPKl69qecm4=-H~%%UwX23NFp8GF?PitCVxF}Z zq-Dr*Yd!H!TuiX>N_odh={JxKmi45c-SfrB$?>X;&P;8=^>~boP&H^1yw7LojP7} zohL=I=O%nsYUdcVZxp#dTB)=tcOA?*-G!wqWAD>V5I_Mo4+TIK^Hp7tcyBx1OQRP* zXwbxLtSU@%=ZR=|<~duPXN$Z~(A+eQ7LWw94mBFN9xmD*f>kMo{H%BLX8~wDL)MQ@ zKSZ;E1uzG`w~nmRD=0}6 zjY$?4`+j4X*MIOm6C|@!LSffaEff4Oi*q_lm6auPWc(*->T3A_rujI~*QL1STa22{ zPZ65w;tezhiB2`G1F;OZ&*a$__{lqA3?WmM%sTc@p6@x*qevn&)3Ef9%uXLqP zCG|PMwBOVR=clK?{S3=npvxjo@syz%6f)ru?7j{yuOq8ir1{1Zd&ytot+kHN@a5N& z{Fox$(2o=GJkEO8Gx3PqhQ6JRnFB)6j5;KrAeM#?VOq?DF56-ESFf@5SIm?Y=ZS)E z^yGpUJ+odsk5`bp{=h1wm*47G6&h{`jHov1$z4&a-mzA_ajM5z{bON|wgXdK9!Mph z8DgZ@zL5Cc*6!C7`dMwsZDD4IS!IZI3W|Zl-O=0KFCdPk4rQEjmnW_o=}4tUKNSN;V5!5 z?a=Q-YSm7 z+n)&W(|$1f`Q7L-Dna&q2x&T=stnw%h`sjW0&>URabsfIr{C3rx6M>+W>M)dOYNPO zKPf(odDAr;KhJ1rz$g9bWkPhtu(*G9cVY1~g;F92m5`TQ$i1aZ(P7I8V}s^b%Y}sZ z+d^kMm_$aGK387$%C@o~aG$vx^u)K#h=kj&Xxh(~{CBZqg3!)!L{6At+veL1gv9I+ zk3b9{0kyzW#Z2d+;upGIPTHlvUhOs#LXP~-^H3*vd-$aq(X#CV`tD+qNqtMN%%Als zsm9i2-5iNomh+P-?#$F!_Us%0U98erxmFcr=9Vc@zd`YQI(tADm2~9$Qrlov;vm~e zmi|N1{^U0<8}(VFh>d`!p#NRho+Ysh%ivOhg-<+%mVNbe=Qj(j)?r=yW2GG(XO&jz z4QIf)C`xc!!f~NnZrmcUK6MaJ{M`5hSH}bH)_4H7VPRScVKK_GlP87S@IzwPJNm)3 zdz0mP+2oh%@C~fby|dnYkm{04QQVq4Fi*68FB{dI+ShRELi@Dzcf027*C{1c7k2f% z+5OvrC=K_!XX}-W9?=r7hxP9|D{~S&BTysaF)|{2tAZdsh1;R|Pq;LVbYGi%oFX_@QV5fXEMK6^8%fGd3SjXS`iM@(Q_VCG2 zp22J+%AL&A+e4mzRb5Kl`lrS$3dCuEynckQ*+x);B2a(7eu-&a4;-N!U3 z!SD`ro;U<0=+n@((A4g2;$B7|&T1})`w_(wX>7c*k-!WjSO_QoXk+HHq=-6g%WcgLQ!9pKf!?0v2nZ-fo4Ni?u%1M!iHyryJd~Rjxp1QoNkG zO=zx6?8&`e&61Z_yI`9N%6xmKV4|$Ydlc5wJ$aKKR9g!!aaQ-VqxRXMo&&V__Puf9 zj5?UTvuLcE2^1{|(nKOH{`={}Fc+2BI17BC_nvzOFDVES8{Wge;?88c`IZWL%anS3|&tL7; zPwQatik+FS^48B!;TO_;NLT^3Los>M5kXolE&nA3-{(5hD+@Nr(E^jCDu>i;nQ+7`YX6n;i&d=?=S|-IBDN zyH1-9MIYn_mwLLC9Wy>E&OS0C=R zCOCNDf3=2R$5$Ww0FV`pM&pU^aDJVuDZB6}JF`{g`EVWbJ;nfq@u^CnQ*op)F{y;Ws9xk^PMRD>qY{o^#FU>tC;V7&l z)`*xZ)deI}k-d)m11Sn&8Abo}K4{S-3}}FP3W5lMu?n?>4nIM+#mjAaH3z*i>OH-d z%jSL%61FH_@ygvicSTX6&?sFIHr)~hK0__4PR?XHkc%I4|I7$|d;Gf>;te#WqG+!- z1^W-bf{+)VDdSa2HjC$Fi>(uFW|pKzbIOGZ8K(|NbLG1AOc~GH=4-ltH+!C_EfD!u z$;8vVTn80Yttv_fC@$GX_bdSHUe0DYn75JXJ=hvjdX&B&fMV!W11Z++=61AaO~qy? zwWUpbJj`BboZ-FSXw!G=U{yHP;q&+Ypku~dbp!L#&#=Bv&jEG&gyw7z=a`R$QP-kR z_X~saYS&^cY24%J0E!boLT%#%*UeYBagD-vldjCWrN3(YRGoGU3#>EDBYCW${xB6-hfYDZoUeYv(P{6B}=mgN7^kPL!tQ%x?{EBBjNlC`x zQ~JGEdeubUQU<_@_0|``7AR+c6~=3~s{2$Q;(QKVxOtJyQ+h7)A5D8b+PlOYN>Pa8 zPHchr+!XyD_>Z4C<)5h4x83bI$_4>gR~&b>>h}x?%((-YfQqAwrg5}6YN08CkRLxZ zxn&aTs@2(wmpIaWcUlCPtqjpy({)w``3}@F>+RB`I6^I3;lw>&;o(h=@4^lk4BZA~ zd!DRM?&r1fZo~GCo`hoPmP1S25p_-dMAzWTzYMdm9eg;07FA2n041?x+v}d5K2MxE zyG7zUhYZNe&+v{*nJ(sDich3T7~p^#F6s0>p=_8zySen_suk`V`qOlj(`lBtXT+@I zhLZTdT2CilhOj~(67Yr&%JeQe=`EY@PFF!xI5qn%4u*j`skqWjpf)*TtMbze;w?&h z(C`<4iuS2DQgt^a@hcu2&%N0?sIwXpV^hvqF(=dz$9J(3vxCCitVCGem+(?7%jY<3 zz}`0iJR~ufja2`3mwsiH`yI!RzWAVrZ=*Pgk1&-}dl&z5C5HuEvA8!Fxpys(%%|%D-dX-+?P3L`h z50uP(^sbkOD?F3nd1S3Zahz$CHbR~VWI|rh47FyEE8z$&8aLMmw_q+(~vKH zJtv!_6rl`v=txUTYsZ(`R_;wIVk^z%wSsoMDEDk$unsCAYY?^(;uQsz89JiWMzp9$iGZ(1zUXrz-;&nSqI$0nAmsWO zNqTQNFlZtewHe>TkN5D58>W z2BChG?U(n!f>Tm8I-O=&${r3eb=w|)S{lzTTyimBNMwhQMZB0K+0Ad7F%8vWa`*#j z+i`{ho+;LV<|joW`{2_>TviT|o~n_LAW`Z5vsvup;h2-)xtC}Q+g`DcmD4ih-jnC8 zC`&U>ylE$yBkQeu%82s10qRoRIt6({+FwpE&WycGfIue7jHU>_~&9# zLf<5s-A76Sjp2uAn9w#C?ke+WBSpFMY8Qx1J5815OK)+>Uq8EeNM>xS{qC->?%M`D z<#+}>fDPufi`cQqMAXi31&-a}QYh`rBFQq#_sg;DxMy`C?M8LNW2LpKzKsABGZ6F? zUyA(8>p{)&I2VR8@P>UtKuD0@Zt0@Tdb@c_ZEYX)@`_Z`&_b1HFk~jZ|Sx zXx84aGfXRVOugEv0R)#i*Nc~;AzB?rn3q7roV7j>YpkUmg+`lh#u?Wo<7W&L6@>F= zs}87vNA|sLw7yLkBf>vJO+Sja-sX9HHvfJwuFH;mnIVu}8%>a{A4bP)(1$XMQ25L{ zc83;z0%m(x{r$`zwqOaK`!<8#pj-8fS4fm(kyIXCd8=0@y^UqI>+y3g+)ox6JW^}G zy7%H_ow=V2Cu;~6KiJuJRp|~2qfr~Mn@-c<*#^>DY!sY4LsnsD07sYv>yN-Z%Qv!7 zS*4+?5;3CRzyGmr!@#;@D_uRR5oZrUb*`^Oict`>N>nWV8MfxmcU8*^8(}1iDGdW_ zdI;k^nd?I7y=;eZ_$}nVP4xp@^|N6Z%74;E1N;$tlq_%;`X~BEM6s9o%k=CsqdKj` z)5$(XtpHHetC=4 z2yyvTczN&59dkIzy;D?PvcE#r&1!A7%oCk~Ta&4`Qs`vyH)Rv6Nfew$ z7UL6)f2c;7Ks79*G&D`Q4N)bM@}Y_T5ofGr$bO60831wUe)D5~a|bXAO^Rbu17aNU z=q3-4$6q-7!_{Iu)jFmoyg7dv7{!Dir(BgA8ssiH?$}Ck{R_o~DQKFv9jAsWIsp0o zr^iYTDXN946lB0sLwgRlwku1k5HpZ3!D<$Dk^2u#7tn^tZZrO&0lAT3`nOGIgBS0{ zi^R^)A6iHoozUfzTC$s%%QXflR=9&Q{NF7$%wukdkn%X^xw6M*(lA*8TZVo%_W#9W zzY|_huzu?(kUqIc7=R{FtAWvo7O<@}@&q0=CkEXGpF zs{o6d?|*4j@Q_Y-rlK#qQTBjidys+#pYNBI5j$ojVrBORkl&WpmwU*=*A_>^{F4P> zgFX!SFDy-AAR7K7`2hs3eF2ta&r(VByoLjM(i( zQ_gl$RLk$fItt_!6ta#RPeZX@e=5);LHth|1JrsU0*AyA;Edogh%%D@8Xp2ow@x0n z?HEF(uk2^4gL&Qe<$w^VU|>KFBlmp|^q5}ug)^CPrcB|A(UqK=G#f8k@y*X2H+t~P zMI#;%K_kXWWF}jK-$aLFB9B>Yt6&UH=51d*Zx3+*b-%`?2=nH@FCW7Xhr|-#Tq>mL zW%J@8-e*0<(#)Wa*$U8QJhfm*gJA+6!$vd#i%F224Rg)xKRhy3BFqL#?QHSuYpt|E zuS5wVnfn@|H#`8Y|K>HQ`e=% zFvH)8Sc-Z{vM(3`-f_?-#Xd+6+$!6v2X9iU^fRs^;!N2)dzDV07o+7t{rwd2$4@I)9n&W36UsN>hcOpVl8nL&$Jhw0x0XkqfVaCAaO8fz}5N)OYt$CuIq2^vEz zn@)A)egS_t=-C}>CXjHtI83Ipc~Ks!{%-Sc7xVX58A_-Oz}#U1l8EICmh97#Z=S-9 z9FB|p|Nc@yhbX`;&z3yR_ox~cV>lpv6QDBUWb&($FId7>yL~^mIKWIC=r6rM zl=%C6|Dil3!K1S3X24ICBJ)k)bePx5UIt-J`He0=G0UuetfQn8kpBIbqF9(W=dqV2 zRi35J)VU~w7&Ai$eCCsQI$s9HtKZ3`O=kx#afJAZkQ?=AIeos6D1b;s$6k5_I*w1a z)ULp8r;Q>1>&oOWp~~F*rH4B*C$|eIZ4P{>TcBP}M4+jrle40;PfVnR2Ic_Nuw~Z! zI|BZ2424+jv+2JLJb?dMTsW1RYVUh3P&cJQ!+8IY;1QP=h@i**0@k20*wnak)Q=^4M2Mdj1H&kX^sUD^%6R< z7xb?{TopvGk#(Y2H56LBVu>pC#!-g}hDt&qJ8K6jvp@ZP%viD20WYo@oroj11_`X{*qCa?n&Q!3gbv?J-IyyVb+n0C*j znjbLRf${K9O(~5M=%pm<%h#uUZ)AM#CFD*ToMP@PMcYCbUeKHUkJ1$bc1(aA_Bj3d zDpDIDouRp|UJI4RkB%_BeK_r=mDY7^h3#IDzA_Sq)96n4yE1bS-zeko8Vkm-GHuCA zR+;29A(&(th)Xb+V>TN-?g0V*#H9w^pG&6L6@ zVbCo-8wSo~q!37&iw)9m#$vL5ir6=85P-n=7UbIs_t_YcgsX?$45W0NiMj^GpOuce z*)R5;+DTz05X@QNki%c>cj?C`H#alTFh6_}_YI?IrD6_ak^CH{tEPu30{>J`7-#-&S^)a_O_*P+u>zz3f#EIC>@XfCYHAH1D#&EoA(FT4#d*iWOXyTgEl zOYCsmj7U+QKQH8eXzno%Rqw3Uf`=!EfkhOCfx`?CMnb}ohrRBl&FTT?G?V+hQz(Ps zGoSB+)b$ntZ9oZXKVRUI5016QR3{Sf?_uD>rN7A~Cp=^9oms+#LmocW4V1;jKJ1$2 z-obyEH_Rcs|J6qiF{y*V)KRS8u$s9M^=y|5XThrodWtHg*)<8OlaiMmBuGOxbGH;nb zJK1kbp&Mb_4+h@54RR{LAWnZU;1A?++;bk9%qR z8=O|n0G_eMQuOz>#Uc&VIaFpbBR?kP7H)>Oyi=Z<2}XH@b92BvTCN|f%|-%GDV}XH zU?uaj{yVF^GQE=HK?^07zH#p99zAD*$}(IuT@{TfhoSd7Kq8YLZC=uBp?Yw?DJ2vK(jIQYhk@ck{e0RKW5#`nmp<4ogZuHH+lzoL zt=%c5rA86=`xl*@-s@^AHsigC*WWPQoo3eV;W*H(epcM`ok#2qi|krT#^M{Ingnh? zLX-a}9{8V+0hR<49HVpK1K7U}wSl8c=v?Uj2;&MMi45{Dh3bxgQ;`}&;56oT&(Z(n z2Et*bb%Ak+@^@*-ew%mq_u>~~GP+hjrh6eAU%w$^P*tBbfwOO-zf#wHRFEfl`KqY< zw*KbuM@xZeD?>*p^vZ#hSeHO!6TwX+)(EdruD3*3PZdBX{jwZw!dn>{w)lfwhw$bMq)091W-Hl_QXDy5O6|y34z}M7zx?=Au#B)-k zRu>dVd`UR#V%k9|>18aM_kXzSLf(FmOJU8=W~m(mYMf8cHb4>*od*f8M$$LX%nU#S zEwm|;Tg>2hA7PvTJndSjSPf!!^6HH>qTS_$xkXFunLnq2lli0!kZjtxW|g#*8am2_ zqMtu4AgH-o?^+LmoZ~*(8M|@uK%2v8?bUMqp;0__2=q>wZWF~kE@Kw~XOBY7?6K?L zSF{r8b$c;lJR{-SN@Hf8dUKM}S=J3iC<6eIz~icl#fOU}&(5-k+Mi4lS*;gD?d{8E zfLK)?lo?~)fQ+`XgWHhq(L<*w%^d|u8&!+nPMPf~?f5F{mj~gpM7tV^t1U(UZI?QV zr)dlIIHadH#TDeEup!J?mlx(~C|yL4E+rUycH8BU;UQVyzMubW z{WczZj>4an@s%?)kqmD@%6|kH@7bXH0H_*g&&Q!H*17ef<)9aYUQVT#Q(F`77jq2TI^P1x)ra-6s&K)S zSH|yrKrE06899&%ALEa_L6RbMOK{|wf0K>RvDv^ls~{%CdE59q?lsZduv^}yGsh>7&j35$n? z(O&ULxZ0V^P3U*dR)$;Oa~C)8WV9Nybw4`aov^yG5emEUt~+?O(=RE!kkq{T5ltaq zZW%qR%sBGnR(r%S-)phvnS(Dle_my2a@ZZ^?XN9654>ms_0{yvXYM)-fy`u@LwaEtmLUuVFsG4wk$eL zdi`g9m&Qx>`cv6Eb`U+_{cZ&zzCsU;k%VkrM4}!bh>Q4Y=;2mTi2nyzTx;OyE8!aG zCi7>2<~bd#m+!HknzmhQWKiX*lK##b{# ze(OTg4aa@-#F^opJK#t?#yQ>`L=<+V8=(cGiYHPGNHut=cI@=;-*sqNyQpc+T( z2Liu6-8Ya4yH-=%0l76+t6}mPuN~f@xTe6w#J+MT$r@b!!ptcS+|NaD2R+ZDPcKw( z-$IF&&+QP2N}699`Gq=HURTq5_r?G$G}bP)hFy`6gouM3g12foWsA8_m8L6)l9RzhC%7>zf+n>13{B zc2bf_(|WfoE{jm6=|*QzzwYEe zE+V**=s@x9F%9o(2{~H{IkU<8TSdG>e1mhN5a?2I zUNUep72RI=Khz%(RUjGLIbb~`c$Lg@f6A5A<=hCq-PUY7W`yU6Rni(Kmn)-yo{Tqp zKUSf!J-+ombwp}YTOognD!EMl8BK?fJ}&uXSD-h)LGl7>urm1suha%eII0?dw8@-+rOh-YZ!C>UH)Q$~@KZ$4cmd(Z-) z>x7!iLvyuCza}H3Lu0L+riT}dZMZXCsd$zC4baW)VrTc;_YeE-Z6_7@Ei}K>kTn9j ztYdqS!Gj_XgOBV|_gkMa+TycgHfTHMZ;Yg~y-`~78%E9>gU>!^Lq~uMqgd8}IIPI%(Z(F!fi{=VSj*5Wgl0hW7j`Grxb zD55o8@}XoFF%XTj;XcJt0;d^aQ9hN60ds7=sbiPvcfI8dB#3=^GV!;=n2?~Ui$*~h zCGBx)!B5UQiA4~YiHk?bO8&%`2;lz1z$MZc(aahx2UfLum~Q`iP}eC!9;`r)h(tyiEd%r?2yjSYy5t$5rzreA*qu}ff{&3b8zt-RBXT(geLh~MMXqeh? zPe`dIubFv&6@S-5cRYL-d<&}h+twEE3nmZL`p<(DtpAprF|u;i=jtqNfgB&HMR+8SQ| z=6n}p6hY_Xf~#(0gEVI&pcX(!TSu<(T3J1$73|s0Q~K^!k zen%(|={j6HJo>tvueI%K3oUYMC2%WoC8G}Cu`bs~`%r}G2T&sd*Pa>93-an7zL%2A!*8H!6rP?*#B z&z78HMbmI7N$&Ud>fxsCfj%5g4Yw@Se1Q>QALliWwxz0d&Gc%p5ZQrY0fSW`gBZ(O z0ym1OYEVsoHK{8qNBVGFH*Y^8{-vf1W&(?j=MEpuFF3qesVzTzcsRpw;ew+yFhp@U zw|he6#QFw7EHS|bmSp{l^((u0W)K~_NzXOWq#BRO!9ZjpvXRN>UhbGtJegx`J&?;` z7r#@{G+%;HOsD2XHeRhRCb&7P#*_LQb`~HshEL0lDb$d7T{nmIRK@o_B5<>b9#QyU zi#xa>!b5kI9QeqnEmI4DXQh6xZ}MD>mT6*@Nbai_;H5#=bE_b|55|mbMG?@fFNRT; zXhe8gNz(B7$W?9D>3!-M_wNaWV=o{Mn+DHh1j*`Id{c9$1liXZ*spIH^E6j|IZGnU zGw6>#KGS^NXSYUfOF5?m3qgJ{)ueD!-1?kENcRVUJ@&Jyw>BbE!#c`y?1esK_iX(I zaww*(Ax$DcX`o#HphwM*PuHqG zQaHeVGyzz5gM*fwK9RM;$(SaxR3n)(SCp5=1B!0NL+6Vl7wHyFZHD-MCc;e#MUG5av&!(#47nwl&Uu<1#h~E??*2J>4e)aB-&4ueHTm+tkm#B z^q`eRoZPqhhk%|g^22zpYl0)Yfb+ah!*FV7i|Gv)GTlF}u_eE`zt==4Vw8-}BgH>B zByiwHxuB5x@g;Fh6twZvK>-uiYQ8fLU-xb~1OX}C8f+=HmfsXvvX-vP&Ud|6Kj$&6 z9ZoZ0tu-#Hz__YLxZZ^Mb#EM5ZQ=6-ht9%rP1NcaM@VKr9jlsPtF1oSxV?^fUp+tI zELpuyt%SPr>9jaf;<`Ks>a8@q8FNq@uZw?h>pEde1#h^K5-Dn{x6ue5K>}4k6T*w|1u*A5I zmLF=Bw-K?aC!VFU%Oqws4c_d3RUAL&VMryBV~pf@3<>e$20cVg)5XrLx;9ZO^N)Mw z2i;0IHRkQ-AGvdjcKXdtbjM`~P@(TL9Aff@eV1t3rL^F!wAxDSsflUI9EJ}p0`btI zpt>UO)5$C~xW%5&;sg3K{f-{;u-(jrdR4ImuJ^X&l2R{GCJL^m6)VGdnJoH?%NK~Xw%5S04CVt)q#W;=n(Cvh0yeT z%AxyKSM;?t3FLH5srp?%*<)7kfYep#)D30|@Rh)^H=~Qrw6}u0b*L#E#iMLBA66tL zA_zTwoM?>fB8g9ih|46u-GAmc@;~}WfVHuPBk@*~<=u_9aty1q9FoVmb=xBL!22WX}O~iQo^wbE&cL4K?^h=J3WTj&Ik@-$h6&;Dt+{KVt>FhHpC+ON}4t0{7DeX z#SG^DOiM5xsBPti3%FhUOghqy9}8lUY)_8Np?6`*Ptp-ur{gvFg7s`FPjTgPX11|g zZN?KADz1;MOF=$e%eX*>yFe|LD_fZ5PN@@HO)%`R84p?nT`5PoxnO>N_?Ka4b++ab z)~Xc`!`VY)NQYYa6OuOp%%fmP0W*f+WwzC=Iu}JEy<-Dy4oV}a!?u;%y;l< zB5aC&L;VE@-620G3sb*tb!{{@gbiYNG-L zl!Pq2fO<>jqwUOTn#9m@Vg#!=_pCZDs^zcWOW?-wgtT=cakHBq`xZU*^B=V(MJQ6q zfmXLgesGX)e_9p#0Cw0|9f|vOuD0IB!-Sv7h2?!eUqUd)xTClCy@653LSE%_cb%2q z$SXgyWF=bN$PSQHCR;SzM-!p5SS(@P&YcZ>f){BQ>h1;neW2&~J-2lK#jy?&J`8fg zxuOkb(t^U5KwOi#qmsK?Vnm--Vyy#G8Tx{0>g)AoR&AQ1`YH=9%?Oz_XCTpP%IyJm zv{kmL2Z)EH#6zMI6t5N#N^K5p9gQv1m@M1Agx+$yZa~Em6qX1&7Oqr+5v>$jT}Z&kV=!MlV0- ztQ8Xvv8u^!HK!``G1!oL0_^gp@5`Z9ds}MWW;lI$k-Juny{#yb`Y~wtXU+!Ay>pVh zQ)LViwIMhTVaKZ7nrk_D`Vk*zw3H)MKRvWP+&K!Zo0GQ&?tcNSIB^&Pa z6F_?>3Ei$>+`r~R`AQ?G;(i%pOt3CYY{LzN&<)>)m@4|lK?4@tezigXEh4+M<;m>w za?7_M3AexF6@P=9>M795y8-Pkch;(>6}50^=55RxXcV`a#;ZL6Bi8zkzt#YqZ~I2y zGz!wW833C`GpcTDP-moJ-qwX2rCt@0FPz;8Upg09r6{e#kN!wJKEU^fsogo>L-+lo z4+SyOYuM6d2KJNgvp-5deX;Pp`6;)(Q*lCR`VdRFuwlbMuSqlpNkp49%kcj)S5Blk zeVTfDPh!x1{@#6z)aof8oEMtz>!9W}1EI~d@4i0(`CF6K^VJZC*zZu}jqeV}Vuc|N zwOyl}UOS^D!P!$g27c28=Fdzt$0Fo$@TI}&JfO7PwG@zf(ArTlWn_&~Kh2BY^4yuV zwKJhI#eKb1r`1`o)YXL|PiCvd0}i1(P7PS=Qn*noGGeS@Elrz;hM$~<6%l@x)%$(- zJUgQHAOdEm-wKj`Quqd|CZs>b@mz@qHCKDetAfeKn?*rP1!NZ5Z_hF&US^!X4N1$-?Ef(*j;OBd4fax&V^GifVmJ&e!0RE z#0?!sUu=%FKa`NXR;H0lUXB(r>xoHk-^twZ7ERf$?q}+vBmN02ucoG}@2|T{DaY7v z)9tWH2yH30vZgq!yh~iUChZr5EN}z5n#!lrPEu{lkA}4Am%X})y^qFK6|W9Z66MW7 zzw5u%Z@F-ES9L1D?tg#V6=F5nn8IdBv_PLRL}&O@(f?Z26gR7jj{nZ|y7fy~P4!1> z(~sBob{5yA0PDoW_xQT{2hW)h{WjpyaV=AN`e_wSNQHk#C+^gQY+$Z%7Zzbkg<(ri z%fA|FeEPRlv389w;>$F&*0;+AikHAllVNk)Na5D%`)5j80_$dpc>&cq`@^XZHAofL(gyd`R^P@^*T7Vs7k&<)}xG! zMEGjhvfAdBv*wFJ%FE1dEmOm=skOEsd|>OMJVbx+lNHJM+iTTs^eI?@YP+h>mBwN*(iUk z`_=JFX5-2FAmb06&Rm%;fyrqPd;K~tJ8gOEx`i8nWHKI47BF5uZ|3dH8+o{|&~lFx z)*^l;>vy}FjA$UvGGR}H>fw-O;CQ@J{y~*REf{f=qjcZm%7Tf5%$JJT9r4|MzXG3Z z$;D!6%A%6ov-3#2tZKjGU zDULuzoL0-7ro?=jTM*Z4(Bjs1K5Ls>#o%5|*f9}oc=1h#A-v9`4&&vXYTpOkJH1k5 zDfRp9)F#)h(Y&D47c%iLisAHc*dsHx(BA;F>m~BUdRO`cOG3u+_(f?KseN)~Z$=@3 zo3DTCQ5W&OyHAE9*^A$F5bWWL&wb~BzUG|eoFYi%|A=5ky~W1$mprL#)Cnu25cE`vnUji}q|6}YeqpIA#H&EDaO1cG29Q(O?%gi=coSfxF7D9GsaPTSuxi$=X_@2VfGzw(M6HI_c9Tx@AT#i{94u< zT>r~Ej&wP2gsI&{V=}&zJ3(jSBTT92#7xmE)}OZQN^|4D5^hQ@kou_?1vjvSQjLGO zk(QwJgzY>*Q>lZ=aft>-GugW1D~Ue zlJvwZ@NNmeVX>2TU7O_TKp;7baKZbii&Vv0=Nn|OS9GaXJ1^wwh(lX1)9J)zVfxv4F!HPmK$1i??$VCA$XEc(Y*s)u4C&{pPS6Uj$`xvj!DK~*rTf2U0<7ii1WR+nke~T|Cl4>vYpe)83HSxSX{9~?k zU$tM0D6&Q)0+U+=*hb7`$k;0ujMG#hx(k7hlO+s^tg_eYg77A17Rv3~HE>|lTYRWL z1zk}T8+K#kMrN45l{3VPMpLee;;b}6Q$cU1zm8N){0 zHNib{ccI+!AGD@eGBH!esL!dMnu%+VVoH@08|j>CD@Kb!6K$ORkpt#k0+7jDwM7`A zBkW<|7(s2&R5Mw-H!HU*nKnykul+)sbZmJ(;}JYJ#!JIHK~-n}y-7YrJtFwH)_heo zY$X}1(I!SD-0O%?<~t9So<0{%PtPQ)wMxPy=5C-XRNZDoM)ohr41Ms=g?u z8&we~kfh;B-QUWy()yfUCW4%B;E1$t9pvTu*=Czg<}i5P8A0Rky7@$2#qe*C6p#5X zGEwT%QjZ06{?d=HLecM!Q;>sr&0E?SudiuZv+?PX!#?og1AKuXSS_ z0<@T;;O|Dc-ZyFmrhP;9`<2rZg(r-xL6DCH5rwI4@NUd~vFTsbamBh+yE|tG>QL1D z^bnb3ge%`op!peI02(slW8fJF1Lc58`nliGf?vPT^4P@uctyP3;GeKM)WHS`#k3TW zG6a5;vM{yL!qMA|Wna!0yt|&-m>fi@ft~_Lsraxh_vVHao&ko^tjDZ<4(nn%PBf+Z zv^RbqCbUN3c#KpeHg20ti20JUfVBnw%E6FfMh3@FPe>iXZ!2#`=MWm_zsJ5NL8bxut?7dUnwh9I(?R}H% zbJHn}bn6+(pa40%0622b#%d_pt#*B}Zuj`OuaS7gWRFvkEK%_Z?p?75H+_QCs zB2@}t_$%Fk<=NW1??Uvl2D4cpiGN}R>F z!ei)IH>pd9VlVBNcgOTe?y^q|wtmPKMT@*Z;7TYxzr+=~v-}fuCi)Ew;InwD;Ij+0 zcJ0{6==8~6&IoM~{l0#0qcYg(Y%AJ{k;bdCA5$qUYqAKCgRUSv<*DRwtqc%XVEm|2LP@`^as>Rt$XaER&eJGt621EU-% zV`EF%yI03x;85rB?!ppC5sYSd8VgYsj188kA^3C>k_8UiW>(IC#a)EL$+#90p<%ZK zIzi40E9iI=W9_y;ao-=RX)!$77~2C>^HY|XQr$+!V)i)yb`WQdFT-VmZi&a=s{}#n zygZiLg)vFH@8g4tF&KE(->N_ybNgV1NZ~}^d5)hOPV6uyc zt8yF^1mJR5Q__t*^`Sa%0uU7oI;E(A+1h0;3)*%)yp4NyOy({-pu#~h^I81-uSrm* z01>B}@4HEtK;XA<#2@Sf34Wa>g53J^*hox5$v1!HOAS~dj0jV1ngGv~64`o`U;KgN zoLn5K(6F5TWHoFg>YiN!XgHcBEz>hI! zyFT0ny6yikC@2+K8T?w7nJM7`B3NCNt{ahpBOr5DH!v4Xi8IggDvEiXK~M0P)?&wf z5rk!BWu`0|=>sL@Bh2XJa!b}>h0iSS+taA!xL`VUujfM_vfS|O(co0fim@nEz^lO3*EOd5%-r*eV;7CyV40Nar|Ud*lcv7|WdgECGe-1FE^;-P9$1Sh>w1=kQ?tE%ttZq+$3+(ZgeyaWA-|FattK0Ik2>VxZ<7Kly zEhhn${6A?+LC;dn8hJtLUY;PigJ-V_;L4XW-s^O}uAMFc`IA$sO1Tp*&t+&PQ?tcQ zK)$t6d{5XZ;B!w+9Na39E}E$>`%)ct6o@m%b)JqCiueu9VqIAdJmFqs)(d+>#M{0L zUs*iakSAxHDTGs(;Ps!LjOL4cX*T3u65&h`XEtuFK>9V083_t$$6$D*e_p0An6Hq_ zh%o@-xAQ(78CqE3{uoi@pGF7O##O&5I=b9?&iyqn_#7Cl$#$bq))GnfT0)W4$=g`D zKn>Gh3Wq=7^5k69xqC5YQ&MEA1&PgRuKC-J(r?k-4rA=&PO5+6+%TVNOha7i->oOX z{4`R$Wj9$g7thVxp%t-V_3J`){BSVC$C~q_6Yi`|GngK?LLP>AZ5*RhG^fO(Ty9>y z$?q?n*P$q`QWzCO$k^E?|HLGTzZ323$dbu!WnAq><-_wGmi0gbZravz8|-iP3kfu@ zON!=c|?pK%+cTryA8$tZuj9IMkm3^Kf zkehtjn9V9(NDwxn;`3~YnVjkXa89w%I7E2!znH1tog+n{N5xI@UiNy?!G za~So*W-sLa23-=P=7js`BR1ULt;Np$MK3PuUrf!m8haO&{934rn%Ne8{a0@uG0VQb?vPAlOuTx>keVTYM{eVn;Vj+vvZ%^G%OrU97AqK8g(+5lcLV&RpU)^`hrhmVas=`>6!SO@U915lw&O;8#*S5!V6|dPUT7?yf4YEt=-2y)1!sL^cS}TV& zeRZ01sacnKKRj54OR1w|$Erj#KHYq^yRT2iA)F5zYguF4$ocQeU)8~MI4Pi=G)ormYmDS8yhw?>ICr@ z=R5vF2Qewln@pBHSh`qCCe6INS__{uJF zK>lL!cidR_XVq&;uFDw|v@EFLJmR6-jrK-%Uq%8((H~Q^!-$)6{RaX}@hzBM1*4I=0G6&H1OQLB`2O>G#_VHx zT>QJ>*TT>N_*i8M-8|SBP_L&8XV!bSwwdPqG1;RQ=Q(L)1EO%cqLd5$U9eXc&GWEg zs~wHlJH-1x4MZMEe4g^Bv0HADv=Ms?>4N_Za2D@ec`wRWVyRe8$8!%*ynUpi>2s7b zV?qA6fJvz;XoYth6jTv}nxEJohmOBIg>thBeaFGf%V%wKS;;P)RYyt)Vh2wl5cNvZ zQGzrCk!;hm%H>dVpXwIKv@)l^w3#bwsakQPNbZ|1y|CT+!Y3v?bnY0uolNcQfVa~Y zhCf!B)t>6DPZ>{EC%#!0E=hVOCl>6H{==W)cV5(QvwjgjJE155ZNj=?0tuB%CND=C!CXzdQ#KBeTqN&!0<@^M(#QLt zXPZ!smqr{gY2NcSP9l*#us5x}fJ|r5lJZa&A_CKdYJwQW$nV+c75=yh<*mwvdzFxl z_mr=GDUrR^O2L^w?W_|}U2-AkwHM|kyo)|)zs_j9L(Pe+`8=UjHz@u->jH^x{Y&Ww zQPYH(BU6tLbzU_gGbDxm&UF4Q%d}uacZ6d$fevRkds}ljtQlOJ0|p`~0Gb)bfDE!uqxae>cdxCu9e&PdO?X*V zLgmnr1wEGb_g?Y(1>>qkyNRxJ;RImdwObuyrgk<`MU|ELYD%0@nH>7Nw_e+*y8O{( zp1SJGcb4htAG0vZwEgwLcj;%zH0N)-EGXo~WHNFN_)6sB4f%?{jm4(mYkz}+>4rE0 zjp>p@w$z1l_0{e>Y)?PxQpagc@)CsiT@0m-L~Cd%c0SQt2+Wesw$H9v!O6KTg6br* z1POl#1=z`L1jRT>Qpkk#Vg#&V6 z?-l&2$r{Dv<~#3gOBNVIN;pE^Qqh#&{EFLR{$(XsziVXBY3+2n!W{3{508*&Tp^3x zCkNl{u%B2z4`S;Fby`Lsy*LeFdi{TIwUtS(TBZ27l*pmQYS`#0A?3@*pJJJXl$m?x z8v)aD92f~wOx2X`lU;>i4dC6m%K9KmmSAsscezkrZ z;1lB*ajsi3cl!l)eeIK^{k-zJZDMZ5lK`T^JCx?)mt-8m1{% ziKjJvRDX3fH6>imtIZ%Kf&gGOHL`($cRX4qDy#L3EDth4hM+ewtC)_S*}8pOL0g@8WJV4cVC?($Vy zR@IYhu1~N@!*8@C3VW>OP?rQI+YsMOp1g{rP^w~sb~`5*4be=F@$6^wHqmko*HBz`mVCi%_C1MLoSnZr`&;geYIVdZIq~D+U zU$tO}H@mLSbri1*2p~?{*}BjEV>7Qjdu!|pQQohQCzM?SY=Acj;f~8uNC8E?Abr0- zgT}#JN4>L1(#${P#L#m^=^~w9jtjyung)ore#MZmf z4f9?k$iBbpdoZtvXz2pNt~Ng{I)It>&x4VR0XtS%sw`y&1j%>!z}EI_{AnqJh5C0< zos^86F9D+^BoCABO!z`GOR&>6Zn4AZ;NwUYn~LP%81vwNPaZGZ3eZD&uAb_*09C{d zYdkw!yzM<${dr}w!EqY@WTR{ra)>u4RxCO(fV8&5=QmJ2`$T9h$Kk?qUL+@>Uu^#D zkoqfh2~Ku+*x(WD6&J@>JI29Fo+kqVCfVjV37=fvMSZ|(F1Q`06lIf zuhn;KW}=D^BtA-sMj%{7G39#KbhV5mrmywkfp7`Fw~N&drscyRreuqLr=<8S%;>#f<>Zbrpk*qogdSuK<-1gWOz# zaO#LqVXdiIUc!38hy8C*Oq;y_3J8F#RHTBVTgL8Sffne4cAiXUuHW75h)D_=RKMeV z06Jg#acVNfc>|)g4cjA_P3&l#Rd}9Ji4}t$^&g81FqlNCk)x=-;2%WEHsxzD7n$ts?_20`^qQmuJ5OSK*W4S^)K4BgwJrT}Mg1GZe~I$(Y%E6AyxrI6ATL`-|P@y5*w zD}l(Q*SVg_s^B1;F^;WvD!zP^`ed~OKu6UJW&*+q)c0zkXu@x+Esg-d{*O@5&rCSg z1wF3lGl>v#6nl;Vnmnc)h=zcSec#iVX58Lszfa(UC+JR(TmtyBxAe6R-p0k3b@Dr7 z8jpLpM&5G7LUmeYh&1Zb;JBgvXT?ZxTIk&_R-q#UuG5T34^VjO4Q;xaj@^>G5A3s# z7)a{{M=qu+W?7ts9{_}4W1LNT$oI&v#@wxz@=)3dS1Lp7VD36l5m(EFK;H$r>FEX| zBe0%l|K0SbIzL#s{KF5v!s6&o->8@da)(nUp2aa6H~R&(_fJXzuIh`hp4d!o-9eUGIbA`5vjgSdpE-#akbwI#HTYpv}5>@Z-yksoKDe3e~Q* zPatw8cN3D`dnY3PkhX>J_fA}XSEWh6-(fa0kS5~R-bL^W z1~B5JqQ!>`_c@(Dh2sMgDxI#Q=F%jkB0HZzDcIN(QJ?zV8f@Kqn$+4>W5>Wc0=7O_R=!wVY!!KbT8Zv@qF5 zY+20>aKNz|G0cXNu?^J52xZk+N5wgO@kuJZF5hJMk)l3`LL-ip155iK>zGy++*u1h z`Yu4`H3q%dJNVoSk$T?R{Dgq`%X2IdS0lULM=NZIj{YvdtLkJq-gR`qNS`u6J}o{- zApeY`wc#6)BZ&k=PXD*mY#wkBG1JJ);BW|@3P-* zewCm+SwNMZ5o2Rtn{xAi%LOO9hMtPFQ~)hM6$bB46W6CqB^w_b&q;_vKm5V5K zYPkN>CJKjg$!|D~=K$OE8!sR1nS*#hVTJP|$`j!~-$VkoVk%M(1u^!O*XZwca%_;- z@5*%}e}!-UmWSFa)WQLI9S`-zyrMU)x7>;)C#YNWVhUr8M5w3X^*Fa$1dI)f1BT4S z47?>84%V6UTaSVTEu>J|PUJ}`PX0OR1QBo`+I=n|>5rxFX zG(>pqd)<#LbJ<7twX>p;gVxyuZUpsw6h@5cEEvKmwvwwi}Pu$m{fg!X*Gfb zksMvYiS`eK=|2Gr+^EFz24Z=SkOP!@KGb|~4C?<{);;AuyjLU{6UFS~v*c3IFp7_p z0yppm-HA{3Ex)mSpCc#;KuoLl-CNr5g_VCOkN;iq2<`q|3eX69+$Iu0hQGkEI>Hkw z&Wb*>mzv7=iAPgaSkL5d@c{FWMGV}2%R%>+`(1Yq{g}%Oah{F@68E2J9V5*DKpBV^ z0`q8tffbS}VQSv3!t!_3pR?o_^Q#N93*V~zR)Mr4O#=Ga?JAn5rE%YkgMfu1*&g#0 zU8XQoT8=Hjelte(&7TM{^zmFDu*S8~Cn}&0x5dlLdCcRJ$4~-~*;w9tr?X9p!IPxu zrtbK7MP^~{JJ}-CYw-GUjomg%s(^T|c_z=`>b|W8mu*=9BFPjC%n>9SOqA?`BmXk&Ix$=t-H?1Ecxf}m? zxzTTEdBq4Y{Cl1M1`wWn^y*hT^>UnedBpEjNBL)nFdB@{4hdIBOX~Nz!Q1G247-Ly z3^=cJMU%)^j=|2`4J0^f<6CG{chWkB%d=tc?xiAQdi+t_-e%O0eeybz7Z_EMOnQT- zs||>2_s&NUwVs1t?*V_v^9Hx{0ms#oIiXAddpSakwH&24~ zK1CnAr?QoCu))&L{X;a z+XgBoAeD)h^m2>{WV@51CVC(C3lAobWy4) zU!87w`s~fHrA9FS(eXo9#=7`*mj{NDrRv2l=q*WJDF>Uv=A8vg7q~6m@f3q}rxWf! z=cj{dS=jaed=q>K!XwHE(<={C4hAriwoCPRssZ>K6K?yD{iT&Y%dMvF3ZY?14i2c_ zf#*gEyYnAEQ>bQUA6EAI9$Hz*7jzu)*S`{r0`e|th4~o#4v=zoZ+Gh<6OK+rei6~S z(*ky6uOOy`;EwRTXY3nJgU6|p@bg7$K8z0}GS^M`ESRg%oqMxh3H##^j~S6huKp;f zXWTC20Y1?G05kwHG4Q8uv;d}M{&~|KO~yu)Z1xO0rroex#J|L2UZAx6x^vAi6Uqp1 zny{do?yGeo(Y-?gG&XX3;}pwrU<^N= zXe!5J#z5Zo<)qylKQ%}ULDDe~DGS=nu$}9YG6R@PZ1~O>(mr)%40u4ad zXdom`TNDc5dEzrg*fx%i+|xa#V}IS+ia9l-j5mT(G5>}JGWdvKfG-l`+g=MdqD-KX zD8mvjuSQT$&eGLN1^i82*aMyw&vQ8*2zMjyBJYo{4{Mov^tn}Ow+Q@f0_4X;KL-}v%BnFD8G z3oNh%LhP$ZK9K~{dhgfjoZ!{3Z)rjJJgp;Q5G4k(x~RMpxVfx$awfJzHek5ry@d}P zuD2vw?vGiH;;GByLsZy*lAMYGZZ8q31nn=~@W(sDyjx9T$FeLeMXe>HKDZ+#>GmwX$}QB?gb{6&k4VVJr`mt zwKcMl)s`Ll?~}-&A%vLAHtC(6O%(m)uuRVrV8&XMh*K|rxKC?7thAk6%=Pz8^gtZI z{#Gd;Fk|!K-S5zVZrWctkia6e28`(g&ksy?+E2BE^X?U4K<}mvW6*8=!2a5Zi=aCP zEQb6{%DFv481F3PX>z%|?7#NLE|kLJ;p*1{@Lmif&T}R?w%cw};0IdHBK=M~UA*Fx zn~^z+K>BjudpWL2(N=tUB5$`crO!hm0S`WtYim#yms1k^_nsis#o@RK>7Txvu;sj> zmS;&MAp23h-rxyumTTHXp$5a=pd#{{tts)Ny!{{<9wHK4Z#dpg<(Nc0ogXTz_IXC! z%O7+#W{8hi5a8lNfcsm(Ks*v8AratXLlVp*(=_Q2-d@|}(m~-`@3c9=KNUzKh%Dn# zZl#SR%2r_U24Zz1@Lsj@D+;3n7@p%XV0gCNl{-pVI@AFmAIXQ;sVS_n_Yjj zh0=1HFupKRqx*xp+3c{zvCuJ(LeP&biohBSkX6WeC#Kve`8YUIuD0q$|Dz+0e#}>J zPttnUf|YTe<;u9t%IFHTTg*rL=-*ox9T|$|R)0)rNPpR{M!q(zX3xktI8T@9id|Il zGQY-eq0I3_I(MX)lhNaszh&&laf9&4gjbGZMC;~8o0bbLA6!)$yVy1BnsM@CG_5Tc zdcn+lR5HH&`d#TMbG2W%Bc`tzrIqmmHhHy_>mt;jfzsm)W!6R(lK?pgZSXu{lS2mO z{->?Rs{;Y#L0D@`D&8LuYZu9U*9Q8%y39T$_o=xQE82nuPK@vj0_w7P>lKI&L=bG$zzZ~W#b`)KK1Q`Bnxeu$RmqhC?dyTd1YU3V{B%flL$Vm0+2 zsa$U-rc^ANbKr}!a_pfeDbvq??>+X)F%kra7!cFWGWAK0d+YM26^QBN`dtaU^y6k+ z#ntN6rHEbDMAPTn{TkJNT_W}8cKK>0j|U&?BHk-F!`5zlSBWhlkTD=>CW=NPNo`G6 z_bW+T+avOj49Ime8coPe{(6Cw?Yhxexb}9X8#>7Hj1o&-?&cool9_UGt)Y5iu_ZL|3!X9!xy@mtOC6HFmZn$b--JxLp2O6u^>VWuS zst=5Ygq;XfM_k8jIrtw!0I+Zd@s$LA7IMff&3@i=Z)-bwzqP@GRcDEo!$4v=WL+cS zzBEMs5+H+(h~@J(kk}!dIQeLv4FZSXcKd~VG@`|KeS8awU)ICY!dBiRDI8)uv|k2(`MRLby4o3V&T1< zq2z(~xp8r@!TsZg7VFu;m+PuHf@L^twqJf zq1UKUFqA#8)a4#W$jOP1_6OUAn`pdXgWe)gB8spe zT!CL;I3=vkSkV0?-Mog3!#bmF6S5pI^{69g4<1dRnIwA(687HH)C~Kh>X6aJe2{bnK9%4~ zLQcD{7pN37>`>6vux{blt(<}Ivl0Eyiu9l`UrESHNe~V-&`znn?V^(Uk;1KW z`K#+{jb$S9RYdlp>KeWg$xc;K7H zrMoyErhJ>8PzBasy4X1NKH|1Y{~g8y?wC$EC9TN^My{V6ks5jENlu>WPStY)z!*0( z$3y*dUxp0e>#QUY>4`V`&a39|HGX1@TCn5z?XIjtQX|J-#m>k0oMw+HKh94#Pr7RQii!Oj8G8UsUT0S5 zQMCJbIopP{Yi`5Y5HqUv*IZWmgBjV8Wk~jHwBg>BmIWs3)QCF{(2X2oIDtN9j;FW3 ze0hg&*JXJ~p>#}qz3;j;(vQEqg)w*s<5+4ufFyt*_mjbe^VV1gO8=V}q5Y8dD26Xc z*U*AMk@}l2KS8{c{*7>`-RB{Towm64{IdvAu?LPgE%87SH^wjB(RBuzI^#6G%Qw;dRN=w#Sl zZpL8Sg71AVPbILsa!&epqZuGmU|;?8;NB9>-Y9`*=8JyZ*;}M~O5hBb^5`^`Mr)f0n8amf8b*dzFW$1x&K3 z4<4)cbz6Xe$|EJLJu`%&^Rzda_G}A~SdiJ4dK(6iAh^8D+sBe~fhj<}u|9|0NMM^q zI7^0gqWsxEWNY6u$W_4zSq3X9MUUg3A#k>}D*m}yNo+kD*#VPFANn3gQ~Jt)BQB4F zd0+<%h`tgha*?^}ZkQ0e+T0qie%lmr{Lof?S^HKU+I;<9Q2RHkClWkx2t|G$2A&OA zy@Q?{MDp}S2%CKs!m=1o-ID>vu7NA52qaL3Hp54$4Y^19hE$9s8;h15<=TkXk z4NX>CdWrHLG)Y`^`VU^9l%}0Oza|4T^d{XO=n{DZ`>c)f(+5{Vm|E=WLHvgEQD5M< z%0$>e&qP?b=^=;m;=VX-S8TA!kBi}razfMV)Eu6T?O)%Dd;`=a-%P!i$WH(|V0g!h zvx43C_?M{QYTgy9)^a%&yUPxpJM~wqe^A>D5g+KG{ z?sp+3?W}=^utM>7=C_CHwV+1pYnjpimNcE^G^G_2)FA$nywuVE}r;Z-?To9kz(#6I&xQ83#UTOJ~+yVf%pYl(5r(???xw6II zsjLY6DS?G4>scWd(q(O1>c^cF|tCf5RR9m1Yh}nQwEzvY-pC_Z4y!YJM zzSkOCp?w3hs8YpHP5WnKKK6ZduvqHnFy56U%bpIOw&i8d z;6TZ72Lw&{*$z{$QUAZxfG_xFEFQZkz_(pGJrXto?w7XIP9RE`WUFvaf{!1#($iJ# zeYAbIv=#gx7AH6*HNu_OQ_A&U9A1tHooRs%29avLvtMWv@O8RE$Y|F`(EbnkqLd7d zpmQ6U0Gu~IZR)UCsfW+qNXNB~d51M3n$>xRAai71Tf=(8Cm_FudK_Ml+4~gJwD6Z| ztxiV}i;zDS2EAG%$TDYQy^U@62ZfQ*tHHu+d9uNu1TxMt3E57~B;6eiL{*%f&$=g} zUTALW-P{i>K0y9@@f(**BCIaWx!wKZaM$MfvNV-4=cm^vUG~5WvnNz)~aN zp1@geA?yqJx+kkGF?a2Vw8%Q8h5I&9(|g^ZhZpvJbtcavryDSD1$sbN!^bdNVaYnJ zZUSdgV1#^Hb4Q}o#4U;^f8SUJcw+@b&dUICXrDBlFQmMTH*0kFsFlq)f%BZmvk)KH zv85#l5j=wai6N7mo%aH&=K@=#q#x++M2mKl+IH4^Sh%9D9rU$%p9Dh1f3R!`DCF|` z`gpKWz*P#J^tYT{TYca_v|X-9%kuJJi!EDC_%ir|BA4eg-!2%8+z_{FztGOB6hvS z3z=5C4z;Z3!)?IW)EFC6{&Q(sKoFK`n_q#?a>Bbq{KZM?De|jysiLSVY*w^`v*bT% z+`BOa;Wf|Wrssb4-{VCv4GeH~e-%k+ykAkuQ=x74N3(;1(Zv^}qW|cL>^lH%VwQTL z=m#9V+A6wGzz8lDoPi$AWYQw>+s?&f#`aFYwB(|{dt<_s3*t)nN4)S5B()|`$JRRa75)3+{h_QtEdJ9322`9 z^;;TL^$dtGr@AecX?#(iOpvW+HdmbhnLhJj3(lg?{oVLMGv{#zTn4p>w*OnTwL}Y6t(6`R ze1m|S$acT3(WbbOw>yj??H*0D+H$o?rGjvzaW4-2*7PHT3!*GuYF41Ip2L!R9xf(% z%4!@lt9hp($ylQi|B(#2R}g32h=LAoUMa}yj}$N z$G0P^v23JvXVrVpCfn~65@)-HTcl%VLC@&F7bqawJEfZJ@=p}QVm$8~*{s-7p?*Yi=#cvxzWo<1 zkdVvFVnF5hElK?;26KQ$@bsW6^l%~eQu~+1el);NbqGSZM1YM2Fd=UTe57^VAVC-U zWyJaXd(O|tsGz{nZC}_IIH0k2i?Z)M!}+stf~R2N`c=tqMX77xZdL6Wd}9p>wC`ul7_uiON=bxa;S^~yaaoQt z%G~>VDAaPO;7+@c;%|d@VTy0_8P9d*Ot#!NIEUw>dDhUQ0E&I~%q|g-ZbRJ$uP^(| zFaY>KnBBI!zTmI=w7geSvs3=!#Kb`zf33H#0Gnk-{mQf;3&(k03V)qs_5GA1@pj@9 z`dd=LBVh0rfL*j^&>~oWKjI4LK)yrOnDKm)N3Nu~xyM*hQTkSOnSSZd7CH%dNo0kf zLoTRJ)%f%WN&)mi0SMaxV|HLf%)=Et>-gA2dk~GRYwS`l$-Dyzxi7*6*cG#&j=`x4 z@cLiXk}WKcjlL`DUv9H-Cgc=6&ag9y{3zDuTD(h`||Gw-T?MhW@ov~9-Tha zOTAjI+xLN`c2?MYofISXV~pYTT6=>1l9CnC{1qk zYfOJ>;IBn=$A)*K|3ND#y%OL*F2gw3hO!8?!xZ)~f}uELtn_t^ zoXRE9QtT9gJMMN~(r(wwJDq%GM*jDHS|P1i{gCXpR~4Y-QG&8tVp`}*LdYP<&JY0XDa_d?3?QO4mkM`?7_ zQP2l$&~5I>3(JZm0k-*sF<$_t))nWd-F1OGMWzZC2BH$z643wl1wjb&!QQ}>rG>6hmt#TrMS3t_+n>JofJ_?IgYgQA1&mL~mplBmcdJ^g2M zp7aL&MruoCib_zxm-PRwS(sWLm?(^9bs+8@B}&aZW&G25GgR-3>-KHl=CEhq=bzRO>@IK< zhTKZLiat@}8NtY}jSHN03%49n&>vQrv(grth?e@Qq~^Rsx7r~dLuG`VYzV)pUqGL> zXjW&3M?G#jzr8T6=5uJ$U#U$2@n5xE)tnx1ADA6PBvG61Q%3R~hESC0eshOm2z!Q(66bWJz`t za5Ia+cPGmR|3ED0uMag2C<)}E1XuJ5W@U@HT}K)OLLjY1xg^M2HB4m$lQU3~4zTQK zW-pF~etgZm+~en5^VeC{x8ur~n&r1@!EWq6jt6n=s|Vx_CoomEG}ke|;Qpqmn+7H? z%EsPNr>)zpOgz|{nlQdiNmFt9vMhcnlz%$Hb%XyBv=S2rH4e<6ptT4KjKyo*4Ygk6 z9e8(JaS`-Jj_G2rdv*l+lh>d2p#v-9gO_@_X^qG>?`X3On+E_cV%U6p|MTW>m(D5vW< z3@Ll8gHaD156(7!fkBQD&{Vg=`aNk-g8KXu9aLPs_f6*HqmZ?I` zoY$c;d%dn|NF^hMhlV@*tgfNwbV)K%(X6b8pwuQ71K4%9yvfZ`5*{qv^wiGzbrw9R zB#ivPFhCNqMH@^S-2H*0R_u z6QNbx`M_0{E8Db$ZZ^rV=Wf*mU&LA9ef?yQ%nft7Z|R%{Z(eSe*~W8x+@R&4`77dB zP?1_AXlJtr21QGPcIrzY238d-<^6J?a>9z$^YllG8v=nwB#tMxrPybWkT@|_Z zs5g4JK(j0GNE0-?Dud=7#ov21&u9Q63zU-c6}?g9PfBd{F;Da7wB7*uI=N?w00e&>7$*nZ{)%)bF&E)V{$b7aH$%l z_IKhm;=ycF&F@0L+-CO5LBj*v4u<#b-vR%YoCyp}kPpjPB6)G!U};nP{1ZXXyY3fM zE|P|0rFxqQ7q<`y!$w>+UU+Vxm*bo>t@eYnr(6gbg8i*JBi&?B)IBzP_$eRTK;|C@ zj~V&uaG^qO%KxuI?yAd4P}h}V(wltveKi=&3XlTT+yz{E@ovX+{)gYqZz?E9fGIUS zw&|!Hkhy0>ld7v@Br^40HE}Y{xP7*?JMo>=RxNAUK=kwPgE;^mpsbdnr=+{=_xSNV z4KDmmmo%blzoghaPD1B{(&uB%jhFu}u!B~PS1LmAWRcKwrS|n^f&*If!izWQ`Jrn# zpo)|4$L?T~&Nn;muQT>PWCp+_vo=5L{+gw(_f^@RISU&}#bRU!%;3B%R8dF{4Yn}c znI*2&9*zF|iW<~&pMGWy8nL+v*cFKezZ0R!^ILo01)hv+JA^t7e`0c2U@P5Y;(Kjh zWz|EKjq*q$EI($waQ$bv4M=8OJouq%$wfZ0!zh`%g*0MQUa8ca`||idCQ=C(x`#ro znt_=Zx&ASm;M&owa}6;6v$w9OnXjg8pt7coHDgcFAj+?_W=|rA(%~eng{836G$0^5 zX4ozz3a7VszjfR+TH_AYYhRrWc|vZIJ;OS`tY8Lby6-Zz>h{~nmhB5Q;F#DGHiBoM|!?&?hhy^0|5R2Hq-q4^h~3Cv@ltKOJ2 zn8SPCY(7%h1N8>j@i8haDSvfO=bm)~vJ(g(y^9e(GhgHSJfW`Ye&$v}UR(NIcQaZh zVUWF{rT>#^*18ge&`Q_5m1i2L8$yl}o>?n9Z`&o@-H$VP(_7N!#N}+eoSaA!0#S$m zIbg!U=t^%OoR)@>Ml-m%#@HyVA z^ZWr(e@#rJ#Op$%80j^UF6QfE{hFW#z2>hK!6>fVZ?G}b0s?}V^+iz95i-k< zm`tKK*e#z%$($swbQp8R-(TWRoLWp#| zbhJl5EBX>rF<$oK5IvKd+V3PhX4IthYE>^IZbySgGroT7xUj_jpU04 z*Cc{TpKjMLTz((QCyOHv##AaZ7CGTs`N=|BtVq$~*JnIwUftcI~$xrkLYSRtFwK$qvN(*K0weUlWhE%iXu1 zcCsO#kyoZX8s>$uRcU+}EBJ^oG1jFr^GAZAD~VAv!BQ*dLMiyezJGf;LtS{g_2}E1 zMDquXmA%VGKN6O|2H#RlKG$ENUuZnIj_7FTCUYK5E`A;0kvV*q%(8ps;l{S>9VD}& zSBxma&2(Eza`vwt`#e`Y^uyjiQBbblJbHSIf|~V$yb{xv$f0;d1kKrOG$L7wSfcPN zA!(xIYw&MBSJOEuyd|X*7oQj@#*T>bz|P99{?Nr&?G*5p5y?#P=TV*EuN0@D0R*2j zVk)DDFIrto5zKScEi)eKTTXnEM|klbs(KS8I1}Aj6~R@VetZ&UR#J8)vGfIkx>` zU^1``!~|7@Zi-K+l17a&k>Pml49G>_tQ1C)w4*(e;*V#dG9|_rK|ft50Px64#Xy|sGlvb{*Sl;ONI&BI&P85A#|>axzJTA$?p>3IrE-gOi@ z@LO(^?tr8@9f`$SicfT!^8y3K^5;((VW}^Xve4mp!bs4PS-n4Ccx}~WsUMD|gZ?2V z6_}>}TaAcm*ZzwM%c?1}-uKfbpzm8A7)fC~w9tAAn&?Xl6(D9jg&w@?qg0HpK~>#< zfd*T}L%%$E^5q(COZZ&SioDtoP6NY6xNXI~4UQZvMP^lpnA=p~*Jc-c zIrbqBsBBK1C}y%aoeW)HQH~_s)Epc|9L#vIj!|HMAG%tgF!g=deL^HK{v=PM!Yt$E zkIYr5we3e}47PdUYUdyrsq6rZu@*Zj&-0RLk2A{)U?VUbQE4CWo`Wf>+4Ie7zy2Sl zt}-gBt_vfA5)w)YNGJ^gN=l@Y|-5v9t0rmaL z!X7zIyaB|q1k!50g~7u&o7`E!-c~hS@#%7J$V`vrx^haa9eOV-&$;_GjEJy0eb*g!GbX~5?(bhCN_#Fk=J2=akWyt%e*uAX&(J7)5kU>=E2MS)tWfG zIMYU8VwY7<(PKGna_^x($p+Id$ZD(Iw>}ts&H+Ojl)w;J*=gERVK@&~WCT3|zxhI^ zX*t&}jC^CxM3*JJV$@I0hap`3)_&vLby0j6eX!G%N;YrkQ~rKp8=S2S2RC_ItCO z8ajQQ1%?)^){u*dBSi7C1}*mnU`N*Y=0usUQwZ!s8)7_%zg?5p!3leV?S+ z%robGQ@xJIob88B%xW@}{d4kqDo^obL>YY^fd6_!Zx3#)UHCA|5=doKrg~c^gMrT7 zR96b#20UC`BQV}_uxnQ}`a@-f(P+0|saQqVy{`pw3qJzsA~2}*Gu^KqHEU?bZEkcO zR~P1DKUYq*$g3QBQuZcDG=;IFXR6UkL4iuYyTZS&K&?hrGM2@+(r}|7gGNs6fKn=) zQy_kH+_5j7V^8oZmep7>2kNVg7Sb9~RrCvN+kQKDvS`6PJXEfAX6c2&%=3EAm7VjU zI12@US{>zS@F+ioQlq(I*ebT?JKCZefL7GFqvSQXc{E{a~&d$ z{C&r3hB}I$D^ZCj5CdtkF3T53CCuGzwQx_%86^cdT~iy z3TQL_N+kO%v`_*zA4PWiB^yLdX))DNQ{l_m2K7(2!~;s2lB$E)U;Zk-#eBFs1Mv;z zEDt=Uv5E@4|E`R=_o|8LilQrsV5diRzolApmN@YALK-~d|E)}uM#4yc^7s#n`7_5VF( zLM}a@U?&ma5ZTk8PnQ)*9~Q;V0n^44ba^^DogVAR=r||u zTF=(!#yN4B24?;zB3|Gnaz5f?;3eE1^C`mVbFOyy+d~SjMLeNV931V4COiG_IhUh< z;-0{0bB%0&w9&(Lz%p)8renbA@iOxZ%IitS6XR+8+_CNSn)E+`qzEn##im1*YFFP5)U zEMIRK?x#Mz_AX~#_^FL0$a6-1XW+{{)mw7|PI88bLXH3V{dH(!ye66`oF3 zMA**CxbCF$@h{9w`Rm?vZZmD}WOl3-WU@6w-?C!)BIc_#1i&Cs^qy(jWiNQF{4dJ}v=9O5NvAN>R> zi=>rDr5U}=&i0}n1ZC|r23x0|(E7K()RB(?tWTSkrGByggXsXSI2P|b^{syBPjx*sH9vl%HKx- z*n_6#*=6|7&`>Vfw_p^MUUmh_hCLxPNK#VdM;e5`($z@=E|$ZF!|(xo4H@P%ln*Q7 zVum$a74OW&aYbr@DW3v^J2-p5mx{9T#&%+gAlU~J_7FPPPJIC9c2)S@g>5#PLq=pj z?`MEar)Upst(R8_>p$L419D|`_|#6N>ju_HZv>d^@ev!N zNTKm9+t|#sJeCtUOB@Z~JGM*3Q&cLS2PE=+O3e}tqm9R*dIrODFi__}!w<*8$jB8a zL~4cTDd>+~v;D&!Q9w!a;hY}X89bDk8LdqgdhJUrRN^uX!)Hk0*t;fhecKIp_3p&QEn?<(kESFt2EgpOdzPa4go6cT4{MgOuv{W7np(?e0$ z3~RfIPo_#cNS$J;)#paB7rWls34n&hvilQ14!GhEshU%Uibim!Ivn^o9$ET7g~UtQ~Hh(WtFbt5RvYmV&6w| zMX$~adM7L2eS+Ql47>&;cAv4~g@#G3BdE+sdVEjwQas;Y@c%D74O}>DX))Fvgduv& z%j-H%F5%jb0*z}|4!YpNm6mv}%FDGpjQ2h6i$P**K14Wp@oU5MoER`<55n2Zmn@@S z$b{%8=Dju3<+ADLQ8GvioSxhGw2%&#qfI$H@%nS&NQ-sgP0moXbK(Y2KB~_9IKZOtRmuO~w;yd;O=zIry6&(?XWf?)clN&k<*(+_5 zbgQz8augt0A_?yL*Y4-z8bplE-}I0+$$Mxd3RFZdHWyH$nZ4VCWkn=`oxBl`U;!9Y zt$rNB1}2O2RHxn(Ps3KBNJ;6sU7)N4Us-O2LOi{L*lM&wfJT~=#yD3>5OyO0*t?)? zalIrjZz`z_d)ijGd&6Np$M+csJC{Ik|8lH*b1X1!g%e&4HxAM+B3`@rK4U$pYe^sk!Y=m=wpQ zx7Vz3r>LiIADyo)q0%DcnGywj>JlXHd58BK**(NfKd32V&)JjD|JW!F#zu9&%kexO zJVa6~kX`&P{<~ri>Qqzj>3OQn7Rva-wf^vEoMEogG=t|v!WDnf2mE$1&zdO88NONC- zVTCoRVedz1LU$V_SA{Q{Czx6c6wS?_vFG4T5Qa@bI1g+TgoBly!f-%j5u!gO)5 zgl0adi1v@_OXIO{saH>JbVHcIH3k>KUl;pRlqDxsZl!CjpP=wwQfFXiB_(!VZ`WZ> z{h#9;j!pXe-Oy)Y@SA%coulJoR?xu6#iI$~YhI_U6xiXfFES$N5?(jC%3htHZN1)r z@sa$EHELQGGaO{vwe#^L6%mvfYkhhO_9Z)3W2lI-}f%Z6?Njp67k~kYe4><13cbq9TX<0A`6j$WiU@N{6dd zSl38JXU~@miZZ0=*tR=)0Kd9x#r_`#CY=Y$Z$AzQh9QoRnnlQ(Up1{trro~iJ}0h? zx~MVzeqQlPQj{xnrov%d9n#rrdXRTjjti5xEcdEkU-ygZI5l(@Q+u$Vs}s1zZw6`y zMXblloUk?QdLB-ho5dylMSXDa$iQwRqk_|sw!?>%EMRj(L+RRkLzXpJpSM$lHsDlW z9e-(^Z}By~g%Yx4^CS~WlNotEmqC0uHP@LV*gBR4>Rd02tX@6G(cERw^$PtDu<8P# zF5C2!QAa3MAoY2R&2e#Qe{?0z7&@LVdijzt=VA0Kne%J|v~QB^5um1qbt#w*CbRsa z9s{P7t7|nbErC_k(Hv!qcSN0ave@?gx=XEi8C4_Odb?oy<%Xj=@4%3U#Y01Q_z6-> z?&~is3rb3(W2)VRY9?!wCIYbl=3f*;kaPU7W z`bf(d&8SS8^F^IC=>@h3lSs)JVIKA$hGBv$$KX?LJJJM8F-5k|VlXQX_DE=6Xfg{mD3)Kh$N>yK-x^-A zv{{xR$rj>x94-_Xs_eq6D;V-0*=%5B+YXiafKVz;%Ct#q*r&EO=ddq9-p7{=SnQaU zBG1VAWJa$cC!(;s;|Uq7QMkfrRw;~fW2p9orsU92)P*f13|=LVgluL+!=i@ixies$ z7#o9U9qACgf+_xL#^4Ots}8@*m?D%4lAh|7dWM|Tt}7N%y0t+jWRLO-vl5`Diz4F) zT#WHs_-|nz1*S-oyWO^OXv{f9EM*MT_hdS%R18~lbvsg3R`CJ;4Q_$ybY&`lb~YIB zafC?g5Wedtl6jF0icLB2H5_l1=G{?NQRRY8#d8$yH#mtCaq1WfWV6BKd+Y3vtP~+Flck0rMtF*&jOpG^0)p>|UDOLBFR2uFp@eAYN(} z&w2;uvnb8ju2D!#?S#V|vFlIBU8SN#>sbemVHN@yXx7PaZ-O@NqyHQqb(wV*WRQSB zTM1an>*m#;&-5nyKSqT4X0W>tUIFwuxWsc`CJUzZ(E9a8uurug|9<2DHIsvGa^#mQn@=i-ka6VW8@6V~G68VAcVB zRXF>hv(lYmIX>sf`S9mHBk8`@)1L|-*I~~3JX9Fh6Z@y7m;`~A%2Y#M_uxlIM$mLA z0O27u|6DKZT7PK}yR2X65GTQba2Bs+4#@~-KJz_*36%QhkS;=NV3@-9G+n4K(lpvS7|lNG6> zC_lKfI*rZA8$gI%gvY7Y#FT1Sm{`>Dp$^Ox-*A&oAozm(8hLqak6t{VcwI^&)#4W_ zGCRpvOC+)^*9-@FiASvQ9CD7ApT+|`;n(uZN*hI4U?SJ4e5YSxXj9Y1q2A?T9qCXg z#a>6(Wc6z5Td8qPbGeWkS$efbW!M{? zj`#ZOAE|KXS46^ugZTBlrFhtB;?ew^(b%b|EfhLOw^(8D(5chK5Aph1m4LMy=OZ3Z z?Yt*8p2+ddPa|-5-#qo&CZX|~p4!uDOF(Iif4}A|u5Yo|?#t8aO_eDox~cmwV$^@OU!4+r7!i&q}{oj4q@}>Y2|?tnx7KY ze#$P=?l`8<@0ZVaDm2-5@1sX(NgT6B8XI~n_ZQnfI*8JL>HUjGJRSfD9s3KmL z8hlGf5_NUYp;k#w>>7k=o!EibP34Kmx?5iiS>W^E>t==^r3;b1m_au+$H6J! z-bUlm)W|ZGm3v%MC^L^+7HnZM6F?ReK3T53@L?0?nPGOmyHlhmd3G{`Q>4C_Ym6;% z#e4k8aaj;dPHGfKuss5>eAHYc4qI82iqN+fH5E-# zO9Jvtw-r|B+%mX!thY*eoEaKeP9{bCVc^6WbN6IYZq~P>CbRH;&aQ73wa?qou|uO# zv19lKrn3!tih@Eg0`Lzrr@sO6LMz{e<-v`K#0!1X0uiKZxJY~hty7x)O1S*GhdIla zJx(T6?<&V&eM)J4;HlNRkmL=~yQ${qoEysbghKWEn4!{SdNS!cnE;}&%}p;a20Ph`4&)cX)s#i)-b+eb>Uzl}*TQ+BvPt#O z&@Dt2=eFkJ4|+tcsa_AlMn3#MkVfiTok`=a%krc<&>y%IdD1akWdiJ5 zs5KSlJ)5Lp@eu6u0KlQdy-AbA;jXkrM!s5xN>&0zfm%$stiaPXKe|WC6B@~)Pjr(y z{|mK|eg-gDZ=gMy7D=bH`QA=rk&3OL%G5&H2LUf)2X}?Jym`+FjDnK>W~u~o(;8hI zdCzAzE}1OD{G8=y<;n(y^7sKeIfaedxKPDq8DY@`&YDfv7jlkZMAuL$CD}jeBm?fD z-lA@bERs&7JG*}2dOUNRLU-?-bu^y-=R*S=WwbH(E6(}?+?^L6j=$sDGTL*P-cLSb zc?c6wP_!ULs=b+6S;NKV8QsOw7WM_*a_O#~+SH}N!NcdIT4^g@g0=2vEgILFl#P| zO?Qv$22li!Mnj=LKksiMd`vG#0X3pvXn7$h%FXa+5HsGx{tA=ucY^TO3Q>8*!()R( z0%g~-FFKv?My6X|9r*?nHYke8ny`k@{EI^He}!3tVAR*Fn4NDu=BH@<;@Md=D%x9% z!mAwefPXqfWzmA@Tgp)Xj>#V(?1p{z`J3}J)Wvri#9zNg8=urqQqmc@D{6RFSSMCV zl`YA3>HuN=KR<=P5E;Cpf=Px=+MuKy#q2XK;;llGV z`ssUu!;KXNPZM-;?}9ZcFg*~&pMDgn;+&ye*g!4VJHpa3vZIYbXX7?wBWt47k)T)5 ziqA|NK}7WryZ$Fc$?J%l-gJ^N1O+1YWwK_^pbUYfSq3Wq#H;Myk5f+GY zu3s<5_c%g&o9L|4awXWhk*hTPi0iN~Ur1c6$z47C^}Nxx9H0CR>tM7`i?hW;-S6cmQe3{+zBOMS1%gU=!~lMI%x`LPOK@ zAHGlgVty_+_!BHwp?-p%XEsxt)0xOUOtU*|?NxgeUS>_NgYJ>iD+GY4#d&H&rt2ndxKB@epZUYE!XkNdqSI5!zJk9s`n>J8%2#vk;*h)42s9L(#}i({zDnYtvd3W#}P*Ms=p z44Q0LYd4~E^=^_s4fcj-!&HCwF4&tss|ygCe;Wm`XsNNWld|5&(guH%{x4tA2=zW5 zWy|3sdU`;qcn*EI9K!-Le3QD3M?|bv`T|dgjfc{adH8!)m19~dB}2dQj6r1g7lMWs zyB=1UsmDAScPVs1o0`=Q>^-7fN^EX|UK8%{j8%WqKI3JcUz}8^v~ zv>Zb*xix1Hw`u%}^U=b5YsUDj#eHiQEQfC-RQ(77bYWm9yBzd-d#6>15L5-SrcPGS zg^{g9DrlA9J7^0#PHc$hI^MZ(3wOwU!hW^#cld^lG*yCr%1ZCIq zb+(1VQn^NiPHFwduN1y3#F_7n=QsR7;LkgDOiw-h$qHurqs;VnGanP6_aX&atS}7q z&h_#I>?|g*y-yJ)$(Pr?I0_T`7S4Mv63f~%Y%}g9U9g{-E3SP@D5CGff-nS zM8s8i+1Ib61W2P0Jbg83EYFqqD9_l{=!#UwVVa2Ug=6_)%l*RMWk?2dpNRQc_{|T4 z&)^gBZM2*6NA8);_RW`2Vb@$FiYZ+lPw6(2f0~%K^k*?u$tWZL2k4T!fyq!jqeuq9 z)B=7j?8L*1cZxI~pzg61eWp*{F5(+U=ExoH8}P`V9PJKu`@Bdr zHMga#kiGj$n9k>ljK)^Yfo*Gqu7oCK&RNd!cw#>uQu0q#EWnFW$b2_D$jgxXy06maOt!i&S}1ZK9ODX zy(2Nb*>)~ptF)ybq-qjfDs{$IpWf(W7*jL+K5*eV z4z9N~rHi7`S9(thK3FDkRK6C>LrBBJm>%gZ&**EY;SVT#=1 zNVg;@k45rGdwz+|FffL#U;p3Bo}dQrBP=4Y5dsP97X2hT%YA6O)n3sPN@(-dLePXr zQ!|g=jGr{0&5~KEuk}*ITG3sKo8YKUk^6XQ2FQcGoHK|uXO%=Dc77W&wViG8nL8}- zYEDO({HKmEHP`_&wM+#&9E=vhrm(QemY<9!0xb>Z<#F71=o(#tg}_36D@KUjnqWk4 z#4sJ%?n=^-U^|1yRdCmL;fRMzYA`#`hj__j8|AhUxc5W2k0iH~Q14McN+y`H)5qD9 zU?NQYvpwV^84hhUMb$dD;N7ogYA*~5UCeIBV-uzR>SRTi?&#u>I?%aow&99 zI+w{~SUL*#xkWpPb+Gknih?2z%1w-*)A9~9V0baXGSpb`)g$%I7t#CbUaFufIdn~p zKvL6{X2#LjtL*pNzgQ``8*rl{J`W;Uu!aZeM_z;@F2$Cq9T9a%x&|bU=5VNN-)Kz2 zT#&lpOLO$gAymd{^~!N7RFa7KUoedVBoyJ2%RH<~SlVv+yC4dQg^#D(bq>FjY=O#9 zXNa%dmRj;~*8jV+AlP$fz@%Di*S(%U{>&6b`b(H8i^m&+rjTd59 z7j<%`xKuIdnYMepkN({rJ?!?D#AKuqXrE$&ISz<+(_5A=(q^uNiG!P3k0Uxm zEXT;^3rGbKo8g@5EQ%A~UcEgUA?@M1ziir9^o0XtGjbpNhq56oAHR&0=Wtm~zgpW) zt-`_S^8YK_a30nmn=D!NYUydPo=0c2J`RVuNWDC?)_9Smk38&CFzuxMHpd$BFh>U^ z7itUinu-}OGX_BKMFf;l1u0|(OLgl3o!2K!&Sf_-$1PZE>=EGLfqQzI!WG4>~i zbM;8A&F`0ga2~lhXbc!%_4KAa52dfRN`E#^q3Fjb&Eaw(cC0rXN-?z8A&#^8jsWtR z7gOeoBeIQpdQ zY$z?<6uWtJqnxfJk?m}@bvuQGL8U{YG-&K90|~`XW};qgmG-0>(^1+M-(+uKGOm*h z*z!R=wLG#%Hxfdb9!>qp^^m4pok^upQLy@6Lvk|Hgr(UdgiOh>K3=Yn5cPMd3VChX ze-iQ{8SZX9f&1q~Ykt%Z=qz>I1b=*^vkSRTw5aG=MODahSMvNQjJCSjxu$0s;ok@T zD`3&d=6_*Vi0ryeb#>{V_e+l++wlNrB0@`4fv(sn;=0EVH5aY^GVYM6U3<1mFqp0qGgQe7Oel|n} zG_}iE_!s_e1kd*6)}BT^?#U?48m9Gexzz%tr3%xqQl_rzP57b)${FV}u32_B8xj|J zqmk^*6)W;v8ScW~H5!v&9mPxUo2oT|CS>huT+l|JyVUrBK}ogy-EA~cnhcQm82jFT zIl^EXm|MOi_6`GqmKrnq<>;>~^vEA>%a+-4swwOmkyWNyD9fFOUD&5rc5BrO>7*7l z8rC7@9mHMFJ?d}2m%0h;PIOVGWmd9hfh49`6wRWzo|dS^sN1KMuO!N{52qaBG2iVE z(esnx+Cu$TQq@Tg+_V^L)@)SxC2XJje0-Nb6O_xE>Fq~!S1gy?e&t|NKer}9Sn2&# zvs0C4(4H-fopC*cT2B%Fr zrDQr)V(^=kfNX+@n`8?=DeXL~dEtPQx!9B$S%ncz;KjRNJjXJ-R~WjuOv-Op?pCzC zi`#?;(Sqv#i*d)mKsr_-ybXVHJP64K9ZzWb9bG<0aXoK%tg0^69ZqESIRPwqOp1v? zmbf2hush7)dYf$ECmlxMKEUX9YtPpzH>t^+%x}Wpu8H#Onv1N-wS9bvqePC*xA(TI z@c%-&Os|2Sqf|p&cjxCocxJUCq+}e}e`ff#VG4u(IwivOm<{19xAN+{8mA99{4i+EwUOhuQgg;%_47N?X_~Thjwatp zr7lm_KFdEdB@6h5W{JWIA(RSJBB{{Aj=9aHTYl{TS&wEf5vAdP)5?`HsCtNFLctt^ zPc8OktL|u@vn%M+?s6q1zeNwhXz;T_gfWT9=5-veO}+3KWUEi^_|=~~#^lZ=dmMg8 z1!j4BcU&WEQ*7_XOM(e$GA#`IX7K8gy%X}7-w#0jZQ!wNEG+~Izb9hcx@!}IYKuLrh`fOvkRgpM6S3zHX8-l=RXnnrpg)g zrJf{;dFUsGd_|WGx7a9<-2nmAKf#{hK;Tq#nBl1$eq%YP`-vTW#_g?9Gk&IKCli zq%seaoYr>L1M{l%)P1zSy-Y%ec)y7QYb!YqB&r~P&_Sts^OV-zGih#G^WvD53~EHX znY7NQE;!GXA5B&KaN@dONpwD25a;AL%WXL4LQ$4U_TT&S!28QBTx#v?3Sc{KC^#H#bexpP9Kp+Fir7Y+EcK|XqjSHj-Kxwj!ejlTC}Q9feRaMH zO+SPFFov`xGOWyJDB*0EWO?#kzt@*xH4-G# z4(k@xLz0&wjpkAP#dASzzqn^FG<&As zXxMFhnxEmumP2U{h%x`{3x*Y48#}$9-Zcl^Bo+(Pq%&&uHytO?trja^SIK1nu%HiF z`uFcP@ojSd@KPu+x0J}CCYad){EJ;#II|g9AQ=-S`T2Z*&R+RCIgBohekGG%Yg-NP zGM){hYnG+svO(W>`bcUKF9#L6iZu z>h_kW4j%6Jm=SL=?!Zs=Rj{1h6^I-y5zmDT!TF zqPQG~T=F<&WXqQpWH(=)4e?%xgi+N!F&Q}6s>+C@r;8<9`J+`5;1kbQ@XmGO>!H!e z#imTHzdvUs$)um$L0RtobTPI&{4M!059dqWfCS#F)&30<_laL@(BMFJ z4|ZFJnmrI+kE8s+`g0xc41x4VPy(O7n-LSAtIzq>Dy@sw*Psaq$j4({7QjTj%;K%kl1R^VcYnr*A2E4K5G2&>pI}B&^M1Mvu<&McSRFptu~xT)H?#u@Zh0Pir=B66J9eDUqrmu{c zN!(~l#~V>FYD8MPBl1`~`k-e60=`}$8jhmt+V^^%t5r{?Y~E@$VD4$jfw%p}B>oHk z1j1%;k;PmZ4#)D_0|wU+t{lU#)!>r>$?={%jRq6^Px2~=KZ5L9`eT^YOCWLL>~kh# zc0ZpMj3&Rh6}VExW|9Hv6kolk=|p~%y=-Zko`mKK_F3xK&dd%Kwi6t$lN`k+weFsb zRDUwV{*mNXXiLp?%TvIfS;Fd&+3~4J-?4dgqB7DJcitFl>3QVwtg0i_CC5;%YWUla z)!7+H3kX=S{Sg;H5fHkf$HTi%R&3_cNftyi{XrE``kUyJ>J0`I$@4eYIa%b#B@r_4 zbc!NW@uX<@AIGx%s&luRnpkX2e&2mBq!8WxCw`*~W*k;y8^u(ch;`isl-k#|vJndi6#r5tg z4#eOKubk5WPJ>d3(P(W|E zL>6gIx2g;X3E9HI-ssZ2wvM6C7-C>|dpj;|z}Cu54h!bty^n$miLqJReFDFM=~I7N z`hAOw=CSK%o#j%$Ov=C|zI{HJYe|eHTqsavl!OA`)pr6`ejJW#7BT==+&tzRRaQ*@ z$G=ciq`+NY5s=5#|4mZ#(rnO+JQ9~Ph4SY43B1&uqGw8)dCdrd5bV@is4M&BjH`^{ zFrH>L;|onE2+x1Px?*0y!W&q>9T3Ej5wIHNf0;F4bEIni#xE<)VK+#AGv`JIjlTB1 zGSLN-1JcCqKbc>?2%zT$Q>ur)AP*5c>8RP{VBYE$4*GeU3FVXyl1MlIvo|$TgZn$; zQ8bnl5e%5;BhdyIpqs$)M4hHGa&fxG7rK(?*XZ46NKJ4Xse*%#cti@PLc=T7N9CdQ zv%4<%!khE_yAI4{}olip?4}%+-hy=G9BugF= z)onfl>*59u1XN*3jUPKac^-LFaT{BVB81 zcuAa|@1mT@L(bM`#qC}k0;s&_VkJ);AwX{4_ zRn1~TMl)sqe5)41CSLEgyX)YGy=%1AjC)53{`kIF+#Eu`zD?z<-0j#n=*Tj>*uv7W zO&l<-{Nzt5_zdvqEdT=>VSsUf>}*JH_#-2L`0DPN0FK#&Pz&q0xIb>4{uaEjFiFb5 zh$u>6y?}-l^9~hF8e*cu4@B-ACL+77OD#6~_oNg2s}V}I{Ml1Ie;f+TPMteNLTG1r z@p%LNb3JuB+46O(sH+xlTNb2qHBO`)-&`G@?rM%^hZuK#Jiy(~FH0{}&j16QYHUW( z{!na)0yxX+L*Zd1@GW+7IJ}0z-{Y#}EDn%p#ukPHOis>;a|hTpX<5!6Iuxktn-FG4 zW1U_*<=C0$m}{;D@l7|bkIry^IKKOCkN8>i-;X^;0!>tO)<+R2SNosAR?3jx@VA55 z$k-B4L1DFNsX6P-qE@F%BJ&YE&FGxBcA-*Sl1$8mA%83>X%KK{m-pKHS2FKss%kA3 zRN?TN7JpZ@Q`~KC_2u9dXx!N2a7cY49TYkKDbG2x+pn@ejw(T$7u9B+d?I32J4clr zR)>>Uq?eKwjr^x{P^Zj&yt?EKyV?eP3SJ`PC~F=8F0N4A$l+)Gqqh6>N6tNEqp2x` zUE1C)K~HEZ3>Ut(7pX)*3bMBULLM2QmT)2#>hrOYdYMu}xONon^GtEpT#`Ls_KS_O zTbSmQq}0nv$*x|Qs|88JC*ivVMlnX2)pkKv9HvZVHw6OEOwuoQqIj=C{bS@FT8ii2 zCo{l$oxBPgRIYoI7D+xukCcmEUfy+W5DtURNZ?^lXRJ4(ew!ptwtes1@hvog*86YO z&l&9YnhNb^E7?vzdVbG}&)oe{fX#8Bwst|1&>ml-=ewJ@&(QEL1)q=BdPn_%O~2>d zBf{1C9p3S{1y5A+skijKRD}S>li<$14MTwWV!4K}FVs!^0c9RfU?p4&c(0P;@Oop? zI(@i6-}pC_o%%e_3lfpu9m5Ztru5M3IvmGsL$7NmuX;VmeheMmWTR!q#>B*!b0t2D z$obXD+X@Wscd^_PXy(QPq$NpzO?sz`$LdGXtOxR|b)x&4ZZe>srk6cMR`n`9>WHMF z@TwxPm(VMf<%8!)R#15l!w}a_+p*=}mbV&OKry8vlf+887uu<0{oGl*Zs>LmJpF7| z0xk!pZ?r$J?|)m*8>p^QFdafT1J&W65Xe5UAOO2KuV(V;mZ}*|f~6mewoq#9Or1E1 zGik$N=-S~R@BbZ-?@s>%=TsW}9M|aqkw(qzZS)`^sx;^I__7y!sr99XV!=@^*SYTYkN%DM^ z%#fWyUr}9&_O^hUpA(^;tpKO??4H;4so=FXM)yul-aY%Yql7h7$Edz{sE`pUuLV~9 z!XjU+46<)8hBwN;8MGl84yIa<1pn>rBUOh_R7T?xQTh-bRtJBu($9`+fty>R?m~&y zb$s|>)>Wy+ijn#h67bpebEuel|&lxM{JA9db|2NRr`x;E8flSBd z#HkHm3p&oJftH#`4f_-w=>TFL^19d!im2)j#yDn}RE3nqS}pJeELIO#Uiym#TS{wv zi+QA@Fi86W(rZo9*ng5#V;iw$Y~SvOV>DTASjZZ9{P>PWYu5!UQl_-hVu~;hH>V1n zTS5kvsz$Q0>sC>c?$;P>@e1Sop^_^Uo5>o)L9Z}Bo`_J0mtIPrZck*_b>MBm3doY} z1D!-U$HHhOAq>ZIxq7sji1WSe|r9K)mKfGc{rs zNB!Xp_MbDqqylGeq8;QVC2`L7agbKFKcVJfcf**;w>rw`z37%Tj8b$nn5TJ4^s>6f zXq5IyCzSXRoa7_&KecDYxPXt0JwDi-A;><2Lz4)Sb-DI>W?rAsJk8RAz%f3eSySmi zRok(>kHH}_`}BTKo@SGC#LDUsN;)lE?MPcO7$xF(D6>Xy-D0fE#OV(Bo=1mpPUtRJhbD|#E>D5zEi ziPDR6BdDt*tKNE7CZuZLLs)06PJ3fx9A)3ggaM+LLP1uy5c_=-c=8e9r?>bBp%38# zU9tF&s1Su+A51ewI{CTd)?2Zme&0wv==TLe_U-&@N-lqhfmos#$ADLQQAdHF2) zwF=-{=G^E`ojpCJ`NDXw)If3`FRsI?WAt+pB5P`1g|L(3qytaYdhuAj!?QAk%{PKr zl+_=$iki$APO{AQQcWvq@Gr~d%84!9M~aflq1QwA2z3;>ve9lyg|t}Yu6|fD$-uS2 z@5KB7Ds?3;tqJu{v@dAnjh~Skm3(zLrK@V-Yh*0{Ymb0cls39tzD|=zs^Ln!LfU(7 z#5PEB$%eIMAi*wF#XM%ni~Qh3Y|!#ncaAZFMy4yFCo*?Dn)3~?oONdrU5_d}8L484 z6R{}jR4yh%k!`!5Ympx;4@{+H)oaxCZ0I%e%oQ(IZ5neWTq)3LDVZg4D--DJ(HGNI z2NBUh;;I}I8m|;5M*^Nr0RUXwI{jV9Wlx)EG zr2pkXwA1Be}|b>*%y)mb!V4s&YF>f{q#9bXfg3@dtVO&JQtjpdg4ufYv`qH~apl}{Cl z_q)mXg9&MfmJD#UAKOuQ4tkj>Uop2F^2cR?5h(^>gq&OwuM0McmspisLlCH^atg#} z7HYaa&odg$%`oi+QqtgP=ucnyexnA5L%$$Kw+rh4dL~@KHmjzwrdzLlRHV1L6~{5v zYNe00%Hkm%^K>h5r;$F0=!eTP^3TNhg&z>m+myJrq^wS-h`lMb zie`JfO@Bb3HWjkjj}5eCj3tGv9EK-V&L>E^8nV0I(5Q&3x;+}z9E$=+EaKA9UdK?J zd2Eh52-p-g7MRU47u0y`pVZE+`g?7Sh=x@(Q0A63j@d4c$;{;>hs5xe3phDYzA!R3 zni5Jx4mLtPaE;^mENgSti3da9q~3^@v~Wgvzd5$bhd~=ylOUtAKv`}#tg38gdoazL zq3P0+xnq8^DTLl7!Iadd&FA6ZOa(-CgbtD$e5|NdTcmC`V>1b+eu*bNLTijXi63R^ z`c5>~nA~HAa`j3w_ruo$^BI|0TfO%|M3dUyrzTUrvkCaF>ZTZdB(A}kQW1?8Ql+h< zTLr62*hMZu#2&n-$%}*zQ)y-XzJ_s8F}8E+>tm2hf@o4|8Yd76*Xysy6_G?vlz&TK z*Y^}IR((9LzG8`0O{1ijW7Z{PGZUGBo#BNh^_0Wl%tf|30W?{9IeK0%XOS*{EVi)V z$du*rq{dv&?BpsH&M94Q^*QeHl9)N`hS?FZ9w1%V#YWHa_rZeU$m&;4(#=q2?`^%QxO~pCtdFpRY+m$+a|QVE1qs z3@ln7P4gI@0ri(&US5vByzs0qwyK0rz_1h*NF006ic*NxZz}z>ub>U6J#f9sh$6;3 zs{!S(5@P8YP}I0pVwkBn8i2p8r}1dn=IPa87z?DWK$2>fDFZCHme?_3lm5n5M1INdzPx$2Rdk z(WSlwIUqKUjVVdpBsknWx5uFG!;J?~^DLOzpVURaHh*G1YbyaP98!v#)i1o5pj;-o zrmvb#6~N$;FN#~rzK=$b&Z+y}X6ctV30tcIJlf+I&4)(JGm{dfEdz}%7wi3e(17E^ z@|FQP>5AdxI{6T*k1NaHUF6s8lh`bbp`GaBM_9Toe1(cLA|VsXNTs z5>DG|W+if4PURpcn#8sX|2 z4gEZ%W!+iYx;PU;o4Gjuil#Gn>Y#giDWA&3ee1V?!oFPH_b#+*2_@PcLj7A4W%}Hw z_&vSoUUL)Axp-!sM~yxw3qSevb~A@{mQlTyg+QW0`nVG7aio}S5)m=9o36-UzS$#J zu^^V#QYy#9!N>iKClu6pv2&5n%x(?=kCTy5hvS$$$no-SE@vv_>EQtmZL_BaG zV;Po+S+h)Q={w$n9OkRm#DP?bGMQe#2KLq+(;fq@+55MZ+BbG5l&m;sExQJK?WLFC zp-V|@_$-|gv_csADGSa@iR)xhG4(upJ9hz+2)#~BIoFDrR4Z+vON>XoeRYv0FuLb0)&W{$i zs?(eVj4C0)TvmB>MP@Ur4QqzaCbl>$)CWd{oQ?n2-ggE?xi#wof)dPt2#Bx^fGD6u z0SUSlL~_nSGLlqs8iI<-77@u&OynSGkPIR@XNF-06a<7Bg5)r8dKBDxk9F>?y1!1< z=MT#o3g7jv>F)LP)4iVV;}dpt=v6bW>T=dUn(&>HfhLt!a;fBgS4wu&rx}|YIV2Op zV`qC7DWsBPTp5wemNZ=o_NFCX&Z*tltB-7XRmBEvL68&fi2c~VsGDSm=HI@Lug79- z+ZsL#=k-SD?5!^&g*o5idmYlOYoMFYc_+`os_A)ij3 zJg4(Wq8L}r5aD_;Uti+RQMjnQf~v-)H4ERf>y+Hk=@%&2n6IFG9g1N$=i?W)cp7Bo>RN76MDw=o{wfqtK>i3c1s7=oDB_n zI!>SrlM7lhH9Inss1ka+S-IL(9rY~in_e#qHkXz4*cPUywZ&!aqw%u4Q+0D-;=@Rz z?yjfX@Jp2_nN{HEJ)mBJAPD+@T&g>AU3cX2HLsvWcqw9B3;zcW&~V74$)haNV?9V=n+JcfRXL2C#? z#EY65mI=;T$6r}k8hbkTIT+6@T{474H;kKBNi&x^wYksvLPX25t5NLZTXrnLe^^_y zp*6^mn(WPN_sTQMi5wm*I?~Ty7QBo9#u?>Mc0a}D}GMflP;yJ{y6ZCiDK|g2%e* z<2|8e+L61j`>P8)cd1_8WtMQc~q2p@YrDUI`<~Lu(5a4+`QSFIb_kYC)099u|lbXp-D96 zI9ccHfx5{jUHtQTLGg+$iayfc?X;f7FYuZX-#dbwZJ%hUiC=4{MuyEplPEf|tcCd+ z*1D%UdWK8=ovRw7ai7g2%+cj%a#{8H1;xKzSKyaNXDctdVBH@?yCj-8@QJz_0yapS z$j+rXrJ2&%5o)Xa(uA}<{H%fUoqPMp__^w_Cz34h7~3D3U#E}K`}l|c#P)Rz#&N+xup`k`~1d)=W>ywI6wY5t9Plf zA1elVx$^-H!Rnn;$B7|wi?Rv3ITNF^dIcfuncU;fGt2250mmK)<^N6izE7W@kW3V3 zpKc>|-=f%+J5p4n3%Ztcb?qyr1+498j`bi%uFLD*tmDq+K&@V~(}le%(njRmJ$v_Q z%R_Z{Ka+UcrVt%t4<#;-bVKnILm%zz9BvfyKFYGPXGQZO;0Ed-diA_rW#=@yz0}$U zdnqHc(9nD=KHS}QxkxXF$xr8*oJB>MYhjF_>GB~IUbkQ{h&Qt74(cM^PuER^E|aos z0w09gOS4}Tnj(46v&dcI$AmiAFJUne-1E%~O4CTMB1?(aD7$sW=6|radIr=pExCgf zPZz8nLNcUATLy0TDln1XJwY-VyEW{xxNJUFF*QIImqTFgr|k7m<>tR!sNllcIG}p2 zVxJvd^BZtTWN9CAXupoWR(5wmeB@ICQusrC2#x&h_KYOc=PZA*l!POWp^Y}M%ex#m z0#`&5ELqPoMLofwg>NlsuH~>)sFkgKn{X6j^?brIU51R^cX!~@^H8qdb$$`^WZp>P zC60$O=)+Ud4buS9CDxt7%_=^g5pcZx(W5XqUB6K_*q23>k@-rK%|-FCuRJy70r>N7 zp5C%W9LgJlwp)tmVA-2JEt(Qgn zdD!9}JHJt+XLwf~YBKLTFmxFx(!%C$e*8lo&mjIq2l=F8bGb5HqF28#T4oDFVu1YG*8i%nvSE+0ZX;b?xF2-zOC8 za({pHkKV^5z)9%76-*3_FP#vO6BPKPe%`g*t|Dxw9kTTG(Ii%CV(3Cy$*qlfCO?yP zF{g=lwd4QHJo#d-R$xh{$e%{vJN%FoMn%^k( zs5}YWqg#H`SUTS-mQIDb89!6DZ0NM*GNrpeXQCMdu(k2%!r-hMD=5nsnGr+lGZm~X z78b#&I$bS#Y9WrAO;Ie{tnc>%%D;a-Qzr^Lf>fLas2z9aTX!wM<(Wt+kBzxHLDy-0 z(3(>Ca@I}@b33|;tK+QVb029t#@CkcdS4xHm{tC~ zAhA;-`SC{A)1+F@`I|AAi~PT6AGX8@p7od>i*-0&2cRe!cY8HeE%_Ee9T*rZ{_5;0 z^zkI2sDUz1*Sra(?3Fo}*C)oq>V5b8PcNz)eq7vR0@T2Rc1z*UJA39eEbSE^k8=dQ z!sLmO+zrYHSBgiTioIGbTkO@#ExT@)yM6$)3gPRwDxkl3_O7DNj_sYERW)gVD^=Pi0q#8P%+|QcySkO;qriMUt=g zzHIHB6Kmxh{5ID!xlY1G7(Nfwo#{VH1u8C*aJP1)sGLOU@u`CFVRx26$z5}e5XVLh zl0DzArTlKp@UE2}6n8BSy;($t{geUv834b3y=*g(q>W)tCs+M(XGV#o=Z&J?{D0%- zFIM4KN((%sYh>K+KR*4LB$WE*$920@w|VMqu77(4-xQ(86Kimn>B+Y9{k8(BP2!Ej z7d8rVJG}wFExcX~l5cm{2J7?N9MSKuzIqafJD)5;?w>u$zq;jrUi!c9_nnu z<^OKphQ_L!0+A8lFB$!AfMCl~5>)NG1YTqb+5OoF+lZcrqDU_K&F*);nSUi9ezW`S zMS?ggc%CIg+}IzV{Y;7Id0%gJY5v*s@vv_2ykwX?_-D`KCwiVs zl{L?Q_q^FHJcjC-dJn30&NYgZQCf&W6LpMA&y1gp$JhT66+JD(DF|N&_Z~X=h@7r~ z{4dVaY_fYfRisGibFLmz-g~t2#NiXMl#T3mLn|J93R9#LJo&oC01hZYuT2 z50x*+m2B`trXtqG$6<@3U=IVn+B;_caS!hk-?QHSV&tFS^Z)agb3Zuc=!4kNM+WV3 zuOc9b+9aTC*3{gbU+g?J!0WRrXVaacvax$8)1U%T))IB08flNYB_bl?GF>#T`*1%C z4uqB!`vdj?t#OI$;Q>}8>8{<{5Z7^=>oZF4yGlzhdCt`^D739G7mSkFHWsOR3chke*8M@>CKW_$ zFEJuOlGqj~%cUMIqaxmOijo8;HK~RE9MkC;JnPnq%oO~yd}MC#_{i{1X@Kn@WjlpL zqA{u8EERP5wNpsalNZ4d5j!YZEP+b2-cY$WOYO(Uv<73^Z<;~PSHJR_9gJhJcZ4=p z)N9wlI}g{NC1vEti@8b*-DD&s50sVid-jkO(N7^sgZ}dFX0`rop7T_%kmr)^vX|uY zpkqXAtj#}1xC5Cl!s2g^SGkZMT=DTUd+|w0V)YayVslKmA_sO5Ln!N=QV_CH(BC8(1?PRxVCL^Py z<5CW@T!GzpzN|>_E(aW?ATFT3Fp#drTQd4w4p$#?GRdQScCROu;ZAJu)AsY1yj;_3 z3E#+uFS!e@|5H|yM!Mz&tsbPEUZJ86XSr1~ABBdxsOvONYfm;JKDVVDtjbuNhLUZz zf@-2E5J6Zg!rd}=TACTEsANf-%D4ezKr(Gnx%Q|nRV_F1c_!qQ%gRQo)+6X|P#>V6 zp<84$%rSX@ez~Q0tjv`lnV1U@%$TR?$cA=myrs6nX+;~Aw;X( ztl{u767)L)&ZsbI^zAxTo?sJWh^Q3xYPig~W7(LSI ztLj;;tT^Y%?lUBFsKgM~_x?d!2JJ%5={@hwnkiW`t3An8Y1^-I^yxbWnVX+pwk&yP z*&|sUGO{PBlf5AQB|X_ETFy!SjI8Ooa;va75A78k9v~=r)TL@t3fT1@8HCgmsk<-U zn8f5j?>K^z47MRxHk7>xl~@Y#LK?tV!ID4WOTim-akR(Y{j?q?AlMXnFaUb;t3 z&-K2i+`d$l7=B(+#oei8ymo8&Ma9*&TczUy;X>C$&v2bQXyW%uqMAZQA?kb;(qYJ+ zsjE;flIOC<|G61FvlmWfNU`aY!8;4Q5QNJrg7d7M0uolE{U)j%M-#aw#}ZQFJSJ1^ zL)k48VVfIhCTR9hWBtM9#ERx#KIkXkm6j&W#qNq?Mts52VciwJ!cBiv&#n*%3tk+ zNNr!jt|Y7OB~{a=9h-6-WCn*nQgWVu9v_dlc(&ALx)8Fl#%JF6ULiIAF_vS7_+#pn z$XfokOC$a-{tnoiQ-7h@JwPc#+;dxV#c!s9gfVew%d?)I?b&l*iztm*waYD^U_NJ+ z5*`=l*qRe^Ql_~1IgA%NQ%}B#dbwx+vFmQ~L0hSx@4clun@dVct{CO(qe+-Q>djUm z@l6Cm&;r68I#p_TZ!b)!EWdr@|6;AvWTO4$YVG)v%qZj1;nJT7AqPN#se<-BDNb^M z*CjU~i22*kgfKw4Y>}J58jXky=re$ze&^5ApSbW0hvo{DPLYNeJf zSYSMLrHnKOo%;-yYMC9PYNuJmTrvtAh9rt(Jlpt>MLr1Mu`gdhD2l8rak`1^5S3nn17=ClkiM3c;@h~ zuyn0#A%%p42Kn}cwMF}a;*aTY99OsO_n95O>YlP-I*4;`qFzs;4B^hC-Vh{%#-2Nlogbs-_Eb~bdrb*P|KdnJXW4Q$`= z1-Ag2(VHi?L(qMFBKp`Oy}kO0-l{hv8nX1BB^=5WVhNvjf8aSK@x?H^p>hm`_(^`n zV%M4JG;zX0*I3=5?m(CcfL{38g+W4wSFDBj%eS)OkX2VwP*0{_A54)l2IYws*TZVb ziq37`@uJQd5Y4KP>I39KI)GYD=-r^4V+R?y{UatqP?mO%$;{4$3=I>SR=SS^&*E4znU;QdT3=j%T`_I~Ilq^~%w*S2 zAU9?lGWY(Xv`$7rN`B?^JM*R^#M29O`}u~P3vE%I3Z=?`c)X?-rCD2RpD>QFo6nm1 zSy)&UAFu|?a=F|cC7nUs6_?cmhlIVpd+4Tr;BF%Mf! zooiSS53FTxYG^o>*ok8;KdpY}aQTta6QJ=ZX-oX`&ru3Hm}9$U zbqxUgnGYR%fB$$VukR$PLahy*Y(4nrpR3YA=JCJv;eHu(=$|jHm`k)lJ4|s80f#PZ zgZ2S)a3uJ8(&F3s#awVv-m_JEkTQQruqFKeU=6kiTW3- zA06OQ7pU^>(7n)2^m%EiMR-UXEYZ)Ch*pAFC9z2gu@-v{9DkCU?rR2347dnA`y4n; zKiVpp*ZA$)V>?e8hT0SGcg*L#d(6kpRodX3nnGp4mFu3We zhW{c8g<2X7RT|hQJa{m}6ioUTySf-n_Jq`fZ&76X9*-=w7X+7Hpxj{cM4I}UDQG%S zh~}%&%r>-}B+Krr*domI=ELodKA0cZ%u|Bni@kWq_UKt9R296Y)v6zl*9xA;_3%{VZ)T2$N3g!z465Ib8ys^ zuYCUJ3Wu0##AK{P`Pcn9_83u{{&zQ3_RARj`cSFFeu^ma)P7mic1uIe*~APvNweig zhfY4HMojMQzP1UI2Hy!|-Kqtij-SmgHG#HGyAZ``K%cIQf_Ywi{AEV&U&47v(lWY$ z27OmalERMAExP*_-=9<6qHk1z@D}d1x>={$TB2?ol@Tfd>Cvj_0v;iO&{7uK)xJI&z-F4X#U0@tWiHBjVWIa z;QwB#qR8AXe({z~rbl;=`_Ygj<(R-rotV3EdpcA8I!_nJQOOjI=Z} zdv983)$yuwiTe7j05=LI+HjasgJRvB-NV2Zl%(R?zNCd+KFT=K2vM|;S*STEXzfd+(e9r7A;=Hg%Cz3; z&?;cS)p#nI)YoP{z{sH)dB~gT${z1X$Nuo|0I~)j{~Ohz<6%E6VC5wZr<_Wa61_7o z?~PeCI|1vE1C*`N{2caDN{ffXUu-TU+lTu5$4pq^`g4vjUVLKj+IS|J?&yXW5c8P; zf_YY=qob{f{V5m#8A&*$aBe-McA-56_cqe7vCxJ^t*Q@5`fdtIK|0?hM~rh{bnR6Q zmD|fOMDmWGY4EM*>=fr;Xa9BgFo(jiw{vRr&A|Vsok?HptJo|Mv>)tev4qSK5@7!I z;d=)LJpn)Luu6cRJ&1?P%z8EI`=|HyjjIFp`BjRH$I+hSC`t6S2E8wRXLAe0w+4;| zA?L-c_n)b!deS5HdNaNOj>_v&lEj#Y?`M@5+yvjDARYMNl93|^IBtgBB26bRiCNES z_1hOA>&{aQ2YbpJBY9_v*NOA{ct*Nr)bK(MSR8fv%G0KaXh4uS^cw0D?o5a=;+lVxK^ykr;FM2o{Qx( z4`h}$hKWE)bIOHwXbaH;)W62(H^k=9g$kLt?~Ww(h~mVqh@zsRGqP42ibf}v$NT6P z($2MTB)EN`!>J3q(K}0O^?6Y2=Y2m+Cd~e|4Y9=E^+pZ?r`-FJhsQlQ>!mceiiA1Ssn;(5MI_yluE z;E=w;GtBYV;d{C4YG>%V)G1@E*_YKD#g_+1_}Ic9GsZvm0C-Ex>|^g3Vmh`q#wAP6 zwbhSso>W6caK&^;h^~B{y0t}Zk(OS@>d;-Nc3#uenVpL%yPH{j9hdhw4K?Glwf_EZ zD*1#50a@8qyYK0jU;pS|oW8T{fRhYw#o2aqa)`6mt$C>xBV<^E+Giosx}NkiBN7L8l+I zzr1+g7C{4Pt0O#?ik&NYn(d!dRPla2D>xZKm!aPI1EDVp@`c~uqm^&s#;y&FF-sftK+F)T`moPg%tDoh4(vFWa45E`w2GQE56l zI#iaakPLas6Fz8KX|C4aq^3@911{>{J>YaF9$XHshkoqKb=_fxij@avtdjXGTaN*} z(eAXtzbx=QAAh{MkGCOYT->4o@ zzMToVOBMl&oDQ_?@{dY*gXG+sxC{IB9LngO8aAPTiHEVTG?T7hJ5OsXx?IM55Zn*_ zJ^Z1!Q~_<+48Dgs5yqJ_$B#!hFfqP)QrMLYl6afnA}jj+DQE^QBrPmQwUg!stmCs^ zS6cHiz7AQRtLu25&=|&TRPLoJDS8C5h=2E}Q;UdovCOU8SeWCy+w8D$@3%cLNl^ma z;b^oDD19?F2v&Q<3fT=l!8Z=;m$+o6r{_otdb+C+d8&JpF?fAU8O>Shk35E@{TgtZ zzg!(QetRR_-}jBhlV!h-2?7y{z}Ppd##23~amZvdebz@|CNUB26-(f-3(m6-@f~IB@WMON@kZB?-sNId;8#t3zfo9>>iM zxT@7i>j3cRG(~XZm3-Dh(fUd9j9qW8!jV(%p_&beigT5^1_DOD;r(fNC~Ikh3h%)g zUZnHMJ90fi+;c2FDKV)rO(Pu#hh=MFONBe|$IZbQ_Z|(cqMlpHHp+|wR?ldZi-lDx zj(g2shVu9`J#?mCVM^wM3Ot}ejTGf74ct(L&%5l2h(=|NBs{1_`p*Yu0d4im>s)6A z>-_4-dy)3Xr-!Nc!WoU&oPSY*XQIi2+&0M<1LL1(-8paDyNq6>Z0JJ>-iMu|Ah@LYGv6B%5 zUWz*ew}eJ%@3pVV_Z-#Z0ml|MD{-IoO`lWJ==R>sDw)39Bo}cm2iUGC3!`Oopscgj zw?={`r!@r~u$PSbe2lW+^}+VU_@{6IyDn&O&%dGE7F&us?ml{Gsw+8I?=%Bz_oB1g zuSFlf1={+RESiDBudcdr+ZVS#9a6rn>L*T5!A)y^Yf=eWT$yg;sj}&1X=#a$cg247 z50j|Q;VhnX!9tb1k3u}!h5A@)g1Fekx8o@3zEoVZk|d!!Q&ZSEs4R6FKPi#@Do2do z4H*9{&upVcT=&sr-trZbcUizuLJhjetE!!dNC7E+Ci`%?X6A^%k+fQ+p8Kj2e+%9% zF^Z@livt4tjgfiySuLTUnzWqD-W_)YrwD*DC$#2v+Ooe#G;1!P(qgbj6@O?w_*s#f zyKu#Vo*D-FitYUqCPY1k0DK5PZij5Zq3?l#scV{;m%V)!%);b03?=JJrP+iM$zV?S zm*|u@SFF1bV#7_Z*vWcuIC;RpYp#aJbP)o9YgN-pnWyiUsS#YWfax}SgY=jfH`Ghw z;*P~$Fbz<6A%THe)yW!2NM+boQlrNo?q&Q;mfRSiP{1jbmLp^2LvU6*uMsM0eVCJV zqgw>)HWGw{j&lDOsVBPj{~HGU4j*CLm4ClVWcy}D&FMsWX>&z;RBjD0(hiUh=88H` zYKL=a`2TIizI4H~wjXjcOn(fu^3pU$tUsag10*&E40nQL(uUKDpPqfM3+LVmykaPo z-QfEUfi5pFVT_qfV}^Xvgh$V!QFgI~g7KSS^E;8%4-b?KXY*IL0{Yrm)qo&GJ19i$|xe!LsB&iZ-%_updx{-=Y~=XYCC zI*3O8xzespETVl;bd3M?y!r_l?<+%${Le4{+z)`z|7$mF^FTyI?Cc&w4uS;!$X-*t Knsvq4|Gxlx(t_gv literal 28042 zcmeFZ2{@GR+c&N;lR+`Eh3rX0T2NUB2?=8_GAd;a*>}1nDOpNUglV%)sO-z|ZLuYc z7-B4iu`k)zG4DNNsqgoBp5O64@9{j(`+tw)Ki$o7nESe~>s+q$yw2;q&d+@prLVP{ ziI<6vj&8S(_Gu$JIvA3Uj{Yw=J^0I~V$m@=x*g4^b0(VLpLBEzbab|K3JP=z;CC#Y zZ6ObjDi6;U9#vHyRqz|&xsuAu%X|Fzaj+e=74>7Qpb)EIYpY-jeitgl#wx@T!DpyX zP(N2xuYebzdL>mA056;#_2Z(a7p><4URb^8+AH7ys^A4SymAEqt^lc50Nj;S@anLo zQDdt>2aaK@5KA3|`iTYq*~S*y#u9A{!HWWR=i(CW5{-3!OE^cEQ#TU3sN#StxK!{0aH+IM zaUHmhj@oFw+Gv;BTJUK~ErrsNdZhygbl_4uaNtd_Z}7W=7^^^xwIvpUS1fU(QzH1G z5Gj-atjj=c^gt~*GG%OR3>=r? zd}wktE?Oqx^E<8jiL~(+qoc{~mLun;&MFLDX?fJ7_|0ACL4TgiU9F40*QaXhM=Ira zbD*<=Gay+zI8ejE6@qt*SF06bD$te8F4slR8Y;M=T-}syq{{ot6ap>82 zc1i8Q=nQseh;lyM&KbS?EgY{!viH{{=DDPv|wR&Q-iph;v_zyi)rNn8;O~t2 z$RF}V6Y}x|DOFRr3L{d3lU3pBcMX^p6aP3>Hwu1IB&)}DDXKwh{?nh52)RLDu-Tn4G{$5~)0AverYLtd0y{6W3J%kr#r z%`XS0SQroH?)pO}4v&1=^}PDfsvN%g-_R_t%O(F{A^k#6x9iGKkD_rxXJ)!VhnWAP zA~wmBM#Gn;z}becYmu7j5S>39OE90YoKad#e)INGJN%E%-#sew6@2e+FvE(r&Q1x^ zh(e)7e|P`?+_e3_WK{YIl%=QyGpt?3S_*~N$6V&lrwf-desnDl6v(VkckP-z0~)ga zV}o8a=X8PyL{0-X!~)?2ryx5iuHCGuV*1DUcujB5Ck(uG7w0oYxCyoSqcZG}K*m&$ zwVw|HNPHgZ!2sL?Rq8O)tZO>$-9++`UjKnDOP_68_fI3JKbp!-X<|V4cCS}IEhWF~ zX0`5-p3Okqc=y#UoQb8en~)^9HkxBJdTy~a$un`d=Amk%-qJ+h{oz?Vb<9+|-Px&}^(x(tMMWR3hVG8VhLI7f z^*$@ZTGx*~E*dfH$sruS@LGUm6)^dso6tUEXu3w{>^8jef)+7riJITCJh;+)!Ueh8~bLCGV zk9yaKpQL^e#s6$RWSYWKYH2@{G!IcMvVHv3VBg3lw-04uiKLp4>44naFl3Ysp)Bj? zRJ%dSq`>TPU#NnT#p|prQlgZ>d+cGZ+tRLYvhE=@L5H<}d2S*csTzzsnRk_`6_xGg zGIyB#I`oze)}kki3ve`5o`IwY$09GPb~RYTq-0IX#-UBl!iZ?*lqeYYl?S8>MeJ!Gq{1CZK z8f2Su=o<_)9+Sue1q&XSmHFpBsmmu^1$}}tfBg8KW%i8T3*Ne^0ruCy;zn+#A{8?(>ABU{Z0k-C&{PQ$MnB>p z&SKYIl&vV>1NWQ;s2`_}_-Qb%8q=*Rs!42*87P|c503#u7)EP%R} zxL+yZrPOXM!1Z+uH9?$Y9*Ed1bx}Pm=3`#7zGu(&pS^V|*5kDr{~gP9;(1J_=p@_J zuH+qIOKj?9H^RzP0wIVOS_seq2uU#rUxkT=bcG@qLqU1LE}w!6AKJUz&lRSPphgMs zdNufk54D0tLZ?H8z*SD|Jd1iQ)0y>RnD8DbxGXV!bVsP09hs?-hCdLZ1{jF=TZs3e z4^&;$BA{v@tD$fAg~f8-Uvpzt-wy}vnj?%P6!zxQK#V0qMD4$W1Zi{3SizGkQ#?CG zWz*<{r9g7ihz*^m26a?!kZTD$LV1t9zX~JXUL?l&-k_jQV}zEddcg01M@fX|u&OJ9{BQDefNMCS<2+inp!-aO zz@jKU5Q>iy8vV>hV{>bKwu5jLs-8UQ)NSrPAG?K*2GzO2Cm4j#JwIB7pI5Wj+7^ki zETr2q2tKz2PJ?YOox*ysqwrg88!T;gnSW*&mlb`){vRUwpKrW(?!?I1Ftsr)=Un*T z$c93B|Bx5LKI$gY<~OVCkZd-D{WxeK+oG3)K%%Hx`^_Q}^G-DE&E_N+$$&XZ;SK@{ zIx*)l22nnbJ^R|qzxXqCxd=I0_LB%7ZT(zUN`gQPRegR^6mf*pU0S| z-<80(3=DpnAmIiEZ7MfaIz=Nf;Sj#5N9P=%a;h#yIkl%`U3u&NH0?h3A^hxqzcKM} zWK(2nV;nJfv-d$^JhgXo%!!IJ+!3gMVtG=CoU3!@25GXaKpPzH`3}->C1Yk3J>5#K z)A8Fg2jg6XCN<#w<6bgT#c>cKb~RUs+K<31WYz6+lXZSftm-F1?fI#9GZ@qtRqp9w z`V)f{OE{{UW$$6 zVU8DqBmo!sl=>4ei}eWPssW;aMo=Gz{U=vZru-+0&g}LE^e*+z1-X;^qChsrro`~_ zdkYezbcHrZWZI9%u8Qlpk3IJZ;DcBt#JugRfkQj954ZxQjRG#oJr|M|FCB$=S>|)t z+i+|UV7wPl@hXpUaRtm+C)FTwZC;F?ivtD1_ydoWt7;=41V7R%r&$J@dkq!o!fA*M zud`6cr$UZYr+Yt)ZP>xRWkrnz4UrDw*2h#O8eyCwkqvsKzapO-!m`GW83A8##*??1 z-YieqzPZ$sJ97vE6TSlaS|Ny6@T2qTNUtC$!hSkA0m)aRYUvllgS z#`%15;zJMuMND>SrS}C|JUb-)QIX5o@~(5w&&`TE-vZ4mg6(v&I=q#9qQsJNaPjbH zrGKHcYMrK~774O&w}?dP6rcIfaqIdR8gsj&a89L2(PzP|_sxP;d)s1dA*qHgss(w#LKtP; zU9V$3roFJcx!JXsbLWL^4h^bWDz|j{o*x@^xNei2%9vANIuTh=SSm3w8d7Dl?0W2- zT=K=! zU9XHP)iIs&h4cOdb1UT)E6dy}+izs3V{IM6-eO{}jrp+R3q8!ti3o44{D#MI5#5j5 zD2Tzz7k6FR?ZdO7_mOH5+-j{7iRM46exe@_zBDY-_R*SO zYhYfVSp%)=FOH0NuY$aU@%~@d&Pvva4Yf!~d7E_#vcE_1LkhaglaOL+HfBLP zpiJtYL1)-p&YrnsdleYPmO(9qJ>jwh0rEQJu`!Iz-WeYRb-Z;QZW25tjc=`ZsR&E7 z+CBDVFUiwj)8R3R;IRR9Q9-@T)`9(Ec$0SXQ^SW)C)A6O_3fXc^p==R8pX~SB_NX) zKX+9eEh|Fu2lE+X7Cf0{f8Cy&gjPZy??ZnKd{RqVPpvjl#axemr)n0ZrV5nyi=pbf zL>Wzl_r|fXXLude!z}EpKz7G@fK-EeGR8E~%hot@1?jn;R(un((AfNWWU2&ot zv;D>L>6Jb{Ye1f%pWEKiEgETm{i*>Vk7ZZKRGC91?e-Q2572zG8iqGC=ua z1r*Zt*@gjXEO~YR8Ni~N&0aF3us3P4{arbm5wz5f6P_PG6!dE}Ay z)0k?g;RdG)z(PbiQnBkk5~e)*fZdSB9GidFK3$$Qof(Dm#5W z!oD*tg#%c~y>u8aVE1F5C;ZK`t+g)tG)CY;1X3{`i3vJ%9_|^T#=6BI{|=zkoJ!n| zv8>H;;1FjmBxVxaB0jtPSE=K65JChamH<<#y5Hs`ZZ1_qcyB0l6wL{I;o12HdmzGs zrV@Dmzv$Lx&oF@;x4&c_RYn>~-rsacaAk9saqkY2HU=RAYVqADffUCMZlIRnFZ?$9 zi_*rV*EQTB)(6xz)DZNz>#za8_a_@q`a)L*_Nz7q`VXxvW)bz{i2qItfAl`*l=g4s zrRjC8>D$43XY@^EH?~?7j3c|(unYO*k>n&*e{r3Pb=2Xplzkh!fiR>Z0+HcCu!mak zJFA?^)DMl%73Os8Cc!v;PMt{VrpSxRCGXQh*fWF{AzS93z~QY51uuX5!$*~vgWJvD zHYq;Pe|+-L1Mkf(P1YTnSf;X4;Ep+iJ_2xkU{cmkqnvxaPRE<4g`_wmw;$A`5Y$ll zAC%u<@(jof;%xzA&y#cIOGgeYo9*9Pyu|<6mYEuTiq~5%f3OwCPmqcp4y;(tnEGw> z{Lj|*Y{S2P@9gYjp+S`Hb5j#zJ=bO4?kIEoio0kaN1pUjJqE@j)G&4*7~$T4j+SMK zhbCf|lNlhOw7y1DH@;Mj|7H!0BRatKAJi}eGH*p-S8E`wHPVAC-VsG=hF}7T#8hDY zMKVef|8y?=;6(5-HGs;Q&nmJ)L5jy|ku&(*O3vur4K3V6sy*SS?u@;KP9MUf;c#$e zIs&Ku5FLs-cT@lZDbVMCyFFhOnADF}`BEpC*I!$VB8Csbt2p|Fg-~OyJAQA&UJ7v- z-@kJcnQhGQ=8_nGokjV^rgJhx+}oUk^uM>KT51Q=_Ok8(EYexdZ~45-aM{!=f9<-( zMr|KqfawY0WdauTX`6bUQ&4~l++wRrhQVP9dqO|j2M;5POQroWu$@Ro~ysgA@vhbmOp96e3*&uma? zhuh+(_N9pt+NDm-$7tQYiByD@E^uYYhazKkQq@Df&u@xxnkCYkQ~flYHiiNc^N;{C zg%dyFvMFuqIJcmVAN`p;S#K?Lt#%$@=|Z#7@)6Wq%CJN-H?Y!hB#ne zc>cT~9Z(%ZosKa~JAei5VSg#0l(HLPzqZwO*b|b?YE_jzsJFrjmkp+7pQv1T3Ic=H zfzVQc{dh0AP-H)Ct)ntQ56|c2mBhzuf)2ch?T81<*8L063aCaoW528J8h%!`iFm3`uEi0uZmYeqlW0 zW~tSd6z-6_)OS`&=3O472}egFo>5_w}2RaW%#?td~o{Zi9As}nHa#aaF1>W zz`~v3w(wp9vod|%>y8|7k5o9nYu=oE&dGwmL&5t>8PXYhFNp~l^`6r9(@KOd=y|hb z;qd+6i&8G9-gf3o)TbsQECPKt5)t`pKbm`9zBPw}g4_i-*4R~__xokbp5a~HqKhH% zgat#SoAS-#>?v&6vw#clotn0YTdiC`p=@KY`B`vbl?@iW@ZJt~$MZ%8(jZP9?0f$O z^zz&3TLu9!A3@a%;V(n1a?4;AGS}hxg|1@wYImZu^IOMIkPWlU zByq_3GbDMe^^WSDE!ppqaj?=HSm*ocl?yiu)F*f6pGzYbCY}W|Av%9%LY99I^c+Ii zCtra)Vd;GI(=(E-vqiION18)aa$4H(+4fUS?zSwwCd52OlcFf^nz^##qivGu@6U*S zw*NZh0^ecCeLVkb-pEN4_p>aOAo^x9%3~mjVk%!pSKH05wEjUm4N!AgK*NpzZ|CwzaN5xoe@vhI&%CW zQhMnA#q%?SKb?g)ji_ES(b`CJRo>VYD+3~XbbtcTezl<39@o{=x*>d}u zT|Yzag1~K@2y%G5MIGjN%5Fd*Y9K{P1&~iaDE_UYE&lu>r+pBL$dR!#L$J1Vvd@14 zk#PVSiF8T~l(An#Ii0(S1Jn3#Y+1E4#~m(e=K03T=#W)FxiZawAzHF`pCZIWtJl9?Z&q z8-RG_5D()E&*SVlGgLzn8AnH@@lxTqFAk;1IXU@G%a3PYjz9{I50?eobfq!2o;PRD zI9gY#6nP{DX}ye(p8lS`RSU56As44Ly>@80kqV;bD{ z^=$T9FfwfP;#C{v>^6=JuTT|;9$&kj9Aez3*10xnI{1xuT${|)tDWlC%VTp~xsGhs|@Gv=umqCv(YRIPI%MzqO5a%fW&|K*PRMK3x zy)QViCWbV2KJ#W#<$w{H&++&h>6Q14vYde=_paBe3W2$C(V~Wm#V!Fw>`sVk1_u#9doYz(GPtxl`s>tbWpqNN$T4BFV!GLK?<eMagASv}I#1K0S(QB_JI=6LxgPir~II}9m;-NcMJ;_FS zShbrkIe)3=Jkrv1G2qm-h07wP$lZSJMQ4H0{e1o6HZPrQFg@lZ+k1!49l$I1r@2E` z4+Q@)I@O&}%|$5#S#T$y;DHIdedhs`2E8}H)$7We*jyAp3elPW^wE(%=?l8=djs;k zIm7!-&#K*)WZxDJOH#vN)>yvMsid=j*_c0*tcP!>_RE!(%J5)I;K#iT0WcT|11;aJ zakx$=7(YUFT;&N!~6zui=O&4WIaEMPcnqNSQyo>t-r5oLLAk;zKF;;|ME=! zl-BG`!uR-~DEE?ZB&k$C`yR;|qL*C+CR4!U4?&pv9n}O$6#1E^Fc7lLYI!1!+>{ut}&)TZd}6*D7?*vbv~_$$ZBdW4C*yr;Czmpk&!| z7U*2R+X=Hc+)7^sZl^k=q?|f7zlI4axRF-a#%NX+8XhdFk9LGu`6oDW6a6B7S! zjL|^sBdTqZc#YA-YSc)1e1YAbWLQE!+QFG@JBDjnO?6cTcB*9@h~T9LRiQ`3KYoNd z?ootUMBnqeex?;>(KC-rEp_sFQKK*jH6$K_Y@8B4gEO;bkWal<$}EwajI%0@=f@+4 zZ|*AwmmJz5;Mcov>3}q8NCU;P=~5RM*~L=BK8~p zt3H+u47;Df4Vh)Zpjh_KeK3PEcp8a;dtTV4 ziOYU;{re@pz|^2SNbYcI)JlEc9Viq~62aTh;yNL4j5!GBqU*0+LQpW<4<5FD1-AkX zt0#PSA|POb7(6Q-JRM<#Q8=H3A~Il`;e(x1D>SN+pduLoul# zZDq=gT(o{RLYgM3lhj5l-!|Y+5kl~aMA%-=Z zsha+8cqLtOeoax+}0+XszV`^&2}~hl;{qCcPqbj7j`z{!HBAKBe@ny z+vY2{(X_O0yUB*u;I01|G5%}n!j_u2g1!QEOQv{{M9gDB&iY{~MC06&?M<>Jy25kIfo z?F}zgD)&Y4;^ND0gDr~fmFC^viCsO|?Oif;ruM_<%1?GG<-YdwGKkAG*glPH_+@H29+)U(UVF@1-JY*$#t z#dKLgLiNKgL2%j+iv(Ol{8ckk>cAR@V4BI@m&T92cs-ME-(LW_My1;f&-v6mX^g7B z6~!OpzFS^y6py zK0O(*1jFmiEbs1{t6%z%{?DGiSz}0o_q+L>D-Nx{ zeM8Fq>iuteT4A0WN$%rWad1~$<%?KO{n-=rn7UE>$AD*iIaRoaGZprAE9ZuouKghN zh?+3KFKoEa1LHeue_ks_&WewkFW$2oGthTYkq<^O<4{+Bwbfp}=5JRr@pbbQ_!BHq z)mjVl%KW#>;E}>dt{pqj@H98je(gtlk7(cwlR8)9wnKXG9D+Mpl9D_%t^k>EoMOwO z50m=sp2lVH%wEgo{!fR?i|-47*;g=n-}ZKwA>_+O=#e_?FQGX#$s^^hsy9cd=j0y_{U&z~+-F>&9gi<|XZOXnEd5*ke-!_A>))$^W(fOa8?1(sI5k1*&cxwgx->7}EpJ;1Q92AT#DGhnJzer%^AC*q{&TQjZnH$Pp zfd`yZCsrP^rqG2AQ!^^}*^5t!hJo za@hQjXQye;-dGq_(&<^bD-8Om^lIzyi99RonGftY5z3+V0-$3&1#d(r_byE?d_>UP zJE6Sb^;XdpShztgx@vRb$to{3SMmyRH%fs9sfn2s)EVyFyd{h}7PLX+M7%DhPd!bA zI4>rsv;*EO3RYiGi?~46azydbTcuuL-3GPP>vYoL8C_DOc_jF-u}}nyz7)z_?Rwf9q9zaHe-MV#j4SFasGzuXY6H z?IIh&Hg{4Js4e8CC_RLlG=cwXG73ikj5O ziZP+D;b2;s?sD5i=-aDc-WdYVbKwpX6Fn^g*3Dl)|Eps8KPv0lZ-JY`)ctkYci3I$ z^IXO1=C5E42&-w4C-e=fTHX^J5IK`#*N%msTKL$aND{wUJVoKy@A>wfslssHaS_CD zOOWdP4Yx5x^60WeXPv>G(| zeQ$|&4&{S9(XqndzJYqu{TdwMu8+?2G+)cqZEVY{uCƜuI9KFim)(26#}z$B%v zNc8GrfD9#)wX9dFquqGIraryj&Bd$F>OGOC$kI=pl;`vC)e2m!otzpP#v)&MmVxYD zy#`U1`9-EIe1!Y?r=R3)qI}4zM9NFV1Yp^x9(v42dd=W&r0jPkzv_%0TQ|^9XBfGZ z%hD)=CsS71i!CdYO_v8_yelIr)^8XIESjZvi&e4V=K_ANRr_FGuuR3=Q(ZX#P*VDX}#r1P7lw&E;=> zQUT8|pxr3xLeFn`zI9~)58MVS$ujyml}>BV&aHP<6%usVzO;v0`l`FE562)!e!N%o z{R&Yw_-xA9SJZIT{I9#&`uFf_uN6knC%u7hA12-_o-9KS1;FO+yf1U4oNH|98hr$B za6VVc9ld7it6<+-?XGaL5?SwrJ%HOO6GHzns;?bAze}W7)p$Dpy4M#gBdZdnCiKBW zBNY_mSiG9T_q4TIEAk)UPO>57HA)iGErfbJ zcM*~@Xl+%ta6DwefHL7qnFu=>hIs$17`n1tS?Mj!J?x5JEm2K+6oF9Z+LM5+44LR` zBFD(Ae@zWque?~c&~$Pqh<>r%2O*a^wFld+8dNSHFT2%L89d!NrC)yYaQ0>Ek`YM1M1)#xAfNhFdn@D?}z`1yJou4T0hmGDX5k4<1PNf zPd2ZI5Kh6XH8mYk@nzWDiNsP1T;Pe-spGv}-E)S)pTHgnzVVc_!&61(l1)RmD&s>a zRjMhohrbF)o|N*u;Y&R!GU<3c`zLb9tZoZAgdKP}zF;twlQlExQp>`oQX7!cy)LAv z;_ES4cGPK$guZ>lCLk}9YU|9Zs}1W1SAUZGE*kd`%9bZmE36)ycgJ?m*p|5a5u3rn zb7ua10#rnhsmZj8pL9syqsNOaMYFDX>Z*&FVk^$l(y)OIB!P)vYVMhslO{LL5C8Z@ zEeg)OA~BWqm7B5@*WQ{bHUV9)$>o0OYF4=*A;YLiG*ppUPYn7YsiARFb@p{GxtaK=h>3F_7ww6|!1^c3mq_jYYNMXMk(LX=Wd~io z4@fGK<|Ihj8n8RJSy*B+;xsUQEvFH~zhu8)9WT6cThA=N>~kOKSC9H|`T4C+qN4+d zVX6bU9kQNl{dGNLg1Pee8P}K!T-3~D)|1LNnZdESH{(e`8xSOu8`?CEPN}fYvF0_s zC6X6?jwj5xdE2}K0ZB3oARy^b&YSvN>E$`%q1h2*?*;) zRCa@EN0saB1lO{bRR#phD=d^(!`BDy%@}=9qcPPgj34hI z)%4t*Bd?LQLVPAVxtH>)_s4G->9eWZ6oU@-#iq)pRJjAw8^FERnad)SrpBaH%DfDD zPe|L7(|a~tilCj=zrp*}c?pwcC?k8Xy~bVry3y%H2+n(g*(PQ674B--#!vgEPOmJ= z*veE=xG6(>qrkSWHVGJ_t^?PY$RBGBNPjq`0>^G)l7&V_odn)`y|Tvs(1e(1 zI1TGK;CpB&1pZ6Kw^MiL#xI|1Iv$Yz3xJTxhbhy1GTCNLfy>Gixyc+f%bmQxP#;QW z>zqI|VhPV+6nKE2ESXw?lkAhxggS;ZHS$9xL5C>*MLZV-$n}5d=!mJr-BDx9;E_D( zgP>tt4*HlkDWiIEl{1q42ZE_ifG*PNb?eimn96sOl_B~5%OKdX@>fGQq^wbS zme0;EWc}M{M{g517Q2B$l5LI=3wf6Gi&(|eJk%71?9FX@)MG`COn5)#Gbxo=mN%8u z(o(ZW^5jV$^UC=4L#{0^%dJ2h)>-8*WeZ)mK&MM8st7zGxH9xEUJq=9g~0kXjsGl9RJRI4#r?-o;f7Q@wn4;qCUiL zVWeo5Ma3d>=wunJV&+H7)Ogo&bw`b>T~sU9zl&T?U0lTL#B)iNUkW%HO>7A;3ni^e ztV>9ae>mk}CAs;i>MbqIrm1>1IMH@GeX=A1h4tK+17DnJh$lY z8n#-|g)!8a6_lGwLkL;^1282G%zknry z35ahRB9tsxVEi`$P6t*9S~2hQ=3MQDW~7IZ-CCH`@_5{7|F2CyzyhKCeSiYg(PV=L zhZj~1iK*d|^{aJG6GO=PF^wseS=Szw$rTVl_0BZ=B*Q2oV@m^P$@;JQ+!2H?TG$m5 z-k`S)eY>gh38W#XRdJLC0+Zqne3bjq2QIq_nqq$)C=^Z=aGTifa}Do&|9L1Pp!{br;wNUs=P4S20tUSL{PNVEGyxd=`F#`snk{d zHp_V(J?k}}6~zaZWT@2z%uoqi7D+A0n^l(W~(i#$YcOqBF`){3kU0sHA_AFt=N z>1F-jtbL~*kZ`{`5@s^nRtXL_MR#-V+m~ll9q`i%LfBMc=*;>9s+zAZP!~ngXLq^v z2e&s-&Ixg>(W{ax$DwBjV?bN9Iy;#YO%~g~{bxm7P{sEL|1A(}*>^}|tuE0X9tC>% zGzcR30rv@7f(o6cRmJ>&A9(ej#(v$?#*EFsT~)H5_)a~9VKV>VS;k}b*R9POBLZ8v z{tor?e}I}${FjIShR0%u{MR>nOX|A*vC)9#2;4+#uL5kRrbpdYEdcy72<3_elf^W2 zSVr^_PbS1ik!`dOBQOfrNZ){*Z9&NX|3--aMiDq%0Sa$dO6nenSbYF&fNJ_S)CQ`` zqHKPy5nJ74zG(1ppiQH9`$%PrpnUH!*`v^vIsJ#E$E?b~s|gi8;w6yqNc^`q*p|;6 zAW~)a=fVuyefFL8=E;grGu#t*20VolDvq4*Pz0e_;PK46A?oqWkoiVI;Xwfq|Flz| zPW=YdF1ImmAUkz0n1u;eFmPGjtmsX=gLDc5z?9y;4;|Q9zTR#$*g~(Tp)#?k&}8Fi z?gP&Vl50HWQ$&$PWYs{ zQ_)2TC-<=u@gWH%YWb%Z6~%ON2mAvl7J?VF({O`K8cd5uCYUb-4fp_(2CDk_7b6lL zjqj93-)6WB*#K?rdPW!ag9^Svv;#fz68`)Jpg=0?%(w-!I% zI-;-v3)>p+F(k>bm{*|1p>Z%+cS+ZHz3%cq}X1BqIR7BoJ zZ`;U6O)>m@EUT%?-_e~M$;g0&rMDYyX0zne_z`5{TJtAr<@v#j+h=B5V)`4Jk*21< z!TdI4`L7x=v)e~nuv&uU;?Hkv!TGAh@J273AzoI}{@(w zjDFYB_jj(FnLKfKknNrNv*qy0lJkE@vPuL4j_)t#vJD&l0c=FVI+vqS(9+Yt$CNh2 zbYjoR17?#PGJ}?Azi+O*ru<|TZUq;x7UVY%V(|R855W3AR?F^0QRdc`J8n##GF(uF zu)rTH4>k}2Xdq{ieb^tYuw2<$gnaFK@7BdshJ?d-wBSKbZA`#hDz>9z!$|{B|BXV% z>s^z~TNCWy>PusF)fs64?pv4? zQGB}3L#f=rVPcN}M6ROyHCXu(>h3l1E8pTPA|trVLQ*UBOLAZr$8d1&Wr?tbsQjEa zg~MFK7S`XM?z(Kf>=YoGdZn@;8{YW*r`y)CDigKJehaPHw6Mb5TKLEklPJb_tJ`JNI~yom^0a)a@HY5BtND!U``G>7BH~thnAETyc=V4-OUNXTK!VlZ^K!^;n-h z&XnJqIp!a(=9KO1=N#cC z{CD{BIv3R=vurtsg>DIK)GUX>=al>kmi&o%IDNqJjGJ`xVfIXQI2bj}EcoV)489^5@P5&J6)A@5iDP@tn??bjfdiqv&36*w3m78evxL z3%ZpoBas2FD5+~n*+DPTke25AR*0_^D=gtC9qCD}owQ@}R*APr((Q9DlKm3N{i}l> z&F&AzHHC&4V3# zi8>Mfx2rF}(WfE8Xb><-L&))|V;peT>qS+20~fsV-btIKj{(66v z79)fnzmS-2NqiA}E&HAHJMJvK*IYZChSadx@sMniZ2;RBb?a22ZuwV4B|fa)bKxu^ zcm_lTI%>+Gu=!!gyke+@^NUlFrz$$W3-l+x80X3`eJT2=Q@cPc%e5-9cV4%LR4-mV zXsl*f5uZFdhUIIc5>?zqPr#b2KnQAYk9v8r&YfZ+zV5t8O2{sZE?jK6=Z0&WZt+)} zXHPf^P+_me;GRPfCW-FNW_j`c`sXL$;nlBP-rnnTuO*M#*PBx0roVkTV}x-C`We0M zNkG_>tu-*DkVDE$oeBki&G2VBXO|3=nq4V)UlO)*Cxc7ILH3e!%_HWgdDpShvd}C8 zryVF!@K}n9q=Z;8_!E~pyLP3J;%V9Y9=n~dR5XLaI^GM9 zO^iW4JU;qmWD9u1K~j8B3QgqgePp~h^g5GCz&Z0KWD-O#i6K4FuU-(_K!iG0-Vun= zbA!E-KsDu?*XUvz9=Be`MiXwfroY{>%Gi1v#kxCYvwVUIZl$84clpVbYhWDGkz?nf zj%VIDkLQgZF|z!2(eOHM>Dvl?&euLsCKZavs9A|;Jg51k`zWKjoUGC|_W82PxdHd@ z=|@XoSxb)mk>iB$KcX-D5>*6wsv^&0=DBmt1PiI4kZ2|6iusA(l|dGw8^ePqy$4Q^?#Rfh>b~2AK!N+^ zt-CoE9V+W%`O_EfO>9RQhwy#OAdP(QhrqI+R+XEDD|zsW?YokA5wOb5rdFUn73u3g zG#JyP^DtNSQT-NvNaNG}iwwiS-B7Bs4se7R!VacIB;=KSf*Wa-5Yo2PB_FKMv9 z#Q4gWIK@4jC$=KJ;HtxTCD015)F#RUjtsB2I0KPgn-Sm*!j~Dzz^CEH-J7*LmEijU zj@*DCLPd)mP0^GK)(rk$O(c^osFm^`^19G-*E8$$g}eEAn?+POO?c~+3ah0u4K6d@ z;V={KE3+0CH-0Rw-e9q8+4yfq(9dwCttwF?V72K!e{|s9xWGeT9rXs%bUPr0 z`n)?lHFeeRkm`3SXen;Z?BT@_-3rMVq}tvvYPsNM=>MOLDYr@ZSApmM8;!tZNB2(| z_c)!t;HxH(38&`GsyeN$*#$<7;mB35>A$w#5V5gY`ahQ#3Z8GkzYPoB+zqLly%olj zSA93$ys$mB!NCs$lg;&8rG8u63@|w`VA$K*F6I$*%z<%hu7loY+zaZ+BCzz1n+A{C zY97lu$DQFF+0psWequTI?MKa%{w;4FK$<-8<%$I?F(_G=pBE2oR?u#lyYQj!BBNHC5mzm|3s@b-84~<4|yPq!pdQ<^=|$$cQS>8!3BKh2^HSA zX)fTqc#uG$yrrNwbH2?z|L{Um;GUw*s@&h5!1FmT$MCT5QR`{9wwSd{JNIOi%6_q$ zEU?7P(?OzWZ@u|_w~*Py)lLQ|?Tp$ik^MdA%bVU#Yqnfua9< z@8MwVBzyCyKLV1ym(BjK;!H5!NCv(BcL`K?VmR`U+N3DHvwS}M1$zd?#;7a?E_|YwFDr0QD}oRi3~}~*a}8b? z#~J^oz6Bux=IdP~LGu>}-?X!U81CF>={lM~OoGVek0FNB^f2FSSD8&J878fiOfX3R z>s4syOC0E4SwS*QAbL)sENsJ_7j5N%M6Jp+(9n8KI3MaY7wT|j&izTB9p^`t*h7c)3(*F8U9*kHY5L96RK#5w4^ zxlb-|L<*kY6eLjSPs6X@#$KWW9MkFZQ;D!FWhN$|t^Nu6A*3Ost1U?zM7ULjSAELa zV_%??zPr6sq;vfN#ERG^_hxEO#+^O)S1r1W-_a|k+e3kc^QyErd3-dz5dw_bwm`F- z@`UjPx)?uCz-&e`KC{r;6@zXSL1j+0Dmnk)uwS-BroSmlbR^Dk*~b-!=2Q?|x~(yT zjQ7?AKL+d^dvFpm@h+HFn|vFK(Zm!8FYL)kYXFldG2cCpKh>XO&-m`|SW>6-MtV`* zYiJCIkHuJFIxq+Dj*0%_V8A$^o8&m#vD*fokDmpf`n4;}_K!(k9O<>OOB^rG%nprX z^6V}@RiCy}=6x98J@jfYZ^C^DeEmn0&Gq7U-1kY(AO(#jt8EB-H{yASoUL}jYGN91 z`e9U(*;nEyVJyT9+G#_ky|~7G0K|9K9}+C=X|?!3hFYj{w_V(W#tbq!ScotIokf$0 zT0{m#ulwSt33&+=8E($x>CBzs_r^h>zY|*O4bOMGXpxJ~%~Qjy=5DAs`q=^SO_nlK zVkZQPy1`Cyu+B%upu)i-bb7-iHF&;56IUD-=b~1Bfa#u6VbrN_()eRnLi5W#cIQ;7la8lX;%fL;GZWfQz-I>Nt(UXr7|^kfL~< zGv5t~ZGwEa(3J4`?wQ?N-X4r>`q(li0Hs7E?!ZH4KUZ8A41LJitHo1#@K8P3=aG;9 z8lUgZEk7Ch(lk-J9vd%p!pr2wqx4@5cJTqp2ui)pHXBMettI_?-5j`^^1h=&0|t1~HTk;qZ;8 zQ8mfVPlLQcuzx${Qj*o@ak4xDd@&kBDE3Bw1yk?35H=`72btHt50dS$lQCZ%rTFSv zW)H;qqUNFUMCrngS8~3RiTPYTo}P zQaNf2$#^ON=i_`#$3DghSVab9=vhfx|qge-zmI8yH6 zOpjSTFP}EHqiw9Oe_Se{(YrX4aty3cG6k2Ymmb3{gFwm9?(?oSVnG< zTTIPwh=}-=ax2VYLt>N>x$JCm4bi&go|rI$GH&ZOe%~_}RJ+f!zkPns^ZRH2n&&y^ zd(M2n-*Z0a`#GP_=k-3g=7i2TxmC0Fwxl%6-a?L(niiexWKYVy#~cKq4wO4-2Vc%& zubx7xSLf2BC2gx=4ZTB7s}v6+cf_uo`Q9}+G5RB7{I1DlW@6g?k>c`+RksrfT)T4= zx8@<+zQd*D`eDhjzs>zD?`Fr*-fr{oOm@Pz+Lexl`C$3f@6lF4aA6;$!SZusng!;c za4Oqb?{{mYtm(t+TvGR%bebfWKPod>@`Q)(&oC2*2ivC;jc`(^4WIVmU{>8~{T_+d z6}wK`$741E*wfvG=8I|rk)vr|SPoq)13LheEh$8dk}J=R zk&PrZMXXE!6m8jM^#zXlbPOK+c=X2;X3j*R`FLyI_STQ3vZ7M0=xtYCw*&46dWkm@ z$hjJS^|Z_P*W1&!o|CS}+#I`(=`d_cCVP|BDaHq2D)LRkr)LJm;<_BHp0X37S|f8T z)v)&W&nDu5EFvrvLMxL+^ug3+$yg6^Z`>fEOF~OV!EqSrJ_k3gFWoUc1nLgi@ZV3x~d0z_s9S_oy zxxN$SUzn-u9FFWLA*HLRigX5iARyXzcS(uBit(dYE4dc*6c-OA#S!pROe zUBVJSAS2iuV-m__(+zQ?l#W0*OPrx0U(Q@%dw?9})W1|QotTsX2b@f}_x57{>T1rf zQK_H^JOVveRLpQB4wa3kp}DYDC92#Xf;ro#bSQ%LfHpmniWJ`IZFaCUEX?z83DW!< z>p_2l0FXPx9yC1KA78?$oNv!P&6l%R#0b}^#%+?#hsxvVhs|-pv;!BJz49%S^#=Zl z#qdQe*_!|78IF_AL0ez42kU0U+2d+N?=(y3)I-`3U;qK2DWZMlIj`$tM-V*7_C}#C z?i9B*Zu<=}@-BUx06<>E*{b$T`k5E76B332?v=}%a|Bq`W3+@e6afQ17YKTU zaGo@qlx-!@qR0AungUoPmleeb61%Y|TACfOP;4teAo-AHEJH#PWH5Q&I^UitDE>l?^5^ukk=K)MQv zd$qoala8LT^|-Repmz@m`D3D_T-Py7cH;&$wsC&FJg7;uSgRx*IAe9^Fj>jn`=L^e z=RMO!S#To`yjSw!XS2y?`H*t4$mmC13K*-~lAj$}5E_8pE1D$*?@4k@FLxti!d$nA zAN+KlW_eLv4!LS9eT|L5?#2eMsT&h(Y}Pq7f(^?_1W)l_MrhF*f3Ry^Gjd~kRIua{ zkUy;k9NtE+BUx&ftYNa$H$f<5eQ5tQVu?`!)z?=Zs+aixEj>!68Q<;;&grLuMCa&T zo@VL(I4*j$LJrtHm`%|TEG|Mw{!=WCHK&{WQR}02QuryBNYuLRh=+jXr!E=FL)iQG zy7`9SIW*YH1TGJLowrUGGR;gpU&w(~p*Fs)WbSV^&c6hcKYGVygr149#jkPZ=RmMAVp-NZ1LD8uUZS+e7 z08c?c2_SK-7Pm|$fkOeV$WNm+*AF$6NU(xx8&4)*5|w6_P(lrB2F28k5baPhl1F7k zxq#Z`y4C)HX@W-~T^;tHM*pvQ(swE9KDj1cy?-lwX=LQVf+z_TA_R%Wd=Fe3riRxp zGo1CpPQdHrAe?4@V*;4%^7oM~kla}5CmWMM77=hYhK74Glh&jDKe3gCz}9px0qIpA zzFHJWmNKH7mU7rM89Y^DCqhy(=Jg3ya5WnVt9+w5&0tJbd4k^--uZi z2F}Ids!bw-do4lCIV~{#{zTfZz^j9f{dv9T;%vcbCohu>i1T0p(c-N|<*xTVGW#RA z#SbE!2yLt>iQ?hUWqJl1sauAr1M}XL23fz@5p}F%*PR5&Q)LK`X-r4W{;yW28^>|q z(>1h#LlhX#vnqX+6`}ki$?p9A;Na0y!=ubRgZF2O-ue9+rd(K^N~cxjZQu?7GGz?m zZkrT;|CA(9b>gMdXFuS+;`U|}1_qQj00#C}vD5W~)B{GYc~LZR`XNiYyM{rrf!dV~ zZQ&xK;;fP)VG7Nm(pXb015woC03J8ufTR_pNbxECM-makbDf zAZHACxeQYUmb@^NS}WnHs^U|u_Q#dn#1B2yJdZpr1cSkcki3d315wILo1nE&c}TZ% z2PI_4XfwGc2P{xZcdWt>y|?WXb`5&0E^A^2sM@r%vv%7ZRX2fYz}S#U{^eo^1cBSU zrRq^H+_3?13;N(og1}$=%1gBv6EYvFwq2dDHTJE1L$;iI&R-*Ie&S17%}DOqm$aIy zD}z>RNe%MNSS>QG9UdSbLu6E3nh0;B9vSPSQ(fO~U-mx;115iTVfk(CA>UhP*TOuX zcA(Suh)SyqGVrl?GtmzQIe{qCGgAhg1Tklo5OdbQmifY{ermyc;*Q{yJzr9 zQY?VDq5>PdT>HF8>~{eGd7FG>?CfTUY=(2T;@5?>VHHUbTziMoTHBsSyEdGqHI;8x zVe+a&sk33B>hkT1x%dm31h?PX zYN_(#)#37v6$XXp;c=IHX)>l8PNsb|V40<*uVm&#)Rd091o3dxE()N_a#_SG^-c1RN@5lorDNQ+Qms)+65a3dVJ_e3;$w%r z&Fkqz+eil5%_pI<^#Hz<$%U1K9b)&R4$=fbvF|$U1 ze(M(2asj$f?;f&E_qsgU|F+sn_Be|6L#m&WP&t6v0=k9o+u>#^VM_+xcEuQ(Q8zFD z@2R zSQSn)@sFsqNKXjdAlzJCt0A8AjPqY4KL3Pi`^z6*8WE$K6E9C%>i{yLB4E-Lv+OC;kaCw4K8-#xHGD^4F$*GiUy8U?^sFVFQm0BH9Nubg_MGrlVC1Fu}N uBD4?uO8))dzxSD`P2}uh=}OCg1n3d5z$E~ z5z&dJGpE2`ww?}-5fPD)TT4r;+e*txJK8xqX+C>mW+7+cVButKrXhEqi0DpOn6{zS z741usN!6u1*LniJWo1Q^V{XTk*=h8@7+=6xl-f`7zOgI27umTym2Ax+ID433%_Vv1 z-o;NJzQ6g7)&InK)h2HxgfFd3J&h#b`d%Me?)?BQ=~A^|Y3-XT+xo7_{a^3aeCMZ~ zSEtD$t9bR}rr9C9^o4vs2WLv#f3>uTx_pB5orm@D@r1{6$8r!e%zkbhUs+>mn;zMz z+{Ich6{^b-2Tr{l&A;=7b;g1SL)}_p=udL*po4V8bdR(}%mKEuuT(QB8D>7$A@ zcVJhTu9a~6M&Mr6FPf(i*eai;wdhFS-OQ)K_jp%BW;ZBwvL#z|b5i2(9~tviUTR~P zcxCkNoJ+?I*TZeJn{M*kGc(b5Y02nA*#&gU*Y9`u>6UY5Tr{q5@r#(5>`<9_^Mx4E zh@k(e{)YL@ih*PcbwXGt^wSH^W$GIyR5Fn)7p#5sP13W^(l1!OU&yq2zc=)K8TpxO zA462BWrOKhTIj!VKMTwDyD#_inw2(N-8o-Rt31~}(x=bZzTH$1P>2rLyTbZr?#AV- z=`7KlotoyM=KZ3IO_L_+F97ToW(VJyuPzL)j3Dk4NKj`oiB z+J1>0aWE2vIp~Je%IY9$hn-O_*DUF6_*^>1za=~MVm}+tk9;(q7tuS1IqiGGZ)rlO z^7fL@SDmx48b|#WxlIXGbID9TZM!F>>1mH|>!2abT|}^IkJ>Yqr}Pi!b~~k_Z98|P znK}Gsr{^Sp^f)7&5YLhR9fsfeos6A~pEElnSOF9BzZ~d^Rs}<$dHDI z%j@5qP$gaDGojYWW#x&B!Ofj>v!-W@ZgJC```MIweW_-l_DX*id3n6mPoj_Bx9)s) zRKOh3PIzWoI3rHoIph@E)vLIB;d}mxz5}Z+VSmfn`Zc0_$t;f@4G+#1QzYchP(br1 zT{ZgNJm#fseGdorX`)e|dL!(?kr!<9#5W{SRVwxbmfeG_)oi#60K^@OM+#3>REW62 z=QBjaudRtlz$aqxLl1t4h#;wu|NP}7H1))PKA&j9{j-9i`5_Szlt@A5o|Xsk^3>^f z=Gt#>tnTi7y>a7Q^Tlgblqc^|w8%%Z+*PDhzt>7~$_0x?71r{jFAp1T>+3$JE|u4~ z7{q!)Uh5jK2@wg!rB*22(+;ZIwZ>k)+=W%tB(;OK#Qg3ON?&_7Wiw^|o0e|9Yft^- zY2*7>q5ed~5Rx+&S^n_>div~^iwVLdi11(FFE9N`K5#+^z6Acm$iqTBUM!twhPU6J zZRGvE$p7_}|Fd~tF@!ReQLIc<_L!4^N6r7Ab^kcu|JNVYP% zs+PS|u6*=mQ{2nj+ibkdI?Cel+xttSkD}e3)LNbEJ(~?+VQ(d;5CczhF8CCBhgrvY zT`EH5l$#d{$(ihL{49S}*Y@o?eM8jbpf_#r_A5==^IAy3EQZM>vqEaXU6)dSHTx-X zG~@IxyHZ?BdW59jmebDiSn;?`o~uVhSAV@lS6nMwggB*bQN7#Ogq!JC?zwtgQj^%$ z%0OrhKQaq@x=el}2L1EQ2ePzBRnMMej3j6cKel942<8ae zpN)_*x^{1Hsb=5`{57+0mDP3jnE_`*>+q((PLc&TzIXoh)mk;ca9UCEPA@{LQC)t` zaKM}Hn4)VpgPy?C8R{d#d<8K8m+rID=8LV-96A5kF_}Mj^lbsSgR%+Q^W$gv7N~~-|BjOWul;^sx$qNqWF3X)lC@MC8&r)f_7lZ z)5}nez2046?a3F7x_Is``M>}GTiaBTLHF90sQBmLn773Jvj*MUP%X}Hb+QDuTUZ{~>wGnX{?++^GaPX9G){)Ur!EdGK*l(dWb819UsNln zDE5&UVKgxlp~hM11MRP16osojV3}}7jQ`lrjQccl!;V1hM3sRuHJ{RQZ{sw%e7N^+ z`2jvTt=z#@Zp`=A)f8DK?BfI9@=&+`iZ`b4bX?0=e0~w1rT)-rz|$@X$u!$YsIaY$ zxmaV3#=&PfHU~8FRC$IT{?&-!I^b><8pacR)O1zn8hVGm>x2L=ifYZR+c+su3vxMG+H-v*ymfWR|O>*dcb@w@326 zc{PVc2^c(DqxmL>zb#9jInWr60NcVk$m6&ND>lFDVNu3AgBz!KZsAju2>zT@-akB~ zQ+0|}UCDpP{SLOX&H+{7OJJcyb-;bhUhA}Tu>GOemUZj<{kksJ-px*}SvtafqXGWM znX|ljNDjB#9%6GFFQ)t|eL-&0N9itv>LdYcn=*hXaJK$RRi#-Nb42Lv@&*(=93_<4 zOu#jL@}yJ-uE>=}#{#tEJaNI}W7egEbz-V} zJ|I{VPNO7KR`7A&z6SAu*2qhQciOoGV!+<#RIULhZAkktZtqT)eHzUrgbH;)DtJ|{ zgE8DLVnpVvLCyly7xgo0cc;S9QbrY@Ukj#yUHnWwhRlyy^j(KK+{UU)ngL)HHa?k4 zmbj0fBn zE&b&|po4~dKn;ePi#E+j&3OtgjiuD;x_Ns&oYOTH9KOm#z=n(0qks?uosz!(^AcUz zIZS(ejG+8OLKHj)v`dsXM2u9NNBtVK^vlYmClv9rn9vZN0OS;=yLPoghehA9J}|z% zjIssiEQBK~gjoux2pABm4?t*A%We57e7Wh!c1nb->=<`S-3!Og3sFQ}EPc*RzU4R2 zzK3tdD+p{b5!?wRHsZW}WOQlZx^?2FQ?A_ohcoC7T7F(7^BIDtU!;2g9)Ib|1&R-6 z%^%a0=Y87Uc_6t~vMB9Kc+{9jKxA72*vgzpzfq${D!=B1W$6%&QAM6{Ch{eJxF$7tas zFdILanEPXuS;U?-4Sg4=sWR^;eT{GCa6uZ$f;vSdX(($XW0l{^LjVHDfJF=rvdW~{ zKHfTOo@7V4p2o-XermtAW39e2^hq#b+VcBlVp4UrlMt#)9ps{^ZZeozC7a^{fG?cD zYihVj@v!rx*D3x0cc#ARh`8T)!VPzz1+L>nk*H&7O!Ik?!5Gl!nBfkhlcE6<>E3DOm zz|8NpY>UK5Sy2!egc=;mk)1Y-Ut+uQ?6ME?rBtC>rVU}H_(LAhLc_b-!zz%mdj}&m zNn#g{MImLtd<**X#bvS3icSs+{rZoTn}+UkzP%Fm)Qu(tU@*l3z~Fp(e#CjI4kK@q zg4M&%A;*Sm|MhW`&Td_Cf$};1zX*#gxQ>wY)yRrVqpR8!OXAs-XxGL-pC6wqmS3e3 zJU*ElJU)jl&sc?`!^?|R?DyM{t9)x=fVhbRg|c|E0C?91WJXky)0=|sp~-JgsGs@rno&^3lT#Fii4ZH@}?ctXX*{Ap;8C{?< zIuAR57EavVLf~tmIz0}z&b@?=1G5F-P`4p%;H4CjdB4bas%MHqjNq_lIDvh7-8Ewf zB5|5oi|lIS_q1biZWy>E6+K2>j2w~Y<^6XQ{M|+u2rl1Oi1RT;oo4uxp%p_Rtin?n zlp_LujQdBs@%*yH1TbHt0{A;2Yv2bw0!AN`0?grk-GHJe$G8siVBv$`i7U$RPxHjF zL@Qq(4)fso0;&HonR>o(-L*@_!-)^x?GKH}U`EY;Mm&B&KvZ&?gy~PPkO{{|3I5vU zlIO0^+Zo+MS1OUd1_ws2L)z;}$it^Cun{Bgl{@H#bUB|oqYJ!`ORn|Yoqt!g|0~Jt z$LE*D;CFk)gagCDYOSMklCb^g(=U_Cgsnj4ur-pM`T zl|@Eykey2l3nUA$-6`7nWcTSR*{22N!}RWut-r|l1N9xMEn2)prd$SJaYC74Tm2rr z-qTdOQqJ#_Z;u!V{_0NBJv^MDZ48X?JDe>ZP~<7NXm;7u`05ApS~kP;4S-X-JZicD z@gP8^7-l4-n5ZamO}Fqw%WJ8_9~A9u-7|Gt14?EfO;$YNP#ac->^tp?JwVdB*JF>? zB0!#v9Ck-X-@crBHE)*@f&P{qJU8z)VNc7M;$fQKfL>f5X~f)?ME6Ium5&%Pa3;HG z_$){>>%AYl)zWZZozYFMqSuImXIDyqMj+@gT8YiLMyTXTbU)*;{S*O6#b~6aQ>8LE zR6WT+{P=>aa3C=|l_9S+AkWjogy|hBsaL%h-oKt%Z)c7WGTRDd3xD)u{psZT1c-kf zYEW5^EMm211`!yS{JM=V4@$AWFJLbk?|r77`PfVzqQqZ)EgYc!z0T2FEz{=^I+alc zH8$qnt4#Y50zn>!K^;Xkq9w2$2iHgA1Wt@gCUnGzgsVH%;G@aCD_#rl*B5mgy(R;U zlkZix(&&dCBrJx*V6gRUsUxn!s+r)K$qc|9=Jd+SO7!n4KV#VTP-{c&YK<-5{$q(d zd@ucm@{FjJ4lEh7t}Qg(iXA;YZ02nTUyd9ZcKg!1W_+rMFp?>gNlmx(55MNM|1+L; z{(eBA z;RQ^;!z2*-)@JMdgJ~P3nz^qc$K+{lAK%$#l-JZ&b+ROYJaD0t6}{0RaQJ-A zSc0o0A6r3aruwwtg-ZVXdS%xeHe!n|b)Z=9ms<9wT=cjdlt77K-fKEN6W@4faWh0P zd3Q6m$j2gM^RKL*>nFN^z+dKk_w4>n9>MN$Wht)9ZnI$`G_-t*3Zrk-B=5F~K;9to$B(^+e3%bR&#q6JND%wNR-E82E-t{?3F#NG4s~jd1{ch3MFS<2t{iYEB9`1-Q_WKyob|Tcn`%w zO%4^Uk;jvpUZ5pzWDZcqt&x?*<+?(T+x-7>GbIEWU&>nXODpIx9 za1i`-*x2uA=H0vI^0z%F5@!@?4yQsuaf ziFs5I*V$qu;a1l;Sc0+HaII?>a%$XC*?@+?nZoQUb#{jZ!W;N115;{!X$RM;MRX1h zPDD92JYM;sI`x77iDK_vrPqmqUr?S#hT-)aUwG8^x{`m%jCmGh44%(G+^S!7P1*R) zByqGg*cw{pvX+}yD7vur`@6(cZZ_1bH%iAdWB00}Gf9fiMrYdlt!ZwFk=S2nGDuHY z|JCyKtzgoiNqnUStT%x!+S}`bX56j*pj#F}`d4t~=|%}ummMll^TFx%stb9~`+3xr zw+mx(1J!N9qb30}^JCtKDLwZoSF@#pO8&S{+3f3F#Cj4J=Itw|blx#9O+BC%2w zk4K%5X;0gA=ViY>uvt%C#FxA*s)y7lY4l?1+oGRUit8_1+=KNi$9jU@ptjp(EFzgC*--{CT|6 zlFJ{P>4kNc@=H}8x{LNMG;IvTHwd|nTW7=F3qBtGMO&9zB~ZUkznU~}bBl{Fo0$)g zMPQuXHoy2jL;BQBjpcO4+2a0_UgQ#YybuchJIQ`ewhlL6%Stt+$Y;&xTPZ)Xh2Jq96=ed}X9hDbvF6#jCS)cQTV}WXAw8VC zq8&X?28`W@S}m$r>o^q9qsGq-D2$d7Fo45MNy^LR0rtvzEZ+NFtilU^{FB)DKwKqY z%`YM!d-nsOy%Ca2wdMaShW}Q~mtRSalGbA`4m~cP(gpnXyIS`}U+a{aeEMtrW#~Y) zxTeDVt};8HHS_A1SJqD#frX)MXZG`%^ja!tZ7@61toJ%n^Ft3(yQwIh7#vyv%ens6 zli{@-QuEj1&JAPYtc|Ho^sjP;7Oe0*763NuXI)evLWV&Su{QUecvF`^GYx;^;l|!9 zGxpKMZT3s^J<0k8XTHa1s^%D?Bvv9R-y)8dEkia(Gcs3k-qdpgJ#N>=*l%&B6{$2$*a0a=?=z;i|wWF?BxGb=N8uRVl}B6BRha?(j@8JOwT2Xuw?JL7v@#u$y|1Na3W6LkjE=La|&nE?xz z*cqKMNJYsYxjGXIDG~fkTTu}{=99Yn$_Lq1e@r_LQL8ZMrI5D;sQXIRV#z#P2n-5w z%8=2$r1-*OAj2CQ`Nx*Y)RrlpPj4{EbBL={S6 zn@(J~)SmG+k!$~L+??O4Ga_Wl9SQq>)y!e7q>FzYi9JFVPkSx3YU>(OBlp@wg6Av~ zFQc&-BsRppV(jTlQ2_F1W2?+QLmvB;h>0!ue|g>&C+EBV{c$Z-VIML6d5IyjG#wk5 zTdVaecjUhpw=?k+wlx5leo!;|P_1`#f9_{jJgwk^!RS4gds*efdbaAxZ>po>>az1b zXNRz5zF6>>Nl;#X6I2k}&Z}4U!EJg7)R5)qvvpz{FwTBEj2hL?blmo?-duUdT*e*q zs|PFM1{chj`HQu7#P?UQYuYhXMf0*k{@zYM0ZpR2%KcLrA>y)O6loOr?`_@4{Cv%6InIA?taPVT-5^ zfdgxI7Xtx;c$*KD9_-hNtQx#YN}9=YT!p39{yH{zYy-igaF0$m$thCv)629s)18Kc z4%0%|)})y8k8vg+fgrXGotq3*VM!7bDVZ|XtgbhCGB;wN@BNh^Q|b@J#jp&sxjK;M zv*xcGD%|{h3oU-%K1YzVho8b78Mc(bIx*9P^ePn0V;He%lBpoCc!&he?LR@J(gs1P!SWM_F9yK$wVYeL@ zr(c~VUj04{>8uAbw= zZ3?>A(2Acl6avdwg`eSKY}l!(SbHo%+Ud+UZ4jRqzT;-Q;S0Z43Np*gQtu=Pj!1Pe zk|>!d&XDw*W~yb=tvyN7KCgk?5P~EIu3-YC4*Lw?b`$Y3b~Or_{rY~?`>@^HB18oQ z#H@jWXOo?JOxwj1$>=N3S9K^ie{3_sH#x9SGTi3s4KCuY@1U;1g3t1%Dv-NrSFw^PUC>5BZ5Vn_nNMAiT zgQsE8(?Fcu`b-uTz0aQ0JLT8oxiUujkRUy)2$EQO18@o*TGHh*vlg!<9nr(hC04)R zpZFK+3JFM0k1InxGA^g0(_~pfp^Q&qzYcvr*y!l3HTi0`jL(1I+DQWCTE9CM0+l{l zGQy~rCv|+W{)RSKmvdG$1(coFD;(;eR%dixk}m)x^o}juWKu*D z@pY@8dC^n~35ffIfc2DK7-c08n1o1Nvbt(nIrT0i?vqISV|>E_od*YW!_X}~VK|%r z@f_xmY~myBeS+$58LsrYZN>t@p|l)%75(8@;VCKJXfXnmFau-^C!Tu(DWL4w8I{11 z{nsWBy69A+$-TV${rm(uOnY3y)op=f8U<;+wI_X{&;__5m^w1i`Ytp;oDYRE9^c;L*3Q2dKe{xR>b~6Fk|Nu^8hOi@_|% zrUu<52--}Pan-1^9^~wOyzu;N&Js5AJfV0woRU3Ou_dsXeZydPdkLSv);QoP>ERz- zE7$UwhmUIo3*`mL`jRlYGciiI(v9$BD&^pLkq5p=*w0`!A%e?M0xrfqQ-2|2atF@a z#VWS(dt7>dOe*)dnk*fVO&1T^UCQg3(B7qp{MMn}fYM{@7Oz14h;h2LoIX`V0o4wR&hc*N9QEm}tbmKQV*{aJLXT zLk{^#dxNa((CpfM{AXSPr2k2G`r*Zd$foa`{+MU?+s6sJTvWiSDWP%n_)W;z=0On& zHdj{*?TkdQHvVV}M-3j=Yce31(nFf|od3ahlHREyt8sk|UQ+hBYn-zSL^D?uGr z_8amnVvDX65|5u7`|d=5G8p4l6@l+Y3^_tk0gGCe4c(cXm#ygPiT%#(abqK|$htQ? zFp*$)D4^D99n(bBBnXa?a+YNL5sUZK6AHmBS_z=IQX82{ z6ICMj2{OOmi~&C}SvJ@Dgy44Fx7k4bPSHC;l4f7+*X;uP^8~)he;ZUtWGOnP_t4|% z)p?&rj&Zo6apsd4E=-H7o3-SY1{=-M8eZ9_QY-U}c=39LPpuYcf=MCS6w7!hQ z*<44=AP=kG!uvv0Z8*WEvcOAYRB1OO8JA&6y2%E737sG}&>?7v803f|dqcpH|2p-9 zBwpdZJ2t;zz?^g&O5;iHe>=r+t|YFE^W#+PQBmW~4XMK?hwbBp`@RPZWEA~aGX%pm z*6h4uvt{hQlkra=iP49E2f@0zaa9>YskHA_c$$|!j%yV*ou&hTz~ph3MzBA zRtjs`8n@#a$JCMulspt{>{6$(!`8$TQ5ApyaK73&?Kt`M_sql(ktVo;3#=x)o||?w)3C+{n$*L3!Y*u|ISI*DJTu zNBXX*10o^U*sx>bTN841h_JEG5Y#)zU8M_~r&uBxpL?`?jSVjRHJ`$~pY~v?&mKnL z-sXdHZVe`YlBRM_1Cgl@X14k|TS-%k%#kX#Z3rBdblfDn{cW!w%%Ok&aG;CLZ|_IpVq*`1to4 zTBv;$5jo%8wc3aG5O{70oFBxxD>I))wtHwhN_b1)rBF4o^i8pX*LdWhcN8U}mvkFjVjH9+37BR*m%76735nYVMxf>;5 z^p&8+qmLRK*zo38(XkcC*to4+1yYjwRXbAz2+(T@+k)ZzII=~c6V-*KjiSo>(k;k_zJ?(_ZA!?`EH@t1Cx_*yI)^B z!bQ`ZusAPD-1a4GlAxV&K?t`$!^5MiH-e$PNeIJJNXyHI*EN{%#X%Hjfi-0{q(IEg zG6A}JyHkwEHiwNL_#QCM-uc2$z`h;ogz1L&A$4{T&y@BOi!Mc~VK7l(TMQ-x+6FR} zY%6QMX2XgiQq(Zs_dXuhX3Juj5xG2a+jk>-;0nyhvkjU@&u=M`s9 z;>u8u>O7wJ_kofK4NpOt#w^3oeUL)Cp5lu}J+dt*w-Qpo<>NvYL*L&Xe$>4Dj-wiv z6^m+^Q~s0pV)C4eN_#JKg6~g_g4dwx;~qoJVs4&r_9uQv_P_I91K#mlOj*1jAz2do z;to(Ibt&SpAoOE5n9Ae0Snl;(&#dRlFqgxl@ApbRIG9HNTJYu|7G3$u+)@@m?y>Vp z)V@36Cfh&+V3Ey+V0p~+NAE(BY4LJ(!`$}TyVE(bTkctU14~L?_PX=}kdC}p|pF!JkX@JSd#f$LblO-5-*|8_DwcHHqu5Fa|UOr&)Ch(`m zaXmffPpfp0EMASQ|Cyk78Rghy6#Lxg6_NdVFyk82b5KZXIMx1B6kOm%^bWhKcj zmpHQ*HcDLtUBHU8>HY*JhYB=zCVxmcB6ax)?VXs`STc0D>_gap+SdT!cty;Ol~djQ zk5K0e*M1ObT$t&Y?=A9X;3jYtgfxMWz8xQDRd@Q+^#tIX^5=j>6asS;EypX4@3S;5 zfc?`CJYt|Dq~{DyA$OBCYYMT1{orxnc9vn;(-#v;pcTzK6#5OR8ol=JGQOL9nuVGt z3BX$i7Xd38)siUk_H9;}+4gV44q}c;t3YdU3nv1D;-OR|B_hm>H(`4H8knO}R^ze{ zqy;vA5y+<>U^zM|HwrpIX1-L&Fh+c!jb0y}jlwR}Rjj=s7>}L;LC|>eY~ct=+og_a z%<#BmHONVm;m0*lZ`?UXJc8n&gMYh1e)VbfxEQ~2cRH)Q;3_`hp!L9YT4K*(f`CRW zl_Z)m{qUSE?udnho?X%W*ERn@{69Bv2>=8X2`!PNOvq?z%KJ2*{f-9AXk4>(qQeXG zp{ch4OA4NkaKkGub_gSRO-GV>iZLr|QN~Wz`Yn9&1Si3ze7S#!;M|z4x}(wYqy5ST z&0}E6>NFso8y)*iN~`zKotkFWUxh|8k;Xm>a~H(>K3fS9r1KIIfX=qmEvuw1k(6lb z1TBXZ6>WKYI%YV-xd}M#22ja9K8lhPP6LiH3tFcR8eocZo>+??1O+W913;yUPb{QS z(7cl+@eT~@CAc8ES_FiVId%sGgUjkp;N;O4B7dKWd5}7v?3nj5+HbHm(uNQlHE>

      6n)oYyoUSeFPmABJlz2aVZcs~3xh$ard} zo?Ff}R{7>jixV`wTjhZDO8d7}Q;$fzruTRB`d~f@fWy4D$*5DFzO>dT#Hy)06;&A! zpnsjcPZ<#Xsf}$2^!1soE0&jiUZ2Z!_&3$y;kY*5y{p6bAltNms+nE1THKiKl_=yh zvm=IwYJ7tWg*a=U+HX53$k-?w7vvj&Gq7EVJ; zr*=ZdKpE3*Zi%~q--z=0_*<25sPn%l26zpR0`^L-V*L!Lz5?Q67|SI}9es~WX!h9v zg6Tx~?~uCq|3UMKx~c^Qx7{_fHdp>v--w%Y@p>au*>IHtQ##kbpi3vcrnf=*Ohx@!|LMiAPT;a(V^y)n+-2Zuc#PEAX)R8nI2~QmCq}ox0V;ZT3htSJI@i1H_%SN8YM~SXSO~! z2v5v?}o^DAsGZePcMqAkvX?%sGW(CHCbl}53O272a2r)ABrJE zd^)2CS~xD^k^Vn}zuR%3SspDsecg0Tg2%qw59iNXE7sBzbYz_aE(jy{j4+strtlaM zv)$f;4IK-rSs*+Xj@36KsgXCq!q|kzKMwq0a0ot(Z(L04S>nbcyE+Y!u0N51OCnTF zZR^~-ljztYSG^0{_)ebHe`ZD~t#skbEA6si80d_Y)aGe0yfs>ANI`%|S%9DRqR&2a z;PM*IDO6(=uXXPmmfhJ{OBpAB5)y^yEG)2qlaSLZ8(vkHg{SN`pP46pTVuFkT0m87-ryEoA8530F8Tyu@ZqlZ;2 zuo!uBFZl5y`M-Q+;0K<#JARRSgchI3L=5*JyaCdzOokcby5 zIe{-lQh*fb$GJgZ2=9!!qo-!eyP;r-%}s$&FnQJfJjRd!gx`QXX!5(H^fMO|GN8iV zZszSfocAbbur;nV1nE5S!zWJD2O!_+8y=BFaM?e@ zNjp1cJStl|50Kb*lO4al;rog%?|}%=lGpH4U4H}ZC0z?&=nRhBa70ko zSZ`wxlZF@<9s~NyxREh#uW)u~ssm(@mQZa4YuB<(Pk!wXi|1mjER;=7-{Cr!cVgSVAjm-{e%xf?zpn^?%t9F zT;r@cvxgsFVFOk&=a@R`jfX%Vf`Z2oMDSN+<_BvG#jU?$-F%(Bw+x?KG*5iPL#uzJ zB2x=amhgt8fMmob7Hz7(EB71!5KB0qwJfp#x(w<&KFTfQw$$E#(wsfDvSSnY#SJ3( zkCRy!{M;?4yoa=PTOEBjlPJW>cgAho`{kG=W@eiuo{bo~w_fnuYu*Es3zzZ=Yi#W5 zw}*?tcu9L!l7r~v>iKF%(82y34574iiy*I%(ej6a-{KWhZWE)Rw+{CTs^*HZGa*kH zoH*g&y##srlw-Bim-n-=f~+7+{2XsOE~1uVTJX=;6PSExlD;6J*!f$QYYqBww^s-6#!*C>80x7zU7M{qfm;I^+msp!X9w4>RTp@R-|(OPim5{K5M1c zZ2>*;k6nHtF)PR!O)!x1fzf+0Cwm6huvz4HwA4C3_Fk;_-Ce=(J2QbEYu-o(&r|!^R{C-U!1wmMA&mQ2jG*y~n*lkX&M`Pd>YiJ_k3w5{yL`VY!)9 z-vqomI20QwHk-oEy-{mp25tMJV34A1+;y~FF4!Klk*~j-7tg-)iwdzPnSGs0sgC;_}NhT?PMVOL`du0!+bi}dr5^Rx@WBnxYP*aOJM(VCxM2JQNLi}E2I z2CtF8O2^ujiaV%`H^H2V=O)M(SqFUxNT8E>#ec^FTHaG6ZVj8nvP*Z5#Z`x#98ZN4}~q* zPurU-R0ZB|)YmXM5&zW$C=ZG6ewQ%~XfRTPR$5XVLtqRdiDjCYJv;IMZ7r7 z5^>6<=tBt;%udU0k-hIaJY3B`E9VgPzy(C^+zs`K+;h7PyY=z{TJ2Yym6t)6=mS^O zMDAAFVP)e{!K3Ju==W=0BehxM-vZmQJC%)LNe(qRd#{#pb3R(wGe^x*htfVj$czJ< z$;Ej2tLLKcVD@IsOq?pGY1mFJ0A#+ju>W03)=|-aC;l+cIO33vIlN5V=pRn1$qWRH zVHp)eochxqw0;us_(V3txaJ}ET(msCZjQb6Y`gbrHK*_I0Ol6(mLYoU#;t=Mv9(ad z;T&g_>Q#~0Xby_Z!HV;|Tw{q(nX+EkCGsUo)=v8#tURBKlA}R9m--FAaBi}GR29*j zhj^*$@hsY1eD8a}^KTF3R5lhYFWAc1^$tsx7Zjrto=)i)W~;63#q@u>!8oM@VZ_=M z_IOO1hVW&H-@v_?D9S^bdZn~ilIL6es#4c1Zpet%5BjR%(bo1hb*bnjz66jy@~6 zT^{X&inZw|{uh@vnS&0$4YG-0lEWNyj{b*&qN*&(vzuVFg)cght3OsXZ>PE!#{aZC z@%D!zFsQa^S~=y$qeI8!@g3fJ`!;6R4_ldFzM0N^WYyO|RUZ{Mh+S&LECs&KG`G@o zOXfJp=3ZU(A8U-Z5qEu+cp%l_wzW~+s8*{K_w)0wg@66go=;L z6L|E|M8hnmVO_CQN(4+Ncz`zJ+WU+s19t1SpJovfG^h(*O&je}n48l(6_wP2a@aNO zQDf`=qk&t0RW@uFL{X+uae*-pxngfUjHH1AX`j|BqrIFIn&$E)N$`fb4#_R-P_8Pi z^p^wV8Lt0b9hc64VU&5UX)2JmVfPW($5;37er($s6Y1{o=?^i|nuM)Z&(qAGXRK1I zuN<^q8rZ)nbuiXTqn+LR9lNE5)w(9VG!=&i=*k2yd<$Ovgs>j>5gk!TguSH%BT_$k z*pg|5_#&9q6y`^jEmkM1*;Z4Slr(n-_wVNEzEBGqbiqWr6lVz83La%RPB_%9x6Mp) zfDuN6cc6xIbNX?vS?j~FA}}vy^ASu*Nbu&mk{>2xZe&t%#b;(XcW5bc1W>)+Z%p>OpY?|3=XO3bW!&!9?sgBnhp@;hOGcr{)c_ouXFztS}CyY1V! zB%toPJFek35oMF#(2&tSX#3TQR+cV@d{@nYr9La8aXn>;sxoPj!ul80ZVq=ic5iM) zVs@`St7g7wWNT|b0}K(B9g6ThSuKlx`pgKtrSL&z3V2@=tsqbPVEfkCJ_B-3_W4oB z*6w#H%&ZiL3&oH1vYV5VC^S8-q07VPVIgH#Bn!%Xz+933#BJF0(E$=`G^vVMdF^c+ zDS0A$uzkvVt-f`!Gp0jDD#@;-^Xvg90!+7M=wOX88}Z|3`@l%r&sNK%Es>i)FKH-j zXS^*%v9@%&9&L{ZsLI7}kj(}ajH}yhzG85=&OiB1Coqdt#o9?*t@rol(7;?S?)`{? zoUOMt6sXc@1dI&k1O}|hH7=_$58D0!CfQ&a!Qq=Z*3=xk<#)%>PBb}9!0-llYYIPo z`S!3rZT1qwYJ$+iW%y~P>-;4#=mbU4se5{#(0Rs3+`{8l=`(2qsKX+*ZNDn7DLE5y z2qS{1pr=rLrED;mHgF5Udjwa(-Wv?jLPsPQE;NiiO$tNBO4?;=sN5iThi=bXJM!dy}DpR&lC|u!l;%rETM5l+U&#dW3>>Q{&18pBB5#>l(rx$E83*9U2( zx>r1H_46=%cJ`O1hH7vx8>{`gR*+cK|Bgn}shKvYyO?QI5$&UsHIm@MU8<_=pB1vL z(*^AB{u6X z{g>;AAuW#MAcp<=-aK^oo!qiR>Tez6Yf3 zRxiX-Mgne@as_+C27-aY)?Y%55d zSE}1?BLXv$Tu2R=`AjCj8;#(4yBk+omzj`U4Xg?6Yvbpw)U9@{F0K1E3Z-0{GsSuu z8!OaB9uL+9C%;tS+}wa@1ga|Lx@u78dQ8lo60LzEm8(W57X=uJX(+q11;&P5`yIB? z%(FR;;s?rQ5zRwyqm#Jc@1FkTh`+R%G`*E2uh6B97SIv%!yL}cXrJKpJ72*0lapL> z?>no&n+_(q$ zGa$-mmSfae_;|(LR&2=e_n@0!F&xpCvS@ zVQhNQEy(?^c(N(Qk_|SrrMWqM0(qFt;CKz zf+8<`d&L=|EAi2;Yrea+wy!gK zxvgntaLzp~q82wvNi)x&B9>vXfI5o;lYmCWexhMha*`WS;7!6J1Lp?J_lqN7yaO+j z3xU&5+8wOIBczEOQ0^6CE72Rs?~4&v9-;9BPPzWqih>dEl%Snrdd+S3FUdZ8&CL7` zSw9yolU;S}E39fZ!Nizm)S+@Vqr9Xw>p1Xl(ThLIkwfe|;M)f?xH%%*wVMOyCSu+- z0&oo(Wbgq~>&#`r6*}e7%)dlsu=ss#J3YAXJ9PQ>Rs`i3F4GRw^>>#Agf@&<>NKFrddB!@oH7KzL8$q zt8SV`dP2EU6H;d9NS8<2R~DLSSA5=blyo<<#aZujOasXWcUFGCwc4~@DJf=9F6?*i zZ@rn0t*aZmqiWnwTA03lm2Ze)ul2P#)(Xrb^1_}ek*XK*Z3{ByW&Qo8$IZG`@>N1fl!2B6XSFrsT!j9>QuRne+qkZjZI;e}^(cpC zZqO%}zV0%AfFA_VJ^?|oa`5DEMcUmdMR5Q@pxOt&>fVe)R6hul%k)azz=~BA8k?xil8Wj*wLApb_yGsS6ySqWUyS_8; zd+&Gk-uD;$fQj>*v(MUVt-bd=noTB!t|Nz4?3-u(_CFt)U`TjoEC%!Jw~?zSL=I*S zr_rEAQ(8x)q^akuu{oXd$El1p?D{eh;iyOiW{j7ct$9jNQf=J2|Kjx+%Ot1^T}TU) zdHlwm%ga1AJDxHiJ=A#bGe_8)zVFsp1j2|ye}lt%7ncOb5gN}-o5w$C0c1JCJhHt| z4NC>plRuOF2g`QZi9O2{cvUd^ZKuAxXpvn5n~K7ZNhMN$r{+G{vUdgep8*>oC5BnL~-HxXEPgYgG+8PvASNnj~cq_%-aR1QVXux{|M^oT%EUE{c_^3@Tluz^0QrzS$V8 zTd|Z$SE8{?Ca;ppewbl8rs9n{){Qs~SA2TY9Powr;FUud{ftIgP#6GpD^* z34X+OV_N+2cx=0Nhi{?D?3qcdd~Ai*8G=+**AmI z78e-PVql5y3gV{|(k>n=LAlP!#nv>lL5b-bql!0-fhjN;T`oR}#Oe%DZrXugMs8c? za{coem&LoanRE;3>2R~8@Lj2NE=O(#vqDCb$1blmR3bB$Nxd%jTHuU#_&5ffxaMtX z`LEVwRmWfnskyH9zkC#TH>`?iM_q$37JJ-jNLQT-)E;S57WRoRp8^DI(Tc4S7sR|v}Pj< zAhl@}sX>yn7n96c?T&k1fc_+SWW-k0l+@}bB{-d8)-;_fhIW}%r1pNhspHT=u@CV3 z>&pg%>oyJg3&747`YywY`e~PbLdiN1;shjVv~O?h0(p9GypDTrb!D4dG|@ehLoPmf zd!$ipurunpb+0yyEKtoejgGcH>jU*zu^xWyR~?+%v07@{W1WKAgzHO}7nAevrNoRw zyTmEk4hsCGWX6wp{iRm8Eee-!?Pi#`M(w^ZAEHi@5|bQ+X=F^Y_3{`fW=ItSKt+93 zMC_!d;38kNbeO)oPoZpiU%jR>JCWIp2wS=rRM0BLHx$VmrH`TnoPR<0;G-;*^+={h zDqgToiJsrC6bWpI;c*` zzyquS3v#?1rE9Fx?Z!Wy%dsP2b{lu6ghLTSF(y7Q{ID#AAz?UFl2LubwwVWq#le?Q zs*$mrrpLeLo5L{( zxt;R72HsrmwPx16P<}Z+URI{{HJwihkwV@qiEVB0S2PbvsXE=hR=&e|l%4#YHZ@Zo zTYb5wkm_j%V^m@pwZV&LPRcrq3KiA!P5u`c&i0L}U(?M7GtSb_fa6b`*G_)(kKq5;39)1$3Eu=!hz&sph7^AItj9@u zyjw|B;z`zZ;=O#2i{p7?@mbsf#xkkQ^j((%=d0=bsAl0$SM&a)$B9WPRH+j);0#{P zF2&4m*_S_Bw#ygG?-u$lFSG{09*=B)P2JN}Jn3aa+}Iz~Lo46XakKP<7S8Xs=q1$& zCpSRJPmny8BUD%R?DKZWHWV9${qH-irAN8;E8L*j)E}wTu4Tm4sa1o1GBG^N5z36Z zVJ?tOutR{IiYFMtd)zkOmd$c%13-cpgQxq5?qxTKZqH7@3YTeblLU9S1!duCU$8RpU@H zr$+tn6R+35T88Jq_Bv!#qbA*Y1gjou=y+XhRl2dL9w-QyqJ1sz&LzMr|K#IlyUJ>Oq;CvKBnKd5^bzEkv;{qspv$q;JcMa z2fzALr6DRe*H`sO@WX!5i;pvUg845$k_X6*=@gticq)>H&9POx)9@_xawR8qc|jwa z=8GapDxj6FcwWIgX^CcL+;+BlX&U`iTswYLMNC9~P?M;=xxV0CwDb@&>AS0->vxA1k>3qfs+96QRrg_|P4epNd+^l`Yre!pO| za0uPW>E5DMX*b8gjXLf{NkJyz1I#BD4(G2!avQWNEgqtks`S{)b&f3`=VOjcH1{6n z{C#mA!+dfJ8z%4o z`@nzv{T<3hpZR_nhV)kr&~u=^^fQz_AQ`*i$BWN5XT4tZ(_!p+_{+6C__Ewv>`V$d z+E@4TDMZeq->U~yGBuojod#_j0P|mHlpo1xZ4GGOY}Pe9nkLRn8Iih_UB|nxLb@qf zH#AJQ>)lQYsZRT?#} zVL?H4xH1_TeJM8>L&WCupD64t&Nrg8L0h z)X&waL(FMPc`o)HYxW{aFXc9+mrKOF?!A$mj^Gy7iIH%6)Bg%T9V+liwV3OmTrMrF z4FRjqFNxP2lRVGOHfHK;`6+xDwUHtW6bUd70h`JWH=|wBqA4k|aM$+$ibEPZw(m$- ztJN>T>^+m8e1M%7RwSv~pNr_D>sRCI9LiR-YP>o=$h9xGe-<}v@_GO9eO{U2UT_s8 zKCaadX59}yZ2^fP12^$VedKaTQ8q@$F@}dYqD*{|zB!GJ`c_>@y+xvi5aYb+ zlSP~Z#e)zUifKU2Dhu~Tpk(h`QO~Ra@g!V>vQUc(fsGPuf1v3w$)5@^Sd6+m^8Q#7 z=};bIIt^|&*K@)l;x%mGSwf3ZdEOQt^l3ZUd0-UE+jw& z;06*{bwE|{X!s=5(rZi7flS^lIKR4NmYug;*l?bM&yTNe_<+gMoN=6}WXUckI{Bdc zx$f8$k4{?l!=tf~Z=6OQIG|72aGmWLfT6B?ichwed@evginKZgw%UGdhi+MtJXVIN zsMG2ncj=;q2&u;%tn4&4*hFFPRX!@i3)e}$IOiS1JS9%UzcCi7`^tVJlM{#PZ#N~E zeeqy3&c#%(c9{jw`kG(dMdOCkdE5w}R7@55^B}+F4_0t5VnNPtC&Qp6xAC=@%ezuT zIH)yWAYVb0>f&LMYUJhaX}cWX)K^t}uW`yMMT1E)GT8{Afls`VG%(7~N6k=?*=>cC9BYTWJ>R@_?#f&5R;91l z<~mA}dTNG@T&Bl}12H|Te(D{WRPFi_iGkl4+PBZ98jPB}y!;tSLuW?*IAe1O8+WYRLxMi`7Ep)ERS#eia&*-i zI{)hXjw0=Ob9Hj{20TAF_u@x3_afbZ1X{P_ag#XKFpu5uD10)er#N-Ur+qMRD#TEm zWgTOPG%NR0?^!ntLn?8g?|uur|NkrOp#ZM%5?~W6K@vTpqV1?m$o$$n82!mo;8Va) z`i}wSc8OO2_L)}Yy_B8EoqLZBL4sC0s~UeFrtkxC5Qjv_R`ds3YKY?K_hrjrJ%oyy zc0Gs{`Zpl@*ROq2s4p44Y037lFLrjqgyMl}d3-o1dy!|cK(ixjL>DHSiT#9zJG$h( z5u4FV^jmteaU2Yq4?ZJaDoYRn=G8m#uGMUh%JHfU@B9f+|Hky)aRN?9#UxTcKRz-5 zT6HDOgFvYmI)sqr29EDm3%+hym<$&U%yMK{8Vs-S_#(s>Gd#?=`UJ@*L!7K)ArOyy zp=fF8`g9Q6fYChmp^?gee+TeBOVVU(v=PpGmv<{n`+poyImmJ!j~QIm9;279ubUIH z8h+5lE;JNFd_Es{)aHw$iuDZ}+n5jtI%D+|q>@zAW!^nU;m-do50Ve1ShB5A$$iw! zMAsZ`hwf)H2c@6CmM0eE*N$lTg&{r;D_Zl2R+B!;rpfoT+Ahx+qwiIc3{J+4 zUHY5C|Gel8(Cdd$aQNx&JO{x4bq_iQ9RS-{0zlQ0w*OniYuZ2faIzD|6;ZXTkM4QH zTi$!;_q*p`(Y8AXT=~CvPo@kET0fT~@->@K(2u5yreFTaSF@QPV;Ywc;;)w| zu=q=D*0mNcS@62VyCS*T{_!Kd6{y$urtR-XjI9Er*Sv1FPQV@H71U}ms;bQKMJ#O{ zoSY}+-H3noImh~7Ra!n?v3)eeW~}RO_Pu;mGtVOGoLyY|4*>u7Dn5SVGlAQ|Mak}Z zXv*}uCy{sZhbpzH#1uyGvMXk1PTcX<+#$>%n6`=o4aim`+wx985`8}+iZiLdXM;hbU-|zKR&b%r7+n(L9(t9H_T)bn@$aqp!B0ROD%XNk55k2N;2_Gg6*ET& z*1q}Nvhu}MDe6h!){J7*wj4K#DhJVE;w@R0{AXgsQ%)zta$pHs^l;0FmBDx%IbnS3 zf4&ol|DwqhkSuJ9t;U;cr=9wf3D5&jVQ9PY#*>i8#nxPxFyQ46N||`mcX#gP*v0?- zy5v|J_VMHPos@B23pD77E+tCZ{fvJ;|Gz%^Jt;DKvdT-o-?Hb$T!RXjo97q5#j%EZIG7>EeTNOB6k;;+QSql zQAk`~n`;sxYDH(g08e91WP*Z-7TKsSOnrQ2#R4ez8I)Da(EsP71E(tX2PGtB@>FnF zkII?=D@QI%K6stJ9gDLIz=CQl#$!KeUt$SFAOLE7pD{G;u1NI>XMcRt>a4*(GUvbc zOL85Q99N+H5R~&@@3WasR#=%+a)G-!`0NTa$QuG_b^j2jl{=s?JiWfW!pr{mbELqs z7|t=Km#Y(y42Y}v{hP~iKmhY&F@A1tO?(By%rS77J+Cxln}wjZ3GM?rq7=0FP4(7D zXLy{AM7z@yp`y_q{NfNg{2BYHFDYM4VhY#Iuhii6?ul}X)!8#-#6#7B>Qj;Evf)ub zn(y>Y^!$CNV?FYD#Ad7pSP&~9R1|YqO+?3+;AvX2ATF9M&~dQ%WX6rDq~WoJ@Tiu< zr~HZu^W%jeCOqETXd1wRlOk)W{pRTd)%a%?kye1+aRP>*k#W#VzTQxj*fHD@V93=} zPT-&0%CDTsDc50fG`A{($ga5AEOnCO1*s#h8i3J{;csd6I=i9gQ_s4(~J2w=ooN>+NF9YqJ-)FiY0=XO>gb&ZOaVEPVr}t6le;>3g zONd4QmuVjvllx~P&Vd>DDlstjRpgg+&{B5KBWO^&#h)>u%({REh~SOE{=7YIl2i^% zXZ?IYCw7G|Cjj#F7=XC34-cM7yqLI*i_A=UFAE#K1}DcNKZ{~2&Z;5?L3`*F#sXpK z-Xqp#utK>*Im=yl0<*)$MbdXx-E43=^bddaQ~q3R4K755g2Th>dbs;g@#hI>vo{&3 z<7(yxS{`2k1T#DBMQyekW0JThOX7OGxyJUb5cFDxXCE~@@kE5|4j3cMwn zi2S&;x$-tL3Q@>ib+t{?^eru5h8(GPbJo=J#SOHgEODKG2IO5f^Mr@k($=x-Fx8K%4CE24%KP=pkeSqdQa)U_H7-$ z=}omxGFqL$e{Wnlb|27gx^n|l!`clrwmcsNnT&qER-RYPk3%!@%cD_Un^cic`>j2^ zCtA4$1;>-pSN_mB>*<>D&)nOl2pVHmK8dsQLXlFm!mk2&XqqP~ZEPM=O0`+&0J`^x z9Ny@8WeKvM=hJKh$ufZOA|)5pX^PTzQvYOztC%Tr3%5@3SG_!a5U6QC^~c=ru{I6j zbe>Pfc_)YA3#f)kKyw%ZR_|T=g~PlrzKmmfCW}2A-|n|t(gHumQnDeI0?CO`oVb^= zZwx_@P7X#^?4`_3L_vn}UH}@~G5{92iz&39rSRYt@k zkB96HF0Y z8GrU)TNQr|F47ap7|$b_J7SWxlRgYSF*hhphTE|)AXJfwz;W%iy`dpUVkys5RD;}GQ9tG_O%H%~q znF<>bT}%tspY_m{c@s@iWcb0gen7A>`A_F=F;QOwdHpVC=FBou=9y`Myvt{VI^fj& zK=kx@xw;zkLOx%vWvPM~`2iqKRCe}9KJj7L`s_)jXxH(%T zjpOsEX%yKv9pS~BAzj-tOLz2WjKa0I8H)1VM!RYpyoo#i%$*>dGRbf^Kg0%3v*)_Z zI&E81ys+1nAm`xNZifH}oM_EcQE{g+9t{0kO1dIC$)8>!b#v0l{Eo|#1F!)c#IK7^ zU%N-ZoZO=>F;oM-r2AV6w5SyE7rY=JuM=>Yb{rjd=;d9NHA`OyU8z@I$?;>&sWZwsB~r~~0)#%xlWFIq6RF{S_wjOTm{p-pz2)Xu zu})dn$GEE1M7IfGK9WyO>T+T`fi>rNe6j)etPK3Sab%~7?m>vR3rPNZ6}7&8JtRGP zLRBR-h<${BK%>LYLTJn)gz%OOYVa_559M;Qpc|kS>agyK!Dr5&+aU28s8I9b=Sdd z2hYlcwm+Sh^zo)f7h}&K-DIWY8A438ABNTz>kvRCq`xiC+mi&aoIFukPc&^Te)s_l zKp1a3=;k{c+N{Q>XE}QkbYpVzlaJx&Z^)*v3l%EMYbAJ_;ox(1`p@G0ijRl1rY+f0 zKa?BjB+BnTi=wGm2BxW<7N&g4Aq9|9NgRe%X7Qj$Xr|bg)wjpolc66Vq9+l3 z1*Uy24To1{PCoG`T8PT8H2sp42HR1Jc|W9;;Sh*^4#b8tx8!K9>WPAls)8+q_OMe9 zEWurXac4?uv*-w?aRQXX#d>ThRBL=h(`IWH# zC&51Rx06c6^P(cZVI-;e4M#ji^I7LTX1VBwzf}rks1Gkco#1@*(o|0&;@)w zXUByCBvVSV*QkSx$<6_20|$}=O??#DRNRj1hH*9x)x5yfJo64=yghPUKMjX0_X51Y8R?XdlB`(9qa+lBUDnICHtnfVG%v?SMLCucu)he1>IL(KV z;SWN7?dF=E3{SN2BJa(jXD<>`@B05RPk1vRu5-x)5*mwU)#0l$zye1vM4F#fHTS*w zqOD(bSm7=;PcV=l;+EP#A5NxCS*)UW7x#1W6MFvYx^G{^55wK^r$P=M@HMUA%BhY| zGNZl}OqM2ekR`ygC4Fsof=;ud9~vE$^59ixS${ZM^Jh-3wcA_=!lhBV<;~hBYVJ24 zar@yhgulb}$HYF&IweHeCF2`~joYup6C`ClzFNUQnxEO$&=g$IWX6Yb;rtxNe7I=( z2%d48S(P%t9IovN&;*_@5^ajI`p96u)no-67B(uB4$su}NU~HI z^QwurShKJEaVcE^b0yD{ANNgj#IZk=o9gYA_7=U&Wc2h5aJZvs3C8*Hm|28AG*S31 zZIkvB$O}4n8Gog^++~td@g*20b^mE^w`IoS_<~K&CStThFcK{??R2fo@=_meMI%VT z*ZO$yNK`pnP?WX$&c}ejVmTh^WCP@~hH+J0k8-|ld}URg>=NqReP8b=7^d&)$eX^? z!b?Q=-O*oPGOw*zv?mMS0$zOe9pY*9eI%7jDHx$FzJP_r^9~33qQ3g4$n14dUx{=Z ztX?sCuV;VIr)(>W*WN?NK~;_EW#yM*MajlcC4hdVuwDAn$c;Qy{gw;)FxH~W4}%52 z71lvwnoEQImym#(8QHkO*E!=p`Bu}Wax8^7Zb2#L;eR%Pz(Wc-$N68BFLGlQ@0JG8 zy{oL_VmrFS@F;3ktA-bJ^GA%2$RL%lqRW=ap!xZ=q_y)Mg9DnKnu<=cMA0F*Ok*OK zO)RMB<9gi~3)|$}0dmEZ?#y_WVIPCN()NR8Y5n==-MPbNc|Gm!rx?$%NIH)gk0s=k z^L!;&a2MI4d{>2v3;&8MTQ1Z^lFvJhKmlLmz0P~YT%h?n9Z%$j_Dpw*TGAFCXjO(J zfF%A>P$^fcU=gfR=mkjfntr&^sOqNkkW?d`BgoY~>#}~44L+kol9Rd9u(3i>D zwdRw0{^QnYmk`F(xavIk3TyZGJCO2+AF4Ae_Iym#7y1lEM0Rew?)qdb{>iiU?Lx!* z!~%jmug^?&(qpP|bc75mt*6Z~lsXF>(2^xw!p*QbjIfVO0`lgoUu06RIlK=i#i_%Uz67EIAF4%x^GF^FEWG!EWN-FpMp!?aV0!gqE zQbbPS3&IjGtK|M(X}Id=iUUg9!~RA$WF=h2S`1`%8{6)>*LjPA#!aFz^?Q-yZlr3q zSNHI^D-_u>e;odi`aOb{;k@`r=i(dWoB?H(kK@^);1VAj9MI0U>}GGV!7%vf$MUGj zXt(+$1Dt<_wLGf($D+hJPDN6_7R_fGz8reCFc)K8iO4Vp3~%gN!&svjGX|LbnH@)O-VcDcuvHj8a1NeqO$z>sh#{s*g(c4I#XRp!$xH~u~l&k-T@I) zi;uF3dJD(&8^y5JC4}a1Fp_rSfUp~Yc>oP?+A;wa$AQT%XSZq2g=R3z$OIHsiBAjb zdH8b6Ezqe)OY`z%R+TRoQLDW_krBR|TzO zUvvHS&!31iq{;7IuANd$DG=%Pnf6H>x>%m|L>P!Jq)0h@3DJsk+?puAa|2u|v8)Nt z%Z}EEJ2Cb8k_Gu5$-YQL_=G@oM3LJz3kWstq#(1qml99fK{ju?M0exe% zQ1@QL-Tb~(QOq4Uon6HkK_VO@wTHZ=2$e7brKm3P$U7O~+LECD)?KlFs z&`6ZX?&UbuOMd`8=AsR*ok-0OYIWI-LH5RBpT=pkYzgDM;}Z*AXu6}WAH{q1y}3vp zk9(USD1viEru#uL${!TZ2XB3a4sIDVodP>`n+O_BCb++K(f2!^z67HVeuWmT5qJr{-x%O0)}xf@Cf#eUQ>3dP;+cS7Fi~M;0nB17 z0fGp+oEMZesr&bV3@7YY&Lu4(y%Wc(2X3HC)=O|HbwB=bODP2j%vJGVLTcQKR4M>` zg?ej&Ei$$*jCrSQPzJ~$Z*DYP*U|~uFDmc20i&TSlQuk6qHp+FL~9O@3&@`&cW@_) z4Z52vTZa0dZ38DkJG$D@1}prL?n+(Rn_jVGRW)xQs1FA%?cebPS()_7=N~*nxbMmf z8rB^e2=Mg0!1bJm`7p&z=)_rA9r$z3=Q!uQZ)tV4w0ejXs+Sr+8=Crs521l{G7?r{ zmMs;aOD;fkGi8dAeKgRTooCwhxh>~7B`0VPDaXt* z6I4<;eFt5EWdf0fJZFH9?mMEDp)-_)fa(W>3PA4wyIzCCn4UkUf5I2O+7oZ&TkgER z1WJu4#Nimd5jqW8lR&7h+%aDt%I2)q@;qDisBnT=4#dN}C5r0j{Rr(xXhzYr9kjy$ zNgG$k*PEEh9q zH*uT_$LO&sISYQ?^A9C9Ek8qCJ%K&$B2Gfk+p%dQK#&0Ig{s#+$uyV+;*hZN0(zG` zS*=X3$mr^;>-AYhMJHoBB~EBFuh@490Rh>5oR_X3osXuH)JChJlMMS_I+*9W!I>l- z#9*#W+FcZVv9Car@p|!GAErol%6u%B{;@Fov+$nFVMt*JOi4Q8RrN~x+?pOMve7U4 z!{8|UPo>#MsUNd7N-}@T63O2HA&|7Hr+{HM>P&zEk;e)- z&0cQCu1*tdd&t(YgK_>bvWQlYmehvB!z6rk2{gJ>+1OENVHsv!WYn%11LK?4yiO!^ zWr^I>ipyl=4!tHwO!NQ35GA$)qpt`BPHK)J_+pw_c$6ZF!R{OPAbo!b}{Mh)w z80pZl>d+X%wLeD8qW^v;wGS;Q!DT&B8#K;a5g$`BbxWt#MMY;=9OcbB1OB(B)qAup z(fej>`>@>(foJgjC8b$~L0m%)QpFHmXrH^7d}eahTFT>FQ_3JyBQ zIwck<$5_|2aLMbgS9n$a+oQ>w+_Qb&c_fea7x=Xl{nQ z3%hrH9tZQQmeTm|_c9g|n5PYA(Uz-%_=u?t|uNU#0Nij&6#tmXw(dW0@bi+VtUS7e?w+EYgL@`bx1o zQkaaj18-&dKNZ+%y!}I5ie>Ntqx}ZMo5j)MXvf2OwI6l*W0EF5&d@hC-vqU%ChKYr z(|y=Z`j2as!x*PS7LuLpbxbu?#5TXTg^@iQjf;n?lK6B{h=E+^=$-pAuEU;0=%QnF z^=1>MHvNY-(%VWT!S`+cAND=}WCl2brddNk*Dv~C1c zl6nS!V!(vB)%dWk_)Ab7KMtlWT7*$bm%JpGzm6F+D5dPHznEGJCgpdIV?Z5xMYh7% zsh=4iP9KR%Z?`h;s65Z}5ASkM3?Uh zdhaPhK^??K#egi6&;1A+3zejFi(_p8<9RxU*{;}cL-suDyYbY=WMNeoeCQdxqc|BY zM875Y|N2!r6rqR(Bmn}u2tfdAl$q)gT9tjt2UOxSq-!8oj+7WldPzNM25OVaA`dqf zHO%L1g^m0A)DTmquu_td#>l2@dI+^yizDARU)K*ZEf>A?3s1; z(|NGtLW;qyt7$#H^d`}L0vnTOOey0h&p&*#K!ny5&0=7Ayg5FzaR8*3N~f)fWhn4y z&`kg)-4aYvidI(rOHH(oT!uowQqEcPPdd*U=JMZIa@gq)@%HH`QUYP8V_`KuCRYFJ z4HEhXllX=2|7#;iRzUbt8UStZB#)5@-e}OcpjFXIe$|pj6 z>%IFvm#AP)%TLR0FBp2`S;g(=Q{lR=B>rUm|H`I;b_5{${sKa8x|(}FZWvZah3K-vunCR0Ez>$t6i&B0~lbzs(CErFd+R8UMz=KTWP!RO<7ftg<$%e zB45N<6|0y0%1lyc-X3tzk^d(b3hM(P$_+<@Qs-Ozhob1V>54>h>KDgbSKQ#C%0NU( z%qrT*+evn6dp8$@JIA>_p52Bhr~0L5d?DZ%c$?%YvGtc;fN%`_7Ip-h5Ppwo_k-@M zk9ROhEJ@vW1rthZL4nZbca*U4Fm!beU6UppZ-GSUL>UYBc9o&ta?vAxzTQBbci)I0 zV1Qov*vyUi_SI<^%rE%uy?;1+xI#c?EwhGS=;j%`0)l9s zWEA5F&||!;fl^eAs2xhQC-N5HM5TVD;NElNhxV|`^QNAten3=_9E)_VP9Qh*xxDQj z3=eCtibh*!1@Z^-0`%~wXv97xw`SM_MPIAUX;n1$O+h+r8DkW=CzSO9@pUVJZE^z8 ziIxxJ2XLT1<<`>^OK3S<1klHCztQGKgCdoZRZ=hZxEIAwL9g9HF2ldK|0Av73c@c>&li%>;KZ0CT90K_84xituf~L#024YzZZkL1coQ1FE`a zTstk#-+*$>E~oMIWJ>C(F0|%I%^v;CB;)UWUyOa!zdC#6#dMKqS7Lan5|y~nPXlfeLs`h!58m1Yno zQ3!Bv_a)bG{j(8j8HQDjVp@J7e1mY5fue}-fJx^j-6_p~2t^p^G+T^f`)hTSm6ewh zXzB&YFd_L`ry_9g&qH^OE8nXZ3otdj2oA*Y5qRt~wwEy7%{H?+eWE4hyN0LbkN|vi zQ7Uhx|KUM&ydq;oJ_N3zq7-5O3Gh4&5&22}qUxPfgnb`?Uz7Z_7a-RsgG|@PC3rQA z^WQ{kyX9A3OjyqQ(wTZ=q%WW$+#;u0^O0jGWc(w|akowPBb2CQDBBB zPR&1hf#zfXV9ed2l%bM6j-z`sOtuS78%OMJTkV>elp#b`(f$({44KibtHw7^_~}_* zCo5|^7%9e(9Q+!Ka{)2wU(70M`F78-v8bZOk@6+(A#93I+L7ob-n%)YI=K>Opj0(1C^ z0Z7c86Bz=Pur^zDn;GgGBl%mPaG}H(Pa|v2DGSNR5_*_)09ND>MdfDfuqlTPNp!?TFPU`~|^*MXKqeS&Ww$5vRwsuB{`lsxe!JCs|thZ{@8Q z--vM8!4e0fu*SazF`A>(hu~;E-Fp9X2=A2WKtz4d2MF0S*Vh*|v!@D@ z49h}cN)Z`PHSz`Lz;wtm^Akz=qzyGV>M!nMsgtU0gy4F3^2yVGA84E|8wXlR8?1jP zW=Bt$4Sr%oH6LpfO1m#a?)~L@Xc&+f1S0k;|+W*n2z-1Q7rLC?zlPD4Dk zryK48A5irTx6*q7O5_uCrc>aYxavfE8IyT1!GGh%%9#(DgAGC6wW zlMqIN7q+kGUdDu1%#o1Y&dn>4=mR#jA%qVG%IM=zJk4I?dS3f2KqdRER?gQ)C*^Y~ zkldi#KAztX)~9PLSJ#Ee@w6&&aIphk&c*Lv*0Ik{Y=T{L7q|I>2akJ zxSm-IgX@u|h2AjgdK-G2cSL)Y>_!$68|5H|G4+nRvG7Jz*bgJdg?;?Q!cGEV zS@j_zAZg&&*jns@RRiX1gMphh(w`L zfBkeZxDwcS%I5eU8~7&jx=nz-Gb{Dy@h;1}1_+%ZeOhr8`b}C;_c>-*-X6&~K{+T!YVVqG2BMJ)IQ_;^|Jj2w- zU{P={6f+2=bL@&afan>gf3Gd=n*ml9J7n9TXL_|=UfVGu2EFowVDf7S3qBQ$Z>3JD zzByzbuPeYU;=ead$aCaMqRFTT{H!{Di&7HC&Mk0`ZGi4tEdJu2w9c1(Y~^dzWi}Tu zz20(vshv`gesBqvd4P>S9lEAOG}&b@L6FJ;jBOiTik`Z3=8iMI(_#s#B)5st9UlbP z03Ti){?{3RU6rccITGP3j}%MHT6iLm;?2M?+=7$2@KMVKD>~zhDob#8RCKX+4w)+C zu{*Uq>Nbwe?6HuMkA~K1JB_#KN}@R-Tvp}ka46k_%9pb6TP7MVo3tk=UIaEeAjOgm zSYKw=E=9;zIIT)w>+4in7!GiM$}&1ho%53+s`@Y8&VzyRu6M^1fWS5XtBUuBU@XLR z>%IIPf)%qm43t1TfppZ)H@AjIznZ2t7$oxnFS{DtsPxu-u|&tDIH%=^Y6*KHgtoO) z^{(Z3nob>LKMZ!qa}#v7cz>iQZ4z_7_&P-~Y$oRpx9(uHX;5AzW0y(Dg=aYeVmL#l zx+-}lY!_Ih_7)PM&tPa8K-5(v&%%ZvMp}#%Xmjs8asGiu6&-YM%~t^StsO^_sd@=} z)|gS2vphYGdrh)3;+uFYh38t`Isa0KaGqkO4BuLxj@#BVS^jKsL(9#Yje?A3Yi#<` znD-oL1to{d)#Dw0tz}RxuDuYwrlh+S}k_3oo6w~(0cE&I7WvBNc3MGP5IHv_8sk2~5*anV#cUKEQ&9mDVX~}}+ zPgG0;dQE=t6xrhE%js%Spo*NsAs`N0{aGt0k@^8%xx6O!U{zRe)4PwGx=^DUy*jPs z7~%Y23~>^f8kGJ~B)Jf#Asz3@2&VsaM+f<@|BA1(0YzE7UYNhsej(HLz7D7jjHAVr zpLW<67(VJf^i!@Ek|y8&%MS94K@ziNn?^BX49I7jaf9)|vMJrjpr=GR&#CLTMiK|q zmiPFH={DmBsq?#dlG$GMuC_bR4aTe6-2Xa zsIJarGPZ{KcpS%6Q~9s(VLg5$_GU&5c$h+c#%QS2nI~x0hS4Mse_1Ijmhi_|r70SY z{kGuuq+XYHceSir%n7DBHXaVFyss;}^}x~(&4Odfwh2{(jmh(~;|wGA3nRU;MqtQ< z+bIgAL{c!74=G&NHH*X7thTS!!kH;Cd%ztul9bzv2`7+H3qyJw%cD?!sg~%qKZ71+ z>A#8&M>t=lbaGh(^3Oh0V7tMZX=Myw+z6d_si<`BHl>e^_hgcGe%y#|_P;ELv=SLMqA_W+F=%{gVhD$zG!qPS08^{uH zINPl4DLe9KTeUDKk3j=c)tXZ#sUm~j&K2IQNzcIP$~&K!-;$M8^;`a-MWv@752}27pI*~s`~p6956b=iR8diKZwb?gOp}^&tH}|Q zS(Q9RxIP}8w?s2WR4d(eDr?rN0VwDkEV;~*`j??3DG-_VZ$pWYPL_>0YJ?g+_xZ0> zX}x=(=i1_lqgG-p5Dx1aSN4(aJ4~KJZeLab*jisjvm*n0(!q}Qb=?j8;S9IpOiIe| zO8?`?_M-yW$v#;`a?S=>??`WaWz-7g{PQTJnz^^=u1~b|o>08K2F=iVhF`T17gi$W z5TMRW%h{D02mnv)H@=JLs|{UOR?`eQGHc&=7ri8UDD1ske69v!YZp8q+cljlQy|ohR9R&+MS^itpG40Ix&3Ql5Led zj>cQUTjHcZk#~()?_wPLXO@(U*7A~I)LMQPt1rTT60aZ;;EPua!hQ$6$5qd~z?KFpf|fD!~4NclqnER3ixrN6=UoEj;*YaAsj+-KmZ$?|O0GFu?) zH*soTHZE*5JN_<(&R935@8QZp1?|H9&mEQg5*nY;zl5lE*mZCR6-0IZCK)c-u*Lm& z$>&C*je8=M*mOZOz?jl5ER$%5-82w|l^GR=pA{);VOI8+Ev4RFd$PKYTzGE}aH#AI zaU;?D!T)b%AX*!naXd6C~60mgrYo}BWnp`?UqwP-gHwbQS~?>@14FEIaB zX>Qc&XGr9E0kyI2G#gIK63UY#tOvMMf#n*QfnNT1<>gz zNkhH&Ync{fh5WIgCRP?EGX6pf&g#jZXDBB%X(E+5Qn|JWRT(=BRg(s^y7SAqsu4-Nki09?l5=c>I)eV;Z#7O_7yYr~*H6%dg21kOJ7#w-XaKowMw7D-4E%UPglX)!wmb&lu1S^_miP zyU2J`o|)r$x(xj{^avfsnAI)gcO}_p0nL=AD0E+Js0MKC?vR)T%}yP~L}%;-icPa> zn~IOK5;6~e65Io)N;1YFKx5yt%gCH5&bKw&O=i!dBlI9ln;kg#jHq|yacO844&Z98t2X2TES}pk> z0w6mH`~aqbjj@R#v>tWy&(NYmGLt%;xeuA|g}EB#;{R=o0egv*kV7+}%OJ(X-bL2< z;bjw7N~Hs%Aa@<+j|=iV)wLb42_yJeexmp&JdopAmrw1&;cJHJmFhKht>iUOD@T=q zQl#cyLcQS=S*PbiNqAb%``=n?GgORcNW1)$y7!-hhXgqSgAD;Q_5a40N30YkO`sx( zNx4UztP>4QKky)Hx#1stF&~?OFQ!{akWMmc=0JudJx1IS)N%dW7Srvp`J~ct4|K2H z)yc&c3I?<{ykp6Y4961y>q(7VF|8qc+kmn)JXhijQbooXP#{l!CD&e2KA|?=z05+x zctQPLGug{ln1xFptJ>j)01*B*F7Q{@?Hk#Qk1S z9UE5Te<^q(Jbyp@R07ZV^$GX3|H25K`F!~o=gezaL)W8qwdg}yRi}$xK2uAG>DG0p zZ)h41G#1$h7Rem7#=t2L&tCnIJB=a6+(M1oJf*m-W`nP#vJ$GOlH}^Tj%L44#{UPK z2pAy>Z$I@cT+BVtr;D}aOiEY3AiW#(UI{+gM0C8d$-tTyNenANptmPOJ6&ffr>UzJ z>dB>tE|DqB>7Eht-0W~s5>Sz>V~HMY^nZMPbzBro_b;p;r) z_%P_%B&cVm*^oJqb>5^8tNz90|6?CpXu#*!)2PL-Jm#Fvv1z*rk6wQb=F`-UvyELF z$d24H0yQ<0=8x8QK~Sq~_w)2mO&5YU)K7YCk3yc`TZAp(eVhhfH?Lu2dpiQd1JL`IO$~{@7Vt#jH4g zQxvC9Vpc0E{I!xQn0EAgF8CAAR0gqK-TGs@!@X~cUCFUN@@EIyza6UveEt)WS*-M~ zblkR6I;;%*rmUcgNhn|MKC?X@Xbq3{#j%l}RX(gd711zpXeE@aKcBhhC#d!WGIL0{ z#HUg#w?YrKK9_4~eF~ex|LLlmfasW~o_;*m$(PuANWA#>P3UM2FgP#H`|D4ou+jJX9>%5hTYYdB zbJL;a3aJY~b2T^rnrH{YFHVW(sTZ+FM0=YP0yaO4r%~iRWe!WEW5Y{i3GMnLUq`JJ zfl!{$4%ZKW2!`=l2?;;{Uw+{B;Vy7=QqYCfd^D zb#DYTA9bQB6AsG6&TPH2)5)7cj6|Hr)CN*?st+)1SZ0)`3lohuXvc*8m1oY@TpXk7 zi@4dl3IK>#Us)9WL6!lXS0xHRe31&7e|?B~bhPb>N zg@Gfu;!!I73|DLzie#h$L-*8;D76nrSv;{z-|F;rhJEn_{D%;Yh11n4oK=~JwR!(E z`J%4b=;4gSi97rT2n&lAb%;6*Hv}9)Cr7voMH;-KZ*mRLfJx*;mD zI@7}_$oKaJ?{~6flD=Zlkp@T9;1#~jTJFqP#xOaSzgMm+9l9fngj9Qitze76Jkqzs z55k#R5j1^*dJ;UNJ0J2^4!(XdI(bmpErP{8A*t5gzUNFz(+hQUfB&5vii*r!H)#8U z*-}l3g8HNXj)2dTVgWN3&(FS zf*IiQ)Ms}Fs~eOsJoP|sk7zd@G{MkEq;mL~W7OrrcQx!QIMnjlZny?v-mXsWB5Pn6 z*m=`IpU?$-muxDjlN_bE_Ti`%We_}$vH0bOjoK8eO8kO0zS0=qk&E-WvWt+S#oC3v zcX}{>_61%HcciIg?WTB;C3rVVB02`do|4jHC_U{>-x8CB6{I_kwt&oahv5flBOk{Z zIW}}WP?52HFMG1PVJK5h!=}dx3)r>=CFVEBSSnvLT=SFeLltbP8dz@?>4^&(3ZAjd zBbUd%{;ZR(5Y~)mM8#TcMc-=P8KAfl0T3=yh9W*l4#6ao7zXAuia;=skn6Fb*c7Pn z+|A+yB}!j$DgAZ%Ko@w|-o3OE+25d9(Q&kCXtCOjI1U(^54{H8IPj@Mk@;}j)(Vwm z;{B3VFoGfY;-Ddxz!9{3wSvrV-hMQys$P(l zr<=3d%h-OASz0I0KXBnZy~&UExl>dA)*7aap0N*Jb0Dm_1}}lx`F7%c0;#&P{nNBk z+N}<#d`lYBmVCjI9_#(t*xNq^X`@49yB$ZA+>M!9yojBXPb@HG$HMpSQwC?Z-x{CM zaDET7qnVBna@|rfWUK=PbkUpvv37{2DmMZL|3ujioueD97Gz+iX^^cMdScK{q3UAd zeritxMNac~NxWW0vp$?7ke&HaQCemn)jbP}){Arr%7v0Q(-cc-)m?IIAO*@*V=v0& z3d`Hn?mM>4T$XPeotl5UoS#sjKHm;GOeA*xE_FU`KCiJxQv2hs!2)%Cjh^1*YHnzq zc6frAvrGSty}kF#3oGGzRFBnrjXy~1c_X!2V!)YD@bX0N_l39Uwe5%3kYr23G$#Od zO(YHB;7@9yLXfrifyg4)>C`Uf5I2#s!-$(4%(#;K3HPPgozZ;tQ|1SA)#=*>rwg9m z4LVK>PmwJUjwxCRzDcMvUi{4KUSh)phJW@qyS6EL8MQaB?Pj-XtqQbevFLA_(77?t z1VzXiq|zpqXXDI7C_)W#z;;~e_o`KOo;`-Oc~3O;ML0%087kO;cJFXRMF=r?JOEEYKdJNLXcsw@fn94fmO7=!7f52^RT@6RGia-~IH(#G0|h;})F zIU4(y#tR#C%GIjQ?^hiMK{AfPW&Kv34W!o@Ov}(FyW9u$zs%Ke%uG0v)5E(=kPDJc979FZ7oBlB~?F~Isx>FCqoq=JsUdKvg&ijQx%?xOS;yAK{ieB!dXq2Atgb+n9(M~g8^)dYD_jjW})OM|*G|^(^GPJ~plf{c{7LJ};lB&bll4NoB5b4YZ^_zO4NG7RCbEPP)+%p~v8S z;-AjMzTK!J%6I4Q)4+(pwQ$|D5Fy6|)yL*A`t4cPp~owevDxnuQ$%tG-$$3`Wz! zhqcVLat>EoT4*8JPqtri8jk;iVf4pEH%lQQb1@R!5lOs20UdccFPmNo)LoeI7|EQE z0WS7~owa)yW#@&y-5w<#ob6>qy74d6MVyGQ-sT>t^?kG}Q>4K((%V27bAD!rm)k35 zNBd80+8nFQa}&NZoj}iR!#{B(|HFHWI54ERHg9J1XL5<|u0`jE@;r>&yA*`A*gh~k zl-a5+qk@lUCuuM0&W*6{+`A`QRE+4>zRtJ6CIa#^PHiX>E;gN0{A@z&rjWT>YEZ~F zml|XntBY6L>7mfsFUwi#`O>d4k1)tNUd>*rV22_ry@?r1|Ahk@f?9# zx&A9pZQ?oZlnTi0XrIMCROFKW8%P8|sURU2x9CXGD7B#Q+p~dO`#iem(RYjSaw$^g zm*z4kFSg6i9ePZ#va-(dUq36v&IOv1AO&7~=s;y}VP}JPrc(E>y(`!4F$o@J^Hx*C?JpOIXR75x69Xy<4FN5bL zKmmfq*-;~#S>vz<`q|(dSGHcJu5Gz>wc+<;mN|ld5oE82UbeWf#1AN)uldYG1g5Be z(-4?Q_#C?6nv{5T5OFgN$h_!{BWRC}8a4i#qTM>+96UpeUDd>I>lb5&_8`;*pt0^9 z&DhG+nB`0VIsPcis*(}LZ(3i?4`~F{Y5v|pb0pA;wJCIHa6j~0B#+7-I>S4oxg8Tg z$!jF+MZO)UhvPfV=?hu+aU5_rQ#6@Z8g>6~)r?6$;4b3*g_pmBK6$->Un~i=d}5?L z1In#YlWj48I{)YeL0vcfnjKS5b^@xt=Y!3izm*XHjsh$?L2dja9A?m+Sdkn;xXfDD zK4FXn*{6mIohbxv6qpxAk$Dz^S7vpLYX4ez;UA6nvhNye-d$y(JwrXY*VcY-4X>{&lkoWC>~$jAUUZ0U`mKDC&814Sg>;T zU9%SvAADGnHrld8)RsWs-O@kao%r~}y(fRxAi)xCz&3n2lW(F>P=;)x^1sO$Ih+z7 z;;`njU^y(TD`vP|J9}*AbEgByPtdT=Sltn7G$<*X}$kWi|6HuF_ zn9N}L_Cq>o#?CECYv_8y$sP1;Xp>;s=;z=1wBs!xDQ88uOpz_dEfNzo)o_xjqirL7 zch!oQhOdIcU^(*gML|3gcP{4SnR+gB8u_rr{zY|ogH^D*K?fEcMovZAh@v>P_SxKB z5Q+P05m!n+9cl?WQ&rXVpAS-G7NgCvmP}`+3;(TAO$ZQ#Z7?(n4VRmawQJWZmHD{S z%GmMbqP$nU{fLMdKKzbYBS(L0KL1A1>YQ9YEZ|?>PLo;0Do3bBEX0}?( zh-WI>NC@ll)ViWsrc~gxfLS z!$8T=7b@~4kn%2{y8{}XZ(^c+rG=6$_wSeJ9x#Jm$9%jelHLh_>gDcd2rbfs)mBVI zi!e#nMmYHn>Z!B%e)?McB8PIrhtGDk2TIfjbTBXXI29eAy8gKW5WJY$dWXY;|Z2VAx2|nzy!Te--U1k4SBrY?kKI(t+FYj+7Fp7!`^!OSl zEYM?JH^(zSl9(RUU3MUJv?7k*{*wudwvI4akMkhv$~YXu-D1?g4G}0{)!W{2at{Gq zUT!G|XvBDRoZyjF!Sv)(4Or%t7{9Wz#+95R@oQzx^&gd=%Sn$r*-hZn|83J`7$9WO z?e#q(6cGre*cU#_R!|RVl;DOFKl1>Xe}1{Ol?`2XyyGO&v1gtHbQh4nA3u{6pbLms zd|wWC1lDVrM>%Z9#_q0<7#YWpR4;Z}v`LdsZlDBi)S4IeC3uwnVsFxhqAWZ>srkFw zpMt^*i3nRF8V<|Epu=*D;c3wWr9tcA>QWPw8@$Gl?QB+FZ41R7(vhzq*+!k~2ca8yDrVlES_HTM(Hanbe9ZR7d{S9M#YzG*Si_xyjWNjiFj$vB(( z8Xb7P(u74NLvp186S)$#t69sZiK3ux^;S{I2rdHcx(5o9!-nj7(cH=db z2ggL5X@<~mMwpxsew&wTt)Jfcpi8LZ(9BkDIS*8EM0)QbgBcS;}Ps1%@BZP;V9%IepdU0&KCq>o~1*JOoRL6Y`I|O(a(iLgFvbjVagl{-lW?lFU`elDvQQNdrn4AbOGz+J;N5!j` ztTiwEky)P5{ViP?3J3{xd47)oWrw;uJ8}e*kxE)V#)w89G^eCeLABjp2kHxbM@r!A zb_m)swm4PO{^^!o7^Ui@#C=83h@$PF8mQ01!kzl8j+5ExIQ(Cg^E(D;Mx8me;e12Z)h${+nP(;z(Xi%dU1@29Uk*- zTVK=YDkCDUZ5tO{+u2aaeOTg6I4S`fv7pOl5@#pd>G4vGTFcA-TMLp2I^yjSj(hS@;ROVZ#GtfKu0^i)e9&UL9>-B)7y9MtKid95v8U}; zuu$&F^iL)QHI{$T(=f!Nf&O?x1mH6^e&jaqp%`g<6qy3ryrhA$e+r&!j{!qObYR@s z_#L&J?RS&wFbfFF4|W5P$CCWFJr-OESHn@aC`?}NV5dvSNXqZ6a3Tp%jI8Uk@JRz6 zed#G(n~B>I<%^ZJ_OfJui@!6_N1Z-zZ$M4>QHuB6GgCu%qxc8bOUI%+$G=D#;`%uZ z|C(J2mbuOGdgoH}H-S|%J-Nl7)(>n1tP*jX`+;!c8A>c=d=L?L*;Kq>hsUYq<`Zp1 z+u#xX1oN9|j}{h>OUnCq0nmx`2({(k%(DpIAO5>Icg}=d9U-g(XP2+f+!u?tdMGqS zOz=HaKBsn5C9lGq+mEM+wk3}Nufi%x&&AZj)D<_Bn##0APzSJq$ol)V(m9ZqY+2M+J-5}bwomfPQ4sr>#{~Px6UwCw zVS@+Yr%o1#RyZ_0=&f|XFpqx+RakbmW_ib7;^sef8DiIhZ>vjq|M~*RD6_qL zCzJ}TBl(*v5XOPh?k&+JNIOzGb;8*t!D&QAi8I=+f}B10DIn)o-R=5{En&U{7dsAx zKbC;{g#nU0`mV$b5ym&^hE*;Q2vz1VtuzuZ2)-lfH}PkDfGV=&_}O8s!()g?mnSY{kBvSbn8`?QK^bQ*V zSP*~w{4CY5^Fe5)pAdcSX>Yczo*Q3*a5vGlU3kGpvA`XlYRjNu4_DL|bxBc6pERM} zji8YkKCU1UCMSgqO3LCplXz$^bTaGC%U$x5$|}VCmr3y@@3d;koP1M`|J0M z*sv(rZ@+B!$^C`?)~Zf)(0A!H$a3=Q?~1s27TxGCji!GvJ^2PIm_K&ToaTeRu9c|G zn`xzc8SA{80$Sqtk4Z$qK9VADN{SEcW?!sX`*u}HY!2pbn*{I!MR)abIAnkTz5^J1 zf%ONFNi8my8Pt}-GfMXp3A;rLPl9jQe&4)Qvp%mMPKGk5{U$zf3y2li46sh_|M2oU=SYB;GLF%=-5_3>`070Enup&Ae|jbP!)=g9k1dBG$DmzGoKIr*&{! zz4N*sR_{*!V7e>&=|pL7mbX^ilsKNn@>E&w>@*aDY~!GpSR{2P-!)X<8g~jDH|;CZ zFM?m+iWkfSkY-W+!!EXjGJIzt)i~bYVRN%8weIpGf>oYjyxrWS$yOGxt62Uj5lBdz zI1;`4DLN_q2z}fsxxYA5|1gwVX((^o^t~4L za2#y8Um4Hkq-DJ?R-mfw{;BD|D9PbCAjuOLuSwvy@bZ~J&11@J!+B@jP5_sc&P|c- z1sLGnO5s5V_0xj45BTC(J_&3^!hVTC|B%A(gyhIXXRgoQqEm@ZH&dqS)x40^cg-#K za9`)vMyFCD->bXmZ5ew<{Y@|V_sGTq2k%n~Yh+9Cm|YmL1b5k_Ts>K=Wj_)Pk_y4}z6~nqz9XM}fxZuEl`UlHFn+eE~qh$f) z7Po);B$9aHF3TJ>2fk_Q=%;QlCNtLA8$D}pf0aNWsF?p~E7 zs025laORsPKSDSP7KwYY{o#a1avr1ZhHOLm>zCd>iF_!f7#jPl3|xB-^z*VRYxsv1 zOAC($Y{nF8ujsjd4w;`^U;0{CfK;MP5SLbNPb|+pPc)FAir4olr6%c8LGP1OdK1=K{bEPVJ}A0f20%?cpkvD(L=e_YShWYTR2fG~;I;2@RK^9ix54|A+3K&kt7Q_Jz-`vA5KJm)pQ1TB%4oOLIRI)M zGe8TZFz7hx)-@`L6lALV^3X=F0%Y%1tDUwVxxYn6<|9DO5GQ1XCQWFW$ z&MR*d^DKi;`n)o0{;ekM=^Z{Q^=2#rrfyKvPSaLMK}jne&Je!lq{kY;tepn>?@a(s z`#vegK|F?>NEE<&&I3(H%^9kk-X+icaYht#bSjhpx~bJXwiB*?=PL{7i>^Cd$i&1o z{tO)wn#uq~GpvmQxg*^Wchbx3foslu4Jzfr!B{FNgq;{*)-E+^2*Br17RK9G+f4hV zZvcR^wCX(EQ<4Jk85d7v{5;oMEWhtR+5HhDzq4p`G;nwRtym$g8wF$2(u-<}v%@J* z(2dUz@aUNJWEHmFK-wZ%hX!-CDP6NwG7I}w^h&EfM7q)&q>W&>{=Dp1;-_YZ1QN1- zCEWx>EYec<+@>RVzMoRrpFtmrN;nI32pK+&-QR21>qf3?}WNL-vwq({wL%wbV&3BHy68SN?0+(L(Z}Tw9m|;K@0N!~Df0FDqYRZOg=q;vta0E* zZ|BowV|MI2TU&dhxy>n>7*sMsmiWfD#-I3`tg zBRQ-$iOI=rJ~%$GQ1P*YgR#e{;qgVhI{SG%zk}JUq%76iP4=Pi;7O1 zFMquZZXosLHTMo@xqTPXf}Q{Ikv9RZaC%DXhQpkX&^V#=jh}`GhRpZ3jscC$~Oilc~t=QZz35GUZ^f*(NbCx&d z=Ff8ctplTUUH6Q|k_|582~rMSOt*MXopbj?MjSs(g7`5clR0D%qg!_(8ca112({;0 z`hqeae?e<@ad!O44DBu{NtA1jTCUb7ve0(QJpi}e4O;2vazoDT5B%hI$=-pcPVo!H zy49>~djL3W^xgq7$b@e( z9aNu?nHJGTV=;P=-hHusxW82=WjU&{GkF;SuvalkObYbU<14AF$QIEwrDjpvJx4kA zg{)$_1jReg^i?Y_z0lqK1vYyZ8Ny5U@vJ{~KZ;27*lZ|hgJ(o{efc$Yb8^j}mzQ}q zUc=GDNG8Xzm^)ZOk)Z(`o0QGS zC0vG~D(;tliqFpt(}>vWbTE6D^q%iCXY7CRBul1@?Xc%eQV_c!<((;OYLB$?K(5L* z^~E7oiRL{5w0P!XGC4OUpZj@HSKjQfog8D=+QgnsSjnfwtFdFL;#h~bp-Fg*32#nA4g8>`jVtLU$DQS}?w$`V($)|QS zXP7vr2m70wC6Zn@>1PbZBN+}Myyth7EUJoR!slq}RyuAbfdE0ISJ=DTSS#yAfmxcI zMM~8{&fZq9I>~xpA9mO9rU~pk@Rffo8Z97y(neh2DTg)^*3`Cx^UGKD>!w>NUc~a| zS__|Ukn^{tv)HNgfmlTa3y{+*&AErDr^ZauZ193Y10R#^6i%1Uc=sC&IS8&SW>W9>*#@Iv=^HRa(pzj&YKF*qDGo9aa<1?EXebT;o4T8s;xPLh-tSJ zY1(3li@h5^PIQQph>>?9+?hflH~AUmu8=|N5CJRi*z9kS=O>J>t)f={@Lj#BLJq== zquwrnp%mCr-}*kBFq!Q1jdO8|2WkE`srN$ubj)8?_lZ$GuU)UXxqZPjAZn@)Z&%`U zbs(E^jPH_TS0!3?8B8Z}0RX0se0MkOPXP*LAPKQ?q3jqwr+DQ34YPo#bS^Eh@jb0t zihKd&Er0f!RoN(7^q z+0FpOtB~*o8|{~8MhQ>cDX>0$5uY%RTJo-WLF&#s6eTJpL%~xCQ=CscjMNnM=#1t2 z^yVX&+rGT1>%pD;#N3H8ru~I9Kxb`n*bI62^c86xfc^JD80PCQQT+u^U+_SOZy<9n zJ5SdS`yYIf8<8d#$H1v>0>IE5`W3|&veL;5ma@nj^uDMTBqv*S zn8h2mMaUMArWlnI>Ocp7&%B{>;zV~IDvjI#UG{QxB17P-ls;KTzHRAs=aXuDb zH_GQ~3|XYd+1-5?l>M0OrOK@!Olc z>QB|wX`#}wqY;vgTVy}_KB+X49%~yA!K8^((MV-1yVFO-(s7n@Ea?%=4+fCTH!lx@ zE+%>4PWA(^UnuzJPe}lFcH-f&X&$}A`XWCjw#A(}pAn@P{lN|}^pt({ZYL}Nqbi0)$nxO>6Ib8^Uy-UV zT4O^)QL%vNPfv~dyAku{hMq@@Ps)-Qz4emx+ms|h^>_bvL)HfM1ClJ;QR!l-#mHAM zKrUO`dSb{`UuzQG*tt(qVk|9BR!+fml7SL-cy_aDlJ#fon+FRX<(4NQI}5Kop9rg+ zySVDLlycrjjB1$yo({lJe6>eGSuk+lpeA-c`z-H?LU*ob%n z3>lT<1^y9abp4=kXQ8d_tVNzZxHd*H+E-Hshr7K5fNOP=P(fy6ewDnr4h)(HfZ?IRz+T$I!Ilh%c={h|aTNsr7hZ0Ch zsNa0gl%}7s0(L+Jo31@2dEi++)jj1Yt*#&EC1whN3qtd&EAFq~oeDFv)^^(5YCu@x zhY6ks(`IsiMyviCc)-K{Jvh6w*c4v9pH5*o-#niED_w#Yol_Py$Nk{bdCU7=C zm3!{ajcwH&3wMQ8Ioomk{A&3JFNlDcidJ)?%6lY81!`WoNaZxdbE+}uqV!lwm>eAf zeDgx_I@WL&e2k_zHc1dWZe@yi_UF0of`H1(YqAwo2%Ny_!g(d+=^51XL~^y4x_bsk z=N*Pd5HbDUCR=7k9ZTZrBpu(-*kT{{RMEd=Ne{PIal@hx|Lx|?~$qLH6P4W5RU+M{4A9C8EniYsx+=#`_q#35izDkp7FX8*@r;ysa83PH z#?L;$S}`>y`x<=Eq)#`N9G{=qXSyDK-nau#&xB>1#uaXEMo$PV9poXRt^cx21Y(&= z{bzx&RxFiFire>RWlPF8`qK4}{Vs}|Sb6TBEs>|KJMrR}l1}+Hdv6&HxAi-lHi`Yj=PgAp0&ujY9BpoNCc5jtp zOVoO)rud2coD3N-8(YG z8}28jD-Y5%KcI0tZ2Ees=6n&MeAmXm|EezHHZlH_?3$dFsddp@oq+4{rw}@@mVam( zAYZd`eLI@V<$1wVb$#aZ9ce`H6Ggb5K~pK|u9yTZ*UnGP!*2m4Lh_Bk)Zl1|()&BT?{52>#;8Gum65Y$?vK0_lWclGT=7c_ zjEG<@SJ}KPv}uYT3ncR@$wE-TYLc*YkkV>qp3db?=A8_=!{0n5)JLr1`15u}*9<>uh-44Zxiv=F8eibO{q2YRhaHKKUFSrwM-&{^{H@yFF>C2b`@l~e zPWfCm2D5XCq9|`@4K|8`*I&P!@6TIEsC4uFkzNuiV1BU4vH%MDuJQ&92Svh*204xf z8%Ayoe|H;1oB|$Hoxf?xxg#%3ZURwDl^W$Z5i}fl+VZj5(zH>a$SL*TgMH)fk zrsJp#2?-rVdO9s8MVe*%ae0#DU${}&TR-byfwV!B#t+g^6sH<1dDRHCRPQN zu(QwAy&ys0I3RkRF(%B0PHun_@2Lkcp{8E+P-H68(lcTaqj=L!t7S5|&T6dqMSIxL zV7M#^QVIJwX8WOEr4WY(;PsnRZj~Ql3aC=yp<*9G7*u{!#&Ssl+?hl%X`Lt*1J!H- zY43QqP%f7bLOAvZu`=mNm)DO(4(M`omIsQ02wmHJ;VqdBd8lMEmJTMV*e`c)bj@-@ zzC-g5ru35}=TfKx&%M3||14rtVzYuOUOf{`3_+UZG5IBI^wRn$W2wU_vcTH1b0K?=5`TwCrvwgfYrR4~P^bNl6FxuV&S zoL24Qk`q{Q6s3Wf_h{IYaI*(9z!|G%)iQ3}S>+WE>No!P#$nxvLv4J7-i4sg zg~A}`eO>W$ zhyo&~jhmkM!!WTvGc(p))~eK2Px<@frV4CI%-q3t8O732%|4+NUK`p!tu!n7kzN*Y zy3k*|CGgrx)^hmoV?w1%Ja z2?v`_Hx<2!`@uZg9QBAEx{VDk=L)ey=n;vs1-U+XlwMdpQ^4MGa>2)O`!Fs;sej~O zwtm!6e+2i*CnD0N_oW^_u{;~|ZtEM#SC&8!i|QFTn#dPZ6rd}mr$KTHi*?U@tWcEy zP-mq}2rK)|s&L|O)ErFwSHZ-T+=4yd+~#Qbmh3<8PB#Cr{y_AHC-nwOj$C;ILjas^ z8M4@OI9GZ*L%USQY)nn7?#Z$uu09Qx#^kfo16s0H{+f!ee8X&n;@y46`|ZY`X-P!7 zYOYm!0ii14q15N1pR!bU%`dmbS3;$!hJwV+`~6`v=o+a~^m{X|g`H*_>i6Ev>%0k% z6=ubgyU`B{aIaAdk(Ga}IqZ)F5hl#g(hV4)Vs;S{UG>O2~@ z?tPDoGb%LEUG#Oa`a-6W^LT}2c>IWQWmg^5(hunctBL5E;6EL)8gR5qM-QDKBnUF} z_l|_i26v_+rO+3A#uOgQlyJryP?CoTzgyv(E56I3WQ!|V%xvv`UTo25n{M6Y)c2|X zQt6-XT&ZCY`)^0(_mEM%DGUgnas*O5aLXs_sdiR6JqW#z4Q#k&NRqad+VZ}73dI{L z702~QzVhbOL$7w}|36DnDU@V!xfBB#Bm?Mbds*4ucn=H6^X=f>)P(eVlVeJftTEQD z#ozFsWLAv(Vx142fdgG#Pk|h~DyT2!C{Up2S+I%9UV$yA8!G`v*mv$z?0=9gxYwWW zj|KOGE$46=2JaUh;g=e2FP-K5_D^q(<0gK1)nWneD+JhL%;Y4I54~VEsB%KOL|k+P zQ$O!Yv(}OPW3&+DNAr9VbgnM;&qIlC8sAf^Q{-7=F zv{-q+RBmfkDLZq?QD6z(B4odsYlX3aeEQAZS|~mQ9WNAC@S@ibTj~77!&f>|DpyOg zEs{{_hKdg?7B#0g6`$xhFDIL&%c{kF9mAKkMhTQ1jGPW%-yF1A8|-e9WnB~r&V@s_ z#hf8%^uc}_?vcLO5!T5sJ(~Y8h+k5@SF4^u>DfVBSKI3T)VfEdbv1qZz=-WoPRnG| zA9i*9M?gDAuIJ#*NJ)b2uU=3{;4_}Y#qh|2=bgjOM+qu{k9-Q(!>l=qRr&St?Yi8T z6q`OHTS69@a2DwG@{ABLkt7j=HeDD6DiHyT z0IeJGuxXbKzXpC4DqLYLqL4N?GHYY$eU+2{584EpC7+&5=o9vHx|n6y%*gGi#^Vb%oa+{qO{- z{ghBxhBl$^xV^Spjg^ zeV4fv9$?}S(CO(A@hO>&4^LUA$ytmjMRPMAI#-L3wI)ELui}=O`l!M1D83~>-n>4 zEv#Qtt@Vewqcny0)x7^fs6*lS^If}d($duxLk0oc^_?8_@}WS7wc*j+=Ts&GSst=> zODB@uhHVF>MtS1od*v~YIK(J`!y{GtBKA+Cdt*e zD)b>>l>4j2z?!0`RVd3#K-s_9CEA*!sZ?ZyLn*AhsfJ^72i~o!-WA96P za8aEYyce-n$=CS`_v&I$30-g#nXX%Y;4^TT6zXcZtYV9|YCV3^V3;kUP-8?-6ARn< zG8k|_);lOuP7RAW*^hP>b{%I;*A=fYAH*}$Mo7~Y>QN#5e~x}r@G(kiiNZHLsZ2Vl zmNV0(q$5%+BGE~ZH7BXc4vvBiC*IhVz7sIBpa10K*x0`A{zrHu6H(-czeU7O=>=0l z{VLhLJonpfQGy2Jm+FeALVI`KR~#bo(XGFAH6q2rRf@KNaQ39+jxFX45o zu>}ukTp9OKU^RZmXOry4z)e~HAt7|h6lWVyZ1cSJjekA1ANpICb*j_suGF7*cSU(D zKHD90W9oYrjd#Q-Ua#IzLe#sqPH;fLa*60)_=IWYK3Au|r!>59vnFZ?OGU-<#(DUf zT!9TaTTgdnN+x`u*76PNCKfBB+bq4U=PC1FqRM~{MAf(I*#}^7Yakc#pBX1eo?2@^ zI|ZdsVRE7EehQe7OeTYHMq1Zj+L-$r%=iW8d2WY3Kbm+VfHR=7NmOw4z_51+KQ5nG z&Ji019EgvPm7*VO0;r(6O)$ir?0rJZD7$a4Y!C(^xuPP188{_3hR#qCg6j$qj*6a&LH{?WySUIWGc z>s#q9{58&Rc*id;mCH>zD~yIYue%uN`9PB(O_7LzH?SXvjS?n{Q!MvDXfvEKwNE~h zMpi1<1H1*!7#gZsbbBrD5y}Zu^MbywPVwlAn5|fK!VvX3Xl=WqoFE9XfR@3@fwt2{ z$#Ry*hyiY#;NVtIrGwJiVDEdfH0ir5;=l(=3K6NBaAsWJ9Jj_*AJ#=vfe`xlCH@ zcy-C9J9t0ckjysWL5fXsn(a~{M1u$O+VcoS2f_v~zl{Pk!3vE+Z8g`RHgB4tbe*xL zW^Zbw)Qy`d+VU&%X_&yDW!T$B!xY?sS%e>v88e)bS%)8~>$Ry6oLsY}kpW{9)yIfP zpp%D?LQd7lzGCM9FWvac zi;8S@AJC0@ReBM2K(942GKi@2W8pE<+$N^cb=1v?yemjG;gmf8sK6bhm0BTcVYuhw zv!Eckig;6di^0wQX@$PbUlv`+EpXlzg>+fKGIYq+6ZZr!tL?XH3asuD%?IC_sOBpT zW=2uctk2R=JgEAKPc+}Lt5EoMDqQG7<*1(%Z*$*Ehx0DVbV9bD}MGCxwE<)#sKlgya!0@96LHLRQEP2J)y~M+pn9w$=^L z%Jxl0hcL;~S|70E*ofmw{_Y1n|@$;;|j z(xnsJX*SB3Szn2HMSn`frI~*ctd5=n43~!cg~o1_+3sWOdF7NsD&-fR^KR4UG+x&| z@?V2`z&t82Dr zIP;w2ixmz&O948!j@!0FxzZgUvAM7AGvzHf`R8(36!0{yFeoIy<1!2Y6{=G|bG1|3 zSi(r1$aP(1>NqsX?FaA+JJY1EVc08c!>qKhvwhmdv;(vhqrp3Uq+;%SSMX~`7qsC+ShuE z!Evt9ewoJos_Ozr0^0F*cLzBhSXKwwy5=5W@(Bm^_UZkhR03w;X}>%j79oFtlA|Dz zR8`e-(HL{%tFU{Wbl&`v^PC^zWUPnM)4Q=-$;A%!qm)HV$13g3G}-T1(SBF` z4)O&MM@t)LrZfQT=?GuT3lnnP5Qb+{L)e?~>e_$&Q1uB-h*6HzyRggl~5D_s)2 zhfSis879v-KK)3{$7pfPTkp!_ppb#Esl&rs|0(3aW(%(MJWidLjl4&nttQE=eUkhX zD;?ELm|O!gxZTup;ZX@d4u@ZJnSf`3*8*Lw{+Z8Jg+WgVw638QJ{)*OX&C*Pl(I22 z+@`aKj~S&af%_1lO;sc-Og@3aWX+*d)s2qe|#2bzs7NXHQWbB zLA-=W9I^=G7!^N5N0UfMuX4Rmg$)68-p5@ef-*$>P{Lg5OfjXd~W9mL5dC?FT!g-;iaJ`r)m`Zyi>V{+&8q0n6-CS{G~ncf!w78%n5~ zZ|wRWoGwt)Q@{OBj;R?Wgm@`0aba$LnlR>jXu8sYmft~2$!@=zsngLMd5M|A<_wD% zE&<#0&6nkxb!Vl6kG8$l(?1w>0B~Sit??thW48UXK{vAE%AdZs7n294n$oWXP;iM| zxgZ(jhfHV#L!w^J$i8B4q;qYmoTF|E}Gw zRC1&W1AFfPaPI!JVYYP!qm_IU?n-C(4e0xmYQ!AOo1w2hkrjFb<4sz0^RJyqgi&y| zJKp}Y#Rc-`J72*mu`;)8kC}r`2xX@$q{Z^2Wh6ww`NB_+*=gLQ=}3kvAeq)WBNsl0 zp$8Uv3#7~wVrry&Nzt?0W8T)3e>OKY^J2H@@kN@d@vP-GA?ku>EaG*-gy@;r+2!KllriTL&BJY3fHoa~^_DrO!W1*6@vF6p*PoJyzrd+CtaU6v4K^mOo4jIOf zHb^r`)}N3Rc^qxm8^^p#K7#Gg;fK8psNicHAEw@T7)}~+nIT2D<;XCkQVFiEE--P)$SE@K>unj7s3j3?YQoT^Z$4xaVM+qpr?HAl0;w|>$NmV#C9i+h>z?p^Ar z(XX&S>18n#`0NI<3qFj%nJ8R)*wJWIe?s8}NG+jn0mnL}aFQVmbNW5+*fq12(_|yB zJ5hVBOA}fRwZFnrbMRR`$#?1tlUyoK=SR2w2mvHJk+4EE z$*^MmcI{O$7pF+{j=T=#CraVtw!Bm5mU&4y=t3FxEd7sj%)vOvfwfc~U_T*~IBTOa zQ(jzr){Arp>6QauiQDw9%jlMKZ&(cnLMkt74`j7@I^?g)JWo4zDymutK7DYXig$d? z7KE`T8|D}OW6&@`tlPMh8vyF~tD1?DL`#8Ovm9?sD5y7;1oEhPKT3WWUYq*_8p-gr zGd|r#p_T~H8?2TIHV&OO9bu^1J_0&JtxiI&7SmUbqvLi$e*&A!%99r`FMRQGjMPum zexzhb6r0w*@OqA2@)ep?;NtnMo9X|mg!kY?m6gjc&QqIqp=1UXhA?O$;8cfm;kXlE zPvy=BcD?!p|1oOS82nY|=ZJ}5n(H7S>Io5*HPbk>aNUwv8h)O(Q5(4agB@py%XqCS zZ`C{@3jKAcP|=QdWwTw!zSH8oO{WtrSegJ_&%cOCS`ertt4kX`C?t^Fx#F`h_q5@gcV00Tqv4VcJJbQtMwf z9fi~yx!Yu;!dc{`XqamDb*<_2@J)QZ=ZFHXObqJmpNte%j^8F9;}>^Z6B0a_T;Ph> zZC!}nD*t;HfR+6X=J?BYmyZ99jz3<+IP7WWUOaGWiQKrd(NTu49#aY4FNO4umh^&o z6>DR1ZBgWs?EOBFWqfUN_SW@-H>)F}&C820Qt>t;ueOTgi+tpfS{iBA8PO>sOz{OY{p>BvFi3veFYtq;<7@lVFr(H4OqI~`Pu#YX;Ui`sjBf2GcG zk>0?gpxEA&`zIp-?lo3pKHJpWM)K;5HD#%~W6T|V*d8JhKbSknL^58FCzvtaXB_=c zO!Wv;iM%BJ2_y9;$a=mZK(Ug}_;ii*x{x|jZ^%nsq%@iX_r4T;e{N2laxY<*EJ~|w zy&RJaDmdIU)oL-wr*{|qPw4znAKYZiUWO9}aBbr7L8VDqWYH)&(75@7Q|ESx)k4Tc zgM!7j6C<{n}$al(N$25^lLqKf4uuOP9Q2jX3w)h zss4}-`004!g-fz?tw%83BFV~Ax21lyvVIhaWtGuv)z+mB`)VL-wMwRU1W zUSI4*AIQZCH0>7wA-}BcKNRt3_ROT~L=hB4HOVA(S8J@gy>k|xv&f){M7(&jf~ROA5ToW z4mCBK_A5w}dafo-Y?v0LKKe3y^N$1BW(5b5VEpO4cQ)Sv!z6OuMUPt5oOG6k+V0HD zbi~O;8?td2RPI;_`a+oTLAPM5CrV*N)H7W8DwH;8V<%Pg_HMB7Tw$^{b2ZH(!#_40 z;tL!aGd|o5qn7elHz~db-+g03`#MigU~mOcd7uMfpWPMhKVke}vpA+ADSG~e{rC8# z7T*ThCr~ABCY6xYZc$VaQkdK(hJiUwf4_O6GRTrsu(3p*+edLcIDcgAnW5Y2J%l=| z*T(WdZ-;4UPV9<=Fi$sAq>D9$dxzro=>G z0}cZjWBtOe2!6}^?v$ow7#jRQ%x(S#S68@3_h`ixpWUnZS01VF6Lz(n+T>=B3CpFG z5hd)`8jTZ+sFus{n8tg+KM{+DS=9z$2GINy*5>PhA5z8i6Bl!IJ`L9%psYcxnP;dP zaLqGkzTj}oHau8Fmr5WvUpJW+S5>bbX=t#o1~WYqQPS*kPklc*TiLUHL?&pSKz)OJ zxq816)P0hr4*z4;T|t=Jaly~v{JC4CcXU+Urm;n@j+WYF$LkbfK+$MP_P@1!U6$vyM=QIVX%s05pj?_!RSh8vl(kZT7++^UpyI{Z*XF z@LEuvIzD{v<#cEfXO^Jz=(JNfjn4|>Z6IcZ+NP~{9Gh%HrDmSw)lS30`-6Q>YP2zU zj_y@VLz#=|++oaLdVideIFQqA$#XT9Mnc-6dOzGyt@Bj%vr#&Nv$z3F109U`>>hmd zINa^E#Y1HPWD2zx-?w4(HVKen6$M!-rs`JaFzFQkA&v4Gyt^@E;ZOf{QS@Jdu$moG zd#__M7l3$}_1$IZHhMeI@hef74N8T>h;zcDNWHFh_sgZn?;@^~;J@`Y)gdUc%()Gc zoaj9=op>R)sRv`8*@VLOQ?Mx`Zp2ekLAxu_RZdpiI4lHzD+N%GFc7};US8w{IV)g- z%De?8k}h6}0Nta|p4-ly;}v1CTp&D7%c~1M_o0lSL7{cyxsCQu`G+e=jv32poU$f7}wCjeciNcQ4U zO4prIO|)s@l5pb;e!0}WwRP+MA+~=Nc zn*z{54c&7@MHep+OB>p27wPd=M;$E# zz|E&+IND|xq%=(5e9T7BY{ovRmF&HMIk(5$hA^ zt4Z^R^Bu9b#YC?TJAgRc@@_v`afHw__#v$wpD3U6OHJ+dccJI_IdpIK;fT@td-8&r zKXVx$lYuKz+ikoGO2zaamvGF!EquIn`hLqip1e=HP`q+L55LMx!Lpma zoG5{^H?9vSY0nBUII(h$8R1beEK_&FmZ9h3mqnL+ZD``I`*QK{y*~pY>3cxyS)IS) zpF2U2C{~e@q4S`J9i{bGyRHr=OY{Z_$O7z9b9g(T8>7&(6#3AhFtohrh!=*1n4U@uQB+?wBios6Q*_+$3zto}SYhR&}S}kfv_Uq=BLu zv%<55(=C6;2`A6n{n^ZQrgE#-!3Rl5^Q!}#p9~r1+@4Yh2h0W?Ekou=wj07v*B=TR zY-qkZqarYRaoNGI(Pa2vAcm^o7q0r%s3eb=wrN&;o!VL;cvKwKK>Uc!l0hd{7g5O0dn*6~=R7Q;`j!*!k0}kpMZk@RAz; zykCpn5C;%%j9^T#lLE(|0R@^ATv3(Ikyk{t-4<_I4j|7j*H{WW>m|bomnfI}7Tu|S z5>d#G65EeQ%p2wQO~K&~U(yJ)gtdRf*+vSZS7_KV)R})gnGA-pg6jxBCqYVAW6X2} z>*>R!`FlqeIt5~&{RWf-mw+DtBQHjK#9 zY{OvNAVpQp_>1;#qPK>!WhMW}+S1E_pB0X)RCNCAmaE+C-{btkP6QHx`>gN0Is$ot+di|IA|)q!LhrQ9I5K9>Z-%Lng)aApb_z-zpX^M3 zlV`-o)u?CFRZnslYdM{MgfXg_c6JEQ3GMRvt4A+MXZ1aU8Z?O255w4JJx&6~Up#RB zQ_}Ij$pB>i$rwR_!5s8_xKz9!-x7s;o{ktX)rwyhwH=C-3<^^e88H%H$|+;C=R>?y z3bbWa7m^1bS2{w$L{LtI@C~V`l`rr@E z!Y3;PzfkOAn$S7+=qG!y$_>oI5-0<^Z$z?be%O0n869^oMZ81MZ0?ZH>RA)e7oT`S z?WUU?RcGO+4BK?IEMO=KVzzYZeoR{wn!%;sfj~e+M+Xy_G{Nl6yGC67ZvJt1iCEz7 z2)!Gb&hKuCeVWLy4;0l1DP@&J*SaoKt*y~{9kMsky-MuQ(Ywc?Z7fB+1*rYbw~WbI zlRi3nsaTjUD zUY-vFf|wD247iY+(wbuY@|ob7hh6B(Z52ZH!jGzLQL1aBz24X$dhdDe0r)r*m$gwF zdEGQ+h-Lvb)3b+aNho*elHg|82*&mA&;r+=lnvM)eH)fxNfmzQ>4slwm`{ztlGs1F zf9a(PHV=X0;ft1skz;9X6dIY+;=~dZD$9ckNG+wxC1%h#SV)Xvn7Y#&)4VJfJJ?eW z734AT5>*(SFsKo{1O1lEo044z=ge4{@VsBVC3D)`jYs%wA^(r@7!X=;p*xVLl)s)> zb&cD>8|-(d05dyOYYvl5G{Xeu4rkC6PcB`FSRRU26q2D5QXj1z5$y#0(R_29!5$)$ z15a@Y&p3AX(S^dwNoJZ^*o;J=%v(j&DBTS=3ma2(M!_d3RC7GaVfU}vKd5aVIe=e? z^ZlLT*=A8LOm*zw>m?SI#-v;5ILJAuvStFgQUkf7t~`KL?kIYR^bD&24`!@{oBbAa zV2(-cZ|sDQjUJfq9obE;Fdy`fUFATqP7ZOQD;1numgt)i{a zgwK`{)zYLf8)9@4n&dKZi^kbvoo)W)?m6wO?^@ z32OY*1c-@=MS_%u$*J{-sC@Lm4lN?Lj7q1o@I~nO8Bv78aV&vWZtf;%(sNy*smRuh zQVX#8Taf>l8HJIt766W$5&((N>;r&EyHjVfs5Qz!lR4R?rHr(c;4muwh>JvFnt8!A zm=w^yem(R>#OkFDwa|px=tmnYer}?7`SxoIvTr(N8!}*Kv&D06g(m^t`;EjIRt94lLp4xNMKt%f6rSlOb|+#N>Xv+bKNpke0z7m=aI}M!mLD-9E!(Z> zj)xL4;1h&WU&)X+y8Lhd7O3+`d#}fmn*Yl2_fcgLprZ zjeMoW$V&!x{7Q{#F!FSV4&mix*%J2m!!<_=37j5k?N4i&a_KRmij+*9c9?(WIEk;n z*USw=&Fg`dD%BszOKdA|!si|yz8lZ03vKIO9oMY>2N9`a+6CT^-)Z>e(Uq|6o?7u) zG1r_{CQN%QF|}mhvD=qwbD5Gcs61qEi<5|KuUHwB2-ybqdYj$YHGEf>C4mtYFv)h` z9K5xFPnsr-{G-?WV)N4r2Ql1)RkCs}Y_FVL&+##n={ z9p4P-D3>}mDtIP=>CS!5acyJ^R5*s@#Bz7$*LIoLKRh{F698n|r67S5mE<+M547`=JgzZZe-vEpv@$5&1gapn$Fg}P0;=l4Xe6@z`R}|5E3wM@8}4H? z0*}|F=-Oy@0*afdgX6XC)uleD^vR~kKzRWkxJI_Nq4~V~)X(pM3_~#o<*s>Z*BJ7x zM+%$klIB_?<>X)An0$ZqP#|F?Jto1Ry^H&I7Vjq}Uusy`ALj^^zNG>WB$hT5ad@}p z)4#uBfjI0#m(@kD9uMkmpBBaJysI;)t?ti_??;3?9xrWs9_7ynF6H~nzNEz>Wp9aieek`AIPy`F2 z<{)RmP{Qrk!P|YS3dd5YuCTm!<8t?>M|>?x;kY#X@*o4z5(j(c|12!Y%*3K8xYshn zRgIdWv#&z9I6RXE94k|N6%8h3Z(nP_&~*@=^t`Dho>xb=DzL?sSX5rqf$J7DEIH%z zT`*SZlg8XAGd4Vs;VBo$ntc%JB>|6h>%(|=5MzP<95l_xMl=gH{8OA9xT|&W?RK`h z^3FKC=6M~CO<;ywtki5%P$P96S z#+WQNhxS%&95XxtLmXZEr}zmG365coor0=krGW6e{z_4KA0v42;Cmjd12bXI;yK@S zYq^5~Y2-C#`90T#QGZ3+G`?FpO}AuI>fL(Tlw#Kc=okX68aJCB<@n()ewKYnd+s=& zV*dD-#4>&=kebN%JCE@R&~EZ3^9B=ip4ea$l}n9W>bT{m8a|u=wr{{hq;@d) zWXis! zCP~V?sQ%km6v?!eI3&n-dSL=%Ubt{r$`1B=nFS$=Sq@pWc3VR za}S#MW}l-)8dPwALq)>m?TaUVRYLi(IwY;V>eNd6tBAw?Cm!?8_D@ec?O-Bt6N85C z8yl-U?GKAuzsU@mE%VCNp1dE~y7w-7TxSdXx4~4F{3bkdi$i~@W`ied zMcK7I* z+H9s#Z(D=%I&45=&Tv=5tQ5JDqgyFD$iZ=MTgsLy6z;O7%csO($bh9gN|}IVVOb+c z4fLja7@QveK1-tS2KeLG>T+C&+=e1n@$S6P^XK^9lBVIgOKWkt4Y4>$aAx@zDe8G2 z=CzB)D@#Ted&~&lo*p)t#o3gCr6W=J-h)zHXU{Kt&)Y?BgjV1}c-j<~Lw4g``o46} zT5-}Z4`%Vjbv?{qX^gu51Kt^O7#cgTcctvx@rwC8lxzh&7j+*TMvj0vCQU z+d(J1q>%;}Vl~S;s)&;)k?@oNDM;Lgk0_N{iogg3Cp|LJqRRPYh7Nn_RVk}KvWCSf z4laZmMXLPeKnbh(si8hjVwwvJmadyad?=R1`Z|*6c)QK4OEdwVE`Lw2gblSt6)xe^ zy;i2A@cL35lYSBSz(~jY3V$q%drmWLoTga3>jYuBA%Urqu`aLgGO<|Nx6yS3wCTe1 zHVmV*Y5v+P5%(u55kFWT*opi>0iFg^0oOJdSA@a#K?dGf7SFwK*bYxGd-Jyk$^pq< z2cl+|%X1B@V`{=lZZWQLquKrt#o`_7|fxk%Qo*) z9K=kq@z1aE20xGRUF>reA{wlqs`lAvqjcY7yYkUw&IfRh9Z|C)(GTF=#xAaL6dWlR zb@(5;K3gOy`ttg2!KB;34*xXf4Znl)_}`0^G*7L$uz*qRN=Qqyt~YN#eW?cXUvgKfYd~ z!FQ5;l>j>LS>J6e2mXA$Fg7;s+7o>zyCInuVj8v7^xzsfgn;C@Ibiw_6AK7QuxwV9 zAb3-v7z6qeqk6v#s3Z_x2d(fUEh7ayiKZZ)xao`ehNao_JFN7`+9y&$_o)SY^%Yc# z2&fa%%)D=}Ghyk9kKdKZ;{XcSro?FgR%`YW!)XPh^-{x~}~)FOR6pP4=6@pQcOB@Zo(WpGk@+Kj>Wwev`JP!SvIog1G=trhk=x5I;&kMmd@uRqo%p97oUI4t_{Ji)%)CRA7hrwsErsH_`9%>*@ zl`@TaJDfQU-j>x;ogN~Z>-Gg2o5j2b7gL>9?^qZxG5n&UfJUjQ@g8zZCC1-O4`S)X|fN6-@COq*YB-G{BHelm_!4Y`A+@vdtZ^dbEU z=@oiqwXnNA$r8LQZRfrkI3avo;DmfBWA%RZ*(dhmk#!AwqWa0!ohBbTl!&YKt>k6k zq*A?Lh1VW6%3t8JB#?pmUvavi$<8fgtqrC6c`=9C?An_c24#I8@iN==7UwArSh7j4 zpMZH+>FlmcyFP?>2`3qD{Z=nz#YlaO_d$5NBdhJ>8TP%!P=dE?kkKx5Sn`*cag&ZX zDTl%jziYVU0MQ;2xZSIWMB?!J-XY>@!%P?^YlQC0|N ziG#&n70{{Z{jRA#(ynkYQ8o`3crihYbhdD$dN`bIGsfqt4Kl$$FCQkFbbXF`p;p|O z*%{MMS4jAVSxaeu-95@N9L7Cn?T1R~ln$Wt%olIYnn^+Fm5(L?1E|d1uMm*5PFyEv zE@xdSgOo%_8Z!LmQ9(usHbtFAYI*gqWAHjSVKP1XAMmgHP8>#jG_R)NMTj$HJTX4> zf~alxGHdPL8vL)UXbMhV_ocLcvWX?#;bWCW^1q1sz@DWXT?vN*a7o|Eq#a){G}%=J z0p(R_nQMcQ@E%Q~=3gh<=FVwsRGTY+(zlsGQT3|u(C#E(l}os){a8*%V_nqfR5l{k zH=h%-{5qB3J)8Dul@@d|*MN=g>~|g#@)|f+8*Vqnzhz;LZTjLzeg*fnl6e9?FwH=PiPOp*+9CN#n#n_ZB!wD*vfW(iqv&>K;6A|6Furge=`gk-g#UfW~+)+bx@}s z&9NJ-(gIT2h9^5wxM#ltKX62cZnwP){Z-VsQwq@1GeR|y-zTg7#`|J+=S>S4g+#%jh#Y(bR@`L zd};oc-+_*BjhZc+O2>&Oc;ESQ;M+i#1-=Jz^srSU)<1Qy&c5!m{O4yf0<0Q@A8m*4 zfw{|HROm?d9QamT>SA^mZ7)MVrr$jJ#ia3Z!zaPYxkT&#_xa9HD3cy{=Ph=5)iyRb zF4mXe4Wu!0lsW}ieMBKPOph8HIn!aeUJG$OzvNdOgmCo(QjH0&7zR3t|S0p6a z$x7gOL>ZVMo+Gopb;{t>V@fBr$fH%_4l>-1D3P;l{(3K zLjw*6V9PY@V{sO$qf zvI*p5$VDC192EtEcW+HQM9-?BlllAT1B7lGL2_eSaBkb zvqSv>-_+63akN{q@nWNazH)J(PIMI~~Td`N}OI0^5Xn&@FtqQD`({g{q z_a~Z%Ju7_gYnU|{>-XC5aK^SM^BtO-0P?##W#Vbl9 zp@+>7%~=?7#KDKN5_;Bc<)b*tGFneKT=ovU&=zeUdLTf<3MdYUA_@KT=!1 z+j28g;wvmuUc(bFIazur;Nqj!!3Ud%_nA}oh>-1~iE9zHcDZ$Kde({RsL^7@J(Xpg z)rT`A%^HWpkpVYhrc(R&}`*%mbVqQT^WxOBDEWbWNiK-D-l>yfSRZBiX_?%(`uMIIIU+ zLHTP~5_qp(e8xeesdj~c$Lk%pv;!3*`?DoHI z-%S(e_u9!pegCe&W3F+dUM8}6Vqo~Euq}wSe?0re5qki{BXC9*@mr9+Av%QFxp(KS zUNjnHUd+*?-(4%K<8u-&fAQ2xRLkYHuFFdjy^8zp97;jPbkMlQNV-y1+L(nD~U^F4`Si*D`5X^>_5OE!(M& zkuwey3f(m%Hb4Lc6CbL8R6#}+!O2L7+G_c1 zEHqZ;mlM>#7<_*LIzp8n`n)+V$M0CbM2y@Luhm5OCK8G$U6RnM)J<(0k85hBoe1Je zWxjYoIu3#g^Mal`d^Gy<(exM(w3d_No-U9kx-H<6a&` zz^%9=dreLwTe0&v9l6F1r>@kOB|LL;bWaTY8- zFkM@gaqT~|u#qWI8}Lg?*AHVWM6lNSCOy?PDBw_kA=foPW`yF+IP2ASZ26~4jkg+; zO7bEq$N2^JL6D?EnCV&t4TB;%5O%6lIf<1NALI#~p^tDy21ZNMeLf{GsH~|)W`Szm z;%9jDkRBH~z6Li26J$j3b$#kj^Dec_eg4}7Ve}yln!aWq$N5<{DbKoY?OBt zw99M}qlnR51i*C;e0wba7zzA-PGyii+L&PJxkTsD4dx4*1X6Gm*1}Mu`A@%Eg8{f@ z>8>8K9!-1AXZ06|%RAAcg2$&gb4QAVywlK%_=jzSgJesr;oKg#Ixr6BdJ-_Nu zZ>Y|^Zjt(30?*2jkaQ6lK2~n3C_XK3-u9F>xu|}cy{e5t-nBeW$Cc%{XJ9jRAlMY2 z=9^6L;VnWTuHW7_drc;qZO*pVel03k^z_xxV^5^Z z36Z{Ds^78Wa_<%4RpSLP0VOAkUsI47Pc^Uh7@AO3f^s~3y^<)a27cRWav*2nkFWj< z+NwW_=G^nN~%szWE#ug^eR_Q(2wO5(_mP z9ii`$*lwps3#jP@X#uh&y3@@|L-H0dfzd@4=lNLk(KfC+>mhxIpm^s|OFfhVNLZ5)Fwl z6TL|@w%`oUdOp4Ly`zM~LG=ZnbzdPfK*r||n4W}1`5t!nQeVc{w*XSuT00?UdHvb@ zOzQQ~k}QLl-^okC0+3NK+4<@7rr-_)^E8di_zw|j=CzMcu56V^X+%Mb*1#D3eL#FT zHrpq7OADsTk80#QuH=-BmcN`?eM734pHn)@RyyvnQIfRIy3aiPZ{Ni<^nTtA2bjKF1zsy!x-F`Po*#ugn{A zgs(1-wfmjYXcw6x6{2`a^g{hK21EBXK04e-)7}>va)*z4)8t^>8p1APHf2!P zVg-d3I@qp?FvwgRB*A-e(>a z(N!~e@VQW*Anxy+39!f7kdnJ!>1F)N$@X}!`bNRC-dbu8*=rNGy>vKuIl2(oS)$)O z4V|w+Qe+=%4n!EW=4@z0uGQ+6ZxD`jCFfo0na%`>57t$}F*13gwx_fc6j3QV?WQ>k zLQBVDlkKLyWg|vf`@f)74UPtbi`52f80?c>%QNhlno-5~29x{;poHF?G55X>ovyCC zbxggQ3CbcDFs{ck3Un;6^5Qk`{KR(QcE{&OW1Wu<^y$8RpdGGqu@gqE7Ud`T~pAn z>$F*W#}GAK?=yGj3*WiYsfR(n<3a|8GE}Ra*PasJjQOY#{E(Q_jMkgPvE#p1c2`aU|E4R>BGXc{FF7&k_v4)T_EOV+d2j(@I~ls+$< zP|Klw0#M^Wp3|O8d6WHvWWvm?z1DPZEBv>T3(Fh4+5U4!JCs$jBrCe%%p51R-CW3R zzd3HLpfTGCRX*LX`~$W@oc>N7U+ryt`rIb0eXc~sGEd0840pet9f_W0nF7f^PTsWh z67U0`_m{ojUj6uFlWS*yC`ItCM=UC3<(5)uVT3@@!#zs30S&FA>4i`*@W->|;cbIi zuuI4~&bAU^;4Ui`y64(R2j+WhkQG#eTG`TbL2>L;@HNEqhEi*X0eK>nDxU^v){vU1pAF&& zZUaWbx0q{|cTs_!bA$S$szSHHy+`s4ec|W#z$TS=8z=ENKlzL|64=wZn10;^XZ>Rl z|AhNb67TBiu(3o?b_%<0oAgbK(+E3*+`6#54l=c7f3(fEVvfQcm>>@a7#`0;XS@|l zyrXKQ$J04GFO;74?#>k&p~I_-J2gWd>xfpo+45_(jB_zHCL?5-_+Q?K#TSPrWa=mQ z$$II$8k9RD*vXCt)sI;nt{QL%ppqpyF1KR1dO(4eym3*UZ7rUB-}-os4_BJ}`hb0m zRCLOiW{&J!A22#-G^&b-GqSI!^WbfR2v-EAlo9b7B7qBuEB`#DSDth|xYl}SkLnG7P1te7hFhZeu9SHhm;CGdo zvi4V^PanU&cw`J6z5xZah`gh5wFd$Yf5qhz3a^W4N0G3U}puPlN? zeie&618RsARorX45#6v)g8!k7v=ng{P~|%(Pne3TyUP|q^1X0y`L8+u(+7O@y?&I2 z*ZEOL38%X(qN?b}muO7D>WJr&u%}E3?i~xC8;Tjt9=9J(9T z#;4WbwCMku3l#Cr6mnBk$bcj7cGp;(Pp+XZ@MYGMU^!qp93%?JJ|k>hg5~kLOujHb z`WeWj&!aFj*_Q@vR)r65zt>H~Fb)6|_Iy>EnnzDdhZn3@=eptqx=0sq3>7J|1=aiF z5uThSCH;JC**W^EEBU~oCXaooC{X0<$6tzlCkm1Rbk5j}^Q_>l@l$;J#vkVFc7YtF z`8H(}{OccgOFhxYY0*16nH*SDE;hi(dE|yZ|JNw|@lSUyovW`=6By%vftnAh&ZlRi zMfKo?aS1GzT1*EyfG}d_u`1v&tR5!TOr;6|Bi{G?T{b2sxJ z(e6Y|{d1oQ+TR;__++En-wA%(5Pw6}raHEoS>EV8P5x+h9?7c`NE zlz=TtqX0wQn9x|5^H}Fga7XU=z5E}~{r#_|%m5xJ4;-%!WJTAUouX=qJ2H1BT&C7g zjH@F>89_G%?^m{g%}jP(4|hHQr%~1Fy!QnGTJOp9<%n%g{7%FujBG|yaC#Qttb9hI>sXnyABLj+?19`3;*YF zoWEARE}QZ_g=x}&H#83I_nh}f%;PRza|0V1Ja!z3WK&J~o}W!@c$&9TDu87fC5RMP z8fxfE=-FaV3e3S3j{~#EX3R1p-Wn{*N~cPvnwCI&1mECc){TgFTShoHTxJ=@;5L6S z0-~!gTZf0vWZ=%0(7_G!Pfz)Xz7;Q}$HCL#sV-sq8fWd{)djFzp~k>dj zPbotxe!$`r@VOy&t+WEY+YNq%Ywy`}Hy zZ-a`pw|f0uOj%DxI;+~OYuF=LritHJ6IVsWzJi&s-8b8@~7d0 zOaE1o-=gYA)?BA+ax2fb@DDxx$Dw0>=WUP)<-lQ+_?-#;{kgx@4dROV&mluTS$}@$ zcX;*xKmPxZ?*C1(VGQFWPtyN4Rro)Yi%Q>{Dv{{)+6(Nz_2|E9rF<6Hi0m z{FCpuc!A+h{NMbz`kme!EG#Ao8S%&Jf48lVg#Ah2lHCXUmN*HV< zDq8Ov-vZ0h#=sg*i*ROzLa_5xXQ~!vGo1gl6s(QW>q}SQDJlwv@cD>|uGrP=MLk0A z_6>BBmOTZOqym)`$?t2(LYt-DH0L&sGWC(Dib9a$&;@Ig?h6zJdsBAvfa#?fIslii z0XWmCV+s}!IZrqZjWmZcti?PJ4tSK?z43LUdMEhNX8ncz9w|DVEX|ygGj^~(sX&2$ zgmM4GTAZFux!oKM{B-dds_v`^#N8t&M>{!c8H$6La20&e<|~oF2Sdr6Howz{>MNI9 zlwaahczOm*zPcY<4ECgUiJTrjui9)Nv`Go#f4{!p_Wc#Vt5IHK`>!yN1%egPl=uY? z?0qzgHgJvwr09m-@BEyWQy3fowgPTejTKuyr}NnK38EJ0T(4N5t(py!vj%JP2Cjfo z>Y1fo^_FDG+SAvy)ALkV=;f@uRj|tIfbOT`$wg{wW3US-HNg;DNS5%BePeNM~oH}K;DKFvj{cKdmlzkZtg4 zKb()VTlFR6%*P|7&^}n7NRdd7CPa&5(-r4}Y@Ctj%3#hK05kbu@l)TWc+Sej+teZ+ zua3v>(tl(HhiD+Wf4ZdP1?I{`9nop;o#u4$fbB%ZQnaq$JkL;5VJ zHtU3ApMuC?#a1Krn4RYiwVDF|K)w3(g4fxB*V=5j5=N@JcJXe(C%w*Kb#A>fEF*3! zIg}G~(d5z{eDPkVMdf{zuBlh!TWHs#Eqdmqd_{t`*{M@d6YkQl+8x6wMrqm2}bwH zcb_9g>V~2Sj;yz94deJxAT#9C$N|ABepz=)j__*!>#mG>qb6GSY(&%>_+X~xuBHDJ9CF^Ht#73K&a2v4a>{%ORzG4&xJ>I*R1I|AN--24v^1a2@u_jLX0IlOzs86r7_Oar^> z-D)L0yQcI!e%^~8Ztx&i2_Am}OFXqkK6kzM&6WjOz*3TvUFvaBD1qp+Z}?|F6dx;J z!%^{3eYn!IBN45kXZxeUuN2_-VdpU$$_b5`La5Ch@}G7Yjid!d&u~Ax&fgH8i%|2} zR4bVU0@wd(LjhY2{V|m;;)>8>Vs;bxHcZ=Sbs$0;lwX~*oICT^fJw_K@e?a~J)+!$ zllE#*V@6S?By1+^v@xQ=-AVuRVyT!X?fa-UdsD)Gplj*hFtJh*VAOy_Y)=`aWt?u1 z5Fapk1ya_2i7F^-&ngqEr5gqJHHP|p_1B@18OMjUgon6#fn3Y?}Rw`PtiD$ey z{Y9g-&`P>?XH^hLfzbYLhmSD1$ggsDUGM%Awh@@%3j680L|mEW5i81Q6ne6xN%=+|{0 zc`sg$ZFV6^zf%5q9d-S?1~9N&z1xC-4oTLn(!n^8sa;{8cBYC-p+h(K6Kk3! zB=yXsv@M6ov5rLt59Q$!=v!b!C|1h4Y4$hQXzvFQWCsB^wNDJu2)p^G=Be zx^m&YS(oGBqjbPPm#*Yi&FkJgq6+Z7_f}YA)c~#2 zDWcqlm6sMeUUf4Xs+V^&A{M=9QuDx>mkJ|e&}CZ67|KGCzBhx+v7#qt?pFzpT*5gh*1Y8K7rZE)|Tb-bzsTU*%nSJe2F(j|?J*PD)6WC?tguSxcxSV@vi* zmTZY5S!Sl;ht4ToC^#`O5cr~%_7o^9dq!_Bt3+VpSxIc>&0cJETU>^T^WNvh_M7uKW1HyWe!@in z2ioqUt0}jM5zS!_F_XwRO?cN~moWn7@)%h7!AL4LN@T5KN@W6F_$-6NxglY|y*!Nn z47n(4Qv@yP;BG8;v&7buZf=zKzm7y7cPO2xWb7AJcD*ewaUwlWcCWRW>E;HIFW$Pm zR(WEQbKXFyH(__Dr6s{Og;}mB9?-w3CSam^w!k;?@fW9KjSFx@BE$VUd2s`c!@_9f z*sk-@nltysuS<@9@X(bIA{@{O$0eDz`1@60%plSgF8pCuHC+WJD=wskNW@3m-yEO}KZK;L%?;52i z7)S1{D)TxjgoK(a7uEk5<5o_+u5~u7sk-qNY&@Cj&B zGENmqB3orYG1ZbR&y{)T7D=ABezYNrorq)y3W|I#2Y0;V%1bdNm?&cL+SKwP33g$!?D`ysw^d>Gk}7d zcUIw zVD|f`@?7qba95xo+m2*k@vF&10TK4CKsjF zH8?*@`mg7z+D-*5b^wQkHNS&iXPKM1-OY|HJkW@nQF7R9RT3G?Gq1Isy4NMhxPAVb zLmWFvi+x^;(VG;gwFT2G5j7HXCg1L9blFbhor8Y!8s;~qP4t~M=G_Y zM@mzE1E4gokwAZs_q0oaSfeyxwVGdF2||XEc^cTcVMv!&+^EGch+G*^zzYXq8F|G~ zJv$cuG~4_M4{)*Vp_5@3KoQ4itHN`Hp6+}&TZ7dU*}ML}+s!Buv~1-Yg$U~mvYNqB z2?s)O3UjEUH90ioj$P$%MGc2~G%Drj1Hg_x;XhXh52OvnzM3p{2)`U$W90B+1MW=U zdEH;w$wibCwQmK!Y>Wt_CeaQRhiKOcneOa~(6w4PZ-^d!y0Wv^R^;_HNZQM90-`}v%ZD@(r%8aO|NFH z+WFaBlCFfWc(l~u@l{JxZ&)s=!3AmWh!Im2oKYkXHqCSe>fD94L~~CZjosqVFC!_k z8Ar7t<8oOIt_I^yj-mC+i5t7#=<768t9 zSydC%uJ-$8)BS%+8tD5M-O@?sm~*Oulcm$7!NHvJZmDf%rhSB;g*eoV~qds02y&6ss^KO!I> z;l#lspGr;2HcA|!Z)^H)Qi$z(dvd{`Wm8uCqP(8cJ9Nz=4IV-$(X%jen?EenVIgGhNYzE$fG7OWIFE zuQ;LuK~yYV`Y`Aq54H7=JTN=0mRYB19Iq&)nRCZYG%O<_gsWW#tRWOAX z4wy%IgE2LWA15ww+44An16&1JG_LW);HlpoEQsYLnI4Um>{?*EAW!&?!x`RX@?{^I zjYs+nOFrL4{2FGF{J*ki3c{3}mE0`%i8A|TD0;of-cuj%ft}@LFSa z8C(w!I=;9&wN*`seVNOS)sc(n$6mWbWhT8l5{Bw!dyyS%F6L(=^&y)OqPZ!Rm|tvO2v!_7%@&S57(I?11FCy4H& z>S#4q7Mw(<(~?ffi1bg2j(mw;&T4H>a#QK?ZPd&X=U0{?4JRyQ++=W-{gN#q2c1d; z0e`P$_nvn2wfiXnQl6Ina^Pn$TFJRLi}ICV_Y=ez4Y+FmA#z^b#`K9+ocU0rZbedz z26z81%#cgWEYr4i)sLiXPxueRWpC%T%S3etQIueRypl^4BHIOi9Xd}&TiTdT2k>1@ zJ7OdF*9NqnyZZhv9PJm=fD!L%U;10-eub`pj_z_4JkPu`Sqkk$+yZH$%CO|5O_`V7 ztIsmLguo4}-?g$b_?^KnmU!BV!*Oo&2Re{*EFNr{Z9Txgnm8}tZ~wCxAtML&LFt&} zAXe~0v)1o?H5)Dwy0%+Gz$di>ug?Y%C*y$vjlczdos+5h7oi~_v>|Mu5mzsuo- z&~M%iut9CzclI12zc$1xy|gZIbE*zxK1|UJvIa%+kHBaL+!V^qBWMV=Dn_g>u8LeTk!C8%$_db^>PJUK$Q{aM*J(INE|q} z%fv}r?x&loV~3czMts%<{5`D4C?CG-vUBB7+|F4OD?T)xj}RMNp9wLmQ7WiSV^7Mp z2P{bQuPe1H9YND8c`1(n04?rBKh`V$q`}-SUWbLfpnCdiPnxX3r?iM2iYOskza7b; z26y$fR9r`kv9{+ly|FA>d%45@S}#ge{i;XhhNWcx?n?-}ft`FV6rgL1n;#hdL|4+kBv%-BdU39S-TiwMQ4InyDzU#{z?l|>Li!KNWe ze)YBa&Cwu8MyWpK()M#9dy)o+IzxzcE!9E(J=sI4M?xRDtpGzCT$6^&{AJ}-g7Rzl z+$)3;bK%3e;lbT93L&xOfiW3-@}CA0t5=)abWR{6xK)*Nb*&Rh%;DGrhuMaDzj?$S z!kQ_oX&62Yhrc@l;a5z$N~+{vnI^7y0A%q#{WAPiQ`R&Ox$RzQr?Y5}p#4ikw0$v3 zwpaij?mf_iuHee}I;-N-f1J^hZ5;XX68&+&^wK(vTlfmX2?M*vY6U8y1I{pT80Wg& zd>R$DMFYpASytzAdYkffg15q4{p#fB=eS2fk=H0r$u3Z+P|tXj)O-9Gga{$9%|m{h z6zZFmZkZjhfbqgEh3c=SEqf;_q?y8%eNpNQ)FVAWQ6#E(CR{#AWK8m;xE=FHVQBwZ z1_K&t5;K5>3N8=2WtWauGF}V<@o^dJ*<3YV`G!j&Jw8GC(&xr2olbF`;i6vKfKH^L zo&?a7T@Xv2Oip|}d|JDzH(Zn~mU}KYB=_V<`q%gFloCLm3fMUy$eoc8UE}51%Cd(n zhB*4pROyec0ETmFxARY?Fv))Z80nCQPw>oj=`Uf-4CS?DP-iXi-k)AjHEvCJr{zrU zB*_HZ-3hsE<5iKV-4B&=DUmhmk|jbyXU>du!CjXXP;+bDyU1`@8CidG1myGBbC(|=&-5sxn^W!z2AAPSlX6OmxhE``?r3*uxMv%^#|}@ zn+#Yw1|KW^iY22pn#ZeW?R&OsLYr!UKxRhfQjupUo5og{6nwmXdHeAh199x5o>^Fa zUHcz@Ia;%Qxv2;t)g2I~vifEL9lQgx4P%I}RQ#G9loVoqttO=xp8w(P8aqE$)-x_| zVfGs$ARuq=d*Icgw6@hJxw&- zwHC_wWN&*4UPCRlPGmO&m*hfuJasae;tN;~huJNr>Ri7q2=LyBZTP}8oK26rQ}V)b zWtqiV4u$iXxsLapPHpO4WHRXfPF24!i%E+XMkY);X_{=sQfkz2Oq8zMBsjO zvNotCENPUY#Z9weD5a*47}tDW4j3V2HxGn)CdhSYfEmxG@;6cq5yGuTrFwPV;39bw zIU}J6%oP5C;=DfwbqpQ=X4tQ)&1ReO>0r%_6 zkWO$bj?tQfbfZKSCieOnwY?90z$S6lySe0r6v|!q0NQs2F@}Z`-xpBzi>3IaGqWa7 z9uEKMXriz!5VoEMnyxnQTF34r_YQn0UrVr4ulEcORPJ-V%e;Nmv$@bd4oK?;*4`E~slZQp z^1tDpwjBhZUsbV1Ai|tdXm{#`jCI0AT;rN|V1*VhkAPdMgwaV}q5WS=bS-?%Lbu3x z=SU2G9d0oK`wd#xa^yoraendJ!#K5e`~$Xk@1smSEu_$nmN7FsGy~1mG5}Xww57|5 z)Y_k5v*&ww*U(fv?FymIHcINpR)E}p@xJ5G>JH0ayZ7cv*V!XQypNw?`wM8t(zml` zZ>@73)bl8;<-7{&$vj&WLIe8eT0M3P4bOOzb7!heN~T=(a<>NlhJGeRUhj>i!-IGk z>s%FI;(}qmdvnfYQvuINpvL?+WU!h-Q!BxFKO*h1)@?5J0tz>I(;;KRNHVJ(T@A(K zux_h|dV9v;$7d1zvutOX`SFG$-PKj%YMTAYDxTJ7PqqKG5q+dZ-OUQKc?NS+Db8!% zgCq3`A<0*=`DzCV>$%>;bCx`9-ah?Ta^g=Qv``jEle{xEgXenw*JTNhyjx%yfBy7D^MdRaxrqp z=7f~o(u{=rE`|t%Lj2?Ar?cXn)wTLfJ}JkEq&N9I4QQogCk&TeYLU*C zDJC|##_#(H#C?Q~U~M07Nq$>z?kLzbEC<-=;}WwGpdrX>4?2P>V0Ok@&5Y5sr{*cf z+N{YhD}fyb^Z&VpmTdSksml8r3H>)Ccc7dH^jvylC=d zJSRQsAz+~nC~YsVOlz<7xd(-79hmKKfu(rbY+t){A#?q3GMfXVW;_jxUaSgICHI?y8vkP%#X(^Dfpebg8Ayy(UwuJ1^B z(~vtj43f+kzsVERS_4|1#@M5l;CQpFQ$^V+fprDtpLx36Db=IzborU-PjXMpSy*U|EH((lz2NR4ECc5eK^T>7@Hw%b>a+h!@8%P2y|-1fHt3`fm_J zqE}znXlv>5li!ZdBZLdWmhCYkf~FMBtG>w9rbJ6LS zGR5q~8j$#~@6d745^h%uKG8dh9CMQmok%ky@Ce`Q&X!(0tj}hImbWx*Fhhx?pJ1~? zsihlDTJ)8GK5P+^7Rtq6ib%sMofl)i_KI9P5fHl{P`Ubay;94fqEJ6&9UT~mJKdR| z@)wR-^rY?_ek?M817E;~#Ba{iZ^aOSBSA1{`ybyrSrST4>0sRucRJ{SrybM}j!$j5 zum1UlRx>iZZfD96jxdFLRY!jMwR@X)444V{vm z{OY3zT3$_DsWl$3v~ts0h8vf;9(pC35%f^nP*<4+&Gkm)0&wzx6*EfU+VKT_URS;y z@=j(b5zKF;5U$^}ciiHHPuRhsoDEAXvU6?-iL549EU?t7nS(c_U67iy60(~5!4q`5 z3TZu1w(Hw5##`i5vwB3Bwhbld3jvKOn)$FtptL)XkvU;gqHlb7cy|z}sKAbTsRYe< znB7+Bc@#YF69kKf4j8&Atpwb*% zvEy=Gh(CBFaUl11s>c@FSoGSCtZyX1A7Em5bglI6+D>Yn=RA)=fLHM!x9x%@A~&<6 z0y+HZA1hAIM!+aGDU$xki7P)k zSik-EAnciQ#B}@w3BG?Wb|-vQMSFwsfA)}l|6<6i3lO?qq!#`(yuUBa#v!W`tb+H)>G*#A8*ANzr2qZh|HHJcQ|1ga`aael zUtxq{TIBlt`uBey&5y4nAQ8H5ktFgY{ z%5O{138p3gR{Nh--^cpnE5bH-ApW--KDUxCIDtvuY9PK2eop9}K2~t_T*!X`XgieZ literal 17288 zcmeG^dpuO>`{S5l3^Ev|l#oIcif)t3IZ3*yDauxhN@-G2QY4HSl`b3FloB~@o0K-S zq>VIfR}sz@Ya^>7ri-MZ!kF`W=XOT3-*3M@-{0@|->+?b)_b1!xxUZyKF|9;@9twS z@X@E4(MTkc{+!t}7n4X7m_*XNN!0`+w|}#KLn3KF?D_uQ;8zlfMIwcgSS%6?{7fT- z7Q48(fFbDsE0h(Q#tJQFr8W9W|AhPcruc?~zxlo?<)N}^p)67;fCGQhibIQW01+ZUlevVjM3|@hwjYFE0lmcsY(Yrm-4})1)68aqv$gj<@EA zx0a{0mIDO%n>TL&7BC0ENk6vc7;epqD`E?hcW~5hz1%z+7=KJJI){R{i_Unf) zpXXxLGO2e!Fn7-8JtLm}{?Ncl`#0_1YD6_*@FLclZLz}q5joI5Z0{MQEqi1eN70HrT3 z_xiNPc*CMC=X$%EKBW~ej(>Jwe8^!<(wfjPUQj??ea<`ll~<2=bSkW|a?B4#!gs&S zIXJH>weyB$?B}wd9c-^Za%;V;n?4A`x}1}V32A;Z%{zir(N-gd3PyRbHNmKl zB}OG=Go-rm3E9ZdL6vM>quI9qmJfrN_qLF`K0%#CHoHYoC-L?W3>*BVQ(KM(0ejFD z?f|1!gan#&4)q&w#F*x#B4t1vD}^|TuooeYwk1X<#L=L-?jpn?LHa7Sz8}RNo)inz zIzz5j+|}UG%t1?IVGVXxoImB}l+HJF-W8801CBoAQ zU&P7OB4gFO%`?$!`mg=4K|^1G{>rD#^U>q~JBa+z9bY0!Exa;%P?Y12AUEg}O1qDs zO)8~j8ADZJ>=XQAm34+d9a@BSS}d{fZ~9a{me=N*=ZmZ${MtgBw>?Se%lH;LXUaEy zT6?zKn)1>mZyqvT1yA8^C&B{0h|cl(rcYz&9FL?DMv;n(Z=NbA*vb$#B^REL?e7wv zNlIsm z25-XS>SMzI=C*D?2b}Zo=c4P}M@cuhw1p_M?rKj?{fb4o8IMwf{e!!Fe?EDv zF2s*L4*Onf(pcdz0OX5Cw>SETm=2R2&sYc&V&a;aLgtEC{D8*fuAu4S@g#d2CTHLb zn@lsJ+r>?mH_^H5^oXoZmqV>i-fpn+g`TC;ZF#jbU+>v0NX zzGs>HWQrAb@e-Oy85nJp`o2s?j@)1(6EC{G?oI2u=fg9Wd!)cSS1(QVMW{s%K!XJ17mX(BN&~Lo8cEI%l0~t(`gn7Y?mGjWSh3UxEJ0r&Filt^X;a5}Iog;!ca@ z{c9$#7M}GUlo$!j$qnX&^LJQc?+AB}I9m$0Ps#H}cB+Pe&n7GJa;H;^rW28`5yTB5 z4mX1!F@y*b+heIw+mn_vinLWjbWSv(q^n7ADcFGJ`}WruTHCyD3*s`o_7obgn#_lH z5|LV9LT=C`r1`}X8>wQQ!uzPo@*D=ptU|<`)}WHcfm)E~X94RP>qD7y~TqD)fJ)q*-S3A&T%919g)9^9{zTA$pYO-Q|| zn<>oFnq-%!V)>h=T4Ek5g@q)->s<2?ME~DV-JPpzI4IpVrE{EAkj}GF(J)so;;R}; zgm)6Vrhr0j&?Q(NZ;1t|=xXpjs!~?n+ho=WM8vEm;yYKDTEtWt3P#IR3WJ~y3ZXDH zhS=!VFB!755$QYf~+;1VU?`mPb|=@Y|d@@A!*VlYz=r=^BaLU`)^jS@lWPWD@TqpD4_NlZzQP3yS(P1=$qmB^UFcAYrV#QNkA|}UM7K|Q=*WIY(vcTWIl$;d zH1+&etY*9gVeHc3p-j8zM6$e+i+D83V{m{1<$wX9V;gFbC&9)EOH4v6rih=6gy~vW zN>|BFX%y!oUjYZcL9i?MlEy^d0**RUUjhI3EMcK#rzpOm>bMeNZB-D}fXr3pl@>LM zM2I)s64OwLhcbj~)W!P>_*ZBw<}FzJuUU$ub-%m4Y)RQbOgA{^OKtYA%g4KQcprbi zFp_MxI{Nlo=;e^Kv_)u6b3kfb@9jhL4l*-(f&&V;v+=QtnA7<<%uQy!iL~yYVntn& zkJn>G=LQ~04xA_q5R?c61G#-P-p894PHxkQG9uP>tFaKAl@*J{lDg<$GR&MD&qwd% zYC;`*m$`2+63+Bd^1CYzDtmg;#wq7%%;s}1TMWR#UA%L?HgC^g3g-fdyn8kcwD(ko z>}XkBdEa6I8u2K?di|TB(vuZIY7pu-wD#SeCyL|2&g!eF@0=EI{64Jr^FU{{{DiU$ zzst56-m{{k6^HFIrh(JaB?Sp0aW9n-%Y%QC`(DdEz?xW~&o z{*G8tGA+OAH}1LK7k#g;PAHoz(CMr@D^;UU>6=-myWn;A9T(nf){gCej_iwgBWhoY z_Gdg|3gWPJqKxEG7_~5#&e_|fUno1l6&wnZnc)6RXvuJ;2{h#A9G{NOMEk7sl%Vp% zvd#%d;Jbw63XgWrHOvze!YWSEc3NTx;VI1_6M13!N_8k9=8tCg7^+i+(IjGj72J=4 zj+m)K9Gi>2_y=D+SX;HN?zYZ6O!Z%{T%d{|#$CnYg{2bul!r(hg3MJlEYrp$4^xM@ zwF^2jQY}Q|6f#SX2*ghF9oX6G5WB}x&b#ry++c8BYSd67m;@jD4xLE2n|uqxbq~}0 zW5{YTlSvmk6I+JB`3k&5-3H8Dle~B8HmDh4JYrCFk3Z~&OopkcGw1kn^jCFtdM3fz zsxANcF!K2cY74>BYRFh@_K|CrSP9dnvs5RhD?*vV&FaD#+r~`>b$%3! zu(vTaYAWGkIpvnv@9Mr{3lQy9m(25!7wV<1^o0+du;>3+++UtZhN8)Ilf7I%L)eB4R9TH55AkLzZb|R=2vL7FuG@i8#s~ zbr{xG_uvf+j&Hohrwd2_^%6m6NUxsE$SjB;HR%bNbw|ygf)CF@Z>t+AI0^Pqs~0C0%Z>`jIA){L=bv>_#yDKr=&@+!(X^i6334it4J||4=+SDRpM?fJNU{1udI3F zv!i!{zv3RxXTe-OX$hFPV*!e{1cz@IkA<=g3S(&uPB1}Wz7v-1ZXsaeK!hjaVNkYy zVeEcm4o)nlTX&$QgdWk$CRETYp5K4cW|O(K^ft-@K|{c{_V|WDcm0X(6F%I}cNgMj zyz1S{+{e(0&Jydz2A}hDj%)Lc?&RRjGV(ya&cWCR4&z6@?`y#sf+KU3oYRFvfh!9R z&XMlr`0Q}rM;EoI==oy+8vl>2Z}xWH$BF-dT!y;;1Ck1vt4<$$KIj&u5*#b8SnOG( z>P2*D$?6Pk9!)B70bMwo2r$7rs71-@Qz{U&Ol1`LcFf%k`=~aaikWDWx=RL_LLFPx z_eTdZ%ZOO;0gh0|MzwuM@PW=rwZ70u)m8QasL35{g&iW?Oz^8ACx9@zVq-qC4=z5c2g4G7#7mJnGm(9MYdMVpH4-HhO5n+p|=U5Ro*pPprq_GLgTg^^etXxg8oi^BK z^zcW}6(`U@vejC((vA|naXV{*Ao$?Ec zxi>bH72z*L{T&ZPFJ1!1CIH4Tsb`eBJt4wfoTFQ_86N zt4{a3xPD>6*5dwA9BGL*yrw^R!3H(b*>hJs?nQqQtspD3o(6@`f%RpYK`L*A2b|QestQMVcrc?KI=&W@K6j zO7nuwdhXfV?1HLuKLDPb`k7#Hw~ya1}z_ z6@aofwfC{KPvmY*^D@-urzeUcagBKGsH-~#n_jhD@L2Z_*y-}X-TA3;N5pcV`Y?v*e~NB!;>4$E51`W_?!idZRHieR38IyZg-S* zuq?3c&k+b<1L%u71&o!!24C(1nUC5n`dxM++J;{ozMZMzYC!H&`EAvQe=I{8^0)wC z1yr8Ru&wQL9t5%X2aY7$1C6~B1@qy@)|`)SI=#AcFI6VNyJOCbjh9r?IWB9|2M^FW zKThPHv;l|Umj^)#O`wh>6~6uOrx3H&_0QOBO%-K24_vH`R3t;`9{62ql+#(Slkwwd z4fh=lAd?qIa=Uw-yun&5n!bGpJ&#`C3Z$3@#p{F}T4#{T|EOZ^fT1vGajKisF-A9E zLG`()^Vve#b(SLIq~+j0kxMGWerz9>!9S^FHcm3nNXjheUXls_Toim1zPc*#!`n+^ zImxQQbhrRqcdAfWq)G1CS(cLPNs z_pNrpRGvrMqzbUSR0bP1)Zkh<|a zW`3*)3BY0{#`9eFNRPGC-El`rkpq=COQ~abd@ZeLo^aY02hJpX-xxD5bW+gzij(*# z3W!je3gY1R8I&+^@WxjSOr*r7!V5zbGJOpx}YuPC}o-w zL#0XHW5e1)OKaI4zP5(@0P2~Lw+@*l_lJ|*DLThZTSM57`EfE740qgHx3=Bt?<5)#OeeNMQlIG>xtzjYr$E$3Ym#HRrV_z**nQ-SY~5dF?M!~%;s8swb`p!ZMnfhh zm=KJb`mo>dM<{umM6JDv{z;i8J6#zYD{-Lb-8QT(wbWGdYywq9usivBa=p0WDeopY zQF$X*DzyJW954!dt2N2V&Qr$pkL`jXytXNMJCP@{{*`n${0P;x-|Koja!ol@aWlXj97I>rAQc($It(J5P_WK~Nx5ln(^*GtNC{dd6LOuA7D0*F{xw$9t{p8F# zRM*bV=>XFX);-ZPF47x6dvn56-k-qSz2ZwJjX8fOfA(giAHJIdUy+602cqC3u!~ug zA?2m4Cr3(WG?wLqt!1Pe%JlcUs56$cViqzi6|JBM~pV2rOaoS0| zft*=|>W;OuP(n=PwYdlv(<-!N0zK<3@I{}De8Kc%JIb_3so}k_-Haa&t#ZvV99L@F$CgM=hhQm9Lq1 z;sAS0Pb=^{M3f|Ro@mJi6QNiua5++_9LP5jz75t2{2tk=9C>3Re5w@~gt#aN@=b*o zz8G*Y6+YAoT#1PIO01^B>so=UkwWEwe99VRt8(CtDY%CW3`Ja&0}f2#4_bj7L?m}g zz!_5*rxh566e$|`yJJF;>60#b+=kDl(!%PJHvFHCNswB_TUuQc8c(YZl;;cet?bHsGs0eLNCQv zPquCVAEU8-`XzjRE=A@DZ^gRC;3gyxK_1IBuwCM<6SX|);wf~F%$FF`@n6ZTb4b7< zEM4vy*<6F13#U;X{gM>EjeMEE79TJc?$QdJgFKe&WSiurP82KY;u-XT+^JbOL}49A z0{k$4C6;|q_UjJx%~;MTnK);v1+#ez)?g+kuKF@@7PkkVhtE0-_tP@9Wa69?Ir*WF zher8QeC5HZwwGy-&Cx44&(D=Ftk>dsCPIc*U;tt)*XlOOU0M{Bbn$|vyAtO`a6lfo z0*O=-UCFD^FDc>I$~`ZeYq9%6fhC#cB!i#P9$XBLl|jSDavA*YYC#OoUcclPzeoxH z9I7*mI!IAyi-kX;*iRDL>n@h7GfHw-*MB{>&2+R}Q$n#JILCDuKU-U753JXkeF~{x zo~9?0@RZ004FGB@wlqts}`w2b3G-!10XM9N>g|Gwv(5&0^^a82X2N%4ulYS2aC_5co&`is(IjZO( z|IlS2I7jCMOY0IVFV54YeUXyu$XGd5xMO1GOCT{|FI5A$Ryp-5h_~6q~ zR>0(_7egS5aoSz0p|#oI-llBPxh>$@$nDmryVCl>%ci1mSl`g#xk%Eox7Kb@MlBQRY)bS;#qdYPIx*dpi($PS+whYM_faK#SSI#TSZWVw} z6m|R!d9Ca#Md7ZaH&@=hoM{A>^vs9dh8YnxN|)_=Z$VlO+zEk7-b+l)7fUeH)`4j^OYE#H36fYVJKLZE;E zgFZi+r7OCLC_uiHF*|-ZEAOxNy>Q?~L3%q#7upg+azO?yD_UOmYVgkQuq&T;Y7a$Ud|itQ zPN02vB*19cYkn3ZY#D~uY)F(P?9!Q`E7DSD`0ARy-=R+j&CZ!0#EEd78?*H%b|XKQ z8>C5I-f|?@4X`$8CuN6S?QiJGoO+Th36rwnd)iVvX*sj@jmXRrGb?bF(Vj!AC}4~D zv2lZEuScHbv^43ymJFsd!Zdx^DG6oUz+f&LXcgzNF^t-d!8q4D*`1z|v51|a;GNN) zEWsBsm$JDjCh5MX=8d(JC~#~6l18R5)90M39BbD%cyNJ9y7#Gh_QLM=NwBmM*)P(B zj%+PK^=N@hk+U8J8qhBwInA=5=Vj%gPfSXR_$f*SOa%&<1{%v8qIOWs`v)T}VYuec z$oP)kKN#ubU(jJDB@g)xUm(3msmjI9m_~h8sNe-q^~)$Hj!lx|RkT|n?OkBKidTNv zI2y=Kst9Sd$@XUhkgaa_ol4MqUyv$9UzwD==YLd^%CyiDpIxyO(G&(^wY0!Eq(*75 zDqyghM1JlPA&u6MpfHXrkjy=ambv&;zMYW9D+bn<4Qc9TE;!{dmdYyz(Mb-Z+c3Ws zyTVB4=*jFaL${Yd`S8@J@f1aIHy5b1&gCMi!zek*H#%q~T@z=3M6ZVhrZt?}FxKvm zp)w_B0ws?`kOrULFY%@3TAuQ%U(Eh#f^6N%=>E$0rv=VK_T7`M&qqLMJ$AZ3%86Fm zvJ=M-U&^WexX2?~&r4n%bR^!@@lU;uUJf3jns-a=+$yM$hIt0%;9}u&+FW_n%4pLj zvowrNT%QhN@0bN$NS?nG!v^#B(102R<&NW|4PwyQ(-PA;_8Tf0#NjQ#q3y~(ek@C+ z7aGt@cH%VJ>mEA+p_tu3F$-?ar)-N>XGdgGyGWg#hs)W_WRTL$43RnXDzMCrowp)g zr^RZrh?tB)7LijPN7=Mswu+sK54TZ^UmVY&$QlRXu|e{T1M-Y5U(P-;Q)LACSRlqz z!P($q0~poVzC#b1G&(HptgQOD7a{>6Br$MgD(p0*r(7h<67PjdoWMhxQa2A?5#Im7 zUbd$!6>%Y+7I+Oti+VladtA1^{Z@t3Kx+tiA-1NCEzN}vwusO6`&eUR-;XPw{gX5y zGpp}i2-FBQ^1mOkUs_{KIEj-1zn2`#5jlXU_W*+WJqu!QPaY+G3M6hXAoCXdZbYlF zUnIp@HL740%2Rjb-IVo;IWNVjQp#5- z=f=rd=@Ns6!^4-}@yVt#b}z&O-3v!E*jQ+-i@Pkj-P;Rpfe*9iA}3DCMM`(UPf5n* zp*#kNnyE7V%+2(KviFSAeSKQSitik-`8{SwBH3;-GGo_Es8MEic9=}K+@C>$u^vUo z-~Zxk-tVVo$wVDfWx-pZNx5>lV3F)-ONI6>#zbLP1}iJU;^axGP(+S)#oL0|$S*%W!2VCNiWG4V7622QWU$PLTj$VB$>G zL~CrH;$@yM0^~JD6qySj+*;W3m1%)g7_u_0kgAFLKOr;hOfQL?@cux_tpy&ACYuVg(7=>HswCT6KF9HPYFB@F0)r@G#8OqI1hf?h6B|La3e z+^LbTKI@Vj3}xxQ;;rNw2I#nIYZ>U+(bMMFf2PF=JQ9$l``dTO`^6TbQ`Yrt`5@{K zXucA}l0dxeAo&LUWEuBm?SsAK*k$oo21J2xWOD=5<|-Z%Ov%ki0loZz^2=ZOo6i}t z^grjGH2tsN_&rPV;#GuX^}J4QmtW9#%-f-)XByD%EWH&Pi8`mrB|9S)w>;`%R&i54 z!X99~xFzleN&t839}P$Hwu20vM!Q@YGN=(z{=)KE>3h`^%fzfs&G`DyE@d(u?O*23 zU?{~9kN#2mh)eYVv^u~3LF&~hj^TZE94?O{()D15VYqJ1=QgdmLKJdX0t^e*v$7k`4^WZtp(>ddM#ghDv;3jz? zUzj_g^_u0L+FPlD3+Dq$|Db}typG?dg?zI8EiW?x&E3_`ANt6u*>lN<*QNKF`zcoJ z=@+AMjhj}{%({ywRki%}L)ZUu*lT6JS$;*x30rS3!<6^;%aoOspcizy=hLh@+z`C; zj>=5TECX0Iz|GdxRIbMecY(=xDz9Wqbc$ihmSszV@$fgvmF=E?+;QorO!!&TL(uGT zu^tzA|7uBoeXspfgrx2H&@N`&*48vo*Y8MhE{kE!e~*^bBp`aPre1-^=*~dH-`?mJ zl$EAk%I+8SzHa-$i9h+b1c{lp7i-2q?ZstuXj@C)^RaIq4|vAs#gH~ocv^4C6=^Ap zJ+E71+ai0#dn@WcHcVyMMLb@=IXmvqF{?A**W;evg@A*KNu4$8cN=ryzA^55(l^I{ zeBG6AmIc@A+vablb6|V+p4!&P{z>}el5o1$^f;!+Gd8NXKD9GPVDm8gd42i2vKxD! zQ|lj;eu}Z++!7gPTbV~jFDm;x`@W3v???A1+tiCkKUyD@BGh}8c%aN_-exxs5(#2R z|GxlK(c#(ot>E2wlkL^!_h<0!7jw8+ZfsS}t*e(ld9VRP*Mj$~L8!GUl^vbfbKBX= zcO;A|dwF%jl@H(3z2-g{;0roBb=l*nyuK~{pB;1sZGU0al^NaPo^BWL@$O^Zn%qgt z{60XRCpF(UtgHVC$SQAfLSgTdW>3l8=J3h*sOX_~BroK91l-Rp76~_cUj?^U zLzF~#m9O|YGqp|JQjp_DbDltDEcG6ETdnE#kPPi!(zW@s_!YP5J;^`>AOpp>Yo-hz z`D~R9-R{B<$?eg~|GaN)r_Da5O{Tqh(KY>j + + + + + + Chapter 21 Convert from wide to long format for ggplot2 | Principles of Uncertainty – exercises + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

      + +
      + +
      + +
      +
      + + +
      +
      + +
      +
      +

      Chapter 21 Convert from wide to long format for ggplot2

      +

      df_long <- gather(df, distribution, density, -x)

      +
      +
      + +
      +
      +
      + + +
      +
      + + + + + + + + + + + + + + + diff --git a/docs/create-data.html b/docs/create-data.html new file mode 100644 index 0000000..323fc2a --- /dev/null +++ b/docs/create-data.html @@ -0,0 +1,381 @@ + + + + + + + Chapter 19 Create data | Principles of Uncertainty – exercises + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
      + +
      + +
      + +
      +
      + + +
      +
      + +
      +
      +

      Chapter 19 Create data

      +

      x <- seq(135, 145, by = 0.01) # Adjusting the x range to account for the larger standard deviations +df <- data.frame(x = x)

      +
      +
      + +
      +
      +
      + + +
      +
      + + + + + + + + + + + + + + + diff --git a/docs/define-the-iq-distributions.html b/docs/define-the-iq-distributions.html new file mode 100644 index 0000000..dc209ce --- /dev/null +++ b/docs/define-the-iq-distributions.html @@ -0,0 +1,381 @@ + + + + + + + Chapter 20 Define the IQ distributions | Principles of Uncertainty – exercises + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
      + +
      + +
      + +
      +
      + + +
      +
      + +
      +
      +

      Chapter 20 Define the IQ distributions

      +

      df\(IQ_mu100_sd10 <- dnorm(df\)x, mean = 100, sd = 10) +df\(IQ_mu105_sd8 <- dnorm(df\)x, mean = 105, sd = 8)

      +
      +
      + +
      +
      +
      + + +
      +
      + + + + + + + + + + + + + + + diff --git a/docs/distributions-intutition.html b/docs/distributions-intutition.html index 424bfb4..5a4f4d9 100644 --- a/docs/distributions-intutition.html +++ b/docs/distributions-intutition.html @@ -23,7 +23,7 @@ - + @@ -301,7 +301,7 @@

      Chapter 18 Distributions intutition

      This chapter is intended to help you familiarize yourself with the different probability distributions you will encounter in this course.

      -

      Use Appendix B as a reference for the basic properties of distributions.

      +

      You will need to use Appendix B extensively as a reference for the basic properties of distributions, so keep it close!

      + + + + + + + + + + + +
      + +
      + +
      + +
      +
      + + +
      +
      + +
      +
      +

      Chapter 22 Ensure the levels of the ‘distribution’ factor match our desired order

      +

      df_long\(distribution <- factor(df_long\)distribution, levels = c(“IQ_mu100_sd10”, “IQ_mu105_sd8”))

      +
      +
      + +
      +
      +
      + + +
      +
      + + + + + + + + + + + + + + + diff --git a/docs/plot.html b/docs/plot.html new file mode 100644 index 0000000..68148a2 --- /dev/null +++ b/docs/plot.html @@ -0,0 +1,468 @@ + + + + + + + Chapter 23 Plot | Principles of Uncertainty – exercises + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
      + +
      + +
      + +
      +
      + + +
      +
      + +
      +
      +

      Chapter 23 Plot

      +

      ggplot(df_long, aes(x = x, y = density, color = distribution)) + +geom_line() + +labs(x = “IQ Score”, y = “Density”) + +scale_color_manual( +name = “IQ Distribution”, +values = c(IQ_mu100_sd10 = “red”, IQ_mu105_sd8 = “blue”), +labels = c(“Group 1 (µ=100, σ=10)”, “Group 2 (µ=105, σ=8)”) +) + +theme_minimal()

      +
      +
      +
      +
      +

      Exercise 23.1 (Beta intuition 1) The beta distribution is a continuous distribution defined on the unit interval \([0,1]\). It has two strictly positive paramters \(\alpha\) and \(\beta\), which determine its shape. Its support makes it especially suitable to model distribtuions of percentages and proportions.

      +

      Below you’ve been provided with some code that you can copy into Rstudio. Once you run the code, an interactive Shiny app will appear and you will be able to manipulate the graph of the beta distribution.

      +

      Play around with the parameters to get:

      +
        +
      1. A straight line from (0,0) to (1,2)
      2. +
      3. A straight line from (0,2) to (1,0)
      4. +
      5. A symmetric bell curve
      6. +
      7. A bowl-shaped curve
      8. +
      9. The standard uniform distribution is actually a special case of the beta distribution. Find the exact parameters \(\alpha\) and \(\beta\). Once you do, prove the equality by inserting the values into our pdf.
      10. +
      +

      Hint: The beta function is evaluated as \(\text{B}(a,b) = \frac{\Gamma(a)\Gamma(b)}{\Gamma(a+b)}\), +the gamma function for positive integers \(n\) is evaluated as \(\Gamma(n)= (n-1)!\)

      +
      +
      # Install and load necessary packages
      +install.packages(c("shiny", "ggplot2"))
      +library(shiny)
      +library(ggplot2)
      +
      +# The Shiny App
      +ui <- fluidPage(
      +  titlePanel("Beta Distribution Viewer"),
      +  
      +  sidebarLayout(
      +    sidebarPanel(
      +      sliderInput("alpha", "Alpha:", min = 0.1, max = 10, value = 2, step = 0.1),
      +      sliderInput("beta", "Beta:", min = 0.1, max = 10, value = 2, step = 0.1)
      +    ),
      +    
      +    mainPanel(
      +      plotOutput("betaPlot")
      +    )
      +  )
      +)
      +
      +server <- function(input, output) {
      +  output$betaPlot <- renderPlot({
      +    x <- seq(0, 1, by = 0.01)
      +    y <- dbeta(x, shape1 = input$alpha, shape2 = input$beta)
      +    
      +    ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) +
      +      geom_line() +
      +      labs(x = "Value", y = "Density") +
      +      theme_minimal()
      +  })
      +}
      +
      +shinyApp(ui = ui, server = server)
      +
      +
      +
      +

      Solution.

      +
        +
      1. \(\alpha = 2, \beta=1\)

      2. +
      3. \(\alpha = 1, \beta=2\)

      4. +
      5. Possible solution \(\alpha = \beta= 5\)

      6. +
      7. Possible solution \(\alpha = \beta= 0.5\)

      8. +
      9. The correct parameters are \(\alpha = 1, \beta=1\), to prove the equality we insert them into the beta pdf: +\[\frac{x^{\alpha - 1} (1 - x)^{\beta - 1}}{\text{B}(\alpha, \beta)} = +\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\text{B}(1, 1)} = + \frac{1}{\frac{\Gamma(1)\Gamma(1)}{\Gamma(1+1)}}= + \frac{1}{\frac{(1-1)!(1-1)!}{(2-1)!}} = 1\]

      10. +
      +
      +
      +
      +

      Exercise 23.2 (Gamma intuition 1) a

      +
      +
      +

      Exercise 23.3 (Exponential intuition 1) a

      +
      + +
      +
      + + + +
      + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/docs/reference-keys.txt b/docs/reference-keys.txt index 3d75cdf..2c42ff2 100644 --- a/docs/reference-keys.txt +++ b/docs/reference-keys.txt @@ -203,3 +203,16 @@ writing-functions other-tips further-reading-and-references distributions +exr:unnamed-chunk-5 +exr:unnamed-chunk-22 +exr:unnamed-chunk-24 +exr:unnamed-chunk-25 +exr:unnamed-chunk-26 +exr:unnamed-chunk-27 +create-data +define-the-iq-distributions +convert-from-wide-to-long-format-for-ggplot2 +ensure-the-levels-of-the-distribution-factor-match-our-desired-order +plot +exr:unnamed-chunk-28 +exr:unnamed-chunk-29 diff --git a/docs/search_index.json b/docs/search_index.json index 41e37ec..1362239 100644 --- a/docs/search_index.json +++ b/docs/search_index.json @@ -1 +1 @@ -[["index.html", "Principles of Uncertainty – exercises Preface", " Principles of Uncertainty – exercises Gregor Pirš and Erik Štrumbelj 2023-09-28 Preface These are the exercises for the Principles of Uncertainty course of the Data Science Master’s at University of Ljubljana, Faculty of Computer and Information Science. This document will be extended each week as the course progresses. At the end of each exercise session, we will post the solutions to the exercises worked in class and select exercises for homework. Students are also encouraged to solve the remaining exercises to further extend their knowledge. Some exercises require the use of R. Those exercises (or parts of) are coloured blue. Students that are not familiar with R programming language should study A to learn the basics. As the course progresses, we will cover more relevant uses of R for data science. "],["introduction.html", "Chapter 1 Probability spaces 1.1 Measure and probability spaces 1.2 Properties of probability measures 1.3 Discrete probability spaces", " Chapter 1 Probability spaces This chapter deals with measures and probability spaces. At the end of the chapter, we look more closely at discrete probability spaces. The students are expected to acquire the following knowledge: Theoretical Use properties of probability to calculate probabilities. Combinatorics. Understanding of continuity of probability. R Vectors and vector operations. For loop. Estimating probability with simulation. sample function. Matrices and matrix operations. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 1.1 Measure and probability spaces Exercise 1.1 (Completing a set to a sigma algebra) Let \\(\\Omega = \\{1,2,...,10\\}\\) and let \\(A = \\{\\emptyset, \\{1\\}, \\{2\\}, \\Omega \\}\\). Show that \\(A\\) is not a sigma algebra of \\(\\Omega\\). Find the minimum number of elements to complete A to a sigma algebra of \\(\\Omega\\). Solution. \\(1^c = \\{2,3,...,10\\} \\notin A \\implies\\) \\(A\\) is not sigma algebra. First we need the complements of all elements, so we need to add sets \\(\\{2,3,...,10\\}\\) and \\(\\{1,3,4,...,10\\}\\). Next we need unions of all sets – we add the set \\(\\{1,2\\}\\). Again we need the complement of this set, so we add \\(\\{3,4,...,10\\}\\). So the minimum number of elements we need to add is 4. Exercise 1.2 (Diversity of sigma algebras) Let \\(\\Omega\\) be a set. Find the smallest sigma algebra of \\(\\Omega\\). Find the largest sigma algebra of \\(\\Omega\\). Solution. \\(A = \\{\\emptyset, \\Omega\\}\\) \\(2^{\\Omega}\\) Exercise 1.3 Find all sigma algebras for \\(\\Omega = \\{0, 1, 2\\}\\). Solution. \\(A = \\{\\emptyset, \\Omega\\}\\) \\(A = 2^{\\Omega}\\) \\(A = \\{\\emptyset, \\{0\\}, \\{1,2\\}, \\Omega\\}\\) \\(A = \\{\\emptyset, \\{1\\}, \\{0,2\\}, \\Omega\\}\\) \\(A = \\{\\emptyset, \\{2\\}, \\{0,1\\}, \\Omega\\}\\) Exercise 1.4 (Difference between algebra and sigma algebra) Let \\(\\Omega = \\mathbb{N}\\) and \\(\\mathcal{A} = \\{A \\subseteq \\mathbb{N}: A \\text{ is finite or } A^c \\text{ is finite.} \\}\\). Show that \\(\\mathcal{A}\\) is an algebra but not a sigma algebra. Solution. \\(\\emptyset\\) is finite so \\(\\emptyset \\in \\mathcal{A}\\). Let \\(A \\in \\mathcal{A}\\) and \\(B \\in \\mathcal{A}\\). If both are finite, then their union is also finite and therefore in \\(\\mathcal{A}\\). Let at least one of them not be finite. Then their union is not finite. But \\((A \\cup B)^c = A^c \\cap B^c\\). And since at least one is infinite, then its complement is finite and the intersection is too. So finite unions are in \\(\\mathcal{A}\\). Let us look at numbers \\(2n\\). For any \\(n\\), \\(2n \\in \\mathcal{A}\\) as it is finite. But \\(\\bigcup_{k = 1}^{\\infty} 2n \\notin \\mathcal{A}\\). Exercise 1.5 We define \\(\\sigma(X) = \\cap_{\\lambda \\in I} S_\\lambda\\) to be a sigma algebra, generated by the set \\(X\\), where \\(S_\\lambda\\) are all sigma algebras such that \\(X \\subseteq S_\\lambda\\). \\(S_\\lambda\\) are indexed by \\(\\lambda \\in I\\). Let \\(A, B \\subseteq 2^{\\Omega}\\). Prove that \\(\\sigma(A) = \\sigma(B) \\iff A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A)\\). Solution. To prove the equivalence, we need to prove that the left hand side implies the right hand side and vice versa. Proving \\(\\sigma(A) = \\sigma(B) \\Rightarrow A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A)\\): we know \\(A \\subseteq \\sigma(A)\\) is always true, so by substituting in \\(\\sigma(B)\\) from the left hand side equality we obtain \\(A \\subseteq \\sigma(B)\\). We obtain \\(B \\subseteq \\sigma(A)\\) by symmetry. This proves the implication. Proving \\(A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A) \\Rightarrow \\sigma(A) = \\sigma(B)\\): by definition of a sigma algebra, generated by a set, we have \\(\\sigma(B) = \\cap_{\\lambda \\in I} S_\\lambda\\) where \\(S_\\lambda\\) are all sigma algebras where \\(B \\subseteq S_\\lambda\\). But \\(\\sigma(A)\\) is one of \\(S_\\lambda\\), so we can write \\(\\sigma(B) = \\sigma(A) \\cap \\left(\\cap_{\\lambda \\in I} S_\\lambda \\right)\\), which implies \\(\\sigma(B) \\subseteq \\sigma(A)\\). By symmetry, we have \\(\\sigma(A) \\subseteq \\sigma(B)\\). Since \\(\\sigma(A) \\subseteq \\sigma(B)\\) and \\(\\sigma(B) \\subseteq \\sigma(A)\\), we obtain \\(\\sigma(A) = \\sigma(B)\\), which proves the implication and completes the equivalence proof. Exercise 1.6 (Intro to measure) Take the measurable space \\(\\Omega = \\{1,2\\}\\), \\(F = 2^{\\Omega}\\). Which of the following is a measure? Which is a probability measure? \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 5\\), \\(\\mu(\\{2\\}) = 6\\), \\(\\mu(\\{1,2\\}) = 11\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 0\\), \\(\\mu(\\{1,2\\}) = 1\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 0\\), \\(\\mu(\\{1,2\\}) = 0\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 1\\), \\(\\mu(\\{1,2\\}) = 1\\) \\(\\mu(\\emptyset)=0\\), \\(\\mu(\\{1\\})=0\\), \\(\\mu(\\{2\\})=\\infty\\), \\(\\mu(\\{1,2\\})=\\infty\\) Solution. Measure. Not probability measure since \\(\\mu(\\Omega) > 1\\). Neither due to countable additivity. Measure. Not probability measure since \\(\\mu(\\Omega) = 0\\). Probability measure. Measure. Not probability measure since \\(\\mu(\\Omega) > 1\\). Exercise 1.7 Define a probability space that could be used to model the outcome of throwing two fair 6-sided dice. Solution. \\(\\Omega = \\{\\{i,j\\}, i = 1,...,6, j = 1,...,6\\}\\) \\(F = 2^{\\Omega}\\) \\(\\forall \\omega \\in \\Omega\\), \\(P(\\omega) = \\frac{1}{6} \\times \\frac{1}{6} = \\frac{1}{36}\\) 1.2 Properties of probability measures Exercise 1.8 A standard deck (52 cards) is distributed to two persons: 26 cards to each person. All partitions are equally likely. Find the probability that: The first person gets 4 Queens. The first person gets at least 2 Queens. R: Use simulation (sample) to check the above answers. Solution. \\(\\frac{\\binom{48}{22}}{\\binom{52}{26}}\\) 1 - \\(\\frac{\\binom{48}{26} + 4 \\times \\binom{48}{25}}{\\binom{52}{26}}\\) For the simulation, let us represent cards with numbers from 1 to 52, and let 1 through 4 represent Queens. set.seed(1) cards <- 1:52 n <- 10000 q4 <- vector(mode = "logical", length = n) q2 <- vector(mode = "logical", length = n) tmp <- vector(mode = "logical", length = n) for (i in 1:n) { p1 <- sample(1:52, 26) q4[i] <- sum(1:4 %in% p1) == 4 q2[i] <- sum(1:4 %in% p1) >= 2 } sum(q4) / n ## [1] 0.0572 sum(q2) / n ## [1] 0.6894 Exercise 1.9 Let \\(A\\) and \\(B\\) be events with probabilities \\(P(A) = \\frac{2}{3}\\) and \\(P(B) = \\frac{1}{2}\\). Show that \\(\\frac{1}{6} \\leq P(A\\cap B) \\leq \\frac{1}{2}\\), and give examples to show that both extremes are possible. Find corresponding bounds for \\(P(A\\cup B)\\). R: Draw samples from the examples and show the probability bounds of \\(P(A \\cap B)\\) . Solution. From the properties of probability we have \\[\\begin{equation} P(A \\cup B) = P(A) + P(B) - P(A \\cap B) \\leq 1. \\end{equation}\\] From this follows \\[\\begin{align} P(A \\cap B) &\\geq P(A) + P(B) - 1 \\\\ &= \\frac{2}{3} + \\frac{1}{2} - 1 \\\\ &= \\frac{1}{6}, \\end{align}\\] which is the lower bound for the intersection. Conversely, we have \\[\\begin{equation} P(A \\cup B) = P(A) + P(B) - P(A \\cap B) \\geq P(A). \\end{equation}\\] From this follows \\[\\begin{align} P(A \\cap B) &\\leq P(B) \\\\ &= \\frac{1}{2}, \\end{align}\\] which is the upper bound for the intersection. For an example take a fair die. To achieve the lower bound let \\(A = \\{3,4,5,6\\}\\) and \\(B = \\{1,2,3\\}\\), then their intersection is \\(A \\cap B = \\{3\\}\\). To achieve the upper bound take \\(A = \\{1,2,3,4\\}\\) and $B = {1,2,3} $. For the bounds of the union we will use the results from the first part. Again from the properties of probability we have \\[\\begin{align} P(A \\cup B) &= P(A) + P(B) - P(A \\cap B) \\\\ &\\geq P(A) + P(B) - \\frac{1}{2} \\\\ &= \\frac{2}{3}. \\end{align}\\] Conversely \\[\\begin{align} P(A \\cup B) &= P(A) + P(B) - P(A \\cap B) \\\\ &\\leq P(A) + P(B) - \\frac{1}{6} \\\\ &= 1. \\end{align}\\] Therefore \\(\\frac{2}{3} \\leq P(A \\cup B) \\leq 1\\). We use sample in R: set.seed(1) n <- 10000 samps <- sample(1:6, n, replace = TRUE) # lower bound lb <- vector(mode = "logical", length = n) A <- c(1,2,3) B <- c(3,4,5,6) for (i in 1:n) { lb[i] <- samps[i] %in% A & samps[i] %in% B } sum(lb) / n ## [1] 0.1605 # upper bound ub <- vector(mode = "logical", length = n) A <- c(1,2,3) B <- c(1,2,3,4) for (i in 1:n) { ub[i] <- samps[i] %in% A & samps[i] %in% B } sum(ub) / n ## [1] 0.4913 Exercise 1.10 A fair coin is tossed repeatedly. Show that, with probability one, a head turns up sooner or later. Show similarly that any given finite sequence of heads and tails occurs eventually with probability one. Solution. \\[\\begin{align} P(\\text{no heads}) &= \\lim_{n \\rightarrow \\infty} P(\\text{no heads in first }n \\text{ tosses}) \\\\ &= \\lim_{n \\rightarrow \\infty} \\frac{1}{2^n} \\\\ &= 0. \\end{align}\\] For the second part, let us fix the given sequence of heads and tails of length \\(k\\) as \\(s\\). A probability that this happens in \\(k\\) tosses is \\(\\frac{1}{2^k}\\). \\[\\begin{align} P(s \\text{ occurs}) &= \\lim_{n \\rightarrow \\infty} P(s \\text{ occurs in first } nk \\text{ tosses}) \\end{align}\\] The right part of the upper equation is greater than if \\(s\\) occurs either in the first \\(k\\) tosses, second \\(k\\) tosses,…, \\(n\\)-th \\(k\\) tosses. Therefore \\[\\begin{align} P(s \\text{ occurs}) &\\geq \\lim_{n \\rightarrow \\infty} P(s \\text{ occurs in first } n \\text{ disjoint sequences of length } k) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - P(s \\text{ does not occur in first } n \\text{ disjoint sequences})) \\\\ &= 1 - \\lim_{n \\rightarrow \\infty} P(s \\text{ does not occur in first } n \\text{ disjoint sequences}) \\\\ &= 1 - \\lim_{n \\rightarrow \\infty} (1 - \\frac{1}{2^k})^n \\\\ &= 1. \\end{align}\\] Exercise 1.11 An Erdos-Renyi random graph \\(G(n,p)\\) is a model with \\(n\\) nodes, where each pair of nodes is connected with probability \\(p\\). Calculate the probability that there exists a node that is not connected to any other node in \\(G(4,0.6)\\). Show that the upper bound for the probability that there exist 2 nodes that are not connected to any other node for an arbitrary \\(G(n,p)\\) is \\(\\binom{n}{2} (1-p)^{2n - 3}\\). R: Estimate the probability from the first point using simulation. Solution. Let \\(A_i\\) be the event that the \\(i\\)-th node is not connected to any other node. Then our goal is to calculate \\(P(\\cup_{i=1}^n A_i)\\). Using the inclusion-exclusion principle, we get \\[\\begin{align} P(\\cup_{i=1}^n A_i) &= \\sum_i A_i - \\sum_{i<j} P(A_i \\cap A_j) + \\sum_{i<j<k} P(A_i \\cap A_j \\cap A_k) - P(A_1 \\cap A_2 \\cap A_3 \\cap A_4) \\\\ &=4 (1 - p)^3 - \\binom{4}{2} (1 - p)^5 + \\binom{4}{3} (1 - p)^6 - (1 - p)^6 \\\\ &\\approx 0.21. \\end{align}\\] Let \\(A_{ij}\\) be the event that nodes \\(i\\) and \\(j\\) are not connected to any other node. We are interested in \\(P(\\cup_{i<j}A_{ij})\\). By using Boole`s inequality, we get \\[\\begin{align} P(\\cup_{i<j}A_{ij}) \\leq \\sum_{i<j} P(A_{ij}). \\end{align}\\] What is the probability of \\(A_{ij}\\)? There need to be no connections to the \\(i\\)-th node to the remaining nodes (excluding \\(j\\)), the same for the \\(j\\)-th node, and there can be no connection between them. Therefore \\[\\begin{align} P(\\cup_{i<j}A_{ij}) &\\leq \\sum_{i<j} (1 - p)^{2(n-2) + 1} \\\\ &= \\binom{n}{2} (1 - p)^{2n - 3}. \\end{align}\\] set.seed(1) n_samp <- 100000 n <- 4 p <- 0.6 conn_samp <- vector(mode = "logical", length = n_samp) for (i in 1:n_samp) { tmp_mat <- matrix(data = 0, nrow = n, ncol = n) samp_conn <- sample(c(0,1), choose(4,2), replace = TRUE, prob = c(1 - p, p)) tmp_mat[lower.tri(tmp_mat)] <- samp_conn tmp_mat[upper.tri(tmp_mat)] <- t(tmp_mat)[upper.tri(t(tmp_mat))] not_conn <- apply(tmp_mat, 1, sum) if (any(not_conn == 0)) { conn_samp[i] <- TRUE } else { conn_samp[i] <- FALSE } } sum(conn_samp) / n_samp ## [1] 0.20565 1.3 Discrete probability spaces Exercise 1.12 Show that the standard measurable space on \\(\\Omega = \\{0,1,...,n\\}\\) equipped with binomial measure is a discrete probability space. Define another probability measure on this measurable space. Show that for \\(n=1\\) the binomial measure is the same as the Bernoulli measure. R: Draw 1000 samples from the binomial distribution \\(p=0.5\\), \\(n=20\\) (rbinom) and compare relative frequencies with theoretical probability measure. Solution. We need to show that the terms of \\(\\sum_{k=0}^n \\binom{n}{k} p^k (1 - p)^{n - k}\\) sum to 1. For that we use the binomial theorem \\(\\sum_{k=0}^n \\binom{n}{k} x^k y^{n-k} = (x + y)^n\\). So \\[\\begin{equation} \\sum_{k=0}^n \\binom{n}{k} p^k (1 - p)^{n - k} = (p + 1 - p)^n = 1. \\end{equation}\\] \\(P(\\{k\\}) = \\frac{1}{n + 1}\\). When \\(n=1\\) then \\(k \\in \\{0,1\\}\\). Inserting \\(n=1\\) into the binomial measure, we get \\(\\binom{1}{k}p^k (1-p)^{1 - k}\\). Now \\(\\binom{1}{1} = \\binom{1}{0} = 1\\), so the measure is \\(p^k (1-p)^{1 - k}\\), which is the Bernoulli measure. set.seed(1) library(ggplot2) library(dplyr) bin_samp <- rbinom(n = 1000, size = 20, prob = 0.5) bin_samp <- data.frame(x = bin_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:20, n = dbinom(0:20, size = 20, prob = 0.5), type = "theoretical_measure")) bin_plot <- ggplot(data = bin_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(bin_plot) Exercise 1.13 Show that the standard measurable space on \\(\\Omega = \\{0,1,...,\\infty\\}\\) equipped with geometric measure is a discrete probability space, equipped with Poisson measure is a discrete probability space. Define another probability measure on this measurable space. R: Draw 1000 samples from the Poisson distribution \\(\\lambda = 10\\) (rpois) and compare relative frequencies with theoretical probability measure. Solution. \\(\\sum_{k = 0}^{\\infty} p(1 - p)^k = p \\sum_{k = 0}^{\\infty} (1 - p)^k = p \\frac{1}{1 - 1 + p} = 1\\). We used the formula for geometric series. \\(\\sum_{k = 0}^{\\infty} \\frac{\\lambda^k e^{-\\lambda}}{k!} = e^{-\\lambda} \\sum_{k = 0}^{\\infty} \\frac{\\lambda^k}{k!} = e^{-\\lambda} e^{\\lambda} = 1.\\) We used the Taylor expansion of the exponential function. Since we only have to define a probability measure, we could only assign probabilities that sum to one to a finite number of events in \\(\\Omega\\), and probability zero to the other infinite number of events. However to make this solution more educational, we will try to find a measure that assigns a non-zero probability to all events in \\(\\Omega\\). A good start for this would be to find a converging infinite series, as the probabilities will have to sum to one. One simple converging series is the geometric series \\(\\sum_{k=0}^{\\infty} p^k\\) for \\(|p| < 1\\). Let us choose an arbitrary \\(p = 0.5\\). Then \\(\\sum_{k=0}^{\\infty} p^k = \\frac{1}{1 - 0.5} = 2\\). To complete the measure, we have to normalize it, so it sums to one, therefore \\(P(\\{k\\}) = \\frac{0.5^k}{2}\\) is a probability measure on \\(\\Omega\\). We could make it even more difficult by making this measure dependent on some parameter \\(\\alpha\\), but this is out of the scope of this introductory chapter. set.seed(1) pois_samp <- rpois(n = 1000, lambda = 10) pois_samp <- data.frame(x = pois_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:25, n = dpois(0:25, lambda = 10), type = "theoretical_measure")) pois_plot <- ggplot(data = pois_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(pois_plot) Exercise 1.14 Define a probability measure on \\((\\Omega = \\mathbb{Z}, 2^{\\mathbb{Z}})\\). Define a probability measure such that \\(P(\\omega) > 0, \\forall \\omega \\in \\Omega\\). R: Implement a random generator that will generate samples with the relative frequency that corresponds to your probability measure. Compare relative frequencies with theoretical probability measure . Solution. \\(P(0) = 1, P(\\omega) = 0, \\forall \\omega \\neq 0\\). \\(P(\\{k\\}) = \\sum_{k = -\\infty}^{\\infty} \\frac{p(1 - p)^{|k|}}{2^{1 - 1_0(k)}}\\), where \\(1_0(k)\\) is the indicator function, which equals to one if \\(k\\) is 0, and equals to zero in every other case. n <- 1000 geom_samps <- rgeom(n, prob = 0.5) sign_samps <- sample(c(FALSE, TRUE), size = n, replace = TRUE) geom_samps[sign_samps] <- -geom_samps[sign_samps] my_pmf <- function (k, p) { indic <- rep(1, length(k)) indic[k == 0] <- 0 return ((p * (1 - p)^(abs(k))) / 2^indic) } geom_samps <- data.frame(x = geom_samps) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = -10:10, n = my_pmf(-10:10, 0.5), type = "theoretical_measure")) geom_plot <- ggplot(data = geom_samps, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(geom_plot) Exercise 1.15 Define a probability measure on \\(\\Omega = \\{1,2,3,4,5,6\\}\\) with parameter \\(m \\in \\{1,2,3,4,5,6\\}\\), so that the probability of outcome at distance \\(1\\) from \\(m\\) is half of the probability at distance \\(0\\), at distance \\(2\\) is half of the probability at distance \\(1\\), etc. R: Implement a random generator that will generate samples with the relative frequency that corresponds to your probability measure. Compare relative frequencies with theoretical probability measure . Solution. \\(P(\\{k\\}) = \\frac{\\frac{1}{2}^{|m - k|}}{\\sum_{i=1}^6 \\frac{1}{2}^{|m - i|}}\\) n <- 10000 m <- 4 my_pmf <- function (k, m) { denom <- sum(0.5^abs(m - 1:6)) return (0.5^abs(m - k) / denom) } samps <- c() for (i in 1:n) { a <- sample(1:6, 1) a_val <- my_pmf(a, m) prob <- runif(1) if (prob < a_val) { samps <- c(samps, a) } } samps <- data.frame(x = samps) %>% count(x) %>% mutate(n = n / length(samps), type = "empirical_frequencies") %>% bind_rows(data.frame(x = 1:6, n = my_pmf(1:6, m), type = "theoretical_measure")) my_plot <- ggplot(data = samps, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(my_plot) "],["uprobspaces.html", "Chapter 2 Uncountable probability spaces 2.1 Borel sets 2.2 Lebesgue measure", " Chapter 2 Uncountable probability spaces This chapter deals with uncountable probability spaces. The students are expected to acquire the following knowledge: Theoretical Understand Borel sets and identify them. Estimate Lebesgue measure for different sets. Know when sets are Borel-measurable. Understanding of countable and uncountable sets. R Uniform sampling. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 2.1 Borel sets Exercise 2.1 Prove that the intersection of two sigma algebras on \\(\\Omega\\) is a sigma algebra. Prove that the collection of all open subsets \\((a,b)\\) on \\((0,1]\\) is not a sigma algebra of \\((0,1]\\). Solution. Empty set: \\[\\begin{equation} \\emptyset \\in \\mathcal{A} \\wedge \\emptyset \\in \\mathcal{B} \\Rightarrow \\emptyset \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Complement: \\[\\begin{equation} \\text{Let } A \\in \\mathcal{A} \\cap \\mathcal{B} \\Rightarrow A \\in \\mathcal{A} \\wedge A \\in \\mathcal{B} \\Rightarrow A^c \\in \\mathcal{A} \\wedge A^c \\in \\mathcal{B} \\Rightarrow A^c \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Countable additivity: Let \\(\\{A_i\\}\\) be a countable sequence of subsets in \\(\\mathcal{A} \\cap \\mathcal{B}\\). \\[\\begin{equation} \\forall i: A_i \\in \\mathcal{A} \\cap \\mathcal{B} \\Rightarrow A_i \\in \\mathcal{A} \\wedge A_i \\in \\mathcal{B} \\Rightarrow \\cup A_i \\in \\mathcal{A} \\wedge \\cup A_i \\in \\mathcal{B} \\Rightarrow \\cup A_i \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Let \\(A\\) denote the collection of all open subsets \\((a,b)\\) on \\((0,1]\\). Then \\((0,1) \\in A\\). But \\((0,1)^c = 1 \\notin A\\). Exercise 2.2 Show that \\(\\mathcal{C} = \\sigma(\\mathcal{C})\\) if and only if \\(\\mathcal{C}\\) is a sigma algebra. Solution. “\\(\\Rightarrow\\)” This follows from the definition of a generated sigma algebra. “\\(\\Leftarrow\\)” Let \\(\\mathcal{F} = \\cap_i F_i\\) be the intersection of all sigma algebras that contain \\(\\mathcal{C}\\). Then \\(\\sigma(\\mathcal{C}) = \\mathcal{F}\\). Additionally, \\(\\forall i: \\mathcal{C} \\in F_i\\). So each \\(F_i\\) can be written as \\(F_i = \\mathcal{C} \\cup D\\), where \\(D\\) are the rest of the elements in the sigma algebra. In other words, each sigma algebra in the collection contains at least \\(\\mathcal{C}\\), but can contain other elements. Now for some \\(j\\), \\(F_j = \\mathcal{C}\\) as \\(\\{F_i\\}\\) contains all sigma algebras that contain \\(\\mathcal{C}\\) and \\(\\mathcal{C}\\) is such a sigma algebra. Since this is the smallest subset in the intersection it follows that \\(\\sigma(\\mathcal{C}) = \\mathcal{F} = \\mathcal{C}\\). Exercise 2.3 Let \\(\\mathcal{C}\\) and \\(\\mathcal{D}\\) be two collections of subsets on \\(\\Omega\\) such that \\(\\mathcal{C} \\subset \\mathcal{D}\\). Prove that \\(\\sigma(\\mathcal{C}) \\subseteq \\sigma(\\mathcal{D})\\). Solution. \\(\\sigma(\\mathcal{D})\\) is a sigma algebra that contains \\(\\mathcal{D}\\). It follows that \\(\\sigma(\\mathcal{D})\\) is a sigma algebra that contains \\(\\mathcal{C}\\). Let us write \\(\\sigma(\\mathcal{C}) = \\cap_i F_i\\), where \\(\\{F_i\\}\\) is the collection of all sigma algebras that contain \\(\\mathcal{C}\\). Since \\(\\sigma(\\mathcal{D})\\) is such a sigma algebra, there exists an index \\(j\\), so that \\(F_j = \\sigma(\\mathcal{D})\\). Then we can write \\[\\begin{align} \\sigma(\\mathcal{C}) &= (\\cap_{i \\neq j} F_i) \\cap \\sigma(\\mathcal{D}) \\\\ &\\subseteq \\sigma(\\mathcal{D}). \\end{align}\\] Exercise 2.4 Prove that the following subsets of \\((0,1]\\) are Borel-measurable by finding their measure. Any countable set. The set of numbers in (0,1] whose decimal expansion does not contain 7. Solution. This follows directly from the fact that every countable set is a union of singletons, whose measure is 0. Let us first look at numbers which have a 7 as the first decimal numbers. Their measure is 0.1. Then we take all the numbers with a 7 as the second decimal number (excluding those who already have it as the first). These have the measure 0.01, and there are 9 of them, so their total measure is 0.09. We can continue to do so infinitely many times. At each \\(n\\), we have the measure of the intervals which is \\(10^n\\) and the number of those intervals is \\(9^{n-1}\\). Now \\[\\begin{align} \\lambda(A) &= 1 - \\sum_{n = 0}^{\\infty} \\frac{9^n}{10^{n+1}} \\\\ &= 1 - \\frac{1}{10} \\sum_{n = 0}^{\\infty} (\\frac{9}{10})^n \\\\ &= 1 - \\frac{1}{10} \\frac{10}{1} \\\\ &= 0. \\end{align}\\] Since we have shown that the measure of the set is \\(0\\), we have also shown that the set is measurable. Exercise 2.5 Let \\(\\Omega = [0,1]\\), and let \\(\\mathcal{F}_3\\) consist of all countable subsets of \\(\\Omega\\), and all subsets of \\(\\Omega\\) having a countable complement. Show that \\(\\mathcal{F}_3\\) is a sigma algebra. Let us define \\(P(A)=0\\) if \\(A\\) is countable, and \\(P(A) = 1\\) if \\(A\\) has a countable complement. Is \\((\\Omega, \\mathcal{F}_3, P)\\) a legitimate probability space? Solution. The empty set is countable, therefore it is in \\(\\mathcal{F}_3\\). For any \\(A \\in \\mathcal{F}_3\\). If \\(A\\) is countable, then \\(A^c\\) has a countable complement and is in \\(\\mathcal{F}_3\\). If \\(A\\) is uncountable, then it has a countable complement \\(A^c\\) which is therefore also in \\(\\mathcal{F}_3\\). We are left with showing countable additivity. Let \\(\\{A_i\\}\\) be an arbitrary collection of sets in \\(\\mathcal{F}_3\\). We will look at two possibilities. First let all \\(A_i\\) be countable. A countable union of countable sets is countable, and therefore in \\(\\mathcal{F}_3\\). Second, let at least one \\(A_i\\) be uncountable. It follows that it has a countable complement. We can write \\[\\begin{equation} (\\cup_{i=1}^{\\infty} A_i)^c = \\cap_{i=1}^{\\infty} A_i^c. \\end{equation}\\] Since at least one \\(A_i^c\\) on the right side is countable, the whole intersection is countable, and therefore the union has a countable complement. It follows that the union is in \\(\\mathcal{F}_3\\). The tuple \\((\\Omega, \\mathcal{F}_3)\\) is a measurable space. Therefore, we only need to check whether \\(P\\) is a probability measure. The measure of the empty set is zero as it is countable. We have to check for countable additivity. Let us look at three situations. Let \\(A_i\\) be disjoint sets. First, let all \\(A_i\\) be countable. \\[\\begin{equation} P(\\cup_{i=1}^{\\infty} A_i) = \\sum_{i=1}^{\\infty}P( A_i)) = 0. \\end{equation}\\] Since the union is countable, the above equation holds. Second, let exactly one \\(A_i\\) be uncountable. W.L.O.G. let that be \\(A_1\\). Then \\[\\begin{equation} P(\\cup_{i=1}^{\\infty} A_i) = 1 + \\sum_{i=2}^{\\infty}P( A_i)) = 1. \\end{equation}\\] Since the union is uncountable, the above equation holds. Third, let at least two \\(A_i\\) be uncountable. We have to check whether it is possible for two uncountable sets in \\(\\mathcal{F}_3\\) to be disjoint. If that is possible, then their measures would sum to more than one and \\(P\\) would not be a probability measure. W.L.O.G. let \\(A_1\\) and \\(A_2\\) be uncountable. Then we have \\[\\begin{equation} A_1 \\cap A_2 = (A_1^c \\cup A_2^c)^c. \\end{equation}\\] Now \\(A_1^c\\) and \\(A_2^c\\) are countable and their union is therefore countable. Let \\(B = A_1^c \\cup A_2^c\\). So the intersection of \\(A_1\\) and \\(A_2\\) equals the complement of \\(B\\), which is countable. For the intersection to be the empty set, \\(B\\) would have to equal to \\(\\Omega\\). But \\(\\Omega\\) is uncountable and therefore \\(B\\) can not equal to \\(\\Omega\\). It follows that two uncountable sets in \\(\\mathcal{F}_3\\) can not have an empty intersection. Therefore the tuple is a legitimate probability space. 2.2 Lebesgue measure Exercise 2.6 Show that the Lebesgue measure of rational numbers on \\([0,1]\\) is 0. R: Implement a random number generator, which generates uniform samples of irrational numbers in \\([0,1]\\) by uniformly sampling from \\([0,1]\\) and rejecting a sample if it is rational. Solution. There are a countable number of rational numbers. Therefore, we can write \\[\\begin{align} \\lambda(\\mathbb{Q}) &= \\lambda(\\cup_{i = 1}^{\\infty} q_i) &\\\\ &= \\sum_{i = 1}^{\\infty} \\lambda(q_i) &\\text{ (countable additivity)} \\\\ &= \\sum_{i = 1}^{\\infty} 0 &\\text{ (Lebesgue measure of a singleton)} \\\\ &= 0. \\end{align}\\] Exercise 2.7 Prove that the Lebesgue measure of \\(\\mathbb{R}\\) is infinity. Paradox. Show that the cardinality of \\(\\mathbb{R}\\) and \\((0,1)\\) is the same, while their Lebesgue measures are infinity and one respectively. Solution. Let \\(a_i\\) be the \\(i\\)-th integer for \\(i \\in \\mathbb{Z}\\). We can write \\(\\mathbb{R} = \\cup_{-\\infty}^{\\infty} (a_i, a_{i + 1}]\\). \\[\\begin{align} \\lambda(\\mathbb{R}) &= \\lambda(\\cup_{i = -\\infty}^{\\infty} (a_i, a_{i + 1}]) \\\\ &= \\lambda(\\lim_{n \\rightarrow \\infty} \\cup_{i = -n}^{n} (a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\lambda(\\cup_{i = -n}^{n} (a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\sum_{i = -n}^{n} \\lambda((a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\sum_{i = -n}^{n} 1 \\\\ &= \\lim_{n \\rightarrow \\infty} 2n \\\\ &= \\infty. \\end{align}\\] We need to find a bijection between \\(\\mathbb{R}\\) and \\((0,1)\\). A well-known function that maps from a bounded interval to \\(\\mathbb{R}\\) is the tangent. To make the bijection easier to achieve, we will take the inverse, which maps from \\(\\mathbb{R}\\) to \\((-\\frac{\\pi}{2}, \\frac{\\pi}{2})\\). However, we need to change the function so it maps to \\((0,1)\\). First we add \\(\\frac{\\pi}{2}\\), so that we move the function above zero. Then we only have to divide by the max value, which in this case is \\(\\pi\\). So our bijection is \\[\\begin{equation} f(x) = \\frac{\\tan^{-1}(x) + \\frac{\\pi}{2}}{\\pi}. \\end{equation}\\] Exercise 2.8 Take the measure space \\((\\Omega_1 = (0,1], B_{(0,1]}, \\lambda)\\) (we know that this is a probability space on \\((0,1]\\)). Define a map (function) from \\(\\Omega_1\\) to \\(\\Omega_2 = \\{1,2,3,4,5,6\\}\\) such that the measure space \\((\\Omega_2, 2^{\\Omega_2}, \\lambda(f^{-1}()))\\) will be a discrete probability space with uniform probabilities (\\(P(\\omega) = \\frac{1}{6}, \\forall \\omega \\in \\Omega_2)\\). Is the map that you defined in (a) the only such map? How would you in the same fashion define a map that would result in a probability space that can be interpreted as a coin toss with probability \\(p\\) of heads? R: Use the map in (a) as a basis for a random generator for this fair die. Solution. In other words, we have to assign disjunct intervals of the same size to each element of \\(\\Omega_2\\). Therefore \\[\\begin{equation} f(x) = \\lceil 6x \\rceil. \\end{equation}\\] No, we could for example rearrange the order in which the intervals are mapped to integers. Additionally, we could have several disjoint intervals that mapped to the same integer, as long as the Lebesgue measure of their union would be \\(\\frac{1}{6}\\) and the function would remain injective. We have \\(\\Omega_3 = \\{0,1\\}\\), where zero represents heads and one represents tails. Then \\[\\begin{equation} f(x) = 0^{I_{A}(x)}, \\end{equation}\\] where \\(A = \\{y \\in (0,1] : y < p\\}\\). set.seed(1) unif_s <- runif(1000) die_s <- ceiling(6 * unif_s) summary(as.factor(die_s)) ## 1 2 3 4 5 6 ## 166 154 200 146 166 168 "],["condprob.html", "Chapter 3 Conditional probability 3.1 Calculating conditional probabilities 3.2 Conditional independence 3.3 Monty Hall problem", " Chapter 3 Conditional probability This chapter deals with conditional probability. The students are expected to acquire the following knowledge: Theoretical Identify whether variables are independent. Calculation of conditional probabilities. Understanding of conditional dependence and independence. How to apply Bayes’ theorem to solve difficult probabilistic questions. R Simulating conditional probabilities. cumsum. apply. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 3.1 Calculating conditional probabilities Exercise 3.1 A military officer is in charge of identifying enemy aircraft and shooting them down. He is able to positively identify an enemy airplane 95% of the time and positively identify a friendly airplane 90% of the time. Furthermore, 99% of the airplanes are friendly. When the officer identifies an airplane as an enemy airplane, what is the probability that it is not and they will shoot at a friendly airplane? Solution. Let \\(E = 0\\) denote that the observed plane is friendly and \\(E=1\\) that it is an enemy. Let \\(I = 0\\) denote that the officer identified it as friendly and \\(I = 1\\) as enemy. Then \\[\\begin{align} P(E = 0 | I = 1) &= \\frac{P(I = 1 | E = 0)P(E = 0)}{P(I = 1)} \\\\ &= \\frac{P(I = 1 | E = 0)P(E = 0)}{P(I = 1 | E = 0)P(E = 0) + P(I = 1 | E = 1)P(E = 1)} \\\\ &= \\frac{0.1 \\times 0.99}{0.1 \\times 0.99 + 0.95 \\times 0.01} \\\\ &= 0.91. \\end{align}\\] Exercise 3.2 R: Consider tossing a fair die. Let \\(A = \\{2,4,6\\}\\) and \\(B = \\{1,2,3,4\\}\\). Then \\(P(A) = \\frac{1}{2}\\), \\(P(B) = \\frac{2}{3}\\) and \\(P(AB) = \\frac{1}{3}\\). Since \\(P(AB) = P(A)P(B)\\), the events \\(A\\) and \\(B\\) are independent. Simulate draws from the sample space and verify that the proportions are the same. Then find two events \\(C\\) and \\(D\\) that are not independent and repeat the simulation. set.seed(1) nsamps <- 10000 tosses <- sample(1:6, nsamps, replace = TRUE) PA <- sum(tosses %in% c(2,4,6)) / nsamps PB <- sum(tosses %in% c(1,2,3,4)) / nsamps PA * PB ## [1] 0.3295095 sum(tosses %in% c(2,4)) / nsamps ## [1] 0.3323 # Let C = {1,2} and D = {2,3} PC <- sum(tosses %in% c(1,2)) / nsamps PD <- sum(tosses %in% c(2,3)) / nsamps PC * PD ## [1] 0.1067492 sum(tosses %in% c(2)) / nsamps ## [1] 0.1622 Exercise 3.3 A machine reports the true value of a thrown 12-sided die 5 out of 6 times. If the machine reports a 1 has been tossed, what is the probability that it is actually a 1? Now let the machine only report whether a 1 has been tossed or not. Does the probability change? R: Use simulation to check your answers to a) and b). Solution. Let \\(T = 1\\) denote that the toss is 1 and \\(M = 1\\) that the machine reports a 1. \\[\\begin{align} P(T = 1 | M = 1) &= \\frac{P(M = 1 | T = 1)P(T = 1)}{P(M = 1)} \\\\ &= \\frac{P(M = 1 | T = 1)P(T = 1)}{\\sum_{k=1}^{12} P(M = 1 | T = k)P(T = k)} \\\\ &= \\frac{\\frac{5}{6}\\frac{1}{12}}{\\frac{5}{6}\\frac{1}{12} + 11 \\frac{1}{6} \\frac{1}{11} \\frac{1}{12}} \\\\ &= \\frac{5}{6}. \\end{align}\\] Yes. \\[\\begin{align} P(T = 1 | M = 1) &= \\frac{P(M = 1 | T = 1)P(T = 1)}{P(M = 1)} \\\\ &= \\frac{P(M = 1 | T = 1)P(T = 1)}{\\sum_{k=1}^{12} P(M = 1 | T = k)P(T = k)} \\\\ &= \\frac{\\frac{5}{6}\\frac{1}{12}}{\\frac{5}{6}\\frac{1}{12} + 11 \\frac{1}{6} \\frac{1}{12}} \\\\ &= \\frac{5}{16}. \\end{align}\\] set.seed(1) nsamps <- 10000 report_a <- vector(mode = "numeric", length = nsamps) report_b <- vector(mode = "logical", length = nsamps) truths <- vector(mode = "logical", length = nsamps) for (i in 1:10000) { toss <- sample(1:12, size = 1) truth <- sample(c(TRUE, FALSE), size = 1, prob = c(5/6, 1/6)) truths[i] <- truth if (truth) { report_a[i] <- toss report_b[i] <- toss == 1 } else { remaining <- (1:12)[1:12 != toss] report_a[i] <- sample(remaining, size = 1) report_b[i] <- toss != 1 } } truth_a1 <- truths[report_a == 1] sum(truth_a1) / length(truth_a1) ## [1] 0.8300733 truth_b1 <- truths[report_b] sum(truth_b1) / length(truth_b1) ## [1] 0.3046209 Exercise 3.4 A coin is tossed independently \\(n\\) times. The probability of heads at each toss is \\(p\\). At each time \\(k\\), \\((k = 2,3,...,n)\\) we get a reward at time \\(k+1\\) if \\(k\\)-th toss was a head and the previous toss was a tail. Let \\(A_k\\) be the event that a reward is obtained at time \\(k\\). Are events \\(A_k\\) and \\(A_{k+1}\\) independent? Are events \\(A_k\\) and \\(A_{k+2}\\) independent? R: simulate 10 tosses 10000 times, where \\(p = 0.7\\). Check your answers to a) and b) by counting the frequencies of the events \\(A_5\\), \\(A_6\\), and \\(A_7\\). Solution. For \\(A_k\\) to happen, we need the tosses \\(k-2\\) and \\(k-1\\) be tails and heads respectively. For \\(A_{k+1}\\) to happen, we need tosses \\(k-1\\) and \\(k\\) be tails and heads respectively. As the toss \\(k-1\\) need to be heads for one and tails for the other, these two events can not happen simultaneously. Therefore the probability of their intersection is 0. But the probability of each of them separately is \\(p(1-p) > 0\\). Therefore, they are not independent. For \\(A_k\\) to happen, we need the tosses \\(k-2\\) and \\(k-1\\) be tails and heads respectively. For \\(A_{k+2}\\) to happen, we need tosses \\(k\\) and \\(k+1\\) be tails and heads respectively. So the probability of intersection is \\(p^2(1-p)^2\\). And the probability of each separately is again \\(p(1-p)\\). Therefore, they are independent. set.seed(1) nsamps <- 10000 p <- 0.7 rewardA_5 <- vector(mode = "logical", length = nsamps) rewardA_6 <- vector(mode = "logical", length = nsamps) rewardA_7 <- vector(mode = "logical", length = nsamps) rewardA_56 <- vector(mode = "logical", length = nsamps) rewardA_57 <- vector(mode = "logical", length = nsamps) for (i in 1:nsamps) { samps <- sample(c(0,1), size = 10, replace = TRUE, prob = c(0.7, 0.3)) rewardA_5[i] <- (samps[4] == 0 & samps[3] == 1) rewardA_6[i] <- (samps[5] == 0 & samps[4] == 1) rewardA_7[i] <- (samps[6] == 0 & samps[5] == 1) rewardA_56[i] <- (rewardA_5[i] & rewardA_6[i]) rewardA_57[i] <- (rewardA_5[i] & rewardA_7[i]) } sum(rewardA_5) / nsamps ## [1] 0.2141 sum(rewardA_6) / nsamps ## [1] 0.2122 sum(rewardA_7) / nsamps ## [1] 0.2107 sum(rewardA_56) / nsamps ## [1] 0 sum(rewardA_57) / nsamps ## [1] 0.0454 Exercise 3.5 A drawer contains two coins. One is an unbiased coin, the other is a biased coin, which will turn up heads with probability \\(p\\) and tails with probability \\(1-p\\). One coin is selected uniformly at random. The selected coin is tossed \\(n\\) times. The coin turns up heads \\(k\\) times and tails \\(n-k\\) times. What is the probability that the coin is biased? The selected coin is tossed repeatedly until it turns up heads \\(k\\) times. Given that it is tossed \\(n\\) times in total, what is the probability that the coin is biased? Solution. Let \\(B = 1\\) denote that the coin is biased and let \\(H = k\\) denote that we’ve seen \\(k\\) heads. \\[\\begin{align} P(B = 1 | H = k) &= \\frac{P(H = k | B = 1)P(B = 1)}{P(H = k)} \\\\ &= \\frac{P(H = k | B = 1)P(B = 1)}{P(H = k | B = 1)P(B = 1) + P(H = k | B = 0)P(B = 0)} \\\\ &= \\frac{p^k(1-p)^{n-k} 0.5}{p^k(1-p)^{n-k} 0.5 + 0.5^{n+1}} \\\\ &= \\frac{p^k(1-p)^{n-k}}{p^k(1-p)^{n-k} + 0.5^n}. \\end{align}\\] The same results as in a). The only difference between these two scenarios is that in b) the last throw must be heads. However, this holds for the biased and the unbiased coin and therefore does not affect the probability of the coin being biased. Exercise 3.6 Judy goes around the company for Women’s day and shares flowers. In every office she leaves a flower, if there is at least one woman inside. The probability that there’s a woman in the office is \\(\\frac{3}{5}\\). What is the probability that Judy leaves her first flower in the fourth office? Given that she has given away exactly three flowers in the first four offices, what is the probability that she gives her fourth flower in the eighth office? What is the probability that she leaves the second flower in the fifth office? What is the probability that she leaves the second flower in the fifth office, given that she did not leave the second flower in the second office? Judy needs a new supply of flowers immediately after the office, where she gives away her last flower. What is the probability that she visits at least five offices, if she starts with two flowers? R: simulate Judy’s walk 10000 times to check your answers a) - e). Solution. Let \\(X_i = k\\) denote the event that … \\(i\\)-th sample on the \\(k\\)-th run. Since the events are independent, we can multiply their probabilities to get \\[\\begin{equation} P(X_1 = 4) = 0.4^3 \\times 0.6 = 0.0384. \\end{equation}\\] Same as in a) as we have a fresh start after first four offices. For this to be possible, she had to leave the first flower in one of the first four offices. Therefore there are four possibilities, and for each of those the probability is \\(0.4^3 \\times 0.6\\). Additionally, the probability that she leaves a flower in the fifth office is \\(0.6\\). So \\[\\begin{equation} P(X_2 = 5) = \\binom{4}{1} \\times 0.4^3 \\times 0.6^2 = 0.09216. \\end{equation}\\] We use Bayes’ theorem. \\[\\begin{align} P(X_2 = 5 | X_2 \\neq 2) &= \\frac{P(X_2 \\neq 2 | X_2 = 5)P(X_2 = 5)}{P(X_2 \\neq 2)} \\\\ &= \\frac{0.09216}{0.64} \\\\ &= 0.144. \\end{align}\\] The denominator in the second equation can be calculated as follows. One of three things has to happen for the second not to be dealt in the second round. First, both are zero, so \\(0.4^2\\). Second, first is zero, and second is one, so \\(0.4 \\times 0.6\\). Third, the first is one and the second one zero, so \\(0.6 \\times 0.4\\). Summing these values we get \\(0.64\\). We will look at the complement, so the events that she gave away exactly two flowers after two, three and four offices. \\[\\begin{equation} P(X_2 \\geq 5) = 1 - 0.6^2 - 2 \\times 0.4 \\times 0.6^2 - 3 \\times 0.4^2 \\times 0.6^2 = 0.1792. \\end{equation}\\] The multiplying parts represent the possibilities of the first flower. set.seed(1) nsamps <- 100000 Judyswalks <- matrix(data = NA, nrow = nsamps, ncol = 8) for (i in 1:nsamps) { thiswalk <- sample(c(0,1), size = 8, replace = TRUE, prob = c(0.4, 0.6)) Judyswalks[i, ] <- thiswalk } csJudy <- t(apply(Judyswalks, 1, cumsum)) # a sum(csJudy[ ,4] == 1 & csJudy[ ,3] == 0) / nsamps ## [1] 0.03848 # b csJsubset <- csJudy[csJudy[ ,4] == 3 & csJudy[ ,3] == 2, ] sum(csJsubset[ ,8] == 4 & csJsubset[ ,7] == 3) / nrow(csJsubset) ## [1] 0.03665893 # c sum(csJudy[ ,5] == 2 & csJudy[ ,4] == 1) / nsamps ## [1] 0.09117 # d sum(csJudy[ ,5] == 2 & csJudy[ ,4] == 1) / sum(csJudy[ ,2] != 2) ## [1] 0.1422398 # e sum(csJudy[ ,4] < 2) / nsamps ## [1] 0.17818 3.2 Conditional independence Exercise 3.7 Describe: A real-world example of two events \\(A\\) and \\(B\\) that are dependent but become conditionally independent if conditioned on a third event \\(C\\). A real-world example of two events \\(A\\) and \\(B\\) that are independent, but become dependent if conditioned on some third event \\(C\\). Solution. Let \\(A\\) be the height of a person and let \\(B\\) be the person’s knowledge of the Dutch language. These events are dependent since the Dutch are known to be taller than average. However if \\(C\\) is the nationality of the person, then \\(A\\) and \\(B\\) are independent given \\(C\\). Let \\(A\\) be the event that Mary passes the exam and let \\(B\\) be the event that John passes the exam. These events are independent. However, if the event \\(C\\) is that Mary and John studied together, then \\(A\\) and \\(B\\) are conditionally dependent given \\(C\\). Exercise 3.8 We have two coins of identical appearance. We know that one is a fair coin and the other flips heads 80% of the time. We choose one of the two coins uniformly at random. We discard the coin that was not chosen. We now flip the chosen coin independently 10 times, producing a sequence \\(Y_1 = y_1\\), \\(Y_2 = y_2\\), …, \\(Y_{10} = y_{10}\\). Intuitively, without doing and computation, are these random variables independent? Compute the probability \\(P(Y_1 = 1)\\). Compute the probabilities \\(P(Y_2 = 1 | Y_1 = 1)\\) and \\(P(Y_{10} = 1 | Y_1 = 1,...,Y_9 = 1)\\). Given your answers to b) and c), would you now change your answer to a)? If so, discuss why your intuition had failed. Solution. \\(P(Y_1 = 1) = 0.5 * 0.8 + 0.5 * 0.5 = 0.65\\). Since we know that \\(Y_1 = 1\\) this should change our view of the probability of the coin being biased or not. Let \\(B = 1\\) denote the event that the coin is biased and let \\(B = 0\\) denote that the coin is unbiased. By using marginal probability, we can write \\[\\begin{align} P(Y_2 = 1 | Y_1 = 1) &= P(Y_2 = 1, B = 1 | Y_1 = 1) + P(Y_2 = 1, B = 0 | Y_1 = 1) \\\\ &= \\sum_{k=1}^2 P(Y_2 = 1 | B = k, Y_1 = 1)P(B = k | Y_1 = 1) \\\\ &= 0.8 \\frac{P(Y_1 = 1 | B = 1)P(B = 1)}{P(Y_1 = 1)} + 0.5 \\frac{P(Y_1 = 1 | B = 0)P(B = 0)}{P(Y_1 = 1)} \\\\ &= 0.8 \\frac{0.8 \\times 0.5}{0.65} + 0.5 \\frac{0.5 \\times 0.5}{0.65} \\\\ &\\approx 0.68. \\end{align}\\] For the other calculation we follow the same procedure. Let \\(X = 1\\) denote that first nine tosses are all heads (equivalent to \\(Y_1 = 1\\),…, \\(Y_9 = 1\\)). \\[\\begin{align} P(Y_{10} = 1 | X = 1) &= P(Y_2 = 1, B = 1 | X = 1) + P(Y_2 = 1, B = 0 | X = 1) \\\\ &= \\sum_{k=1}^2 P(Y_2 = 1 | B = k, X = 1)P(B = k | X = 1) \\\\ &= 0.8 \\frac{P(X = 1 | B = 1)P(B = 1)}{P(X = 1)} + 0.5 \\frac{P(X = 1 | B = 0)P(B = 0)}{P(X = 1)} \\\\ &= 0.8 \\frac{0.8^9 \\times 0.5}{0.5 \\times 0.8^9 + 0.5 \\times 0.5^9} + 0.5 \\frac{0.5^9 \\times 0.5}{0.5 \\times 0.8^9 + 0.5 \\times 0.5^9} \\\\ &\\approx 0.8. \\end{align}\\] 3.3 Monty Hall problem The Monty Hall problem is a famous probability puzzle with non-intuitive outcome. Many established mathematicians and statisticians had problems solving it and many even disregarded the correct solution until they’ve seen the proof by simulation. Here we will show how it can be solved relatively simply with the use of Bayes’ theorem if we select the variables in a smart way. Exercise 3.9 (Monty Hall problem) A prize is placed at random behind one of three doors. You pick a door. Now Monty Hall chooses one of the other two doors, opens it and shows you that it is empty. He then gives you the opportunity to keep your door or switch to the other unopened door. Should you stay or switch? Use Bayes’ theorem to calculate the probability of winning if you switch and if you do not. R: Check your answers in R. Solution. W.L.O.G. assume we always pick the first door. The host can only open door 2 or door 3, as he can not open the door we picked. Let \\(k \\in \\{2,3\\}\\). Let us first look at what happens if we do not change. Then we have \\[\\begin{align} P(\\text{car in 1} | \\text{open $k$}) &= \\frac{P(\\text{open $k$} | \\text{car in 1})P(\\text{car in 1})}{P(\\text{open $k$})} \\\\ &= \\frac{P(\\text{open $k$} | \\text{car in 1})P(\\text{car in 1})}{\\sum_{n=1}^3 P(\\text{open $k$} | \\text{car in $n$})P(\\text{car in $n$)}}. \\end{align}\\] The probability that he opened \\(k\\) if the car is in 1 is \\(\\frac{1}{2}\\), as he can choose between door 2 and 3 as both have a goat behind it. Let us look at the normalization constant. When \\(n = 1\\) we get the value in the nominator. When \\(n=k\\), we get 0, as he will not open the door if there’s a prize behind. The remaining option is that we select 1, the car is behind \\(k\\) and he opens the only door left. Since he can’t open 1 due to it being our pick and \\(k\\) due to having the prize, the probability of opening the remaining door is 1, and the prior probability of the car being behind this door is \\(\\frac{1}{3}\\). So we have \\[\\begin{align} P(\\text{car in 1} | \\text{open $k$}) &= \\frac{\\frac{1}{2}\\frac{1}{3}}{\\frac{1}{2}\\frac{1}{3} + \\frac{1}{3}} \\\\ &= \\frac{1}{3}. \\end{align}\\] Now let us look at what happens if we do change. Let \\(k' \\in \\{2,3\\}\\) be the door that is not opened. If we change, we select this door, so we have \\[\\begin{align} P(\\text{car in $k'$} | \\text{open $k$}) &= \\frac{P(\\text{open $k$} | \\text{car in $k'$})P(\\text{car in $k'$})}{P(\\text{open $k$})} \\\\ &= \\frac{P(\\text{open $k$} | \\text{car in $k'$})P(\\text{car in $k'$})}{\\sum_{n=1}^3 P(\\text{open $k$} | \\text{car in $n$})P(\\text{car in $n$)}}. \\end{align}\\] The denominator stays the same, the only thing that is different from before is \\(P(\\text{open $k$} | \\text{car in $k'$})\\). We have a situation where we initially selected door 1 and the car is in door \\(k'\\). The probability that the host will open door \\(k\\) is then 1, as he can not pick any other door. So we have \\[\\begin{align} P(\\text{car in $k'$} | \\text{open $k$}) &= \\frac{\\frac{1}{3}}{\\frac{1}{2}\\frac{1}{3} + \\frac{1}{3}} \\\\ &= \\frac{2}{3}. \\end{align}\\] Therefore it makes sense to change the door. set.seed(1) nsamps <- 1000 ifchange <- vector(mode = "logical", length = nsamps) ifstay <- vector(mode = "logical", length = nsamps) for (i in 1:nsamps) { where_car <- sample(c(1:3), 1) where_player <- sample(c(1:3), 1) open_samp <- (1:3)[where_car != (1:3) & where_player != (1:3)] if (length(open_samp) == 1) { where_open <- open_samp } else { where_open <- sample(open_samp, 1) } ifstay[i] <- where_car == where_player where_ifchange <- (1:3)[where_open != (1:3) & where_player != (1:3)] ifchange[i] <- where_ifchange == where_car } sum(ifstay) / nsamps ## [1] 0.328 sum(ifchange) / nsamps ## [1] 0.672 "],["rvs.html", "Chapter 4 Random variables 4.1 General properties and calculations 4.2 Discrete random variables 4.3 Continuous random variables 4.4 Singular random variables 4.5 Transformations", " Chapter 4 Random variables This chapter deals with random variables and their distributions. The students are expected to acquire the following knowledge: Theoretical Identification of random variables. Convolutions of random variables. Derivation of PDF, PMF, CDF, and quantile function. Definitions and properties of common discrete random variables. Definitions and properties of common continuous random variables. Transforming univariate random variables. R Familiarize with PDF, PMF, CDF, and quantile functions for several distributions. Visual inspection of probability distributions. Analytical and empirical calculation of probabilities based on distributions. New R functions for plotting (for example, facet_wrap). Creating random number generators based on the Uniform distribution. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 4.1 General properties and calculations Exercise 4.1 Which of the functions below are valid CDFs? Find their respective densities. R: Plot the three functions. \\[\\begin{equation} F(x) = \\begin{cases} 1 - e^{-x^2} & x \\geq 0 \\\\ 0 & x < 0. \\end{cases} \\end{equation}\\] \\[\\begin{equation} F(x) = \\begin{cases} e^{-\\frac{1}{x}} & x > 0 \\\\ 0 & x \\leq 0. \\end{cases} \\end{equation}\\] \\[\\begin{equation} F(x) = \\begin{cases} 0 & x \\leq 0 \\\\ \\frac{1}{3} & 0 < x \\leq \\frac{1}{2} \\\\ 1 & x > \\frac{1}{2}. \\end{cases} \\end{equation}\\] Solution. Yes. First, let us check the limits. \\(\\lim_{x \\rightarrow -\\infty} (0) = 0\\). \\(\\lim_{x \\rightarrow \\infty} (1 - e^{-x^2}) = 1 - \\lim_{x \\rightarrow \\infty} e^{-x^2} = 1 - 0 = 1\\). Second, let us check whether the function is increasing. Let \\(x > y \\geq 0\\). Then \\(1 - e^{-x^2} \\geq 1 - e^{-y^2}\\). We only have to check right continuity for the point zero. \\(F(0) = 0\\) and \\(\\lim_{\\epsilon \\downarrow 0}F (0 + \\epsilon) = \\lim_{\\epsilon \\downarrow 0} 1 - e^{-\\epsilon^2} = 1 - \\lim_{\\epsilon \\downarrow 0} e^{-\\epsilon^2} = 1 - 1 = 0\\). We get the density by differentiating the CDF. \\(p(x) = \\frac{d}{dx} 1 - e^{-x^2} = 2xe^{-x^2}.\\) Students are encouraged to check that this is a proper PDF. Yes. First, let us check the limits. $_{x -} (0) = 0 and \\(\\lim_{x \\rightarrow \\infty} (e^{-\\frac{1}{x}}) = 1\\). Second, let us check whether the function is increasing. Let \\(x > y \\geq 0\\). Then \\(e^{-\\frac{1}{x}} \\geq e^{-\\frac{1}{y}}\\). We only have to check right continuity for the point zero. \\(F(0) = 0\\) and \\(\\lim_{\\epsilon \\downarrow 0}F (0 + \\epsilon) = \\lim_{\\epsilon \\downarrow 0} e^{-\\frac{1}{\\epsilon}} = 0\\). We get the density by differentiating the CDF. \\(p(x) = \\frac{d}{dx} e^{-\\frac{1}{x}} = \\frac{1}{x^2}e^{-\\frac{1}{x}}.\\) Students are encouraged to check that this is a proper PDF. No. The function is not right continuous as \\(F(\\frac{1}{2}) = \\frac{1}{3}\\), but \\(\\lim_{\\epsilon \\downarrow 0} F(\\frac{1}{2} + \\epsilon) = 1\\). f1 <- function (x) { tmp <- 1 - exp(-x^2) tmp[x < 0] <- 0 return(tmp) } f2 <- function (x) { tmp <- exp(-(1 / x)) tmp[x <= 0] <- 0 return(tmp) } f3 <- function (x) { tmp <- x tmp[x == x] <- 1 tmp[x <= 0.5] <- 1/3 tmp[x <= 0] <- 0 return(tmp) } cdf_data <- tibble(x = seq(-1, 20, by = 0.001), f1 = f1(x), f2 = f2(x), f3 = f3(x)) %>% melt(id.vars = "x") cdf_plot <- ggplot(data = cdf_data, aes(x = x, y = value, color = variable)) + geom_hline(yintercept = 1) + geom_line() plot(cdf_plot) Exercise 4.2 Let \\(X\\) be a random variable with CDF \\[\\begin{equation} F(x) = \\begin{cases} 0 & x < 0 \\\\ \\frac{x^2}{2} & 0 \\leq x < 1 \\\\ \\frac{1}{2} + \\frac{p}{2} & 1 \\leq x < 2 \\\\ \\frac{1}{2} + \\frac{p}{2} + \\frac{1 - p}{2} & x \\geq 2 \\end{cases} \\end{equation}\\] R: Plot this CDF for \\(p = 0.3\\). Is it a discrete, continuous, or mixed random varible? Find the probability density/mass of \\(X\\). f1 <- function (x, p) { tmp <- x tmp[x >= 2] <- 0.5 + (p * 0.5) + ((1-p) * 0.5) tmp[x < 2] <- 0.5 + (p * 0.5) tmp[x < 1] <- (x[x < 1])^2 / 2 tmp[x < 0] <- 0 return(tmp) } cdf_data <- tibble(x = seq(-1, 5, by = 0.001), y = f1(x, 0.3)) cdf_plot <- ggplot(data = cdf_data, aes(x = x, y = y)) + geom_hline(yintercept = 1) + geom_line(color = "blue") plot(cdf_plot) ::: {.solution} \\(X\\) is a mixed random variable. Since \\(X\\) is a mixed random variable, we have to find the PDF of the continuous part and the PMF of the discrete part. We get the continuous part by differentiating the corresponding CDF, \\(\\frac{d}{dx}\\frac{x^2}{2} = x\\). So the PDF, when \\(0 \\leq x < 1\\), is \\(p(x) = x\\). Let us look at the discrete part now. It has two steps, so this is a discrete distribution with two outcomes – numbers 1 and 2. The first happens with probability \\(\\frac{p}{2}\\), and the second with probability \\(\\frac{1 - p}{2}\\). This reminds us of the Bernoulli distribution. The PMF for the discrete part is \\(P(X = x) = (\\frac{p}{2})^{2 - x} (\\frac{1 - p}{2})^{x - 1}\\). ::: Exercise 4.3 (Convolutions) Convolutions are probability distributions that correspond to sums of independent random variables. Let \\(X\\) and \\(Y\\) be independent discrete variables. Find the PMF of \\(Z = X + Y\\). Hint: Use the law of total probability. Let \\(X\\) and \\(Y\\) be independent continuous variables. Find the PDF of \\(Z = X + Y\\). Hint: Start with the CDF. Solution. \\[\\begin{align} P(Z = z) &= P(X + Y = z) & \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + Y = z | Y = k) P(Y = k) & \\text{ (law of total probability)} \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + k = z | Y = k) P(Y = k) & \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + k = z) P(Y = k) & \\text{ (independence of $X$ and $Y$)} \\\\ &= \\sum_{k = -\\infty}^\\infty P(X = z - k) P(Y = k). & \\end{align}\\] Let \\(f\\) and \\(g\\) be the PDFs of \\(X\\) and \\(Y\\) respectively. \\[\\begin{align} F(z) &= P(Z < z) \\\\ &= P(X + Y < z) \\\\ &= \\int_{-\\infty}^{\\infty} P(X + Y < z | Y = y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X + y < z | Y = y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X + y < z)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X < z - y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} (\\int_{-\\infty}^{z - y} f(x) dx) g(y) dy \\end{align}\\] Now \\[\\begin{align} p(z) &= \\frac{d}{dz} F(z) & \\\\ &= \\int_{-\\infty}^{\\infty} (\\frac{d}{dz}\\int_{-\\infty}^{z - y} f(x) dx) g(y) dy & \\\\ &= \\int_{-\\infty}^{\\infty} f(z - y) g(y) dy & \\text{ (fundamental theorem of calculus)}. \\end{align}\\] 4.2 Discrete random variables Exercise 4.4 (Binomial random variable) Let \\(X_k\\), \\(k = 1,...,n\\), be random variables with the Bernoulli measure as the PMF. Let \\(X = \\sum_{k=1}^n X_k\\). We call \\(X_k\\) a Bernoulli random variable with parameter \\(p \\in (0,1)\\). Find the CDF of \\(X_k\\). Find PMF of \\(X\\). This is a Binomial random variable with support in \\(\\{0,1,2,...,n\\}\\) and parameters \\(p \\in (0,1)\\) and \\(n \\in \\mathbb{N}_0\\). We denote \\[\\begin{equation} X | n,p \\sim \\text{binomial}(n,p). \\end{equation}\\] Find CDF of \\(X\\). R: Simulate from the binomial distribution with \\(n = 10\\) and \\(p = 0.5\\), and from \\(n\\) Bernoulli distributions with \\(p = 0.5\\). Visually compare the sum of Bernoullis and the binomial. Hint: there is no standard function like rpois for a Bernoulli random variable. Check exercise 1.12 to find out how to sample from a Bernoulli distribution. Solution. There are two outcomes – zero and one. Zero happens with probability \\(1 - p\\). Therefore \\[\\begin{equation} F(k) = \\begin{cases} 0 & k < 0 \\\\ 1 - p & 0 \\leq k < 1 \\\\ 1 & k \\geq 1. \\end{cases} \\end{equation}\\] For the probability of \\(X\\) to be equal to some \\(k \\leq n\\), exactly \\(k\\) Bernoulli variables need to be one, and the others zero. So \\(p^k(1-p)^{n-k}\\). There are \\(\\binom{n}{k}\\) such possible arrangements. Therefore \\[\\begin{align} P(X = k) = \\binom{n}{k} p^k (1 - p)^{n-k}. \\end{align}\\] \\[\\begin{equation} F(k) = \\sum_{i = 0}^{\\lfloor k \\rfloor} \\binom{n}{i} p^i (1 - p)^{n - i} \\end{equation}\\] set.seed(1) nsamps <- 10000 binom_samp <- rbinom(nsamps, size = 10, prob = 0.5) bernoulli_mat <- matrix(data = NA, nrow = nsamps, ncol = 10) for (i in 1:nsamps) { bernoulli_mat[i, ] <- rbinom(10, size = 1, prob = 0.5) } bern_samp <- apply(bernoulli_mat, 1, sum) b_data <- tibble(x = c(binom_samp, bern_samp), type = c(rep("binomial", 10000), rep("Bernoulli_sum", 10000))) b_plot <- ggplot(data = b_data, aes(x = x, fill = type)) + geom_bar(position = "dodge") plot(b_plot) Exercise 4.5 (Geometric random variable) A variable with PMF \\[\\begin{equation} P(k) = p(1-p)^k \\end{equation}\\] is a geometric random variable with support in non-negative integers. It has one parameter \\(p \\in (0,1]\\). We denote \\[\\begin{equation} X | p \\sim \\text{geometric}(p) \\end{equation}\\] Derive the CDF of a geometric random variable. R: Draw 1000 samples from the geometric distribution with \\(p = 0.3\\) and compare their frequencies to theoretical values. Solution. \\[\\begin{align} P(X \\leq k) &= \\sum_{i = 0}^k p(1-p)^i \\\\ &= p \\sum_{i = 0}^k (1-p)^i \\\\ &= p \\frac{1 - (1-p)^{k+1}}{1 - (1 - p)} \\\\ &= 1 - (1-p)^{k + 1} \\end{align}\\] set.seed(1) geo_samp <- rgeom(n = 1000, prob = 0.3) geo_samp <- data.frame(x = geo_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:20, n = dgeom(0:20, prob = 0.3), type = "theoretical_measure")) geo_plot <- ggplot(data = geo_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(geo_plot) Exercise 4.6 (Poisson random variable) A variable with PMF \\[\\begin{equation} P(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!} \\end{equation}\\] is a Poisson random variable with support in non-negative integers. It has one positive parameter \\(\\lambda\\), which also represents its mean value and variance (a measure of the deviation of the values from the mean – more on mean and variance in the next chapter). We denote \\[\\begin{equation} X | \\lambda \\sim \\text{Poisson}(\\lambda). \\end{equation}\\] This distribution is usually the default choice for modeling counts. We have already encountered a Poisson random variable in exercise 1.13, where we also sampled from this distribution. The CDF of a Poisson random variable is \\(P(X <= x) = e^{-\\lambda} \\sum_{i=0}^x \\frac{\\lambda^{i}}{i!}\\). R: Draw 1000 samples from the Poisson distribution with \\(\\lambda = 5\\) and compare their empirical cumulative distribution function with the theoretical CDF. set.seed(1) pois_samp <- rpois(n = 1000, lambda = 5) pois_samp <- data.frame(x = pois_samp) pois_plot <- ggplot(data = pois_samp, aes(x = x, colour = "ECDF")) + stat_ecdf(geom = "step") + geom_step(data = tibble(x = 0:17, y = ppois(x, 5)), aes(x = x, y = y, colour = "CDF")) + scale_colour_manual("Lgend title", values = c("black", "red")) plot(pois_plot) Exercise 4.7 (Negative binomial random variable) A variable with PMF \\[\\begin{equation} p(k) = \\binom{k + r - 1}{k}(1-p)^r p^k \\end{equation}\\] is a negative binomial random variable with support in non-negative integers. It has two parameters \\(r > 0\\) and \\(p \\in (0,1)\\). We denote \\[\\begin{equation} X | r,p \\sim \\text{NB}(r,p). \\end{equation}\\] Let us reparameterize the negative binomial distribution with \\(q = 1 - p\\). Find the PMF of \\(X \\sim \\text{NB}(1, q)\\). Do you recognize this distribution? Show that the sum of two negative binomial random variables with the same \\(p\\) is also a negative binomial random variable. Hint: Use the fact that the number of ways to place \\(n\\) indistinct balls into \\(k\\) boxes is \\(\\binom{n + k - 1}{n}\\). R: Draw samples from \\(X \\sim \\text{NB}(5, 0.4)\\) and \\(Y \\sim \\text{NB}(3, 0.4)\\). Draw samples from \\(Z = X + Y\\), where you use the parameters calculated in b). Plot both distributions, their sum, and \\(Z\\) using facet_wrap. Be careful, as R uses a different parameterization size=\\(r\\) and prob=\\(1 - p\\). Solution. \\[\\begin{align} P(X = k) &= \\binom{k + 1 - 1}{k}q^1 (1-q)^k \\\\ &= q(1-q)^k. \\end{align}\\] This is the geometric distribution. Let \\(X \\sim \\text{NB}(r_1, p)\\) and \\(Y \\sim \\text{NB}(r_2, p)\\). Let \\(Z = X + Y\\). \\[\\begin{align} P(Z = z) &= \\sum_{k = 0}^{\\infty} P(X = z - k)P(Y = k), \\text{ if k < 0, then the probabilities are 0} \\\\ &= \\sum_{k = 0}^{z} P(X = z - k)P(Y = k), \\text{ if k > z, then the probabilities are 0} \\\\ &= \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k}(1 - p)^{r_1} p^{z - k} \\binom{k + r_2 - 1}{k}(1 - p)^{r_2} p^{k} & \\\\ &= \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}(1 - p)^{r_1 + r_2} p^{z} & \\\\ &= (1 - p)^{r_1 + r_2} p^{z} \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}& \\end{align}\\] The part before the sum reminds us of the negative binomial distribution with parameters \\(r_1 + r_2\\) and \\(p\\). To complete this term to the negative binomial PMF we need \\(\\binom{z + r_1 + r_2 -1}{z}\\). So the only thing we need to prove is that the sum equals this term. Both terms in the sum can be interpreted as numbers of ways to place a number of balls into boxes. For the left term it is \\(z-k\\) balls into \\(r_1\\) boxes, and for the right \\(k\\) balls into \\(r_2\\) boxes. For each \\(k\\) we are distributing \\(z\\) balls in total. By summing over all \\(k\\), we actually get all the possible placements of \\(z\\) balls into \\(r_1 + r_2\\) boxes. Therefore \\[\\begin{align} P(Z = z) &= (1 - p)^{r_1 + r_2} p^{z} \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}& \\\\ &= \\binom{z + r_1 + r_2 -1}{z} (1 - p)^{r_1 + r_2} p^{z}. \\end{align}\\] From this it also follows that the sum of geometric distributions with the same parameter is a negative binomial distribution. \\(Z \\sim \\text{NB}(8, 0.4)\\). set.seed(1) nsamps <- 10000 x <- rnbinom(nsamps, size = 5, prob = 0.6) y <- rnbinom(nsamps, size = 3, prob = 0.6) xpy <- x + y z <- rnbinom(nsamps, size = 8, prob = 0.6) samps <- tibble(x, y, xpy, z) samps <- melt(samps) ggplot(data = samps, aes(x = value)) + geom_bar() + facet_wrap(~ variable) 4.3 Continuous random variables Exercise 4.8 (Exponential random variable) A variable \\(X\\) with PDF \\(\\lambda e^{-\\lambda x}\\) is an exponential random variable with support in non-negative real numbers. It has one positive parameter \\(\\lambda\\). We denote \\[\\begin{equation} X | \\lambda \\sim \\text{Exp}(\\lambda). \\end{equation}\\] Find the CDF of an exponential random variable. Find the quantile function of an exponential random variable. Calculate the probability \\(P(1 \\leq X \\leq 3)\\), where \\(X \\sim \\text{Exp(1.5)}\\). R: Check your answer to c) with a simulation (rexp). Plot the probability in a meaningful way. R: Implement PDF, CDF, and the quantile function and compare their values with corresponding R functions visually. Hint: use the size parameter to make one of the curves wider. Solution. \\[\\begin{align} F(x) &= \\int_{0}^{x} \\lambda e^{-\\lambda t} dt \\\\ &= \\lambda \\int_{0}^{x} e^{-\\lambda t} dt \\\\ &= \\lambda (\\frac{1}{-\\lambda}e^{-\\lambda t} |_{0}^{x}) \\\\ &= \\lambda(\\frac{1}{\\lambda} - \\frac{1}{\\lambda} e^{-\\lambda x}) \\\\ &= 1 - e^{-\\lambda x}. \\end{align}\\] \\[\\begin{align} F(F^{-1}(x)) &= x \\\\ 1 - e^{-\\lambda F^{-1}(x)} &= x \\\\ e^{-\\lambda F^{-1}(x)} &= 1 - x \\\\ -\\lambda F^{-1}(x) &= \\ln(1 - x) \\\\ F^{-1}(x) &= - \\frac{ln(1 - x)}{\\lambda}. \\end{align}\\] \\[\\begin{align} P(1 \\leq X \\leq 3) &= P(X \\leq 3) - P(X \\leq 1) \\\\ &= P(X \\leq 3) - P(X \\leq 1) \\\\ &= 1 - e^{-1.5 \\times 3} - 1 + e^{-1.5 \\times 1} \\\\ &\\approx 0.212. \\end{align}\\] set.seed(1) nsamps <- 1000 samps <- rexp(nsamps, rate = 1.5) sum(samps >= 1 & samps <= 3) / nsamps ## [1] 0.212 exp_plot <- ggplot(data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dexp, args = list(rate = 1.5)) + stat_function(fun = dexp, args = list(rate = 1.5), xlim = c(1,3), geom = "area", fill = "red") plot(exp_plot) exp_pdf <- function(x, lambda) { return (lambda * exp(-lambda * x)) } exp_cdf <- function(x, lambda) { return (1 - exp(-lambda * x)) } exp_quant <- function(q, lambda) { return (-(log(1 - q) / lambda)) } ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_pdf, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = pexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_cdf, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) ggplot(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x)) + stat_function(fun = qexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_quant, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) Exercise 4.9 (Uniform random variable) Continuous uniform random variable with parameters \\(a\\) and \\(b\\) has the PDF \\[\\begin{equation} p(x) = \\begin{cases} \\frac{1}{b - a} & x \\in [a,b] \\\\ 0 & \\text{otherwise}. \\end{cases} \\end{equation}\\] Find the CDF of the uniform random variable. Find the quantile function of the uniform random variable. Let \\(X \\sim \\text{Uniform}(a,b)\\). Find the CDF of the variable \\(Y = \\frac{X - a}{b - a}\\). This is the standard uniform random variable. Let \\(X \\sim \\text{Uniform}(-1, 3)\\). Find such \\(z\\) that \\(P(X < z + \\mu_x) = \\frac{1}{5}\\). R: Check your result from d) using simulation. Solution. \\[\\begin{align} F(x) &= \\int_{a}^x \\frac{1}{b - a} dt \\\\ &= \\frac{1}{b - a} \\int_{a}^x dt \\\\ &= \\frac{x - a}{b - a}. \\end{align}\\] \\[\\begin{align} F(F^{-1}(p)) &= p \\\\ \\frac{F^{-1}(p) - a}{b - a} &= p \\\\ F^{-1}(p) &= p(b - a) + a. \\end{align}\\] \\[\\begin{align} F_Y(y) &= P(Y < y) \\\\ &= P(\\frac{X - a}{b - a} < y) \\\\ &= P(X < y(b - a) + a) \\\\ &= F_X(y(b - a) + a) \\\\ &= \\frac{(y(b - a) + a) - a}{b - a} \\\\ &= y. \\end{align}\\] \\[\\begin{align} P(X < z + 1) &= \\frac{1}{5} \\\\ F(z + 1) &= \\frac{1}{5} \\\\ z + 1 &= F^{-1}(\\frac{1}{5}) \\\\ z &= \\frac{1}{5}4 - 1 - 1 \\\\ z &= -1.2. \\end{align}\\] set.seed(1) a <- -1 b <- 3 nsamps <- 10000 unif_samp <- runif(nsamps, a, b) mu_x <- mean(unif_samp) new_samp <- unif_samp - mu_x quantile(new_samp, probs = 1/5) ## 20% ## -1.203192 punif(-0.2, -1, 3) ## [1] 0.2 Exercise 4.10 (Beta random variable) A variable \\(X\\) with PDF \\[\\begin{equation} p(x) = \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)}, \\end{equation}\\] where \\(\\text{B}(\\alpha, \\beta) = \\frac{\\Gamma(\\alpha) \\Gamma(\\beta)}{\\Gamma(\\alpha + \\beta)}\\) and \\(\\Gamma(x) = \\int_0^{\\infty} x^{z - 1} e^{-x} dx\\) is a Beta random variable with support on \\([0,1]\\). It has two positive parameters \\(\\alpha\\) and \\(\\beta\\). Notation: \\[\\begin{equation} X | \\alpha, \\beta \\sim \\text{Beta}(\\alpha, \\beta) \\end{equation}\\] It is often used in modeling rates. Calculate the PDF for \\(\\alpha = 1\\) and \\(\\beta = 1\\). What do you notice? R: Plot densities of the beta distribution for parameter pairs (2, 2), (4, 1), (1, 4), (2, 5), and (0.1, 0.1). R: Sample from \\(X \\sim \\text{Beta}(2, 5)\\) and compare the histogram with Beta PDF. Solution. \\[\\begin{equation} p(x) = \\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\\text{B}(1, 1)} = 1. \\end{equation}\\] This is the standard uniform distribution. set.seed(1) ggplot(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x)) + stat_function(fun = dbeta, args = list(shape1 = 2, shape2 = 2), aes(color = "alpha = 0.5")) + stat_function(fun = dbeta, args = list(shape1 = 4, shape2 = 1), aes(color = "alpha = 4")) + stat_function(fun = dbeta, args = list(shape1 = 1, shape2 = 4), aes(color = "alpha = 1")) + stat_function(fun = dbeta, args = list(shape1 = 2, shape2 = 5), aes(color = "alpha = 25")) + stat_function(fun = dbeta, args = list(shape1 = 0.1, shape2 = 0.1), aes(color = "alpha = 0.1")) set.seed(1) nsamps <- 1000 samps <- rbeta(nsamps, 2, 5) ggplot(data = data.frame(x = samps), aes(x = x)) + geom_histogram(aes(y = ..density..), color = "black") + stat_function(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x), fun = dbeta, args = list(shape1 = 2, shape2 = 5), color = "red", size = 1.2) Exercise 4.11 (Gamma random variable) A random variable with PDF \\[\\begin{equation} p(x) = \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x} \\end{equation}\\] is a Gamma random variable with support on the positive numbers and parameters shape \\(\\alpha > 0\\) and rate \\(\\beta > 0\\). We write \\[\\begin{equation} X | \\alpha, \\beta \\sim \\text{Gamma}(\\alpha, \\beta) \\end{equation}\\] and it’s CDF is \\[\\begin{equation} \\frac{\\gamma(\\alpha, \\beta x)}{\\Gamma(\\alpha)}, \\end{equation}\\] where \\(\\gamma(s, x) = \\int_0^x t^{s-1} e^{-t} dt\\). It is usually used in modeling positive phenomena (for example insurance claims and rainfalls). Let \\(X \\sim \\text{Gamma}(1, \\beta)\\). Find the PDF of \\(X\\). Do you recognize this PDF? Let \\(k = \\alpha\\) and \\(\\theta = \\frac{1}{\\beta}\\). Find the PDF of \\(X | k, \\theta \\sim \\text{Gamma}(k, \\theta)\\). Random variables can be reparameterized, and sometimes a reparameterized distribution is more suitable for certain calculations. The first parameterization is for example usually used in Bayesian statistics, while this parameterization is more common in econometrics and some other applied fields. Note that you also need to pay attention to the parameters in statistical software, so diligently read the help files when using functions like rgamma to see how the function is parameterized. R: Plot gamma CDF for random variables with shape and rate parameters (1,1), (10,1), (1,10). Solution. \\[\\begin{align} p(x) &= \\frac{\\beta^1}{\\Gamma(1)} x^{1 - 1}e^{-\\beta x} \\\\ &= \\beta e^{-\\beta x} \\end{align}\\] This is the PDF of the exponential distribution with parameter \\(\\beta\\). \\[\\begin{align} p(x) &= \\frac{1}{\\Gamma(k)\\beta^k} x^{k - 1}e^{-\\frac{x}{\\theta}}. \\end{align}\\] set.seed(1) ggplot(data = data.frame(x = seq(0, 25, by = 0.01)), aes(x = x)) + stat_function(fun = pgamma, args = list(shape = 1, rate = 1), aes(color = "Gamma(1,1)")) + stat_function(fun = pgamma, args = list(shape = 10, rate = 1), aes(color = "Gamma(10,1)")) + stat_function(fun = pgamma, args = list(shape = 1, rate = 10), aes(color = "Gamma(1,10)")) Exercise 4.12 (Normal random variable) A random variable with PDF \\[\\begin{equation} p(x) = \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} \\end{equation}\\] is a normal random variable with support on the real axis and parameters \\(\\mu\\) in reals and \\(\\sigma^2 > 0\\). The first is the mean parameter and the second is the variance parameter. Many statistical methods assume a normal distribution. We denote \\[\\begin{equation} X | \\mu, \\sigma \\sim \\text{N}(\\mu, \\sigma^2), \\end{equation}\\] and it’s CDF is \\[\\begin{equation} F(x) = \\int_{-\\infty}^x \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(t - \\mu)^2}{2 \\sigma^2}} dt, \\end{equation}\\] which is intractable and is usually approximated. Due to its flexibility it is also one of the most researched distributions. For that reason statisticians often use transformations of variables or approximate distributions with the normal distribution. Show that a variable \\(\\frac{X - \\mu}{\\sigma} \\sim \\text{N}(0,1)\\). This transformation is called standardization, and \\(\\text{N}(0,1)\\) is a standard normal distribution. R: Plot the normal distribution with \\(\\mu = 0\\) and different values for the \\(\\sigma\\) parameter. R: The normal distribution provides a good approximation for the Poisson distribution with a large \\(\\lambda\\). Let \\(X \\sim \\text{Poisson}(50)\\). Approximate \\(X\\) with the normal distribution and compare its density with the Poisson histogram. What are the values of \\(\\mu\\) and \\(\\sigma^2\\) that should provide the best approximation? Note that R function rnorm takes standard deviation (\\(\\sigma\\)) as a parameter and not variance. Solution. \\[\\begin{align} P(\\frac{X - \\mu}{\\sigma} < x) &= P(X < \\sigma x + \\mu) \\\\ &= F(\\sigma x + \\mu) \\\\ &= \\int_{-\\infty}^{\\sigma x + \\mu} \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(t - \\mu)^2}{2\\sigma^2}} dt \\end{align}\\] Now let \\(s = f(t) = \\frac{t - \\mu}{\\sigma}\\), then \\(ds = \\frac{dt}{\\sigma}\\) and \\(f(\\sigma x + \\mu) = x\\), so \\[\\begin{align} P(\\frac{X - \\mu}{\\sigma} < x) &= \\int_{-\\infty}^{x} \\frac{1}{\\sqrt{2 \\pi}} e^{-\\frac{s^2}{2}} ds. \\end{align}\\] There is no need to evaluate this integral, as we recognize it as the CDF of a normal distribution with \\(\\mu = 0\\) and \\(\\sigma^2 = 1\\). set.seed(1) # b ggplot(data = data.frame(x = seq(-15, 15, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 1), aes(color = "sd = 1")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 0.4), aes(color = "sd = 0.1")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "sd = 2")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 5), aes(color = "sd = 5")) # c mean_par <- 50 nsamps <- 100000 pois_samps <- rpois(nsamps, lambda = mean_par) norm_samps <- rnorm(nsamps, mean = mean_par, sd = sqrt(mean_par)) my_plot <- ggplot() + geom_bar(data = tibble(x = pois_samps), aes(x = x, y = (..count..)/sum(..count..))) + geom_density(data = tibble(x = norm_samps), aes(x = x), color = "red") plot(my_plot) Exercise 4.13 (Logistic random variable) A logistic random variable has CDF \\[\\begin{equation} F(x) = \\frac{1}{1 + e^{-\\frac{x - \\mu}{s}}}, \\end{equation}\\] where \\(\\mu\\) is real and \\(s > 0\\). The support is on the real axis. We denote \\[\\begin{equation} X | \\mu, s \\sim \\text{Logistic}(\\mu, s). \\end{equation}\\] The distribution of the logistic random variable resembles a normal random variable, however it has heavier tails. Find the PDF of a logistic random variable. R: Implement logistic PDF and CDF and visually compare both for \\(X \\sim \\text{N}(0, 1)\\) and \\(Y \\sim \\text{logit}(0, \\sqrt{\\frac{3}{\\pi^2}})\\). These distributions have the same mean and variance. Additionally, plot the same plot on the interval [5,10], to better see the difference in the tails. R: For the distributions in b) find the probability \\(P(|X| > 4)\\) and interpret the result. Solution. \\[\\begin{align} p(x) &= \\frac{d}{dx} \\frac{1}{1 + e^{-\\frac{x - \\mu}{s}}} \\\\ &= \\frac{- \\frac{d}{dx} (1 + e^{-\\frac{x - \\mu}{s}})}{(1 + e{-\\frac{x - \\mu}{s}})^2} \\\\ &= \\frac{e^{-\\frac{x - \\mu}{s}}}{s(1 + e{-\\frac{x - \\mu}{s}})^2}. \\end{align}\\] # b set.seed(1) logit_pdf <- function (x, mu, s) { return ((exp(-(x - mu)/(s))) / (s * (1 + exp(-(x - mu)/(s)))^2)) } nl_plot <- ggplot(data = data.frame(x = seq(-12, 12, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "normal")) + stat_function(fun = logit_pdf, args = list(mu = 0, s = sqrt(12/pi^2)), aes(color = "logit")) plot(nl_plot) nl_plot <- ggplot(data = data.frame(x = seq(5, 10, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "normal")) + stat_function(fun = logit_pdf, args = list(mu = 0, s = sqrt(12/pi^2)), aes(color = "logit")) plot(nl_plot) # c logit_cdf <- function (x, mu, s) { return (1 / (1 + exp(-(x - mu) / s))) } p_logistic <- 1 - logit_cdf(4, 0, sqrt(12/pi^2)) + logit_cdf(-4, 0, sqrt(12/pi^2)) p_norm <- 1 - pnorm(4, 0, 2) + pnorm(-4, 0, 2) p_logistic ## [1] 0.05178347 p_norm ## [1] 0.04550026 # Logistic distribution has wider tails, therefore the probability of larger # absolute values is higher. 4.4 Singular random variables Exercise 4.14 (Cantor distribution) The Cantor set is a subset of \\([0,1]\\), which we create by iteratively deleting the middle third of the interval. For example, in the first iteration, we get the sets \\([0,\\frac{1}{3}]\\) and \\([\\frac{2}{3},1]\\). In the second iteration, we get \\([0,\\frac{1}{9}]\\), \\([\\frac{2}{9},\\frac{1}{3}]\\), \\([\\frac{2}{3}, \\frac{7}{9}]\\), and \\([\\frac{8}{9}, 1]\\). On the \\(n\\)-th iteration, we have \\[\\begin{equation} C_n = \\frac{C_{n-1}}{3} \\cup \\bigg(\\frac{2}{3} + \\frac{C_{n-1}}{3} \\bigg), \\end{equation}\\] where \\(C_0 = [0,1]\\). The Cantor set is then defined as the intersection of these sets \\[\\begin{equation} C = \\cap_{n=1}^{\\infty} C_n. \\end{equation}\\] It has the same cardinality as \\([0,1]\\). Another way to define the Cantor set is the set of all numbers on \\([0,1]\\), that do not have a 1 in the ternary representation \\(x = \\sum_{n=1}^\\infty \\frac{x_i}{3^i}, x_i \\in \\{0,1,2\\}\\). A random variable follows the Cantor distribution, if its CDF is the Cantor function (below). You can find the implementations of random number generator, CDF, and quantile functions for the Cantor distributions at https://github.com/Henrygb/CantorDist.R. Show that the Lebesgue measure of the Cantor set is 0. (Jagannathan) Let us look at an infinite sequence of independent fair-coin tosses. If the outcome is heads, let \\(x_i = 2\\) and \\(x_i = 0\\), when tails. Then use these to create \\(x = \\sum_{n=1}^\\infty \\frac{x_i}{3^i}\\). This is a random variable with the Cantor distribution. Show that \\(X\\) has a singular distribution. Solution. \\[\\begin{align} \\lambda(C) &= 1 - \\lambda(C^c) \\\\ &= 1 - \\frac{1}{3}\\sum_{k = 0}^\\infty (\\frac{2}{3})^k \\\\ &= 1 - \\frac{\\frac{1}{3}}{1 - \\frac{2}{3}} \\\\ &= 0. \\end{align}\\] First, for every \\(x\\), the probability of observing it is \\(\\lim_{n \\rightarrow \\infty} \\frac{1}{2^n} = 0\\). Second, the probability that we observe one of all the possible sequences is 1. Therefore \\(P(C) = 1\\). So this is a singular variable. The CDF only increments on the elements of the Cantor set. 4.5 Transformations Exercise 4.15 Let \\(X\\) be a random variable that is uniformly distributed on \\(\\{-2, -1, 0, 1, 2\\}\\). Find the PMF of \\(Y = X^2\\). Solution. \\[\\begin{align} P_Y(y) = \\sum_{x \\in \\sqrt(y)} P_X(x) = \\begin{cases} 0 & y \\notin \\{0,1,4\\} \\\\ \\frac{1}{5} & y = 0 \\\\ \\frac{2}{5} & y \\in \\{1,4\\} \\end{cases} \\end{align}\\] Exercise 4.16 (Lognormal random variable) A lognormal random variable is a variable whose logarithm is normally distributed. In practice, we often encounter skewed data. Usually using a log transformation on such data makes it more symmetric and therefore more suitable for modeling with the normal distribution (more on why we wish to model data with the normal distribution in the following chapters). Let \\(X \\sim \\text{N}(\\mu,\\sigma)\\). Find the PDF of \\(Y: \\log(Y) = X\\). R: Sample from the lognormal distribution with parameters \\(\\mu = 5\\) and \\(\\sigma = 2\\). Plot a histogram of the samples. Then log-transform the samples and plot a histogram along with the theoretical normal PDF. Solution. \\[\\begin{align} p_Y(y) &= p_X(\\log(y)) \\frac{d}{dy} \\log(y) \\\\ &= \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{\\frac{(\\log(y) - \\mu)^2}{2 \\sigma^2}} \\frac{1}{y} \\\\ &= \\frac{1}{y \\sqrt{2 \\pi \\sigma^2}} e^{\\frac{(\\log(y) - \\mu)^2}{2 \\sigma^2}}. \\end{align}\\] set.seed(1) nsamps <- 10000 mu <- 0.5 sigma <- 0.4 ln_samps <- rlnorm(nsamps, mu, sigma) ln_plot <- ggplot(data = data.frame(x = ln_samps), aes(x = x)) + geom_histogram(color = "black") plot(ln_plot) norm_samps <- log(ln_samps) n_plot <- ggplot(data = data.frame(x = norm_samps), aes(x = x)) + geom_histogram(aes(y = ..density..), color = "black") + stat_function(fun = dnorm, args = list(mean = mu, sd = sigma), color = "red") plot(n_plot) Exercise 4.17 (Probability integral transform) This exercise is borrowed from Wasserman. Let \\(X\\) have a continuous, strictly increasing CDF \\(F\\). Let \\(Y = F(X)\\). Find the density of \\(Y\\). This is called the probability integral transform. Let \\(U \\sim \\text{Uniform}(0,1)\\) and let \\(X = F^{-1}(U)\\). Show that \\(X \\sim F\\). R: Implement a program that takes Uniform(0,1) random variables and generates random variables from an exponential(\\(\\beta\\)) distribution. Compare your implemented function with function rexp in R. Solution. \\[\\begin{align} F_Y(y) &= P(Y < y) \\\\ &= P(F(X) < y) \\\\ &= P(X < F_X^{-1}(y)) \\\\ &= F_X(F_X^{-1}(y)) \\\\ &= y. \\end{align}\\] From the above it follows that \\(p(y) = 1\\). Note that we need to know the inverse CDF to be able to apply this procedure. \\[\\begin{align} P(X < x) &= P(F^{-1}(U) < x) \\\\ &= P(U < F(x)) \\\\ &= F_U(F(x)) \\\\ &= F(x). \\end{align}\\] set.seed(1) nsamps <- 10000 beta <- 4 generate_exp <- function (n, beta) { tmp <- runif(n) X <- qexp(tmp, beta) return (X) } df <- tibble("R" = rexp(nsamps, beta), "myGenerator" = generate_exp(nsamps, beta)) %>% gather() ggplot(data = df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") "],["mrvs.html", "Chapter 5 Multiple random variables 5.1 General 5.2 Bivariate distribution examples 5.3 Transformations", " Chapter 5 Multiple random variables This chapter deals with multiple random variables and their distributions. The students are expected to acquire the following knowledge: Theoretical Calculation of PDF of transformed multiple random variables. Finding marginal and conditional distributions. R Scatterplots of bivariate random variables. New R functions (for example, expand.grid). .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 5.1 General Exercise 5.1 Let \\(X \\sim \\text{N}(0,1)\\) and \\(Y \\sim \\text{N}(0,1)\\) be independent random variables. Draw 1000 samples from \\((X,Y)\\) and plot a scatterplot. Now let \\(X \\sim \\text{N}(0,1)\\) and \\(Y | X = x \\sim N(ax, 1)\\). Draw 1000 samples from \\((X,Y)\\) for \\(a = 1\\), \\(a=0\\), and \\(a=-0.5\\). Plot the scatterplots. How would you interpret parameter \\(a\\)? Plot the marginal distribution of \\(Y\\) for cases \\(a=1\\), \\(a=0\\), and \\(a=-0.5\\). Can you guess which distribution it is? set.seed(1) nsamps <- 1000 x <- rnorm(nsamps) y <- rnorm(nsamps) ggplot(data.frame(x, y), aes(x = x, y = y)) + geom_point() y1 <- rnorm(nsamps, mean = 1 * x) y2 <- rnorm(nsamps, mean = 0 * x) y3 <- rnorm(nsamps, mean = -0.5 * x) df <- tibble(x = c(x,x,x), y = c(y1,y2,y3), a = c(rep(1, nsamps), rep(0, nsamps), rep(-0.5, nsamps))) ggplot(df, aes(x = x, y = y)) + geom_point() + facet_wrap(~a) + coord_equal(ratio=1) # Parameter a controls the scale of linear dependency between X and Y. ggplot(df, aes(x = y)) + geom_density() + facet_wrap(~a) 5.2 Bivariate distribution examples Exercise 5.2 (Discrete bivariate random variable) Let \\(X\\) represent the event that a die rolls an even number and let \\(Y\\) represent the event that a die rolls one, two, or a three. Find the marginal distributions of \\(X\\) and \\(Y\\). Find the PMF of \\((X,Y)\\). Find the CDF of \\((X,Y)\\). Find \\(P(X = 1 | Y = 1)\\). Solution. \\[\\begin{align} P(X = 1) = \\frac{1}{2} \\text{ and } P(X = 0) = \\frac{1}{2} \\\\ P(Y = 1) = \\frac{1}{2} \\text{ and } P(Y = 0) = \\frac{1}{2} \\\\ \\end{align}\\] \\[\\begin{align} P(X = 1, Y = 1) = \\frac{1}{6} \\\\ P(X = 1, Y = 0) = \\frac{2}{6} \\\\ P(X = 0, Y = 1) = \\frac{2}{6} \\\\ P(X = 0, Y = 0) = \\frac{1}{6} \\end{align}\\] \\[\\begin{align} P(X \\leq x, Y \\leq y) = \\begin{cases} \\frac{1}{6} & x = 0, y = 0 \\\\ \\frac{3}{6} & x \\neq y \\\\ 1 & x = 1, y = 1 \\end{cases} \\end{align}\\] \\[\\begin{align} P(X = 1 | Y = 1) = \\frac{1}{3} \\end{align}\\] Exercise 5.3 (Continuous bivariate random variable) Let \\(p(x,y) = 6 (x - y)^2\\) be the PDF of a bivariate random variable \\((X,Y)\\), where both variables range from zero to one. Find CDF. Find marginal distributions. Find conditional distributions. R: Plot a grid of points and colour them by value – this can help us visualize the PDF. R: Implement a random number generator, which will generate numbers from \\((X,Y)\\) and visually check the results. R: Plot the marginal distribution of \\(Y\\) and the conditional distributions of \\(X | Y = y\\), where \\(y \\in \\{0, 0.1, 0.5\\}\\). Solution. \\[\\begin{align} F(x,y) &= \\int_0^{x} \\int_0^{y} 6 (t - s)^2 ds dt\\\\ &= 6 \\int_0^{x} \\int_0^{y} t^2 - 2ts + s^2 ds dt\\\\ &= 6 \\int_0^{x} t^2y - ty^2 + \\frac{y^3}{3} dt \\\\ &= 6 (\\frac{x^3 y}{3} - \\frac{x^2y^2}{2} + \\frac{x y^3}{3}) \\\\ &= 2 x^3 y - 3 t^2y^2 + 2 x y^3 \\end{align}\\] \\[\\begin{align} p(x) &= \\int_0^{1} 6 (x - y)^2 dy\\\\ &= 6 (x^2 - x + \\frac{1}{3}) \\\\ &= 6x^2 - 6x + 2 \\end{align}\\] \\[\\begin{align} p(y) &= \\int_0^{1} 6 (x - y)^2 dx\\\\ &= 6 (y^2 - y + \\frac{1}{3}) \\\\ &= 6y^2 - 6y + 2 \\end{align}\\] \\[\\begin{align} p(x|y) &= \\frac{p(xy)}{p(y)} \\\\ &= \\frac{6 (x - y)^2}{6 (y^2 - y + \\frac{1}{3})} \\\\ &= \\frac{(x - y)^2}{y^2 - y + \\frac{1}{3}} \\end{align}\\] \\[\\begin{align} p(y|x) &= \\frac{p(xy)}{p(x)} \\\\ &= \\frac{6 (x - y)^2}{6 (x^2 - x + \\frac{1}{3})} \\\\ &= \\frac{(x - y)^2}{x^2 - x + \\frac{1}{3}} \\end{align}\\] set.seed(1) # d pxy <- function (x, y) { return ((x - y)^2) } x_axis <- seq(0, 1, length.out = 100) y_axis <- seq(0, 1, length.out = 100) df <- expand.grid(x_axis, y_axis) colnames(df) <- c("x", "y") df <- cbind(df, pdf = pxy(df$x, df$y)) ggplot(data = df, aes(x = x, y = y, color = pdf)) + geom_point() # e samps <- NULL for (i in 1:10000) { xt <- runif(1, 0, 1) yt <- runif(1, 0, 1) pdft <- pxy(xt, yt) acc <- runif(1, 0, 6) if (acc <= pdft) { samps <- rbind(samps, c(xt, yt)) } } colnames(samps) <- c("x", "y") ggplot(data = as.data.frame(samps), aes(x = x, y = y)) + geom_point() # f mar_pdf <- function (x) { return (6 * x^2 - 6 * x + 2) } cond_pdf <- function (x, y) { return (((x - y)^2) / (y^2 - y + 1/3)) } df <- tibble(x = x_axis, mar = mar_pdf(x), y0 = cond_pdf(x, 0), y0.1 = cond_pdf(x, 0.1), y0.5 = cond_pdf(x, 0.5)) %>% gather(dist, value, -x) ggplot(df, aes(x = x, y = value, color = dist)) + geom_line() Exercise 5.4 (Mixed bivariate random variable) Let \\(f(x,y) = \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y+ \\alpha -1} e^{-x(1 + \\beta)}\\) be the PDF of a bivariate random variable, where \\(x \\in (0, \\infty)\\) and \\(y \\in \\mathbb{N}_0\\). Find the marginal distribution of \\(X\\). Do you recognize this distribution? Find the conditional distribution of \\(Y | X\\). Do you recognize this distribution? Calculate the probability \\(P(Y = 2 | X = 2.5)\\) for \\((X,Y)\\). Find the marginal distribution of \\(Y\\). Do you recognize this distribution? R: Take 1000 random samples from \\((X,Y)\\) with parameters \\(\\beta = 1\\) and \\(\\alpha = 1\\). Plot a scatterplot. Plot a bar plot of the marginal distribution of \\(Y\\), and the theoretical PMF calculated from d) on the range from 0 to 10. Hint: Use the gamma function in R.? Solution. \\[\\begin{align} p(x) &= \\sum_{k = 0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)k!} x^{k + \\alpha -1} e^{-x(1 + \\beta)} & \\\\ &= \\sum_{k = 0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)k!} x^{k} x^{\\alpha -1} e^{-x} e^{-\\beta x} & \\\\ &= \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x} \\sum_{k = 0}^{\\infty} \\frac{1}{k!} x^{k} e^{-x} & \\\\ &= \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x} & \\text{the last term above sums to one} \\end{align}\\] This is the Gamma PDF. \\[\\begin{align} p(y|x) &= \\frac{p(x,y)}{p(x)} \\\\ &= \\frac{\\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y+ \\alpha -1} e^{-x(1 + \\beta)}}{\\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x}} \\\\ &= \\frac{x^y e^{-x}}{y!}. \\end{align}\\] This is the Poisson PMF. \\[\\begin{align} P(Y = 2 | X = 2.5) = \\frac{2.5^2 e^{-2.5}}{2!} \\approx 0.26. \\end{align}\\] \\[\\begin{align} p(y) &= \\int_{0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y + \\alpha -1} e^{-x(1 + \\beta)} dx & \\\\ &= \\frac{1}{y!} \\int_{0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{(y + \\alpha) -1} e^{-(1 + \\beta)x} dx & \\\\ &= \\frac{1}{y!} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} \\int_{0}^{\\infty} \\frac{\\Gamma(y + \\alpha)}{(1 + \\beta)^{y + \\alpha}} \\frac{(1 + \\beta)^{y + \\alpha}}{\\Gamma(y + \\alpha)} x^{(y + \\alpha) -1} e^{-(1 + \\beta)x} dx & \\text{complete to Gamma PDF} \\\\ &= \\frac{1}{y!} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} \\frac{\\Gamma(y + \\alpha)}{(1 + \\beta)^{y + \\alpha}}. \\end{align}\\] We add the terms in the third equality to get a Gamma PDF inside the integral, which then integrates to one. We do not recognize this distribution. set.seed(1) px <- function (x, alpha, beta) { return((1 / factorial(x)) * (beta^alpha / gamma(alpha)) * (gamma(x + alpha) / (1 + beta)^(x + alpha))) } nsamps <- 1000 rx <- rgamma(nsamps, 1, 1) ryx <- rpois(nsamps, rx) ggplot(data = data.frame(x = rx, y = ryx), aes(x = x, y = y)) + geom_point() ggplot(data = data.frame(x = rx, y = ryx), aes(x = y)) + geom_bar(aes(y = (..count..)/sum(..count..))) + stat_function(fun = px, args = list(alpha = 1, beta = 1), color = "red") Exercise 5.5 Let \\(f(x,y) = cx^2y\\) for \\(x^2 \\leq y \\leq 1\\) and zero otherwise. Find such \\(c\\) that \\(f\\) is a PDF of a bivariate random variable. This exercise is borrowed from Wasserman. Solution. \\[\\begin{align} 1 &= \\int_{-1}^{1} \\int_{x^2}^1 cx^2y dy dx \\\\ &= \\int_{-1}^{1} cx^2 (\\frac{1}{2} - \\frac{x^4}{2}) dx \\\\ &= \\frac{c}{2} \\int_{-1}^{1} x^2 - x^6 dx \\\\ &= \\frac{c}{2} (\\frac{1}{3} + \\frac{1}{3} - \\frac{1}{7} - \\frac{1}{7}) \\\\ &= \\frac{c}{2} \\frac{8}{21} \\\\ &= \\frac{4c}{21} \\end{align}\\] It follows \\(c = \\frac{21}{4}\\). 5.3 Transformations Exercise 5.6 Let \\((X,Y)\\) be uniformly distributed on the unit ball \\(\\{(x,y,z) : x^2 + y^2 + z^2 \\leq 1\\}\\). Let \\(R = \\sqrt{X^2 + Y^2 + Z^2}\\). Find the CDF and PDF of \\(R\\). Solution. \\[\\begin{align} P(R < r) &= P(\\sqrt{X^2 + Y^2 + Z^2} < r) \\\\ &= P(X^2 + Y^2 + Z^2 < r^2) \\\\ &= \\frac{\\frac{4}{3} \\pi r^3}{\\frac{4}{3}\\pi} \\\\ &= r^3. \\end{align}\\] The second line shows us that we are looking at the probability which is represented by a smaller ball with radius \\(r\\). To get the probability, we divide it by the radius of the whole ball. We get the PDF by differentiating the CDF, so \\(p(r) = 3r^2\\). "],["integ.html", "Chapter 6 Integration 6.1 Monte Carlo integration 6.2 Lebesgue integrals", " Chapter 6 Integration This chapter deals with abstract and Monte Carlo integration. The students are expected to acquire the following knowledge: Theoretical How to calculate Lebesgue integrals for non-simple functions. R Monte Carlo integration. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 6.1 Monte Carlo integration Exercise 6.1 Let \\(X\\) and \\(Y\\) be continuous random variables on the unit interval and \\(p(x,y) = 6(x - y)^2\\). Use Monte Carlo integration to estimate the probability \\(P(0.2 \\leq X \\leq 0.5, \\: 0.1 \\leq Y \\leq 0.2)\\). Can you find the exact value? set.seed(1) nsamps <- 1000 V <- (0.5 - 0.2) * (0.2 - 0.1) x1 <- runif(nsamps, 0.2, 0.5) x2 <- runif(nsamps, 0.1, 0.2) f_1 <- function (x, y) { return (6 * (x - y)^2) } mcint <- V * (1 / nsamps) * sum(f_1(x1, x2)) sdm <- sqrt((V^2 / nsamps) * var(f_1(x1, x2))) mcint ## [1] 0.008793445 sdm ## [1] 0.0002197686 F_1 <- function (x, y) { return (2 * x^3 * y - 3 * x^2 * y^2 + 2 * x * y^3) } F_1(0.5, 0.2) - F_1(0.2, 0.2) - F_1(0.5, 0.1) + F_1(0.2, 0.1) ## [1] 0.0087 6.2 Lebesgue integrals Exercise 6.2 (borrowed from Jagannathan) Find the Lebesgue integral of the following functions on (\\(\\mathbb{R}\\), \\(\\mathcal{B}(\\mathbb{R})\\), \\(\\lambda\\)). \\[\\begin{align} f(\\omega) = \\begin{cases} \\omega, & \\text{for } \\omega = 0,1,...,n \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] \\[\\begin{align} f(\\omega) = \\begin{cases} 1, & \\text{for } \\omega = \\mathbb{Q}^c \\cap [0,1] \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] \\[\\begin{align} f(\\omega) = \\begin{cases} n, & \\text{for } \\omega = \\mathbb{Q}^c \\cap [0,n] \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] Solution. \\[\\begin{align} \\int f(\\omega) d\\lambda = \\sum_{\\omega = 0}^n \\omega \\lambda(\\omega) = 0. \\end{align}\\] \\[\\begin{align} \\int f(\\omega) d\\lambda = 1 \\times \\lambda(\\mathbb{Q}^c \\cap [0,1]) = 1. \\end{align}\\] \\[\\begin{align} \\int f(\\omega) d\\lambda = n \\times \\lambda(\\mathbb{Q}^c \\cap [0,n]) = n^2. \\end{align}\\] Exercise 6.3 (borrowed from Jagannathan) Let \\(c \\in \\mathbb{R}\\) be fixed and (\\(\\mathbb{R}\\), \\(\\mathcal{B}(\\mathbb{R})\\)) a measurable space. If for any Borel set \\(A\\), \\(\\delta_c (A) = 1\\) if \\(c \\in A\\), and \\(\\delta_c (A) = 0\\) otherwise, then \\(\\delta_c\\) is called a Dirac measure. Let \\(g\\) be a non-negative, measurable function. Show that \\(\\int g d \\delta_c = g(c)\\). Solution. \\[\\begin{align} \\int g d \\delta_c &= \\sup_{q \\in S(g)} \\int q d \\delta_c \\\\ &= \\sup_{q \\in S(g)} \\sum_{i = 1}^n a_i \\delta_c(A_i) \\\\ &= \\sup_{q \\in S(g)} \\sum_{i = 1}^n a_i \\text{I}_{A_i}(c) \\\\ &= \\sup_{q \\in S(g)} q(c) \\\\ &= g(c) \\end{align}\\] "],["ev.html", "Chapter 7 Expected value 7.1 Discrete random variables 7.2 Continuous random variables 7.3 Sums, functions, conditional expectations 7.4 Covariance", " Chapter 7 Expected value This chapter deals with expected values of random variables. The students are expected to acquire the following knowledge: Theoretical Calculation of the expected value. Calculation of variance and covariance. Cauchy distribution. R Estimation of expected value. Estimation of variance and covariance. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 7.1 Discrete random variables Exercise 7.1 (Bernoulli) Let \\(X \\sim \\text{Bernoulli}(p)\\). Find \\(E[X]\\). Find \\(Var[X]\\). R: Let \\(p = 0.4\\). Check your answers to a) and b) with a simulation. Solution. \\[\\begin{align*} E[X] = \\sum_{k=0}^1 p^k (1-p)^{1-k} k = p. \\end{align*}\\] \\[\\begin{align*} Var[X] = E[X^2] - E[X]^2 = \\sum_{k=0}^1 (p^k (1-p)^{1-k} k^2) - p^2 = p(1-p). \\end{align*}\\] set.seed(1) nsamps <- 1000 x <- rbinom(nsamps, 1, 0.4) mean(x) ## [1] 0.394 var(x) ## [1] 0.239003 0.4 * (1 - 0.4) ## [1] 0.24 Exercise 7.2 (Binomial) Let \\(X \\sim \\text{Binomial}(n,p)\\). Find \\(E[X]\\). Find \\(Var[X]\\). Solution. Let \\(X = \\sum_{i=0}^n X_i\\), where \\(X_i \\sim \\text{Bernoulli}(p)\\). Then, due to linearity of expectation \\[\\begin{align*} E[X] = E[\\sum_{i=0}^n X_i] = \\sum_{i=0}^n E[X_i] = np. \\end{align*}\\] Again let \\(X = \\sum_{i=0}^n X_i\\), where \\(X_i \\sim \\text{Bernoulli}(p)\\). Since the Bernoulli variables \\(X_i\\) are independent we have \\[\\begin{align*} Var[X] = Var[\\sum_{i=0}^n X_i] = \\sum_{i=0}^n Var[X_i] = np(1-p). \\end{align*}\\] Exercise 7.3 (Poisson) Let \\(X \\sim \\text{Poisson}(\\lambda)\\). Find \\(E[X]\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\sum_{k=0}^\\infty \\frac{\\lambda^k e^{-\\lambda}}{k!} k & \\\\ &= \\sum_{k=1}^\\infty \\frac{\\lambda^k e^{-\\lambda}}{k!} k & \\text{term at $k=0$ is 0} \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty \\frac{\\lambda^{k-1}}{(k - 1)!} & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=0}^\\infty \\frac{\\lambda^{k}}{k!} & \\\\ &= e^{-\\lambda} \\lambda e^\\lambda & \\\\ &= \\lambda. \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty k \\frac{\\lambda^{k-1}}{(k - 1)!} - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty (k - 1) + 1) \\frac{\\lambda^{k-1}}{(k - 1)!} - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\sum_{k=1}^\\infty (k - 1) \\frac{\\lambda^{k-1}}{(k - 1)!} + \\sum_{k=1}^\\infty \\frac{\\lambda^{k-1}}{(k - 1)!}\\Big) - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\lambda\\sum_{k=2}^\\infty \\frac{\\lambda^{k-2}}{(k - 2)!} + e^\\lambda\\Big) - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\lambda e^\\lambda + e^\\lambda\\Big) - \\lambda^2 & \\\\ &= \\lambda^2 + \\lambda - \\lambda^2 & \\\\ &= \\lambda. \\end{align*}\\] Exercise 7.4 (Geometric) Let \\(X \\sim \\text{Geometric}(p)\\). Find \\(E[X]\\). Hint: \\(\\frac{d}{dx} x^k = k x^{(k - 1)}\\). Solution. \\[\\begin{align*} E[X] &= \\sum_{k=0}^\\infty (1 - p)^k p k & \\\\ &= p (1 - p) \\sum_{k=0}^\\infty (1 - p)^{k-1} k & \\\\ &= p (1 - p) \\sum_{k=0}^\\infty -\\frac{d}{dp}(1 - p)^k & \\\\ &= p (1 - p) \\Big(-\\frac{d}{dp}\\Big) \\sum_{k=0}^\\infty (1 - p)^k & \\\\ &= p (1 - p) \\Big(-\\frac{d}{dp}\\Big) \\frac{1}{1 - (1 - p)} & \\text{geometric series} \\\\ &= \\frac{1 - p}{p} \\end{align*}\\] 7.2 Continuous random variables Exercise 7.5 (Gamma) Let \\(X \\sim \\text{Gamma}(\\alpha, \\beta)\\). Hint: \\(\\Gamma(z) = \\int_0^\\infty t^{z-1}e^{-t} dt\\) and \\(\\Gamma(z + 1) = z \\Gamma(z)\\). Find \\(E[X]\\). Find \\(Var[X]\\). R: Let \\(\\alpha = 10\\) and \\(\\beta = 2\\). Plot the density of \\(X\\). Add a horizontal line at the expected value that touches the density curve (geom_segment). Shade the area within a standard deviation of the expected value. Solution. \\[\\begin{align*} E[X] &= \\int_0^\\infty \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)}x^\\alpha e^{-\\beta x} dx & \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\int_0^\\infty x^\\alpha e^{-\\beta x} dx & \\text{ (let $t = \\beta x$)} \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(\\alpha) }\\int_0^\\infty \\frac{t^\\alpha}{\\beta^\\alpha} e^{-t} \\frac{dt}{\\beta} & \\\\ &= \\frac{1}{\\beta \\Gamma(\\alpha) }\\int_0^\\infty t^\\alpha e^{-t} dt & \\\\ &= \\frac{\\Gamma(\\alpha + 1)}{\\beta \\Gamma(\\alpha)} & \\\\ &= \\frac{\\alpha \\Gamma(\\alpha)}{\\beta \\Gamma(\\alpha)} & \\\\ &= \\frac{\\alpha}{\\beta}. & \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\int_0^\\infty \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)}x^{\\alpha+1} e^{-\\beta x} dx - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\Gamma(\\alpha + 2)}{\\beta^2 \\Gamma(\\alpha)} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{(\\alpha + 1)\\alpha\\Gamma(\\alpha)}{\\beta^2 \\Gamma(\\alpha)} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\alpha^2 + \\alpha}{\\beta^2} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\alpha}{\\beta^2}. \\end{align*}\\] set.seed(1) x <- seq(0, 25, by = 0.01) y <- dgamma(x, shape = 10, rate = 2) df <- data.frame(x = x, y = y) ggplot(df, aes(x = x, y = y)) + geom_line() + geom_segment(aes(x = 5, y = 0, xend = 5, yend = dgamma(5, shape = 10, rate = 2)), color = "red") + stat_function(fun = dgamma, args = list(shape = 10, rate = 2), xlim = c(5 - sqrt(10/4), 5 + sqrt(10/4)), geom = "area", fill = "gray", alpha = 0.4) Exercise 7.6 (Beta) Let \\(X \\sim \\text{Beta}(\\alpha, \\beta)\\). Find \\(E[X]\\). Hint 1: \\(\\text{B}(x,y) = \\int_0^1 t^{x-1} (1 - t)^{y-1} dt\\). Hint 2: \\(\\text{B}(x + 1, y) = \\text{B}(x,y)\\frac{x}{x + y}\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\int_0^1 \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} x dx \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)}\\int_0^1 x^{\\alpha} (1 - x)^{\\beta - 1} dx \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 1, \\beta) \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha, \\beta) \\frac{\\alpha}{\\alpha + \\beta} \\\\ &= \\frac{\\alpha}{\\alpha + \\beta}. \\\\ \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\int_0^1 \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} x^2 dx - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)}\\int_0^1 x^{\\alpha + 1} (1 - x)^{\\beta - 1} dx - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 2, \\beta) - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 1, \\beta) \\frac{\\alpha + 1}{\\alpha + \\beta + 1} - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{\\alpha + 1}{\\alpha + \\beta + 1} \\frac{\\alpha}{\\alpha + \\beta} - \\frac{\\alpha^2}{(\\alpha + \\beta)^2}\\\\ &= \\frac{\\alpha \\beta}{(\\alpha + \\beta)^2(\\alpha + \\beta + 1)}. \\end{align*}\\] Exercise 7.7 (Exponential) Let \\(X \\sim \\text{Exp}(\\lambda)\\). Find \\(E[X]\\). Hint: \\(\\Gamma(z + 1) = z\\Gamma(z)\\) and \\(\\Gamma(1) = 1\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\int_0^\\infty \\lambda e^{-\\lambda x} x dx & \\\\ &= \\lambda \\int_0^\\infty x e^{-\\lambda x} dx & \\\\ &= \\lambda \\int_0^\\infty \\frac{t}{\\lambda} e^{-t} \\frac{dt}{\\lambda} & \\text{$t = \\lambda x$}\\\\ &= \\lambda \\lambda^{-2} \\Gamma(2) & \\text{definition of gamma function} \\\\ &= \\lambda^{-1}. \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 & \\\\ &= \\int_0^\\infty \\lambda e^{-\\lambda x} x^2 dx - \\lambda^{-2} & \\\\ &= \\lambda \\int_0^\\infty \\frac{t^2}{\\lambda^2} e^{-t} \\frac{dt}{\\lambda} - \\lambda^{-2} & \\text{$t = \\lambda x$} \\\\ &= \\lambda \\lambda^{-3} \\Gamma(3) - \\lambda^{-2} & \\text{definition of gamma function} & \\\\ &= \\lambda^{-2} 2 \\Gamma(2) - \\lambda^{-2} & \\\\ &= 2 \\lambda^{-2} - \\lambda^{-2} & \\\\ &= \\lambda^{-2}. & \\\\ \\end{align*}\\] Exercise 7.8 (Normal) Let \\(X \\sim \\text{N}(\\mu, \\sigma)\\). Show that \\(E[X] = \\mu\\). Hint: Use the error function \\(\\text{erf}(x) = \\frac{1}{\\sqrt(\\pi)} \\int_{-x}^x e^{-t^2} dt\\). The statistical interpretation of this function is that if \\(Y \\sim \\text{N}(0, 0.5)\\), then the error function describes the probability of \\(Y\\) falling between \\(-x\\) and \\(x\\). Also, \\(\\text{erf}(\\infty) = 1\\). Show that \\(Var[X] = \\sigma^2\\). Hint: Start with the definition of variance. Solution. \\[\\begin{align*} E[X] &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} x dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty x e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty \\Big(t \\sqrt{2\\sigma^2} + \\mu\\Big)e^{-t^2} \\sqrt{2 \\sigma^2} dt & t = \\frac{x - \\mu}{\\sqrt{2}\\sigma} \\\\ &= \\frac{\\sqrt{2\\sigma^2}}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty t e^{-t^2} dt + \\frac{1}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty \\mu e^{-t^2} dt & \\\\ \\end{align*}\\] Let us calculate these integrals separately. \\[\\begin{align*} \\int t e^{-t^2} dt &= -\\frac{1}{2}\\int e^{s} ds & s = -t^2 \\\\ &= -\\frac{e^s}{2} + C \\\\ &= -\\frac{e^{-t^2}}{2} + C & \\text{undoing substitution}. \\end{align*}\\] Inserting the integration limits we get \\[\\begin{align*} \\int_{-\\infty}^\\infty t e^{-t^2} dt &= 0, \\end{align*}\\] due to the integrated function being symmetric. Reordering the second integral we get \\[\\begin{align*} \\mu \\frac{1}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty e^{-t^2} dt &= \\mu \\text{erf}(\\infty) & \\text{definition of error function} \\\\ &= \\mu & \\text{probability of $Y$ falling between $-\\infty$ and $\\infty$}. \\end{align*}\\] Combining all of the above we get \\[\\begin{align*} E[X] &= \\frac{\\sqrt{2\\sigma^2}}{\\sqrt{\\pi}} \\times 0 + \\mu &= \\mu.\\\\ \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[(X - E[X])^2] \\\\ &= E[(X - \\mu)^2] \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty (x - \\mu)^2 e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} dx \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\int_{-\\infty}^\\infty t^2 e^{-\\frac{t^2}{2}} dt \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\bigg(\\Big(- t e^{-\\frac{t^2}{2}} |_{-\\infty}^\\infty \\Big) + \\int_{-\\infty}^\\infty e^{-\\frac{t^2}{2}} \\bigg) dt & \\text{integration by parts} \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\sqrt{2 \\pi} \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt(\\pi)}e^{-s^2} \\bigg) & s = \\frac{t}{\\sqrt{2}} \\text{ and evaluating the left expression at the bounds} \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\sqrt{2 \\pi} \\Big(\\text{erf}(\\infty) & \\text{definition of error function} \\\\ &= \\sigma^2. \\end{align*}\\] 7.3 Sums, functions, conditional expectations Exercise 7.9 (Expectation of transformations) Let \\(X\\) follow a normal distribution with mean \\(\\mu\\) and variance \\(\\sigma^2\\). Find \\(E[2X + 4]\\). Find \\(E[X^2]\\). Find \\(E[\\exp(X)]\\). Hint: Use the error function \\(\\text{erf}(x) = \\frac{1}{\\sqrt(\\pi)} \\int_{-x}^x e^{-t^2} dt\\). Also, \\(\\text{erf}(\\infty) = 1\\). R: Check your results numerically for \\(\\mu = 0.4\\) and \\(\\sigma^2 = 0.25\\) and plot the densities of all four distributions. Solution. \\[\\begin{align} E[2X + 4] &= 2E[X] + 4 & \\text{linearity of expectation} \\\\ &= 2\\mu + 4. \\\\ \\end{align}\\] \\[\\begin{align} E[X^2] &= E[X]^2 + Var[X] & \\text{definition of variance} \\\\ &= \\mu^2 + \\sigma^2. \\end{align}\\] \\[\\begin{align} E[\\exp(X)] &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} e^x dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{\\frac{2 \\sigma^2 x}{2\\sigma^2} -\\frac{(x - \\mu)^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{x^2 - 2x(\\mu + \\sigma^2) + \\mu^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{(x - (\\mu + \\sigma^2))^2 + \\mu^2 - (\\mu + \\sigma^2)^2}{2\\sigma^2}} dx & \\text{complete the square} \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{\\frac{- \\mu^2 + (\\mu + \\sigma^2)^2}{2\\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{(x - (\\mu + \\sigma^2))^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{\\frac{- \\mu^2 + (\\mu + \\sigma^2)^2}{2\\sigma^2}} \\sigma \\sqrt{2 \\pi} \\text{erf}(\\infty) & \\\\ &= e^{\\frac{2\\mu + \\sigma^2}{2}}. \\end{align}\\] set.seed(1) mu <- 0.4 sigma <- 0.5 x <- rnorm(100000, mean = mu, sd = sigma) mean(2*x + 4) ## [1] 4.797756 2 * mu + 4 ## [1] 4.8 mean(x^2) ## [1] 0.4108658 mu^2 + sigma^2 ## [1] 0.41 mean(exp(x)) ## [1] 1.689794 exp((2 * mu + sigma^2) / 2) ## [1] 1.690459 Exercise 7.10 (Sum of independent random variables) Borrowed from Wasserman. Let \\(X_1, X_2,...,X_n\\) be IID random variables with expected value \\(E[X_i] = \\mu\\) and variance \\(Var[X_i] = \\sigma^2\\). Find the expected value and variance of \\(\\bar{X} = \\frac{1}{n} \\sum_{i=1}^n X_i\\). \\(\\bar{X}\\) is called a statistic (a function of the values in a sample). It is itself a random variable and its distribution is called a sampling distribution. R: Take \\(n = 5, 10, 100, 1000\\) samples from the N(\\(2\\), \\(6\\)) distribution 10000 times. Plot the theoretical density and the densities of \\(\\bar{X}\\) statistic for each \\(n\\). Intuitively, are the results in correspondence with your calculations? Check them numerically. Solution. Let us start with the expectation of \\(\\bar{X}\\). \\[\\begin{align} E[\\bar{X}] &= E[\\frac{1}{n} \\sum_{i=1}^n X_i] & \\\\ &= \\frac{1}{n} E[\\sum_{i=1}^n X_i] & \\text{ (multiplication with a scalar)} \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[X_i] & \\text{ (linearity)} \\\\ &= \\frac{1}{n} n \\mu & \\\\ &= \\mu. \\end{align}\\] Now the variance \\[\\begin{align} Var[\\bar{X}] &= Var[\\frac{1}{n} \\sum_{i=1}^n X_i] & \\\\ &= \\frac{1}{n^2} Var[\\sum_{i=1}^n X_i] & \\text{ (multiplication with a scalar)} \\\\ &= \\frac{1}{n^2} \\sum_{i=1}^n Var[X_i] & \\text{ (independence of samples)} \\\\ &= \\frac{1}{n^2} n \\sigma^2 & \\\\ &= \\frac{1}{n} \\sigma^2. \\end{align}\\] set.seed(1) nsamps <- 10000 mu <- 2 sigma <- sqrt(6) N <- c(5, 10, 100, 500) X <- matrix(data = NA, nrow = nsamps, ncol = length(N)) ind <- 1 for (n in N) { for (i in 1:nsamps) { X[i,ind] <- mean(rnorm(n, mu, sigma)) } ind <- ind + 1 } colnames(X) <- N X <- melt(as.data.frame(X)) ggplot(data = X, aes(x = value, colour = variable)) + geom_density() + stat_function(data = data.frame(x = seq(-2, 6, by = 0.01)), aes(x = x), fun = dnorm, args = list(mean = mu, sd = sigma), color = "black") Exercise 7.11 (Conditional expectation) Let \\(X \\in \\mathbb{R}_0^+\\) and \\(Y \\in \\mathbb{N}_0\\) be random variables with joint distribution \\(p_{XY}(X,Y) = \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} 0.5^{y + 1}\\). Find \\(E[X | Y = y]\\) by first finding \\(p_Y\\) and then \\(p_{X|Y}\\). Find \\(E[X]\\). R: check your answers to a) and b) by drawing 10000 samples from \\(p_Y\\) and \\(p_{X|Y}\\). Solution. \\[\\begin{align} p(y) &= \\int_0^\\infty \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} 0.5^{y + 1} dx \\\\ &= \\frac{0.5^{y + 1}}{y + 1} \\int_0^\\infty e^{-\\frac{x}{y + 1}} dx \\\\ &= \\frac{0.5^{y + 1}}{y + 1} (y + 1) \\\\ &= 0.5^{y + 1} \\\\ &= 0.5(1 - 0.5)^y. \\end{align}\\] We recognize this as the geometric distribution. \\[\\begin{align} p(x|y) &= \\frac{p(x,y)}{p(y)} \\\\ &= \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}}. \\end{align}\\] We recognize this as the exponential distribution. \\[\\begin{align} E[X | Y = y] &= \\int_0^\\infty x \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} dx \\\\ &= y + 1 & \\text{expected value of the exponential distribution} \\end{align}\\] Use the law of iterated expectation. \\[\\begin{align} E[X] &= E[E[X | Y]] \\\\ &= E[Y + 1] \\\\ &= E[Y] + 1 \\\\ &= \\frac{1 - 0.5}{0.5} + 1 \\\\ &= 2. \\end{align}\\] set.seed(1) y <- rgeom(100000, 0.5) x <- rexp(100000, rate = 1 / (y + 1)) x2 <- x[y == 3] mean(x2) ## [1] 4.048501 3 + 1 ## [1] 4 mean(x) ## [1] 2.007639 (1 - 0.5) / 0.5 + 1 ## [1] 2 Exercise 7.12 (Cauchy distribution) Let \\(p(x | x_0, \\gamma) = \\frac{1}{\\pi \\gamma \\Big(1 + \\big(\\frac{x - x_0}{\\gamma}\\big)^2\\Big)}\\). A random variable with this PDF follows a Cauchy distribution. This distribution is symmetric and has wider tails than the normal distribution. R: Draw \\(n = 1,...,1000\\) samples from a standard normal and \\(\\text{Cauchy}(0, 1)\\). For each \\(n\\) plot the mean and the median of the sample using facets. Interpret the results. To get a mathematical explanation of the results in a), evaluate the integral \\(\\int_0^\\infty \\frac{x}{1 + x^2} dx\\) and consider that \\(E[X] = \\int_{-\\infty}^\\infty \\frac{x}{1 + x^2}dx\\). set.seed(1) n <- 1000 means_n <- vector(mode = "numeric", length = n) means_c <- vector(mode = "numeric", length = n) medians_n <- vector(mode = "numeric", length = n) medians_c <- vector(mode = "numeric", length = n) for (i in 1:n) { tmp_n <- rnorm(i) tmp_c <- rcauchy(i) means_n[i] <- mean(tmp_n) means_c[i] <- mean(tmp_c) medians_n[i] <- median(tmp_n) medians_c[i] <- median(tmp_c) } df <- data.frame("distribution" = c(rep("normal", 2 * n), rep("Cauchy", 2 * n)), "type" = c(rep("mean", n), rep("median", n), rep("mean", n), rep("median", n)), "value" = c(means_n, medians_n, means_c, medians_c), "n" = rep(1:n, times = 4)) ggplot(df, aes(x = n, y = value)) + geom_line(alpha = 0.5) + facet_wrap(~ type + distribution , scales = "free") Solution. \\[\\begin{align} \\int_0^\\infty \\frac{x}{1 + x^2} dx &= \\frac{1}{2} \\int_1^\\infty \\frac{1}{u} du & u = 1 + x^2 \\\\ &= \\frac{1}{2} \\ln(x) |_0^\\infty. \\end{align}\\] This integral is not finite. The same holds for the negative part. Therefore, the expectation is undefined, as \\(E[|X|] = \\infty\\). Why can we not just claim that \\(f(x) = x / (1 + x^2)\\) is odd and \\(\\int_{-\\infty}^\\infty f(x) = 0\\)? By definition of the Lebesgue integral \\(\\int_{-\\infty}^{\\infty} f= \\int_{-\\infty}^{\\infty} f_+-\\int_{-\\infty}^{\\infty} f_-\\). At least one of the two integrals needs to be finite for \\(\\int_{-\\infty}^{\\infty} f\\) to be well-defined. However \\(\\int_{-\\infty}^{\\infty} f_+=\\int_0^{\\infty} x/(1+x^2)\\) and \\(\\int_{-\\infty}^{\\infty} f_-=\\int_{-\\infty}^{0} |x|/(1+x^2)\\). We have just shown that both of these integrals are infinite, which implies that their sum is also infinite. 7.4 Covariance Exercise 7.13 Below is a table of values for random variables \\(X\\) and \\(Y\\). X Y 2.1 8 -0.5 11 1 10 -2 12 4 9 Find sample covariance of \\(X\\) and \\(Y\\). Find sample variances of \\(X\\) and \\(Y\\). Find sample correlation of \\(X\\) and \\(Y\\). Find sample variance of \\(Z = 2X - 3Y\\). Solution. \\(\\bar{X} = 0.92\\) and \\(\\bar{Y} = 10\\). \\[\\begin{align} s(X, Y) &= \\frac{1}{n - 1} \\sum_{i=1}^5 (X_i - 0.92) (Y_i - 10) \\\\ &= -3.175. \\end{align}\\] \\[\\begin{align} s(X) &= \\frac{\\sum_{i=1}^5(X_i - 0.92)^2}{5 - 1} \\\\ &= 5.357. \\end{align}\\] \\[\\begin{align} s(Y) &= \\frac{\\sum_{i=1}^5(Y_i - 10)^2}{5 - 1} \\\\ &= 2.5. \\end{align}\\] \\[\\begin{align} r(X,Y) &= \\frac{Cov(X,Y)}{\\sqrt{Var[X]Var[Y]}} \\\\ &= \\frac{-3.175}{\\sqrt{5.357 \\times 2.5}} \\\\ &= -8.68. \\end{align}\\] \\[\\begin{align} s(Z) &= 2^2 s(X) + 3^2 s(Y) + 2 \\times 2 \\times 3 s(X, Y) \\\\ &= 4 \\times 5.357 + 9 \\times 2.5 + 12 \\times 3.175 \\\\ &= 82.028. \\end{align}\\] Exercise 7.14 Let \\(X \\sim \\text{Uniform}(0,1)\\) and \\(Y | X = x \\sim \\text{Uniform(0,x)}\\). Find the covariance of \\(X\\) and \\(Y\\). Find the correlation of \\(X\\) and \\(Y\\). R: check your answers to a) and b) with simulation. Plot \\(X\\) against \\(Y\\) on a scatterplot. Solution. The joint PDF is \\(p(x,y) = p(x)p(y|x) = \\frac{1}{x}\\). \\[\\begin{align} Cov(X,Y) &= E[XY] - E[X]E[Y] \\\\ \\end{align}\\] Let us first evaluate the first term: \\[\\begin{align} E[XY] &= \\int_0^1 \\int_0^x x y \\frac{1}{x} dy dx \\\\ &= \\int_0^1 \\int_0^x y dy dx \\\\ &= \\int_0^1 \\frac{x^2}{2} dx \\\\ &= \\frac{1}{6}. \\end{align}\\] Now let us find \\(E[Y]\\), \\(E[X]\\) is trivial. \\[\\begin{align} E[Y] = E[E[Y | X]] = E[\\frac{X}{2}] = \\frac{1}{2} \\int_0^1 x dx = \\frac{1}{4}. \\end{align}\\] Combining all: \\[\\begin{align} Cov(X,Y) &= \\frac{1}{6} - \\frac{1}{2} \\frac{1}{4} = \\frac{1}{24}. \\end{align}\\] \\[\\begin{align} \\rho(X,Y) &= \\frac{Cov(X,Y)}{\\sqrt{Var[X]Var[Y]}} \\\\ \\end{align}\\] Let us calculate \\(Var[X]\\). \\[\\begin{align} Var[X] &= E[X^2] - \\frac{1}{4} \\\\ &= \\int_0^1 x^2 - \\frac{1}{4} \\\\ &= \\frac{1}{3} - \\frac{1}{4} \\\\ &= \\frac{1}{12}. \\end{align}\\] Let us calculate \\(E[E[Y^2|X]]\\). \\[\\begin{align} E[E[Y^2|X]] &= E[\\frac{x^2}{3}] \\\\ &= \\frac{1}{9}. \\end{align}\\] Then \\(Var[Y] = \\frac{1}{9} - \\frac{1}{16} = \\frac{7}{144}\\). Combining all \\[\\begin{align} \\rho(X,Y) &= \\frac{\\frac{1}{24}}{\\sqrt{\\frac{1}{12}\\frac{7}{144}}} \\\\ &= 0.65. \\end{align}\\] set.seed(1) nsamps <- 10000 x <- runif(nsamps) y <- runif(nsamps, 0, x) cov(x, y) ## [1] 0.04274061 1/24 ## [1] 0.04166667 cor(x, y) ## [1] 0.6629567 (1 / 24) / (sqrt(7 / (12 * 144))) ## [1] 0.6546537 ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_point(alpha = 0.2) + geom_smooth(method = "lm") "],["mrv.html", "Chapter 8 Multivariate random variables 8.1 Multinomial random variables 8.2 Multivariate normal random variables 8.3 Transformations", " Chapter 8 Multivariate random variables This chapter deals with multivariate random variables. The students are expected to acquire the following knowledge: Theoretical Multinomial distribution. Multivariate normal distribution. Cholesky decomposition. Eigendecomposition. R Sampling from the multinomial distribution. Sampling from the multivariate normal distribution. Matrix decompositions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 8.1 Multinomial random variables Exercise 8.1 Let \\(X_i\\), \\(i = 1,...,k\\) represent \\(k\\) events, and \\(p_i\\) the probabilities of these events happening in a trial. Let \\(n\\) be the number of trials, and \\(X\\) a multivariate random variable, the collection of \\(X_i\\). Then \\(p(x) = \\frac{n!}{x_1!x_2!...x_k!} p_1^{x_1} p_2^{x_2}...p_k^{x_k}\\) is the PMF of a multinomial distribution, where \\(n = \\sum_{i = 1}^k x_i\\). Show that the marginal distribution of \\(X_i\\) is a binomial distribution. Take 1000 samples from the multinomial distribution with \\(n=4\\) and probabilities \\(p = (0.2, 0.2, 0.5, 0.1)\\). Then take 1000 samples from four binomial distributions with the same parameters. Inspect the results visually. Solution. We will approach this proof from the probabilistic point of view. W.L.O.G. let \\(x_1\\) be the marginal distribution we are interested in. The term \\(p^{x_1}\\) denotes the probability that event 1 happened \\(x_1\\) times. For this event not to happen, one of the other events needs to happen. So for each of the remaining trials, the probability of another event is \\(\\sum_{i=2}^k p_i = 1 - p_1\\), and there were \\(n - x_1\\) such trials. What is left to do is to calculate the number of permutations of event 1 happening and event 1 not happening. We choose \\(x_1\\) trials, from \\(n\\) trials. Therefore \\(p(x_1) = \\binom{n}{x_1} p_1^{x_1} (1 - p_1)^{n - x_1}\\), which is the binomial PMF. Interested students are encouraged to prove this mathematically. set.seed(1) nsamps <- 1000 samps_mult <- rmultinom(nsamps, 4, prob = c(0.2, 0.2, 0.5, 0.1)) samps_mult <- as_tibble(t(samps_mult)) %>% gather() samps <- tibble( V1 = rbinom(nsamps, 4, 0.2), V2 = rbinom(nsamps, 4, 0.2), V3 = rbinom(nsamps, 4, 0.5), V4 = rbinom(nsamps, 4, 0.1) ) %>% gather() %>% bind_rows(samps_mult) %>% bind_cols("dist" = c(rep("binomial", 4*nsamps), rep("multinomial", 4*nsamps))) ggplot(samps, aes(x = value, fill = dist)) + geom_bar(position = "dodge") + facet_wrap(~ key) Exercise 8.2 (Multinomial expected value) Find the expected value, variance and covariance of the multinomial distribution. Hint: First find the expected value for \\(n = 1\\) and then use the fact that the trials are independent. Solution. Let us first calculate the expected value of \\(X_1\\), when \\(n = 1\\). \\[\\begin{align} E[X_1] &= \\sum_{n_1 = 0}^1 \\sum_{n_2 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_1!n_2!...n_k!}p_1^{n_1}p_2^{n_2}...p_k^{n_k}n_1 \\\\ &= \\sum_{n_1 = 0}^1 \\frac{p_1^{n_1} n_1}{n_1!} \\sum_{n_2 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_2!...n_k!}p_2^{n_2}...p_k^{n_k} \\end{align}\\] When \\(n_1 = 0\\) then the whole terms is zero, so we do not need to evaluate other sums. When \\(n_1 = 1\\), all other \\(n_i\\) must be zero, as we have \\(1 = \\sum_{i=1}^k n_i\\). Therefore the other sums equal \\(1\\). So \\(E[X_1] = p_1\\) and \\(E[X_i] = p_i\\) for \\(i = 1,...,k\\). Now let \\(Y_j\\), \\(j = 1,...,n\\), have a multinomial distribution with \\(n = 1\\), and let \\(X\\) have a multinomial distribution with an arbitrary \\(n\\). Then we can write \\(X = \\sum_{j=1}^n Y_j\\). And due to independence \\[\\begin{align} E[X] &= E[\\sum_{j=1}^n Y_j] \\\\ &= \\sum_{j=1}^n E[Y_j] \\\\ &= np. \\end{align}\\] For the variance, we need \\(E[X^2]\\). Let us follow the same procedure as above and first calculate \\(E[X_i]\\) for \\(n = 1\\). The only thing that changes is that the term \\(n_i\\) becomes \\(n_i^2\\). Since we only have \\(0\\) and \\(1\\) this does not change the outcome. So \\[\\begin{align} Var[X_i] &= E[X_i^2] - E[X_i]^2\\\\ &= p_i(1 - p_i). \\end{align}\\] Analogous to above for arbitrary \\(n\\) \\[\\begin{align} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\sum_{j=1}^n E[Y_j^2] - \\sum_{j=1}^n E[Y_j]^2 \\\\ &= \\sum_{j=1}^n E[Y_j^2] - E[Y_j]^2 \\\\ &= \\sum_{j=1}^n p(1-p) \\\\ &= np(1-p). \\end{align}\\] To calculate the covariance, we need \\(E[X_i X_j]\\). Again, let us start with \\(n = 1\\). Without loss of generality, let us assume \\(i = 1\\) and \\(j = 2\\). \\[\\begin{align} E[X_1 X_2] = \\sum_{n_1 = 0}^1 \\sum_{n_2 = 0}^1 \\frac{p_1^{n_1} n_1}{n_1!} \\frac{p_2^{n_2} n_2}{n_2!} \\sum_{n_3 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_3!...n_k!}p_3^{n_3}...p_k^{n_k}. \\end{align}\\] In the above expression, at each iteration we multiply with \\(n_1\\) and \\(n_2\\). Since \\(n = 1\\), one of these always has to be zero. Therefore \\(E[X_1 X_2] = 0\\) and \\[\\begin{align} Cov(X_i, X_j) &= E[X_i X_j] - E[X_i]E[X_j] \\\\ &= - p_i p_j. \\end{align}\\] For arbitrary \\(n\\), let \\(X = \\sum_{t = 1}^n Y_t\\) be the sum of independent multinomial random variables \\(Y_t = [X_{1t}, X_{2t},...,X_{kt}]^T\\) with \\(n=1\\). Then \\(X_1 = \\sum_{t = 1}^n X_{1t}\\) and \\(X_2 = \\sum_{l = 1}^n X_{2l}\\). \\[\\begin{align} Cov(X_1, X_2) &= E[X_1 X_2] - E[X_1] E[X_2] \\\\ &= E[\\sum_{t = 1}^n X_{1t} \\sum_{l = 1}^n X_{2l}] - n^2 p_1 p_2 \\\\ &= \\sum_{t = 1}^n \\sum_{l = 1}^n E[X_{1t} X_{2l}] - n^2 p_1 p_2. \\end{align}\\] For \\(X_{1t}\\) and \\(X_{2l}\\) the expected value is zero when \\(t = l\\). When \\(t \\neq l\\) then they are independent, so the expected value is the product \\(p_1 p_2\\). There are \\(n^2\\) total terms, and for \\(n\\) of them \\(t = l\\) holds. So \\(E[X_1 X_2] = (n^2 - n) p_1 p_2\\). Inserting into the above \\[\\begin{align} Cov(X_1, X_2) &= (n^2 - n) p_1 p_2 - n^2 p_1 p_2 \\\\ &= - n p_1 p_2. \\end{align}\\] 8.2 Multivariate normal random variables Exercise 8.3 (Cholesky decomposition) Let \\(X\\) be a random vector of length \\(k\\) with \\(X_i \\sim \\text{N}(0, 1)\\) and \\(LL^*\\) the Cholesky decomposition of a Hermitian positive-definite matrix \\(A\\). Let \\(\\mu\\) be a vector of length \\(k\\). Find the distribution of the random vector \\(Y = \\mu + L X\\). Find the Cholesky decomposition of \\(A = \\begin{bmatrix} 2 & 1.2 \\\\ 1.2 & 1 \\end{bmatrix}\\). R: Use the results from a) and b) to sample from the MVN distribution \\(\\text{N}(\\mu, A)\\), where \\(\\mu = [1.5, -1]^T\\). Plot a scatterplot and compare it to direct samples from the multivariate normal distribution (rmvnorm). Solution. \\(X\\) has an independent normal distribution of dimension \\(k\\). Then \\[\\begin{align} Y = \\mu + L X &\\sim \\text{N}(\\mu, LL^T) \\\\ &\\sim \\text{N}(\\mu, A). \\end{align}\\] Solve \\[\\begin{align} \\begin{bmatrix} a & 0 \\\\ b & c \\end{bmatrix} \\begin{bmatrix} a & b \\\\ 0 & c \\end{bmatrix} = \\begin{bmatrix} 2 & 1.2 \\\\ 1.2 & 1 \\end{bmatrix} \\end{align}\\] # a set.seed(1) nsamps <- 1000 X <- matrix(data = rnorm(nsamps * 2), ncol = 2) mu <- c(1.5, -1) L <- matrix(data = c(sqrt(2), 0, 1.2 / sqrt(2), sqrt(1 - 1.2^2/2)), ncol = 2, byrow = TRUE) Y <- t(mu + L %*% t(X)) plot_df <- data.frame(rbind(X, Y), c(rep("X", nsamps), rep("Y", nsamps))) colnames(plot_df) <- c("D1", "D2", "var") ggplot(data = plot_df, aes(x = D1, y = D2, colour = as.factor(var))) + geom_point() Exercise 8.4 (Eigendecomposition) R: Let \\(\\Sigma = U \\Lambda U^T\\) be the eigendecomposition of covariance matrix \\(\\Sigma\\). Follow the procedure below, to sample from a multivariate normal with \\(\\mu = [-2, 1]^T\\) and \\(\\Sigma = \\begin{bmatrix} 0.3, -0.5 \\\\ -0.5, 1.6 \\end{bmatrix}\\): Sample from two independent standardized normal distributions to get \\(X\\). Find the eigendecomposition of \\(X\\) (eigen). Multiply \\(X\\) by \\(\\Lambda^{\\frac{1}{2}}\\) to get \\(X2\\). Consider how the eigendecomposition for \\(X2\\) changes compared to \\(X\\). Multiply \\(X2\\) by \\(U\\) to get \\(X3\\). Consider how the eigendecomposition for \\(X3\\) changes compared to \\(X2\\). Add \\(\\mu\\) to \\(X3\\). Consider how the eigendecomposition for \\(X4\\) changes compared to \\(X3\\). Plot the data and the eigenvectors (scaled with \\(\\Lambda^{\\frac{1}{2}}\\)) at each step. Hint: Use geom_segment for the eigenvectors. # a set.seed(1) sigma <- matrix(data = c(0.3, -0.5, -0.5, 1.6), nrow = 2, byrow = TRUE) ed <- eigen(sigma) e_val <- ed$values e_vec <- ed$vectors # b set.seed(1) nsamps <- 1000 X <- matrix(data = rnorm(nsamps * 2), ncol = 2) vec1 <- matrix(c(1,0,0,1), nrow = 2) X2 <- t(sqrt(diag(e_val)) %*% t(X)) vec2 <- sqrt(diag(e_val)) %*% vec1 X3 <- t(e_vec %*% t(X2)) vec3 <- e_vec %*% vec2 X4 <- t(c(-2, 1) + t(X3)) vec4 <- c(-2, 1) + vec3 vec_mat <- data.frame(matrix(c(0,0,0,0,0,0,0,0,0,0,0,0,-2,1,-2,1), ncol = 2, byrow = TRUE), t(cbind(vec1, vec2, vec3, vec4)), c(1,1,2,2,3,3,4,4)) df <- data.frame(rbind(X, X2, X3, X4), c(rep(1, nsamps), rep(2, nsamps), rep(3, nsamps), rep(4, nsamps))) colnames(df) <- c("D1", "D2", "wh") colnames(vec_mat) <- c("D1", "D2", "E1", "E2", "wh") ggplot(data = df, aes(x = D1, y = D2)) + geom_point() + geom_segment(data = vec_mat, aes(xend = E1, yend = E2), color = "red") + facet_wrap(~ wh) + coord_fixed() Exercise 8.5 (Marginal and conditional distributions) Let \\(X \\sim \\text{N}(\\mu, \\Sigma)\\), where \\(\\mu = [2, 0, -1]^T\\) and \\(\\Sigma = \\begin{bmatrix} 1 & -0.2 & 0.5 \\\\ -0.2 & 1.4 & -1.2 \\\\ 0.5 & -1.2 & 2 \\\\ \\end{bmatrix}\\). Let \\(A\\) represent the first two random variables and \\(B\\) the third random variable. R: For the calculation in the following points, you can use R. Find the marginal distribution of \\(B\\). Find the conditional distribution of \\(B | A = [a_1, a_2]^T\\). Find the marginal distribution of \\(A\\). Find the conditional distribution of \\(A | B = b\\). R: Visually compare the distributions of a) and b), and c) and d) at three different conditional values. mu <- c(2, 0, -1) Sigma <- matrix(c(1, -0.2, 0.5, -0.2, 1.4, -1.2, 0.5, -1.2, 2), nrow = 3, byrow = TRUE) mu_A <- c(2, 0) mu_B <- -1 Sigma_A <- Sigma[1:2, 1:2] Sigma_B <- Sigma[3, 3] Sigma_AB <- Sigma[1:2, 3] # b tmp_b <- t(Sigma_AB) %*% solve(Sigma_A) mu_b <- mu_B - tmp_b %*% mu_A Sigma_b <- Sigma_B - t(Sigma_AB) %*% solve(Sigma_A) %*% Sigma_AB mu_b ## [,1] ## [1,] -1.676471 tmp_b ## [,1] [,2] ## [1,] 0.3382353 -0.8088235 Sigma_b ## [,1] ## [1,] 0.8602941 # d tmp_a <- Sigma_AB * (1 / Sigma_B) mu_a <- mu_A - tmp_a * mu_B Sigma_d <- Sigma_A - (Sigma_AB * (1 / Sigma_B)) %*% t(Sigma_AB) mu_a ## [1] 2.25 -0.60 tmp_a ## [1] 0.25 -0.60 Sigma_d ## [,1] [,2] ## [1,] 0.875 0.10 ## [2,] 0.100 0.68 Solution. \\(B \\sim \\text{N}(-1, 2)\\). \\(B | A = a \\sim \\text{N}(-1.68 + [0.34, -0.81] a, 0.86)\\). \\(\\mu_A = [2, 0]^T\\) and \\(\\Sigma_A = \\begin{bmatrix} 1 & -0.2 & \\\\ -0.2 & 1.4 \\\\ \\end{bmatrix}\\). \\[\\begin{align} A | B = b &\\sim \\text{N}(\\mu_t, \\Sigma_t), \\\\ \\mu_t &= [2.25, -0.6]^T + [0.25, -0.6]^T b, \\\\ \\Sigma_t &= \\begin{bmatrix} 0.875 & 0.1 \\\\ 0.1 & 0.68 \\\\ \\end{bmatrix} \\end{align}\\] library(mvtnorm) set.seed(1) nsamps <- 1000 # a and b samps <- as.data.frame(matrix(data = NA, nrow = 4 * nsamps, ncol = 2)) samps[1:nsamps,1] <- rnorm(nsamps, mu_B, Sigma_B) samps[1:nsamps,2] <- "marginal" for (i in 1:3) { a <- rmvnorm(1, mu_A, Sigma_A) samps[(i*nsamps + 1):((i + 1) * nsamps), 1] <- rnorm(nsamps, mu_b + tmp_b %*% t(a), Sigma_b) samps[(i*nsamps + 1):((i + 1) * nsamps), 2] <- paste0(# "cond", round(a, digits = 2), collapse = "-") } colnames(samps) <- c("x", "dist") ggplot(samps, aes(x = x)) + geom_density() + facet_wrap(~ dist) # c and d samps <- as.data.frame(matrix(data = NA, nrow = 4 * nsamps, ncol = 3)) samps[1:nsamps,1:2] <- rmvnorm(nsamps, mu_A, Sigma_A) samps[1:nsamps,3] <- "marginal" for (i in 1:3) { b <- rnorm(1, mu_B, Sigma_B) samps[(i*nsamps + 1):((i + 1) * nsamps), 1:2] <- rmvnorm(nsamps, mu_a + tmp_a * b, Sigma_d) samps[(i*nsamps + 1):((i + 1) * nsamps), 3] <- b } colnames(samps) <- c("x", "y", "dist") ggplot(samps, aes(x = x, y = y)) + geom_point() + geom_smooth(method = "lm") + facet_wrap(~ dist) 8.3 Transformations Exercise 8.6 Let \\((U,V)\\) be a random variable with PDF \\(p(u,v) = \\frac{1}{4 \\sqrt{u}}\\), \\(U \\in [0,4]\\) and \\(V \\in [\\sqrt{U}, \\sqrt{U} + 1]\\). Let \\(X = \\sqrt{U}\\) and \\(Y = V - \\sqrt{U}\\). Find PDF of \\((X,Y)\\). What can you tell about distributions of \\(X\\) and \\(Y\\)? This exercise shows how we can simplify a probabilistic problem with a clever use of transformations. R: Take 1000 samples from \\((X,Y)\\) and transform them with inverses of the above functions to get samples from \\((U,V)\\). Plot both sets of samples. Solution. First we need to find the inverse functions. Since \\(x = \\sqrt{u}\\) it follows that \\(u = x^2\\), and that \\(x \\in [0,2]\\). Similarly \\(v = y + x\\) and \\(y \\in [0,1]\\). Let us first find the Jacobian. \\[\\renewcommand\\arraystretch{1.6} J(x,y) = \\begin{bmatrix} \\frac{\\partial u}{\\partial x} & \\frac{\\partial v}{\\partial x} \\\\%[1ex] % <-- 1ex more space between rows of matrix \\frac{\\partial u}{\\partial y} & \\frac{\\partial v}{\\partial y} \\end{bmatrix} = \\begin{bmatrix} 2x & 1 \\\\%[1ex] % <-- 1ex more space between rows of matrix 0 & 1 \\end{bmatrix}, \\] and the determinant is \\(|J(x,y)| = 2x\\). Putting everything together, we get \\[\\begin{align} p_{X,Y}(x,y) = p_{U,V}(x^2, y + x) |J(x,y)| = \\frac{1}{4 \\sqrt{x^2}} 2x = \\frac{1}{2}. \\end{align}\\] This reminds us of the Uniform distribution. Indeed we can see that \\(p_X(x) = \\frac{1}{2}\\) and \\(p_Y(y) = 1\\). So instead of dealing with an awkward PDF of \\((U,V)\\) and the corresponding dynamic bounds, we are now looking at two independent Uniform random variables. In practice, this could make modeling much easier. set.seed(1) nsamps <- 2000 x <- runif(nsamps, min = 0, max = 2) y <- runif(nsamps) orig <- tibble(x = x, y = y, vrs = "original") u <- x^2 v <- y + x transf <- tibble(x = u, y = v, vrs = "transformed") df <- bind_rows(orig, transf) ggplot(df, aes(x = x, y = y, color = vrs)) + geom_point(alpha = 0.3) Exercise 8.7 R: Write a function that will calculate the probability density of an arbitraty multivariate normal distribution, based on independent standardized normal PDFs. Compare with dmvnorm from the mvtnorm package. library(mvtnorm) set.seed(1) mvn_dens <- function (y, mu, Sigma) { L <- chol(Sigma) L_inv <- solve(t(L)) g_inv <- L_inv %*% t(y - mu) J <- L_inv J_det <- det(J) return(prod(dnorm(g_inv)) * J_det) } mu_v <- c(-2, 0, 1) cov_m <- matrix(c(1, -0.2, 0.5, -0.2, 2, 0.3, 0.5, 0.3, 1.6), ncol = 3, byrow = TRUE) n_comp <- 20 for (i in 1:n_comp) { x <- rmvnorm(1, mean = mu_v, sigma = cov_m) print(paste0("My function: ", mvn_dens(x, mu_v, cov_m), ", dmvnorm: ", dmvnorm(x, mu_v, cov_m))) } ## [1] "My function: 0.0229514237156383, dmvnorm: 0.0229514237156383" ## [1] "My function: 0.00763138915406231, dmvnorm: 0.00763138915406231" ## [1] "My function: 0.0230688881105741, dmvnorm: 0.0230688881105741" ## [1] "My function: 0.0113616213114731, dmvnorm: 0.0113616213114731" ## [1] "My function: 0.00151808500121907, dmvnorm: 0.00151808500121907" ## [1] "My function: 0.0257658045974509, dmvnorm: 0.0257658045974509" ## [1] "My function: 0.0157963825730805, dmvnorm: 0.0157963825730805" ## [1] "My function: 0.00408856287529248, dmvnorm: 0.00408856287529248" ## [1] "My function: 0.0327793540101256, dmvnorm: 0.0327793540101256" ## [1] "My function: 0.0111606542967978, dmvnorm: 0.0111606542967978" ## [1] "My function: 0.0147636757585684, dmvnorm: 0.0147636757585684" ## [1] "My function: 0.0142948300412207, dmvnorm: 0.0142948300412207" ## [1] "My function: 0.0203093820657542, dmvnorm: 0.0203093820657542" ## [1] "My function: 0.0287533273357481, dmvnorm: 0.0287533273357481" ## [1] "My function: 0.0213402305128623, dmvnorm: 0.0213402305128623" ## [1] "My function: 0.0218356957993885, dmvnorm: 0.0218356957993885" ## [1] "My function: 0.0250750113961771, dmvnorm: 0.0250750113961771" ## [1] "My function: 0.0166498666348048, dmvnorm: 0.0166498666348048" ## [1] "My function: 0.00189725106874659, dmvnorm: 0.00189725106874659" ## [1] "My function: 0.0196697814975113, dmvnorm: 0.0196697814975113" "],["ard.html", "Chapter 9 Alternative representation of distributions 9.1 Probability generating functions (PGFs) 9.2 Moment generating functions (MGFs)", " Chapter 9 Alternative representation of distributions This chapter deals with alternative representation of distributions. The students are expected to acquire the following knowledge: Theoretical Probability generating functions. Moment generating functions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 9.1 Probability generating functions (PGFs) Exercise 9.1 Show that the sum of independent Poisson random variables is itself a Poisson random variable. R: Let \\(X\\) be a sum of three Poisson distributions with \\(\\lambda_i \\in \\{2, 5.2, 10\\}\\). Take 1000 samples and plot the three distributions and the sum. Then take 1000 samples from the theoretical distribution of \\(X\\) and compare them to the sum. Solution. Let \\(X_i \\sim \\text{Poisson}(\\lambda_i)\\) for \\(i = 1,...,n\\), and let \\(X = \\sum_{i=1}^n X_i\\). \\[\\begin{align} \\alpha_X(t) &= \\prod_{i=1}^n \\alpha_{X_i}(t) \\\\ &= \\prod_{i=1}^n \\bigg( \\sum_{j=0}^\\infty t^j \\frac{\\lambda_i^j e^{-\\lambda_i}}{j!} \\bigg) \\\\ &= \\prod_{i=1}^n \\bigg( e^{-\\lambda_i} \\sum_{j=0}^\\infty \\frac{(t\\lambda_i)^j }{j!} \\bigg) \\\\ &= \\prod_{i=1}^n \\bigg( e^{-\\lambda_i} e^{t \\lambda_i} \\bigg) & \\text{power series} \\\\ &= \\prod_{i=1}^n \\bigg( e^{\\lambda_i(t - 1)} \\bigg) \\\\ &= e^{\\sum_{i=1}^n \\lambda_i(t - 1)} \\\\ &= e^{t \\sum_{i=1}^n \\lambda_i - \\sum_{i=1}^n \\lambda_i} \\\\ &= e^{-\\sum_{i=1}^n \\lambda_i} \\sum_{j=0}^\\infty \\frac{(t \\sum_{i=1}^n \\lambda_i)^j}{j!}\\\\ &= \\sum_{j=0}^\\infty \\frac{e^{-\\sum_{i=1}^n \\lambda_i} (t \\sum_{i=1}^n \\lambda_i)^j}{j!}\\\\ \\end{align}\\] The last term is the PGF of a Poisson random variable with parameter \\(\\sum_{i=1}^n \\lambda_i\\). Because the PGF is unique, \\(X\\) is a Poisson random variable. set.seed(1) library(tidyr) nsamps <- 1000 samps <- matrix(data = NA, nrow = nsamps, ncol = 4) samps[ ,1] <- rpois(nsamps, 2) samps[ ,2] <- rpois(nsamps, 5.2) samps[ ,3] <- rpois(nsamps, 10) samps[ ,4] <- samps[ ,1] + samps[ ,2] + samps[ ,3] colnames(samps) <- c(2, 2.5, 10, "sum") gsamps <- as_tibble(samps) gsamps <- gather(gsamps, key = "dist", value = "value") ggplot(gsamps, aes(x = value)) + geom_bar() + facet_wrap(~ dist) samps <- cbind(samps, "theoretical" = rpois(nsamps, 2 + 5.2 + 10)) gsamps <- as_tibble(samps[ ,4:5]) gsamps <- gather(gsamps, key = "dist", value = "value") ggplot(gsamps, aes(x = value, fill = dist)) + geom_bar(position = "dodge") Exercise 9.2 Find the expected value and variance of the negative binomial distribution. Hint: Find the Taylor series of \\((1 - y)^{-r}\\) at point 0. Solution. Let \\(X \\sim \\text{NB}(r, p)\\). \\[\\begin{align} \\alpha_X(t) &= E[t^X] \\\\ &= \\sum_{j=0}^\\infty t^j \\binom{j + r - 1}{j} (1 - p)^r p^j \\\\ &= (1 - p)^r \\sum_{j=0}^\\infty \\binom{j + r - 1}{j} (tp)^j \\\\ &= (1 - p)^r \\sum_{j=0}^\\infty \\frac{(j + r - 1)(j + r - 2)...r}{j!} (tp)^j. \\\\ \\end{align}\\] Let us look at the Taylor series of \\((1 - y)^{-r}\\) at 0 \\[\\begin{align} (1 - y)^{-r} = &1 + \\frac{-r(-1)}{1!}y + \\frac{-r(-r - 1)(-1)^2}{2!}y^2 + \\\\ &\\frac{-r(-r - 1)(-r - 2)(-1)^3}{3!}y^3 + ... \\\\ \\end{align}\\] How does the \\(k\\)-th term look like? We have \\(k\\) derivatives of our function so \\[\\begin{align} \\frac{d^k}{d^k y} (1 - y)^{-r} &= \\frac{-r(-r - 1)...(-r - k + 1)(-1)^k}{k!}y^k \\\\ &= \\frac{r(r + 1)...(r + k - 1)}{k!}y^k. \\end{align}\\] We observe that this equals to the \\(j\\)-th term in the sum of NB PGF. Therefore \\[\\begin{align} \\alpha_X(t) &= (1 - p)^r (1 - tp)^{-r} \\\\ &= \\Big(\\frac{1 - p}{1 - tp}\\Big)^r \\end{align}\\] To find the expected value, we need to differentiate \\[\\begin{align} \\frac{d}{dt} \\Big(\\frac{1 - p}{1 - tp}\\Big)^r &= r \\Big(\\frac{1 - p}{1 - tp}\\Big)^{r-1} \\frac{d}{dt} \\frac{1 - p}{1 - tp} \\\\ &= r \\Big(\\frac{1 - p}{1 - tp}\\Big)^{r-1} \\frac{p(1 - p)}{(1 - tp)^2}. \\\\ \\end{align}\\] Evaluating this at 1, we get: \\[\\begin{align} E[X] = \\frac{rp}{1 - p}. \\end{align}\\] For the variance we need the second derivative. \\[\\begin{align} \\frac{d^2}{d^2t} \\Big(\\frac{1 - p}{1 - tp}\\Big)^r &= \\frac{p^2 r (r + 1) (\\frac{1 - p}{1 - tp})^r}{(tp - 1)^2} \\end{align}\\] Evaluating this at 1 and inserting the first derivatives, we get: \\[\\begin{align} Var[X] &= \\frac{d^2}{dt^2} \\alpha_X(1) + \\frac{d}{dt}\\alpha_X(1) - \\Big(\\frac{d}{dt}\\alpha_X(t) \\Big)^2 \\\\ &= \\frac{p^2 r (r + 1)}{(1 - p)^2} + \\frac{rp}{1 - p} - \\frac{r^2p^2}{(1 - p)^2} \\\\ &= \\frac{rp}{(1 - p)^2}. \\end{align}\\] library(tidyr) set.seed(1) nsamps <- 100000 find_p <- function (mu, r) { return (10 / (r + 10)) } r <- c(1,2,10,20) p <- find_p(10, r) sigma <- rep(sqrt(p*r / (1 - p)^2), each = nsamps) samps <- cbind("r=1" = rnbinom(nsamps, size = r[1], prob = 1 - p[1]), "r=2" = rnbinom(nsamps, size = r[2], prob = 1 - p[2]), "r=4" = rnbinom(nsamps, size = r[3], prob = 1 - p[3]), "r=20" = rnbinom(nsamps, size = r[4], prob = 1 - p[4])) gsamps <- gather(as.data.frame(samps)) iw <- (gsamps$value > sigma + 10) | (gsamps$value < sigma - 10) ggplot(gsamps, aes(x = value, fill = iw)) + geom_bar() + # geom_density() + facet_wrap(~ key) 9.2 Moment generating functions (MGFs) Exercise 9.3 Find the variance of the geometric distribution. Solution. Let \\(X \\sim \\text{Geometric}(p)\\). The MGF of the geometric distribution is \\[\\begin{align} M_X(t) &= E[e^{tX}] \\\\ &= \\sum_{k=0}^\\infty p(1 - p)^k e^{tk} \\\\ &= p \\sum_{k=0}^\\infty ((1 - p)e^t)^k. \\end{align}\\] Let us assume that \\((1 - p)e^t < 1\\). Then, by using the geometric series we get \\[\\begin{align} M_X(t) &= \\frac{p}{1 - e^t + pe^t}. \\end{align}\\] The first derivative of the above expression is \\[\\begin{align} \\frac{d}{dt}M_X(t) &= \\frac{-p(-e^t + pe^t)}{(1 - e^t + pe^t)^2}, \\end{align}\\] and evaluating at \\(t = 0\\), we get \\(\\frac{1 - p}{p}\\), which we already recognize as the expected value of the geometric distribution. The second derivative is \\[\\begin{align} \\frac{d^2}{dt^2}M_X(t) &= \\frac{(p-1)pe^t((p-1)e^t - 1)}{((p - 1)e^t + 1)^3}, \\end{align}\\] and evaluating at \\(t = 0\\), we get \\(\\frac{(p - 1)(p - 2)}{p^2}\\). Combining we get the variance \\[\\begin{align} Var(X) &= \\frac{(p - 1)(p - 2)}{p^2} - \\frac{(1 - p)^2}{p^2} \\\\ &= \\frac{(p-1)(p-2) - (1-p)^2}{p^2} \\\\ &= \\frac{1 - p}{p^2}. \\end{align}\\] Exercise 9.4 Find the distribution of sum of two normal random variables \\(X\\) and \\(Y\\), by comparing \\(M_{X+Y}(t)\\) to \\(M_X(t)\\). R: To illustrate the result draw random samples from N\\((-3, 1)\\) and N\\((5, 1.2)\\) and calculate the empirical mean and variance of \\(X+Y\\). Plot all three histograms in one plot. Solution. Let \\(X \\sim \\text{N}(\\mu_X, 1)\\) and \\(Y \\sim \\text{N}(\\mu_Y, 1)\\). The MGF of the sum is \\[\\begin{align} M_{X+Y}(t) &= M_X(t) M_Y(t). \\end{align}\\] Let us calculate \\(M_X(t)\\), the MGF for \\(Y\\) then follows analogously. \\[\\begin{align} M_X(t) &= \\int_{-\\infty}^\\infty e^{tx} \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - mu_X)^2}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - mu_X)^2 - 2\\sigma_X tx}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{x^2 - 2\\mu_X x + \\mu_X^2 - 2\\sigma_X tx}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - (\\mu_X + \\sigma_X^2 t))^2 + \\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} dx & \\text{complete the square}\\\\ &= e^{-\\frac{\\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - (\\mu_X + \\sigma_X^2 t))^2}{2\\sigma_X^2}} dx & \\\\ &= e^{-\\frac{\\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} & \\text{normal PDF} \\\\ &= e^{-\\frac{\\mu_X^2 - \\mu_X^2 - \\mu_X \\sigma_X^2 t - 2 \\sigma_X^4 t^2}{2\\sigma_X^2}} \\\\ &= e^{\\sigma_X^2 t^2 + \\frac{\\mu_X t}{2}}. \\\\ \\end{align}\\] The MGF of the sum is then \\[\\begin{align} M_{X+Y}(t) &= e^{\\sigma_X^2 t^2 + 0.5\\mu_X t} e^{\\sigma_Y^2 t^2 + 0.5\\mu_Y t} \\\\ &= e^{t^2(\\sigma_X^2 + \\sigma_Y^2) + 0.5 t(\\mu_X + \\mu_Y)}. \\end{align}\\] By comparing \\(M_{X+Y}(t)\\) and \\(M_X(t)\\) we observe that both have two terms. The first is \\(2t^2\\) multiplied by the variance, and the second is \\(2t\\) multiplied by the mean. Since MGFs are unique, we conclude that \\(Z = X + Y \\sim \\text{N}(\\mu_X + \\mu_Y, \\sigma_X^2 + \\sigma_Y^2)\\). library(tidyr) library(ggplot2) set.seed(1) nsamps <- 1000 x <- rnorm(nsamps, -3, 1) y <- rnorm(nsamps, 5, 1.2) z <- x + y mean(z) ## [1] 1.968838 var(z) ## [1] 2.645034 df <- data.frame(x = x, y = y, z = z) %>% gather() ggplot(df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") "],["ci.html", "Chapter 10 Concentration inequalities 10.1 Comparison 10.2 Practical", " Chapter 10 Concentration inequalities This chapter deals with concentration inequalities. The students are expected to acquire the following knowledge: Theoretical More assumptions produce closer bounds. R Optimization. Estimating probability inequalities. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 10.1 Comparison Exercise 10.1 R: Let \\(X\\) be geometric random variable with \\(p = 0.7\\). Visually compare the Markov bound, Chernoff bound, and the theoretical probabilities for \\(x = 1,...,12\\). To get the best fitting Chernoff bound, you will need to optimize the bound depending on \\(t\\). Use either analytical or numerical optimization. bound_chernoff <- function (t, p, a) { return ((p / (1 - exp(t) + p * exp(t))) / exp(a * t)) } set.seed(1) p <- 0.7 a <- seq(1, 12, by = 1) ci_markov <- (1 - p) / p / a t <- vector(mode = "numeric", length = length(a)) for (i in 1:length(t)) { t[i] <- optimize(bound_chernoff, interval = c(0, log(1 / (1 - p))), p = p, a = a[i])$minimum } t ## [1] 0.5108267 0.7984981 0.9162927 0.9808238 1.0216635 1.0498233 1.0704327 ## [8] 1.0861944 1.0986159 1.1086800 1.1169653 1.1239426 ci_chernoff <- (p / (1 - exp(t) + p * exp(t))) / exp(a * t) actual <- 1 - pgeom(a, 0.7) plot_df <- rbind( data.frame(x = a, y = ci_markov, type = "Markov"), data.frame(x = a, y = ci_chernoff, type = "Chernoff"), data.frame(x = a, y = actual, type = "Actual") ) ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() Exercise 10.2 R: Let \\(X\\) be a sum of 100 Beta distributions with random parameters. Take 1000 samples and plot the Chebyshev bound, Hoeffding bound, and the empirical probabilities. set.seed(1) nvars <- 100 nsamps <- 1000 samps <- matrix(data = NA, nrow = nsamps, ncol = nvars) Sn_mean <- 0 Sn_var <- 0 for (i in 1:nvars) { alpha1 <- rgamma(1, 10, 1) beta1 <- rgamma(1, 10, 1) X <- rbeta(nsamps, alpha1, beta1) Sn_mean <- Sn_mean + alpha1 / (alpha1 + beta1) Sn_var <- Sn_var + alpha1 * beta1 / ((alpha1 + beta1)^2 * (alpha1 + beta1 + 1)) samps[ ,i] <- X } mean(apply(samps, 1, sum)) ## [1] 51.12511 Sn_mean ## [1] 51.15723 var(apply(samps, 1, sum)) ## [1] 1.170652 Sn_var ## [1] 1.166183 a <- 1:30 b <- a / sqrt(Sn_var) ci_chebyshev <- 1 / b^2 ci_hoeffding <- 2 * exp(- 2 * a^2 / nvars) empirical <- NULL for (i in 1:length(a)) { empirical[i] <- sum(abs((apply(samps, 1, sum)) - Sn_mean) >= a[i])/ nsamps } plot_df <- rbind( data.frame(x = a, y = ci_chebyshev, type = "Chebyshev"), data.frame(x = a, y = ci_hoeffding, type = "Hoeffding"), data.frame(x = a, y = empirical, type = "Empirical") ) ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() + coord_cartesian(xlim = c(15, 25), ylim = c(0, 0.05)) 10.2 Practical Exercise 10.3 From Jagannathan. Let \\(X_i\\), \\(i = 1,...n\\), be a random sample of size \\(n\\) of a random variable \\(X\\). Let \\(X\\) have mean \\(\\mu\\) and variance \\(\\sigma^2\\). Find the size of the sample \\(n\\) required so that the probability that the difference between sample mean and true mean is smaller than \\(\\frac{\\sigma}{10}\\) is at least 0.95. Hint: Derive a version of the Chebyshev inequality for \\(P(|X - \\mu| \\geq a)\\) using Markov inequality. Solution. Let \\(\\bar{X} = \\sum_{i=1}^n X_i\\). Then \\(E[\\bar{X}] = \\mu\\) and \\(Var[\\bar{X}] = \\frac{\\sigma^2}{n}\\). Let us first derive another representation of Chebyshev inequality. \\[\\begin{align} P(|X - \\mu| \\geq a) = P(|X - \\mu|^2 \\geq a^2) \\leq \\frac{E[|X - \\mu|^2]}{a^2} = \\frac{Var[X]}{a^2}. \\end{align}\\] Let us use that on our sampling distribution: \\[\\begin{align} P(|\\bar{X} - \\mu| \\geq \\frac{\\sigma}{10}) \\leq \\frac{100 Var[\\bar{X}]}{\\sigma^2} = \\frac{100 Var[X]}{n \\sigma^2} = \\frac{100}{n}. \\end{align}\\] We are interested in the difference being smaller, therefore \\[\\begin{align} P(|\\bar{X} - \\mu| < \\frac{\\sigma}{10}) = 1 - P(|\\bar{X} - \\mu| \\geq \\frac{\\sigma}{10}) \\geq 1 - \\frac{100}{n} \\geq 0.95. \\end{align}\\] It follows that we need a sample size of \\(n \\geq \\frac{100}{0.05} = 2000\\). "],["crv.html", "Chapter 11 Convergence of random variables", " Chapter 11 Convergence of random variables This chapter deals with convergence of random variables. The students are expected to acquire the following knowledge: Theoretical Finding convergences of random variables. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 11.1 Let \\(X_1\\), \\(X_2\\),…, \\(X_n\\) be a sequence of Bernoulli random variables. Let \\(Y_n = \\frac{X_1 + X_2 + ... + X_n}{n^2}\\). Show that this sequence converges point-wise to the zero random variable. R: Use a simulation to check your answer. Solution. Let \\(\\epsilon\\) be arbitrary. We need to find such \\(n_0\\), that for every \\(n\\) greater than \\(n_0\\) \\(|Y_n| < \\epsilon\\) holds. \\[\\begin{align} |Y_n| &= |\\frac{X_1 + X_2 + ... + X_n}{n^2}| \\\\ &\\leq |\\frac{n}{n^2}| \\\\ &= \\frac{1}{n}. \\end{align}\\] So we need to find such \\(n_0\\), that for every \\(n > n_0\\) we will have \\(\\frac{1}{n} < \\epsilon\\). So \\(n_0 > \\frac{1}{\\epsilon}\\). x <- 1:1000 X <- matrix(data = NA, nrow = length(x), ncol = 100) y <- vector(mode = "numeric", length = length(x)) for (i in 1:length(x)) { X[i, ] <- rbinom(100, size = 1, prob = 0.5) } X <- apply(X, 2, cumsum) tmp_mat <- matrix(data = (1:1000)^2, nrow = 1000, ncol = 100) X <- X / tmp_mat y <- apply(X, 1, mean) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() Exercise 11.2 Let \\(\\Omega = [0,1]\\) and let \\(X_n\\) be a sequence of random variables, defined as \\[\\begin{align} X_n(\\omega) = \\begin{cases} \\omega^3, &\\omega = \\frac{i}{n}, &0 \\leq i \\leq 1 \\\\ 1, & \\text{otherwise.} \\end{cases} \\end{align}\\] Show that \\(X_n\\) converges almost surely to \\(X \\sim \\text{Uniform}(0,1)\\). Solution. We need to show \\(P(\\{\\omega: X_n(\\omega) \\rightarrow X(\\omega)\\}) = 1\\). Let \\(\\omega \\neq \\frac{i}{n}\\). Then for any \\(\\omega\\), \\(X_n\\) converges pointwise to \\(X\\): \\[\\begin{align} X_n(\\omega) = 1 \\implies |X_n(\\omega) - X(s)| = |1 - 1| < \\epsilon. \\end{align}\\] The above is independent of \\(n\\). Since there are countably infinite number of elements in the complement (\\(\\frac{i}{n}\\)), the probability of this set is 1. Exercise 11.3 Borrowed from Wasserman. Let \\(X_n \\sim \\text{N}(0, \\frac{1}{n})\\) and let \\(X\\) be a random variable with CDF \\[\\begin{align} F_X(x) = \\begin{cases} 0, &x < 0 \\\\ 1, &x \\geq 0. \\end{cases} \\end{align}\\] Does \\(X_n\\) converge to \\(X\\) in distribution? How about in probability? Prove or disprove these statement. R: Plot the CDF of \\(X_n\\) for \\(n = 1, 2, 5, 10, 100, 1000\\). Solution. Let us first check convergence in distribution. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} F_{X_n}(x) &= \\lim_{n \\rightarrow \\infty} \\phi (\\sqrt(n) x). \\end{align}\\] We have two cases, for \\(x < 0\\) and \\(x > 0\\). We do not need to check for \\(x = 0\\), since \\(F_X\\) is not continuous in that point. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} \\phi (\\sqrt(n) x) = \\begin{cases} 0, & x < 0 \\\\ 1, & x > 0. \\end{cases} \\end{align}\\] This is the same as \\(F_X\\). Let us now check convergence in probability. Since \\(X\\) is a point-mass distribution at zero, we have \\[\\begin{align} \\lim_{n \\rightarrow \\infty} P(|X_n| > \\epsilon) &= \\lim_{n \\rightarrow \\infty} (P(X_n > \\epsilon) + P(X_n < -\\epsilon)) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - P(X_n < \\epsilon) + P(X_n < -\\epsilon)) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - \\phi(\\sqrt{n} \\epsilon) + \\phi(- \\sqrt{n} \\epsilon)) \\\\ &= 0. \\end{align}\\] n <- c(1,2,5,10,100,1000) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01)), aes(x = x)) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/1), aes(color = "sd = 1/1")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/2), aes(color = "sd = 1/2")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/5), aes(color = "sd = 1/5")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/10), aes(color = "sd = 1/10")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/100), aes(color = "sd = 1/100")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/1000), aes(color = "sd = 1/1000")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/10000), aes(color = "sd = 1/10000")) Exercise 11.4 Let \\(X_i\\) be i.i.d. and \\(\\mu = E(X_1)\\). Let variance of \\(X_1\\) be finite. Show that the mean of \\(X_i\\), \\(\\bar{X}_n = \\frac{1}{n}\\sum_{i=1}^n X_i\\) converges in quadratic mean to \\(\\mu\\). Solution. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} E(|\\bar{X_n} - \\mu|^2) &= \\lim_{n \\rightarrow \\infty} E(\\bar{X_n}^2 - 2 \\bar{X_n} \\mu + \\mu^2) \\\\ &= \\lim_{n \\rightarrow \\infty} (E(\\bar{X_n}^2) - 2 \\mu E(\\frac{\\sum_{i=1}^n X_i}{n}) + \\mu^2) \\\\ &= \\lim_{n \\rightarrow \\infty} E(\\bar{X_n})^2 + \\lim_{n \\rightarrow \\infty} Var(\\bar{X_n}) - 2 \\mu^2 + \\mu^2 \\\\ &= \\lim_{n \\rightarrow \\infty} \\frac{n^2 \\mu^2}{n^2} + \\lim_{n \\rightarrow \\infty} \\frac{\\sigma^2}{n} - \\mu^2 \\\\ &= \\mu^2 - \\mu^2 + \\lim_{n \\rightarrow \\infty} \\frac{\\sigma^2}{n} \\\\ &= 0. \\end{align}\\] "],["lt.html", "Chapter 12 Limit theorems", " Chapter 12 Limit theorems This chapter deals with limit theorems. The students are expected to acquire the following knowledge: Theoretical Monte Carlo integration convergence. Difference between weak and strong law of large numbers. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 12.1 Show that Monte Carlo integration converges almost surely to the true integral of a bounded function. Solution. Let \\(g\\) be a function defined on \\(\\Omega\\). Let \\(X_i\\), \\(i = 1,...,n\\) be i.i.d. (multivariate) uniform random variables with bounds defined on \\(\\Omega\\). Let \\(Y_i\\) = \\(g(X_i)\\). Then it follows that \\(Y_i\\) are also i.i.d. random variables and their expected value is \\(E[g(X)] = \\int_{\\Omega} g(x) f_X(x) dx = \\frac{1}{V_{\\Omega}} \\int_{\\Omega} g(x) dx\\). By the strong law of large numbers, we have \\[\\begin{equation} \\frac{1}{n}\\sum_{i=1}^n Y_i \\xrightarrow{\\text{a.s.}} E[g(X)]. \\end{equation}\\] It follows that \\[\\begin{equation} V_{\\Omega} \\frac{1}{n}\\sum_{i=1}^n Y_i \\xrightarrow{\\text{a.s.}} \\int_{\\Omega} g(x) dx. \\end{equation}\\] Exercise 12.2 Let \\(X\\) be a geometric random variable with probability 0.5 and support in positive integers. Let \\(Y = 2^X (-1)^X X^{-1}\\). Find the expected value of \\(Y\\) by using conditional convergence (this variable does not have an expected value in the conventional sense – the series is not absolutely convergent). R: Draw \\(10000\\) samples from a geometric distribution with probability 0.5 and support in positive integers to get \\(X\\). Then calculate \\(Y\\) and plot the means at each iteration (sample). Additionally, plot the expected value calculated in a. Try it with different seeds. What do you notice? Solution. \\[\\begin{align*} E[Y] &= \\sum_{x=1}^{\\infty} \\frac{2^x (-1)^x}{x} 0.5^x \\\\ &= \\sum_{x=1}^{\\infty} \\frac{(-1)^x}{x} \\\\ &= - \\sum_{x=1}^{\\infty} \\frac{(-1)^{x+1}}{x} \\\\ &= - \\ln(2) \\end{align*}\\] set.seed(3) x <- rgeom(100000, prob = 0.5) + 1 y <- 2^x * (-1)^x * x^{-1} y_means <- cumsum(y) / seq_along(y) df <- data.frame(x = 1:length(y_means), y = y_means) ggplot(data = df, aes(x = x, y = y)) + geom_line() + geom_hline(yintercept = -log(2)) "],["eb.html", "Chapter 13 Estimation basics 13.1 ECDF 13.2 Properties of estimators", " Chapter 13 Estimation basics This chapter deals with estimation basics. The students are expected to acquire the following knowledge: Biased and unbiased estimators. Consistent estimators. Empirical cumulative distribution function. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 13.1 ECDF Exercise 13.1 (ECDF intuition) Take any univariate continuous distribution that is readily available in R and plot its CDF (\\(F\\)). Draw one sample (\\(n = 1\\)) from the chosen distribution and draw the ECDF (\\(F_n\\)) of that one sample. Use the definition of the ECDF, not an existing function in R. Implementation hint: ECDFs are always piecewise constant - they only jump at the sampled values and by \\(1/n\\). Repeat (b) for \\(n = 5, 10, 100, 1000...\\) Theory says that \\(F_n\\) should converge to \\(F\\). Can you observe that? For \\(n = 100\\) repeat the process \\(m = 20\\) times and plot every \\(F_n^{(m)}\\). Theory says that \\(F_n\\) will converge to \\(F\\) the slowest where \\(F\\) is close to 0.5 (where the variance is largest). Can you observe that? library(ggplot2) set.seed(1) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) one_samp <- rnorm(1) X <- data.frame(x = c(-5, one_samp, 5), y = c(0,1,1)) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) + geom_step(data = X, aes(x = x, y = y)) N <- c(5, 10, 100, 1000) X <- NULL for (n in N) { tmp <- rnorm(n) tmp_X <- data.frame(x = c(-5, sort(tmp), 5), y = c(0, seq(1/n, 1, by = 1/n), 1), n = n) X <- rbind(X, tmp_X) } ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) + geom_step(data = X, aes(x = x, y = y, color = as.factor(n))) + labs(color = "N") 13.2 Properties of estimators Exercise 13.2 Show that the sample average is, as an estimator of the mean: unbiased, consistent, asymptotically normal. Solution. \\[\\begin{align*} E[\\frac{1}{n} \\sum_{i=1}^n X_i] &= \\frac{1}{n} \\sum_{i=i}^n E[X_i] \\\\ &= E[X]. \\end{align*}\\] \\[\\begin{align*} \\lim_{n \\rightarrow \\infty} P(|\\frac{1}{n} \\sum_{i=1}^n X_i - E[X]| > \\epsilon) &= \\lim_{n \\rightarrow \\infty} P((\\frac{1}{n} \\sum_{i=1}^n X_i - E[X])^2 > \\epsilon^2) \\\\ & \\leq \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i - E[X])^2]}{\\epsilon^2} & \\text{Markov inequality} \\\\ & = \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i)^2 - 2 \\frac{1}{n} \\sum_{i=1}^n X_i E[X] + E[X]^2]}{\\epsilon^2} \\\\ & = \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i)^2] - 2 E[X]^2 + E[X]^2}{\\epsilon^2} \\\\ &= 0 \\end{align*}\\] For the last equality see the solution to ??. Follows directly from the CLT. Exercise 13.3 (Consistent but biased estimator) Show that sample variance (the plug-in estimator of variance) is a biased estimator of variance. Show that sample variance is a consistent estimator of variance. Show that the estimator with (\\(N-1\\)) (Bessel correction) is unbiased. Solution. \\[\\begin{align*} E[\\frac{1}{n} \\sum_{i=1}^n (Y_i - \\bar{Y})^2] &= \\frac{1}{n} \\sum_{i=1}^n E[(Y_i - \\bar{Y})^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2] - 2 E[Y_i \\bar{Y}] + \\bar{Y}^2)] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2 - 2 Y_i \\bar{Y} + \\bar{Y}^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2 - \\frac{2}{n} Y_i^2 - \\frac{2}{n} \\sum_{i \\neq j} Y_i Y_j + \\frac{1}{n^2}\\sum_j \\sum_{k \\neq j} Y_j Y_k + \\frac{1}{n^2} \\sum_j Y_j^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n \\frac{n - 2}{n} (\\sigma^2 + \\mu^2) - \\frac{2}{n} (n - 1) \\mu^2 + \\frac{1}{n^2}n(n-1)\\mu^2 + \\frac{1}{n^2}n(\\sigma^2 + \\mu^2) \\\\ &= \\frac{n-1}{n}\\sigma^2 \\\\ < \\sigma^2. \\end{align*}\\] Let \\(S_n\\) denote the sample variance. Then we can write it as \\[\\begin{align*} S_n &= \\frac{1}{n} \\sum_{i=1}^n (X_i - \\bar{X})^2 = \\frac{1}{n} \\sum_{i=1}^n (X_i - \\mu)^2 + 2(X_i - \\mu)(\\mu - \\bar{X}) + (\\mu - \\bar{X})^2. \\end{align*}\\] Now \\(\\bar{X}\\) converges in probability (by WLLN) to \\(\\mu\\) therefore the right terms converge in probability to zero. The left term converges in probability to \\(\\sigma^2\\), also by WLLN. Therefore the sample variance is a consistent estimatior of the variance. The denominator changes in the second-to-last line of a., therefore the last line is now equality. Exercise 13.4 (Estimating the median) Show that the sample median is an unbiased estimator of the median for N\\((\\mu, \\sigma^2)\\). Show that the sample median is an unbiased estimator of the mean for any distribution with symmetric density. Hint 1: The pdf of an order statistic is \\(f_{X_{(k)}}(x) = \\frac{n!}{(n - k)!(k - 1)!}f_X(x)\\Big(F_X(x)^{k-1} (1 - F_X(x)^{n - k}) \\Big)\\). Hint 2: A distribution is symmetric when \\(X\\) and \\(2a - X\\) have the same distribution for some \\(a\\). Solution. Let \\(Z_i\\), \\(i = 1,...,n\\) be i.i.d. variables with a symmetric distribution and let \\(Z_{k:n}\\) denote the \\(k\\)-th order statistic. We will distinguish two cases, when \\(n\\) is odd and when \\(n\\) is even. Let first \\(n = 2m + 1\\) be odd. Then the sample median is \\(M = Z_{m+1:2m+1}\\). Its PDF is \\[\\begin{align*} f_M(x) = (m+1)\\binom{2m + 1}{m}f_Z(x)\\Big(F_Z(x)^m (1 - F_Z(x)^m) \\Big). \\end{align*}\\] For every symmetric distribution, it holds that \\(F_X(x) = 1 - F(2a - x)\\). Let \\(a = \\mu\\), the population mean. Plugging this into the PDF, we get that \\(f_M(x) = f_M(2\\mu -x)\\). It follows that \\[\\begin{align*} E[M] &= E[2\\mu - M] \\\\ 2E[M] &= 2\\mu \\\\ E[M] &= \\mu. \\end{align*}\\] Now let \\(n = 2m\\) be even. Then the sample median is \\(M = \\frac{Z_{m:2m} + Z_{m+1:2m}}{2}\\). It can be shown, that the joint PDF of these terms is also symmetric. Therefore, similar to the above \\[\\begin{align*} E[M] &= E[\\frac{Z_{m:2m} + Z_{m+1:2m}}{2}] \\\\ &= E[\\frac{2\\mu - M + 2\\mu - M}{2}] \\\\ &= E[2\\mu - M]. \\end{align*}\\] The above also proves point a. as the median and the mean are the same in normal distribution. Exercise 13.5 (Matrix trace estimation) The Hutchinson trace estimator [1] is an estimator of the trace of a symmetric positive semidefinite matrix A that relies on Monte Carlo sampling. The estimator is defined as \\[\\begin{align*} \\textrm{tr}(A) \\approx \\frac{1}{n} \\Sigma_{i=1}^n z_i^T A z_i, &\\\\ z_i \\sim_{\\mathrm{IID}} \\textrm{Uniform}(\\{-1, 1\\}^m), & \\end{align*}\\] where \\(A \\in \\mathbb{R}^{m \\times m}\\) is a symmetric positive semidefinite matrix. Elements of each vector \\(z_i\\) are either \\(-1\\) or \\(1\\) with equal probability. This is also called a Rademacher distribution. Data scientists often want the trace of a Hessian to obtain valuable curvature information for a loss function. Per [2], an example is classifying ten digits based on \\((28,28)\\) grayscale images (i.e. MNIST data) using logistic regression. The number of parameters is \\(m = 28^2 \\cdot 10 = 7840\\) and the size of the Hessian is \\(m^2\\), roughly \\(6 \\cdot 10^6\\). The diagonal average is equal to the average eigenvalue, which may be useful for optimization; in MCMC contexts, this would be useful for preconditioners and step size optimization. Computing Hessians (as a means of getting eigenvalue information) is often intractable, but Hessian-vector products can be computed faster by autodifferentiation (with e.g. Tensorflow, Pytorch, Jax). This is one motivation for the use of a stochastic trace estimator as outlined above. References: A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines (Hutchinson, 1990) A Modern Analysis of Hutchinson’s Trace Estimator (Skorski, 2020) Prove that the Hutchinson trace estimator is an unbiased estimator of the trace. Solution. We first simplify our task: \\[\\begin{align} \\mathbb{E}\\left[\\frac{1}{n} \\Sigma_{i=1}^n z_i^T A z_i \\right] &= \\frac{1}{n} \\Sigma_{i=1}^n \\mathbb{E}\\left[z_i^T A z_i \\right] \\\\ &= \\mathbb{E}\\left[z_i^T A z_i \\right], \\end{align}\\] where the second equality is due to having \\(n\\) IID vectors \\(z_i\\). We now only need to show that \\(\\mathbb{E}\\left[z^T A z \\right] = \\mathrm{tr}(A)\\). We omit the index due to all vectors being IID: \\[\\begin{align} \\mathrm{tr}(A) &= \\mathrm{tr}(AI) \\\\ &= \\mathrm{tr}(A\\mathbb{E}[zz^T]) \\\\ &= \\mathbb{E}[\\mathrm{tr}(Azz^T)] \\\\ &= \\mathbb{E}[\\mathrm{tr}(z^TAz)] \\\\ &= \\mathbb{E}[z^TAz]. \\end{align}\\] This concludes the proof. We clarify some equalities below. The second equality assumes that \\(\\mathbb{E}[zz^T] = I\\). By noting that the mean of the Rademacher distribution is 0, we have \\[\\begin{align} \\mathrm{Cov}[z, z] &= \\mathbb{E}[(z - \\mathbb{E}[z])(z - \\mathbb{E}[z])^T] \\\\ &= \\mathbb{E}[zz^T]. \\end{align}\\] Dimensions of \\(z\\) are independent, so \\(\\mathrm{Cov}[z, z]_{ij} = 0\\) for \\(i \\neq j\\). The diagonal will contain variances, which are equal to \\(1\\) for all dimensions \\(k = 1 \\dots m\\): \\(\\mathrm{Var}[z^{(k)}] = \\mathbb{E}[z^{(k)}z^{(k)}] - \\mathbb{E}[z^{(k)}]^2 = 1 - 0 = 1\\). It follows that the covariance is an identity matrix. Note that this is a general result for vectors with IID dimensions sampled from a distribution with mean 0 and variance 1. We could therefore use something else instead of the Rademacher, e.g. \\(z ~ N(0, I)\\). The third equality uses the fact that the expectation of a trace equals the trace of an expectation. If \\(X\\) is a random matrix, then \\(\\mathbb{E}[X]_{ij} = \\mathbb{E}[X_{ij}]\\). Therefore: \\[\\begin{align} \\mathrm{tr}(\\mathbb{E}[X]) &= \\Sigma_{i=1}^m(\\mathbb{E}[X]_{ii}) \\\\ &= \\Sigma_{i=1}^m(\\mathbb{E}[X_{ii}]) \\\\ &= \\mathbb{E}[\\Sigma_{i=1}^m(X_{ii})] \\\\ &= \\mathbb{E}[\\mathrm{tr}(X)], \\end{align}\\] where we used the linearity of the expectation in the third step. The fourth equality uses the fact that \\(\\mathrm{tr}(AB) = \\mathrm{tr}(BA)\\) for any matrices \\(A \\in \\mathbb{R}^{n \\times m}, B \\in \\mathbb{R}^{m \\times n}\\). The last inequality uses the fact that the trace of a \\(1 \\times 1\\) matrix is just its element. "],["boot.html", "Chapter 14 Bootstrap", " Chapter 14 Bootstrap This chapter deals with bootstrap. The students are expected to acquire the following knowledge: How to use bootstrap to generate coverage intervals. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 14.1 Ideally, a \\(1-\\alpha\\) CI would have \\(1-\\alpha\\) coverage. That is, say a 95% CI should, in the long run, contain the true value of the parameter 95% of the time. In practice, it is impossible to assess the coverage of our CI method, because we rarely know the true parameter. In simulation, however, we can. Let’s assess the coverage of bootstrap percentile intervals. Pick a univariate distribution with readily available mean and one that you can easily sample from. Draw \\(n = 30\\) random samples from the chosen distribution and use the bootstrap (with large enough m) and percentile CI method to construct 95% CI. Repeat the process many times and count how many times the CI contains the true mean. That is, compute the actual coverage probability (don’t forget to include the standard error of the coverage probability!). What can you observe? Try one or two different distributions. What can you observe? Repeat (b) and (c) using BCa intervals (R package boot). How does the coverage compare to percentile intervals? As (d) but using intervals based on asymptotic normality (+/- 1.96 SE). How do results from (b), (d), and (e) change if we increase the sample size to n = 200? What about n = 5? library(boot) set.seed(0) nit <- 1000 # Repeat the process "many times" alpha <- 0.05 # CI parameter nboot <- 100 # m parameter for bootstrap ("large enough m") # f: change this to 200 or 5. nsample <- 30 # n = 30 random samples from the chosen distribution. Comment out BCa code if it breaks. covers <- matrix(nrow = nit, ncol = 3) covers_BCa <- matrix(nrow = nit, ncol = 3) covers_asymp_norm <- matrix(nrow = nit, ncol = 3) isin <- function (x, lower, upper) { (x > lower) & (x < upper) } for (j in 1:nit) { # Repeating many times # a: pick a univariate distribution - standard normal x1 <- rnorm(nsample) # c: one or two different distributions - beta and poisson x2 <- rbeta(nsample, 1, 2) x3 <- rpois(nsample, 5) X1 <- matrix(data = NA, nrow = nsample, ncol = nboot) X2 <- matrix(data = NA, nrow = nsample, ncol = nboot) X3 <- matrix(data = NA, nrow = nsample, ncol = nboot) for (i in 1:nboot) { X1[ ,i] <- sample(x1, nsample, replace = T) X2[ ,i] <- sample(x2, nsample, T) X3[ ,i] <- sample(x3, nsample, T) } X1_func <- apply(X1, 2, mean) X2_func <- apply(X2, 2, mean) X3_func <- apply(X3, 2, mean) X1_quant <- quantile(X1_func, probs = c(alpha / 2, 1 - alpha / 2)) X2_quant <- quantile(X2_func, probs = c(alpha / 2, 1 - alpha / 2)) X3_quant <- quantile(X3_func, probs = c(alpha / 2, 1 - alpha / 2)) covers[j,1] <- (0 > X1_quant[1]) & (0 < X1_quant[2]) covers[j,2] <- ((1 / 3) > X2_quant[1]) & ((1 / 3) < X2_quant[2]) covers[j,3] <- (5 > X3_quant[1]) & (5 < X3_quant[2]) mf <- function (x, i) return(mean(x[i])) bootX1 <- boot(x1, statistic = mf, R = nboot) bootX2 <- boot(x2, statistic = mf, R = nboot) bootX3 <- boot(x3, statistic = mf, R = nboot) X1_quant_BCa <- boot.ci(bootX1, type = "bca")$bca X2_quant_BCa <- boot.ci(bootX2, type = "bca")$bca X3_quant_BCa <- boot.ci(bootX3, type = "bca")$bca covers_BCa[j,1] <- (0 > X1_quant_BCa[4]) & (0 < X1_quant_BCa[5]) covers_BCa[j,2] <- ((1 / 3) > X2_quant_BCa[4]) & ((1 / 3) < X2_quant_BCa[5]) covers_BCa[j,3] <- (5 > X3_quant_BCa[4]) & (5 < X3_quant_BCa[5]) # e: estimate mean and standard error # sample mean: x1_bar <- mean(x1) x2_bar <- mean(x2) x3_bar <- mean(x3) # standard error (of the sample mean) estimate: sample standard deviation / sqrt(n) x1_bar_SE <- sd(x1) / sqrt(nsample) x2_bar_SE <- sd(x2) / sqrt(nsample) x3_bar_SE <- sd(x3) / sqrt(nsample) covers_asymp_norm[j,1] <- isin(0, x1_bar - 1.96 * x1_bar_SE, x1_bar + 1.96 * x1_bar_SE) covers_asymp_norm[j,2] <- isin(1/3, x2_bar - 1.96 * x2_bar_SE, x2_bar + 1.96 * x2_bar_SE) covers_asymp_norm[j,3] <- isin(5, x3_bar - 1.96 * x3_bar_SE, x3_bar + 1.96 * x3_bar_SE) } apply(covers, 2, mean) ## [1] 0.918 0.925 0.905 apply(covers, 2, sd) / sqrt(nit) ## [1] 0.008680516 0.008333333 0.009276910 apply(covers_BCa, 2, mean) ## [1] 0.927 0.944 0.927 apply(covers_BCa, 2, sd) / sqrt(nit) ## [1] 0.008230355 0.007274401 0.008230355 apply(covers_asymp_norm, 2, mean) ## [1] 0.939 0.937 0.930 apply(covers_asymp_norm, 2, sd) / sqrt(nit) ## [1] 0.007572076 0.007687008 0.008072494 Exercise 14.2 You are given a sample of independent observations from a process of interest: Index 1 2 3 4 5 6 7 8 X 7 2 4 6 4 5 9 10 Compute the plug-in estimate of mean and 95% symmetric CI based on asymptotic normality. Use the plug-in estimate of SE. Same as (a), but use the unbiased estimate of SE. Apply nonparametric bootstrap with 1000 bootstrap replications and estimate the 95% CI for the mean with percentile-based CI. # a x <- c(7, 2, 4, 6, 4, 5, 9, 10) n <- length(x) mu <- mean(x) SE <- sqrt(mean((x - mu)^2)) / sqrt(n) SE ## [1] 0.8915839 z <- qnorm(1 - 0.05 / 2) c(mu - z * SE, mu + z * SE) ## [1] 4.127528 7.622472 # b SE <- sd(x) / sqrt(n) SE ## [1] 0.9531433 c(mu - z * SE, mu + z * SE) ## [1] 4.006873 7.743127 # c set.seed(0) m <- 1000 T_mean <- function(x) {mean(x)} est_boot <- array(NA, m) for (i in 1:m) { x_boot <- x[sample(1:n, n, rep = T)] est_boot[i] <- T_mean(x_boot) } quantile(est_boot, p = c(0.025, 0.975)) ## 2.5% 97.5% ## 4.250 7.625 Exercise 14.3 We are given a sample of 10 independent paired (bivariate) observations: Index 1 2 3 4 5 6 7 8 9 10 X 1.26 -0.33 1.33 1.27 0.41 -1.54 -0.93 -0.29 -0.01 2.40 Y 2.64 0.33 0.48 0.06 -0.88 -2.14 -2.21 0.95 0.83 1.45 Compute Pearson correlation between X and Y. Use the cor.test() from R to estimate a 95% CI for the estimate from (a). Apply nonparametric bootstrap with 1000 bootstrap replications and estimate the 95% CI for the Pearson correlation with percentile-based CI. Compare CI from (b) and (c). Are they similar? How would the bootstrap estimation of CI change if we were interested in Spearman or Kendall correlation instead? x <- c(1.26, -0.33, 1.33, 1.27, 0.41, -1.54, -0.93, -0.29, -0.01, 2.40) y <- c(2.64, 0.33, 0.48, 0.06, -0.88, -2.14, -2.21, 0.95, 0.83, 1.45) # a cor(x, y) ## [1] 0.6991247 # b res <- cor.test(x, y) res$conf.int[1:2] ## [1] 0.1241458 0.9226238 # c set.seed(0) m <- 1000 n <- length(x) T_cor <- function(x, y) {cor(x, y)} est_boot <- array(NA, m) for (i in 1:m) { idx <- sample(1:n, n, rep = T) # !!! important to use same indices to keep dependency between x and y est_boot[i] <- T_cor(x[idx], y[idx]) } quantile(est_boot, p = c(0.025, 0.975)) ## 2.5% 97.5% ## 0.2565537 0.9057664 # d # Yes, but the bootstrap CI is more narrow. # e # We just use the functions for Kendall/Spearman coefficients instead: T_kendall <- function(x, y) {cor(x, y, method = "kendall")} T_spearman <- function(x, y) {cor(x, y, method = "spearman")} # Put this in a function that returns the CI bootstrap_95_ci <- function(x, y, t, m = 1000) { n <- length(x) est_boot <- array(NA, m) for (i in 1:m) { idx <- sample(1:n, n, rep = T) # !!! important to use same indices to keep dependency between x and y est_boot[i] <- t(x[idx], y[idx]) } quantile(est_boot, p = c(0.025, 0.975)) } bootstrap_95_ci(x, y, T_kendall) ## 2.5% 97.5% ## -0.08108108 0.78378378 bootstrap_95_ci(x, y, T_spearman) ## 2.5% 97.5% ## -0.1701115 0.8867925 Exercise 14.4 In this problem we will illustrate the use of the nonparametric bootstrap for estimating CIs of regression model coefficients. Load the longley dataset from base R with data(longley). Use lm() to apply linear regression using “Employed” as the target (dependent) variable and all other variables as the predictors (independent). Using lm() results, print the estimated regression coefficients and standard errors. Estimate 95% CI for the coefficients using +/- 1.96 * SE. Use nonparametric bootstrap with 100 replications to estimate the SE of the coefficients from (b). Compare the SE from (c) with those from (b). # a data(longley) # b res <- lm(Employed ~ . , longley) tmp <- data.frame(summary(res)$coefficients[,1:2]) tmp$LB <- tmp[,1] - 1.96 * tmp[,2] tmp$UB <- tmp[,1] + 1.96 * tmp[,2] tmp ## Estimate Std..Error LB UB ## (Intercept) -3.482259e+03 8.904204e+02 -5.227483e+03 -1.737035e+03 ## GNP.deflator 1.506187e-02 8.491493e-02 -1.513714e-01 1.814951e-01 ## GNP -3.581918e-02 3.349101e-02 -1.014616e-01 2.982320e-02 ## Unemployed -2.020230e-02 4.883997e-03 -2.977493e-02 -1.062966e-02 ## Armed.Forces -1.033227e-02 2.142742e-03 -1.453204e-02 -6.132495e-03 ## Population -5.110411e-02 2.260732e-01 -4.942076e-01 3.919994e-01 ## Year 1.829151e+00 4.554785e-01 9.364136e-01 2.721889e+00 # c set.seed(0) m <- 100 n <- nrow(longley) T_coef <- function(x) { lm(Employed ~ . , x)$coefficients } est_boot <- array(NA, c(m, ncol(longley))) for (i in 1:m) { idx <- sample(1:n, n, rep = T) est_boot[i,] <- T_coef(longley[idx,]) } SE <- apply(est_boot, 2, sd) SE ## [1] 1.826011e+03 1.605981e-01 5.693746e-02 8.204892e-03 3.802225e-03 ## [6] 3.907527e-01 9.414436e-01 # Show the standard errors around coefficients library(ggplot2) library(reshape2) df <- data.frame(index = 1:7, bootstrap_SE = SE, lm_SE = tmp$Std..Error) melted_df <- melt(df[2:nrow(df), ], id.vars = "index") # Ignore bias which has a really large magnitude ggplot(melted_df, aes(x = index, y = value, fill = variable)) + geom_bar(stat="identity", position="dodge") + xlab("Coefficient") + ylab("Standard error") # + scale_y_continuous(trans = "log") # If you want to also plot bias Exercise 14.5 This exercise shows a shortcoming of the bootstrap method when using the plug in estimator for the maximum. Compute the 95% bootstrap CI for the maximum of a standard normal distribution. Compute the 95% bootstrap CI for the maximum of a binomial distribution with n = 15 and p = 0.2. Repeat (b) using p = 0.9. Why is the result different? # bootstrap CI for maximum alpha <- 0.05 T_max <- function(x) {max(x)} # Equal to T_max = max bootstrap <- function(x, t, m = 1000) { n <- length(x) values <- rep(0, m) for (i in 1:m) { values[i] <- t(sample(x, n, replace = T)) } quantile(values, probs = c(alpha / 2, 1 - alpha / 2)) } # a # Meaningless, as the normal distribution can yield arbitrarily large values. x <- rnorm(100) bootstrap(x, T_max) ## 2.5% 97.5% ## 1.819425 2.961743 # b x <- rbinom(100, size = 15, prob = 0.2) # min = 0, max = 15 bootstrap(x, T_max) ## 2.5% 97.5% ## 6 7 # c x <- rbinom(100, size = 15, prob = 0.9) # min = 0, max = 15 bootstrap(x, T_max) ## 2.5% 97.5% ## 15 15 # Observation: to estimate the maximum, we need sufficient probability mass near the maximum value the distribution can yield. # Using bootstrap is pointless when there is too little mass near the true maximum. # In general, bootstrap will fail when estimating the CI for the maximum. "],["ml.html", "Chapter 15 Maximum likelihood 15.1 Deriving MLE 15.2 Fisher information 15.3 The German tank problem", " Chapter 15 Maximum likelihood This chapter deals with maximum likelihood estimation. The students are expected to acquire the following knowledge: How to derive MLE. Applying MLE in R. Calculating and interpreting Fisher information. Practical use of MLE. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 15.1 Deriving MLE Exercise 15.1 Derive the maximum likelihood estimator of variance for N\\((\\mu, \\sigma^2)\\). Compare with results from 13.3. What does that say about the MLE estimator? Solution. The mean is assumed constant, so we have the likelihood \\[\\begin{align} L(\\sigma^2; y) &= \\prod_{i=1}^n \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(y_i - \\mu)^2}{2 \\sigma^2}} \\\\ &= \\frac{1}{\\sqrt{2 \\pi \\sigma^2}^n} e^{\\frac{-\\sum_{i=1}^n (y_i - \\mu)^2}{2 \\sigma^2}} \\end{align}\\] We need to find the maximum of this function. We first observe that we can replace \\(\\frac{-\\sum_{i=1}^n (y_i - \\mu)^2}{2}\\) with a constant \\(c\\), since none of the terms are dependent on \\(\\sigma^2\\). Additionally, the term \\(\\frac{1}{\\sqrt{2 \\pi}^n}\\) does not affect the calculation of the maximum. So now we have \\[\\begin{align} L(\\sigma^2; y) &= (\\sigma^2)^{-\\frac{n}{2}} e^{\\frac{c}{\\sigma^2}}. \\end{align}\\] Differentiating we get \\[\\begin{align} \\frac{d}{d \\sigma^2} L(\\sigma^2; y) &= (\\sigma^2)^{-\\frac{n}{2}} \\frac{d}{d \\sigma^2} e^{\\frac{c}{\\sigma^2}} + e^{\\frac{c}{\\sigma^2}} \\frac{d}{d \\sigma^2} (\\sigma^2)^{-\\frac{n}{2}} \\\\ &= - (\\sigma^2)^{-\\frac{n}{2}} e^{\\frac{c}{\\sigma^2}} \\frac{c}{(\\sigma^2)^2} - e^{\\frac{c}{\\sigma^2}} \\frac{n}{2} (\\sigma^2)^{-\\frac{n + 2}{2}} \\\\ &= - (\\sigma^2)^{-\\frac{n + 4}{2}} e^{\\frac{c}{\\sigma^2}} c - e^{\\frac{c}{\\sigma^2}} \\frac{n}{2} (\\sigma^2)^{-\\frac{n + 2}{2}} \\\\ &= - e^{\\frac{c}{\\sigma^2}} (\\sigma^2)^{-\\frac{n + 4}{2}} \\Big(c + \\frac{n}{2}\\sigma^2 \\Big). \\end{align}\\] To get the maximum, this has to equal to 0, so \\[\\begin{align} c + \\frac{n}{2}\\sigma^2 &= 0 \\\\ \\sigma^2 &= -\\frac{2c}{n} \\\\ \\sigma^2 &= \\frac{\\sum_{i=1}^n (Y_i - \\mu)^2}{n}. \\end{align}\\] The MLE estimator is biased. Exercise 15.2 (Multivariate normal distribution) Derive the maximum likelihood estimate for the mean and covariance matrix of the multivariate normal. Simulate \\(n = 40\\) samples from a bivariate normal distribution (choose non-trivial parameters, that is, mean \\(\\neq 0\\) and covariance \\(\\neq 0\\)). Compute the MLE for the sample. Overlay the data with an ellipse that is determined by the MLE and an ellipse that is determined by the chosen true parameters. Repeat b. several times and observe how the estimates (ellipses) vary around the true value. Hint: For the derivation of MLE, these identities will be helpful: \\(\\frac{\\partial b^T a}{\\partial a} = \\frac{\\partial a^T b}{\\partial a} = b\\), \\(\\frac{\\partial a^T A a}{\\partial a} = (A + A^T)a\\), \\(\\frac{\\partial \\text{tr}(BA)}{\\partial A} = B^T\\), \\(\\frac{\\partial \\ln |A|}{\\partial A} = (A^{-1})^T\\), \\(a^T A a = \\text{tr}(a^T A a) = \\text{tr}(a a^T A) = \\text{tr}(Aaa^T)\\). Solution. The log likelihood of the MVN distribution is \\[\\begin{align*} l(\\mu, \\Sigma ; x) &= -\\frac{1}{2}\\Big(\\sum_{i=1}^n k\\ln(2\\pi) + |\\Sigma| + (x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) \\\\ &= -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n(x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) + c, \\end{align*}\\] where \\(c\\) is a constant with respect to \\(\\mu\\) and \\(\\Sigma\\). To find the MLE we first need to find partial derivatives. Let us start with \\(\\mu\\). \\[\\begin{align*} \\frac{\\partial}{\\partial \\mu}l(\\mu, \\Sigma ; x) &= \\frac{\\partial}{\\partial \\mu} -\\frac{1}{2}\\Big(\\sum_{i=1}^n x_i^T \\Sigma^{-1} x_i - x_i^T \\Sigma^{-1} \\mu - \\mu^T \\Sigma^{-1} x_i + \\mu^T \\Sigma^{-1} \\mu \\Big) \\\\ &= -\\frac{1}{2}\\Big(\\sum_{i=1}^n - \\Sigma^{-1} x_i - \\Sigma^{-1} x_i + 2 \\Sigma^{-1} \\mu \\Big) \\\\ &= -\\Sigma^{-1}\\Big(\\sum_{i=1}^n - x_i + \\mu \\Big). \\end{align*}\\] Equating above with zero, we get \\[\\begin{align*} \\sum_{i=1}^n - x_i + \\mu &= 0 \\\\ \\hat{\\mu} = \\frac{1}{n} \\sum_{i=1}^n x_i, \\end{align*}\\] which is the dimension-wise empirical mean. Now for the covariance matrix \\[\\begin{align*} \\frac{\\partial}{\\partial \\Sigma^{-1}}l(\\mu, \\Sigma ; x) &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n(x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) \\\\ &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n \\text{tr}((x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu))\\Big) \\\\ &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n \\text{tr}((\\Sigma^{-1} (x_i - \\mu) (x_i - \\mu)^T )\\Big) \\\\ &= \\frac{n}{2}\\Sigma + -\\frac{1}{2}\\Big(\\sum_{i=1}^n (x_i - \\mu) (x_i - \\mu)^T \\Big). \\end{align*}\\] Equating above with zero, we get \\[\\begin{align*} \\hat{\\Sigma} = \\frac{1}{n}\\sum_{i=1}^n (x_i - \\mu) (x_i - \\mu)^T. \\end{align*}\\] set.seed(1) n <- 40 mu <- c(1, -2) Sigma <- matrix(data = c(2, -1.6, -1.6, 1.8), ncol = 2) X <- mvrnorm(n = n, mu = mu, Sigma = Sigma) colnames(X) <- c("X1", "X2") X <- as.data.frame(X) # plot.new() tru_ellip <- ellipse(mu, Sigma, draw = FALSE) colnames(tru_ellip) <- c("X1", "X2") tru_ellip <- as.data.frame(tru_ellip) mu_est <- apply(X, 2, mean) tmp <- as.matrix(sweep(X, 2, mu_est)) Sigma_est <- (1 / n) * t(tmp) %*% tmp est_ellip <- ellipse(mu_est, Sigma_est, draw = FALSE) colnames(est_ellip) <- c("X1", "X2") est_ellip <- as.data.frame(est_ellip) ggplot(data = X, aes(x = X1, y = X2)) + geom_point() + geom_path(data = tru_ellip, aes(x = X1, y = X2, color = "truth")) + geom_path(data = est_ellip, aes(x = X1, y = X2, color = "estimated")) + labs(color = "type") Exercise 15.3 (Logistic regression) Logistic regression is a popular discriminative model when our target variable is binary (categorical with 2 values). One of the ways of looking at logistic regression is that it is linear regression but instead of using the linear term as the mean of a normal RV, we use it as the mean of a Bernoulli RV. Of course, the mean of a Bernoulli is bounded on \\([0,1]\\), so, to avoid non-sensical values, we squeeze the linear between 0 and 1 with the inverse logit function inv_logit\\((z) = 1 / (1 + e^{-z})\\). This leads to the following model: \\(y_i | \\beta, x_i \\sim \\text{Bernoulli}(\\text{inv_logit}(\\beta x_i))\\). Explicitly write the likelihood function of beta. Implement the likelihood function in R. Use black-box box-constraint optimization (for example, optim() with L-BFGS) to find the maximum likelihood estimate for beta for \\(x\\) and \\(y\\) defined below. Plot the estimated probability as a function of the independent variable. Compare with the truth. Let \\(y2\\) be a response defined below. Will logistic regression work well on this dataset? Why not? How can we still use the model, without changing it? inv_log <- function (z) { return (1 / (1 + exp(-z))) } set.seed(1) x <- rnorm(100) y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) y2 <- rbinom(100, size = 1, prob = inv_log(1.2 * x + 1.4 * x^2)) Solution. \\[\\begin{align*} l(\\beta; x, y) &= p(y | x, \\beta) \\\\ &= \\ln(\\prod_{i=1}^n \\text{inv_logit}(\\beta x_i)^{y_i} (1 - \\text{inv_logit}(\\beta x_i))^{1 - y_i}) \\\\ &= \\sum_{i=1}^n y_i \\ln(\\text{inv_logit}(\\beta x_i)) + (1 - y_i) \\ln(1 - \\text{inv_logit}(\\beta x_i)). \\end{align*}\\] set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par ## [1] 1.166558 truth_p <- data.frame(x = x, prob = inv_log(1.2 * x), type = "truth") est_p <- data.frame(x = x, prob = inv_log(my_optim$par * x), type = "estimated") plot_df <- rbind(truth_p, est_p) ggplot(data = plot_df, aes(x = x, y = prob, color = type)) + geom_point(alpha = 0.3) y2 <- rbinom(2000, size = 1, prob = inv_log(1.2 * x + 1.4 * x^2)) X2 <- cbind(x, x^2) my_optim2 <- optim(par = c(0, 0), fn = l_logistic, method = "L-BFGS-B", lower = c(0, 0), upper = c(2, 2), X = t(X2), y = y2) my_optim2$par ## [1] 1.153656 1.257649 tmp <- sweep(data.frame(x = x, x2 = x^2), 2, my_optim2$par, FUN = "*") tmp <- tmp[ ,1] + tmp[ ,2] truth_p <- data.frame(x = x, prob = inv_log(1.2 * x + 1.4 * x^2), type = "truth") est_p <- data.frame(x = x, prob = inv_log(tmp), type = "estimated") plot_df <- rbind(truth_p, est_p) ggplot(data = plot_df, aes(x = x, y = prob, color = type)) + geom_point(alpha = 0.3) Exercise 15.4 (Linear regression) For the data generated below, do the following: Compute the least squares (MLE) estimate of coefficients beta using the matrix exact solution. Compute the MLE by minimizing the sum of squared residuals using black-box optimization (optim()). Compute the MLE by using the output built-in linear regression (lm() ). Compare (a-c and the true coefficients). Compute 95% CI on the beta coefficients using the output of built-in linear regression. Compute 95% CI on the beta coefficients by using (a or b) and the bootstrap with percentile method for CI. Compare with d. set.seed(1) n <- 100 x1 <- rnorm(n) x2 <- rnorm(n) x3 <- rnorm(n) X <- cbind(x1, x2, x3) beta <- c(0.2, 0.6, -1.2) y <- as.vector(t(beta %*% t(X))) + rnorm(n, sd = 0.2) set.seed(1) n <- 100 x1 <- rnorm(n) x2 <- rnorm(n) x3 <- rnorm(n) X <- cbind(x1, x2, x3) beta <- c(0.2, 0.6, -1.2) y <- as.vector(t(beta %*% t(X))) + rnorm(n, sd = 0.2) LS_fun <- function (beta, X, y) { return(sum((y - beta %*% t(X))^2)) } my_optim <- optim(par = c(0, 0, 0), fn = LS_fun, lower = -5, upper = 5, X = X, y = y, method = "L-BFGS-B") my_optim$par ## [1] 0.1898162 0.5885946 -1.1788264 df <- data.frame(y = y, x1 = x1, x2 = x2, x3 = x3) my_lm <- lm(y ~ x1 + x2 + x3 - 1, data = df) my_lm ## ## Call: ## lm(formula = y ~ x1 + x2 + x3 - 1, data = df) ## ## Coefficients: ## x1 x2 x3 ## 0.1898 0.5886 -1.1788 # matrix solution beta_hat <- solve(t(X) %*% X) %*% t(X) %*% y beta_hat ## [,1] ## x1 0.1898162 ## x2 0.5885946 ## x3 -1.1788264 out <- summary(my_lm) out$coefficients[ ,2] ## x1 x2 x3 ## 0.02209328 0.02087542 0.01934506 # bootstrap CI nboot <- 1000 beta_boot <- matrix(data = NA, ncol = length(beta), nrow = nboot) for (i in 1:nboot) { inds <- sample(1:n, n, replace = T) new_df <- df[inds, ] X_tmp <- as.matrix(new_df[ ,-1]) y_tmp <- new_df[ ,1] # print(nrow(new_df)) tmp_beta <- solve(t(X_tmp) %*% X_tmp) %*% t(X_tmp) %*% y_tmp beta_boot[i, ] <- tmp_beta } apply(beta_boot, 2, mean) ## [1] 0.1893281 0.5887068 -1.1800738 apply(beta_boot, 2, quantile, probs = c(0.025, 0.975)) ## [,1] [,2] [,3] ## 2.5% 0.1389441 0.5436911 -1.221560 ## 97.5% 0.2386295 0.6363102 -1.140416 out$coefficients[ ,2] ## x1 x2 x3 ## 0.02209328 0.02087542 0.01934506 Exercise 15.5 (Principal component analysis) Load the olympic data set from package ade4. The data show decathlon results for 33 men in 1988 Olympic Games. This data set serves as a great example of finding the latent structure in the data, as there are certain characteristics of the athletes that make them excel at different events. For example an explosive athlete will do particulary well in sprints and long jumps. Perform PCA (prcomp) on the data set and interpret the first 2 latent dimensions. Hint: Standardize the data first to get meaningful results. Use MLE to estimate the covariance of the standardized multivariate distribution. Decompose the estimated covariance matrix with the eigendecomposition. Compare the eigenvectors to the output of PCA. data(olympic) X <- olympic$tab X_scaled <- scale(X) my_pca <- prcomp(X_scaled) summary(my_pca) ## Importance of components: ## PC1 PC2 PC3 PC4 PC5 PC6 PC7 ## Standard deviation 1.8488 1.6144 0.97123 0.9370 0.74607 0.70088 0.65620 ## Proportion of Variance 0.3418 0.2606 0.09433 0.0878 0.05566 0.04912 0.04306 ## Cumulative Proportion 0.3418 0.6025 0.69679 0.7846 0.84026 0.88938 0.93244 ## PC8 PC9 PC10 ## Standard deviation 0.55389 0.51667 0.31915 ## Proportion of Variance 0.03068 0.02669 0.01019 ## Cumulative Proportion 0.96312 0.98981 1.00000 autoplot(my_pca, data = X, loadings = TRUE, loadings.colour = 'blue', loadings.label = TRUE, loadings.label.size = 3) Sigma_est <- (1 / nrow(X_scaled)) * t(X_scaled) %*% X_scaled Sigma_dec <- eigen(Sigma_est) Sigma_dec$vectors ## [,1] [,2] [,3] [,4] [,5] [,6] ## [1,] 0.4158823 0.1488081 -0.26747198 -0.08833244 -0.442314456 0.03071237 ## [2,] -0.3940515 -0.1520815 -0.16894945 -0.24424963 0.368913901 -0.09378242 ## [3,] -0.2691057 0.4835374 0.09853273 -0.10776276 -0.009754680 0.23002054 ## [4,] -0.2122818 0.0278985 -0.85498656 0.38794393 -0.001876311 0.07454380 ## [5,] 0.3558474 0.3521598 -0.18949642 0.08057457 0.146965351 -0.32692886 ## [6,] 0.4334816 0.0695682 -0.12616012 -0.38229029 -0.088802794 0.21049130 ## [7,] -0.1757923 0.5033347 0.04609969 0.02558404 0.019358607 0.61491241 ## [8,] -0.3840821 0.1495820 0.13687235 0.14396548 -0.716743474 -0.34776037 ## [9,] -0.1799436 0.3719570 -0.19232803 -0.60046566 0.095582043 -0.43744387 ## [10,] 0.1701426 0.4209653 0.22255233 0.48564231 0.339772188 -0.30032419 ## [,7] [,8] [,9] [,10] ## [1,] 0.2543985 0.663712826 -0.10839531 0.10948045 ## [2,] 0.7505343 0.141264141 0.04613910 0.05580431 ## [3,] -0.1106637 0.072505560 0.42247611 0.65073655 ## [4,] -0.1351242 -0.155435871 -0.10206505 0.11941181 ## [5,] 0.1413388 -0.146839303 0.65076229 -0.33681395 ## [6,] 0.2725296 -0.639003579 -0.20723854 0.25971800 ## [7,] 0.1439726 0.009400445 -0.16724055 -0.53450315 ## [8,] 0.2732665 -0.276873049 -0.01766443 -0.06589572 ## [9,] -0.3419099 0.058519366 -0.30619617 -0.13093187 ## [10,] 0.1868704 0.007310045 -0.45688227 0.24311846 my_pca$rotation ## PC1 PC2 PC3 PC4 PC5 PC6 ## 100 -0.4158823 0.1488081 0.26747198 -0.08833244 -0.442314456 0.03071237 ## long 0.3940515 -0.1520815 0.16894945 -0.24424963 0.368913901 -0.09378242 ## poid 0.2691057 0.4835374 -0.09853273 -0.10776276 -0.009754680 0.23002054 ## haut 0.2122818 0.0278985 0.85498656 0.38794393 -0.001876311 0.07454380 ## 400 -0.3558474 0.3521598 0.18949642 0.08057457 0.146965351 -0.32692886 ## 110 -0.4334816 0.0695682 0.12616012 -0.38229029 -0.088802794 0.21049130 ## disq 0.1757923 0.5033347 -0.04609969 0.02558404 0.019358607 0.61491241 ## perc 0.3840821 0.1495820 -0.13687235 0.14396548 -0.716743474 -0.34776037 ## jave 0.1799436 0.3719570 0.19232803 -0.60046566 0.095582043 -0.43744387 ## 1500 -0.1701426 0.4209653 -0.22255233 0.48564231 0.339772188 -0.30032419 ## PC7 PC8 PC9 PC10 ## 100 0.2543985 -0.663712826 0.10839531 -0.10948045 ## long 0.7505343 -0.141264141 -0.04613910 -0.05580431 ## poid -0.1106637 -0.072505560 -0.42247611 -0.65073655 ## haut -0.1351242 0.155435871 0.10206505 -0.11941181 ## 400 0.1413388 0.146839303 -0.65076229 0.33681395 ## 110 0.2725296 0.639003579 0.20723854 -0.25971800 ## disq 0.1439726 -0.009400445 0.16724055 0.53450315 ## perc 0.2732665 0.276873049 0.01766443 0.06589572 ## jave -0.3419099 -0.058519366 0.30619617 0.13093187 ## 1500 0.1868704 -0.007310045 0.45688227 -0.24311846 15.2 Fisher information Exercise 15.6 Let us assume a Poisson likelihood. Derive the MLE estimate of the mean. Derive the Fisher information. For the data below compute the MLE and construct confidence intervals. Use bootstrap to construct the CI for the mean. Compare with c) and discuss. x <- c(2, 5, 3, 1, 2, 1, 0, 3, 0, 2) Solution. The log likelihood of the Poisson is \\[\\begin{align*} l(\\lambda; x) = \\sum_{i=1}^n x_i \\ln \\lambda - n \\lambda - \\sum_{i=1}^n \\ln x_i! \\end{align*}\\] Taking the derivative and equating with 0 we get \\[\\begin{align*} \\frac{1}{\\hat{\\lambda}}\\sum_{i=1}^n x_i - n &= 0 \\\\ \\hat{\\lambda} &= \\frac{1}{n} \\sum_{i=1}^n x_i. \\end{align*}\\] Since \\(\\lambda\\) is the mean parameter, this was expected. For the Fischer information, we first need the second derivative, which is \\[\\begin{align*} - \\lambda^{-2} \\sum_{i=1}^n x_i. \\\\ \\end{align*}\\] Now taking the expectation of the negative of the above, we get \\[\\begin{align*} E[\\lambda^{-2} \\sum_{i=1}^n x_i] &= \\lambda^{-2} E[\\sum_{i=1}^n x_i] \\\\ &= \\lambda^{-2} n \\lambda \\\\ &= \\frac{n}{\\lambda}. \\end{align*}\\] set.seed(1) x <- c(2, 5, 3, 1, 2, 1, 0, 3, 0, 2) lambda_hat <- mean(x) finfo <- length(x) / lambda_hat mle_CI <- c(lambda_hat - 1.96 * sqrt(1 / finfo), lambda_hat + 1.96 * sqrt(1 / finfo)) boot_lambda <- c() nboot <- 1000 for (i in 1:nboot) { tmp_x <- sample(x, length(x), replace = T) boot_lambda[i] <- mean(tmp_x) } boot_CI <- c(quantile(boot_lambda, 0.025), quantile(boot_lambda, 0.975)) mle_CI ## [1] 1.045656 2.754344 boot_CI ## 2.5% 97.5% ## 1.0 2.7 Exercise 15.7 Find the Fisher information matrix for the Gamma distribution. Generate 20 samples from a Gamma distribution and plot a confidence ellipse of the inverse of Fisher information matrix around the ML estimates of the parameters. Also plot the theoretical values. Repeat the sampling several times. What do you observe? Discuss what a non-diagonal Fisher matrix implies. Hint: The digamma function is defined as \\(\\psi(x) = \\frac{\\frac{d}{dx} \\Gamma(x)}{\\Gamma(x)}\\). Additionally, you do not need to evaluate \\(\\frac{d}{dx} \\psi(x)\\). To calculate its value in R, use package numDeriv. Solution. The log likelihood of the Gamma is \\[\\begin{equation*} l(\\alpha, \\beta; x) = n \\alpha \\ln \\beta - n \\ln \\Gamma(\\alpha) + (\\alpha - 1) \\sum_{i=1}^n \\ln x_i - \\beta \\sum_{i=1}^n x_i. \\end{equation*}\\] Let us calculate the derivatives. \\[\\begin{align*} \\frac{\\partial}{\\partial \\alpha} l(\\alpha, \\beta; x) &= n \\ln \\beta - n \\psi(\\alpha) + \\sum_{i=1}^n \\ln x_i, \\\\ \\frac{\\partial}{\\partial \\beta} l(\\alpha, \\beta; x) &= \\frac{n \\alpha}{\\beta} - \\sum_{i=1}^n x_i, \\\\ \\frac{\\partial^2}{\\partial \\alpha \\beta} l(\\alpha, \\beta; x) &= \\frac{n}{\\beta}, \\\\ \\frac{\\partial^2}{\\partial \\alpha^2} l(\\alpha, \\beta; x) &= - n \\frac{\\partial}{\\partial \\alpha} \\psi(\\alpha), \\\\ \\frac{\\partial^2}{\\partial \\beta^2} l(\\alpha, \\beta; x) &= - \\frac{n \\alpha}{\\beta^2}. \\end{align*}\\] The Fisher information matrix is then \\[\\begin{align*} I(\\alpha, \\beta) = - E[ \\begin{bmatrix} - n \\psi'(\\alpha) & \\frac{n}{\\beta} \\\\ \\frac{n}{\\beta} & - \\frac{n \\alpha}{\\beta^2} \\end{bmatrix} ] = \\begin{bmatrix} n \\psi'(\\alpha) & - \\frac{n}{\\beta} \\\\ - \\frac{n}{\\beta} & \\frac{n \\alpha}{\\beta^2} \\end{bmatrix} \\end{align*}\\] A non-diagonal Fisher matrix implies that the parameter estimates are linearly dependent. set.seed(1) n <- 20 pars_theor <- c(5, 2) x <- rgamma(n, 5, 2) # MLE for alpha and beta log_lik <- function (pars, x) { n <- length(x) return (- (n * pars[1] * log(pars[2]) - n * log(gamma(pars[1])) + (pars[1] - 1) * sum(log(x)) - pars[2] * sum(x))) } my_optim <- optim(par = c(1,1), fn = log_lik, method = "L-BFGS-B", lower = c(0.001, 0.001), upper = c(8, 8), x = x) pars_mle <- my_optim$par fish_mat <- matrix(data = NA, nrow = 2, ncol = 2) fish_mat[1,2] <- - n / pars_mle[2] fish_mat[2,1] <- - n / pars_mle[2] fish_mat[2,2] <- (n * pars_mle[1]) / (pars_mle[2]^2) fish_mat[1,1] <- n * grad(digamma, pars_mle[1]) fish_mat_inv <- solve(fish_mat) est_ellip <- ellipse(pars_mle, fish_mat_inv, draw = FALSE) colnames(est_ellip) <- c("X1", "X2") est_ellip <- as.data.frame(est_ellip) ggplot() + geom_point(data = data.frame(x = pars_mle[1], y = pars_mle[2]), aes(x = x, y = y)) + geom_path(data = est_ellip, aes(x = X1, y = X2)) + geom_point(aes(x = pars_theor[1], y = pars_theor[2]), color = "red") + geom_text(aes(x = pars_theor[1], y = pars_theor[2], label = "Theoretical parameters"), color = "red", nudge_y = -0.2) 15.3 The German tank problem Exercise 15.8 (The German tank problem) During WWII the allied intelligence were faced with an important problem of estimating the total production of certain German tanks, such as the Panther. What turned out to be a successful approach was to estimate the maximum from the serial numbers of the small sample of captured or destroyed tanks (describe the statistical model used). What assumptions were made by using the above model? Do you think they are reasonable assumptions in practice? Show that the plug-in estimate for the maximum (i.e. the maximum of the sample) is a biased estimator. Derive the maximum likelihood estimate of the maximum. Check that the following estimator is not biased: \\(\\hat{n} = \\frac{k + 1}{k}m - 1\\). Solution. The data are the serial numbers of the tanks. The parameter is \\(n\\), the total production of the tank. The distribution of the serial numbers is a discrete uniform distribution over all serial numbers. One of the assumptions is that we have i.i.d samples, however in practice this might not be true, as some tanks produced later could be sent to the field later, therefore already in theory we would not be able to recover some values from the population. To find the expected value we first need to find the distribution of \\(m\\). Let us start with the CDF. \\[\\begin{align*} F_m(x) = P(Y_1 < x,...,Y_k < x). \\end{align*}\\] If \\(x < k\\) then \\(F_m(x) = 0\\) and if \\(x \\geq 1\\) then \\(F_m(x) = 1\\). What about between those values. So the probability that the maximum value is less than or equal to \\(m\\) is just the number of possible draws from \\(Y\\) that are all smaller than \\(m\\), divided by all possible draws. This is \\(\\frac{{x}\\choose{k}}{{n}\\choose{k}}\\). The PDF on the suitable bounds is then \\[\\begin{align*} P(m = x) = F_m(x) - F_m(x - 1) = \\frac{\\binom{x}{k} - \\binom{x - 1}{k}}{\\binom{n}{k}} = \\frac{\\binom{x - 1}{k - 1}}{\\binom{n}{k}}. \\end{align*}\\] Now we can calculate the expected value of \\(m\\) using some combinatorial identities. \\[\\begin{align*} E[m] &= \\sum_{i = k}^n i \\frac{{i - 1}\\choose{k - 1}}{{n}\\choose{k}} \\\\ &= \\sum_{i = k}^n i \\frac{\\frac{(i - 1)!}{(k - 1)!(i - k)!}}{{n}\\choose{k}} \\\\ &= \\frac{k}{\\binom{n}{k}}\\sum_{i = k}^n \\binom{i}{k} \\\\ &= \\frac{k}{\\binom{n}{k}} \\binom{n + 1}{k + 1} \\\\ &= \\frac{k(n + 1)}{k + 1}. \\end{align*}\\] The bias of this estimator is then \\[\\begin{align*} E[m] - n = \\frac{k(n + 1)}{k + 1} - n = \\frac{k - n}{k + 1}. \\end{align*}\\] The probability that we observed our sample \\(Y = {Y_1, Y_2,...,,Y_k}\\) given \\(n\\) is \\(\\frac{1}{{n}\\choose{k}}\\). We need to find such \\(n^*\\) that this function is maximized. Additionally, we have a constraint that \\(n^* \\geq m = \\max{(Y)}\\). Let us plot this function for \\(m = 10\\) and \\(k = 4\\). library(ggplot2) my_fun <- function (x, m, k) { tmp <- 1 / (choose(x, k)) tmp[x < m] <- 0 return (tmp) } x <- 1:20 y <- my_fun(x, 10, 4) df <- data.frame(x = x, y = y) ggplot(data = df, aes(x = x, y = y)) + geom_line() ::: {.solution} (continued) We observe that the maximum of this function lies at the maximum value of the sample. Therefore \\(n^* = m\\) and ML estimate equals the plug-in estimate. \\[\\begin{align*} E[\\hat{n}] &= \\frac{k + 1}{k} E[m] - 1 \\\\ &= \\frac{k + 1}{k} \\frac{k(n + 1)}{k + 1} - 1 \\\\ &= n. \\end{align*}\\] ::: "],["nhst.html", "Chapter 16 Null hypothesis significance testing", " Chapter 16 Null hypothesis significance testing This chapter deals with null hypothesis significance testing. The students are expected to acquire the following knowledge: Binomial test. t-test. Chi-squared test. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 16.1 (Binomial test) We assume \\(y_i \\in \\{0,1\\}\\), \\(i = 1,...,n\\) and \\(y_i | \\theta = 0.5 \\sim i.i.d.\\) Bernoulli\\((\\theta)\\). The test statistic is \\(X = \\sum_{i=1}^n\\) and the rejection region R is defined as the region where the probability of obtaining such or more extreme \\(X\\) given \\(\\theta = 0.5\\) is less than 0.05. Derive and plot the power function of the test for \\(n=100\\). What is the significance level of this test if \\(H0: \\theta = 0.5\\)? At which values of X will we reject the null hypothesis? # a # First we need the rejection region, so we need to find X_min and X_max n <- 100 qbinom(0.025, n, 0.5) ## [1] 40 qbinom(0.975, n, 0.5) ## [1] 60 pbinom(40, n, 0.5) ## [1] 0.02844397 pbinom(60, n, 0.5) ## [1] 0.9823999 X_min <- 39 X_max <- 60 thetas <- seq(0, 1, by = 0.01) beta_t <- 1 - pbinom(X_max, size = n, prob = thetas) + pbinom(X_min, size = n, prob = thetas) plot(beta_t) # b # The significance level is beta_t[51] ## [1] 0.0352002 # We will reject the null hypothesis at X values below X_min and above X_max. Exercise 16.2 (Long-run guarantees of the t-test) Generate a sample of size \\(n = 10\\) from the standard normal. Use the two-sided t-test with \\(H0: \\mu = 0\\) and record the p-value. Can you reject H0 at 0.05 significance level? (before simulating) If we repeated (b) many times, what would be the relative frequency of false positives/Type I errors (rejecting the null that is true)? What would be the relative frequency of false negatives /Type II errors (retaining the null when the null is false)? (now simulate b and check if the simulation results match your answer in b) Similar to (a-c) but now we generate data from N(-0.5, 1). Similar to (a-c) but now we generate data from N(\\(\\mu\\), 1) where we every time pick a different \\(\\mu < 0\\) and use a one-sided test \\(H0: \\mu <= 0\\). set.seed(2) # a x <- rnorm(10) my_test <- t.test(x, alternative = "two.sided", mu = 0) my_test ## ## One Sample t-test ## ## data: x ## t = 0.6779, df = 9, p-value = 0.5149 ## alternative hypothesis: true mean is not equal to 0 ## 95 percent confidence interval: ## -0.4934661 0.9157694 ## sample estimates: ## mean of x ## 0.2111516 # we can not reject the null hypothesis # b # The expected value of false positives would be 0.05. The expected value of # true negatives would be 0, as there are no negatives (the null hypothesis is # always the truth). nit <- 1000 typeIerr <- vector(mode = "logical", length = nit) typeIIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { x <- rnorm(10) my_test <- t.test(x, alternative = "two.sided", mu = 0) if (my_test$p.value < 0.05) { typeIerr[i] <- T } else { typeIerr[i] <- F } } mean(typeIerr) ## [1] 0.052 sd(typeIerr) / sqrt(nit) ## [1] 0.007024624 # d # We can not estimate the percentage of true negatives, but it will probably be # higher than 0.05. There will be no false positives as the null hypothesis is # always false. typeIIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { x <- rnorm(10, -0.5) my_test <- t.test(x, alternative = "two.sided", mu = 0) if (my_test$p.value < 0.05) { typeIIerr[i] <- F } else { typeIIerr[i] <- T } } mean(typeIIerr) ## [1] 0.719 sd(typeIIerr) / sqrt(nit) ## [1] 0.01422115 # e # The expected value of false positives would be lower than 0.05. The expected # value of true negatives would be 0, as there are no negatives (the null # hypothesis is always the truth). typeIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { u <- runif(1, -1, 0) x <- rnorm(10, u) my_test <- t.test(x, alternative = "greater", mu = 0) if (my_test$p.value < 0.05) { typeIerr[i] <- T } else { typeIerr[i] <- F } } mean(typeIerr) ## [1] 0.012 sd(typeIerr) / sqrt(nit) ## [1] 0.003444977 Exercise 16.3 (T-test, confidence intervals, and bootstrap) Sample \\(n=20\\) from a standard normal distribution and calculate the p-value using t-test, confidence intervals based on normal distribution, and bootstrap. Repeat this several times and check how many times we rejected the null hypothesis (made a type I error). Hint: For the confidence intervals you can use function CI from the Rmisc package. set.seed(1) library(Rmisc) nit <- 1000 n_boot <- 100 t_logic <- rep(F, nit) boot_logic <- rep(F, nit) norm_logic <- rep(F, nit) for (i in 1:nit) { x <- rnorm(20) my_test <- t.test(x) my_CI <- CI(x) if (my_test$p.value <= 0.05) t_logic[i] <- T boot_tmp <- vector(mode = "numeric", length = n_boot) for (j in 1:n_boot) { tmp_samp <- sample(x, size = 20, replace = T) boot_tmp[j] <- mean(tmp_samp) } if ((quantile(boot_tmp, 0.025) >= 0) | (quantile(boot_tmp, 0.975) <= 0)) { boot_logic[i] <- T } if ((my_CI[3] >= 0) | (my_CI[1] <= 0)) { norm_logic[i] <- T } } mean(t_logic) ## [1] 0.053 sd(t_logic) / sqrt(nit) ## [1] 0.007088106 mean(boot_logic) ## [1] 0.093 sd(boot_logic) / sqrt(nit) ## [1] 0.009188876 mean(norm_logic) ## [1] 0.053 sd(norm_logic) / sqrt(nit) ## [1] 0.007088106 Exercise 16.4 (Chi-squared test) Show that the \\(\\chi^2 = \\sum_{i=1}^k \\frac{(O_i - E_i)^2}{E_i}\\) test statistic is approximately \\(\\chi^2\\) distributed when we have two categories. Let us look at the US voting data here. Compare the number of voters who voted for Trump or Hillary depending on their income (less or more than 100.000 dollars per year). Manually calculate the chi-squared statistic, compare to the chisq.test in R, and discuss the results. Visualize the test. Solution. Let \\(X_i\\) be binary variables, \\(i = 1,...,n\\). We can then express the test statistic as \\[\\begin{align} \\chi^2 = &\\frac{(O_i - np)^2}{np} + \\frac{(n - O_i - n(1 - p))^2}{n(1 - p)} \\\\ &= \\frac{(O_i - np)^2}{np(1 - p)} \\\\ &= (\\frac{O_i - np}{\\sqrt{np(1 - p)}})^2. \\end{align}\\] When \\(n\\) is large, this distrbution is approximately normal with \\(\\mu = np\\) and \\(\\sigma^2 = np(1 - p)\\) (binomial converges in distribution to standard normal). By definition, the chi-squared distribution with \\(k\\) degrees of freedom is a sum of squares of \\(k\\) independent standard normal random variables. n <- 24588 less100 <- round(0.66 * n * c(0.49, 0.45, 0.06)) # some rounding, but it should not affect results more100 <- round(0.34 * n * c(0.47, 0.47, 0.06)) x <- rbind(less100, more100) colnames(x) <- c("Clinton", "Trump", "other/no answer") print(x) ## Clinton Trump other/no answer ## less100 7952 7303 974 ## more100 3929 3929 502 chisq.test(x) ## ## Pearson's Chi-squared test ## ## data: x ## X-squared = 9.3945, df = 2, p-value = 0.00912 x ## Clinton Trump other/no answer ## less100 7952 7303 974 ## more100 3929 3929 502 csum <- apply(x, 2, sum) rsum <- apply(x, 1, sum) chi2 <- (x[1,1] - csum[1] * rsum[1] / sum(x))^2 / (csum[1] * rsum[1] / sum(x)) + (x[1,2] - csum[2] * rsum[1] / sum(x))^2 / (csum[2] * rsum[1] / sum(x)) + (x[1,3] - csum[3] * rsum[1] / sum(x))^2 / (csum[3] * rsum[1] / sum(x)) + (x[2,1] - csum[1] * rsum[2] / sum(x))^2 / (csum[1] * rsum[2] / sum(x)) + (x[2,2] - csum[2] * rsum[2] / sum(x))^2 / (csum[2] * rsum[2] / sum(x)) + (x[2,3] - csum[3] * rsum[2] / sum(x))^2 / (csum[3] * rsum[2] / sum(x)) chi2 ## Clinton ## 9.394536 1 - pchisq(chi2, df = 2) ## Clinton ## 0.009120161 x <- seq(0, 15, by = 0.01) df <- data.frame(x = x) ggplot(data = df, aes(x = x)) + stat_function(fun = dchisq, args = list(df = 2)) + geom_segment(aes(x = chi2, y = 0, xend = chi2, yend = dchisq(chi2, df = 2))) + stat_function(fun = dchisq, args = list(df = 2), xlim = c(chi2, 15), geom = "area", fill = "red") "],["bi.html", "Chapter 17 Bayesian inference 17.1 Conjugate priors 17.2 Posterior sampling", " Chapter 17 Bayesian inference This chapter deals with Bayesian inference. The students are expected to acquire the following knowledge: How to set prior distribution. Compute posterior distribution. Compute posterior predictive distribution. Use sampling for inference. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 17.1 Conjugate priors Exercise 17.1 (Poisson-gamma model) Let us assume a Poisson likelihood and a gamma prior on the Poisson mean parameter (this is a conjugate prior). Derive posterior Below we have some data, which represents number of goals in a football match. Choose sensible prior for this data (draw the gamma density if necessary), justify it. Compute the posterior. Compute an interval such that the probability that the true mean is in there is 95%. What is the probability that the true mean is greater than 2.5? Back to theory: Compute prior predictive and posterior predictive. Discuss why the posterior predictive is overdispersed and not Poisson? Draw a histogram of the prior predictive and posterior predictive for the data from (b). Discuss. Generate 10 and 100 random samples from a Poisson distribution and compare the posteriors with a flat prior, and a prior concentrated away from the truth. x <- c(3, 2, 1, 1, 5, 4, 0, 0, 4, 3) Solution. \\[\\begin{align*} p(\\lambda | X) &= \\frac{p(X | \\lambda) p(\\lambda)}{\\int_0^\\infty p(X | \\lambda) p(\\lambda) d\\lambda} \\\\ &\\propto p(X | \\lambda) p(\\lambda) \\\\ &= \\Big(\\prod_{i=1}^n \\frac{1}{x_i!} \\lambda^{x_i} e^{-\\lambda}\\Big) \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\lambda^{\\alpha - 1} e^{-\\beta \\lambda} \\\\ &\\propto \\lambda^{\\sum_{i=1}^n x_i + \\alpha - 1} e^{- \\lambda (n + \\beta)} \\\\ \\end{align*}\\] We recognize this as the shape of a gamma distribution, therefore \\[\\begin{align*} \\lambda | X \\sim \\text{gamma}(\\alpha + \\sum_{i=1}^n x_i, \\beta + n) \\end{align*}\\] For the prior predictive, we have \\[\\begin{align*} p(x^*) &= \\int_0^\\infty p(x^*, \\lambda) d\\lambda \\\\ &= \\int_0^\\infty p(x^* | \\lambda) p(\\lambda) d\\lambda \\\\ &= \\int_0^\\infty \\frac{1}{x^*!} \\lambda^{x^*} e^{-\\lambda} \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\lambda^{\\alpha - 1} e^{-\\beta \\lambda} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\int_0^\\infty \\lambda^{x^* + \\alpha - 1} e^{-\\lambda (1 + \\beta)} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\frac{\\Gamma(x^* + \\alpha)}{(1 + \\beta)^{x^* + \\alpha}} \\int_0^\\infty \\frac{(1 + \\beta)^{x^* + \\alpha}}{\\Gamma(x^* + \\alpha)} \\lambda^{x^* + \\alpha - 1} e^{-\\lambda (1 + \\beta)} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\frac{\\Gamma(x^* + \\alpha)}{(1 + \\beta)^{x^* + \\alpha}} \\\\ &= \\frac{\\Gamma(x^* + \\alpha)}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} (\\frac{\\beta}{1 + \\beta})^\\alpha (\\frac{1}{1 + \\beta})^{x^*}, \\end{align*}\\] which we recognize as the negative binomial distribution with \\(r = \\alpha\\) and \\(p = \\frac{1}{\\beta + 1}\\). For the posterior predictive, the calculation is the same, only now the parameters are \\(r = \\alpha + \\sum_{i=1}^n x_i\\) and \\(p = \\frac{1}{\\beta + n + 1}\\). There are two sources of uncertainty in the predictive distribution. First is the uncertainty about the population. Second is the variability in sampling from the population. When \\(n\\) is large, the latter is going to be very small. But when \\(n\\) is small, the latter is going to be higher, resulting in an overdispersed predictive distribution. x <- c(3, 2, 1, 1, 5, 4, 0, 0, 4, 3) # b # quick visual check of the prior ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dgamma, args = list(shape = 1, rate = 1)) palpha <- 1 pbeta <- 1 alpha_post <- palpha + sum(x) beta_post <- pbeta + length(x) ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dgamma, args = list(shape = alpha_post, rate = beta_post)) # probability of being higher than 2.5 1 - pgamma(2.5, alpha_post, beta_post) ## [1] 0.2267148 # interval qgamma(c(0.025, 0.975), alpha_post, beta_post) ## [1] 1.397932 3.137390 # d prior_pred <- rnbinom(1000, size = palpha, prob = 1 - 1 / (pbeta + 1)) post_pred <- rnbinom(1000, size = palpha + sum(x), prob = 1 - 1 / (pbeta + 10 + 1)) df <- data.frame(prior = prior_pred, posterior = post_pred) df <- gather(df) ggplot(df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") # e set.seed(1) x1 <- rpois(10, 2.5) x2 <- rpois(100, 2.5) alpha_flat <- 1 beta_flat <- 0.1 alpha_conc <- 50 beta_conc <- 10 n <- 10000 df_flat <- data.frame(x1 = rgamma(n, alpha_flat + sum(x1), beta_flat + 10), x2 = rgamma(n, alpha_flat + sum(x2), beta_flat + 100), type = "flat") df_flat <- tidyr::gather(df_flat, key = "key", value = "value", - type) df_conc <- data.frame(x1 = rgamma(n, alpha_conc + sum(x1), beta_conc + 10), x2 = rgamma(n, alpha_conc + sum(x2), beta_conc + 100), type = "conc") df_conc <- tidyr::gather(df_conc, key = "key", value = "value", - type) df <- rbind(df_flat, df_conc) ggplot(data = df, aes(x = value, color = type)) + facet_wrap(~ key) + geom_density() 17.2 Posterior sampling Exercise 17.2 (Bayesian logistic regression) In Chapter 15 we implemented a MLE for logistic regression (see the code below). For this model, conjugate priors do not exist, which complicates the calculation of the posterior. However, we can use sampling from the numerator of the posterior, using rejection sampling. Set a sensible prior distribution on \\(\\beta\\) and use rejection sampling to find the posterior distribution. In a) you will get a distribution of parameter \\(\\beta\\). Plot the probabilities (as in exercise 15.3) for each sample of \\(\\beta\\) and compare to the truth. Hint: We can use rejection sampling even for functions which are not PDFs – they do not have to sum/integrate to 1. We just need to use a suitable envelope that we know how to sample from. For example, here we could use a uniform distribution and scale it suitably. set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par # Let's say we believe that the mean of beta is 0.5. Since we are not very sure # about this, we will give it a relatively high variance. So a normal prior with # mean 0.5 and standard deviation 5. But there is no right solution to this, # this is basically us expressing our prior belief in the parameter values. set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) if (is.nan(logl)) logl <- Inf return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par ## [1] 1.166558 f_logistic <- function (beta, X, y) { logl <- prod(inv_log(as.vector(beta %*% X))^y * (1 - inv_log(as.vector(beta %*% X)))^(1 - y)) return(logl) } a <- seq(0, 3, by = 0.01) my_l <- c() for (i in a) { my_l <- c(my_l, f_logistic(i, x, y) * dnorm(i, 0.5, 5)) } plot(my_l) envlp <- 10^(-25.8) * dunif(a, -5, 5) # found by trial and error tmp <- data.frame(envel = envlp, l = my_l, t = a) tmp <- gather(tmp, key = "key", value = "value", - t) ggplot(tmp, aes(x = t, y = value, color = key)) + geom_line() # envelope OK set.seed(1) nsamps <- 1000 samps <- c() for (i in 1:nsamps) { tmp <- runif(1, -5, 5) u <- runif(1, 0, 1) if (u < (f_logistic(tmp, x, y) * dnorm(tmp, 0.5, 5)) / (10^(-25.8) * dunif(tmp, -5, 5))) { samps <- c(samps, tmp) } } plot(density(samps)) mean(samps) ## [1] 1.211578 median(samps) ## [1] 1.204279 truth_p <- data.frame(x = x, prob = inv_log(1.2 * x), type = "truth") preds <- inv_log(x %*% t(samps)) preds <- gather(cbind(as.data.frame(preds), x = x), key = "key", "value" = value, - x) ggplot(preds, aes(x = x, y = value)) + geom_line(aes(group = key), color = "gray", alpha = 0.7) + geom_point(data = truth_p, aes(y = prob), color = "red", alpha = 0.7) + theme_bw() "],["distributions-intutition.html", "Chapter 18 Distributions intutition 18.1 Discrete distributions 18.2 Continuous distributions", " Chapter 18 Distributions intutition This chapter is intended to help you familiarize yourself with the different probability distributions you will encounter in this course. Use Appendix B as a reference for the basic properties of distributions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 18.1 Discrete distributions Exercise 18.1 (Bernoulli intuition 1) Arguably the simplest distribution you will enocounter is the Bernoulli distribution. It is a discrete probability distribution used to represent the outcome of a yes/no question. It has one parameter \\(p\\) which is the probability of success. The probability of failure is \\(1-p\\), sometimes denoted as \\(q\\). A classic way to think about a Bernoulli trial (a yes/no experiment) is a coin flip. Real coins are fair, meaning the probability of either heads (1) or tails (0) are the same i.e. \\(p=0.5\\), shown below in figure a. Alternatively we may want to represent a process that doesn’t have equal probabilities of outcomes like “Will a throw of a fair die result in a 6?”. In this case \\(p=\\frac{1}{6}\\), shown in figure b. Using your knowledge of the Bernoulli distribution use the throw of a fair die to think of events, such that: \\(p = 0.5\\) \\(p = \\frac{5}{6}\\) \\(q = \\frac{2}{3}\\) Solution. An event that is equally likely to happen or not happen i.e. \\(p = 0.5\\) would be throwing an even number. More formally we can name this event \\(A\\) and write: \\(A = \\{2,4,6\\}\\), its probability being \\(P(A) = 0.5\\) An example of an event with \\(p = \\frac{5}{6}\\) would be throwing a number greater than 1. Defined as \\(B = \\{2,3,4,5,6\\}\\). We need an event that fails \\(\\frac{2}{3}\\) of the time. Alternatively we can reverse the problem and find an event that succeeds \\(\\frac{1}{3}\\) of the time, since: \\(q = 1 - p \\implies p = 1 - q = \\frac{1}{3}\\). The event that our outcome is divisible by 3: \\(C = \\{3, 6\\}\\) satisfies this condition. Exercise 18.2 (Binomial intuition 1) The binomial distribution is a generalization of the Bernoulli distribution. Instead of considering a single Bernoulli trial, we now consider a sequence of \\(n\\) trials, which are independent and have the same parameter \\(p\\). So the binomial distribution has two parameters \\(n\\) - the number of trials and \\(p\\) - the probability of success for each trial. If we return to our coin flip representation, we now flip a coin several times. The binomial distribution will give us the probabilities of all possible outcomes. Below we show the distribution for a series of 10 coin flips with a fair coin (left) and a biased coin (right). The numbers on the x axis represent the number of times the coin landed heads. Using your knowledge of the binomial distribution: Take the pmf of the binomial distribution and plug in \\(n=1\\), check that it is in fact equivalent to a Bernoulli distribution. In our examples we show the graph of a binomial distribution over 10 trials with \\(p=0.8\\). If we take a look at the graph, it appears as though the probabilities of getting 0,1,2 or 3 heads in 10 flips are zero. Is it actually zero? Check by plugging in the values into the pmf. Solution. The pmf of a binomial distribution is \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\), now we insert \\(n=1\\) to get: \\[\\binom{1}{k} p^k (1 - p)^{1 - k}\\]. Not quite equivalent to a Bernoulli, however note that the support of the binomial distribution is defined as \\(k \\in \\{0,1,\\dots,n\\}\\), so in our case \\(k = \\{0,1\\}\\), then: \\[\\binom{1}{0} = \\binom{1}{1} = 1\\] we get: \\(p^k (1 - p)^{1 - k}\\) ,the Bernoulli distribution. As we already know \\(p=0.8, n=10\\), so: \\[\\binom{10}{0} 0.8^0 (1 - 0.8)^{10 - 0} = 1.024 \\cdot 10^{-7}\\] \\[\\binom{10}{1} 0.8^1 (1 - 0.8)^{10 - 1} = 4.096 \\cdot 10^{-6}\\] \\[\\binom{10}{2} 0.8^2 (1 - 0.8)^{10 - 2} = 7.3728 \\cdot 10^{-5}\\] \\[\\binom{10}{3} 0.8^3 (1 - 0.8)^{10 - 3} = 7.86432\\cdot 10^{-4}\\] So the probabilities are not zero, just very small. Exercise 18.3 (Poisson intuition 1) Below are shown 3 different graphs of the Poisson distribution. Your task is to replicate them on your own in R by varying the \\(\\lambda\\) parameter. Hint: You can use dpois() to get the probabilities. library(ggplot2) library(gridExtra) x = 0:15 # Create Poisson data data1 <- data.frame(x = x, y = dpois(x, lambda = 0.1)) data2 <- data.frame(x = x, y = dpois(x, lambda = 1)) data3 <- data.frame(x = x, y = dpois(x, lambda = 7.5)) # Create individual ggplot objects plot1 <- ggplot(data1, aes(x, y)) + geom_col() + xlab("x") + ylab("Probability") + ylim(0,1) plot2 <- ggplot(data2, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) plot3 <- ggplot(data3, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) # Combine the plots grid.arrange(plot1, plot2, plot3, ncol = 3) Exercise 18.4 (Poisson intuition 2) The Poisson distribution is a discrete probability distribution that models processes where events occur at a constant mean rate and are independent of each other. It has a single parameter \\(\\lambda\\), which represents the constant mean rate. A classic example of a scenario that can be modeled using the Poisson distribution is the number of calls received by a call center in a day (or in fact any other time interval). Suppose you work in a call center and have some understanding of probability distributions. You overhear your supervisor mentioning that the call center receives an average of 2.5 calls per day. Using your knowledge of the Poisson distribution, calculate: The probability you will get no calls today. The probability you will get more than 5 calls today. Solution. First recall the Poisson pmf: \\[p(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!}\\] as stated previously our parameter \\(\\lambda = 2.5\\) To get the probability of no calls we simply plug in \\(k = 0\\), so: \\[p(0) = \\frac{2.5^0 e^{-2.5}}{0!} = e^{-2.5} \\approx 0.082\\] The support of the Poisson distribution is non-negative integers. So if we wanted to calculate the probability of getting more than 5 calls we would need to add up the probabilities of getting 6 calls and 7 calls and so on up to infinity. Let us instead remember that the sum of all probabilties will be 1, we will reverse the problem and instead ask “What is the probability we get 5 calls or less?”. We can subtract the probability of the opposite outcome (the complement) from 1 to get the probability of our original question. \\[P(k > 5) = 1 - P(k \\leq 5)\\] \\[P(k \\leq 5) = \\sum_{i=0}^{5} p(i) = p(0) + p(1) + p(2) + p(3) + p(4) + p(5) =\\] \\[= \\frac{2.5^0 e^{-2.5}}{0!} + \\frac{2.5^1 e^{-2.5}}{1!} + \\dots =\\] \\[=0.957979\\] So the probability of geting more than 5 calls will be \\(1 - 0.957979 = 0.042021\\) Exercise 18.5 (Geometric intuition 1) The geometric distribution is a discrete distribution that models the number of failures before the first success in a sequence of independent Bernoulli trials. It has a single parameter \\(p\\), representing the probability of success. Disclaimer: There are two forms of this distribution, the one we just described and another version that models the number of trials before the first success. The difference is subtle yet significant and you are likely to encounter both forms, though here we will limit ourselves to the former. In the graph below we show the pmf of a geometric distribution with \\(p=0.5\\). This can be thought of as the number of successive failures (tails) in the flip of a fair coin. You can see that there’s a 50% chance you will have zero failures i.e. you will flip a heads on your very first attempt. But there is some smaller chance that you will flip a sequence of tails in a row, with longer sequences having ever lower probability. Create an equivalent graph that represents the probability of rolling a 6 with a fair 6-sided die. Use the formula for the mean of the geometric distribution and determine the average number of failures before you roll a 6. Look up the alternative form of the geometric distribtuion and again use the formula for the mean to determine the average number of trials before you roll a 6. Solution. Parameter p (the probability of success) for rolling a 6 is \\(p=\\frac{1}{6}\\). library(ggplot2) # Parameters p <- 1/6 x_vals <- 0:9 # Starting from 0 probs <- dgeom(x_vals, p) # Data data <- data.frame(x_vals, probs) # Plot ggplot(data, aes(x=x_vals, y=probs)) + geom_segment(aes(xend=x_vals, yend=0), color="black", size=1) + geom_point(color="red", size=2) + labs(x = "Number of trials", y = "Probability") + theme_minimal() + scale_x_continuous(breaks = x_vals) # This line ensures integer x-axis labels ::: {.solution} b) The expected value of a random variable (the mean) is denoted as \\(E[X]\\). \\[E[X] = \\frac{1-p}{p}= \\frac{1- \\frac{1}{6}}{\\frac{1}{6}} = \\frac{5}{6}\\cdot 6 = 5\\] On average we will fail 5 times before we roll our first 6. The alternative form of this distribution (with support on all positive integers) has a slightly different formula for the mean. This change reflects the difference in the way we posed our question: \\[E[X] = \\frac{1}{p} = \\frac{1}{\\frac{1}{6}} = 6\\] On average we will have to throw the die 6 times before we roll a 6. ::: 18.2 Continuous distributions Exercise 18.6 (Uniform intuition 1) The need for a randomness is a common problem. A practical solution are so-called random number generators (RNGs). The simplest RNG one would think of is choosing a set of numbers and having the generator return a number at random, where the probability of returning any number from this set is the same. If this set is an interval of real numbers, then we’ve basically described the continuous uniform distribution. It has two parameters \\(a\\) and \\(b\\), which define the beginning and end of its support respectively. Let’s think of the mean intuitively. The expected value or mean of a distribution is the pivot point on our x-axis, which “balances” the graph. Given parameters \\(a\\) and \\(b\\) what is your intuitive guess of the mean for this distribution? A special case of the uniform distribution is the standard uniform distribution with \\(a=0\\),\\(b=1\\). Write the pdf of this particular case. Solution. It’s the midpoint between \\(a\\) and \\(b\\), so \\(\\frac{b-a}{2}\\) Inserting the parameter values we get:\\[f(x) = \\begin{cases} 1 & \\text{if } 0 \\leq x \\leq 1 \\\\ 0 & \\text{otherwise} \\end{cases} \\] Notice how the pdf is just a constant \\(1\\) across all values of \\(x \\in [0,1]\\). Here it is important to distinguish between probability and probability density. The density may be 1, but the probability is not and while discreet distributions never exceeded 1 on the y-axis, continuous distributions can go as high as you like. Exercise 18.7 (Normal intuition 1) a Exercise 18.8 (Beta intuition 1) The beta distribution is a continuous distribution defined on the unit interval \\([0,1]\\). It has two strictly positive paramters \\(\\alpha\\) and \\(\\beta\\), which determine its shape. Its support makes it especially suitable to model distribtuions of percentages and proportions. Below you’ve been provided with some code that you can copy into Rstudio. Once you run the code, an interactive Shiny app will appear and you will be able to manipulate the graph of the beta distribution. Play around with the parameters to get: A straight line from (0,0) to (1,2) A straight line from (0,2) to (1,0) A symmetric bell curve A bowl-shaped curve The standard uniform distribution is actually a special case of the beta distribution. Find the exact parameters \\(\\alpha\\) and \\(\\beta\\). Once you do, prove the equality by inserting the values into our pdf. Hint: The beta function is evaluated as \\(\\text{B}(a,b) = \\frac{\\Gamma(a)\\Gamma(b)}{\\Gamma(a+b)}\\), the gamma function for positive integers \\(n\\) is evaluated as \\(\\Gamma(n)= (n-1)!\\) # Install and load necessary packages install.packages(c("shiny", "ggplot2")) library(shiny) library(ggplot2) # The Shiny App ui <- fluidPage( titlePanel("Beta Distribution Viewer"), sidebarLayout( sidebarPanel( sliderInput("alpha", "Alpha:", min = 0.1, max = 10, value = 2, step = 0.1), sliderInput("beta", "Beta:", min = 0.1, max = 10, value = 2, step = 0.1) ), mainPanel( plotOutput("betaPlot") ) ) ) server <- function(input, output) { output$betaPlot <- renderPlot({ x <- seq(0, 1, by = 0.01) y <- dbeta(x, shape1 = input$alpha, shape2 = input$beta) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() + labs(x = "Value", y = "Density") + theme_minimal() }) } shinyApp(ui = ui, server = server) Solution. \\(\\alpha = 2, \\beta=1\\) \\(\\alpha = 1, \\beta=2\\) Possible solution \\(\\alpha = \\beta= 5\\) Possible solution \\(\\alpha = \\beta= 0.5\\) The correct parameters are \\(\\alpha = 1, \\beta=1\\), to prove the equality we insert them into the beta pdf: \\[\\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} = \\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\\text{B}(1, 1)} = \\frac{1}{\\frac{\\Gamma(1)\\Gamma(1)}{\\Gamma(1+1)}}= \\frac{1}{\\frac{(1-1)!(1-1)!}{(2-1)!}} = 1\\] Exercise 18.9 (Gamma intuition 1) a Exercise 18.10 (Exponential intuition 1) a "],["A1.html", "A R programming language A.1 Basic characteristics A.2 Why R? A.3 Setting up A.4 R basics A.5 Functions A.6 Other tips A.7 Further reading and references", " A R programming language A.1 Basic characteristics R is free software for statistical computing and graphics. It is widely used by statisticians, scientists, and other professionals for software development and data analysis. It is an interpreted language and therefore the programs do not need compilation. A.2 Why R? R is one of the main two languages used for statistics and machine learning (the other being Python). Pros Libraries. Comprehensive collection of statistical and machine learning packages. Easy to code. Open source. Anyone can access R and develop new methods. Additionally, it is relatively simple to get source code of established methods. Large community. The use of R has been rising for some time, in industry and academia. Therefore a large collection of blogs and tutorials exists, along with people offering help on pages like StackExchange and CrossValidated. Integration with other languages and LaTeX. New methods. Many researchers develop R packages based on their research, therefore new methods are available soon after development. Cons Slow. Programs run slower than in other programming languages, however this can be somewhat ammended by effective coding or integration with other languages. Memory intensive. This can become a problem with large data sets, as they need to be stored in the memory, along with all the information the models produce. Some packages are not as good as they should be, or have poor documentation. Object oriented programming in R can be very confusing and complex. A.3 Setting up https://www.r-project.org/. A.3.1 RStudio RStudio is the most widely used IDE for R. It is free, you can download it from https://rstudio.com/. While console R is sufficient for the requirements of this course, we recommend the students install RStudio for its better user interface. A.3.2 Libraries for data science Listed below are some of the more useful libraries (packages) for data science. Students are also encouraged to find other useful packages. dplyr Efficient data manipulation. Part of the wider package collection called tidyverse. ggplot2 Plotting based on grammar of graphics. stats Several statistical models. rstan Bayesian inference using Hamiltonian Monte Carlo. Very flexible model building. MCMCpack Bayesian inference. rmarkdown, knitr, and bookdown Dynamic reports (for example such as this one). devtools Package development. A.4 R basics A.4.1 Variables and types Important information and tips: no type declaration define variables with <- instead of = (although both work, there is a slight difference, additionally most of the packages use the arrow) for strings use \"\" for comments use # change types with as.type() functions no special type for single character like C++ for example n <- 20 x <- 2.7 m <- n # m gets value 20 my_flag <- TRUE student_name <- "Luka" typeof(n) ## [1] "double" typeof(student_name) ## [1] "character" typeof(my_flag) ## [1] "logical" typeof(as.integer(n)) ## [1] "integer" typeof(as.character(n)) ## [1] "character" A.4.2 Basic operations n + x ## [1] 22.7 n - x ## [1] 17.3 diff <- n - x # variable diff gets the difference between n and x diff ## [1] 17.3 n * x ## [1] 54 n / x ## [1] 7.407407 x^2 ## [1] 7.29 sqrt(x) ## [1] 1.643168 n > 2 * n ## [1] FALSE n == n ## [1] TRUE n == 2 * n ## [1] FALSE n != n ## [1] FALSE paste(student_name, "is", n, "years old") ## [1] "Luka is 20 years old" A.4.3 Vectors use c() to combine elements into vectors can only contain one type of variable if different types are provided, all are transformed to the most basic type in the vector access elements by indexes or logical vectors of the same length a scalar value is regarded as a vector of length 1 1:4 # creates a vector of integers from 1 to 4 ## [1] 1 2 3 4 student_ages <- c(20, 23, 21) student_names <- c("Luke", "Jen", "Mike") passed <- c(TRUE, TRUE, FALSE) length(student_ages) ## [1] 3 # access by index student_ages[2] ## [1] 23 student_ages[1:2] ## [1] 20 23 student_ages[2] <- 24 # change values # access by logical vectors student_ages[passed == TRUE] # same as student_ages[passed] ## [1] 20 24 student_ages[student_names %in% c("Luke", "Mike")] ## [1] 20 21 student_names[student_ages > 20] ## [1] "Jen" "Mike" A.4.3.1 Operations with vectors most operations are element-wise if we operate on vectors of different lengths, the shorter vector periodically repeats its elements until it reaches the length of the longer one a <- c(1, 3, 5) b <- c(2, 2, 1) d <- c(6, 7) a + b ## [1] 3 5 6 a * b ## [1] 2 6 5 a + d ## Warning in a + d: longer object length is not a multiple of shorter object ## length ## [1] 7 10 11 a + 2 * b ## [1] 5 7 7 a > b ## [1] FALSE TRUE TRUE b == a ## [1] FALSE FALSE FALSE a %*% b # vector multiplication, not element-wise ## [,1] ## [1,] 13 A.4.4 Factors vectors of finite predetermined classes suitable for categorical variables ordinal (ordered) or nominal (unordered) car_brand <- factor(c("Audi", "BMW", "Mercedes", "BMW"), ordered = FALSE) car_brand ## [1] Audi BMW Mercedes BMW ## Levels: Audi BMW Mercedes freq <- factor(x = NA, levels = c("never","rarely","sometimes","often","always"), ordered = TRUE) freq[1:3] <- c("rarely", "sometimes", "rarely") freq ## [1] rarely sometimes rarely ## Levels: never < rarely < sometimes < often < always freq[4] <- "quite_often" # non-existing level, returns NA ## Warning in `[<-.factor`(`*tmp*`, 4, value = "quite_often"): invalid factor ## level, NA generated freq ## [1] rarely sometimes rarely <NA> ## Levels: never < rarely < sometimes < often < always A.4.5 Matrices two-dimensional generalizations of vectors my_matrix <- matrix(c(1, 2, 1, 5, 4, 2), nrow = 2, byrow = TRUE) my_matrix ## [,1] [,2] [,3] ## [1,] 1 2 1 ## [2,] 5 4 2 my_square_matrix <- matrix(c(1, 3, 2, 3), nrow = 2) my_square_matrix ## [,1] [,2] ## [1,] 1 2 ## [2,] 3 3 my_matrix[1,2] # first row, second column ## [1] 2 my_matrix[2, ] # second row ## [1] 5 4 2 my_matrix[ ,3] # third column ## [1] 1 2 A.4.5.1 Matrix functions and operations most operation element-wise mind the dimensions when using matrix multiplication %*% nrow(my_matrix) # number of matrix rows ## [1] 2 ncol(my_matrix) # number of matrix columns ## [1] 3 dim(my_matrix) # matrix dimension ## [1] 2 3 t(my_matrix) # transpose ## [,1] [,2] ## [1,] 1 5 ## [2,] 2 4 ## [3,] 1 2 diag(my_matrix) # the diagonal of the matrix as vector ## [1] 1 4 diag(1, nrow = 3) # creates a diagonal matrix ## [,1] [,2] [,3] ## [1,] 1 0 0 ## [2,] 0 1 0 ## [3,] 0 0 1 det(my_square_matrix) # matrix determinant ## [1] -3 my_matrix + 2 * my_matrix ## [,1] [,2] [,3] ## [1,] 3 6 3 ## [2,] 15 12 6 my_matrix * my_matrix # element-wise multiplication ## [,1] [,2] [,3] ## [1,] 1 4 1 ## [2,] 25 16 4 my_matrix %*% t(my_matrix) # matrix multiplication ## [,1] [,2] ## [1,] 6 15 ## [2,] 15 45 my_vec <- as.vector(my_matrix) # transform to vector my_vec ## [1] 1 5 2 4 1 2 A.4.6 Arrays multi-dimensional generalizations of matrices my_array <- array(c(1, 2, 3, 4, 5, 6, 7, 8), dim = c(2, 2, 2)) my_array[1, 1, 1] ## [1] 1 my_array[2, 2, 1] ## [1] 4 my_array[1, , ] ## [,1] [,2] ## [1,] 1 5 ## [2,] 3 7 dim(my_array) ## [1] 2 2 2 A.4.7 Data frames basic data structure for analysis differ from matrices as columns can be of different types student_data <- data.frame("Name" = student_names, "Age" = student_ages, "Pass" = passed) student_data ## Name Age Pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE ## 3 Mike 21 FALSE colnames(student_data) <- c("name", "age", "pass") # change column names student_data[1, ] ## name age pass ## 1 Luke 20 TRUE student_data[ ,colnames(student_data) %in% c("name", "pass")] ## name pass ## 1 Luke TRUE ## 2 Jen TRUE ## 3 Mike FALSE student_data$pass # access column by name ## [1] TRUE TRUE FALSE student_data[student_data$pass == TRUE, ] ## name age pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE A.4.8 Lists useful for storing different data structures access elements with double square brackets elements can be named first_list <- list(student_ages, my_matrix, student_data) second_list <- list(student_ages, my_matrix, student_data, first_list) first_list[[1]] ## [1] 20 24 21 second_list[[4]] ## [[1]] ## [1] 20 24 21 ## ## [[2]] ## [,1] [,2] [,3] ## [1,] 1 2 1 ## [2,] 5 4 2 ## ## [[3]] ## name age pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE ## 3 Mike 21 FALSE second_list[[4]][[1]] # first element of the fourth element of second_list ## [1] 20 24 21 length(second_list) ## [1] 4 second_list[[length(second_list) + 1]] <- "add_me" # append an element names(first_list) <- c("Age", "Matrix", "Data") first_list$Age ## [1] 20 24 21 A.4.9 Loops mostly for loop for loop can iterate over an arbitrary vector # iterate over consecutive natural numbers my_sum <- 0 for (i in 1:10) { my_sum <- my_sum + i } my_sum ## [1] 55 # iterate over an arbirary vector my_sum <- 0 some_numbers <- c(2, 3.5, 6, 100) for (i in some_numbers) { my_sum <- my_sum + i } my_sum ## [1] 111.5 A.5 Functions for help use ?function_name A.5.1 Writing functions We can write our own functions with function(). In the brackets, we define the parameters the function gets, and in curly brackets we define what the function does. We use return() to return values. sum_first_n_elements <- function (n) { my_sum <- 0 for (i in 1:n) { my_sum <- my_sum + i } return (my_sum) } sum_first_n_elements(10) ## [1] 55 A.6 Other tips Use set.seed(arbitrary_number) at the beginning of a script to set the seed and ensure replication. To dynamically set the working directory in R Studio to the parent folder of a R script use setwd(dirname(rstudioapi::getSourceEditorContext()$path)). To avoid slow R loops use the apply family of functions. See ?apply and ?lapply. To make your data manipulation (and therefore your life) a whole lot easier, use the dplyr package. Use getAnywhere(function_name) to get the source code of any function. Use browser for debugging. See ?browser. A.7 Further reading and references Getting started with R Studio: https://www.youtube.com/watch?v=lVKMsaWju8w Official R manuals: https://cran.r-project.org/manuals.html Cheatsheets: https://www.rstudio.com/resources/cheatsheets/ Workshop on R, dplyr, ggplot2, and R Markdown: https://github.com/bstatcomp/Rworkshop "],["distributions.html", "B Probability distributions", " B Probability distributions Name parameters support pdf/pmf mean variance Bernoulli \\(p \\in [0,1]\\) \\(k \\in \\{0,1\\}\\) \\(p^k (1 - p)^{1 - k}\\) 1.12 \\(p\\) 7.1 \\(p(1-p)\\) 7.1 binomial \\(n \\in \\mathbb{N}\\), \\(p \\in [0,1]\\) \\(k \\in \\{0,1,\\dots,n\\}\\) \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\) 4.4 \\(np\\) 7.2 \\(np(1-p)\\) 7.2 Poisson \\(\\lambda > 0\\) \\(k \\in \\mathbb{N}_0\\) \\(\\frac{\\lambda^k e^{-\\lambda}}{k!}\\) 4.6 \\(\\lambda\\) 7.3 \\(\\lambda\\) 7.3 geometric \\(p \\in (0,1]\\) \\(k \\in \\mathbb{N}_0\\) \\(p(1-p)^k\\) 4.5 \\(\\frac{1 - p}{p}\\) 7.4 \\(\\frac{1 - p}{p^2}\\) 9.3 normal \\(\\mu \\in \\mathbb{R}\\), \\(\\sigma^2 > 0\\) \\(x \\in \\mathbb{R}\\) \\(\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}}\\) 4.12 \\(\\mu\\) 7.8 \\(\\sigma^2\\) 7.8 uniform \\(a,b \\in \\mathbb{R}\\), \\(a < b\\) \\(x \\in [a,b]\\) \\(\\frac{1}{b-a}\\) 4.9 \\(\\frac{a+b}{2}\\) \\(\\frac{(b-a)^2}{12}\\) beta \\(\\alpha,\\beta > 0\\) \\(x \\in [0,1]\\) \\(\\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)}\\) 4.10 \\(\\frac{\\alpha}{\\alpha + \\beta}\\) 7.6 \\(\\frac{\\alpha \\beta}{(\\alpha + \\beta)^2(\\alpha + \\beta + 1)}\\) 7.6 gamma \\(\\alpha,\\beta > 0\\) \\(x \\in (0, \\infty)\\) \\(\\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x}\\) 4.11 \\(\\frac{\\alpha}{\\beta}\\) 7.5 \\(\\frac{\\alpha}{\\beta^2}\\) 7.5 exponential \\(\\lambda > 0\\) \\(x \\in [0, \\infty)\\) \\(\\lambda e^{-\\lambda x}\\) 4.8 \\(\\frac{1}{\\lambda}\\) 7.7 \\(\\frac{1}{\\lambda^2}\\) 7.7 logistic \\(\\mu \\in \\mathbb{R}\\), \\(s > 0\\) \\(x \\in \\mathbb{R}\\) \\(\\frac{e^{-\\frac{x - \\mu}{s}}}{s(1 + e^{-\\frac{x - \\mu}{s}})^2}\\) 4.13 \\(\\mu\\) \\(\\frac{s^2 \\pi^2}{3}\\) negative binomial \\(r \\in \\mathbb{N}\\), \\(p \\in [0,1]\\) \\(k \\in \\mathbb{N}_0\\) \\(\\binom{k + r - 1}{k}(1-p)^r p^k\\) 4.7 \\(\\frac{rp}{1 - p}\\) 9.2 \\(\\frac{rp}{(1 - p)^2}\\) 9.2 multinomial \\(n \\in \\mathbb{N}\\), \\(k \\in \\mathbb{N}\\) \\(p_i \\in [0,1]\\), \\(\\sum p_i = 1\\) \\(x_i \\in \\{0,..., n\\}\\), \\(i \\in \\{1,...,k\\}\\), \\(\\sum{x_i} = n\\) \\(\\frac{n!}{x_1!x_2!...x_k!} p_1^{x_1} p_2^{x_2}...p_k^{x_k}\\) 8.1 \\(np_i\\) \\(np_i(1-p_i)\\) "],["references.html", "References", " References "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]] +[["distributions-intutition.html", "Chapter 18 Distributions intutition 18.1 Discrete distributions 18.2 Continuous distributions", " Chapter 18 Distributions intutition This chapter is intended to help you familiarize yourself with the different probability distributions you will encounter in this course. You will need to use Appendix B extensively as a reference for the basic properties of distributions, so keep it close! .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 18.1 Discrete distributions Exercise 18.1 (Bernoulli intuition 1) Arguably the simplest distribution you will enocounter is the Bernoulli distribution. It is a discrete probability distribution used to represent the outcome of a yes/no question. It has one parameter \\(p\\) which is the probability of success. The probability of failure is \\((1-p)\\), sometimes denoted as \\(q\\). A classic way to think about a Bernoulli trial (a yes/no experiment) is a coin flip. Real coins are fair, meaning the probability of either heads (1) or tails (0) are the same, so \\(p=0.5\\) as shown below in figure a. Alternatively we may want to represent a process that doesn’t have equal probabilities of outcomes like “Will a throw of a fair die result in a 6?”. In this case \\(p=\\frac{1}{6}\\), shown in figure b. Using your knowledge of the Bernoulli distribution use the throw of a fair die to think of events, such that: \\(p = 0.5\\) \\(p = \\frac{5}{6}\\) \\(q = \\frac{2}{3}\\) Solution. An event that is equally likely to happen or not happen i.e. \\(p = 0.5\\) would be throwing an even number. More formally we can name this event \\(A\\) and write: \\(A = \\{2,4,6\\}\\), its probability being \\(P(A) = 0.5\\) An example of an event with \\(p = \\frac{5}{6}\\) would be throwing a number greater than 1. Defined as \\(B = \\{2,3,4,5,6\\}\\). We need an event that fails \\(\\frac{2}{3}\\) of the time. Alternatively we can reverse the problem and find an event that succeeds \\(\\frac{1}{3}\\) of the time, since: \\(q = 1 - p \\implies p = 1 - q = \\frac{1}{3}\\). The event that our outcome is divisible by 3: \\(C = \\{3, 6\\}\\) satisfies this condition. Exercise 18.2 (Binomial intuition 1) The binomial distribution is a generalization of the Bernoulli distribution. Instead of considering a single Bernoulli trial, we now consider a sequence of \\(n\\) trials, which are independent and have the same parameter \\(p\\). So the binomial distribution has two parameters \\(n\\) - the number of trials and \\(p\\) - the probability of success for each trial. If we return to our coin flip representation, we now flip a coin several times. The binomial distribution will give us the probabilities of all possible outcomes. Below we show the distribution for a series of 10 coin flips with a fair coin (left) and a biased coin (right). The numbers on the x axis represent the number of times the coin landed heads. Using your knowledge of the binomial distribution: Take the pmf of the binomial distribution and plug in \\(n=1\\), check that it is in fact equivalent to a Bernoulli distribution. In our examples we show the graph of a binomial distribution over 10 trials with \\(p=0.8\\). If we take a look at the graph, it appears as though the probabilities of getting 0,1,2 or 3 heads in 10 flips are zero. Is it actually zero? Check by plugging in the values into the pmf. Solution. The pmf of a binomial distribution is \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\), now we insert \\(n=1\\) to get: \\[\\binom{1}{k} p^k (1 - p)^{1 - k}\\]. Not quite equivalent to a Bernoulli, however note that the support of the binomial distribution is defined as \\(k \\in \\{0,1,\\dots,n\\}\\), so in our case \\(k = \\{0,1\\}\\), then: \\[\\binom{1}{0} = \\binom{1}{1} = 1\\] we get: \\(p^k (1 - p)^{1 - k}\\) ,the Bernoulli distribution. As we already know \\(p=0.8, n=10\\), so: \\[\\binom{10}{0} 0.8^0 (1 - 0.8)^{10 - 0} = 1.024 \\cdot 10^{-7}\\] \\[\\binom{10}{1} 0.8^1 (1 - 0.8)^{10 - 1} = 4.096 \\cdot 10^{-6}\\] \\[\\binom{10}{2} 0.8^2 (1 - 0.8)^{10 - 2} = 7.3728 \\cdot 10^{-5}\\] \\[\\binom{10}{3} 0.8^3 (1 - 0.8)^{10 - 3} = 7.86432\\cdot 10^{-4}\\] So the probabilities are not zero, just very small. Exercise 18.3 (Poisson intuition 1) Below are shown 3 different graphs of the Poisson distribution. Your task is to replicate them on your own in R by varying the \\(\\lambda\\) parameter. Hint: You can use dpois() to get the probabilities. library(ggplot2) library(gridExtra) x = 0:15 # Create Poisson data data1 <- data.frame(x = x, y = dpois(x, lambda = 0.1)) data2 <- data.frame(x = x, y = dpois(x, lambda = 1)) data3 <- data.frame(x = x, y = dpois(x, lambda = 7.5)) # Create individual ggplot objects plot1 <- ggplot(data1, aes(x, y)) + geom_col() + xlab("x") + ylab("Probability") + ylim(0,1) plot2 <- ggplot(data2, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) plot3 <- ggplot(data3, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) # Combine the plots grid.arrange(plot1, plot2, plot3, ncol = 3) Exercise 18.4 (Poisson intuition 2) The Poisson distribution is a discrete probability distribution that models the probability of a given number of events occuring within processes where events occur at a constant mean rate and independently of each other - a Poisson process. It has a single parameter \\(\\lambda\\), which represents the constant mean rate. A classic example of a scenario that can be modeled using the Poisson distribution is the number of calls received by a call center in a day (or in fact any other time interval). Suppose you work in a call center and have some understanding of probability distributions. You overhear your supervisor mentioning that the call center receives an average of 2.5 calls per day. Using your knowledge of the Poisson distribution, calculate: The probability you will get no calls today. The probability you will get more than 5 calls today. Solution. First recall the Poisson pmf: \\[p(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!}\\] as stated previously our parameter \\(\\lambda = 2.5\\) To get the probability of no calls we simply plug in \\(k = 0\\), so: \\[p(0) = \\frac{2.5^0 e^{-2.5}}{0!} = e^{-2.5} \\approx 0.082\\] The support of the Poisson distribution is non-negative integers. So if we wanted to calculate the probability of getting more than 5 calls we would need to add up the probabilities of getting 6 calls and 7 calls and so on up to infinity. Let us instead remember that the sum of all probabilties will be 1, we will reverse the problem and instead ask “What is the probability we get 5 calls or less?”. We can subtract the probability of the opposite outcome (the complement) from 1 to get the probability of our original question. \\[P(k > 5) = 1 - P(k \\leq 5)\\] \\[P(k \\leq 5) = \\sum_{i=0}^{5} p(i) = p(0) + p(1) + p(2) + p(3) + p(4) + p(5) =\\] \\[= \\frac{2.5^0 e^{-2.5}}{0!} + \\frac{2.5^1 e^{-2.5}}{1!} + \\dots =\\] \\[=0.957979\\] So the probability of geting more than 5 calls will be \\(1 - 0.957979 = 0.042021\\) Exercise 18.5 (Geometric intuition 1) The geometric distribution is a discrete distribution that models the number of failures before the first success in a sequence of independent Bernoulli trials. It has a single parameter \\(p\\), representing the probability of success. NOTE: There are two forms of this distribution, the one we just described and another that models the number of trials before the first success. The difference is subtle yet significant and you are likely to encounter both forms. In the graph below we show the pmf of a geometric distribution with \\(p=0.5\\). This can be thought of as the number of successive failures (tails) in the flip of a fair coin. You can see that there’s a 50% chance you will have zero failures i.e. you will flip a heads on your very first attempt. But there is some smaller chance that you will flip a sequence of tails in a row, with longer sequences having ever lower probability. Create an equivalent graph that represents the probability of rolling a 6 with a fair 6-sided die. Use the formula for the mean of the geometric distribution and determine the average number of failures before you roll a 6. Look up the alternative form of the geometric distribtuion and again use the formula for the mean to determine the average number of trials before you roll a 6. Solution. Parameter p (the probability of success) for rolling a 6 is \\(p=\\frac{1}{6}\\). library(ggplot2) # Parameters p <- 1/6 x_vals <- 0:9 # Starting from 0 probs <- dgeom(x_vals, p) # Data data <- data.frame(x_vals, probs) # Plot ggplot(data, aes(x=x_vals, y=probs)) + geom_segment(aes(xend=x_vals, yend=0), color="black", size=1) + geom_point(color="red", size=2) + labs(x = "Number of trials", y = "Probability") + theme_minimal() + scale_x_continuous(breaks = x_vals) # This line ensures integer x-axis labels ::: {.solution} b) The expected value of a random variable (the mean) is denoted as \\(E[X]\\). \\[E[X] = \\frac{1-p}{p}= \\frac{1- \\frac{1}{6}}{\\frac{1}{6}} = \\frac{5}{6}\\cdot 6 = 5\\] On average we will fail 5 times before we roll our first 6. The alternative form of this distribution (with support on all positive integers) has a slightly different formula for the mean. This change reflects the difference in the way we posed our question: \\[E[X] = \\frac{1}{p} = \\frac{1}{\\frac{1}{6}} = 6\\] On average we will have to throw the die 6 times before we roll a 6. ::: 18.2 Continuous distributions Exercise 18.6 (Uniform intuition 1) The need for a randomness is a common problem. A practical solution are so-called random number generators (RNGs). The simplest RNG one would think of is choosing a set of numbers and having the generator return a number at random, where the probability of returning any number from this set is the same. If this set is an interval of real numbers, then we’ve basically described the continuous uniform distribution. It has two parameters \\(a\\) and \\(b\\), which define the beginning and end of its support respectively. Let’s think of the mean intuitively. The expected value or mean of a distribution is the pivot point on our x-axis, which “balances” the graph. Given parameters \\(a\\) and \\(b\\) what is your intuitive guess of the mean for this distribution? A special case of the uniform distribution is the standard uniform distribution with \\(a=0\\) and \\(b=1\\). Write the pdf \\(f(x)\\) of this particular distribution. Solution. It’s the midpoint between \\(a\\) and \\(b\\), so \\(\\frac{a+b}{2}\\) Inserting the parameter values we get:\\[f(x) = \\begin{cases} 1 & \\text{if } 0 \\leq x \\leq 1 \\\\ 0 & \\text{otherwise} \\end{cases} \\] Notice how the pdf is just a constant \\(1\\) across all values of \\(x \\in [0,1]\\). Here it is important to distinguish between probability and probability density. The density may be 1, but the probability is not and while discreet distributions never exceed 1 on the y-axis, continuous distributions can go as high as you like. Exercise 18.7 (Normal intuition 1) The normal distribution, also known as the Gaussian distribution, is a continuous distribution that encompasses the entire real number line. It has two parameters: the mean, denoted by \\(\\mu\\), and the variance, represented by \\(\\sigma^2\\). Its shape resembles the iconic bell curve. The position of its peak is determined by the parameter \\(\\mu\\), while the variance determines the spread or width of the curve. A smaller variance results in a sharper, narrower peak, while a larger variance leads to a broader, more spread-out curve. Below, we graph the distribution of IQ scores for two different populations. We aim to identify individuals with an IQ at or above 140 for an experiment. We can identify them reliably; however, we only have time to examine one of the two groups. Which group should we investigate to have the best chance? NOTE: The graph below displays the parameter \\(\\sigma\\), which is the square root of the variance, more commonly referred to as the standard deviation. Keep this in mind when solving the problems. Insert the values of either population into the pdf of a normal distribution and determine which one has a higher density at \\(x=140\\). Generate the graph yourself and zoom into the relevant area to graphically verify your answer. To determine probability density, we can use the pdf. However, if we wish to know the proportion of the population that falls within certain parameters, we would need to integrate the pdf. Fortunately, the integrals of common distributions are well-established. This integral gives us the cumulative distribution function \\(F(x)\\) (CDF). BONUS: Look up the CDF of the normal distribution and input the appropriate values to determine the percentage of each population that comprises individuals with an IQ of 140 or higher. Solution. Group 1: \\(\\mu = 100, \\sigma=10 \\rightarrow \\sigma^2 = 100\\) \\[\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} = \\frac{1}{\\sqrt{2 \\pi 100}} e^{-\\frac{(140 - 100)^2}{2 \\cdot 100}} \\approx 1.34e-05\\] Group 2: \\(\\mu = 105, \\sigma=8 \\rightarrow \\sigma^2 = 64\\) \\[\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} = \\frac{1}{\\sqrt{2 \\pi 64}} e^{-\\frac{(140 - 105)^2}{2 \\cdot 64}} \\approx 3.48e-06\\] So despite the fact that group 1 has a lower average IQ, we are more likely to find 140 IQ individuals in group 2. library(ggplot2) library(tidyr) # Create data x <- seq(135, 145, by = 0.01) # Adjusting the x range to account for the larger standard deviations df <- data.frame(x = x) # Define the IQ distributions df$IQ_mu100_sd10 <- dnorm(df$x, mean = 100, sd = 10) df$IQ_mu105_sd8 <- dnorm(df$x, mean = 105, sd = 8) # Convert from wide to long format for ggplot2 df_long <- gather(df, distribution, density, -x) # Ensure the levels of the 'distribution' factor match our desired order df_long$distribution <- factor(df_long$distribution, levels = c("IQ_mu100_sd10", "IQ_mu105_sd8")) # Plot ggplot(df_long, aes(x = x, y = density, color = distribution)) + geom_line() + labs(x = "IQ Score", y = "Density") + scale_color_manual( name = "IQ Distribution", values = c(IQ_mu100_sd10 = "red", IQ_mu105_sd8 = "blue"), labels = c("Group 1 (µ=100, σ=10)", "Group 2 (µ=105, σ=8)") ) + theme_minimal() ::: {.solution} c. The CDF of the normal distribution is \\(\\Phi(x) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{x - \\mu}{\\sigma \\sqrt{2}} \\right) \\right]\\). The CDF is defined as the integral of the distribution density up to x. So to get the total percentage of individuals with IQ at 140 or higher we will need to subtract the value from 1. Group 1: \\[1 - \\Phi(140) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{140 - 100}{10 \\sqrt{2}} \\right) \\right] \\approx 3.17e-05 \\] Group 2 : \\[1 - \\Phi(140) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{140 - 105}{8 \\sqrt{2}} \\right) \\right] \\approx 6.07e-06 \\] So roughly 0.003% and 0.0006% of individuals in groups 1 and 2 respectively have an IQ at or above 140. ::: Exercise 18.8 (Beta intuition 1) The beta distribution is a continuous distribution defined on the unit interval \\([0,1]\\). It has two strictly positive paramters \\(\\alpha\\) and \\(\\beta\\), which determine its shape. Its support makes it especially suitable to model distribtuions of percentages and proportions. Below you’ve been provided with some code that you can copy into Rstudio. Once you run the code, an interactive Shiny app will appear and you will be able to manipulate the graph of the beta distribution. Play around with the parameters to get: A straight line from (0,0) to (1,2) A straight line from (0,2) to (1,0) A symmetric bell curve A bowl-shaped curve The standard uniform distribution is actually a special case of the beta distribution. Find the exact parameters \\(\\alpha\\) and \\(\\beta\\). Once you do, prove the equality by inserting the values into our pdf. Hint: The beta function is evaluated as \\(\\text{B}(a,b) = \\frac{\\Gamma(a)\\Gamma(b)}{\\Gamma(a+b)}\\), the gamma function for positive integers \\(n\\) is evaluated as \\(\\Gamma(n)= (n-1)!\\) # Install and load necessary packages install.packages(c("shiny", "ggplot2")) library(shiny) library(ggplot2) # The Shiny App ui <- fluidPage( titlePanel("Beta Distribution Viewer"), sidebarLayout( sidebarPanel( sliderInput("alpha", "Alpha:", min = 0.1, max = 10, value = 2, step = 0.1), sliderInput("beta", "Beta:", min = 0.1, max = 10, value = 2, step = 0.1) ), mainPanel( plotOutput("betaPlot") ) ) ) server <- function(input, output) { output$betaPlot <- renderPlot({ x <- seq(0, 1, by = 0.01) y <- dbeta(x, shape1 = input$alpha, shape2 = input$beta) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() + labs(x = "Value", y = "Density") + theme_minimal() }) } shinyApp(ui = ui, server = server) Solution. \\(\\alpha = 2, \\beta=1\\) \\(\\alpha = 1, \\beta=2\\) Possible solution \\(\\alpha = \\beta= 5\\) Possible solution \\(\\alpha = \\beta= 0.5\\) The correct parameters are \\(\\alpha = 1, \\beta=1\\), to prove the equality we insert them into the beta pdf: \\[\\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} = \\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\\text{B}(1, 1)} = \\frac{1}{\\frac{\\Gamma(1)\\Gamma(1)}{\\Gamma(1+1)}}= \\frac{1}{\\frac{(1-1)!(1-1)!}{(2-1)!}} = 1\\] Exercise 18.9 (Exponential intuition 1) The exponential distribution represents the distributon of time between events in a Poisson process. It is the continuous analogue of the geometric distribution. It has a single parameter \\(\\lambda\\), which is strictly positive and represents the constant rate of the corresponding Poisson process. The support is all positive reals, since time between events is non-negative, but not bound upwards. Let’s revisit the call center from our Poisson problem. We get 2.5 calls per day on average, this is our rate parameter \\(\\lambda\\). A work day is 8 hours. What is the mean time between phone calls? The cdf \\(F(x)\\) tells us what percentage of calls occur within x amount of time of each other. You want to take an hour long lunch break but are worried about missing calls. Calculate the percentage of calls you are likely to miss if you’re gone for an hour. Hint: The cdf is \\(F(x) = \\int_{-\\infty}^{x} f(x) dx\\) Solution. Taking \\(\\lambda = \\frac{2.5 \\text{ calls}}{8 \\text{ hours}} = \\frac{1 \\text{ call}}{3.2 \\text{ hours}}\\) \\[E[X] = \\frac{1}{\\lambda} = \\frac{3.2 \\text{ hours}}{\\text{call}}\\] First we derive the CDF, we can integrate from 0 instead of \\(-\\infty\\), since we have no support in the negatives: \\[\\begin{align} F(x) &= \\int_{0}^{x} \\lambda e^{-\\lambda t} dt \\\\ &= \\lambda \\int_{0}^{x} e^{-\\lambda t} dt \\\\ &= \\lambda (\\frac{1}{-\\lambda}e^{-\\lambda t} |_{0}^{x}) \\\\ &= \\lambda(\\frac{1}{\\lambda} - \\frac{1}{\\lambda} e^{-\\lambda x}) \\\\ &= 1 - e^{-\\lambda x}. \\end{align}\\] Then we just evaluate it for a time of 1 hour: \\[F(1 \\text{ hour}) = 1 - e^{-\\frac{1 \\text{ call}}{3.2 \\text{ hours}} \\cdot 1 \\text{ hour}}= 1 - e^{-\\frac{1 \\text{ call}}{3.2 \\text{ hours}}} \\approx 0.268\\] So we have about a 27% chance of missing a call if we’re gone for an hour. Exercise 18.10 (Gamma intuition 1) The gamma distribution is a continuous distribution characterized by two parameters, \\(\\alpha\\) and \\(\\beta\\), both greater than 0. These parameters afford the distribution a broad range of shapes, leading to it being commonly referred to as a family of distributions. Given its support over the positive real numbers, it is well suited for modeling a diverse range of positive-valued phenomena. The exponential distribution is actually just a particular form of the gamma distribution. What are the values of \\(\\alpha\\) and \\(\\beta\\)? Copy the code from our beta distribution Shiny app and modify it to simulate the gamma distribution. Then get it to show the exponential. Solution. Let’s start by taking a look at the pdfs of the two distributions side by side: \\[\\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x} = \\lambda e^{-\\lambda x}\\] The \\(x^{\\alpha - 1}\\) term is not found anywhere in the pdf of the exponential so we need to eliminate it by setting \\(\\alpha = 1\\). This also makes the fraction evaluate to \\(\\frac{\\beta^1}{\\Gamma(1)} = \\beta\\), which leaves us with \\[\\beta \\cdot e^{-\\beta x}\\] Now we can see that \\(\\beta = \\lambda\\) and \\(\\alpha = 1\\). # Install and load necessary packages install.packages(c("shiny", "ggplot2")) library(shiny) library(ggplot2) # The Shiny App ui <- fluidPage( titlePanel("Gamma Distribution Viewer"), sidebarLayout( sidebarPanel( sliderInput("shape", "Shape (α):", min = 0.1, max = 10, value = 2, step = 0.1), sliderInput("scale", "Scale (β):", min = 0.1, max = 10, value = 2, step = 0.1) ), mainPanel( plotOutput("gammaPlot") ) ) ) server <- function(input, output) { output$gammaPlot <- renderPlot({ x <- seq(0, 25, by = 0.1) y <- dgamma(x, shape = input$shape, scale = input$scale) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() + labs(x = "Value", y = "Density") + theme_minimal() }) } shinyApp(ui = ui, server = server) "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]] From d952bef8bfc6d1db31187606bbf052d6dd786cac Mon Sep 17 00:00:00 2001 From: LeonHvastja Date: Sun, 1 Oct 2023 00:34:15 +0200 Subject: [PATCH 5/7] built files --- docs/404.html | 2 +- docs/A1.html | 250 ++++++++--------- docs/ard.html | 2 +- docs/bi.html | 2 +- .../figure-html/unnamed-chunk-297-1.png | Bin 0 -> 109778 bytes .../figure-html/unnamed-chunk-299-1.png | Bin 0 -> 92696 bytes docs/boot.html | 2 +- docs/ci.html | 2 +- docs/condprob.html | 2 +- docs/crv.html | 2 +- docs/distributions-intutition.html | 256 +++++++++--------- docs/distributions.html | 2 +- docs/eb.html | 2 +- docs/ev.html | 2 +- docs/index.html | 4 +- docs/integ.html | 2 +- docs/introduction.html | 2 +- docs/lt.html | 2 +- docs/ml.html | 2 +- docs/mrv.html | 2 +- docs/mrvs.html | 2 +- docs/nhst.html | 2 +- docs/reference-keys.txt | 19 +- docs/references.html | 2 +- docs/rvs.html | 2 +- docs/search_index.json | 2 +- docs/uprobspaces.html | 2 +- 27 files changed, 279 insertions(+), 292 deletions(-) create mode 100644 docs/bookdown-pou_files/figure-html/unnamed-chunk-297-1.png create mode 100644 docs/bookdown-pou_files/figure-html/unnamed-chunk-299-1.png diff --git a/docs/404.html b/docs/404.html index 47877bf..fd15a7d 100644 --- a/docs/404.html +++ b/docs/404.html @@ -23,7 +23,7 @@ - + diff --git a/docs/A1.html b/docs/A1.html index 9ee3176..f3b6fd8 100644 --- a/docs/A1.html +++ b/docs/A1.html @@ -23,7 +23,7 @@ - + @@ -357,48 +357,48 @@

      A.4.1 Variables and typeschange types with as.type() functions
    1. no special type for single character like C++ for example
    2. -
      n            <- 20
      -x            <- 2.7
      -m            <- n # m gets value 20
      -my_flag      <- TRUE
      -student_name <- "Luka"
      -typeof(n)
      +
      n            <- 20
      +x            <- 2.7
      +m            <- n # m gets value 20
      +my_flag      <- TRUE
      +student_name <- "Luka"
      +typeof(n)
      ## [1] "double"
      -
      typeof(student_name)
      +
      typeof(student_name)
      ## [1] "character"
      -
      typeof(my_flag)
      +
      typeof(my_flag)
      ## [1] "logical"
      -
      typeof(as.integer(n))
      +
      typeof(as.integer(n))
      ## [1] "integer"
      -
      typeof(as.character(n))
      +
      typeof(as.character(n))
      ## [1] "character"

      A.4.2 Basic operations

      -
      n + x
      +
      n + x
      ## [1] 22.7
      -
      n - x
      +
      n - x
      ## [1] 17.3
      -
      diff <- n - x # variable diff gets the difference between n and x
      -diff
      +
      diff <- n - x # variable diff gets the difference between n and x
      +diff
      ## [1] 17.3
      -
      n * x
      +
      n * x
      ## [1] 54
      -
      n / x
      +
      n / x
      ## [1] 7.407407
      -
      x^2
      +
      x^2
      ## [1] 7.29
      -
      sqrt(x)
      +
      sqrt(x)
      ## [1] 1.643168
      -
      n > 2 * n
      +
      n > 2 * n
      ## [1] FALSE
      -
      n == n
      +
      n == n
      ## [1] TRUE
      -
      n == 2 * n
      +
      n == 2 * n
      ## [1] FALSE
      -
      n != n
      +
      n != n
      ## [1] FALSE
      -
      paste(student_name, "is", n, "years old")
      +
      paste(student_name, "is", n, "years old")
      ## [1] "Luka is 20 years old"
      @@ -410,26 +410,26 @@

      A.4.3 Vectors
      1:4 # creates a vector of integers from 1 to 4

      +
      1:4 # creates a vector of integers from 1 to 4
      ## [1] 1 2 3 4
      -
      student_ages  <- c(20, 23, 21)
      -student_names <- c("Luke", "Jen", "Mike")
      -passed        <- c(TRUE, TRUE, FALSE)
      -length(student_ages)
      +
      student_ages  <- c(20, 23, 21)
      +student_names <- c("Luke", "Jen", "Mike")
      +passed        <- c(TRUE, TRUE, FALSE)
      +length(student_ages)
      ## [1] 3
      -
      # access by index
      -student_ages[2] 
      +
      # access by index
      +student_ages[2] 
      ## [1] 23
      -
      student_ages[1:2]
      +
      student_ages[1:2]
      ## [1] 20 23
      -
      student_ages[2] <- 24 # change values
      -
      -# access by logical vectors
      -student_ages[passed == TRUE] # same as student_ages[passed]
      +
      student_ages[2] <- 24 # change values
      +
      +# access by logical vectors
      +student_ages[passed == TRUE] # same as student_ages[passed]
      ## [1] 20 24
      -
      student_ages[student_names %in% c("Luke", "Mike")]
      +
      student_ages[student_names %in% c("Luke", "Mike")]
      ## [1] 20 21
      -
      student_names[student_ages > 20]
      +
      student_names[student_ages > 20]
      ## [1] "Jen"  "Mike"

      A.4.3.1 Operations with vectors

      @@ -437,24 +437,24 @@

      A.4.3.1 Operations with vectorsmost operations are element-wise
    3. if we operate on vectors of different lengths, the shorter vector periodically repeats its elements until it reaches the length of the longer one
    4. -
      a <- c(1, 3, 5)
      -b <- c(2, 2, 1)
      -d <- c(6, 7)
      -a + b
      +
      a <- c(1, 3, 5)
      +b <- c(2, 2, 1)
      +d <- c(6, 7)
      +a + b
      ## [1] 3 5 6
      -
      a * b
      +
      a * b
      ## [1] 2 6 5
      -
      a + d
      +
      a + d
      ## Warning in a + d: longer object length is not a multiple of shorter object
       ## length
      ## [1]  7 10 11
      -
      a + 2 * b
      +
      a + 2 * b
      ## [1] 5 7 7
      -
      a > b
      +
      a > b
      ## [1] FALSE  TRUE  TRUE
      -
      b == a
      +
      b == a
      ## [1] FALSE FALSE FALSE
      -
      a %*% b # vector multiplication, not element-wise
      +
      a %*% b # vector multiplication, not element-wise
      ##      [,1]
       ## [1,]   13

      @@ -466,21 +466,21 @@

      A.4.4 Factors
      car_brand <- factor(c("Audi", "BMW", "Mercedes", "BMW"), ordered = FALSE)
      -car_brand
      +
      car_brand <- factor(c("Audi", "BMW", "Mercedes", "BMW"), ordered = FALSE)
      +car_brand
      ## [1] Audi     BMW      Mercedes BMW     
       ## Levels: Audi BMW Mercedes
      -
      freq      <- factor(x       = NA,
      -                    levels  = c("never","rarely","sometimes","often","always"),
      -                    ordered = TRUE)
      -freq[1:3] <- c("rarely", "sometimes", "rarely")
      -freq
      +
      freq      <- factor(x       = NA,
      +                    levels  = c("never","rarely","sometimes","often","always"),
      +                    ordered = TRUE)
      +freq[1:3] <- c("rarely", "sometimes", "rarely")
      +freq
      ## [1] rarely    sometimes rarely   
       ## Levels: never < rarely < sometimes < often < always
      -
      freq[4]   <- "quite_often" # non-existing level, returns NA
      +
      freq[4]   <- "quite_often" # non-existing level, returns NA
      ## Warning in `[<-.factor`(`*tmp*`, 4, value = "quite_often"): invalid factor
       ## level, NA generated
      -
      freq
      +
      freq
      ## [1] rarely    sometimes rarely    <NA>     
       ## Levels: never < rarely < sometimes < often < always
      @@ -489,26 +489,26 @@

      A.4.5 Matrices
      my_matrix <- matrix(c(1, 2, 1,
      -                      5, 4, 2),
      -                    nrow  = 2,
      -                    byrow = TRUE)
      -my_matrix
      +
      my_matrix <- matrix(c(1, 2, 1,
      +                      5, 4, 2),
      +                    nrow  = 2,
      +                    byrow = TRUE)
      +my_matrix
      ##      [,1] [,2] [,3]
       ## [1,]    1    2    1
       ## [2,]    5    4    2
      -
      my_square_matrix <- matrix(c(1, 3,
      -                             2, 3),
      -                           nrow  = 2)
      -my_square_matrix
      +
      my_square_matrix <- matrix(c(1, 3,
      +                             2, 3),
      +                           nrow  = 2)
      +my_square_matrix
      ##      [,1] [,2]
       ## [1,]    1    2
       ## [2,]    3    3
      -
      my_matrix[1,2] # first row, second column
      +
      my_matrix[1,2] # first row, second column
      ## [1] 2
      -
      my_matrix[2, ] # second row
      +
      my_matrix[2, ] # second row
      ## [1] 5 4 2
      -
      my_matrix[ ,3] # third column
      +
      my_matrix[ ,3] # third column
      ## [1] 1 2

      A.4.5.1 Matrix functions and operations

      @@ -516,40 +516,40 @@

      A.4.5.1 Matrix functions and oper
    5. most operation element-wise
    6. mind the dimensions when using matrix multiplication %*%
    7. -
      nrow(my_matrix) # number of matrix rows
      +
      nrow(my_matrix) # number of matrix rows
      ## [1] 2
      -
      ncol(my_matrix) # number of matrix columns
      +
      ncol(my_matrix) # number of matrix columns
      ## [1] 3
      -
      dim(my_matrix) # matrix dimension
      +
      dim(my_matrix) # matrix dimension
      ## [1] 2 3
      -
      t(my_matrix) # transpose
      +
      t(my_matrix) # transpose
      ##      [,1] [,2]
       ## [1,]    1    5
       ## [2,]    2    4
       ## [3,]    1    2
      -
      diag(my_matrix) # the diagonal of the matrix as vector
      +
      diag(my_matrix) # the diagonal of the matrix as vector
      ## [1] 1 4
      -
      diag(1, nrow = 3) # creates a diagonal matrix
      +
      diag(1, nrow = 3) # creates a diagonal matrix
      ##      [,1] [,2] [,3]
       ## [1,]    1    0    0
       ## [2,]    0    1    0
       ## [3,]    0    0    1
      -
      det(my_square_matrix) # matrix determinant
      +
      det(my_square_matrix) # matrix determinant
      ## [1] -3
      -
      my_matrix + 2 * my_matrix
      +
      my_matrix + 2 * my_matrix
      ##      [,1] [,2] [,3]
       ## [1,]    3    6    3
       ## [2,]   15   12    6
      -
      my_matrix * my_matrix # element-wise multiplication
      +
      my_matrix * my_matrix # element-wise multiplication
      ##      [,1] [,2] [,3]
       ## [1,]    1    4    1
       ## [2,]   25   16    4
      -
      my_matrix %*% t(my_matrix) # matrix multiplication
      +
      my_matrix %*% t(my_matrix) # matrix multiplication
      ##      [,1] [,2]
       ## [1,]    6   15
       ## [2,]   15   45
      -
      my_vec <- as.vector(my_matrix) # transform to vector
      -my_vec
      +
      my_vec <- as.vector(my_matrix) # transform to vector
      +my_vec
      ## [1] 1 5 2 4 1 2

      @@ -558,16 +558,16 @@

      A.4.6 Arrays
      my_array <- array(c(1, 2, 3, 4, 5, 6, 7, 8), dim = c(2, 2, 2))
      -my_array[1, 1, 1]
      +
      my_array <- array(c(1, 2, 3, 4, 5, 6, 7, 8), dim = c(2, 2, 2))
      +my_array[1, 1, 1]
      ## [1] 1
      -
      my_array[2, 2, 1]
      +
      my_array[2, 2, 1]
      ## [1] 4
      -
      my_array[1, , ]
      +
      my_array[1, , ]
      ##      [,1] [,2]
       ## [1,]    1    5
       ## [2,]    3    7
      -
      dim(my_array)
      +
      dim(my_array)
      ## [1] 2 2 2
      @@ -576,26 +576,26 @@

      A.4.7 Data frames
      student_data <- data.frame("Name" = student_names, 
      -                           "Age"  = student_ages, 
      -                           "Pass" = passed)
      -student_data

      +
      student_data <- data.frame("Name" = student_names, 
      +                           "Age"  = student_ages, 
      +                           "Pass" = passed)
      +student_data
      ##   Name Age  Pass
       ## 1 Luke  20  TRUE
       ## 2  Jen  24  TRUE
       ## 3 Mike  21 FALSE
      -
      colnames(student_data) <- c("name", "age", "pass") # change column names
      -student_data[1, ]
      +
      colnames(student_data) <- c("name", "age", "pass") # change column names
      +student_data[1, ]
      ##   name age pass
       ## 1 Luke  20 TRUE
      -
      student_data[ ,colnames(student_data) %in% c("name", "pass")]
      +
      student_data[ ,colnames(student_data) %in% c("name", "pass")]
      ##   name  pass
       ## 1 Luke  TRUE
       ## 2  Jen  TRUE
       ## 3 Mike FALSE
      -
      student_data$pass # access column by name
      +
      student_data$pass # access column by name
      ## [1]  TRUE  TRUE FALSE
      -
      student_data[student_data$pass == TRUE, ]
      +
      student_data[student_data$pass == TRUE, ]
      ##   name age pass
       ## 1 Luke  20 TRUE
       ## 2  Jen  24 TRUE
      @@ -607,11 +607,11 @@

      A.4.8 Lists
      first_list  <- list(student_ages, my_matrix, student_data)
      -second_list <- list(student_ages, my_matrix, student_data, first_list)
      -first_list[[1]]
      +
      first_list  <- list(student_ages, my_matrix, student_data)
      +second_list <- list(student_ages, my_matrix, student_data, first_list)
      +first_list[[1]]
      ## [1] 20 24 21
      -
      second_list[[4]]
      +
      second_list[[4]]
      ## [[1]]
       ## [1] 20 24 21
       ## 
      @@ -625,13 +625,13 @@ 

      A.4.8 Lists
      second_list[[4]][[1]] # first element of the fourth element of second_list
      +
      second_list[[4]][[1]] # first element of the fourth element of second_list
      ## [1] 20 24 21
      -
      length(second_list)
      +
      length(second_list)
      ## [1] 4
      -
      second_list[[length(second_list) + 1]] <- "add_me" # append an element
      -names(first_list) <- c("Age", "Matrix", "Data")
      -first_list$Age
      +
      second_list[[length(second_list) + 1]] <- "add_me" # append an element
      +names(first_list) <- c("Age", "Matrix", "Data")
      +first_list$Age
      ## [1] 20 24 21
      @@ -640,20 +640,20 @@

      A.4.9 Loops
      # iterate over consecutive natural numbers
      -my_sum <- 0
      -for (i in 1:10) {
      -  my_sum <- my_sum + i
      -}
      -my_sum

      +
      # iterate over consecutive natural numbers
      +my_sum <- 0
      +for (i in 1:10) {
      +  my_sum <- my_sum + i
      +}
      +my_sum
      ## [1] 55
      -
      # iterate over an arbirary vector
      -my_sum       <- 0
      -some_numbers <- c(2, 3.5, 6, 100)
      -for (i in some_numbers) {
      -  my_sum <- my_sum + i
      -}
      -my_sum
      +
      # iterate over an arbirary vector
      +my_sum       <- 0
      +some_numbers <- c(2, 3.5, 6, 100)
      +for (i in some_numbers) {
      +  my_sum <- my_sum + i
      +}
      +my_sum
      ## [1] 111.5
      @@ -667,14 +667,14 @@

      A.5.1 Writing functions
      sum_first_n_elements <- function (n) {
      -  my_sum <- 0
      -  for (i in 1:n) {
      -    my_sum <- my_sum + i
      -  }
      -  return (my_sum)
      -}
      -sum_first_n_elements(10)
      +
      sum_first_n_elements <- function (n) {
      +  my_sum <- 0
      +  for (i in 1:n) {
      +    my_sum <- my_sum + i
      +  }
      +  return (my_sum)
      +}
      +sum_first_n_elements(10)
      ## [1] 55
      diff --git a/docs/ard.html b/docs/ard.html index 385143d..82eb67b 100644 --- a/docs/ard.html +++ b/docs/ard.html @@ -23,7 +23,7 @@ - + diff --git a/docs/bi.html b/docs/bi.html index a1b79a0..a164a4e 100644 --- a/docs/bi.html +++ b/docs/bi.html @@ -23,7 +23,7 @@ - + diff --git a/docs/bookdown-pou_files/figure-html/unnamed-chunk-297-1.png b/docs/bookdown-pou_files/figure-html/unnamed-chunk-297-1.png new file mode 100644 index 0000000000000000000000000000000000000000..8b9d4fa0d53b00ff914aa48330a50f7cc88f57a2 GIT binary patch literal 109778 zcmeFZWn7e9_XY|hDxs8wA}L5C-L2BnB_N=5cOwj_NQpE^BaMK>&@CaI(%ndR$DDf@ zp4ahxpL4#QZ~qVS!-4zWYp=cXTGz&3K~CZ>IuSY&64G5M$>&N)NH;~0kZ#nY-U2?^ zH~KY-goJ`=`s|s4`7?=U))v-w$~JHG4J8b%4DC$yl_Z`bA@KwSscM@LtKtd8RX`Z1 zy5A3EWQ1W}az+$eDD{09`+aE$v7BHGw)t%6@YSBcz;BD{AU7 zh>_?!)8el=NXt@F6Q9EmkKVf;y-W8+TtlL*_t? zG4}L(7snq0?LRl4pO>)e#-lrWD#hPf!XibQnWJYaa66(TEs!;(_v;8!H{oj8^*b(K zA}|7ZY%&>}@(uU$rW*x_GHwRW@8YOs3N@)`Cq_TL&}Aya`}tV#qfY!ChgKHH^Mg|- z^@JGIxiDTr45Ckunbk^mp0vg6_H$VqicM2iaHWZ@gHgqrZb z1ht9$_~5#Z&+_`-^AK_@)3;i$Q!~+tew%##oo@2=cyMNYx`FQG5~)nZ?6P(3cOT2s z3@GJx@3)366IIIUJD#p4IgY((MmCfKtWwNUVegNLAB8WlJfKJ=52I;QHuz-F$0v=+ z*%I~0-e9kyZ@?9qQYnc$usZO!v*^!DV}ma<#XZf^N!-{YJtIBl?UBP)I(!~h>VZ|_ zYDrZ?_A3rl#zbaJ4y|JY33ffu2HoZ10^Mb9%{!O3J#T>5#?{I=*El-W&^;=xwVEXM z1mz5b(wS5(-at}QG&t2xL1Z0B9u+QCsK&Rn&KHi_gu~3+j>5>O!ShoKLbKiWNp?xL z(_XFGGf(Vv?R0I)?2{zS23)r|%s2B>3(=QMqZKETo@$OT3D6q&2sSh}7|6)~#{c*M z?G3Y=6L-$NAVTe`TP+3F200$NUWD29>sGmtp;=)S@u*ugZ!_k%*l);L&3~Yj5`OT= z$Fj=p!^S~|lc$W}Sibj9`yC5`n|g0`-*O1y2{+ihe=r?btGK>1V7F4#fjRfdG@-a8 zb~T`}ZDC#yJ)e!9;Lc)wGVNOB?<(TH49t>flSQNgcQ&2|YpKf%!g1GhL;Iv#Jm)n0 zN1CO)STngdde2NcxV((#YqpSbg)&?Ym0W0=^rk^PgYO%EsLK=e10RGtjI0e$R((X2nz0R#vh!dde{$3aT*nokVKHAo{OruAg@o}j>lWPXxZ6XC#jLV&6!0j zi{>W@@lL_ql3cgZhl`+fK~ zoi6t}3##mGY|7jrRS-y$YYmT)v2o(^QFRi@=J6zZkN#*kH5UrD2om!Dej$GcDnu>y zr>5{i0-=)s@0agLXbM88|MnA9a%>G`+~m*=O`?zg{w$Tq$(?_BCGav4gPW3G;-%)M z_y2WK)DWaa_WzaTN{Uc?a%7^>@9A&;T_zDwM9;rVgZf;}3&q5sD(d6^%U*$v-uU;I z{{Oeph^Jm6*r3SBNGsOIj~|QTNj#vqmz9+zDXyZ~F8omH6Y9nn_mNV|zB-&$6OEh*8jsuw2*+-Ia_R!m+v$l$LqkKe;qThW;_|JWXf{~GR4T_?aqcb`rw43M zamclDMYcf#AhK$@$B+K?97QBiB)vYSQKH`X7n=m^^c^8^%K!KDe_8VXY3={41yKX< z2x;%uww_15R8z|^PIMLjl_@_QfX^_HE<<>*IjPn7@vgFzl+@Udt{8R+dEtv(I^5=ixcB=c$W+e~c}Fs2ACiF2ceFf!>f_(@1&_DnV-fZ5 zud*RMLRz|>C+Q3W-&7fZDA9ev?z(HB?Kr6VO}EZ7@8HFwgTrxa=N{{^;_UX04o>}_ zcY;Xx)wrizQcw3+AYvLlf=BUF?prlXpJ_0^G|{2;ez*mu{I_Tkq77S9B8X^_TKM*z z`^gO7%b$*0mL(7{>*RJY*S)$&BR4?eJnggVvOz+xYU_6!c8_B9|DQ_uNn0uZhXF6 zA1O?A8hPW}G`lc8S>^035&B)TL^_awNt$$80NIQ*26}n66}|;||6gJ1`o9(Uew~!t zM&IQjB_@3-nm?L$nm|&6iN!0gWesC~y>K-j%C%i4md!;?lJtsgT0U7$%0HS0>%x*@ zQ88dbYDplV=4f?+$_Mk7vA@&$ z--3A)If-{GmB?_O)^aI-=X!@6K?!0VjVBey0hAa#|RTX86<8va6^S|jrq5bdu`rMB}wpQ1(b zuw&1Vb|dB0Kba+cQh+JL{%UiU;|Kb-!{#h#y_nI^v9ZmeokmNjH8bxpCT_-b zn%kcuh{>;>3_E6qBzPI81c8Qw!7~szTF!=9adltrLw#DvH{f zE3|So+ht?0pMNL?N*T6k?$jctm*kVKnTv0+O5?YFj3*|h$`ER0#_O$pVz;m?(^f&L zb^@xX^9jAvXeKuhgIcZ)G>D5Sa7>W6W~x0`%*c$1y`bSdzBvl)P3w`=Sx1cntDI)B z{hosc6k^G4b)6?J`TDMwHtjsOgH+n%w%|ftS)ef^W2LUHOL1u$9Bt>>R_4>SHExIN zD?+!dN&S1Mkj>gTgNJ-=+1oVsXmWRa`f@|`c6$g-PDh}XA!DMsT7)dnkG5FWlbOBK zJJXOQ($5Xx^-iMq;7iDU{SCXYs*<&@74}TpPBH0mR$*4CtK7a$!+VWVOw2m5{n5JN*8tl&@o#_$7qhPWyC~fj78WhPj&D zV?Wf{GgGc#!&vHQf>Q7_D_&P2qx}sQVkSzV0fMB-qY4vobxdL&o4`c2mafvX%_{4i zsioVvhA}wlOQF4Pad)vaO3WfV=RYiQmuOcy*fK-+f5~$iw&QVHO}vyKaU56@tIUsP z)R?=xNF}?-8av|kyACFRzR&vyXowrZe8OG|fyXtu6(FS6V64*S?2m z$nISxa(3|dP(8nWCuFlm{j|%vv^t6sv8u)rYSE_Uu;zxz@XTw6B>8hkLy%70S5 zFt$7;Qx7H;EOklxr*|lDM1}~>;z&rAP0<@yKFoHtw{BW)A(JVa*74Z2-|Ig;2v77F zl5vSuOg6I5z#6s&GiclT9L47UC8Z!NCUs@?XgiHIYa9GtjMn|$uN*9R_@z^SnSZMq z!rtHbV0CsJ$mVG~^DXAy8e%`nNZi*8H*;9^f2zc&4}6+?JLgX`adb{8D63Z@7D!NJ zsf8uDuV7b*W}gz%drocKzzUK4<$1QTOszf%I(~PpXijvVy5ND{uT%Q#O9sa6rX(K4 z`*qMZSKQQ}9c4T}&b#;DW2Na{&>swy3`;M^hr4x=BKC04C6X7+j02zAcY?5tHoMX8 zQJc&)`Jc`OGRM2t)rLuV3LQ<@Z|bk7iU%qAVv>C0IBLLP9tdQv{jF2AoajzHJ>KT} zx%<;4M{sEzP7<}Z8L4gE9K$JDr5DC&t+%akA=T9r4_J2@VwzT#?BB;qET8d*2YFncEN=$v zFNHeW=fy}22&8}$*pW3X3EoIIDM*6Dk(kn^q=nTk=Ob%1zG{o^ruDip*0^9CkaF33 ze!fq!EOj@(#%SolvUAkwiu^Bg=ukHo+5p9~>Xwy+Y2C`Wn3JEK3~u$vT2=J%Efv0v zqs*dWs#!-i3LDHH?IgVjo-zLthgC1$Z_!_NQ;s_c-Dp*tq_#CBU}eJ~icw6W{u;Al zo`}0A-;l}vzio4zuUZ9lgo*2^!GHS}uxR|73 zgcMG`g*dc5&JOFc;I6xLTy-2d>J+rVrSaT5k>IX=#dCE;GuTUl#n+-=2c7guY07#)igrf8x@R z@r?c5xw<>;vCLFtpp#@55WjWNuj3x0Zh`343ay|RtU$&EwR`;6KJ!7XKBi5)?o6A$ z`k_@;@ zx-4bC?3U@YL4Pp~9b7G{m$S>*9AfX0H2U!v5_7?-U+qzg*v*T)K@UM<>hz>EgTLbu z&1*Fmc`AV}>2-M_I!?>jGo!=b6q8O`-=>7mMKz0Q4`KzcSG&qUw2XiVK7+c?1Fu4l z2-VHbS8>LPF=k_&!wb&HDeT=Ln8Fi*EgJC}OPqhJAOlcA!#Z~gysjlz0!8c4GwG~z zn${z&G+^vaSD$S~YnJTJxh&$Z^hxZI-Mh-@|LkAn6htNRi+djZS{hV^*Q+WkwP)3! zuGFn1*5j}+B!7Qk9dAf%&lX{7)ZNX}zZieT4Ta0O6=*Hn3iAp!Hk*u&zwm=Y{*fQ)h!?u744}6Ol*A?V10GWJh1{+IznT?54_PLauEu zE0@z#UKEDAQn{uSt?S`PPZ7)4{#UB{APcNL;`YG>UgY|fdr1+;o^TwE%M#Sht6^H{zB%!tsAl2G-^|^U1niJI%}WLv>NVdF+FW(OQpLPWurS(-{%w zO&^SOU$Q6FT-dA#A)uKGu)QyCaiwr^LL`vG8DB6G4`hWp>@>UH$@r|Q>uUJT>!J0F z$BxPpqt}5AUi=pTr+!yP9{V=@eK>^Wiv-yz4dx4`nfOb;d!7^CC3|wPGGB9gZg09i z(!_rrglP3|?g6k|DJcah{LCNG7C2<@&%D=ZTpbL5Qkk{=@FY_>eaz;UY_`WCd}r{3 zB7X%at8tL*IlNT_Tz#`rMze0pe9X2*iiDrQlO22_lWne^PeSHNTXSkQTjo!IfE;6s z0luQRt~&Ms9x41#Ud-)nR@HbhG7zN-%bM$T%PSJ=<2jrbwFlnFZl0S{~#+d5{E2W6=B@drv>SKgcdM*Z>l8RfH6I>ST zr^w?`mzPH>ufgZ~X2R(1cOOa{CncEq*X3oW{=KXafNw7;CskgXb$R{qPXGF-4khs5 zO31U6;TcGG7!k+z*ujW?cHj3tb2=sj*Ggsqo+dD=!{LM<(oG7R<%DdO9)9K$vUI^c z)bqj?H>R-yQ6fP;>AX}X1cC8tVS#b$nI;@Y5uv?Fz!;+A)`pR%%CK@g|LlPgxoCpD z+_x|8o1yd!%K3lW#yKpYL5*7QIGNC-^Tbqtp%sK$`I+LB@EKd1?q8z_R9`@1 zoBbP#*D9v-HckwUVm^xHaL6iN`yOhhQxf!m^{LtTko(%8diu#;j$qJxgdLNI=mWOF zdn=Objv-kN)B3@Zi z8!y+rR5BZEcVCNCPiqeRA6-$xboDNS=s&vpaa%c$Be;%m-iY8C-=$0bVq|1CPGPf& zXdK*fuiO{)J#a=W`ykfqGeY*Zu8aH`v`5rmlc-dZ{HX~liS<@yPLGvRtb_|I}nAhNgQUGXa< z8&le=!`6H30b2(^z)4TnsY-H_re|Ofx}o{$Pka-(4+|Dw3H^}a!2(3+vJk3r!*efg zz>{^W#GK|X$tqgzONyFfJy!ExCq!9k09nC@1MruLP@!~IC&>%UayAz!F(gZ**g9G? z`O{>ev)HE<{S}=yVPRq_p6EJEMASdDmYjY&&uQ9A4z#w92CMlvmzIU2V>$N%SSSJ1 zHCNg86@tJ>pa(!W??MWaxem7EkGFwA=4u}1MM?QlcT7ww+Xubbq5A%hiwF~mMJHWP){tzjT}p%}BF z!j_D~Q5m#|2uTUDjhfFb>6uA#H7$Q7Y6Do2a2S+*eJqmbx!Mdh+wX>psb#4kGgw*& zpt{Qv4x40-*IxViU{x@t1cLF3kn@AmN9m%Ju0JDR1wk+Ji#=gsT5+@aga7H`XClM@aTwA{#gFBWt=dbU(i?Ht!0oq z!anoSJI1%nC=(#&fe=eT&!YqxZ?E-CTVNE#VHH1IVAPd1+ekUm>J_&jk9!v0f~@|` z7V&@zVgQfp5^;e?2qX|E#+R09Zj0@1{&Sn`H;u9RbcgH!jsRc-%3ru-f$mz5@aqM_ z3l_uyjakKK-@MLxXfI+O9js8s;CEjHJ4t*C)iCMw!^wWCafiAjU>mdneRR%_#tkTEp=(I!`@?%jT<2LPw54ih&Mf8896rFplg3L!*y9n zA6VTw+i02J;&eTk<_`Yhy#AXchEEkhmY`MZh)BN(7vOkC&m!5b{h+`L+Gh!+oW!xR zdo9bJ^8>o7u47xubg?Y#jV~*Tuut*=d_com-m@&%<&gy)kal8ll8wv<{0655{r780 zT@6;x!BGB^9>k5J8b|;da(YFFf2|?(Ph`w_8uOeRz0cCAWgm@p=5n>ac-UK8qu#7# zE!7caSi*-6Z@8{Do%|b?qAw%>rqGptm@&_Q4aSAT3J)LK7{nXZZloM2C+1#^u6v>STb@4ibJ1VThA5D$?n;N* z!Cld>ygp4TyQSndR+MJxCztn-k!rx{mbz-tb=+k&%bo;ADso(_0~QFyo&eQit1uD= z8GNf_=mKXJtDTgeB{v|^wIUa%{qo>d?XoxTIjmlt8rs}McocO&4$#s@JP!u^(G)_L z%?dnGJ-odCIIxv~ysJV+D;>isdkU)KRpmMy~EF=jG1hpp9!H z;#lJw%~&{4VSkT?T84k~{_y%JqXwzUZI6@mbW*|C%I1wytyf!hNimih8Zq?r^b_FJ z_W?n64U`JQ?mPla6k&P5;M`i4TDavQm&NZFibbZM!_Gg5#e#^|RA0_oi!$NWMViy$;$Qo$XaFkWk`i3un>t*UAqI^9&tfJTdxWQ3EBeox zTPS=jY&uvB3kS86RT)f=CLi)hExMha?|l(Ew|O6KAqQsbeNJYNHC!X#@bYO2Sk?k~ z_>ILqR#kk-ebyrz)#$f1^h`{Tkz<+PuELzW2xvGsWC}*hNJP)8Kgub)8= zCx4HXS?9lGta)HVE%$5+N{eE~^-ZMB z{Q0YyU-CdjTqC|&5?$PgiDPX;8t44`tDPG0A?QB0TtLg-{X41%Mvu)31WTMXh1lyZ zOaXKm>L+|*s*tWE=LqVGlL4^c8i+$Shoz>YvF%@l92+cm`x6brHKc$7Vy=rfqeAPA z5Gy@A|H)hvk*7&Zcpc+lo0|g~Ma{P!a)EGH&Qh$;Jr}Y~an{UH>-R$D)M-F-#M`D3 zxoBJFBj{MXjuo(`1+NCkd?`>=BzOl?9|g*ADY1RAf?w@D9Z&vHVkrY0!x#%1!q&Xs;R2E> zOo|o2g;)@;&wXyi)e+W-s!%$CN`;%8!3D!G zQZQHA>bMri8Gbn4ljAtbwo-FsM}mBSP%~l+0%zE_rmT703MZhP+slU`<=rzBCp_!A z@Ca3W^9uf>>+lFbB)5^1{oJcIXNgN0KNgGI@6~G5&B?9gLsXKr+J^Lq#n~7Oc%>+iV zIZaH3NLMLuec{w=HZY3$!cR>svR#(+k3EkKn+^l~~CI zu-;e=jKc)ikOoeHB`h+fbhU5YCW(kNqvg2k-Skv)2^ac%f+Ju?1wkR$Ho&vHtpB{t zd9!D$c8~nlosd}08;_?!XIspiW(#6()ZXRib~-;F2>!FX|AuEl_}jrO`@sTtigD3p zWRl-JYoikFDm9Dbu#7G1QAb=-QXa_qZaVSTMGDnT+~B&npP@*1h>gC|2fR?OFG>TZ7lECUlVwxV5F|N-(hUh&&3pNpT_{qBalQqn1H_;U zELcA|YnoPn;~&g}xVb7MN)ce(+5|7iLGCxMTz(})|aDNzt}b23faInr|nV`bbry}~|yu_Jvb z52I2F1@QpofSw~(*@_+x9RL{GC63Q!=*QWBFq6)wlFFIV(_rSBom->`;gdY%GjJFV z#XRs}?Cl3Dj6Pu=&}3eCsNL_o_ZY!PPXH!mtS;CLt{ZY|kW92R8F*c_ZrrM0j(?E} zkN&%8y|CbqwftclHfw}D;w1vBx>b2xysxZHWC!(&M7HI)c|Q2Vn;ld=eVWwQe&U)D=}^vuaip-0G?JQw0=e}4ajkhTkq`W&7CkU$j#EWrwKydt;nY%g6} z3_lshQP8af-6o}Nj@Q+0na@Q@`cAU^XLEREz3$HV zS<M?n_D{Ved0z7e|Ybq&};{Tc=o4;$!v zt1AXFg9#HfTFs`a3&f>g{ZS_B2Uwrva{BJt;amN^gpu3lw|FVAigCe7CueNE(g2?I zXC+iv;!rT|$Af=lMj0X*+Z>++x^?$k7k>ObYC!%st9SL_7yv9f(xLz#`Si*(!O=Al z5s=Yk4|DA{I{O~D)Z&4{54>=8WT-&0zkyRXedeS8tb~k$t^kOJ>_kLyZAsZ)!f%$y zq`sKCzKVADHtPCvo8xG8;16gAOy7WrQE}u)awV@f$4zRv;xKg}11RB5e6HJ~enDHl zjfMaydB?*5H#+V}c>WF1^=?8{=x75*uGypL%CEO)R3`j|WJ&&JLYxDOTo7Hix7csI zn*;T`&y%K5dlLoI)$$%qjQDYP31iVxicn6Ca{Q@6le>ZE)?P>=!*!4Mu|9(CZkYF} z1U5BRkunRcI;(Map9T1K8YA?aC89q#-iaW3RVd(9#G=Y~;Rbd>@P#eilMMwoSt{V& z`{L?h4^lU;zaN@tc*x}}sD@`QQ0ZmWJ_?$^uXHrw50Kn-ec8?qSQ-2J=5OMAO zass#tK{zjqs4{>OlxmdJUD9%#Kyv0og(@;TR+dI*yTvD@17gH{E8+<3F-Lj!n)Hox zZ(I1X`*)9t!*`9Q)fVByvE61uR*${A!b4)I#-@K7QITHIc^Yvs?IK(b0EWj+rdW?a zdb&#knRd<8L3*r&g5X$vupvc_P`cQ)V_lV6VUonKdeej0|7Ra6f)Rm$@{b6csu}KN z*lOO93`|`y^Ig!c1)oXs1JTsi_XFvH!^u5D41Gn0IA+AMe{5caoE+E!vZ&H^Ja~iuMPT&d zh3Ops3=)6t8(j5|pzO3#Sf5*f{p1gDEJEG@D3EfZg5$d8rDQRbHnWYtfmu&w2D%jT zSj)>Q`Fw~FFBF`CbsV|VP~ba$$^i7V{k~*wlp^CLIV#KDU++fNv*AewAw^Qlb^d>@ z+g#E@Iz{`#q^y!hi`(Y)H|vX3 z&p_q(YJUPfxq>RNpemNmwNX9rKAMgm%G-0$#KTNeS8U)yJ=x#9xRGSGLaX1kSU;kL;=xc|@r|6cc4(CJCSk ziLOfzWRrxn0UDb^^b_rYDrf7v54m0dw+BqQox9_C-A-3BOKfJJa+>x(7j!=y2%&fe z%u%X_zA=uMI(Bgm{{|Sd1CGPmnK1UmDxci-{Ng((oZ9vAK#np0G=_3pVP#au!7Sc= z7V^OsYGk(`H)lVEUymxUO1{<-ku=<72KDa)qsB=a)DQ0(0PQ}9)dbH`&q7-`Kmk_I z1{i`RTwMMGTTcm00!-g=0bg!-fNlNg1;sIew$@t=h{eoev$XZpdepp( zdJ-vg!9i?`^CwOrfldhgUV!$)<%RkHf@6+&5YY<2FdGD9SRUIfpZ{EWFr+ovG7Ye% z4f8Z08EQ*d!si8@^J_HmHt%lTC8FvmQpwj51ZZ%IM67xrAeF1)f@kXxEv62E0BceY zW+o!+vvy6ibcwkoR^kkGVs1Qk{8>8-bwolIuzMK0~o%iOL5kfHQX-S zua{ZtzV)2MNX(ucZEXoPScN5XS_<=YWX2kN8~un$QrW!XuWuA#DkgRS&8|M0axZ!P z1C(Ij%?qGnI|PO{*G-lMIu)t6lP7rjE!K;!yD?pSbo@}z*+z5##m)oCztW-i93G$$ zhoM{zE3DX->74NwpaBaqk9|c)Yjs{O_d^TkedCtTBqfcQ{3@@dDX-Q2ikCmE!zG+T2qkRoskAR$Y`(8e(by$i~x56Ik zWqP!Zrn0JO9rWU8I@3e;$rBc3&=;j*NT&hPLX z**vdwoG5pblZma!8+G5htD30xvQY0P@cHTJ^c{PwIB<@POdUX~9p^>c#_dWlUqaUY zT52ifb(V*2I}L5ZIkKTReBH&m=TUBy`YN#xVAC2a@*3+IUQJ59TTy{CthAcsD^tex z@9uKX`d@x>nc5Un*+j}>M|72x!rpI>kT@xp*b1@6Y9GN`mT`P(i^au7A6$A{Bl3E| zcw|jf<`{trPd55+q>_Y%((s#+>=dQ^BrkSlod6Ec=4`D;M?6Xv_eFP_ROv@7tfL;x z(6tYY+sME2DF(f-t)5uMG} z+0<9{JF0Pe!YX#wp^%YmS{Coii;jci zc^#0$Sn(-P`G%)oqZY8l{1hKx@p2FZR^-&>1Dx_4N~$Z;vT~|k>5%DC=xvDQ8uvON zT(J1ti-X}xAOsA4eRey+dCFCbp;E4;OTPtRo#iP%7_D?USRLFGCsX`zc{HjN-I%4e}OKK7HEM>px<;vi2@f(jd;xf`TUWPquDc0TJ&AU|iak*dkFyxe`E z_dC}7yDiwH;$$gKHIA=T%}TaKMy7GpaLE^i9hWg{&@%*87|d z!bOR95BMewQfYI(*cHbuZ-no*_r3#P{Z(P}3Xs7Zf`f>8`vqIhJ%F1ImbRp~;mv=R zRK1gDA7R}}evx&ZjOH2@dsnGBjp-$5Is1s(%{WszO}@JEOq>5jdNepSQm$y=(>ABp z+2Mv^%!$ZyclgEBol)h?HZBYl&%He7(7pRLf=fx?-X7yW()Uw_f&zDYOoQ$&GL%NcL>on7vc7WtmGj?=w`yPx{ z?GI%+o#tk!iIaFI+W?V?ezK|cxVr&&scethwDKgWsl>?k@t%&OA$9kxH9TpQQe#7x zQ@^eEKRa~fP7h1!UPv7Jahsm&kTsG~JC`m&UQ(wP8gi@61NwRK#clnAk!OX|@yTJ} ziSJ>}7%FUUs&l9LhPfsjg4mx9IFaV$Po$R7GuJ7VvUlClnnhi&vZf!B3jdphqwon7 z`T*Gtpj-pA1f4!0-Dk{y=Kn6O8!G|P1z$KGz1*6r$*ESK3?y+H5uHp}ZdvM#NzGQy z8i{c|6OODnbTu9Ua$4Y;%@(Zo!o| zi1r01c1KyFofKL(FwWZX@T==dx)mGy&d-ImDV-ZM!4*eS{;A`a;~v#cy~y;omm_5p z$;Xz?txlx*&Jthvf?LOjr*qp2p-OGSC5)3qdTikZ~-jwp4J?Ds7u#0VkHC8Jt}5_}+d4 zJQs9}Gq@MxqghW?CyP5_K_w!18-dMlV-Yh2|ZAg>(!0NiPVYs&70}OVlUJ?Z0q?m+1RP7OA3|9_GID9_Nspu6wJ1!`lY7VN^jp5vNow~PorY$mb8OBkfj=`W zsqG$U#Q!cCz4@tu>gBom6y;90{m>y9v*Oz}rDi!^IxiyHN0P|mUbnKPH2yfM4`awl zLF-d1&O&%2h|PxV9v1$Rm`z2RO4xZ+SC^rcr27&&f{GSO`q;Kkq%uuEgbfzg>eK7%YK=_ zBCj_K+K4s3y%{F|E6tjp7&*hi;)1Vr{`p$~2@A_88k1Fnd@rEH!J{v{V!d*Wx>Rk$kx4iK_XDdcmJUIs0Q?+5IAb&Nl~4Z)lYW z?c4${%hp3RwrSe-2(a-HTIP`30G!||k|_;OC7-+m*X}j@UcMM}G|m>iVd@2$J3Wpz zD6xnkP%upAaQx;hwYL-okW2{UG_dm)68Q};tPF}OkGG)7$Em?Ix{z>PZ;wN%!7cf) zzZnlA`@mkr>w--wU|~Z9_2xQrol3ftXDt3;X5ChIw#IHdM$wB~>|Go4!3oG|<~4!9 zKtzxCa+9=bk4qynVg2T7QeWW{hvt@=Fe3&Y!G|Eh1P%1~2N)AwRBiJ+76VUX{@4WS z2QBgzs2Do;t-foV4+GhKklwoFs9D#2&~Ys#%zZ1YWYXz@>K3?oRCta#wSld?^%F|a zQ%{}b;W5;e z$!zTTV`d=+b7N!7UOo$dl&lA{#-!do!$yPo`6G2ncGMJ;lNdi9%xazl;uLXPw5m%E ztf8(K`f^w&vM=Yd3yRJ`K!_;z-(2JcRr1T3XGL3;ErAqE0$i8J!iF1zRdTFM+N+Pt zUQkr=yS5N8t*X3AVCD+miLTxz%huM)&s|~16TZmhiktpBD-yPeL<}r zI=T%uvU_+FF$aZ^0bX-ro2hms`y*VxB!}2yo6uoBW6#bJ=p^CGwMHHgM_>6!E4x90 zL&}kzz7KscuPF6{&M0?eFlYc8)8;{ifMa`!{6!lKkR-o6;x~jIJPt0Fj2KLl zA_xAaM1QfVj$-#f+$M3dHy-h8io&X`5VwmF2HU>qfFwI{uXJSD)B^V4H7^X> zNudSMsel^kxT}wzQoIIlf1Qgyr+Xl?K2>GlIL+}uf7fJ*pWRq`ZOC;vZgBb@#-xFc z-nP1C5K*?l`g~aUIcMC(1V+HLhxInT;vCn_PU-EgIQPuzv+o9B%ZS%u@B#t!PH)9@ z?26?UqKxc*;igJ@^f3Evs#d-$o`6#|JC=!8oNGjd?89s@*z|ZC2?Mj7Hb+andGpl0 zXyDi9}*b$pW~k05Nhsdy7C#1E{%x?fmD4QanZzbH%IJi)?l4 z!ZmV%tE~&{W?L#p+X6F!|&6{?f2xuYWid_Z)@WCSgLxp z%#=wtKKv6raA1D6gPGxa^-%?(U)|-$^{UN7`X@|^ZK?VOoR&l`9wp-&e0Uq-t`3Wmi$47KRg2fP@Yq7+N+~}^k_!tv=%VS*qTF?^?-p#lA>IF!tZ*Z zCiWqY2C2CtA@2&4rDQ=$U=~WX}&OfVw65KNei68(pG4QBdNr3%mz%A~>dV zogvjYSToK;h8WA_sJk~G*=O`K(zR)-YB2FQK_KXFeGhd=tmUIFW}a!}1vjK8b|)Jc zBYm%mk(g=Y-OK)m9srEfZUHK9?(HbXfyGVo9MDePh4$FX*`YTHu9dkSM{Q;vmlDwz zl&3|SrUdBD>OOTI%5x)h!k#-x2F&(Lw+uUCMNG2?%JS9E+avoklZ^7{(Ge3E2cYq1 zzwlA6u8S)8fp%IFLoUErQj-abB}d~Dg%8CIYaU!So1^n0n1cz|$(^*r&fmyOmjqlc>{0tlQQ|kvJq*5y|wzi=@su& z2lvCm3qK^;Ulm z!Bkr0U51b=5!75r*@M{x167p}nU$`v0lK;;{gp-1svw_1Dgdyc1EY>ABdjXkRm8qP zHfggnawQjJ%>|Vd+LE004AOQgEZ&pTxsg#rc&oJ#BY z+|+A+&=u(zx?3Yf!0Z2$%l{mGF~^g?uEvC4 zwbY(FpsfA%lK0^~sod3Wh9><&9Eo;5n-IcNXBoV@%Z#TM;-XFK@`-1{$~SQYL=i~R zA+TF{96xxjV3_3cn-<2cug>4egrOGZKpAsXa~9Stwy45X4-z=vAC57Mmn_zvk4!^X z8Sh$G;E$J0Qn=4HOx#vP;XJ*Ue5u$h*6R?9FuVN#*j-@3FH^QF6IlWoMcR3s`>TtT zJt^wuW8Z7&ukK#})lC*JDQ!)4H>}+4XG}(tD?40&ycQ8&E!okIU$-6r@~hl_I^k}e zT|5xTAF9G!t@hWosuCDyji06)J`oW)taJ>}b(K|m%EZl>W*)d&3N1@&tqYTnIZNF{s@3hODtZg{1T)r)5E7y-tXA?=KO%9*g%1A-RA zKWPW`1MF{EnN)qOzX~P|RIizB2^K3B)tIS_1KLNJxEBNjz+vr4)$Ok2yji`KGgwN# zhgPzOarf}SroU2@d9$ise%l6rJWtBgGZo(Whtv*)4^mk8_%&RNcU5*_f_BWUBZj`) z%M^SjrO3%=_t(QL3qUvhc_+VNC=Vz`&ZQSce(N#XXU;s~+79 z&rjDQPo=g`u!nC0Ps8=T=L-1wsIXoB&l!wk6iB_w`dA&67IvLkgatcZ6caXbM@;)y zUEzFMV-M!8SbWJsynpV)14gz zQL?;C#fj+-RyJb5Lb`|Vv4A4v?(_Pj7f0ZqAiDUcSLX!(g6a>bBE$jZ&6RdOD3g1p zvx8VS8O1g&hPHabN-hpv(FT&@>7yiOwC#<*=tU|fr&@(*t3^c%`RSzt32!8)7!4-L zRp0m89yqbMz-^+dPrPM8hZ5|D!h62C9PX?_%?j7gB$kHi`nBFKCh^ z&4-P22oUd7V5zb_%%IsjhCGCyP-pM(c%T*BSDB ztlAT`!y2rGJmC~I+c|dpU>PG$7sR_{DbzseY9=V{ex(@@jvusHv3MQe}!5 z`J$0hYt>f9_g&$lcs@;H_rkhYp%y0R8Fg&v9NCIx2>?PKF$c{Z%v1S}nDGPv!)p9R z`25NsuyK(A3f}VNn%F$-Zxqxm+sja;uMS7MS!-eoChaWbSef@C7hw{ut3o%N1A;wS zbiY3TrUO=*6dvPI3QA-B3*oROwBvownL0XcmUfQbT3vpKfGRG&{Tb0ynfIexAX%I5{~_xu zfU4ZSw*dhGK}s6wmXK~ZG>C+BBOM|g(jbi>ozmSPh?InaNJ&Yjba(fEAGr6Y_xonv zaU5sdXUAI4>a}EnhJni*4U~*CR9Y-2ZcD18%{9)Y;_{sG=J(=N=6`2d$bi?AL2vx};IzLWel^c%{~YpJ+*b>)WM^*rO3r*hsgx-X(Ue#)By-&efwTAHSX z&L7_Ofk4f>z+&OxJ3TG}?)dVoGM#usqCe-O;Y#|4ALd_*t~%+wqZxL5msowq$nVSE zJ9ouQe320$81=(E>qYfV)zVFmM7GnXKN;2;V*qt=FKB&^Z*Le>mz#^Kwoi|a5af4R z{-Qb46LJOx8YeRcmR#@pUAB@V=)P&1ZSKp)_QL`xGJczZcP!8^qiBJ9y zP0+!Bvm^A&_}|K>NKe>{4>rcam%-{UuG~U(-*_EADa{O25Q+lX1oG=EsBb=%5Hizc z)s&`RsjMFvm49~-in&T(2`xDK%I$PAotr1uXrC|VVCq}pSMK*u?ZaJ^JpllPN2l^_ zNhPEJejW(PxVQ#ZO{jhHkbhB7=tf%LhO)~Qoe>^ZL!OJrc}=2Ix`$5ja}NS2iS-Ai z%cJFbAe|+~Ra58RQX(jxjX|%wR4VxXEuO8w?WqI z^SBDY{?$e~F$l@G~@Z;f05q$!8y zPxI~;{nwT9yfSJly{<@x-3(VUbl&!s-R$oUf>o8~pR4W$882*ou#wxYifQt>VPC%4 z%+<$I|06WWFqpwj&#j|p-GWlbM;~3aywfcg-k8_dWO5$bAEWYDc;2oTYd7nipBn8n zzJHtxT8aplk40$F-l$oQj+j*Fd&(W(C&lHLY2TuSO=f+J_eWvz*8wuLgFkrpmWC^z zyabSbxaDF^Jra4(WNVtX0cWK0kbpfW6lKTQ+ps7;0q>`UT5Divs)KW7<> zF)(t@FRJ-R#?79|5jO0M8>L4zn!l)Cv8yokZ8IyK(nQ}BL;N*@+D`BXHV{BI(c~uQK+v6z>#*NSCFVuaHN1JV7Xi)-4m{0LtvZLAdbF~F!#obU zD`(?d7Jn?{jx{?_9Qnd zW&nTS*x8CS^-Kik@hWSjdig__7V_}kbSni9L@`1JgTGc(gcccA=go`XQUo3H82e1Z z)v|cG%%n@C(%`wy>b|8O>03KWvZS;TK4B+|qe1a7EaJ$<*TakN3Jzv6B-@ks^#S}GcpPZZ0y z2#0ce{SY=*S5tE#{PCH*zcP2IuFB<`S??xBvD)heK(&Te#y< z{cf*yi}IHCaN|^0E!%+miqYPSsW(xJFrn`qF-KZp2Zee`tC)1CE@u1OEQ zMv#6Tu36}Ib^dubB!4A9ZFCI*8F$o1k8$&(aj+ew24$m$wu>aO3}z2(?YP*zd!Gq`==WC zc3U@nH7oYo}G z;*mgcj`+vf^tdo`uM6EyPiIS7kh!g*wVv9I)IX7AhxKdK-@Y@~IP86)(Ax>s}Oa%no5B7WbF-obHK zP+S>$g?z=;_08CxTSZyQGokKvWT1jL7|B7~2|&u=8CST&oy!=q(^gSRbn29FaBwVt z<;gh`X|^$G-`08sW`J6-0a8IVf6qqCVTOO&#zwtvY_!c}aGBICa+Kt-8P%bFhu2{_ zNTssH=~t@F!%KV6P`3P&dL>p_Xz5M_LiVa9Kzt22%+d6;!mdhhl*=)>YyFXvQ&n~&fKN>O8&@jvHGjV zD(Uny?&N>m$w?}Y)01I{}t(ZUYKynapArLQeS?> zs+N3k>K*UQ`?2$dzphXIC!C<0bax!SZ2-#5372Qc@30aAzlla< zY~x0()f?P3wj+tCWZfTdsy<~3n^V7`C45mhWPw+n;ga8-CP`ENxv2M7`ot@ioz0!* z1fiQRAYA)*S?U~AiZ%(6-=;bH`XFrPr^UzfYZ9AP1wLjdc3AOwg_OsGmGPB_=Gn)T z07FC5?LIl}OVSe>=d%V}xT}kNdAE~kQ1kc~?mS8cN`&YJ>-?=tR9d?w{j}1{MQvhq zWgX5tx-;FQOZDnS$q&9GszQT%H5N3O)-3uY-%U+(<*r|JTrYa4F=Q+LCDtmT{^57E zk8W#2BE4bV&)o+(RwBlox2Rs7D%uDe>;EK_AoW^BFd6wcZjm2}pph}BQL5E>el(zW zNYrI4d}{0and-zK2IglYFXta&{GWPUGAg*eh8_|Zxx9lO|q1m2PF5+!BE5_lhpmm{>oEQRvK7BD{f8CllXI9?;b91-%a&)_Y05G7a2DC zJF%o*ksy(<{TJ_D{rT-GzKXyvgJx2L@#t1iV89)VMM!UO*Ggytm)CH;6UKMbTvV(K zO#?aHb$!&%CR$Dg9#=U2VU*LDz3V>nYFY=3yWQLh$4Ka67C$gijGfqw9NzxW|%Z0i?s@G zHM+Zl`o08Ei?|_UaZ3#3n8D3^d53E{EXjiIT@z+6r70%aP(jR(e}}R9b;A-bap4D_ z;ex1=*|gs{N(AbczV|oOUPkFO*04LY+$RN9W7440Dph#iUy-vYD7Drx=iXIhOSU-U zHpp0+?qxrze;LM3r$TrCmzF|jR)F)V;6l9%ch&bryV3ueF`aAh(L2XaJ>}do;GsYS!VJd{;oOW@l6|K zR~cF}Fz;$%$()4~<)RBS?zUqH{Sh@Uz^yWMS?3Qx5$zMgChJWsms|El$$}SCv1XMb zcOty`cKwg@&aZ=0uGRs^p4krScCe$dz$;91DqM^(MJ4*j)lA0*U|M2ijssd(6I}1} zdQx@SB@Ac;s~-eB>+!NoWK4S{K5rsxNRg0NTWqPljPRB_#?v=0uMCcT4#z?9Smu9c z^`!vtq1BeE@C&GLzH}&K<;~=tLbIX z=*zH?gWc%dmTG*q%8D?XXT_aiGpm9YV(ht&^f$Nvx1Z7=M#aue^WykNc6qt&Ag3Qx zu_+?O+smXDcq4DJG1NB?hgQCM?Hb^U6=f=G^WQV&ciNXv7Sz^i_#F7)4bk5O3aUEL zG=ng1(>Nf@aL1Juw$W#V65WVDGRhUrc4Z-^h@cdOWICAVT|Is-t9<)gb zCw`%Ep6Kkndb;yv?z!gqQi4+9!Q(ph@~2XSrDIdyBekxC-B(~~B;u;)l*M{QBE#T2 zd6DoVOUMY|Dc`~BwiXqaU9Yc)GPmniHRP#=`S>gr_`sqTz>6S9!;4T!_xAK$?bLCq zI+*3T=ZQJF1nda)Jy%zsw(Zqc6%@>uFMxcq5e{)BjT!ss7e>f=nVaJ`v5Kg;2E`Xl&{~d! z6^Mt_=C{FPVm%GTqg_~k(lbMlMAhCzG}rOc`PF1R9n*$3r2xdw6%F$0&u{Fq!Qw+2 zJ&r*ly$=kMkB=PL!j+wPY9FoTVB(;puGx;uLEyGbzWZ36rm1jeo<41o4r*tSZr1?I+8^{LmKo7wNDseT9}na)n9LZJNonsTCXQm0n+Nfw2K5*d6vRkoETTOb;z zObU;T;S{_gvRH=GtE_JUKmB%L(C#5Oi2QXAIx^rA)YEX05biF4-nXY_=5s@y?gcFy zYOB}IGTOAC{0R_sJzjB4TlG<&Epv zV~PK9s*-fG+R>}A!ZTL%|5C$SKn)=op}coTh!%zPq&->aI4P*+Ih^L%D9I_=XrtJL zaeABO;E-d{e$exA7Y#dRw12JW3rqH)##K?LM}2iqMw6$wEIWNrqiZtKwg$_Kn5#f#on+ z{rL3m%mVQKC1g1Q5Z7q?%E+I;efOS&W)&Z}Fdy^0WFF5$dyy$Bqf7EGfb8W^iHu}< z+HzE*cVr?jzc|aL<-pql4b~VxleYUffBnG!f34DRo4HbJ1nY@*XSQt5Vv)xC!ug$A zqezrCB{lu1Vzwn1XS}$ji=;|kMNx>CY9&oCu_H}Z&$;fUo7NwkjDA=Y|ir73N0!msx@zSD|n&(P_m zjD67HT+=Q25>~x1#sT=N#EZ%AzfWHq2cF(qkUjFj?F*7I`u4cYR7K?KE~L&&F*=&k z4D2uEoi626&5fH)7QJjmfOoVpyyA24H7~;yIe{5&Owy$!gTDj8yKTYv>cGE*`H28; zb4N{6az4cl@9apEMZ1co+k>A>Gubk)jg(UI*|$xNmpzU$z$NJ$eqf_4(Xu+`TVN;V z;JRi~@+DlEI!t!-^`L)&3cYfzNw*FIRombA3j-StHcp6Y_4XFeYSVoZ*(1#R^r#C4 zs_qkTTGr~D%ZUmaYoJq))W2l^4nHXQzJl5ZuOR%}44q1FfgI7n*7cS62d?p;iaVV8 z7jMd_z$HIXc;J8cYe-U%FBeUo9r&)*jazYO(=bIrSnd+nWE~bdo(NHg%*!P zB=FVQgD;w$F%d~|_#6@cf*WaAz_ZFQk7e&R+9nCvoRl&gA$G7$8Z<^F#F!MPC(b2$ z=SZo9BJpjo4hf(0RrEmPhdP%NlKIeGb6)UjW!6J|;}3TJ$6?UFv^)lv?DWIc=g}Qf zVq#!m(_p3YmAVnO21dWAG=|Tf`W5gPS-lJHc#1)lqDWcl^?bquy2A7!xrwR<0Rq!yU2#h7cG4||0BWxZAE?n zDph>P%NMGP2UIOG-> zEiwwnJC>HFhPB%_c7{#<7SK^^rNQt$#PzkHxCZ*;_pYL-&rx8rHI=hQn0{7sSy6mE zoCMM$PNL-1*%)gU=c_LioQv>(EsaD7*ul@aJ(C4`5pTkL5(N`hTSTa<2QDbs7}Spw z5%X%F`nc_XqR(`2^EF+`Ix3Uwm9Zv(xJq6Tx>zIh+&#*_+`5Migo9UL;vJEZ`NI3- zI$P;%{oHkF_e0r5{vwSfP;K|!Hj6{7ltL&)oI+6W>k~IKH^9dF3o`>S(RWmSrxU?L zbs`a-?tV-t#R@!wY|bil-HTmfQBDqny%H$w+*=r*`&w;@nd+KHJB(D)FIjDAe7I`8 z-fYGaa+VGsFJb-N=^mW`#NpRh_rC>94F-52CNntuyWd$PC!3hxPbE?e$o+v zDjF$>NdIVCp+0(7)3xjSin;!Y8%N@>lUO5pw@ z4iT_-x4mN!HUdfR7A2vH{y~5iZ2*BsBWG_vLD$7Vf}8)59;|j{zCKX^ zORzQJrnZCgB$$m|ZtqCZppW6v%dg=S5*_$lLsXGbGV`r6uz69Xd}as*FIqirvcJ6M z#iJwq3!z|;YD1tBmyakC;r1aRm>qLhuDLGnNI$Z@z% z1iLY@%&;qb2}AcEK>4p)j3fX{4j;0I{rX{`scof;oB2INMfB7XY4OJ29$6FBJ!!KZ zvW5rC(voLQ5hgGu5=XCQPC~^*{-#m|D3xO8Ba7bRh!2v-71{ucwp;-JC)$P0D#6k( zCa7$vS)XUZ7f%j(ZI#}LrAaS@&K<|rf1H^iXhFk=+P zzbAdi@{tk{W8=ug!>BW6IkyS4J_eqJ7oy$qn4Q61!;bf&q=tRRG~vVInSztKxMoW= zr3{DHTdE3GLRL$vDkDqC`J5SdrD6W1O6c#jNI(VYK&Q{`^f%S{+gpH;z|FTZ z?kKp5_f;k4XG=K)(G-P^WagA=9aur{3RuKlUkv7`L2AA8}{Q3378@QyG| z!-LL%90@X5H^rJ%R#{&pUwW>+>=92B60#lZJ02wz$0XOyHnmqRzIbX6s5g z`E-79jQM)Ts;FNh?;WG=#IJS8DF*ytS#8hE2l_^c-wd!xXkHTBRGqy1f7n|J`bZyt zbls6CAuQ{)*hOk4LHuSa$>eW1X|-R13niX!j)Y(rDrG~AKQdbz&Ku8D=!sb4FklRq z+hA!lr>%BGk^S=7{+r0HDS$5ap%>8@ddiv04DuQ9ig7>AKxS2&M@AibLijk&2aF&{(F0c)G+tVEzgXc z?$}5M@;uSp;VCnzwT-)LE8F!W2JxZ#>I6>T%`;5J@AaXN)O6coEvV$vd&?VkBEG_` zUnrv@`~^EPP}o_t@au>E#&1$M!y7NG17TJi2KD$bAv|}zIzji~&#&2s%_m|D#F6`o z;Ss)JJniN21DN(R`vToqzuVya!qa8)vv^<$3TXtA|0th-W52R+w{=wDxMtP{~_fQ-QFz07|rMSE)3rtl?43*rA;3QJge=~=` zslGOtyEas)%$9j$pUh)NA$YbJGZa5f{3$*yP5xxsg_KFRUS*3(IS=*q%$KBiMorO$ z_V8>_PnDcp`2^z*u6^Vn9z5eSYO2*+E;G}B{~K1AaXZ6B?gfY9QS#YEw}oj^$)}KI zzuD8s+y9QwbzTLlf|O;_EiJ@z9H6BL_vhjvEv@i_0*vk?1E4P2n_&T3Mr{G=tT{Z- z9h09Lzzvt{r#GF>KXBXoj&~4-Tw?^PSBr#+o|C$7vL>^ey=Tjr0i!=0FHh~xCugrE zn{Te&-_Mmd9w)PTsg>*Lh9|KaA$o$bU>t^RNNccmk)VA2;W?P~3qvxZQK`4*&}ETPGrrarp{*U8=Uel5__uA=eT9pZ)t=taY{s*9YSsQ*E?`I# z!p`DDMoTosX1y*-d#cplwucc=wwCt1DJJB!><7ceioeaS^rp%Np_9Hx+6E&}C|YsozE zaEV)J&OR*-f;Ff&_iJ~pTs!dHheaP4zDnS?};x>BubhM_s)i!tfWne@%- z&biGxZ9t(R2k!P77oSVmnpKNbGVhamM5cLPyBhZOO}(a6hl>P#3QX(J=WZk2T*P za)9TnGSwSZ0%{dCQQD?FQa{jJ(aGb$}AiW46ipq z?mVpdIguPb@g4h?LPytc_y%Zg9?2N0#h*lWUq7s|(u^J{p)VS`+N`3i#E2$fjiI}` z5-PuT!9S@G?)yBsZQ!l+UZ*&EKU0n4y8~r9f#4koD@n9)Blc%&W}3!lvUB%WRzhI8 zi}kk`&fSX@YU^rkL=H+5mK3BSNaUdH>dlL8HXR>`A~|#Kl)}TN5l|H{B%A(ETX^15Bks9Ix^QXwE!XxZ^zIo z1oEXAk*>fj;aW2bnf^2@?Y=ntZ1Ic6q-60jd+M|G^fTz3K$Oza3j-o31-z9ed;P;k z<@93d*%9#JrFqHo&%V^Cex@cAU26V<94%v;&FdVoQX7X}FHBx{N>;{9I-)1*K-jK0 z2RQ*G68}0TELKYjRCK@3^*?%o{)Ma)-eV3nnTCDK;mF0ykg&JKVk?3wHuDi}d33ng zi*KB-nrs#BUmUK9CFt0P5MCej3IEED5PAmf6Tmf2lp00ACj+syuxD{fc?~fSs(#(f zGzJXCGHD?dX$?P4AFmBc73fCjaT;C{SI#SxH(B-+&AssugT0r++jFwJ_A9?oGytXd zIsOWx&c#PXa`GEzd|*4Vj(T~ADF4hsTd8n)5{Hm(biMs8UBqAcUg@!ktC zO%CKSkYt0iK|5lEJmC7Sg-VZguPj%JP53$2R9D@z%r&QDK1bU=GYW4ZZFJVqNAz$DGH@(hcu`w1d z4@oGDzKSw1Wr}Bx}dZ!J69}YVOdX0Qwep(_=`p96etOIEt7$Kz+fMuT^O)Xor zK>V2vjOBC#?M$|a<<22o$@T_R*2Gv5h)}avggUa2z<-cKEQ<7vh9pT+d~!0^k)LEg zIWC+e%UNoUs4J&Rhvlx}&9z3}-X%UhM9>>0^MA@eupk1VTTA70%Hnfc?+4P@9L(0_ zmG1{)1Omp02BC=UjD0kEZ={oS1)Abyg5DX6W8U9!)C*3)6sVt3>W(Md(;QxxZ=9^g zDh!SjahbFfm%8FY&{E5r&KW)q3|pr3UxE%nN+Cb%(w8_bAOCoAD=Gtge)mn3tnaAQ zJS158bP~Q!JkD}6RM6fxfpS69{n24kNrT{c5|;!SuvZ)Ns&iSZ$Kn%WJ3cWSM^E zo52RZ%eIOU=*ICX{f1e$9wVr`T(=<}wEB^lu6I&SOWic+Y99i~l3wq!V|}vXzXWDR zB29x9K!N^{)syELvNxhBk@imummmgO*95G=|c zy9`i`kB^N2Bxwl{q%`PqM66n>r5aG!e0{#rb=sMhd2@ZzTu9wOAwE`{ybgLHi32ju z1^uEh+@mTD4qANgIj-d<41#fY2l#B`4F_Q4X^cUHTwGMFpoA+ik;XXwIhat^$#Q+V zOgcvo;RZZPiq(-4`x@rx)hT~E7e2)E&7Ndo^nh~USM`z%$&dgxRK4Yg=!U;`5E`X5 zFf<`%|FAXGJoO#IeClsis4Zmoq$+_1k9Gd7q!oso?7Qia4{2#-`z~xnTGf^>3YBtE z9(gqPoK(M_TizOvF`pB4u+YToQ{{yKVtj>nR1Az09dB%9 zxY6u)reXCVZ69o&to^z^K>=~a*WWni+!K|zH3EpSKP$#!H}4@1gmWkR8ArYkix07f zb5w`7MZdTa<#Alacr5Ida>`D4YY;+FrNtGl3E8(Vd^wE0m+aau%Wn7>N;RCS0Vp`Z z-u-vL^Bm5r?7BAss}yB7S2l(*C#dQnoOsIeS2mxb4PjqI#hU%V`vuy4sEdEX7QQbr zgQuC;G*U8{Ze;-^1>WAj8J&o5#cG?d$pX9c5q;kC*AtP46^;oWku;p}AQpL#6tPbt zk_V5DdxPBf>|ygCgApo5UEz0GwDvWOsG(1U#0dM)JKyBV^DC3nhJmuua*fqGG^Yq| zeR^b*ESi@cNLU~O>!9N%Nml%|n=p2$C@us{FpFOE&1w&Zh3eV=`!Kixwk=2oBi|hr zh2rwNzr(;;gV0yOWG~X< zt%HOVYu5pg8uN5PHj%r>8-`4!SopB*>B`pMc%Fm+w9W-HXP>;t5_kEF&_H}E$;aj$ z+4tx0_wLS2OilLLs`H_Dx%Lm2xt*jW@wGK^z}0)seu(eqTa8I*Uv}Wz8ui0q*))nd z0x&sDGS=ZAT+ouB6++Dz*b}|=aY`fjPO~X=f-!L>MbWQSid5)}l=B~^I*n+mWQm8U z1p9G9f2_q04um_2>1poF(0=6LPUWo{H=c^dUkf61G}VcJX z?9SK-V|)^IqJJBH3l-9ZhCqXabV1L1Sj^2yfxCPI;VhLhqv)$TT{+F82e?r@wk>s?f=_|wAzrHe;qCUMR2%2O zKOCYY`x2PxLnWH*=LhLG&$|;?(CK?RV`#z9t=W7q)Hy-&J+Ta}dbO@D4gQV z^UwRg-;?nfAdB>XL-JJ+awxft#QKtx8Pnl-*7OY3^lBQX{F-s8Z-rTJP+qSUWkB_M z>5XO6obVV?*xr9(qt;c_i)VLOJV_1)PkG{dI_f=S^y}mfvl1&&1egIJ z4Z7DwqZex_gFbq}jr1Ug6=YmJ&wx&xK3W*9el^OF{WAAi+&mnZIbgO9hB3eA9UJRP z3nC8eo5>=Mz5$l`;O<=O8feFt3v6X}DQ_=j@w_B@y)SRluJ$&^s$b!ONunlu&okR; zL_gOCbR}nyJsxKto!3fi{m>6-p@p~QyjT<$MS~KPEu}B<+>Y4&x)*J=CU0Rmto?lx zTkyGJ+rBB}E|oKf^d(>*^ut)Dnm3s(-c*_2u_!%R3$IXl-7=xckg?b(=td?Aon-l3 zH9D@CN;VI4byET}JJ)u=Bn~h!SRugA8@Nomj1IFS=4Dp~3zccRSUx3(XGVY_T%a9} zpyoGe|#t7@$`M$SHJi2JI9^9 zytLY)SD7NmdREZ51TdWaTA<^-MSr?7aXoFi-0udNyY&yR-qb-gG0e-lSfJ=8Tb zTEtcm$P`_rBp^~CWv1c)14N;7SXFo~!Nd#m79ZH-SfCC(*_x_TZWNXhvM`DmnR*Q@ zgu|_xS?FX36%B48`?H{TOiSB+9$<7cx_ju>egk8~QhPuaN@vbUN^$H#zWsvhn8Wdt zq=kSwwPLmKKvV)zvlKf-35pNN;@9(>b^SwZXZI_CtS-XGSs{~?eXv#j*`ynHqRK2H zI%o-Hv^E%fBsI(usZ42Vb1z|OoIrksznI`@IV6yV1 zlu7GT1{U=*e?rks-^+`gW@RwxI8L3#W!~{VXeCwc$ehz)==n^KP8%EYxKm)wcOa@Z-|;H|%I|oD{~j$LDQ>GZW4}yG`kbzF}+ZRXZWR zVvluWcMOt#br0~Srj^JhY+jgLx#gRkoOsj*U3q$GHu&8~N3nRc+nm+^G!q_|QKQA= z*Ki-8CJ?x2zCJ2Hwp;5=T6rVF4-|XTh=lg)xAiMLO>g!L-8ThKreYQ}yljlOU${@n zaZkr)Z|J&An~=sB9PqyHwA>ISlnvmppYUYlx}3A;O+_=m-bU9Ij+h(0V%h{w#A|>Jv zw~vn>yG&)aJ4WfF7?)|<9Sy4or7XOCz#Q51XmZPL35A-#g+0)|8J`WNex9-A%TdMJ*x+-_~> zd&uH_u~id-)?$@XW&f^=PLhz{xkM1>|IyZ4v6W!wUa`rXzDZo5U( z#AVElFXMU!l@oFjSW0yp)B<)L?k}?&eDj}DE;u1J6n5Fxso(lMX%75>4B;>i%eHSG z`mVIEp|wYlmVe%R0G|TQRKkA5djC$%eF$gR#X$gLw9OWX+7)9&FJ%_d6h)sQEtI0Z z)x#cQ91g>QK^N)6$}#CKX)f^%g?QEPVLhyh_nY7MhBM%xYt*K}5G9qgp9vdTfR3Zp zL2C!&{MesfG7LpR$+l#o0=F zsTgZxWt2kTvNEs;WddE4fen$J9Q5E>r|mAWQ6M_0e1y<>!NiX<&S_(KW0qkfHdrBd zuS(b$LG`0{$roI!S}S3iLHuUA4bHiE`W`P_eY6~GuT`d%lN#R)Kw9WQmLi`7)&bZo z_{QMd77LG^CvwLTN!|0&xzx&OqI(||tpz?NEkMM-ZHWL)`6>=ogBx}1 z9x_XptHki5M;isL7;yj5p`yio1W8NVGp9dtFzXm`9;ZE5`_;%YeyXNuRQ}u=bTP~Z z3f0P5uVjC^*LjHX_GWF^J1g;U>Np>2jG(X}qCQDguHW`Xl!ZjeJ$^+3oSL{v4KZL& z5|$F9kiPfL#q2Qa@w~C0d4Z{G^s?_CrHd>`U*>p?}#?RM1Y{`19!fV_H@-y}=RZR?(40 z3T5xkr@Y+K@B~m}f(#$NA#LhEGdNR^qw zEB3`#AVRfyHMdxoerv1XMrRH#B%Mwoz6&{_Tf`YZa=Mj$JZp*zk1fRrv zcq>Uj(BrZ59CQUS$x1PQCf2O3VVS=qXkTY z3Wm5y^sqV;3LPp&O5)%&AK{xP;c53I?+A@Y9D?r69r)1RT=&hn81lZs)Vm&- zYWH3WAa2QP5&<=rqAWwmYV4`ubZL9N!;Jhf7EnOgLElxIk<0>pr}Bw{``xf9J8aFS z0WD_6#{FQr{lP|ZtPx*vSKO181{(I7uteZmlImzWcSRnDt5k0Km!9?}~Y01$Pndz%Gbfby{AZkR=$ik-1V;|f0 z;{jG}azow-T1El6e3Eeuu@<#k#&I1A#)6s%7HbGtTbc9f5~3nPLXKTo}D zaJ8~1Z*S#=v2g&2t8B>I+@($DVXzqG4#SOgx+V}`m6i`6AB}) zZG2F^`Hcc+xc^A`y>3Hi?QakDj#wn}!+u?S2)4-BjRWD;1r`od)*~Et)2Ir@^=tLi z5_idyoQFOWinsYxO7ah?q4(O)i8ZR zdUJZ8_l0;z+S49HJmyLPwELjddaFBo55~N$UGLsFu#V4vrC&d)Bq9<{cUz~w7z1X; z*6SxW)H@X@G~@Rl&m?lbQX3=Semx-$3`pf44lCN#qTRefYhV5Di}mjLd353TvNEYX z9$$SO4m>{|9ZsdGR!?iRfYDPb_vJ0*88VMGjLGYKT_s2Yi?66w68vZlBLSPEl8$CmX49d_m2Kvte0uM`45QAX zoP=TI1e?#k;+0;g0R!r-J%AA$-J)&ND1pPEqL@tr8f`DBcBaLv8D&o#jp{257pW>E zegayvzN@>ID=|ymBG#awOsSoGQ{~D$(Af-#=<5+g;g^-`DhF4YkNhiuD=$LFL~+A{MRWNYhw&6e0rVYLph0lE_j$TryzF*!WXDY<+BIeCMg-Elu}Oj#B^aB*ct3}u9BhPu3B>EgGnCpW4^=F6 zHdnvn>3>4=VPsiZ5BsDk$;V|sN1N1bJb7aIMCszAx_(H!exKdBTdspJF`WU_+3 zl7=ilfnm%+t~#bEWGZOz4SrJ69vB@3UvgOP}v=Q$`46p*C7^nDTc6 z`_8;SL!>S29S-{g=H+GFi1(YAR!fviPI~TJoZ8O4GF^3Q{LY)^`RczNQLNe|?!l^K zL~`qjCO?1kWbuAn#mS&};VLZ&0q zw4wR>#?5PzElf|^FN0IAt^6Gx%X>GbwKcjt`BXw@G>fOzE3xg(#8)|(LqasIpC!Tx zIYBD+Yk!XUiHqEXSJEHEBS-`{g>WBeZixS6IdGo%0WuC^jO4E5c*ai;rjif(ysxVw z=Qh^`o1cm0U%pQ=B{kDVx$#6Fwys;kZ6NKV4G%lkxlCIj;k1lhr+hrH|2!)FebB?l zl`HWiEw1g)vMVA3!Z(E^-r?e~?W@N!>rTx4aFeEVN&7fhHM$)$>!M^Ku6lLqCUy@t zABkrhmcfQ=n1=;HI4P0c?BVO#rc|B?Y2f}DV)f$tS~dZKDoUL}yf=s|I3oA%lDBAJ z?Kbg29nptx2@Epr-y6RaO2aFaYlu*h+7{ymu#3X8XUp~@;a3(F4=jl)Z-E+xQ$ss4 zjE;h5Nh~uePDhr}X_UT-M9-kPcs|8Q92SHAYNc8dJ53j5&vfyg)k5P%9?3dXtAH=K z?N$#@*j!{l8cqS|oBkem+@M_c;g+dQ&y%v+x4=6fnQ}j-5!J||y!pDmk{9^1q?Yy< zb+Vo;Ch(k9M0O>xlncmd;vK8?Oph&m0xfl^(AL7qh(Sieh;hbS2v0~L@b@` zt&Y{$=&MSHHmaWy@;a@@x_K{lU@krGaWx%(75|Gi!`J5n626s;Tpr|-yqkFjY6WtK zRCBGYuMQXE0n6E`4f*{MI#DsY-n6Wb?$Mc;x)8E`d1NG`DcrtHE?={N2*Ap%#Zgv; zVBa#z7UZF`?srGj3|g@4CO8{CvgL}cVFqFa{cW3VD+9et$Jdo~MW_R8xr>))Y~AlR zCPC7Z$*>!{AZ$?iccS`bw*Wz^m&;GeUo11Z(1nC++hmEd2$RMfdZ0V3P9~Y95&Swe zMvvA%T67U{P%RTAf2EqSBnES0D`e|oFYTs7zkpCgeX;28rifOy@U@Pe*PAlvM-?m7 ztHS%OU>vFAdnC^dCDFuZtMU2sT~jW)N-`{63H^SH`JHwv|D~NDA=knQ zPrf-cDKT@jAr6=mrnAcnc1XVFI+gr2KOA+&M#(i}Z>?GOD5CG9c`wX4O$=lZEu-Gv z3ceB8WkW$xP3gifc`SB7cl^S%-qjx!Tk_B9)aay5U>cp_`R41wwl#doG5` z1*}`6_e}ugo9ORW*x%@~%%N=xT&MS9(%Uq`PdYD&+uCuRCzr*4Y>bxI{F~|A5&-Xi z$cJ6Ek{pf0A}Kj5#2h=w8&y&?yx)gY{c2_EOCo(DJnEXN!b<8ZhMP(Gtny}LcqBnj ze7?ryXCOKYSia52rx(diCW8k>%KzDt=PbZ!ThuK!6!oL3bY+GCYqsNtMXXv<;r)AVEI5s%twPpI_pH6+L{QnHn&@-4zb-r1k*7Bx(hhv zZ})~q^9j4|UhH-UUC2BnaLuaME~lpp5x0ILb}a|&{7clTK^L^pxr&< z7U%co{?2nFz*eLpQzsqr2y&t*_EMBW=Dv${V{jr8g!9yXNgDSnN8d>xn)rQm#FHs8 zhrce@w+#!>7$3-dbOX78=0HEkl)VO#kU~+Z0em^iG~&m14~Qj7rd)how!-qf!;m>q zjAG#Z9dRSJI`o<>KvDvwu5>T?S^(#Ca9`mVIHC2&*UyA2< zbgHa3i78}JSVE=OPeq#lkE^c?tE$`D1_Y%+Iwd5Ol3k!?c)w!LecY7@WLWTaw8)Dch6Y3FWZ6ubtdTKGe(i86* z(4gZ{+q~RUvm%c{gwvCgyWQd8s7vO3ANh4(x`VYmWemLdfk2%8(>jp+?<4rrh4H~% z>ApZwx&Wt<^5X%Z>54luC1jszu--$tusK^Z2U~{1`^V;?ez=wF=znzvc`?ETNm)u5 zbMiUTNcvvg=qdZIg^l77c;cPm)UG^o&y3)%@A8%`{Uk;U&|*uyi&M{HzE?DfGXMM0 zjX(X+|ESh!wd=HxzM_Zrd})SgAcODI7yCI$-|N$ySp*MwsL&HIm=bO~lld#{p{jr{ z-}U^*V7F`lC))o|jPKx7(QzbGaF_CY!IZI8n}rBo6cg(FlXS}^ZF}l_#(`w*$pukI z85tq_%6+3eRv=!YiQUVaT5E}+yw-O=jHdG!`S|;5L#VJGkW;OKjYv}{NtOg}=Mi7Ztj34u&yO4z*Z^cjmSu%sWI^zl@crYM%b^nS>Q^zz zH0{fd_cbU~uhS8I6suqph?i`S33>n5zrig_UXFRG6c-67Sx}aWf#Ma}jo{|*p3biz zY_)PAa(MrHEZ;kTQopX7*$CrlVB*GlKthGEpFQ?7^mb z4}zIr{1;5ke+^!TPhka^b22N2rRdg`JC9-3-Yi)BJSOQv$9$6z0A!uHY;;(6RnZ-m zS@9qd9F?pPX|x4m4C8jH^26-ED+UHm8{UL7fJvwJ18D2A3mU<2h+sGVo5O`3zmz>v0%iO%DMNr4*S^i}g%gk(Dwqv^jg}-pEL8{#(EkqXWl6ER4@= zAg9QG6cpDAaQ-i_uD?3M*unIQa(aj0{gdfAzs1UC zA#EuEwW%RqvHEfBv^J`-}eCRHiL- zU(ma5N4U#$$;ie;{P*b;4{ve~)ADeCvsM0eUq!ZNqhNwYF;i}9Yin|@$l%N~m*~Ub z-$#H%OaAtNM&~%cQDK4(_%~!PQysK~xw?QGBZLhLl@Rr6p|2s2D-44cI|De&_kCEI zNPp?BT`hX8;i?L>Zal^~FM}wd{_Eo82+-$=mv|(58~-)&g)-%T7Yr;ci!Y6Gfl?{~ zSk|%Y59IF#@_2Xec_CViO3CGqen!WWH z*?*RwCN}01j5w{+;+Lku$7Zg3zid1+uoN!D#V)_9s<9~M!GnsccXOhB*KhMxSg`hm zQ69EOQ5J)bqWxSd92vK8t<&ewpwOa!f1h&N7|cg6HefP1?TOGxxMP5RE#c7*K=hcK z52d<-c~WCV%AbpXcImWG=uSp(tNZgkN%rAm=uOuX4`>pOs`uuRNZp${*)B|6c$)e6zdL*B7q7G)ah`T z^pa(h7_3V`e?_pir7hjXvjAqO)Y{FkUiTZ`=YFg)PFVW1k~dy6R^q^RLUvd*84v!L!U*ukg!pecy?19`*Z%JJ@$$aGM9p3l^pAP= z-{m?l_st31`ZKeN!S$bMV*pYSe=hY=?b>mnDKUvjAMzy>l`uCl&)&Sc5Eul)yiPZB zP(Qy;VsK)k@A)Hr45T6lucPZ}^60{*?^8l*S@?I4h=mk0+0`e|*aDTF$u;~qWeLZbqF-~;)eBQBUfU{Z~ylLZaoLKvROXzJ0RWlMyu57Jf* zv|&V^dCRvwIY9_||L;Xkt^@-sg_aEfz3}*gUX21dLT~HE3xJtHE;f1`zW}7MGWw_s zq@YdQfUmhR9nJOkr;Dkf<+9b1A9c`33o#+=b^2r!bCWVnoh5w9lSE;iemrrhk922j zb-8AKt2s-y*8-bd0y%$!1cZMfk?=TE9Yw&q1_MNTbg_k%i-!m?h~gW1ocH-|4v@`+ zZpGd%d?XSE;>6OLK@}aX%D6QLch*rpn6<0CNc4BY?NP1E=iz z#)vS4IEJQeY-F~spxSL4QyqeX*_l&div&)C*=i)R4cK7ValxPelvn^xy*}I7y1zLn z1ek{kkRT?gYDlE$HGz}PbBOqG^`ql3InsqPIjiB1Icrt_&PdJNOfl@OvoTg8O!_i9_^G7!L8!Jq?fx z5w*4KbNl2OHR3=mD-ni8z3V&rf^HFv_=1;Sj(XX4d;D6t)=IfkjJLRx;S{JIJ~sfx zY6Z*C%5UNOEOQ`Qk*xs2v^Bo72=YQCdd42IB>io>w;fDXE2H~1 z1yQG`v~5X6Z2NnxU@01H)r+_7u}|^TV-u>E(D=!JDm^SyTd--8N`)S7G01QtM{s+? z-}Ah$*+=>ki^-$@X#HT|o5h#G2dA}$Q8%#AH1MRZ3MKM}cUWxJZgDPqdVaZDxkv2K z5NP1JhRa;M@u8F?kv!_EM~;*&oF>L2U+i_)5c)D_ z8fOaNqExla)qo8=gg?DeY7=UYK3snaPJ2IQFI8e`w7&rd z(r~JU3hL)i(ChD(-(!qNM+#GKT@@#-vHOx>J?wuy7*$mi$UhgjI~}8Hk;HzPWV>!5 zb_m9b$&IqbndN<;lO>IQ109b_H2Bf3*Z~EPAsg)T@sU&?r7KuWyIwrM^1WD)0)iTv zR*UJ!Zr!BrML0;bikW%PVO^m*Rgt|cfe}mM-u4}Rzo0Av1{n~M^(KtHnnvSzs`3Us z80^u*NsPKhAT+XLF$<{9Yd-xdWC0V;o}rQQCJ=l`$p-=^_`4+ zMyP2i7z}KD(pBF==<#P5zLf|dlXd#e>ql;0cN+5Z1E-yeBZ|F_m}HCBB*ju&-?;Rs zZNLk8L7D`sB3En2k^!0Rs~jq>j{`9jCW{&`zKs@6SxeWt5y##VFxA>WFGfC!d$QDi z+j%#@Hgw9Lp(sevv`MLowpIE3!TPLO)`>pAo7*9%JZpJ4M+~9_?=J~GT*VcZcWZ{g z_QW{?$_?p|`VGFAHk95v-WMr7dVrlZf-wZTmdWT3LPMPHRIy4{4Eer zDqJsTE^pTy22vv&YOW$G@LSPOyA0a(D%17|eLl|QFekBGr&rmXJj?@}Sb5_^MC1vF zg+AG}&uu-W!Q5;1DamGr%hctYj}@OICHpJ z9Pl_Y*HsZ(gITCo?D0`v_nj_^Ei=jHg37>8U$4SA2V6g}i;0CIF#vbyns0(6<`HPd z#M${actMYnZ{!^`@y!3$i6a^{3LilJ)f)*PVMU3^)VT09mf9nH4+`@89R70li|+#UFr_%b&izSHT~Y4_Tjt9Q(F+x>3LNvBps zZg$xg^U)d-%03O@FzJQ$0;3cmRRb^GS9O>fEW!1h$y5%s6$J(5UY9eYSfUV$kc>n# z6l9nh5QZ=V;E;CFYlqQ|$z1wts*oPKl69qecm4=-H~%%UwX23NFp8GF?PitCVxF}Z zq-Dr*Yd!H!TuiX>N_odh={JxKmi45c-SfrB$?>X;&P;8=^>~boP&H^1yw7LojP7} zohL=I=O%nsYUdcVZxp#dTB)=tcOA?*-G!wqWAD>V5I_Mo4+TIK^Hp7tcyBx1OQRP* zXwbxLtSU@%=ZR=|<~duPXN$Z~(A+eQ7LWw94mBFN9xmD*f>kMo{H%BLX8~wDL)MQ@ zKSZ;E1uzG`w~nmRD=0}6 zjY$?4`+j4X*MIOm6C|@!LSffaEff4Oi*q_lm6auPWc(*->T3A_rujI~*QL1STa22{ zPZ65w;tezhiB2`G1F;OZ&*a$__{lqA3?WmM%sTc@p6@x*qevn&)3Ef9%uXLqP zCG|PMwBOVR=clK?{S3=npvxjo@syz%6f)ru?7j{yuOq8ir1{1Zd&ytot+kHN@a5N& z{Fox$(2o=GJkEO8Gx3PqhQ6JRnFB)6j5;KrAeM#?VOq?DF56-ESFf@5SIm?Y=ZS)E z^yGpUJ+odsk5`bp{=h1wm*47G6&h{`jHov1$z4&a-mzA_ajM5z{bON|wgXdK9!Mph z8DgZ@zL5Cc*6!C7`dMwsZDD4IS!IZI3W|Zl-O=0KFCdPk4rQEjmnW_o=}4tUKNSN;V5!5 z?a=Q-YSm7 z+n)&W(|$1f`Q7L-Dna&q2x&T=stnw%h`sjW0&>URabsfIr{C3rx6M>+W>M)dOYNPO zKPf(odDAr;KhJ1rz$g9bWkPhtu(*G9cVY1~g;F92m5`TQ$i1aZ(P7I8V}s^b%Y}sZ z+d^kMm_$aGK387$%C@o~aG$vx^u)K#h=kj&Xxh(~{CBZqg3!)!L{6At+veL1gv9I+ zk3b9{0kyzW#Z2d+;upGIPTHlvUhOs#LXP~-^H3*vd-$aq(X#CV`tD+qNqtMN%%Als zsm9i2-5iNomh+P-?#$F!_Us%0U98erxmFcr=9Vc@zd`YQI(tADm2~9$Qrlov;vm~e zmi|N1{^U0<8}(VFh>d`!p#NRho+Ysh%ivOhg-<+%mVNbe=Qj(j)?r=yW2GG(XO&jz z4QIf)C`xc!!f~NnZrmcUK6MaJ{M`5hSH}bH)_4H7VPRScVKK_GlP87S@IzwPJNm)3 zdz0mP+2oh%@C~fby|dnYkm{04QQVq4Fi*68FB{dI+ShRELi@Dzcf027*C{1c7k2f% z+5OvrC=K_!XX}-W9?=r7hxP9|D{~S&BTysaF)|{2tAZdsh1;R|Pq;LVbYGi%oFX_@QV5fXEMK6^8%fGd3SjXS`iM@(Q_VCG2 zp22J+%AL&A+e4mzRb5Kl`lrS$3dCuEynckQ*+x);B2a(7eu-&a4;-N!U3 z!SD`ro;U<0=+n@((A4g2;$B7|&T1})`w_(wX>7c*k-!WjSO_QoXk+HHq=-6g%WcgLQ!9pKf!?0v2nZ-fo4Ni?u%1M!iHyryJd~Rjxp1QoNkG zO=zx6?8&`e&61Z_yI`9N%6xmKV4|$Ydlc5wJ$aKKR9g!!aaQ-VqxRXMo&&V__Puf9 zj5?UTvuLcE2^1{|(nKOH{`={}Fc+2BI17BC_nvzOFDVES8{Wge;?88c`IZWL%anS3|&tL7; zPwQatik+FS^48B!;TO_;NLT^3Los>M5kXolE&nA3-{(5hD+@Nr(E^jCDu>i;nQ+7`YX6n;i&d=?=S|-IBDN zyH1-9MIYn_mwLLC9Wy>E&OS0C=R zCOCNDf3=2R$5$Ww0FV`pM&pU^aDJVuDZB6}JF`{g`EVWbJ;nfq@u^CnQ*op)F{y;Ws9xk^PMRD>qY{o^#FU>tC;V7&l z)`*xZ)deI}k-d)m11Sn&8Abo}K4{S-3}}FP3W5lMu?n?>4nIM+#mjAaH3z*i>OH-d z%jSL%61FH_@ygvicSTX6&?sFIHr)~hK0__4PR?XHkc%I4|I7$|d;Gf>;te#WqG+!- z1^W-bf{+)VDdSa2HjC$Fi>(uFW|pKzbIOGZ8K(|NbLG1AOc~GH=4-ltH+!C_EfD!u z$;8vVTn80Yttv_fC@$GX_bdSHUe0DYn75JXJ=hvjdX&B&fMV!W11Z++=61AaO~qy? zwWUpbJj`BboZ-FSXw!G=U{yHP;q&+Ypku~dbp!L#&#=Bv&jEG&gyw7z=a`R$QP-kR z_X~saYS&^cY24%J0E!boLT%#%*UeYBagD-vldjCWrN3(YRGoGU3#>EDBYCW${xB6-hfYDZoUeYv(P{6B}=mgN7^kPL!tQ%x?{EBBjNlC`x zQ~JGEdeubUQU<_@_0|``7AR+c6~=3~s{2$Q;(QKVxOtJyQ+h7)A5D8b+PlOYN>Pa8 zPHchr+!XyD_>Z4C<)5h4x83bI$_4>gR~&b>>h}x?%((-YfQqAwrg5}6YN08CkRLxZ zxn&aTs@2(wmpIaWcUlCPtqjpy({)w``3}@F>+RB`I6^I3;lw>&;o(h=@4^lk4BZA~ zd!DRM?&r1fZo~GCo`hoPmP1S25p_-dMAzWTzYMdm9eg;07FA2n041?x+v}d5K2MxE zyG7zUhYZNe&+v{*nJ(sDich3T7~p^#F6s0>p=_8zySen_suk`V`qOlj(`lBtXT+@I zhLZTdT2CilhOj~(67Yr&%JeQe=`EY@PFF!xI5qn%4u*j`skqWjpf)*TtMbze;w?&h z(C`<4iuS2DQgt^a@hcu2&%N0?sIwXpV^hvqF(=dz$9J(3vxCCitVCGem+(?7%jY<3 zz}`0iJR~ufja2`3mwsiH`yI!RzWAVrZ=*Pgk1&-}dl&z5C5HuEvA8!Fxpys(%%|%D-dX-+?P3L`h z50uP(^sbkOD?F3nd1S3Zahz$CHbR~VWI|rh47FyEE8z$&8aLMmw_q+(~vKH zJtv!_6rl`v=txUTYsZ(`R_;wIVk^z%wSsoMDEDk$unsCAYY?^(;uQsz89JiWMzp9$iGZ(1zUXrz-;&nSqI$0nAmsWO zNqTQNFlZtewHe>TkN5D58>W z2BChG?U(n!f>Tm8I-O=&${r3eb=w|)S{lzTTyimBNMwhQMZB0K+0Ad7F%8vWa`*#j z+i`{ho+;LV<|joW`{2_>TviT|o~n_LAW`Z5vsvup;h2-)xtC}Q+g`DcmD4ih-jnC8 zC`&U>ylE$yBkQeu%82s10qRoRIt6({+FwpE&WycGfIue7jHU>_~&9# zLf<5s-A76Sjp2uAn9w#C?ke+WBSpFMY8Qx1J5815OK)+>Uq8EeNM>xS{qC->?%M`D z<#+}>fDPufi`cQqMAXi31&-a}QYh`rBFQq#_sg;DxMy`C?M8LNW2LpKzKsABGZ6F? zUyA(8>p{)&I2VR8@P>UtKuD0@Zt0@Tdb@c_ZEYX)@`_Z`&_b1HFk~jZ|Sx zXx84aGfXRVOugEv0R)#i*Nc~;AzB?rn3q7roV7j>YpkUmg+`lh#u?Wo<7W&L6@>F= zs}87vNA|sLw7yLkBf>vJO+Sja-sX9HHvfJwuFH;mnIVu}8%>a{A4bP)(1$XMQ25L{ zc83;z0%m(x{r$`zwqOaK`!<8#pj-8fS4fm(kyIXCd8=0@y^UqI>+y3g+)ox6JW^}G zy7%H_ow=V2Cu;~6KiJuJRp|~2qfr~Mn@-c<*#^>DY!sY4LsnsD07sYv>yN-Z%Qv!7 zS*4+?5;3CRzyGmr!@#;@D_uRR5oZrUb*`^Oict`>N>nWV8MfxmcU8*^8(}1iDGdW_ zdI;k^nd?I7y=;eZ_$}nVP4xp@^|N6Z%74;E1N;$tlq_%;`X~BEM6s9o%k=CsqdKj` z)5$(XtpHHetC=4 z2yyvTczN&59dkIzy;D?PvcE#r&1!A7%oCk~Ta&4`Qs`vyH)Rv6Nfew$ z7UL6)f2c;7Ks79*G&D`Q4N)bM@}Y_T5ofGr$bO60831wUe)D5~a|bXAO^Rbu17aNU z=q3-4$6q-7!_{Iu)jFmoyg7dv7{!Dir(BgA8ssiH?$}Ck{R_o~DQKFv9jAsWIsp0o zr^iYTDXN946lB0sLwgRlwku1k5HpZ3!D<$Dk^2u#7tn^tZZrO&0lAT3`nOGIgBS0{ zi^R^)A6iHoozUfzTC$s%%QXflR=9&Q{NF7$%wukdkn%X^xw6M*(lA*8TZVo%_W#9W zzY|_huzu?(kUqIc7=R{FtAWvo7O<@}@&q0=CkEXGpF zs{o6d?|*4j@Q_Y-rlK#qQTBjidys+#pYNBI5j$ojVrBORkl&WpmwU*=*A_>^{F4P> zgFX!SFDy-AAR7K7`2hs3eF2ta&r(VByoLjM(i( zQ_gl$RLk$fItt_!6ta#RPeZX@e=5);LHth|1JrsU0*AyA;Edogh%%D@8Xp2ow@x0n z?HEF(uk2^4gL&Qe<$w^VU|>KFBlmp|^q5}ug)^CPrcB|A(UqK=G#f8k@y*X2H+t~P zMI#;%K_kXWWF}jK-$aLFB9B>Yt6&UH=51d*Zx3+*b-%`?2=nH@FCW7Xhr|-#Tq>mL zW%J@8-e*0<(#)Wa*$U8QJhfm*gJA+6!$vd#i%F224Rg)xKRhy3BFqL#?QHSuYpt|E zuS5wVnfn@|H#`8Y|K>HQ`e=% zFvH)8Sc-Z{vM(3`-f_?-#Xd+6+$!6v2X9iU^fRs^;!N2)dzDV07o+7t{rwd2$4@I)9n&W36UsN>hcOpVl8nL&$Jhw0x0XkqfVaCAaO8fz}5N)OYt$CuIq2^vEz zn@)A)egS_t=-C}>CXjHtI83Ipc~Ks!{%-Sc7xVX58A_-Oz}#U1l8EICmh97#Z=S-9 z9FB|p|Nc@yhbX`;&z3yR_ox~cV>lpv6QDBUWb&($FId7>yL~^mIKWIC=r6rM zl=%C6|Dil3!K1S3X24ICBJ)k)bePx5UIt-J`He0=G0UuetfQn8kpBIbqF9(W=dqV2 zRi35J)VU~w7&Ai$eCCsQI$s9HtKZ3`O=kx#afJAZkQ?=AIeos6D1b;s$6k5_I*w1a z)ULp8r;Q>1>&oOWp~~F*rH4B*C$|eIZ4P{>TcBP}M4+jrle40;PfVnR2Ic_Nuw~Z! zI|BZ2424+jv+2JLJb?dMTsW1RYVUh3P&cJQ!+8IY;1QP=h@i**0@k20*wnak)Q=^4M2Mdj1H&kX^sUD^%6R< z7xb?{TopvGk#(Y2H56LBVu>pC#!-g}hDt&qJ8K6jvp@ZP%viD20WYo@oroj11_`X{*qCa?n&Q!3gbv?J-IyyVb+n0C*j znjbLRf${K9O(~5M=%pm<%h#uUZ)AM#CFD*ToMP@PMcYCbUeKHUkJ1$bc1(aA_Bj3d zDpDIDouRp|UJI4RkB%_BeK_r=mDY7^h3#IDzA_Sq)96n4yE1bS-zeko8Vkm-GHuCA zR+;29A(&(th)Xb+V>TN-?g0V*#H9w^pG&6L6@ zVbCo-8wSo~q!37&iw)9m#$vL5ir6=85P-n=7UbIs_t_YcgsX?$45W0NiMj^GpOuce z*)R5;+DTz05X@QNki%c>cj?C`H#alTFh6_}_YI?IrD6_ak^CH{tEPu30{>J`7-#-&S^)a_O_*P+u>zz3f#EIC>@XfCYHAH1D#&EoA(FT4#d*iWOXyTgEl zOYCsmj7U+QKQH8eXzno%Rqw3Uf`=!EfkhOCfx`?CMnb}ohrRBl&FTT?G?V+hQz(Ps zGoSB+)b$ntZ9oZXKVRUI5016QR3{Sf?_uD>rN7A~Cp=^9oms+#LmocW4V1;jKJ1$2 z-obyEH_Rcs|J6qiF{y*V)KRS8u$s9M^=y|5XThrodWtHg*)<8OlaiMmBuGOxbGH;nb zJK1kbp&Mb_4+h@54RR{LAWnZU;1A?++;bk9%qR z8=O|n0G_eMQuOz>#Uc&VIaFpbBR?kP7H)>Oyi=Z<2}XH@b92BvTCN|f%|-%GDV}XH zU?uaj{yVF^GQE=HK?^07zH#p99zAD*$}(IuT@{TfhoSd7Kq8YLZC=uBp?Yw?DJ2vK(jIQYhk@ck{e0RKW5#`nmp<4ogZuHH+lzoL zt=%c5rA86=`xl*@-s@^AHsigC*WWPQoo3eV;W*H(epcM`ok#2qi|krT#^M{Ingnh? zLX-a}9{8V+0hR<49HVpK1K7U}wSl8c=v?Uj2;&MMi45{Dh3bxgQ;`}&;56oT&(Z(n z2Et*bb%Ak+@^@*-ew%mq_u>~~GP+hjrh6eAU%w$^P*tBbfwOO-zf#wHRFEfl`KqY< zw*KbuM@xZeD?>*p^vZ#hSeHO!6TwX+)(EdruD3*3PZdBX{jwZw!dn>{w)lfwhw$bMq)091W-Hl_QXDy5O6|y34z}M7zx?=Au#B)-k zRu>dVd`UR#V%k9|>18aM_kXzSLf(FmOJU8=W~m(mYMf8cHb4>*od*f8M$$LX%nU#S zEwm|;Tg>2hA7PvTJndSjSPf!!^6HH>qTS_$xkXFunLnq2lli0!kZjtxW|g#*8am2_ zqMtu4AgH-o?^+LmoZ~*(8M|@uK%2v8?bUMqp;0__2=q>wZWF~kE@Kw~XOBY7?6K?L zSF{r8b$c;lJR{-SN@Hf8dUKM}S=J3iC<6eIz~icl#fOU}&(5-k+Mi4lS*;gD?d{8E zfLK)?lo?~)fQ+`XgWHhq(L<*w%^d|u8&!+nPMPf~?f5F{mj~gpM7tV^t1U(UZI?QV zr)dlIIHadH#TDeEup!J?mlx(~C|yL4E+rUycH8BU;UQVyzMubW z{WczZj>4an@s%?)kqmD@%6|kH@7bXH0H_*g&&Q!H*17ef<)9aYUQVT#Q(F`77jq2TI^P1x)ra-6s&K)S zSH|yrKrE06899&%ALEa_L6RbMOK{|wf0K>RvDv^ls~{%CdE59q?lsZduv^}yGsh>7&j35$n? z(O&ULxZ0V^P3U*dR)$;Oa~C)8WV9Nybw4`aov^yG5emEUt~+?O(=RE!kkq{T5ltaq zZW%qR%sBGnR(r%S-)phvnS(Dle_my2a@ZZ^?XN9654>ms_0{yvXYM)-fy`u@LwaEtmLUuVFsG4wk$eL zdi`g9m&Qx>`cv6Eb`U+_{cZ&zzCsU;k%VkrM4}!bh>Q4Y=;2mTi2nyzTx;OyE8!aG zCi7>2<~bd#m+!HknzmhQWKiX*lK##b{# ze(OTg4aa@-#F^opJK#t?#yQ>`L=<+V8=(cGiYHPGNHut=cI@=;-*sqNyQpc+T( z2Liu6-8Ya4yH-=%0l76+t6}mPuN~f@xTe6w#J+MT$r@b!!ptcS+|NaD2R+ZDPcKw( z-$IF&&+QP2N}699`Gq=HURTq5_r?G$G}bP)hFy`6gouM3g12foWsA8_m8L6)l9RzhC%7>zf+n>13{B zc2bf_(|WfoE{jm6=|*QzzwYEe zE+V**=s@x9F%9o(2{~H{IkU<8TSdG>e1mhN5a?2I zUNUep72RI=Khz%(RUjGLIbb~`c$Lg@f6A5A<=hCq-PUY7W`yU6Rni(Kmn)-yo{Tqp zKUSf!J-+ombwp}YTOognD!EMl8BK?fJ}&uXSD-h)LGl7>urm1suha%eII0?dw8@-+rOh-YZ!C>UH)Q$~@KZ$4cmd(Z-) z>x7!iLvyuCza}H3Lu0L+riT}dZMZXCsd$zC4baW)VrTc;_YeE-Z6_7@Ei}K>kTn9j ztYdqS!Gj_XgOBV|_gkMa+TycgHfTHMZ;Yg~y-`~78%E9>gU>!^Lq~uMqgd8}IIPI%(Z(F!fi{=VSj*5Wgl0hW7j`Grxb zD55o8@}XoFF%XTj;XcJt0;d^aQ9hN60ds7=sbiPvcfI8dB#3=^GV!;=n2?~Ui$*~h zCGBx)!B5UQiA4~YiHk?bO8&%`2;lz1z$MZc(aahx2UfLum~Q`iP}eC!9;`r)h(tyiEd%r?2yjSYy5t$5rzreA*qu}ff{&3b8zt-RBXT(geLh~MMXqeh? zPe`dIubFv&6@S-5cRYL-d<&}h+twEE3nmZL`p<(DtpAprF|u;i=jtqNfgB&HMR+8SQ| z=6n}p6hY_Xf~#(0gEVI&pcX(!TSu<(T3J1$73|s0Q~K^!k zen%(|={j6HJo>tvueI%K3oUYMC2%WoC8G}Cu`bs~`%r}G2T&sd*Pa>93-an7zL%2A!*8H!6rP?*#B z&z78HMbmI7N$&Ud>fxsCfj%5g4Yw@Se1Q>QALliWwxz0d&Gc%p5ZQrY0fSW`gBZ(O z0ym1OYEVsoHK{8qNBVGFH*Y^8{-vf1W&(?j=MEpuFF3qesVzTzcsRpw;ew+yFhp@U zw|he6#QFw7EHS|bmSp{l^((u0W)K~_NzXOWq#BRO!9ZjpvXRN>UhbGtJegx`J&?;` z7r#@{G+%;HOsD2XHeRhRCb&7P#*_LQb`~HshEL0lDb$d7T{nmIRK@o_B5<>b9#QyU zi#xa>!b5kI9QeqnEmI4DXQh6xZ}MD>mT6*@Nbai_;H5#=bE_b|55|mbMG?@fFNRT; zXhe8gNz(B7$W?9D>3!-M_wNaWV=o{Mn+DHh1j*`Id{c9$1liXZ*spIH^E6j|IZGnU zGw6>#KGS^NXSYUfOF5?m3qgJ{)ueD!-1?kENcRVUJ@&Jyw>BbE!#c`y?1esK_iX(I zaww*(Ax$DcX`o#HphwM*PuHqG zQaHeVGyzz5gM*fwK9RM;$(SaxR3n)(SCp5=1B!0NL+6Vl7wHyFZHD-MCc;e#MUG5av&!(#47nwl&Uu<1#h~E??*2J>4e)aB-&4ueHTm+tkm#B z^q`eRoZPqhhk%|g^22zpYl0)Yfb+ah!*FV7i|Gv)GTlF}u_eE`zt==4Vw8-}BgH>B zByiwHxuB5x@g;Fh6twZvK>-uiYQ8fLU-xb~1OX}C8f+=HmfsXvvX-vP&Ud|6Kj$&6 z9ZoZ0tu-#Hz__YLxZZ^Mb#EM5ZQ=6-ht9%rP1NcaM@VKr9jlsPtF1oSxV?^fUp+tI zELpuyt%SPr>9jaf;<`Ks>a8@q8FNq@uZw?h>pEde1#h^K5-Dn{x6ue5K>}4k6T*w|1u*A5I zmLF=Bw-K?aC!VFU%Oqws4c_d3RUAL&VMryBV~pf@3<>e$20cVg)5XrLx;9ZO^N)Mw z2i;0IHRkQ-AGvdjcKXdtbjM`~P@(TL9Aff@eV1t3rL^F!wAxDSsflUI9EJ}p0`btI zpt>UO)5$C~xW%5&;sg3K{f-{;u-(jrdR4ImuJ^X&l2R{GCJL^m6)VGdnJoH?%NK~Xw%5S04CVt)q#W;=n(Cvh0yeT z%AxyKSM;?t3FLH5srp?%*<)7kfYep#)D30|@Rh)^H=~Qrw6}u0b*L#E#iMLBA66tL zA_zTwoM?>fB8g9ih|46u-GAmc@;~}WfVHuPBk@*~<=u_9aty1q9FoVmb=xBL!22WX}O~iQo^wbE&cL4K?^h=J3WTj&Ik@-$h6&;Dt+{KVt>FhHpC+ON}4t0{7DeX z#SG^DOiM5xsBPti3%FhUOghqy9}8lUY)_8Np?6`*Ptp-ur{gvFg7s`FPjTgPX11|g zZN?KADz1;MOF=$e%eX*>yFe|LD_fZ5PN@@HO)%`R84p?nT`5PoxnO>N_?Ka4b++ab z)~Xc`!`VY)NQYYa6OuOp%%fmP0W*f+WwzC=Iu}JEy<-Dy4oV}a!?u;%y;l< zB5aC&L;VE@-620G3sb*tb!{{@gbiYNG-L zl!Pq2fO<>jqwUOTn#9m@Vg#!=_pCZDs^zcWOW?-wgtT=cakHBq`xZU*^B=V(MJQ6q zfmXLgesGX)e_9p#0Cw0|9f|vOuD0IB!-Sv7h2?!eUqUd)xTClCy@653LSE%_cb%2q z$SXgyWF=bN$PSQHCR;SzM-!p5SS(@P&YcZ>f){BQ>h1;neW2&~J-2lK#jy?&J`8fg zxuOkb(t^U5KwOi#qmsK?Vnm--Vyy#G8Tx{0>g)AoR&AQ1`YH=9%?Oz_XCTpP%IyJm zv{kmL2Z)EH#6zMI6t5N#N^K5p9gQv1m@M1Agx+$yZa~Em6qX1&7Oqr+5v>$jT}Z&kV=!MlV0- ztQ8Xvv8u^!HK!``G1!oL0_^gp@5`Z9ds}MWW;lI$k-Juny{#yb`Y~wtXU+!Ay>pVh zQ)LViwIMhTVaKZ7nrk_D`Vk*zw3H)MKRvWP+&K!Zo0GQ&?tcNSIB^&Pa z6F_?>3Ei$>+`r~R`AQ?G;(i%pOt3CYY{LzN&<)>)m@4|lK?4@tezigXEh4+M<;m>w za?7_M3AexF6@P=9>M795y8-Pkch;(>6}50^=55RxXcV`a#;ZL6Bi8zkzt#YqZ~I2y zGz!wW833C`GpcTDP-moJ-qwX2rCt@0FPz;8Upg09r6{e#kN!wJKEU^fsogo>L-+lo z4+SyOYuM6d2KJNgvp-5deX;Pp`6;)(Q*lCR`VdRFuwlbMuSqlpNkp49%kcj)S5Blk zeVTfDPh!x1{@#6z)aof8oEMtz>!9W}1EI~d@4i0(`CF6K^VJZC*zZu}jqeV}Vuc|N zwOyl}UOS^D!P!$g27c28=Fdzt$0Fo$@TI}&JfO7PwG@zf(ArTlWn_&~Kh2BY^4yuV zwKJhI#eKb1r`1`o)YXL|PiCvd0}i1(P7PS=Qn*noGGeS@Elrz;hM$~<6%l@x)%$(- zJUgQHAOdEm-wKj`Quqd|CZs>b@mz@qHCKDetAfeKn?*rP1!NZ5Z_hF&US^!X4N1$-?Ef(*j;OBd4fax&V^GifVmJ&e!0RE z#0?!sUu=%FKa`NXR;H0lUXB(r>xoHk-^twZ7ERf$?q}+vBmN02ucoG}@2|T{DaY7v z)9tWH2yH30vZgq!yh~iUChZr5EN}z5n#!lrPEu{lkA}4Am%X})y^qFK6|W9Z66MW7 zzw5u%Z@F-ES9L1D?tg#V6=F5nn8IdBv_PLRL}&O@(f?Z26gR7jj{nZ|y7fy~P4!1> z(~sBob{5yA0PDoW_xQT{2hW)h{WjpyaV=AN`e_wSNQHk#C+^gQY+$Z%7Zzbkg<(ri z%fA|FeEPRlv389w;>$F&*0;+AikHAllVNk)Na5D%`)5j80_$dpc>&cq`@^XZHAofL(gyd`R^P@^*T7Vs7k&<)}xG! zMEGjhvfAdBv*wFJ%FE1dEmOm=skOEsd|>OMJVbx+lNHJM+iTTs^eI?@YP+h>mBwN*(iUk z`_=JFX5-2FAmb06&Rm%;fyrqPd;K~tJ8gOEx`i8nWHKI47BF5uZ|3dH8+o{|&~lFx z)*^l;>vy}FjA$UvGGR}H>fw-O;CQ@J{y~*REf{f=qjcZm%7Tf5%$JJT9r4|MzXG3Z z$;D!6%A%6ov-3#2tZKjGU zDULuzoL0-7ro?=jTM*Z4(Bjs1K5Ls>#o%5|*f9}oc=1h#A-v9`4&&vXYTpOkJH1k5 zDfRp9)F#)h(Y&D47c%iLisAHc*dsHx(BA;F>m~BUdRO`cOG3u+_(f?KseN)~Z$=@3 zo3DTCQ5W&OyHAE9*^A$F5bWWL&wb~BzUG|eoFYi%|A=5ky~W1$mprL#)Cnu25cE`vnUji}q|6}YeqpIA#H&EDaO1cG29Q(O?%gi=coSfxF7D9GsaPTSuxi$=X_@2VfGzw(M6HI_c9Tx@AT#i{94u< zT>r~Ej&wP2gsI&{V=}&zJ3(jSBTT92#7xmE)}OZQN^|4D5^hQ@kou_?1vjvSQjLGO zk(QwJgzY>*Q>lZ=aft>-GugW1D~Ue zlJvwZ@NNmeVX>2TU7O_TKp;7baKZbii&Vv0=Nn|OS9GaXJ1^wwh(lX1)9J)zVfxv4F!HPmK$1i??$VCA$XEc(Y*s)u4C&{pPS6Uj$`xvj!DK~*rTf2U0<7ii1WR+nke~T|Cl4>vYpe)83HSxSX{9~?k zU$tM0D6&Q)0+U+=*hb7`$k;0ujMG#hx(k7hlO+s^tg_eYg77A17Rv3~HE>|lTYRWL z1zk}T8+K#kMrN45l{3VPMpLee;;b}6Q$cU1zm8N){0 zHNib{ccI+!AGD@eGBH!esL!dMnu%+VVoH@08|j>CD@Kb!6K$ORkpt#k0+7jDwM7`A zBkW<|7(s2&R5Mw-H!HU*nKnykul+)sbZmJ(;}JYJ#!JIHK~-n}y-7YrJtFwH)_heo zY$X}1(I!SD-0O%?<~t9So<0{%PtPQ)wMxPy=5C-XRNZDoM)ohr41Ms=g?u z8&we~kfh;B-QUWy()yfUCW4%B;E1$t9pvTu*=Czg<}i5P8A0Rky7@$2#qe*C6p#5X zGEwT%QjZ06{?d=HLecM!Q;>sr&0E?SudiuZv+?PX!#?og1AKuXSS_ z0<@T;;O|Dc-ZyFmrhP;9`<2rZg(r-xL6DCH5rwI4@NUd~vFTsbamBh+yE|tG>QL1D z^bnb3ge%`op!peI02(slW8fJF1Lc58`nliGf?vPT^4P@uctyP3;GeKM)WHS`#k3TW zG6a5;vM{yL!qMA|Wna!0yt|&-m>fi@ft~_Lsraxh_vVHao&ko^tjDZ<4(nn%PBf+Z zv^RbqCbUN3c#KpeHg20ti20JUfVBnw%E6FfMh3@FPe>iXZ!2#`=MWm_zsJ5NL8bxut?7dUnwh9I(?R}H% zbJHn}bn6+(pa40%0622b#%d_pt#*B}Zuj`OuaS7gWRFvkEK%_Z?p?75H+_QCs zB2@}t_$%Fk<=NW1??Uvl2D4cpiGN}R>F z!ei)IH>pd9VlVBNcgOTe?y^q|wtmPKMT@*Z;7TYxzr+=~v-}fuCi)Ew;InwD;Ij+0 zcJ0{6==8~6&IoM~{l0#0qcYg(Y%AJ{k;bdCA5$qUYqAKCgRUSv<*DRwtqc%XVEm|2LP@`^as>Rt$XaER&eJGt621EU-% zV`EF%yI03x;85rB?!ppC5sYSd8VgYsj188kA^3C>k_8UiW>(IC#a)EL$+#90p<%ZK zIzi40E9iI=W9_y;ao-=RX)!$77~2C>^HY|XQr$+!V)i)yb`WQdFT-VmZi&a=s{}#n zygZiLg)vFH@8g4tF&KE(->N_ybNgV1NZ~}^d5)hOPV6uyc zt8yF^1mJR5Q__t*^`Sa%0uU7oI;E(A+1h0;3)*%)yp4NyOy({-pu#~h^I81-uSrm* z01>B}@4HEtK;XA<#2@Sf34Wa>g53J^*hox5$v1!HOAS~dj0jV1ngGv~64`o`U;KgN zoLn5K(6F5TWHoFg>YiN!XgHcBEz>hI! zyFT0ny6yikC@2+K8T?w7nJM7`B3NCNt{ahpBOr5DH!v4Xi8IggDvEiXK~M0P)?&wf z5rk!BWu`0|=>sL@Bh2XJa!b}>h0iSS+taA!xL`VUujfM_vfS|O(co0fim@nEz^lO3*EOd5%-r*eV;7CyV40Nar|Ud*lcv7|WdgECGe-1FE^;-P9$1Sh>w1=kQ?tE%ttZq+$3+(ZgeyaWA-|FattK0Ik2>VxZ<7Kly zEhhn${6A?+LC;dn8hJtLUY;PigJ-V_;L4XW-s^O}uAMFc`IA$sO1Tp*&t+&PQ?tcQ zK)$t6d{5XZ;B!w+9Na39E}E$>`%)ct6o@m%b)JqCiueu9VqIAdJmFqs)(d+>#M{0L zUs*iakSAxHDTGs(;Ps!LjOL4cX*T3u65&h`XEtuFK>9V083_t$$6$D*e_p0An6Hq_ zh%o@-xAQ(78CqE3{uoi@pGF7O##O&5I=b9?&iyqn_#7Cl$#$bq))GnfT0)W4$=g`D zKn>Gh3Wq=7^5k69xqC5YQ&MEA1&PgRuKC-J(r?k-4rA=&PO5+6+%TVNOha7i->oOX z{4`R$Wj9$g7thVxp%t-V_3J`){BSVC$C~q_6Yi`|GngK?LLP>AZ5*RhG^fO(Ty9>y z$?q?n*P$q`QWzCO$k^E?|HLGTzZ323$dbu!WnAq><-_wGmi0gbZravz8|-iP3kfu@ zON!=c|?pK%+cTryA8$tZuj9IMkm3^Kf zkehtjn9V9(NDwxn;`3~YnVjkXa89w%I7E2!znH1tog+n{N5xI@UiNy?!G za~So*W-sLa23-=P=7js`BR1ULt;Np$MK3PuUrf!m8haO&{934rn%Ne8{a0@uG0VQb?vPAlOuTx>keVTYM{eVn;Vj+vvZ%^G%OrU97AqK8g(+5lcLV&RpU)^`hrhmVas=`>6!SO@U915lw&O;8#*S5!V6|dPUT7?yf4YEt=-2y)1!sL^cS}TV& zeRZ01sacnKKRj54OR1w|$Erj#KHYq^yRT2iA)F5zYguF4$ocQeU)8~MI4Pi=G)ormYmDS8yhw?>ICr@ z=R5vF2Qewln@pBHSh`qCCe6INS__{uJF zK>lL!cidR_XVq&;uFDw|v@EFLJmR6-jrK-%Uq%8((H~Q^!-$)6{RaX}@hzBM1*4I=0G6&H1OQLB`2O>G#_VHx zT>QJ>*TT>N_*i8M-8|SBP_L&8XV!bSwwdPqG1;RQ=Q(L)1EO%cqLd5$U9eXc&GWEg zs~wHlJH-1x4MZMEe4g^Bv0HADv=Ms?>4N_Za2D@ec`wRWVyRe8$8!%*ynUpi>2s7b zV?qA6fJvz;XoYth6jTv}nxEJohmOBIg>thBeaFGf%V%wKS;;P)RYyt)Vh2wl5cNvZ zQGzrCk!;hm%H>dVpXwIKv@)l^w3#bwsakQPNbZ|1y|CT+!Y3v?bnY0uolNcQfVa~Y zhCf!B)t>6DPZ>{EC%#!0E=hVOCl>6H{==W)cV5(QvwjgjJE155ZNj=?0tuB%CND=C!CXzdQ#KBeTqN&!0<@^M(#QLt zXPZ!smqr{gY2NcSP9l*#us5x}fJ|r5lJZa&A_CKdYJwQW$nV+c75=yh<*mwvdzFxl z_mr=GDUrR^O2L^w?W_|}U2-AkwHM|kyo)|)zs_j9L(Pe+`8=UjHz@u->jH^x{Y&Ww zQPYH(BU6tLbzU_gGbDxm&UF4Q%d}uacZ6d$fevRkds}ljtQlOJ0|p`~0Gb)bfDE!uqxae>cdxCu9e&PdO?X*V zLgmnr1wEGb_g?Y(1>>qkyNRxJ;RImdwObuyrgk<`MU|ELYD%0@nH>7Nw_e+*y8O{( zp1SJGcb4htAG0vZwEgwLcj;%zH0N)-EGXo~WHNFN_)6sB4f%?{jm4(mYkz}+>4rE0 zjp>p@w$z1l_0{e>Y)?PxQpagc@)CsiT@0m-L~Cd%c0SQt2+Wesw$H9v!O6KTg6br* z1POl#1=z`L1jRT>Qpkk#Vg#&V6 z?-l&2$r{Dv<~#3gOBNVIN;pE^Qqh#&{EFLR{$(XsziVXBY3+2n!W{3{508*&Tp^3x zCkNl{u%B2z4`S;Fby`Lsy*LeFdi{TIwUtS(TBZ27l*pmQYS`#0A?3@*pJJJXl$m?x z8v)aD92f~wOx2X`lU;>i4dC6m%K9KmmSAsscezkrZ z;1lB*ajsi3cl!l)eeIK^{k-zJZDMZ5lK`T^JCx?)mt-8m1{% ziKjJvRDX3fH6>imtIZ%Kf&gGOHL`($cRX4qDy#L3EDth4hM+ewtC)_S*}8pOL0g@8WJV4cVC?($Vy zR@IYhu1~N@!*8@C3VW>OP?rQI+YsMOp1g{rP^w~sb~`5*4be=F@$6^wHqmko*HBz`mVCi%_C1MLoSnZr`&;geYIVdZIq~D+U zU$tO}H@mLSbri1*2p~?{*}BjEV>7Qjdu!|pQQohQCzM?SY=Acj;f~8uNC8E?Abr0- zgT}#JN4>L1(#${P#L#m^=^~w9jtjyung)ore#MZmf z4f9?k$iBbpdoZtvXz2pNt~Ng{I)It>&x4VR0XtS%sw`y&1j%>!z}EI_{AnqJh5C0< zos^86F9D+^BoCABO!z`GOR&>6Zn4AZ;NwUYn~LP%81vwNPaZGZ3eZD&uAb_*09C{d zYdkw!yzM<${dr}w!EqY@WTR{ra)>u4RxCO(fV8&5=QmJ2`$T9h$Kk?qUL+@>Uu^#D zkoqfh2~Ku+*x(WD6&J@>JI29Fo+kqVCfVjV37=fvMSZ|(F1Q`06lIf zuhn;KW}=D^BtA-sMj%{7G39#KbhV5mrmywkfp7`Fw~N&drscyRreuqLr=<8S%;>#f<>Zbrpk*qogdSuK<-1gWOz# zaO#LqVXdiIUc!38hy8C*Oq;y_3J8F#RHTBVTgL8Sffne4cAiXUuHW75h)D_=RKMeV z06Jg#acVNfc>|)g4cjA_P3&l#Rd}9Ji4}t$^&g81FqlNCk)x=-;2%WEHsxzD7n$ts?_20`^qQmuJ5OSK*W4S^)K4BgwJrT}Mg1GZe~I$(Y%E6AyxrI6ATL`-|P@y5*w zD}l(Q*SVg_s^B1;F^;WvD!zP^`ed~OKu6UJW&*+q)c0zkXu@x+Esg-d{*O@5&rCSg z1wF3lGl>v#6nl;Vnmnc)h=zcSec#iVX58Lszfa(UC+JR(TmtyBxAe6R-p0k3b@Dr7 z8jpLpM&5G7LUmeYh&1Zb;JBgvXT?ZxTIk&_R-q#UuG5T34^VjO4Q;xaj@^>G5A3s# z7)a{{M=qu+W?7ts9{_}4W1LNT$oI&v#@wxz@=)3dS1Lp7VD36l5m(EFK;H$r>FEX| zBe0%l|K0SbIzL#s{KF5v!s6&o->8@da)(nUp2aa6H~R&(_fJXzuIh`hp4d!o-9eUGIbA`5vjgSdpE-#akbwI#HTYpv}5>@Z-yksoKDe3e~Q* zPatw8cN3D`dnY3PkhX>J_fA}XSEWh6-(fa0kS5~R-bL^W z1~B5JqQ!>`_c@(Dh2sMgDxI#Q=F%jkB0HZzDcIN(QJ?zV8f@Kqn$+4>W5>Wc0=7O_R=!wVY!!KbT8Zv@qF5 zY+20>aKNz|G0cXNu?^J52xZk+N5wgO@kuJZF5hJMk)l3`LL-ip155iK>zGy++*u1h z`Yu4`H3q%dJNVoSk$T?R{Dgq`%X2IdS0lULM=NZIj{YvdtLkJq-gR`qNS`u6J}o{- zApeY`wc#6)BZ&k=PXD*mY#wkBG1JJ);BW|@3P-* zewCm+SwNMZ5o2Rtn{xAi%LOO9hMtPFQ~)hM6$bB46W6CqB^w_b&q;_vKm5V5K zYPkN>CJKjg$!|D~=K$OE8!sR1nS*#hVTJP|$`j!~-$VkoVk%M(1u^!O*XZwca%_;- z@5*%}e}!-UmWSFa)WQLI9S`-zyrMU)x7>;)C#YNWVhUr8M5w3X^*Fa$1dI)f1BT4S z47?>84%V6UTaSVTEu>J|PUJ}`PX0OR1QBo`+I=n|>5rxFX zG(>pqd)<#LbJ<7twX>p;gVxyuZUpsw6h@5cEEvKmwvwwi}Pu$m{fg!X*Gfb zksMvYiS`eK=|2Gr+^EFz24Z=SkOP!@KGb|~4C?<{);;AuyjLU{6UFS~v*c3IFp7_p z0yppm-HA{3Ex)mSpCc#;KuoLl-CNr5g_VCOkN;iq2<`q|3eX69+$Iu0hQGkEI>Hkw z&Wb*>mzv7=iAPgaSkL5d@c{FWMGV}2%R%>+`(1Yq{g}%Oah{F@68E2J9V5*DKpBV^ z0`q8tffbS}VQSv3!t!_3pR?o_^Q#N93*V~zR)Mr4O#=Ga?JAn5rE%YkgMfu1*&g#0 zU8XQoT8=Hjelte(&7TM{^zmFDu*S8~Cn}&0x5dlLdCcRJ$4~-~*;w9tr?X9p!IPxu zrtbK7MP^~{JJ}-CYw-GUjomg%s(^T|c_z=`>b|W8mu*=9BFPjC%n>9SOqA?`BmXk&Ix$=t-H?1Ecxf}m? zxzTTEdBq4Y{Cl1M1`wWn^y*hT^>UnedBpEjNBL)nFdB@{4hdIBOX~Nz!Q1G247-Ly z3^=cJMU%)^j=|2`4J0^f<6CG{chWkB%d=tc?xiAQdi+t_-e%O0eeybz7Z_EMOnQT- zs||>2_s&NUwVs1t?*V_v^9Hx{0ms#oIiXAddpSakwH&24~ zK1CnAr?QoCu))&L{X;a z+XgBoAeD)h^m2>{WV@51CVC(C3lAobWy4) zU!87w`s~fHrA9FS(eXo9#=7`*mj{NDrRv2l=q*WJDF>Uv=A8vg7q~6m@f3q}rxWf! z=cj{dS=jaed=q>K!XwHE(<={C4hAriwoCPRssZ>K6K?yD{iT&Y%dMvF3ZY?14i2c_ zf#*gEyYnAEQ>bQUA6EAI9$Hz*7jzu)*S`{r0`e|th4~o#4v=zoZ+Gh<6OK+rei6~S z(*ky6uOOy`;EwRTXY3nJgU6|p@bg7$K8z0}GS^M`ESRg%oqMxh3H##^j~S6huKp;f zXWTC20Y1?G05kwHG4Q8uv;d}M{&~|KO~yu)Z1xO0rroex#J|L2UZAx6x^vAi6Uqp1 zny{do?yGeo(Y-?gG&XX3;}pwrU<^N= zXe!5J#z5Zo<)qylKQ%}ULDDe~DGS=nu$}9YG6R@PZ1~O>(mr)%40u4ad zXdom`TNDc5dEzrg*fx%i+|xa#V}IS+ia9l-j5mT(G5>}JGWdvKfG-l`+g=MdqD-KX zD8mvjuSQT$&eGLN1^i82*aMyw&vQ8*2zMjyBJYo{4{Mov^tn}Ow+Q@f0_4X;KL-}v%BnFD8G z3oNh%LhP$ZK9K~{dhgfjoZ!{3Z)rjJJgp;Q5G4k(x~RMpxVfx$awfJzHek5ry@d}P zuD2vw?vGiH;;GByLsZy*lAMYGZZ8q31nn=~@W(sDyjx9T$FeLeMXe>HKDZ+#>GmwX$}QB?gb{6&k4VVJr`mt zwKcMl)s`Ll?~}-&A%vLAHtC(6O%(m)uuRVrV8&XMh*K|rxKC?7thAk6%=Pz8^gtZI z{#Gd;Fk|!K-S5zVZrWctkia6e28`(g&ksy?+E2BE^X?U4K<}mvW6*8=!2a5Zi=aCP zEQb6{%DFv481F3PX>z%|?7#NLE|kLJ;p*1{@Lmif&T}R?w%cw};0IdHBK=M~UA*Fx zn~^z+K>BjudpWL2(N=tUB5$`crO!hm0S`WtYim#yms1k^_nsis#o@RK>7Txvu;sj> zmS;&MAp23h-rxyumTTHXp$5a=pd#{{tts)Ny!{{<9wHK4Z#dpg<(Nc0ogXTz_IXC! z%O7+#W{8hi5a8lNfcsm(Ks*v8AratXLlVp*(=_Q2-d@|}(m~-`@3c9=KNUzKh%Dn# zZl#SR%2r_U24Zz1@Lsj@D+;3n7@p%XV0gCNl{-pVI@AFmAIXQ;sVS_n_Yjj zh0=1HFupKRqx*xp+3c{zvCuJ(LeP&biohBSkX6WeC#Kve`8YUIuD0q$|Dz+0e#}>J zPttnUf|YTe<;u9t%IFHTTg*rL=-*ox9T|$|R)0)rNPpR{M!q(zX3xktI8T@9id|Il zGQY-eq0I3_I(MX)lhNaszh&&laf9&4gjbGZMC;~8o0bbLA6!)$yVy1BnsM@CG_5Tc zdcn+lR5HH&`d#TMbG2W%Bc`tzrIqmmHhHy_>mt;jfzsm)W!6R(lK?pgZSXu{lS2mO z{->?Rs{;Y#L0D@`D&8LuYZu9U*9Q8%y39T$_o=xQE82nuPK@vj0_w7P>lKI&L=bG$zzZ~W#b`)KK1Q`Bnxeu$RmqhC?dyTd1YU3V{B%flL$Vm0+2 zsa$U-rc^ANbKr}!a_pfeDbvq??>+X)F%kra7!cFWGWAK0d+YM26^QBN`dtaU^y6k+ z#ntN6rHEbDMAPTn{TkJNT_W}8cKK>0j|U&?BHk-F!`5zlSBWhlkTD=>CW=NPNo`G6 z_bW+T+avOj49Ime8coPe{(6Cw?Yhxexb}9X8#>7Hj1o&-?&cool9_UGt)Y5iu_ZL|3!X9!xy@mtOC6HFmZn$b--JxLp2O6u^>VWuS zst=5Ygq;XfM_k8jIrtw!0I+Zd@s$LA7IMff&3@i=Z)-bwzqP@GRcDEo!$4v=WL+cS zzBEMs5+H+(h~@J(kk}!dIQeLv4FZSXcKd~VG@`|KeS8awU)ICY!dBiRDI8)uv|k2(`MRLby4o3V&T1< zq2z(~xp8r@!TsZg7VFu;m+PuHf@L^twqJf zq1UKUFqA#8)a4#W$jOP1_6OUAn`pdXgWe)gB8spe zT!CL;I3=vkSkV0?-Mog3!#bmF6S5pI^{69g4<1dRnIwA(687HH)C~Kh>X6aJe2{bnK9%4~ zLQcD{7pN37>`>6vux{blt(<}Ivl0Eyiu9l`UrESHNe~V-&`znn?V^(Uk;1KW z`K#+{jb$S9RYdlp>KeWg$xc;K7H zrMoyErhJ>8PzBasy4X1NKH|1Y{~g8y?wC$EC9TN^My{V6ks5jENlu>WPStY)z!*0( z$3y*dUxp0e>#QUY>4`V`&a39|HGX1@TCn5z?XIjtQX|J-#m>k0oMw+HKh94#Pr7RQii!Oj8G8UsUT0S5 zQMCJbIopP{Yi`5Y5HqUv*IZWmgBjV8Wk~jHwBg>BmIWs3)QCF{(2X2oIDtN9j;FW3 ze0hg&*JXJ~p>#}qz3;j;(vQEqg)w*s<5+4ufFyt*_mjbe^VV1gO8=V}q5Y8dD26Xc z*U*AMk@}l2KS8{c{*7>`-RB{Towm64{IdvAu?LPgE%87SH^wjB(RBuzI^#6G%Qw;dRN=w#Sl zZpL8Sg71AVPbILsa!&epqZuGmU|;?8;NB9>-Y9`*=8JyZ*;}M~O5hBb^5`^`Mr)f0n8amf8b*dzFW$1x&K3 z4<4)cbz6Xe$|EJLJu`%&^Rzda_G}A~SdiJ4dK(6iAh^8D+sBe~fhj<}u|9|0NMM^q zI7^0gqWsxEWNY6u$W_4zSq3X9MUUg3A#k>}D*m}yNo+kD*#VPFANn3gQ~Jt)BQB4F zd0+<%h`tgha*?^}ZkQ0e+T0qie%lmr{Lof?S^HKU+I;<9Q2RHkClWkx2t|G$2A&OA zy@Q?{MDp}S2%CKs!m=1o-ID>vu7NA52qaL3Hp54$4Y^19hE$9s8;h15<=TkXk z4NX>CdWrHLG)Y`^`VU^9l%}0Oza|4T^d{XO=n{DZ`>c)f(+5{Vm|E=WLHvgEQD5M< z%0$>e&qP?b=^=;m;=VX-S8TA!kBi}razfMV)Eu6T?O)%Dd;`=a-%P!i$WH(|V0g!h zvx43C_?M{QYTgy9)^a%&yUPxpJM~wqe^A>D5g+KG{ z?sp+3?W}=^utM>7=C_CHwV+1pYnjpimNcE^G^G_2)FA$nywuVE}r;Z-?To9kz(#6I&xQ83#UTOJ~+yVf%pYl(5r(???xw6II zsjLY6DS?G4>scWd(q(O1>c^cF|tCf5RR9m1Yh}nQwEzvY-pC_Z4y!YJM zzSkOCp?w3hs8YpHP5WnKKK6ZduvqHnFy56U%bpIOw&i8d z;6TZ72Lw&{*$z{$QUAZxfG_xFEFQZkz_(pGJrXto?w7XIP9RE`WUFvaf{!1#($iJ# zeYAbIv=#gx7AH6*HNu_OQ_A&U9A1tHooRs%29avLvtMWv@O8RE$Y|F`(EbnkqLd7d zpmQ6U0Gu~IZR)UCsfW+qNXNB~d51M3n$>xRAai71Tf=(8Cm_FudK_Ml+4~gJwD6Z| ztxiV}i;zDS2EAG%$TDYQy^U@62ZfQ*tHHu+d9uNu1TxMt3E57~B;6eiL{*%f&$=g} zUTALW-P{i>K0y9@@f(**BCIaWx!wKZaM$MfvNV-4=cm^vUG~5WvnNz)~aN zp1@geA?yqJx+kkGF?a2Vw8%Q8h5I&9(|g^ZhZpvJbtcavryDSD1$sbN!^bdNVaYnJ zZUSdgV1#^Hb4Q}o#4U;^f8SUJcw+@b&dUICXrDBlFQmMTH*0kFsFlq)f%BZmvk)KH zv85#l5j=wai6N7mo%aH&=K@=#q#x++M2mKl+IH4^Sh%9D9rU$%p9Dh1f3R!`DCF|` z`gpKWz*P#J^tYT{TYca_v|X-9%kuJJi!EDC_%ir|BA4eg-!2%8+z_{FztGOB6hvS z3z=5C4z;Z3!)?IW)EFC6{&Q(sKoFK`n_q#?a>Bbq{KZM?De|jysiLSVY*w^`v*bT% z+`BOa;Wf|Wrssb4-{VCv4GeH~e-%k+ykAkuQ=x74N3(;1(Zv^}qW|cL>^lH%VwQTL z=m#9V+A6wGzz8lDoPi$AWYQw>+s?&f#`aFYwB(|{dt<_s3*t)nN4)S5B()|`$JRRa75)3+{h_QtEdJ9322`9 z^;;TL^$dtGr@AecX?#(iOpvW+HdmbhnLhJj3(lg?{oVLMGv{#zTn4p>w*OnTwL}Y6t(6`R ze1m|S$acT3(WbbOw>yj??H*0D+H$o?rGjvzaW4-2*7PHT3!*GuYF41Ip2L!R9xf(% z%4!@lt9hp($ylQi|B(#2R}g32h=LAoUMa}yj}$N z$G0P^v23JvXVrVpCfn~65@)-HTcl%VLC@&F7bqawJEfZJ@=p}QVm$8~*{s-7p?*Yi=#cvxzWo<1 zkdVvFVnF5hElK?;26KQ$@bsW6^l%~eQu~+1el);NbqGSZM1YM2Fd=UTe57^VAVC-U zWyJaXd(O|tsGz{nZC}_IIH0k2i?Z)M!}+stf~R2N`c=tqMX77xZdL6Wd}9p>wC`ul7_uiON=bxa;S^~yaaoQt z%G~>VDAaPO;7+@c;%|d@VTy0_8P9d*Ot#!NIEUw>dDhUQ0E&I~%q|g-ZbRJ$uP^(| zFaY>KnBBI!zTmI=w7geSvs3=!#Kb`zf33H#0Gnk-{mQf;3&(k03V)qs_5GA1@pj@9 z`dd=LBVh0rfL*j^&>~oWKjI4LK)yrOnDKm)N3Nu~xyM*hQTkSOnSSZd7CH%dNo0kf zLoTRJ)%f%WN&)mi0SMaxV|HLf%)=Et>-gA2dk~GRYwS`l$-Dyzxi7*6*cG#&j=`x4 z@cLiXk}WKcjlL`DUv9H-Cgc=6&ag9y{3zDuTD(h`||Gw-T?MhW@ov~9-Tha zOTAjI+xLN`c2?MYofISXV~pYTT6=>1l9CnC{1qk zYfOJ>;IBn=$A)*K|3ND#y%OL*F2gw3hO!8?!xZ)~f}uELtn_t^ zoXRE9QtT9gJMMN~(r(wwJDq%GM*jDHS|P1i{gCXpR~4Y-QG&8tVp`}*LdYP<&JY0XDa_d?3?QO4mkM`?7_ zQP2l$&~5I>3(JZm0k-*sF<$_t))nWd-F1OGMWzZC2BH$z643wl1wjb&!QQ}>rG>6hmt#TrMS3t_+n>JofJ_?IgYgQA1&mL~mplBmcdJ^g2M zp7aL&MruoCib_zxm-PRwS(sWLm?(^9bs+8@B}&aZW&G25GgR-3>-KHl=CEhq=bzRO>@IK< zhTKZLiat@}8NtY}jSHN03%49n&>vQrv(grth?e@Qq~^Rsx7r~dLuG`VYzV)pUqGL> zXjW&3M?G#jzr8T6=5uJ$U#U$2@n5xE)tnx1ADA6PBvG61Q%3R~hESC0eshOm2z!Q(66bWJz`t za5Ia+cPGmR|3ED0uMag2C<)}E1XuJ5W@U@HT}K)OLLjY1xg^M2HB4m$lQU3~4zTQK zW-pF~etgZm+~en5^VeC{x8ur~n&r1@!EWq6jt6n=s|Vx_CoomEG}ke|;Qpqmn+7H? z%EsPNr>)zpOgz|{nlQdiNmFt9vMhcnlz%$Hb%XyBv=S2rH4e<6ptT4KjKyo*4Ygk6 z9e8(JaS`-Jj_G2rdv*l+lh>d2p#v-9gO_@_X^qG>?`X3On+E_cV%U6p|MTW>m(D5vW< z3@Ll8gHaD156(7!fkBQD&{Vg=`aNk-g8KXu9aLPs_f6*HqmZ?I` zoY$c;d%dn|NF^hMhlV@*tgfNwbV)K%(X6b8pwuQ71K4%9yvfZ`5*{qv^wiGzbrw9R zB#ivPFhCNqMH@^S-2H*0R_u z6QNbx`M_0{E8Db$ZZ^rV=Wf*mU&LA9ef?yQ%nft7Z|R%{Z(eSe*~W8x+@R&4`77dB zP?1_AXlJtr21QGPcIrzY238d-<^6J?a>9z$^YllG8v=nwB#tMxrPybWkT@|_Z zs5g4JK(j0GNE0-?Dud=7#ov21&u9Q63zU-c6}?g9PfBd{F;Da7wB7*uI=N?w00e&>7$*nZ{)%)bF&E)V{$b7aH$%l z_IKhm;=ycF&F@0L+-CO5LBj*v4u<#b-vR%YoCyp}kPpjPB6)G!U};nP{1ZXXyY3fM zE|P|0rFxqQ7q<`y!$w>+UU+Vxm*bo>t@eYnr(6gbg8i*JBi&?B)IBzP_$eRTK;|C@ zj~V&uaG^qO%KxuI?yAd4P}h}V(wltveKi=&3XlTT+yz{E@ovX+{)gYqZz?E9fGIUS zw&|!Hkhy0>ld7v@Br^40HE}Y{xP7*?JMo>=RxNAUK=kwPgE;^mpsbdnr=+{=_xSNV z4KDmmmo%blzoghaPD1B{(&uB%jhFu}u!B~PS1LmAWRcKwrS|n^f&*If!izWQ`Jrn# zpo)|4$L?T~&Nn;muQT>PWCp+_vo=5L{+gw(_f^@RISU&}#bRU!%;3B%R8dF{4Yn}c znI*2&9*zF|iW<~&pMGWy8nL+v*cFKezZ0R!^ILo01)hv+JA^t7e`0c2U@P5Y;(Kjh zWz|EKjq*q$EI($waQ$bv4M=8OJouq%$wfZ0!zh`%g*0MQUa8ca`||idCQ=C(x`#ro znt_=Zx&ASm;M&owa}6;6v$w9OnXjg8pt7coHDgcFAj+?_W=|rA(%~eng{836G$0^5 zX4ozz3a7VszjfR+TH_AYYhRrWc|vZIJ;OS`tY8Lby6-Zz>h{~nmhB5Q;F#DGHiBoM|!?&?hhy^0|5R2Hq-q4^h~3Cv@ltKOJ2 zn8SPCY(7%h1N8>j@i8haDSvfO=bm)~vJ(g(y^9e(GhgHSJfW`Ye&$v}UR(NIcQaZh zVUWF{rT>#^*18ge&`Q_5m1i2L8$yl}o>?n9Z`&o@-H$VP(_7N!#N}+eoSaA!0#S$m zIbg!U=t^%OoR)@>Ml-m%#@HyVA z^ZWr(e@#rJ#Op$%80j^UF6QfE{hFW#z2>hK!6>fVZ?G}b0s?}V^+iz95i-k< zm`tKK*e#z%$($swbQp8R-(TWRoLWp#| zbhJl5EBX>rF<$oK5IvKd+V3PhX4IthYE>^IZbySgGroT7xUj_jpU04 z*Cc{TpKjMLTz((QCyOHv##AaZ7CGTs`N=|BtVq$~*JnIwUftcI~$xrkLYSRtFwK$qvN(*K0weUlWhE%iXu1 zcCsO#kyoZX8s>$uRcU+}EBJ^oG1jFr^GAZAD~VAv!BQ*dLMiyezJGf;LtS{g_2}E1 zMDquXmA%VGKN6O|2H#RlKG$ENUuZnIj_7FTCUYK5E`A;0kvV*q%(8ps;l{S>9VD}& zSBxma&2(Eza`vwt`#e`Y^uyjiQBbblJbHSIf|~V$yb{xv$f0;d1kKrOG$L7wSfcPN zA!(xIYw&MBSJOEuyd|X*7oQj@#*T>bz|P99{?Nr&?G*5p5y?#P=TV*EuN0@D0R*2j zVk)DDFIrto5zKScEi)eKTTXnEM|klbs(KS8I1}Aj6~R@VetZ&UR#J8)vGfIkx>` zU^1``!~|7@Zi-K+l17a&k>Pml49G>_tQ1C)w4*(e;*V#dG9|_rK|ft50Px64#Xy|sGlvb{*Sl;ONI&BI&P85A#|>axzJTA$?p>3IrE-gOi@ z@LO(^?tr8@9f`$SicfT!^8y3K^5;((VW}^Xve4mp!bs4PS-n4Ccx}~WsUMD|gZ?2V z6_}>}TaAcm*ZzwM%c?1}-uKfbpzm8A7)fC~w9tAAn&?Xl6(D9jg&w@?qg0HpK~>#< zfd*T}L%%$E^5q(COZZ&SioDtoP6NY6xNXI~4UQZvMP^lpnA=p~*Jc-c zIrbqBsBBK1C}y%aoeW)HQH~_s)Epc|9L#vIj!|HMAG%tgF!g=deL^HK{v=PM!Yt$E zkIYr5we3e}47PdUYUdyrsq6rZu@*Zj&-0RLk2A{)U?VUbQE4CWo`Wf>+4Ie7zy2Sl zt}-gBt_vfA5)w)YNGJ^gN=l@Y|-5v9t0rmaL z!X7zIyaB|q1k!50g~7u&o7`E!-c~hS@#%7J$V`vrx^haa9eOV-&$;_GjEJy0eb*g!GbX~5?(bhCN_#Fk=J2=akWyt%e*uAX&(J7)5kU>=E2MS)tWfG zIMYU8VwY7<(PKGna_^x($p+Id$ZD(Iw>}ts&H+Ojl)w;J*=gERVK@&~WCT3|zxhI^ zX*t&}jC^CxM3*JJV$@I0hap`3)_&vLby0j6eX!G%N;YrkQ~rKp8=S2S2RC_ItCO z8ajQQ1%?)^){u*dBSi7C1}*mnU`N*Y=0usUQwZ!s8)7_%zg?5p!3leV?S+ z%robGQ@xJIob88B%xW@}{d4kqDo^obL>YY^fd6_!Zx3#)UHCA|5=doKrg~c^gMrT7 zR96b#20UC`BQV}_uxnQ}`a@-f(P+0|saQqVy{`pw3qJzsA~2}*Gu^KqHEU?bZEkcO zR~P1DKUYq*$g3QBQuZcDG=;IFXR6UkL4iuYyTZS&K&?hrGM2@+(r}|7gGNs6fKn=) zQy_kH+_5j7V^8oZmep7>2kNVg7Sb9~RrCvN+kQKDvS`6PJXEfAX6c2&%=3EAm7VjU zI12@US{>zS@F+ioQlq(I*ebT?JKCZefL7GFqvSQXc{E{a~&d$ z{C&r3hB}I$D^ZCj5CdtkF3T53CCuGzwQx_%86^cdT~iy z3TQL_N+kO%v`_*zA4PWiB^yLdX))DNQ{l_m2K7(2!~;s2lB$E)U;Zk-#eBFs1Mv;z zEDt=Uv5E@4|E`R=_o|8LilQrsV5diRzolApmN@YALK-~d|E)}uM#4yc^7s#n`7_5VF( zLM}a@U?&ma5ZTk8PnQ)*9~Q;V0n^44ba^^DogVAR=r||u zTF=(!#yN4B24?;zB3|Gnaz5f?;3eE1^C`mVbFOyy+d~SjMLeNV931V4COiG_IhUh< z;-0{0bB%0&w9&(Lz%p)8renbA@iOxZ%IitS6XR+8+_CNSn)E+`qzEn##im1*YFFP5)U zEMIRK?x#Mz_AX~#_^FL0$a6-1XW+{{)mw7|PI88bLXH3V{dH(!ye66`oF3 zMA**CxbCF$@h{9w`Rm?vZZmD}WOl3-WU@6w-?C!)BIc_#1i&Cs^qy(jWiNQF{4dJ}v=9O5NvAN>R> zi=>rDr5U}=&i0}n1ZC|r23x0|(E7K()RB(?tWTSkrGByggXsXSI2P|b^{syBPjx*sH9vl%HKx- z*n_6#*=6|7&`>Vfw_p^MUUmh_hCLxPNK#VdM;e5`($z@=E|$ZF!|(xo4H@P%ln*Q7 zVum$a74OW&aYbr@DW3v^J2-p5mx{9T#&%+gAlU~J_7FPPPJIC9c2)S@g>5#PLq=pj z?`MEar)Upst(R8_>p$L419D|`_|#6N>ju_HZv>d^@ev!N zNTKm9+t|#sJeCtUOB@Z~JGM*3Q&cLS2PE=+O3e}tqm9R*dIrODFi__}!w<*8$jB8a zL~4cTDd>+~v;D&!Q9w!a;hY}X89bDk8LdqgdhJUrRN^uX!)Hk0*t;fhecKIp_3p&QEn?<(kESFt2EgpOdzPa4go6cT4{MgOuv{W7np(?e0$ z3~RfIPo_#cNS$J;)#paB7rWls34n&hvilQ14!GhEshU%Uibim!Ivn^o9$ET7g~UtQ~Hh(WtFbt5RvYmV&6w| zMX$~adM7L2eS+Ql47>&;cAv4~g@#G3BdE+sdVEjwQas;Y@c%D74O}>DX))Fvgduv& z%j-H%F5%jb0*z}|4!YpNm6mv}%FDGpjQ2h6i$P**K14Wp@oU5MoER`<55n2Zmn@@S z$b{%8=Dju3<+ADLQ8GvioSxhGw2%&#qfI$H@%nS&NQ-sgP0moXbK(Y2KB~_9IKZOtRmuO~w;yd;O=zIry6&(?XWf?)clN&k<*(+_5 zbgQz8augt0A_?yL*Y4-z8bplE-}I0+$$Mxd3RFZdHWyH$nZ4VCWkn=`oxBl`U;!9Y zt$rNB1}2O2RHxn(Ps3KBNJ;6sU7)N4Us-O2LOi{L*lM&wfJT~=#yD3>5OyO0*t?)? zalIrjZz`z_d)ijGd&6Np$M+csJC{Ik|8lH*b1X1!g%e&4HxAM+B3`@rK4U$pYe^sk!Y=m=wpQ zx7Vz3r>LiIADyo)q0%DcnGywj>JlXHd58BK**(NfKd32V&)JjD|JW!F#zu9&%kexO zJVa6~kX`&P{<~ri>Qqzj>3OQn7Rva-wf^vEoMEogG=t|v!WDnf2mE$1&zdO88NONC- zVTCoRVedz1LU$V_SA{Q{Czx6c6wS?_vFG4T5Qa@bI1g+TgoBly!f-%j5u!gO)5 zgl0adi1v@_OXIO{saH>JbVHcIH3k>KUl;pRlqDxsZl!CjpP=wwQfFXiB_(!VZ`WZ> z{h#9;j!pXe-Oy)Y@SA%coulJoR?xu6#iI$~YhI_U6xiXfFES$N5?(jC%3htHZN1)r z@sa$EHELQGGaO{vwe#^L6%mvfYkhhO_9Z)3W2lI-}f%Z6?Njp67k~kYe4><13cbq9TX<0A`6j$WiU@N{6dd zSl38JXU~@miZZ0=*tR=)0Kd9x#r_`#CY=Y$Z$AzQh9QoRnnlQ(Up1{trro~iJ}0h? zx~MVzeqQlPQj{xnrov%d9n#rrdXRTjjti5xEcdEkU-ygZI5l(@Q+u$Vs}s1zZw6`y zMXblloUk?QdLB-ho5dylMSXDa$iQwRqk_|sw!?>%EMRj(L+RRkLzXpJpSM$lHsDlW z9e-(^Z}By~g%Yx4^CS~WlNotEmqC0uHP@LV*gBR4>Rd02tX@6G(cERw^$PtDu<8P# zF5C2!QAa3MAoY2R&2e#Qe{?0z7&@LVdijzt=VA0Kne%J|v~QB^5um1qbt#w*CbRsa z9s{P7t7|nbErC_k(Hv!qcSN0ave@?gx=XEi8C4_Odb?oy<%Xj=@4%3U#Y01Q_z6-> z?&~is3rb3(W2)VRY9?!wCIYbl=3f*;kaPU7W z`bf(d&8SS8^F^IC=>@h3lSs)JVIKA$hGBv$$KX?LJJJM8F-5k|VlXQX_DE=6Xfg{mD3)Kh$N>yK-x^-A zv{{xR$rj>x94-_Xs_eq6D;V-0*=%5B+YXiafKVz;%Ct#q*r&EO=ddq9-p7{=SnQaU zBG1VAWJa$cC!(;s;|Uq7QMkfrRw;~fW2p9orsU92)P*f13|=LVgluL+!=i@ixies$ z7#o9U9qACgf+_xL#^4Ots}8@*m?D%4lAh|7dWM|Tt}7N%y0t+jWRLO-vl5`Diz4F) zT#WHs_-|nz1*S-oyWO^OXv{f9EM*MT_hdS%R18~lbvsg3R`CJ;4Q_$ybY&`lb~YIB zafC?g5Wedtl6jF0icLB2H5_l1=G{?NQRRY8#d8$yH#mtCaq1WfWV6BKd+Y3vtP~+Flck0rMtF*&jOpG^0)p>|UDOLBFR2uFp@eAYN(} z&w2;uvnb8ju2D!#?S#V|vFlIBU8SN#>sbemVHN@yXx7PaZ-O@NqyHQqb(wV*WRQSB zTM1an>*m#;&-5nyKSqT4X0W>tUIFwuxWsc`CJUzZ(E9a8uurug|9<2DHIsvGa^#mQn@=i-ka6VW8@6V~G68VAcVB zRXF>hv(lYmIX>sf`S9mHBk8`@)1L|-*I~~3JX9Fh6Z@y7m;`~A%2Y#M_uxlIM$mLA z0O27u|6DKZT7PK}yR2X65GTQba2Bs+4#@~-KJz_*36%QhkS;=NV3@-9G+n4K(lpvS7|lNG6> zC_lKfI*rZA8$gI%gvY7Y#FT1Sm{`>Dp$^Ox-*A&oAozm(8hLqak6t{VcwI^&)#4W_ zGCRpvOC+)^*9-@FiASvQ9CD7ApT+|`;n(uZN*hI4U?SJ4e5YSxXj9Y1q2A?T9qCXg z#a>6(Wc6z5Td8qPbGeWkS$efbW!M{? zj`#ZOAE|KXS46^ugZTBlrFhtB;?ew^(b%b|EfhLOw^(8D(5chK5Aph1m4LMy=OZ3Z z?Yt*8p2+ddPa|-5-#qo&CZX|~p4!uDOF(Iif4}A|u5Yo|?#t8aO_eDox~cmwV$^@OU!4+r7!i&q}{oj4q@}>Y2|?tnx7KY ze#$P=?l`8<@0ZVaDm2-5@1sX(NgT6B8XI~n_ZQnfI*8JL>HUjGJRSfD9s3KmL z8hlGf5_NUYp;k#w>>7k=o!EibP34Kmx?5iiS>W^E>t==^r3;b1m_au+$H6J! z-bUlm)W|ZGm3v%MC^L^+7HnZM6F?ReK3T53@L?0?nPGOmyHlhmd3G{`Q>4C_Ym6;% z#e4k8aaj;dPHGfKuss5>eAHYc4qI82iqN+fH5E-# zO9Jvtw-r|B+%mX!thY*eoEaKeP9{bCVc^6WbN6IYZq~P>CbRH;&aQ73wa?qou|uO# zv19lKrn3!tih@Eg0`Lzrr@sO6LMz{e<-v`K#0!1X0uiKZxJY~hty7x)O1S*GhdIla zJx(T6?<&V&eM)J4;HlNRkmL=~yQ${qoEysbghKWEn4!{SdNS!cnE;}&%}p;a20Ph`4&)cX)s#i)-b+eb>Uzl}*TQ+BvPt#O z&@Dt2=eFkJ4|+tcsa_AlMn3#MkVfiTok`=a%krc<&>y%IdD1akWdiJ5 zs5KSlJ)5Lp@eu6u0KlQdy-AbA;jXkrM!s5xN>&0zfm%$stiaPXKe|WC6B@~)Pjr(y z{|mK|eg-gDZ=gMy7D=bH`QA=rk&3OL%G5&H2LUf)2X}?Jym`+FjDnK>W~u~o(;8hI zdCzAzE}1OD{G8=y<;n(y^7sKeIfaedxKPDq8DY@`&YDfv7jlkZMAuL$CD}jeBm?fD z-lA@bERs&7JG*}2dOUNRLU-?-bu^y-=R*S=WwbH(E6(}?+?^L6j=$sDGTL*P-cLSb zc?c6wP_!ULs=b+6S;NKV8QsOw7WM_*a_O#~+SH}N!NcdIT4^g@g0=2vEgILFl#P| zO?Qv$22li!Mnj=LKksiMd`vG#0X3pvXn7$h%FXa+5HsGx{tA=ucY^TO3Q>8*!()R( z0%g~-FFKv?My6X|9r*?nHYke8ny`k@{EI^He}!3tVAR*Fn4NDu=BH@<;@Md=D%x9% z!mAwefPXqfWzmA@Tgp)Xj>#V(?1p{z`J3}J)Wvri#9zNg8=urqQqmc@D{6RFSSMCV zl`YA3>HuN=KR<=P5E;Cpf=Px=+MuKy#q2XK;;llGV z`ssUu!;KXNPZM-;?}9ZcFg*~&pMDgn;+&ye*g!4VJHpa3vZIYbXX7?wBWt47k)T)5 ziqA|NK}7WryZ$Fc$?J%l-gJ^N1O+1YWwK_^pbUYfSq3Wq#H;Myk5f+GY zu3s<5_c%g&o9L|4awXWhk*hTPi0iN~Ur1c6$z47C^}Nxx9H0CR>tM7`i?hW;-S6cmQe3{+zBOMS1%gU=!~lMI%x`LPOK@ zAHGlgVty_+_!BHwp?-p%XEsxt)0xOUOtU*|?NxgeUS>_NgYJ>iD+GY4#d&H&rt2ndxKB@epZUYE!XkNdqSI5!zJk9s`n>J8%2#vk;*h)42s9L(#}i({zDnYtvd3W#}P*Ms=p z44Q0LYd4~E^=^_s4fcj-!&HCwF4&tss|ygCe;Wm`XsNNWld|5&(guH%{x4tA2=zW5 zWy|3sdU`;qcn*EI9K!-Le3QD3M?|bv`T|dgjfc{adH8!)m19~dB}2dQj6r1g7lMWs zyB=1UsmDAScPVs1o0`=Q>^-7fN^EX|UK8%{j8%WqKI3JcUz}8^v~ zv>Zb*xix1Hw`u%}^U=b5YsUDj#eHiQEQfC-RQ(77bYWm9yBzd-d#6>15L5-SrcPGS zg^{g9DrlA9J7^0#PHc$hI^MZ(3wOwU!hW^#cld^lG*yCr%1ZCIq zb+(1VQn^NiPHFwduN1y3#F_7n=QsR7;LkgDOiw-h$qHurqs;VnGanP6_aX&atS}7q z&h_#I>?|g*y-yJ)$(Pr?I0_T`7S4Mv63f~%Y%}g9U9g{-E3SP@D5CGff-nS zM8s8i+1Ib61W2P0Jbg83EYFqqD9_l{=!#UwVVa2Ug=6_)%l*RMWk?2dpNRQc_{|T4 z&)^gBZM2*6NA8);_RW`2Vb@$FiYZ+lPw6(2f0~%K^k*?u$tWZL2k4T!fyq!jqeuq9 z)B=7j?8L*1cZxI~pzg61eWp*{F5(+U=ExoH8}P`V9PJKu`@Bdr zHMga#kiGj$n9k>ljK)^Yfo*Gqu7oCK&RNd!cw#>uQu0q#EWnFW$b2_D$jgxXy06maOt!i&S}1ZK9ODX zy(2Nb*>)~ptF)ybq-qjfDs{$IpWf(W7*jL+K5*eV z4z9N~rHi7`S9(thK3FDkRK6C>LrBBJm>%gZ&**EY;SVT#=1 zNVg;@k45rGdwz+|FffL#U;p3Bo}dQrBP=4Y5dsP97X2hT%YA6O)n3sPN@(-dLePXr zQ!|g=jGr{0&5~KEuk}*ITG3sKo8YKUk^6XQ2FQcGoHK|uXO%=Dc77W&wViG8nL8}- zYEDO({HKmEHP`_&wM+#&9E=vhrm(QemY<9!0xb>Z<#F71=o(#tg}_36D@KUjnqWk4 z#4sJ%?n=^-U^|1yRdCmL;fRMzYA`#`hj__j8|AhUxc5W2k0iH~Q14McN+y`H)5qD9 zU?NQYvpwV^84hhUMb$dD;N7ogYA*~5UCeIBV-uzR>SRTi?&#u>I?%aow&99 zI+w{~SUL*#xkWpPb+Gknih?2z%1w-*)A9~9V0baXGSpb`)g$%I7t#CbUaFufIdn~p zKvL6{X2#LjtL*pNzgQ``8*rl{J`W;Uu!aZeM_z;@F2$Cq9T9a%x&|bU=5VNN-)Kz2 zT#&lpOLO$gAymd{^~!N7RFa7KUoedVBoyJ2%RH<~SlVv+yC4dQg^#D(bq>FjY=O#9 zXNa%dmRj;~*8jV+AlP$fz@%Di*S(%U{>&6b`b(H8i^m&+rjTd59 z7j<%`xKuIdnYMepkN({rJ?!?D#AKuqXrE$&ISz<+(_5A=(q^uNiG!P3k0Uxm zEXT;^3rGbKo8g@5EQ%A~UcEgUA?@M1ziir9^o0XtGjbpNhq56oAHR&0=Wtm~zgpW) zt-`_S^8YK_a30nmn=D!NYUydPo=0c2J`RVuNWDC?)_9Smk38&CFzuxMHpd$BFh>U^ z7itUinu-}OGX_BKMFf;l1u0|(OLgl3o!2K!&Sf_-$1PZE>=EGLfqQzI!WG4>~i zbM;8A&F`0ga2~lhXbc!%_4KAa52dfRN`E#^q3Fjb&Eaw(cC0rXN-?z8A&#^8jsWtR z7gOeoBeIQpdQ zY$z?<6uWtJqnxfJk?m}@bvuQGL8U{YG-&K90|~`XW};qgmG-0>(^1+M-(+uKGOm*h z*z!R=wLG#%Hxfdb9!>qp^^m4pok^upQLy@6Lvk|Hgr(UdgiOh>K3=Yn5cPMd3VChX ze-iQ{8SZX9f&1q~Ykt%Z=qz>I1b=*^vkSRTw5aG=MODahSMvNQjJCSjxu$0s;ok@T zD`3&d=6_*Vi0ryeb#>{V_e+l++wlNrB0@`4fv(sn;=0EVH5aY^GVYM6U3<1mFqp0qGgQe7Oel|n} zG_}iE_!s_e1kd*6)}BT^?#U?48m9Gexzz%tr3%xqQl_rzP57b)${FV}u32_B8xj|J zqmk^*6)W;v8ScW~H5!v&9mPxUo2oT|CS>huT+l|JyVUrBK}ogy-EA~cnhcQm82jFT zIl^EXm|MOi_6`GqmKrnq<>;>~^vEA>%a+-4swwOmkyWNyD9fFOUD&5rc5BrO>7*7l z8rC7@9mHMFJ?d}2m%0h;PIOVGWmd9hfh49`6wRWzo|dS^sN1KMuO!N{52qaBG2iVE z(esnx+Cu$TQq@Tg+_V^L)@)SxC2XJje0-Nb6O_xE>Fq~!S1gy?e&t|NKer}9Sn2&# zvs0C4(4H-fopC*cT2B%Fr zrDQr)V(^=kfNX+@n`8?=DeXL~dEtPQx!9B$S%ncz;KjRNJjXJ-R~WjuOv-Op?pCzC zi`#?;(Sqv#i*d)mKsr_-ybXVHJP64K9ZzWb9bG<0aXoK%tg0^69ZqESIRPwqOp1v? zmbf2hush7)dYf$ECmlxMKEUX9YtPpzH>t^+%x}Wpu8H#Onv1N-wS9bvqePC*xA(TI z@c%-&Os|2Sqf|p&cjxCocxJUCq+}e}e`ff#VG4u(IwivOm<{19xAN+{8mA99{4i+EwUOhuQgg;%_47N?X_~Thjwatp zr7lm_KFdEdB@6h5W{JWIA(RSJBB{{Aj=9aHTYl{TS&wEf5vAdP)5?`HsCtNFLctt^ zPc8OktL|u@vn%M+?s6q1zeNwhXz;T_gfWT9=5-veO}+3KWUEi^_|=~~#^lZ=dmMg8 z1!j4BcU&WEQ*7_XOM(e$GA#`IX7K8gy%X}7-w#0jZQ!wNEG+~Izb9hcx@!}IYKuLrh`fOvkRgpM6S3zHX8-l=RXnnrpg)g zrJf{;dFUsGd_|WGx7a9<-2nmAKf#{hK;Tq#nBl1$eq%YP`-vTW#_g?9Gk&IKCli zq%seaoYr>L1M{l%)P1zSy-Y%ec)y7QYb!YqB&r~P&_Sts^OV-zGih#G^WvD53~EHX znY7NQE;!GXA5B&KaN@dONpwD25a;AL%WXL4LQ$4U_TT&S!28QBTx#v?3Sc{KC^#H#bexpP9Kp+Fir7Y+EcK|XqjSHj-Kxwj!ejlTC}Q9feRaMH zO+SPFFov`xGOWyJDB*0EWO?#kzt@*xH4-G# z4(k@xLz0&wjpkAP#dASzzqn^FG<&As zXxMFhnxEmumP2U{h%x`{3x*Y48#}$9-Zcl^Bo+(Pq%&&uHytO?trja^SIK1nu%HiF z`uFcP@ojSd@KPu+x0J}CCYad){EJ;#II|g9AQ=-S`T2Z*&R+RCIgBohekGG%Yg-NP zGM){hYnG+svO(W>`bcUKF9#L6iZu z>h_kW4j%6Jm=SL=?!Zs=Rj{1h6^I-y5zmDT!TF zqPQG~T=F<&WXqQpWH(=)4e?%xgi+N!F&Q}6s>+C@r;8<9`J+`5;1kbQ@XmGO>!H!e z#imTHzdvUs$)um$L0RtobTPI&{4M!059dqWfCS#F)&30<_laL@(BMFJ z4|ZFJnmrI+kE8s+`g0xc41x4VPy(O7n-LSAtIzq>Dy@sw*Psaq$j4({7QjTj%;K%kl1R^VcYnr*A2E4K5G2&>pI}B&^M1Mvu<&McSRFptu~xT)H?#u@Zh0Pir=B66J9eDUqrmu{c zN!(~l#~V>FYD8MPBl1`~`k-e60=`}$8jhmt+V^^%t5r{?Y~E@$VD4$jfw%p}B>oHk z1j1%;k;PmZ4#)D_0|wU+t{lU#)!>r>$?={%jRq6^Px2~=KZ5L9`eT^YOCWLL>~kh# zc0ZpMj3&Rh6}VExW|9Hv6kolk=|p~%y=-Zko`mKK_F3xK&dd%Kwi6t$lN`k+weFsb zRDUwV{*mNXXiLp?%TvIfS;Fd&+3~4J-?4dgqB7DJcitFl>3QVwtg0i_CC5;%YWUla z)!7+H3kX=S{Sg;H5fHkf$HTi%R&3_cNftyi{XrE``kUyJ>J0`I$@4eYIa%b#B@r_4 zbc!NW@uX<@AIGx%s&luRnpkX2e&2mBq!8WxCw`*~W*k;y8^u(ch;`isl-k#|vJndi6#r5tg z4#eOKubk5WPJ>d3(P(W|E zL>6gIx2g;X3E9HI-ssZ2wvM6C7-C>|dpj;|z}Cu54h!bty^n$miLqJReFDFM=~I7N z`hAOw=CSK%o#j%$Ov=C|zI{HJYe|eHTqsavl!OA`)pr6`ejJW#7BT==+&tzRRaQ*@ z$G=ciq`+NY5s=5#|4mZ#(rnO+JQ9~Ph4SY43B1&uqGw8)dCdrd5bV@is4M&BjH`^{ zFrH>L;|onE2+x1Px?*0y!W&q>9T3Ej5wIHNf0;F4bEIni#xE<)VK+#AGv`JIjlTB1 zGSLN-1JcCqKbc>?2%zT$Q>ur)AP*5c>8RP{VBYE$4*GeU3FVXyl1MlIvo|$TgZn$; zQ8bnl5e%5;BhdyIpqs$)M4hHGa&fxG7rK(?*XZ46NKJ4Xse*%#cti@PLc=T7N9CdQ zv%4<%!khE_yAI4{}olip?4}%+-hy=G9BugF= z)onfl>*59u1XN*3jUPKac^-LFaT{BVB81 zcuAa|@1mT@L(bM`#qC}k0;s&_VkJ);AwX{4_ zRn1~TMl)sqe5)41CSLEgyX)YGy=%1AjC)53{`kIF+#Eu`zD?z<-0j#n=*Tj>*uv7W zO&l<-{Nzt5_zdvqEdT=>VSsUf>}*JH_#-2L`0DPN0FK#&Pz&q0xIb>4{uaEjFiFb5 zh$u>6y?}-l^9~hF8e*cu4@B-ACL+77OD#6~_oNg2s}V}I{Ml1Ie;f+TPMteNLTG1r z@p%LNb3JuB+46O(sH+xlTNb2qHBO`)-&`G@?rM%^hZuK#Jiy(~FH0{}&j16QYHUW( z{!na)0yxX+L*Zd1@GW+7IJ}0z-{Y#}EDn%p#ukPHOis>;a|hTpX<5!6Iuxktn-FG4 zW1U_*<=C0$m}{;D@l7|bkIry^IKKOCkN8>i-;X^;0!>tO)<+R2SNosAR?3jx@VA55 z$k-B4L1DFNsX6P-qE@F%BJ&YE&FGxBcA-*Sl1$8mA%83>X%KK{m-pKHS2FKss%kA3 zRN?TN7JpZ@Q`~KC_2u9dXx!N2a7cY49TYkKDbG2x+pn@ejw(T$7u9B+d?I32J4clr zR)>>Uq?eKwjr^x{P^Zj&yt?EKyV?eP3SJ`PC~F=8F0N4A$l+)Gqqh6>N6tNEqp2x` zUE1C)K~HEZ3>Ut(7pX)*3bMBULLM2QmT)2#>hrOYdYMu}xONon^GtEpT#`Ls_KS_O zTbSmQq}0nv$*x|Qs|88JC*ivVMlnX2)pkKv9HvZVHw6OEOwuoQqIj=C{bS@FT8ii2 zCo{l$oxBPgRIYoI7D+xukCcmEUfy+W5DtURNZ?^lXRJ4(ew!ptwtes1@hvog*86YO z&l&9YnhNb^E7?vzdVbG}&)oe{fX#8Bwst|1&>ml-=ewJ@&(QEL1)q=BdPn_%O~2>d zBf{1C9p3S{1y5A+skijKRD}S>li<$14MTwWV!4K}FVs!^0c9RfU?p4&c(0P;@Oop? zI(@i6-}pC_o%%e_3lfpu9m5Ztru5M3IvmGsL$7NmuX;VmeheMmWTR!q#>B*!b0t2D z$obXD+X@Wscd^_PXy(QPq$NpzO?sz`$LdGXtOxR|b)x&4ZZe>srk6cMR`n`9>WHMF z@TwxPm(VMf<%8!)R#15l!w}a_+p*=}mbV&OKry8vlf+887uu<0{oGl*Zs>LmJpF7| z0xk!pZ?r$J?|)m*8>p^QFdafT1J&W65Xe5UAOO2KuV(V;mZ}*|f~6mewoq#9Or1E1 zGik$N=-S~R@BbZ-?@s>%=TsW}9M|aqkw(qzZS)`^sx;^I__7y!sr99XV!=@^*SYTYkN%DM^ z%#fWyUr}9&_O^hUpA(^;tpKO??4H;4so=FXM)yul-aY%Yql7h7$Edz{sE`pUuLV~9 z!XjU+46<)8hBwN;8MGl84yIa<1pn>rBUOh_R7T?xQTh-bRtJBu($9`+fty>R?m~&y zb$s|>)>Wy+ijn#h67bpebEuel|&lxM{JA9db|2NRr`x;E8flSBd z#HkHm3p&oJftH#`4f_-w=>TFL^19d!im2)j#yDn}RE3nqS}pJeELIO#Uiym#TS{wv zi+QA@Fi86W(rZo9*ng5#V;iw$Y~SvOV>DTASjZZ9{P>PWYu5!UQl_-hVu~;hH>V1n zTS5kvsz$Q0>sC>c?$;P>@e1Sop^_^Uo5>o)L9Z}Bo`_J0mtIPrZck*_b>MBm3doY} z1D!-U$HHhOAq>ZIxq7sji1WSe|r9K)mKfGc{rs zNB!Xp_MbDqqylGeq8;QVC2`L7agbKFKcVJfcf**;w>rw`z37%Tj8b$nn5TJ4^s>6f zXq5IyCzSXRoa7_&KecDYxPXt0JwDi-A;><2Lz4)Sb-DI>W?rAsJk8RAz%f3eSySmi zRok(>kHH}_`}BTKo@SGC#LDUsN;)lE?MPcO7$xF(D6>Xy-D0fE#OV(Bo=1mpPUtRJhbD|#E>D5zEi ziPDR6BdDt*tKNE7CZuZLLs)06PJ3fx9A)3ggaM+LLP1uy5c_=-c=8e9r?>bBp%38# zU9tF&s1Su+A51ewI{CTd)?2Zme&0wv==TLe_U-&@N-lqhfmos#$ADLQQAdHF2) zwF=-{=G^E`ojpCJ`NDXw)If3`FRsI?WAt+pB5P`1g|L(3qytaYdhuAj!?QAk%{PKr zl+_=$iki$APO{AQQcWvq@Gr~d%84!9M~aflq1QwA2z3;>ve9lyg|t}Yu6|fD$-uS2 z@5KB7Ds?3;tqJu{v@dAnjh~Skm3(zLrK@V-Yh*0{Ymb0cls39tzD|=zs^Ln!LfU(7 z#5PEB$%eIMAi*wF#XM%ni~Qh3Y|!#ncaAZFMy4yFCo*?Dn)3~?oONdrU5_d}8L484 z6R{}jR4yh%k!`!5Ympx;4@{+H)oaxCZ0I%e%oQ(IZ5neWTq)3LDVZg4D--DJ(HGNI z2NBUh;;I}I8m|;5M*^Nr0RUXwI{jV9Wlx)EG zr2pkXwA1Be}|b>*%y)mb!V4s&YF>f{q#9bXfg3@dtVO&JQtjpdg4ufYv`qH~apl}{Cl z_q)mXg9&MfmJD#UAKOuQ4tkj>Uop2F^2cR?5h(^>gq&OwuM0McmspisLlCH^atg#} z7HYaa&odg$%`oi+QqtgP=ucnyexnA5L%$$Kw+rh4dL~@KHmjzwrdzLlRHV1L6~{5v zYNe00%Hkm%^K>h5r;$F0=!eTP^3TNhg&z>m+myJrq^wS-h`lMb zie`JfO@Bb3HWjkjj}5eCj3tGv9EK-V&L>E^8nV0I(5Q&3x;+}z9E$=+EaKA9UdK?J zd2Eh52-p-g7MRU47u0y`pVZE+`g?7Sh=x@(Q0A63j@d4c$;{;>hs5xe3phDYzA!R3 zni5Jx4mLtPaE;^mENgSti3da9q~3^@v~Wgvzd5$bhd~=ylOUtAKv`}#tg38gdoazL zq3P0+xnq8^DTLl7!Iadd&FA6ZOa(-CgbtD$e5|NdTcmC`V>1b+eu*bNLTijXi63R^ z`c5>~nA~HAa`j3w_ruo$^BI|0TfO%|M3dUyrzTUrvkCaF>ZTZdB(A}kQW1?8Ql+h< zTLr62*hMZu#2&n-$%}*zQ)y-XzJ_s8F}8E+>tm2hf@o4|8Yd76*Xysy6_G?vlz&TK z*Y^}IR((9LzG8`0O{1ijW7Z{PGZUGBo#BNh^_0Wl%tf|30W?{9IeK0%XOS*{EVi)V z$du*rq{dv&?BpsH&M94Q^*QeHl9)N`hS?FZ9w1%V#YWHa_rZeU$m&;4(#=q2?`^%QxO~pCtdFpRY+m$+a|QVE1qs z3@ln7P4gI@0ri(&US5vByzs0qwyK0rz_1h*NF006ic*NxZz}z>ub>U6J#f9sh$6;3 zs{!S(5@P8YP}I0pVwkBn8i2p8r}1dn=IPa87z?DWK$2>fDFZCHme?_3lm5n5M1INdzPx$2Rdk z(WSlwIUqKUjVVdpBsknWx5uFG!;J?~^DLOzpVURaHh*G1YbyaP98!v#)i1o5pj;-o zrmvb#6~N$;FN#~rzK=$b&Z+y}X6ctV30tcIJlf+I&4)(JGm{dfEdz}%7wi3e(17E^ z@|FQP>5AdxI{6T*k1NaHUF6s8lh`bbp`GaBM_9Toe1(cLA|VsXNTs z5>DG|W+if4PURpcn#8sX|2 z4gEZ%W!+iYx;PU;o4Gjuil#Gn>Y#giDWA&3ee1V?!oFPH_b#+*2_@PcLj7A4W%}Hw z_&vSoUUL)Axp-!sM~yxw3qSevb~A@{mQlTyg+QW0`nVG7aio}S5)m=9o36-UzS$#J zu^^V#QYy#9!N>iKClu6pv2&5n%x(?=kCTy5hvS$$$no-SE@vv_>EQtmZL_BaG zV;Po+S+h)Q={w$n9OkRm#DP?bGMQe#2KLq+(;fq@+55MZ+BbG5l&m;sExQJK?WLFC zp-V|@_$-|gv_csADGSa@iR)xhG4(upJ9hz+2)#~BIoFDrR4Z+vON>XoeRYv0FuLb0)&W{$i zs?(eVj4C0)TvmB>MP@Ur4QqzaCbl>$)CWd{oQ?n2-ggE?xi#wof)dPt2#Bx^fGD6u z0SUSlL~_nSGLlqs8iI<-77@u&OynSGkPIR@XNF-06a<7Bg5)r8dKBDxk9F>?y1!1< z=MT#o3g7jv>F)LP)4iVV;}dpt=v6bW>T=dUn(&>HfhLt!a;fBgS4wu&rx}|YIV2Op zV`qC7DWsBPTp5wemNZ=o_NFCX&Z*tltB-7XRmBEvL68&fi2c~VsGDSm=HI@Lug79- z+ZsL#=k-SD?5!^&g*o5idmYlOYoMFYc_+`os_A)ij3 zJg4(Wq8L}r5aD_;Uti+RQMjnQf~v-)H4ERf>y+Hk=@%&2n6IFG9g1N$=i?W)cp7Bo>RN76MDw=o{wfqtK>i3c1s7=oDB_n zI!>SrlM7lhH9Inss1ka+S-IL(9rY~in_e#qHkXz4*cPUywZ&!aqw%u4Q+0D-;=@Rz z?yjfX@Jp2_nN{HEJ)mBJAPD+@T&g>AU3cX2HLsvWcqw9B3;zcW&~V74$)haNV?9V=n+JcfRXL2C#? z#EY65mI=;T$6r}k8hbkTIT+6@T{474H;kKBNi&x^wYksvLPX25t5NLZTXrnLe^^_y zp*6^mn(WPN_sTQMi5wm*I?~Ty7QBo9#u?>Mc0a}D}GMflP;yJ{y6ZCiDK|g2%e* z<2|8e+L61j`>P8)cd1_8WtMQc~q2p@YrDUI`<~Lu(5a4+`QSFIb_kYC)099u|lbXp-D96 zI9ccHfx5{jUHtQTLGg+$iayfc?X;f7FYuZX-#dbwZJ%hUiC=4{MuyEplPEf|tcCd+ z*1D%UdWK8=ovRw7ai7g2%+cj%a#{8H1;xKzSKyaNXDctdVBH@?yCj-8@QJz_0yapS z$j+rXrJ2&%5o)Xa(uA}<{H%fUoqPMp__^w_Cz34h7~3D3U#E}K`}l|c#P)Rz#&N+xup`k`~1d)=W>ywI6wY5t9Plf zA1elVx$^-H!Rnn;$B7|wi?Rv3ITNF^dIcfuncU;fGt2250mmK)<^N6izE7W@kW3V3 zpKc>|-=f%+J5p4n3%Ztcb?qyr1+498j`bi%uFLD*tmDq+K&@V~(}le%(njRmJ$v_Q z%R_Z{Ka+UcrVt%t4<#;-bVKnILm%zz9BvfyKFYGPXGQZO;0Ed-diA_rW#=@yz0}$U zdnqHc(9nD=KHS}QxkxXF$xr8*oJB>MYhjF_>GB~IUbkQ{h&Qt74(cM^PuER^E|aos z0w09gOS4}Tnj(46v&dcI$AmiAFJUne-1E%~O4CTMB1?(aD7$sW=6|radIr=pExCgf zPZz8nLNcUATLy0TDln1XJwY-VyEW{xxNJUFF*QIImqTFgr|k7m<>tR!sNllcIG}p2 zVxJvd^BZtTWN9CAXupoWR(5wmeB@ICQusrC2#x&h_KYOc=PZA*l!POWp^Y}M%ex#m z0#`&5ELqPoMLofwg>NlsuH~>)sFkgKn{X6j^?brIU51R^cX!~@^H8qdb$$`^WZp>P zC60$O=)+Ud4buS9CDxt7%_=^g5pcZx(W5XqUB6K_*q23>k@-rK%|-FCuRJy70r>N7 zp5C%W9LgJlwp)tmVA-2JEt(Qgn zdD!9}JHJt+XLwf~YBKLTFmxFx(!%C$e*8lo&mjIq2l=F8bGb5HqF28#T4oDFVu1YG*8i%nvSE+0ZX;b?xF2-zOC8 za({pHkKV^5z)9%76-*3_FP#vO6BPKPe%`g*t|Dxw9kTTG(Ii%CV(3Cy$*qlfCO?yP zF{g=lwd4QHJo#d-R$xh{$e%{vJN%FoMn%^k( zs5}YWqg#H`SUTS-mQIDb89!6DZ0NM*GNrpeXQCMdu(k2%!r-hMD=5nsnGr+lGZm~X z78b#&I$bS#Y9WrAO;Ie{tnc>%%D;a-Qzr^Lf>fLas2z9aTX!wM<(Wt+kBzxHLDy-0 z(3(>Ca@I}@b33|;tK+QVb029t#@CkcdS4xHm{tC~ zAhA;-`SC{A)1+F@`I|AAi~PT6AGX8@p7od>i*-0&2cRe!cY8HeE%_Ee9T*rZ{_5;0 z^zkI2sDUz1*Sra(?3Fo}*C)oq>V5b8PcNz)eq7vR0@T2Rc1z*UJA39eEbSE^k8=dQ z!sLmO+zrYHSBgiTioIGbTkO@#ExT@)yM6$)3gPRwDxkl3_O7DNj_sYERW)gVD^=Pi0q#8P%+|QcySkO;qriMUt=g zzHIHB6Kmxh{5ID!xlY1G7(Nfwo#{VH1u8C*aJP1)sGLOU@u`CFVRx26$z5}e5XVLh zl0DzArTlKp@UE2}6n8BSy;($t{geUv834b3y=*g(q>W)tCs+M(XGV#o=Z&J?{D0%- zFIM4KN((%sYh>K+KR*4LB$WE*$920@w|VMqu77(4-xQ(86Kimn>B+Y9{k8(BP2!Ej z7d8rVJG}wFExcX~l5cm{2J7?N9MSKuzIqafJD)5;?w>u$zq;jrUi!c9_nnu z<^OKphQ_L!0+A8lFB$!AfMCl~5>)NG1YTqb+5OoF+lZcrqDU_K&F*);nSUi9ezW`S zMS?ggc%CIg+}IzV{Y;7Id0%gJY5v*s@vv_2ykwX?_-D`KCwiVs zl{L?Q_q^FHJcjC-dJn30&NYgZQCf&W6LpMA&y1gp$JhT66+JD(DF|N&_Z~X=h@7r~ z{4dVaY_fYfRisGibFLmz-g~t2#NiXMl#T3mLn|J93R9#LJo&oC01hZYuT2 z50x*+m2B`trXtqG$6<@3U=IVn+B;_caS!hk-?QHSV&tFS^Z)agb3Zuc=!4kNM+WV3 zuOc9b+9aTC*3{gbU+g?J!0WRrXVaacvax$8)1U%T))IB08flNYB_bl?GF>#T`*1%C z4uqB!`vdj?t#OI$;Q>}8>8{<{5Z7^=>oZF4yGlzhdCt`^D739G7mSkFHWsOR3chke*8M@>CKW_$ zFEJuOlGqj~%cUMIqaxmOijo8;HK~RE9MkC;JnPnq%oO~yd}MC#_{i{1X@Kn@WjlpL zqA{u8EERP5wNpsalNZ4d5j!YZEP+b2-cY$WOYO(Uv<73^Z<;~PSHJR_9gJhJcZ4=p z)N9wlI}g{NC1vEti@8b*-DD&s50sVid-jkO(N7^sgZ}dFX0`rop7T_%kmr)^vX|uY zpkqXAtj#}1xC5Cl!s2g^SGkZMT=DTUd+|w0V)YayVslKmA_sO5Ln!N=QV_CH(BC8(1?PRxVCL^Py z<5CW@T!GzpzN|>_E(aW?ATFT3Fp#drTQd4w4p$#?GRdQScCROu;ZAJu)AsY1yj;_3 z3E#+uFS!e@|5H|yM!Mz&tsbPEUZJ86XSr1~ABBdxsOvONYfm;JKDVVDtjbuNhLUZz zf@-2E5J6Zg!rd}=TACTEsANf-%D4ezKr(Gnx%Q|nRV_F1c_!qQ%gRQo)+6X|P#>V6 zp<84$%rSX@ez~Q0tjv`lnV1U@%$TR?$cA=myrs6nX+;~Aw;X( ztl{u767)L)&ZsbI^zAxTo?sJWh^Q3xYPig~W7(LSI ztLj;;tT^Y%?lUBFsKgM~_x?d!2JJ%5={@hwnkiW`t3An8Y1^-I^yxbWnVX+pwk&yP z*&|sUGO{PBlf5AQB|X_ETFy!SjI8Ooa;va75A78k9v~=r)TL@t3fT1@8HCgmsk<-U zn8f5j?>K^z47MRxHk7>xl~@Y#LK?tV!ID4WOTim-akR(Y{j?q?AlMXnFaUb;t3 z&-K2i+`d$l7=B(+#oei8ymo8&Ma9*&TczUy;X>C$&v2bQXyW%uqMAZQA?kb;(qYJ+ zsjE;flIOC<|G61FvlmWfNU`aY!8;4Q5QNJrg7d7M0uolE{U)j%M-#aw#}ZQFJSJ1^ zL)k48VVfIhCTR9hWBtM9#ERx#KIkXkm6j&W#qNq?Mts52VciwJ!cBiv&#n*%3tk+ zNNr!jt|Y7OB~{a=9h-6-WCn*nQgWVu9v_dlc(&ALx)8Fl#%JF6ULiIAF_vS7_+#pn z$XfokOC$a-{tnoiQ-7h@JwPc#+;dxV#c!s9gfVew%d?)I?b&l*iztm*waYD^U_NJ+ z5*`=l*qRe^Ql_~1IgA%NQ%}B#dbwx+vFmQ~L0hSx@4clun@dVct{CO(qe+-Q>djUm z@l6Cm&;r68I#p_TZ!b)!EWdr@|6;AvWTO4$YVG)v%qZj1;nJT7AqPN#se<-BDNb^M z*CjU~i22*kgfKw4Y>}J58jXky=re$ze&^5ApSbW0hvo{DPLYNeJf zSYSMLrHnKOo%;-yYMC9PYNuJmTrvtAh9rt(Jlpt>MLr1Mu`gdhD2l8rak`1^5S3nn17=ClkiM3c;@h~ zuyn0#A%%p42Kn}cwMF}a;*aTY99OsO_n95O>YlP-I*4;`qFzs;4B^hC-Vh{%#-2Nlogbs-_Eb~bdrb*P|KdnJXW4Q$`= z1-Ag2(VHi?L(qMFBKp`Oy}kO0-l{hv8nX1BB^=5WVhNvjf8aSK@x?H^p>hm`_(^`n zV%M4JG;zX0*I3=5?m(CcfL{38g+W4wSFDBj%eS)OkX2VwP*0{_A54)l2IYws*TZVb ziq37`@uJQd5Y4KP>I39KI)GYD=-r^4V+R?y{UatqP?mO%$;{4$3=I>SR=SS^&*E4znU;QdT3=j%T`_I~Ilq^~%w*S2 zAU9?lGWY(Xv`$7rN`B?^JM*R^#M29O`}u~P3vE%I3Z=?`c)X?-rCD2RpD>QFo6nm1 zSy)&UAFu|?a=F|cC7nUs6_?cmhlIVpd+4Tr;BF%Mf! zooiSS53FTxYG^o>*ok8;KdpY}aQTta6QJ=ZX-oX`&ru3Hm}9$U zbqxUgnGYR%fB$$VukR$PLahy*Y(4nrpR3YA=JCJv;eHu(=$|jHm`k)lJ4|s80f#PZ zgZ2S)a3uJ8(&F3s#awVv-m_JEkTQQruqFKeU=6kiTW3- zA06OQ7pU^>(7n)2^m%EiMR-UXEYZ)Ch*pAFC9z2gu@-v{9DkCU?rR2347dnA`y4n; zKiVpp*ZA$)V>?e8hT0SGcg*L#d(6kpRodX3nnGp4mFu3We zhW{c8g<2X7RT|hQJa{m}6ioUTySf-n_Jq`fZ&76X9*-=w7X+7Hpxj{cM4I}UDQG%S zh~}%&%r>-}B+Krr*domI=ELodKA0cZ%u|Bni@kWq_UKt9R296Y)v6zl*9xA;_3%{VZ)T2$N3g!z465Ib8ys^ zuYCUJ3Wu0##AK{P`Pcn9_83u{{&zQ3_RARj`cSFFeu^ma)P7mic1uIe*~APvNweig zhfY4HMojMQzP1UI2Hy!|-Kqtij-SmgHG#HGyAZ``K%cIQf_Ywi{AEV&U&47v(lWY$ z27OmalERMAExP*_-=9<6qHk1z@D}d1x>={$TB2?ol@Tfd>Cvj_0v;iO&{7uK)xJI&z-F4X#U0@tWiHBjVWIa z;QwB#qR8AXe({z~rbl;=`_Ygj<(R-rotV3EdpcA8I!_nJQOOjI=Z} zdv983)$yuwiTe7j05=LI+HjasgJRvB-NV2Zl%(R?zNCd+KFT=K2vM|;S*STEXzfd+(e9r7A;=Hg%Cz3; z&?;cS)p#nI)YoP{z{sH)dB~gT${z1X$Nuo|0I~)j{~Ohz<6%E6VC5wZr<_Wa61_7o z?~PeCI|1vE1C*`N{2caDN{ffXUu-TU+lTu5$4pq^`g4vjUVLKj+IS|J?&yXW5c8P; zf_YY=qob{f{V5m#8A&*$aBe-McA-56_cqe7vCxJ^t*Q@5`fdtIK|0?hM~rh{bnR6Q zmD|fOMDmWGY4EM*>=fr;Xa9BgFo(jiw{vRr&A|Vsok?HptJo|Mv>)tev4qSK5@7!I z;d=)LJpn)Luu6cRJ&1?P%z8EI`=|HyjjIFp`BjRH$I+hSC`t6S2E8wRXLAe0w+4;| zA?L-c_n)b!deS5HdNaNOj>_v&lEj#Y?`M@5+yvjDARYMNl93|^IBtgBB26bRiCNES z_1hOA>&{aQ2YbpJBY9_v*NOA{ct*Nr)bK(MSR8fv%G0KaXh4uS^cw0D?o5a=;+lVxK^ykr;FM2o{Qx( z4`h}$hKWE)bIOHwXbaH;)W62(H^k=9g$kLt?~Ww(h~mVqh@zsRGqP42ibf}v$NT6P z($2MTB)EN`!>J3q(K}0O^?6Y2=Y2m+Cd~e|4Y9=E^+pZ?r`-FJhsQlQ>!mceiiA1Ssn;(5MI_yluE z;E=w;GtBYV;d{C4YG>%V)G1@E*_YKD#g_+1_}Ic9GsZvm0C-Ex>|^g3Vmh`q#wAP6 zwbhSso>W6caK&^;h^~B{y0t}Zk(OS@>d;-Nc3#uenVpL%yPH{j9hdhw4K?Glwf_EZ zD*1#50a@8qyYK0jU;pS|oW8T{fRhYw#o2aqa)`6mt$C>xBV<^E+Giosx}NkiBN7L8l+I zzr1+g7C{4Pt0O#?ik&NYn(d!dRPla2D>xZKm!aPI1EDVp@`c~uqm^&s#;y&FF-sftK+F)T`moPg%tDoh4(vFWa45E`w2GQE56l zI#iaakPLas6Fz8KX|C4aq^3@911{>{J>YaF9$XHshkoqKb=_fxij@avtdjXGTaN*} z(eAXtzbx=QAAh{MkGCOYT->4o@ zzMToVOBMl&oDQ_?@{dY*gXG+sxC{IB9LngO8aAPTiHEVTG?T7hJ5OsXx?IM55Zn*_ zJ^Z1!Q~_<+48Dgs5yqJ_$B#!hFfqP)QrMLYl6afnA}jj+DQE^QBrPmQwUg!stmCs^ zS6cHiz7AQRtLu25&=|&TRPLoJDS8C5h=2E}Q;UdovCOU8SeWCy+w8D$@3%cLNl^ma z;b^oDD19?F2v&Q<3fT=l!8Z=;m$+o6r{_otdb+C+d8&JpF?fAU8O>Shk35E@{TgtZ zzg!(QetRR_-}jBhlV!h-2?7y{z}Ppd##23~amZvdebz@|CNUB26-(f-3(m6-@f~IB@WMON@kZB?-sNId;8#t3zfo9>>iM zxT@7i>j3cRG(~XZm3-Dh(fUd9j9qW8!jV(%p_&beigT5^1_DOD;r(fNC~Ikh3h%)g zUZnHMJ90fi+;c2FDKV)rO(Pu#hh=MFONBe|$IZbQ_Z|(cqMlpHHp+|wR?ldZi-lDx zj(g2shVu9`J#?mCVM^wM3Ot}ejTGf74ct(L&%5l2h(=|NBs{1_`p*Yu0d4im>s)6A z>-_4-dy)3Xr-!Nc!WoU&oPSY*XQIi2+&0M<1LL1(-8paDyNq6>Z0JJ>-iMu|Ah@LYGv6B%5 zUWz*ew}eJ%@3pVV_Z-#Z0ml|MD{-IoO`lWJ==R>sDw)39Bo}cm2iUGC3!`Oopscgj zw?={`r!@r~u$PSbe2lW+^}+VU_@{6IyDn&O&%dGE7F&us?ml{Gsw+8I?=%Bz_oB1g zuSFlf1={+RESiDBudcdr+ZVS#9a6rn>L*T5!A)y^Yf=eWT$yg;sj}&1X=#a$cg247 z50j|Q;VhnX!9tb1k3u}!h5A@)g1Fekx8o@3zEoVZk|d!!Q&ZSEs4R6FKPi#@Do2do z4H*9{&upVcT=&sr-trZbcUizuLJhjetE!!dNC7E+Ci`%?X6A^%k+fQ+p8Kj2e+%9% zF^Z@livt4tjgfiySuLTUnzWqD-W_)YrwD*DC$#2v+Ooe#G;1!P(qgbj6@O?w_*s#f zyKu#Vo*D-FitYUqCPY1k0DK5PZij5Zq3?l#scV{;m%V)!%);b03?=JJrP+iM$zV?S zm*|u@SFF1bV#7_Z*vWcuIC;RpYp#aJbP)o9YgN-pnWyiUsS#YWfax}SgY=jfH`Ghw z;*P~$Fbz<6A%THe)yW!2NM+boQlrNo?q&Q;mfRSiP{1jbmLp^2LvU6*uMsM0eVCJV zqgw>)HWGw{j&lDOsVBPj{~HGU4j*CLm4ClVWcy}D&FMsWX>&z;RBjD0(hiUh=88H` zYKL=a`2TIizI4H~wjXjcOn(fu^3pU$tUsag10*&E40nQL(uUKDpPqfM3+LVmykaPo z-QfEUfi5pFVT_qfV}^Xvgh$V!QFgI~g7KSS^E;8%4-b?KXY*IL0{Yrm)qo&GJ19i$|xe!LsB&iZ-%_updx{-=Y~=XYCC zI*3O8xzespETVl;bd3M?y!r_l?<+%${Le4{+z)`z|7$mF^FTyI?Cc&w4uS;!$X-*t Knsvq4|Gxlx(t_gv literal 0 HcmV?d00001 diff --git a/docs/bookdown-pou_files/figure-html/unnamed-chunk-299-1.png b/docs/bookdown-pou_files/figure-html/unnamed-chunk-299-1.png new file mode 100644 index 0000000000000000000000000000000000000000..1ee82b6191254365844e6d1ec1163b091195cb54 GIT binary patch literal 92696 zcmeFZ^;?u%A2$jk2BIh+AxH=U(kci_j)HVaDlH)mO4k6hG3b&Kgh5hTxh=}OCg1n3d5z$E~ z5z&dJGpE2`ww?}-5fPD)TT4r;+e*txJK8xqX+C>mW+7+cVButKrXhEqi0DpOn6{zS z741usN!6u1*LniJWo1Q^V{XTk*=h8@7+=6xl-f`7zOgI27umTym2Ax+ID433%_Vv1 z-o;NJzQ6g7)&InK)h2HxgfFd3J&h#b`d%Me?)?BQ=~A^|Y3-XT+xo7_{a^3aeCMZ~ zSEtD$t9bR}rr9C9^o4vs2WLv#f3>uTx_pB5orm@D@r1{6$8r!e%zkbhUs+>mn;zMz z+{Ich6{^b-2Tr{l&A;=7b;g1SL)}_p=udL*po4V8bdR(}%mKEuuT(QB8D>7$A@ zcVJhTu9a~6M&Mr6FPf(i*eai;wdhFS-OQ)K_jp%BW;ZBwvL#z|b5i2(9~tviUTR~P zcxCkNoJ+?I*TZeJn{M*kGc(b5Y02nA*#&gU*Y9`u>6UY5Tr{q5@r#(5>`<9_^Mx4E zh@k(e{)YL@ih*PcbwXGt^wSH^W$GIyR5Fn)7p#5sP13W^(l1!OU&yq2zc=)K8TpxO zA462BWrOKhTIj!VKMTwDyD#_inw2(N-8o-Rt31~}(x=bZzTH$1P>2rLyTbZr?#AV- z=`7KlotoyM=KZ3IO_L_+F97ToW(VJyuPzL)j3Dk4NKj`oiB z+J1>0aWE2vIp~Je%IY9$hn-O_*DUF6_*^>1za=~MVm}+tk9;(q7tuS1IqiGGZ)rlO z^7fL@SDmx48b|#WxlIXGbID9TZM!F>>1mH|>!2abT|}^IkJ>Yqr}Pi!b~~k_Z98|P znK}Gsr{^Sp^f)7&5YLhR9fsfeos6A~pEElnSOF9BzZ~d^Rs}<$dHDI z%j@5qP$gaDGojYWW#x&B!Ofj>v!-W@ZgJC```MIweW_-l_DX*id3n6mPoj_Bx9)s) zRKOh3PIzWoI3rHoIph@E)vLIB;d}mxz5}Z+VSmfn`Zc0_$t;f@4G+#1QzYchP(br1 zT{ZgNJm#fseGdorX`)e|dL!(?kr!<9#5W{SRVwxbmfeG_)oi#60K^@OM+#3>REW62 z=QBjaudRtlz$aqxLl1t4h#;wu|NP}7H1))PKA&j9{j-9i`5_Szlt@A5o|Xsk^3>^f z=Gt#>tnTi7y>a7Q^Tlgblqc^|w8%%Z+*PDhzt>7~$_0x?71r{jFAp1T>+3$JE|u4~ z7{q!)Uh5jK2@wg!rB*22(+;ZIwZ>k)+=W%tB(;OK#Qg3ON?&_7Wiw^|o0e|9Yft^- zY2*7>q5ed~5Rx+&S^n_>div~^iwVLdi11(FFE9N`K5#+^z6Acm$iqTBUM!twhPU6J zZRGvE$p7_}|Fd~tF@!ReQLIc<_L!4^N6r7Ab^kcu|JNVYP% zs+PS|u6*=mQ{2nj+ibkdI?Cel+xttSkD}e3)LNbEJ(~?+VQ(d;5CczhF8CCBhgrvY zT`EH5l$#d{$(ihL{49S}*Y@o?eM8jbpf_#r_A5==^IAy3EQZM>vqEaXU6)dSHTx-X zG~@IxyHZ?BdW59jmebDiSn;?`o~uVhSAV@lS6nMwggB*bQN7#Ogq!JC?zwtgQj^%$ z%0OrhKQaq@x=el}2L1EQ2ePzBRnMMej3j6cKel942<8ae zpN)_*x^{1Hsb=5`{57+0mDP3jnE_`*>+q((PLc&TzIXoh)mk;ca9UCEPA@{LQC)t` zaKM}Hn4)VpgPy?C8R{d#d<8K8m+rID=8LV-96A5kF_}Mj^lbsSgR%+Q^W$gv7N~~-|BjOWul;^sx$qNqWF3X)lC@MC8&r)f_7lZ z)5}nez2046?a3F7x_Is``M>}GTiaBTLHF90sQBmLn773Jvj*MUP%X}Hb+QDuTUZ{~>wGnX{?++^GaPX9G){)Ur!EdGK*l(dWb819UsNln zDE5&UVKgxlp~hM11MRP16osojV3}}7jQ`lrjQccl!;V1hM3sRuHJ{RQZ{sw%e7N^+ z`2jvTt=z#@Zp`=A)f8DK?BfI9@=&+`iZ`b4bX?0=e0~w1rT)-rz|$@X$u!$YsIaY$ zxmaV3#=&PfHU~8FRC$IT{?&-!I^b><8pacR)O1zn8hVGm>x2L=ifYZR+c+su3vxMG+H-v*ymfWR|O>*dcb@w@326 zc{PVc2^c(DqxmL>zb#9jInWr60NcVk$m6&ND>lFDVNu3AgBz!KZsAju2>zT@-akB~ zQ+0|}UCDpP{SLOX&H+{7OJJcyb-;bhUhA}Tu>GOemUZj<{kksJ-px*}SvtafqXGWM znX|ljNDjB#9%6GFFQ)t|eL-&0N9itv>LdYcn=*hXaJK$RRi#-Nb42Lv@&*(=93_<4 zOu#jL@}yJ-uE>=}#{#tEJaNI}W7egEbz-V} zJ|I{VPNO7KR`7A&z6SAu*2qhQciOoGV!+<#RIULhZAkktZtqT)eHzUrgbH;)DtJ|{ zgE8DLVnpVvLCyly7xgo0cc;S9QbrY@Ukj#yUHnWwhRlyy^j(KK+{UU)ngL)HHa?k4 zmbj0fBn zE&b&|po4~dKn;ePi#E+j&3OtgjiuD;x_Ns&oYOTH9KOm#z=n(0qks?uosz!(^AcUz zIZS(ejG+8OLKHj)v`dsXM2u9NNBtVK^vlYmClv9rn9vZN0OS;=yLPoghehA9J}|z% zjIssiEQBK~gjoux2pABm4?t*A%We57e7Wh!c1nb->=<`S-3!Og3sFQ}EPc*RzU4R2 zzK3tdD+p{b5!?wRHsZW}WOQlZx^?2FQ?A_ohcoC7T7F(7^BIDtU!;2g9)Ib|1&R-6 z%^%a0=Y87Uc_6t~vMB9Kc+{9jKxA72*vgzpzfq${D!=B1W$6%&QAM6{Ch{eJxF$7tas zFdILanEPXuS;U?-4Sg4=sWR^;eT{GCa6uZ$f;vSdX(($XW0l{^LjVHDfJF=rvdW~{ zKHfTOo@7V4p2o-XermtAW39e2^hq#b+VcBlVp4UrlMt#)9ps{^ZZeozC7a^{fG?cD zYihVj@v!rx*D3x0cc#ARh`8T)!VPzz1+L>nk*H&7O!Ik?!5Gl!nBfkhlcE6<>E3DOm zz|8NpY>UK5Sy2!egc=;mk)1Y-Ut+uQ?6ME?rBtC>rVU}H_(LAhLc_b-!zz%mdj}&m zNn#g{MImLtd<**X#bvS3icSs+{rZoTn}+UkzP%Fm)Qu(tU@*l3z~Fp(e#CjI4kK@q zg4M&%A;*Sm|MhW`&Td_Cf$};1zX*#gxQ>wY)yRrVqpR8!OXAs-XxGL-pC6wqmS3e3 zJU*ElJU)jl&sc?`!^?|R?DyM{t9)x=fVhbRg|c|E0C?91WJXky)0=|sp~-JgsGs@rno&^3lT#Fii4ZH@}?ctXX*{Ap;8C{?< zIuAR57EavVLf~tmIz0}z&b@?=1G5F-P`4p%;H4CjdB4bas%MHqjNq_lIDvh7-8Ewf zB5|5oi|lIS_q1biZWy>E6+K2>j2w~Y<^6XQ{M|+u2rl1Oi1RT;oo4uxp%p_Rtin?n zlp_LujQdBs@%*yH1TbHt0{A;2Yv2bw0!AN`0?grk-GHJe$G8siVBv$`i7U$RPxHjF zL@Qq(4)fso0;&HonR>o(-L*@_!-)^x?GKH}U`EY;Mm&B&KvZ&?gy~PPkO{{|3I5vU zlIO0^+Zo+MS1OUd1_ws2L)z;}$it^Cun{Bgl{@H#bUB|oqYJ!`ORn|Yoqt!g|0~Jt z$LE*D;CFk)gagCDYOSMklCb^g(=U_Cgsnj4ur-pM`T zl|@Eykey2l3nUA$-6`7nWcTSR*{22N!}RWut-r|l1N9xMEn2)prd$SJaYC74Tm2rr z-qTdOQqJ#_Z;u!V{_0NBJv^MDZ48X?JDe>ZP~<7NXm;7u`05ApS~kP;4S-X-JZicD z@gP8^7-l4-n5ZamO}Fqw%WJ8_9~A9u-7|Gt14?EfO;$YNP#ac->^tp?JwVdB*JF>? zB0!#v9Ck-X-@crBHE)*@f&P{qJU8z)VNc7M;$fQKfL>f5X~f)?ME6Ium5&%Pa3;HG z_$){>>%AYl)zWZZozYFMqSuImXIDyqMj+@gT8YiLMyTXTbU)*;{S*O6#b~6aQ>8LE zR6WT+{P=>aa3C=|l_9S+AkWjogy|hBsaL%h-oKt%Z)c7WGTRDd3xD)u{psZT1c-kf zYEW5^EMm211`!yS{JM=V4@$AWFJLbk?|r77`PfVzqQqZ)EgYc!z0T2FEz{=^I+alc zH8$qnt4#Y50zn>!K^;Xkq9w2$2iHgA1Wt@gCUnGzgsVH%;G@aCD_#rl*B5mgy(R;U zlkZix(&&dCBrJx*V6gRUsUxn!s+r)K$qc|9=Jd+SO7!n4KV#VTP-{c&YK<-5{$q(d zd@ucm@{FjJ4lEh7t}Qg(iXA;YZ02nTUyd9ZcKg!1W_+rMFp?>gNlmx(55MNM|1+L; z{(eBA z;RQ^;!z2*-)@JMdgJ~P3nz^qc$K+{lAK%$#l-JZ&b+ROYJaD0t6}{0RaQJ-A zSc0o0A6r3aruwwtg-ZVXdS%xeHe!n|b)Z=9ms<9wT=cjdlt77K-fKEN6W@4faWh0P zd3Q6m$j2gM^RKL*>nFN^z+dKk_w4>n9>MN$Wht)9ZnI$`G_-t*3Zrk-B=5F~K;9to$B(^+e3%bR&#q6JND%wNR-E82E-t{?3F#NG4s~jd1{ch3MFS<2t{iYEB9`1-Q_WKyob|Tcn`%w zO%4^Uk;jvpUZ5pzWDZcqt&x?*<+?(T+x-7>GbIEWU&>nXODpIx9 za1i`-*x2uA=H0vI^0z%F5@!@?4yQsuaf ziFs5I*V$qu;a1l;Sc0+HaII?>a%$XC*?@+?nZoQUb#{jZ!W;N115;{!X$RM;MRX1h zPDD92JYM;sI`x77iDK_vrPqmqUr?S#hT-)aUwG8^x{`m%jCmGh44%(G+^S!7P1*R) zByqGg*cw{pvX+}yD7vur`@6(cZZ_1bH%iAdWB00}Gf9fiMrYdlt!ZwFk=S2nGDuHY z|JCyKtzgoiNqnUStT%x!+S}`bX56j*pj#F}`d4t~=|%}ummMll^TFx%stb9~`+3xr zw+mx(1J!N9qb30}^JCtKDLwZoSF@#pO8&S{+3f3F#Cj4J=Itw|blx#9O+BC%2w zk4K%5X;0gA=ViY>uvt%C#FxA*s)y7lY4l?1+oGRUit8_1+=KNi$9jU@ptjp(EFzgC*--{CT|6 zlFJ{P>4kNc@=H}8x{LNMG;IvTHwd|nTW7=F3qBtGMO&9zB~ZUkznU~}bBl{Fo0$)g zMPQuXHoy2jL;BQBjpcO4+2a0_UgQ#YybuchJIQ`ewhlL6%Stt+$Y;&xTPZ)Xh2Jq96=ed}X9hDbvF6#jCS)cQTV}WXAw8VC zq8&X?28`W@S}m$r>o^q9qsGq-D2$d7Fo45MNy^LR0rtvzEZ+NFtilU^{FB)DKwKqY z%`YM!d-nsOy%Ca2wdMaShW}Q~mtRSalGbA`4m~cP(gpnXyIS`}U+a{aeEMtrW#~Y) zxTeDVt};8HHS_A1SJqD#frX)MXZG`%^ja!tZ7@61toJ%n^Ft3(yQwIh7#vyv%ens6 zli{@-QuEj1&JAPYtc|Ho^sjP;7Oe0*763NuXI)evLWV&Su{QUecvF`^GYx;^;l|!9 zGxpKMZT3s^J<0k8XTHa1s^%D?Bvv9R-y)8dEkia(Gcs3k-qdpgJ#N>=*l%&B6{$2$*a0a=?=z;i|wWF?BxGb=N8uRVl}B6BRha?(j@8JOwT2Xuw?JL7v@#u$y|1Na3W6LkjE=La|&nE?xz z*cqKMNJYsYxjGXIDG~fkTTu}{=99Yn$_Lq1e@r_LQL8ZMrI5D;sQXIRV#z#P2n-5w z%8=2$r1-*OAj2CQ`Nx*Y)RrlpPj4{EbBL={S6 zn@(J~)SmG+k!$~L+??O4Ga_Wl9SQq>)y!e7q>FzYi9JFVPkSx3YU>(OBlp@wg6Av~ zFQc&-BsRppV(jTlQ2_F1W2?+QLmvB;h>0!ue|g>&C+EBV{c$Z-VIML6d5IyjG#wk5 zTdVaecjUhpw=?k+wlx5leo!;|P_1`#f9_{jJgwk^!RS4gds*efdbaAxZ>po>>az1b zXNRz5zF6>>Nl;#X6I2k}&Z}4U!EJg7)R5)qvvpz{FwTBEj2hL?blmo?-duUdT*e*q zs|PFM1{chj`HQu7#P?UQYuYhXMf0*k{@zYM0ZpR2%KcLrA>y)O6loOr?`_@4{Cv%6InIA?taPVT-5^ zfdgxI7Xtx;c$*KD9_-hNtQx#YN}9=YT!p39{yH{zYy-igaF0$m$thCv)629s)18Kc z4%0%|)})y8k8vg+fgrXGotq3*VM!7bDVZ|XtgbhCGB;wN@BNh^Q|b@J#jp&sxjK;M zv*xcGD%|{h3oU-%K1YzVho8b78Mc(bIx*9P^ePn0V;He%lBpoCc!&he?LR@J(gs1P!SWM_F9yK$wVYeL@ zr(c~VUj04{>8uAbw= zZ3?>A(2Acl6avdwg`eSKY}l!(SbHo%+Ud+UZ4jRqzT;-Q;S0Z43Np*gQtu=Pj!1Pe zk|>!d&XDw*W~yb=tvyN7KCgk?5P~EIu3-YC4*Lw?b`$Y3b~Or_{rY~?`>@^HB18oQ z#H@jWXOo?JOxwj1$>=N3S9K^ie{3_sH#x9SGTi3s4KCuY@1U;1g3t1%Dv-NrSFw^PUC>5BZ5Vn_nNMAiT zgQsE8(?Fcu`b-uTz0aQ0JLT8oxiUujkRUy)2$EQO18@o*TGHh*vlg!<9nr(hC04)R zpZFK+3JFM0k1InxGA^g0(_~pfp^Q&qzYcvr*y!l3HTi0`jL(1I+DQWCTE9CM0+l{l zGQy~rCv|+W{)RSKmvdG$1(coFD;(;eR%dixk}m)x^o}juWKu*D z@pY@8dC^n~35ffIfc2DK7-c08n1o1Nvbt(nIrT0i?vqISV|>E_od*YW!_X}~VK|%r z@f_xmY~myBeS+$58LsrYZN>t@p|l)%75(8@;VCKJXfXnmFau-^C!Tu(DWL4w8I{11 z{nsWBy69A+$-TV${rm(uOnY3y)op=f8U<;+wI_X{&;__5m^w1i`Ytp;oDYRE9^c;L*3Q2dKe{xR>b~6Fk|Nu^8hOi@_|% zrUu<52--}Pan-1^9^~wOyzu;N&Js5AJfV0woRU3Ou_dsXeZydPdkLSv);QoP>ERz- zE7$UwhmUIo3*`mL`jRlYGciiI(v9$BD&^pLkq5p=*w0`!A%e?M0xrfqQ-2|2atF@a z#VWS(dt7>dOe*)dnk*fVO&1T^UCQg3(B7qp{MMn}fYM{@7Oz14h;h2LoIX`V0o4wR&hc*N9QEm}tbmKQV*{aJLXT zLk{^#dxNa((CpfM{AXSPr2k2G`r*Zd$foa`{+MU?+s6sJTvWiSDWP%n_)W;z=0On& zHdj{*?TkdQHvVV}M-3j=Yce31(nFf|od3ahlHREyt8sk|UQ+hBYn-zSL^D?uGr z_8amnVvDX65|5u7`|d=5G8p4l6@l+Y3^_tk0gGCe4c(cXm#ygPiT%#(abqK|$htQ? zFp*$)D4^D99n(bBBnXa?a+YNL5sUZK6AHmBS_z=IQX82{ z6ICMj2{OOmi~&C}SvJ@Dgy44Fx7k4bPSHC;l4f7+*X;uP^8~)he;ZUtWGOnP_t4|% z)p?&rj&Zo6apsd4E=-H7o3-SY1{=-M8eZ9_QY-U}c=39LPpuYcf=MCS6w7!hQ z*<44=AP=kG!uvv0Z8*WEvcOAYRB1OO8JA&6y2%E737sG}&>?7v803f|dqcpH|2p-9 zBwpdZJ2t;zz?^g&O5;iHe>=r+t|YFE^W#+PQBmW~4XMK?hwbBp`@RPZWEA~aGX%pm z*6h4uvt{hQlkra=iP49E2f@0zaa9>YskHA_c$$|!j%yV*ou&hTz~ph3MzBA zRtjs`8n@#a$JCMulspt{>{6$(!`8$TQ5ApyaK73&?Kt`M_sql(ktVo;3#=x)o||?w)3C+{n$*L3!Y*u|ISI*DJTu zNBXX*10o^U*sx>bTN841h_JEG5Y#)zU8M_~r&uBxpL?`?jSVjRHJ`$~pY~v?&mKnL z-sXdHZVe`YlBRM_1Cgl@X14k|TS-%k%#kX#Z3rBdblfDn{cW!w%%Ok&aG;CLZ|_IpVq*`1to4 zTBv;$5jo%8wc3aG5O{70oFBxxD>I))wtHwhN_b1)rBF4o^i8pX*LdWhcN8U}mvkFjVjH9+37BR*m%76735nYVMxf>;5 z^p&8+qmLRK*zo38(XkcC*to4+1yYjwRXbAz2+(T@+k)ZzII=~c6V-*KjiSo>(k;k_zJ?(_ZA!?`EH@t1Cx_*yI)^B z!bQ`ZusAPD-1a4GlAxV&K?t`$!^5MiH-e$PNeIJJNXyHI*EN{%#X%Hjfi-0{q(IEg zG6A}JyHkwEHiwNL_#QCM-uc2$z`h;ogz1L&A$4{T&y@BOi!Mc~VK7l(TMQ-x+6FR} zY%6QMX2XgiQq(Zs_dXuhX3Juj5xG2a+jk>-;0nyhvkjU@&u=M`s9 z;>u8u>O7wJ_kofK4NpOt#w^3oeUL)Cp5lu}J+dt*w-Qpo<>NvYL*L&Xe$>4Dj-wiv z6^m+^Q~s0pV)C4eN_#JKg6~g_g4dwx;~qoJVs4&r_9uQv_P_I91K#mlOj*1jAz2do z;to(Ibt&SpAoOE5n9Ae0Snl;(&#dRlFqgxl@ApbRIG9HNTJYu|7G3$u+)@@m?y>Vp z)V@36Cfh&+V3Ey+V0p~+NAE(BY4LJ(!`$}TyVE(bTkctU14~L?_PX=}kdC}p|pF!JkX@JSd#f$LblO-5-*|8_DwcHHqu5Fa|UOr&)Ch(`m zaXmffPpfp0EMASQ|Cyk78Rghy6#Lxg6_NdVFyk82b5KZXIMx1B6kOm%^bWhKcj zmpHQ*HcDLtUBHU8>HY*JhYB=zCVxmcB6ax)?VXs`STc0D>_gap+SdT!cty;Ol~djQ zk5K0e*M1ObT$t&Y?=A9X;3jYtgfxMWz8xQDRd@Q+^#tIX^5=j>6asS;EypX4@3S;5 zfc?`CJYt|Dq~{DyA$OBCYYMT1{orxnc9vn;(-#v;pcTzK6#5OR8ol=JGQOL9nuVGt z3BX$i7Xd38)siUk_H9;}+4gV44q}c;t3YdU3nv1D;-OR|B_hm>H(`4H8knO}R^ze{ zqy;vA5y+<>U^zM|HwrpIX1-L&Fh+c!jb0y}jlwR}Rjj=s7>}L;LC|>eY~ct=+og_a z%<#BmHONVm;m0*lZ`?UXJc8n&gMYh1e)VbfxEQ~2cRH)Q;3_`hp!L9YT4K*(f`CRW zl_Z)m{qUSE?udnho?X%W*ERn@{69Bv2>=8X2`!PNOvq?z%KJ2*{f-9AXk4>(qQeXG zp{ch4OA4NkaKkGub_gSRO-GV>iZLr|QN~Wz`Yn9&1Si3ze7S#!;M|z4x}(wYqy5ST z&0}E6>NFso8y)*iN~`zKotkFWUxh|8k;Xm>a~H(>K3fS9r1KIIfX=qmEvuw1k(6lb z1TBXZ6>WKYI%YV-xd}M#22ja9K8lhPP6LiH3tFcR8eocZo>+??1O+W913;yUPb{QS z(7cl+@eT~@CAc8ES_FiVId%sGgUjkp;N;O4B7dKWd5}7v?3nj5+HbHm(uNQlHE>

      6n)oYyoUSeFPmABJlz2aVZcs~3xh$ard} zo?Ff}R{7>jixV`wTjhZDO8d7}Q;$fzruTRB`d~f@fWy4D$*5DFzO>dT#Hy)06;&A! zpnsjcPZ<#Xsf}$2^!1soE0&jiUZ2Z!_&3$y;kY*5y{p6bAltNms+nE1THKiKl_=yh zvm=IwYJ7tWg*a=U+HX53$k-?w7vvj&Gq7EVJ; zr*=ZdKpE3*Zi%~q--z=0_*<25sPn%l26zpR0`^L-V*L!Lz5?Q67|SI}9es~WX!h9v zg6Tx~?~uCq|3UMKx~c^Qx7{_fHdp>v--w%Y@p>au*>IHtQ##kbpi3vcrnf=*Ohx@!|LMiAPT;a(V^y)n+-2Zuc#PEAX)R8nI2~QmCq}ox0V;ZT3htSJI@i1H_%SN8YM~SXSO~! z2v5v?}o^DAsGZePcMqAkvX?%sGW(CHCbl}53O272a2r)ABrJE zd^)2CS~xD^k^Vn}zuR%3SspDsecg0Tg2%qw59iNXE7sBzbYz_aE(jy{j4+strtlaM zv)$f;4IK-rSs*+Xj@36KsgXCq!q|kzKMwq0a0ot(Z(L04S>nbcyE+Y!u0N51OCnTF zZR^~-ljztYSG^0{_)ebHe`ZD~t#skbEA6si80d_Y)aGe0yfs>ANI`%|S%9DRqR&2a z;PM*IDO6(=uXXPmmfhJ{OBpAB5)y^yEG)2qlaSLZ8(vkHg{SN`pP46pTVuFkT0m87-ryEoA8530F8Tyu@ZqlZ;2 zuo!uBFZl5y`M-Q+;0K<#JARRSgchI3L=5*JyaCdzOokcby5 zIe{-lQh*fb$GJgZ2=9!!qo-!eyP;r-%}s$&FnQJfJjRd!gx`QXX!5(H^fMO|GN8iV zZszSfocAbbur;nV1nE5S!zWJD2O!_+8y=BFaM?e@ zNjp1cJStl|50Kb*lO4al;rog%?|}%=lGpH4U4H}ZC0z?&=nRhBa70ko zSZ`wxlZF@<9s~NyxREh#uW)u~ssm(@mQZa4YuB<(Pk!wXi|1mjER;=7-{Cr!cVgSVAjm-{e%xf?zpn^?%t9F zT;r@cvxgsFVFOk&=a@R`jfX%Vf`Z2oMDSN+<_BvG#jU?$-F%(Bw+x?KG*5iPL#uzJ zB2x=amhgt8fMmob7Hz7(EB71!5KB0qwJfp#x(w<&KFTfQw$$E#(wsfDvSSnY#SJ3( zkCRy!{M;?4yoa=PTOEBjlPJW>cgAho`{kG=W@eiuo{bo~w_fnuYu*Es3zzZ=Yi#W5 zw}*?tcu9L!l7r~v>iKF%(82y34574iiy*I%(ej6a-{KWhZWE)Rw+{CTs^*HZGa*kH zoH*g&y##srlw-Bim-n-=f~+7+{2XsOE~1uVTJX=;6PSExlD;6J*!f$QYYqBww^s-6#!*C>80x7zU7M{qfm;I^+msp!X9w4>RTp@R-|(OPim5{K5M1c zZ2>*;k6nHtF)PR!O)!x1fzf+0Cwm6huvz4HwA4C3_Fk;_-Ce=(J2QbEYu-o(&r|!^R{C-U!1wmMA&mQ2jG*y~n*lkX&M`Pd>YiJ_k3w5{yL`VY!)9 z-vqomI20QwHk-oEy-{mp25tMJV34A1+;y~FF4!Klk*~j-7tg-)iwdzPnSGs0sgC;_}NhT?PMVOL`du0!+bi}dr5^Rx@WBnxYP*aOJM(VCxM2JQNLi}E2I z2CtF8O2^ujiaV%`H^H2V=O)M(SqFUxNT8E>#ec^FTHaG6ZVj8nvP*Z5#Z`x#98ZN4}~q* zPurU-R0ZB|)YmXM5&zW$C=ZG6ewQ%~XfRTPR$5XVLtqRdiDjCYJv;IMZ7r7 z5^>6<=tBt;%udU0k-hIaJY3B`E9VgPzy(C^+zs`K+;h7PyY=z{TJ2Yym6t)6=mS^O zMDAAFVP)e{!K3Ju==W=0BehxM-vZmQJC%)LNe(qRd#{#pb3R(wGe^x*htfVj$czJ< z$;Ej2tLLKcVD@IsOq?pGY1mFJ0A#+ju>W03)=|-aC;l+cIO33vIlN5V=pRn1$qWRH zVHp)eochxqw0;us_(V3txaJ}ET(msCZjQb6Y`gbrHK*_I0Ol6(mLYoU#;t=Mv9(ad z;T&g_>Q#~0Xby_Z!HV;|Tw{q(nX+EkCGsUo)=v8#tURBKlA}R9m--FAaBi}GR29*j zhj^*$@hsY1eD8a}^KTF3R5lhYFWAc1^$tsx7Zjrto=)i)W~;63#q@u>!8oM@VZ_=M z_IOO1hVW&H-@v_?D9S^bdZn~ilIL6es#4c1Zpet%5BjR%(bo1hb*bnjz66jy@~6 zT^{X&inZw|{uh@vnS&0$4YG-0lEWNyj{b*&qN*&(vzuVFg)cght3OsXZ>PE!#{aZC z@%D!zFsQa^S~=y$qeI8!@g3fJ`!;6R4_ldFzM0N^WYyO|RUZ{Mh+S&LECs&KG`G@o zOXfJp=3ZU(A8U-Z5qEu+cp%l_wzW~+s8*{K_w)0wg@66go=;L z6L|E|M8hnmVO_CQN(4+Ncz`zJ+WU+s19t1SpJovfG^h(*O&je}n48l(6_wP2a@aNO zQDf`=qk&t0RW@uFL{X+uae*-pxngfUjHH1AX`j|BqrIFIn&$E)N$`fb4#_R-P_8Pi z^p^wV8Lt0b9hc64VU&5UX)2JmVfPW($5;37er($s6Y1{o=?^i|nuM)Z&(qAGXRK1I zuN<^q8rZ)nbuiXTqn+LR9lNE5)w(9VG!=&i=*k2yd<$Ovgs>j>5gk!TguSH%BT_$k z*pg|5_#&9q6y`^jEmkM1*;Z4Slr(n-_wVNEzEBGqbiqWr6lVz83La%RPB_%9x6Mp) zfDuN6cc6xIbNX?vS?j~FA}}vy^ASu*Nbu&mk{>2xZe&t%#b;(XcW5bc1W>)+Z%p>OpY?|3=XO3bW!&!9?sgBnhp@;hOGcr{)c_ouXFztS}CyY1V! zB%toPJFek35oMF#(2&tSX#3TQR+cV@d{@nYr9La8aXn>;sxoPj!ul80ZVq=ic5iM) zVs@`St7g7wWNT|b0}K(B9g6ThSuKlx`pgKtrSL&z3V2@=tsqbPVEfkCJ_B-3_W4oB z*6w#H%&ZiL3&oH1vYV5VC^S8-q07VPVIgH#Bn!%Xz+933#BJF0(E$=`G^vVMdF^c+ zDS0A$uzkvVt-f`!Gp0jDD#@;-^Xvg90!+7M=wOX88}Z|3`@l%r&sNK%Es>i)FKH-j zXS^*%v9@%&9&L{ZsLI7}kj(}ajH}yhzG85=&OiB1Coqdt#o9?*t@rol(7;?S?)`{? zoUOMt6sXc@1dI&k1O}|hH7=_$58D0!CfQ&a!Qq=Z*3=xk<#)%>PBb}9!0-llYYIPo z`S!3rZT1qwYJ$+iW%y~P>-;4#=mbU4se5{#(0Rs3+`{8l=`(2qsKX+*ZNDn7DLE5y z2qS{1pr=rLrED;mHgF5Udjwa(-Wv?jLPsPQE;NiiO$tNBO4?;=sN5iThi=bXJM!dy}DpR&lC|u!l;%rETM5l+U&#dW3>>Q{&18pBB5#>l(rx$E83*9U2( zx>r1H_46=%cJ`O1hH7vx8>{`gR*+cK|Bgn}shKvYyO?QI5$&UsHIm@MU8<_=pB1vL z(*^AB{u6X z{g>;AAuW#MAcp<=-aK^oo!qiR>Tez6Yf3 zRxiX-Mgne@as_+C27-aY)?Y%55d zSE}1?BLXv$Tu2R=`AjCj8;#(4yBk+omzj`U4Xg?6Yvbpw)U9@{F0K1E3Z-0{GsSuu z8!OaB9uL+9C%;tS+}wa@1ga|Lx@u78dQ8lo60LzEm8(W57X=uJX(+q11;&P5`yIB? z%(FR;;s?rQ5zRwyqm#Jc@1FkTh`+R%G`*E2uh6B97SIv%!yL}cXrJKpJ72*0lapL> z?>no&n+_(q$ zGa$-mmSfae_;|(LR&2=e_n@0!F&xpCvS@ zVQhNQEy(?^c(N(Qk_|SrrMWqM0(qFt;CKz zf+8<`d&L=|EAi2;Yrea+wy!gK zxvgntaLzp~q82wvNi)x&B9>vXfI5o;lYmCWexhMha*`WS;7!6J1Lp?J_lqN7yaO+j z3xU&5+8wOIBczEOQ0^6CE72Rs?~4&v9-;9BPPzWqih>dEl%Snrdd+S3FUdZ8&CL7` zSw9yolU;S}E39fZ!Nizm)S+@Vqr9Xw>p1Xl(ThLIkwfe|;M)f?xH%%*wVMOyCSu+- z0&oo(Wbgq~>&#`r6*}e7%)dlsu=ss#J3YAXJ9PQ>Rs`i3F4GRw^>>#Agf@&<>NKFrddB!@oH7KzL8$q zt8SV`dP2EU6H;d9NS8<2R~DLSSA5=blyo<<#aZujOasXWcUFGCwc4~@DJf=9F6?*i zZ@rn0t*aZmqiWnwTA03lm2Ze)ul2P#)(Xrb^1_}ek*XK*Z3{ByW&Qo8$IZG`@>N1fl!2B6XSFrsT!j9>QuRne+qkZjZI;e}^(cpC zZqO%}zV0%AfFA_VJ^?|oa`5DEMcUmdMR5Q@pxOt&>fVe)R6hul%k)azz=~BA8k?xil8Wj*wLApb_yGsS6ySqWUyS_8; zd+&Gk-uD;$fQj>*v(MUVt-bd=noTB!t|Nz4?3-u(_CFt)U`TjoEC%!Jw~?zSL=I*S zr_rEAQ(8x)q^akuu{oXd$El1p?D{eh;iyOiW{j7ct$9jNQf=J2|Kjx+%Ot1^T}TU) zdHlwm%ga1AJDxHiJ=A#bGe_8)zVFsp1j2|ye}lt%7ncOb5gN}-o5w$C0c1JCJhHt| z4NC>plRuOF2g`QZi9O2{cvUd^ZKuAxXpvn5n~K7ZNhMN$r{+G{vUdgep8*>oC5BnL~-HxXEPgYgG+8PvASNnj~cq_%-aR1QVXux{|M^oT%EUE{c_^3@Tluz^0QrzS$V8 zTd|Z$SE8{?Ca;ppewbl8rs9n{){Qs~SA2TY9Powr;FUud{ftIgP#6GpD^* z34X+OV_N+2cx=0Nhi{?D?3qcdd~Ai*8G=+**AmI z78e-PVql5y3gV{|(k>n=LAlP!#nv>lL5b-bql!0-fhjN;T`oR}#Oe%DZrXugMs8c? za{coem&LoanRE;3>2R~8@Lj2NE=O(#vqDCb$1blmR3bB$Nxd%jTHuU#_&5ffxaMtX z`LEVwRmWfnskyH9zkC#TH>`?iM_q$37JJ-jNLQT-)E;S57WRoRp8^DI(Tc4S7sR|v}Pj< zAhl@}sX>yn7n96c?T&k1fc_+SWW-k0l+@}bB{-d8)-;_fhIW}%r1pNhspHT=u@CV3 z>&pg%>oyJg3&747`YywY`e~PbLdiN1;shjVv~O?h0(p9GypDTrb!D4dG|@ehLoPmf zd!$ipurunpb+0yyEKtoejgGcH>jU*zu^xWyR~?+%v07@{W1WKAgzHO}7nAevrNoRw zyTmEk4hsCGWX6wp{iRm8Eee-!?Pi#`M(w^ZAEHi@5|bQ+X=F^Y_3{`fW=ItSKt+93 zMC_!d;38kNbeO)oPoZpiU%jR>JCWIp2wS=rRM0BLHx$VmrH`TnoPR<0;G-;*^+={h zDqgToiJsrC6bWpI;c*` zzyquS3v#?1rE9Fx?Z!Wy%dsP2b{lu6ghLTSF(y7Q{ID#AAz?UFl2LubwwVWq#le?Q zs*$mrrpLeLo5L{( zxt;R72HsrmwPx16P<}Z+URI{{HJwihkwV@qiEVB0S2PbvsXE=hR=&e|l%4#YHZ@Zo zTYb5wkm_j%V^m@pwZV&LPRcrq3KiA!P5u`c&i0L}U(?M7GtSb_fa6b`*G_)(kKq5;39)1$3Eu=!hz&sph7^AItj9@u zyjw|B;z`zZ;=O#2i{p7?@mbsf#xkkQ^j((%=d0=bsAl0$SM&a)$B9WPRH+j);0#{P zF2&4m*_S_Bw#ygG?-u$lFSG{09*=B)P2JN}Jn3aa+}Iz~Lo46XakKP<7S8Xs=q1$& zCpSRJPmny8BUD%R?DKZWHWV9${qH-irAN8;E8L*j)E}wTu4Tm4sa1o1GBG^N5z36Z zVJ?tOutR{IiYFMtd)zkOmd$c%13-cpgQxq5?qxTKZqH7@3YTeblLU9S1!duCU$8RpU@H zr$+tn6R+35T88Jq_Bv!#qbA*Y1gjou=y+XhRl2dL9w-QyqJ1sz&LzMr|K#IlyUJ>Oq;CvKBnKd5^bzEkv;{qspv$q;JcMa z2fzALr6DRe*H`sO@WX!5i;pvUg845$k_X6*=@gticq)>H&9POx)9@_xawR8qc|jwa z=8GapDxj6FcwWIgX^CcL+;+BlX&U`iTswYLMNC9~P?M;=xxV0CwDb@&>AS0->vxA1k>3qfs+96QRrg_|P4epNd+^l`Yre!pO| za0uPW>E5DMX*b8gjXLf{NkJyz1I#BD4(G2!avQWNEgqtks`S{)b&f3`=VOjcH1{6n z{C#mA!+dfJ8z%4o z`@nzv{T<3hpZR_nhV)kr&~u=^^fQz_AQ`*i$BWN5XT4tZ(_!p+_{+6C__Ewv>`V$d z+E@4TDMZeq->U~yGBuojod#_j0P|mHlpo1xZ4GGOY}Pe9nkLRn8Iih_UB|nxLb@qf zH#AJQ>)lQYsZRT?#} zVL?H4xH1_TeJM8>L&WCupD64t&Nrg8L0h z)X&waL(FMPc`o)HYxW{aFXc9+mrKOF?!A$mj^Gy7iIH%6)Bg%T9V+liwV3OmTrMrF z4FRjqFNxP2lRVGOHfHK;`6+xDwUHtW6bUd70h`JWH=|wBqA4k|aM$+$ibEPZw(m$- ztJN>T>^+m8e1M%7RwSv~pNr_D>sRCI9LiR-YP>o=$h9xGe-<}v@_GO9eO{U2UT_s8 zKCaadX59}yZ2^fP12^$VedKaTQ8q@$F@}dYqD*{|zB!GJ`c_>@y+xvi5aYb+ zlSP~Z#e)zUifKU2Dhu~Tpk(h`QO~Ra@g!V>vQUc(fsGPuf1v3w$)5@^Sd6+m^8Q#7 z=};bIIt^|&*K@)l;x%mGSwf3ZdEOQt^l3ZUd0-UE+jw& z;06*{bwE|{X!s=5(rZi7flS^lIKR4NmYug;*l?bM&yTNe_<+gMoN=6}WXUckI{Bdc zx$f8$k4{?l!=tf~Z=6OQIG|72aGmWLfT6B?ichwed@evginKZgw%UGdhi+MtJXVIN zsMG2ncj=;q2&u;%tn4&4*hFFPRX!@i3)e}$IOiS1JS9%UzcCi7`^tVJlM{#PZ#N~E zeeqy3&c#%(c9{jw`kG(dMdOCkdE5w}R7@55^B}+F4_0t5VnNPtC&Qp6xAC=@%ezuT zIH)yWAYVb0>f&LMYUJhaX}cWX)K^t}uW`yMMT1E)GT8{Afls`VG%(7~N6k=?*=>cC9BYTWJ>R@_?#f&5R;91l z<~mA}dTNG@T&Bl}12H|Te(D{WRPFi_iGkl4+PBZ98jPB}y!;tSLuW?*IAe1O8+WYRLxMi`7Ep)ERS#eia&*-i zI{)hXjw0=Ob9Hj{20TAF_u@x3_afbZ1X{P_ag#XKFpu5uD10)er#N-Ur+qMRD#TEm zWgTOPG%NR0?^!ntLn?8g?|uur|NkrOp#ZM%5?~W6K@vTpqV1?m$o$$n82!mo;8Va) z`i}wSc8OO2_L)}Yy_B8EoqLZBL4sC0s~UeFrtkxC5Qjv_R`ds3YKY?K_hrjrJ%oyy zc0Gs{`Zpl@*ROq2s4p44Y037lFLrjqgyMl}d3-o1dy!|cK(ixjL>DHSiT#9zJG$h( z5u4FV^jmteaU2Yq4?ZJaDoYRn=G8m#uGMUh%JHfU@B9f+|Hky)aRN?9#UxTcKRz-5 zT6HDOgFvYmI)sqr29EDm3%+hym<$&U%yMK{8Vs-S_#(s>Gd#?=`UJ@*L!7K)ArOyy zp=fF8`g9Q6fYChmp^?gee+TeBOVVU(v=PpGmv<{n`+poyImmJ!j~QIm9;279ubUIH z8h+5lE;JNFd_Es{)aHw$iuDZ}+n5jtI%D+|q>@zAW!^nU;m-do50Ve1ShB5A$$iw! zMAsZ`hwf)H2c@6CmM0eE*N$lTg&{r;D_Zl2R+B!;rpfoT+Ahx+qwiIc3{J+4 zUHY5C|Gel8(Cdd$aQNx&JO{x4bq_iQ9RS-{0zlQ0w*OniYuZ2faIzD|6;ZXTkM4QH zTi$!;_q*p`(Y8AXT=~CvPo@kET0fT~@->@K(2u5yreFTaSF@QPV;Ywc;;)w| zu=q=D*0mNcS@62VyCS*T{_!Kd6{y$urtR-XjI9Er*Sv1FPQV@H71U}ms;bQKMJ#O{ zoSY}+-H3noImh~7Ra!n?v3)eeW~}RO_Pu;mGtVOGoLyY|4*>u7Dn5SVGlAQ|Mak}Z zXv*}uCy{sZhbpzH#1uyGvMXk1PTcX<+#$>%n6`=o4aim`+wx985`8}+iZiLdXM;hbU-|zKR&b%r7+n(L9(t9H_T)bn@$aqp!B0ROD%XNk55k2N;2_Gg6*ET& z*1q}Nvhu}MDe6h!){J7*wj4K#DhJVE;w@R0{AXgsQ%)zta$pHs^l;0FmBDx%IbnS3 zf4&ol|DwqhkSuJ9t;U;cr=9wf3D5&jVQ9PY#*>i8#nxPxFyQ46N||`mcX#gP*v0?- zy5v|J_VMHPos@B23pD77E+tCZ{fvJ;|Gz%^Jt;DKvdT-o-?Hb$T!RXjo97q5#j%EZIG7>EeTNOB6k;;+QSql zQAk`~n`;sxYDH(g08e91WP*Z-7TKsSOnrQ2#R4ez8I)Da(EsP71E(tX2PGtB@>FnF zkII?=D@QI%K6stJ9gDLIz=CQl#$!KeUt$SFAOLE7pD{G;u1NI>XMcRt>a4*(GUvbc zOL85Q99N+H5R~&@@3WasR#=%+a)G-!`0NTa$QuG_b^j2jl{=s?JiWfW!pr{mbELqs z7|t=Km#Y(y42Y}v{hP~iKmhY&F@A1tO?(By%rS77J+Cxln}wjZ3GM?rq7=0FP4(7D zXLy{AM7z@yp`y_q{NfNg{2BYHFDYM4VhY#Iuhii6?ul}X)!8#-#6#7B>Qj;Evf)ub zn(y>Y^!$CNV?FYD#Ad7pSP&~9R1|YqO+?3+;AvX2ATF9M&~dQ%WX6rDq~WoJ@Tiu< zr~HZu^W%jeCOqETXd1wRlOk)W{pRTd)%a%?kye1+aRP>*k#W#VzTQxj*fHD@V93=} zPT-&0%CDTsDc50fG`A{($ga5AEOnCO1*s#h8i3J{;csd6I=i9gQ_s4(~J2w=ooN>+NF9YqJ-)FiY0=XO>gb&ZOaVEPVr}t6le;>3g zONd4QmuVjvllx~P&Vd>DDlstjRpgg+&{B5KBWO^&#h)>u%({REh~SOE{=7YIl2i^% zXZ?IYCw7G|Cjj#F7=XC34-cM7yqLI*i_A=UFAE#K1}DcNKZ{~2&Z;5?L3`*F#sXpK z-Xqp#utK>*Im=yl0<*)$MbdXx-E43=^bddaQ~q3R4K755g2Th>dbs;g@#hI>vo{&3 z<7(yxS{`2k1T#DBMQyekW0JThOX7OGxyJUb5cFDxXCE~@@kE5|4j3cMwn zi2S&;x$-tL3Q@>ib+t{?^eru5h8(GPbJo=J#SOHgEODKG2IO5f^Mr@k($=x-Fx8K%4CE24%KP=pkeSqdQa)U_H7-$ z=}omxGFqL$e{Wnlb|27gx^n|l!`clrwmcsNnT&qER-RYPk3%!@%cD_Un^cic`>j2^ zCtA4$1;>-pSN_mB>*<>D&)nOl2pVHmK8dsQLXlFm!mk2&XqqP~ZEPM=O0`+&0J`^x z9Ny@8WeKvM=hJKh$ufZOA|)5pX^PTzQvYOztC%Tr3%5@3SG_!a5U6QC^~c=ru{I6j zbe>Pfc_)YA3#f)kKyw%ZR_|T=g~PlrzKmmfCW}2A-|n|t(gHumQnDeI0?CO`oVb^= zZwx_@P7X#^?4`_3L_vn}UH}@~G5{92iz&39rSRYt@k zkB96HF0Y z8GrU)TNQr|F47ap7|$b_J7SWxlRgYSF*hhphTE|)AXJfwz;W%iy`dpUVkys5RD;}GQ9tG_O%H%~q znF<>bT}%tspY_m{c@s@iWcb0gen7A>`A_F=F;QOwdHpVC=FBou=9y`Myvt{VI^fj& zK=kx@xw;zkLOx%vWvPM~`2iqKRCe}9KJj7L`s_)jXxH(%T zjpOsEX%yKv9pS~BAzj-tOLz2WjKa0I8H)1VM!RYpyoo#i%$*>dGRbf^Kg0%3v*)_Z zI&E81ys+1nAm`xNZifH}oM_EcQE{g+9t{0kO1dIC$)8>!b#v0l{Eo|#1F!)c#IK7^ zU%N-ZoZO=>F;oM-r2AV6w5SyE7rY=JuM=>Yb{rjd=;d9NHA`OyU8z@I$?;>&sWZwsB~r~~0)#%xlWFIq6RF{S_wjOTm{p-pz2)Xu zu})dn$GEE1M7IfGK9WyO>T+T`fi>rNe6j)etPK3Sab%~7?m>vR3rPNZ6}7&8JtRGP zLRBR-h<${BK%>LYLTJn)gz%OOYVa_559M;Qpc|kS>agyK!Dr5&+aU28s8I9b=Sdd z2hYlcwm+Sh^zo)f7h}&K-DIWY8A438ABNTz>kvRCq`xiC+mi&aoIFukPc&^Te)s_l zKp1a3=;k{c+N{Q>XE}QkbYpVzlaJx&Z^)*v3l%EMYbAJ_;ox(1`p@G0ijRl1rY+f0 zKa?BjB+BnTi=wGm2BxW<7N&g4Aq9|9NgRe%X7Qj$Xr|bg)wjpolc66Vq9+l3 z1*Uy24To1{PCoG`T8PT8H2sp42HR1Jc|W9;;Sh*^4#b8tx8!K9>WPAls)8+q_OMe9 zEWurXac4?uv*-w?aRQXX#d>ThRBL=h(`IWH# zC&51Rx06c6^P(cZVI-;e4M#ji^I7LTX1VBwzf}rks1Gkco#1@*(o|0&;@)w zXUByCBvVSV*QkSx$<6_20|$}=O??#DRNRj1hH*9x)x5yfJo64=yghPUKMjX0_X51Y8R?XdlB`(9qa+lBUDnICHtnfVG%v?SMLCucu)he1>IL(KV z;SWN7?dF=E3{SN2BJa(jXD<>`@B05RPk1vRu5-x)5*mwU)#0l$zye1vM4F#fHTS*w zqOD(bSm7=;PcV=l;+EP#A5NxCS*)UW7x#1W6MFvYx^G{^55wK^r$P=M@HMUA%BhY| zGNZl}OqM2ekR`ygC4Fsof=;ud9~vE$^59ixS${ZM^Jh-3wcA_=!lhBV<;~hBYVJ24 zar@yhgulb}$HYF&IweHeCF2`~joYup6C`ClzFNUQnxEO$&=g$IWX6Yb;rtxNe7I=( z2%d48S(P%t9IovN&;*_@5^ajI`p96u)no-67B(uB4$su}NU~HI z^QwurShKJEaVcE^b0yD{ANNgj#IZk=o9gYA_7=U&Wc2h5aJZvs3C8*Hm|28AG*S31 zZIkvB$O}4n8Gog^++~td@g*20b^mE^w`IoS_<~K&CStThFcK{??R2fo@=_meMI%VT z*ZO$yNK`pnP?WX$&c}ejVmTh^WCP@~hH+J0k8-|ld}URg>=NqReP8b=7^d&)$eX^? z!b?Q=-O*oPGOw*zv?mMS0$zOe9pY*9eI%7jDHx$FzJP_r^9~33qQ3g4$n14dUx{=Z ztX?sCuV;VIr)(>W*WN?NK~;_EW#yM*MajlcC4hdVuwDAn$c;Qy{gw;)FxH~W4}%52 z71lvwnoEQImym#(8QHkO*E!=p`Bu}Wax8^7Zb2#L;eR%Pz(Wc-$N68BFLGlQ@0JG8 zy{oL_VmrFS@F;3ktA-bJ^GA%2$RL%lqRW=ap!xZ=q_y)Mg9DnKnu<=cMA0F*Ok*OK zO)RMB<9gi~3)|$}0dmEZ?#y_WVIPCN()NR8Y5n==-MPbNc|Gm!rx?$%NIH)gk0s=k z^L!;&a2MI4d{>2v3;&8MTQ1Z^lFvJhKmlLmz0P~YT%h?n9Z%$j_Dpw*TGAFCXjO(J zfF%A>P$^fcU=gfR=mkjfntr&^sOqNkkW?d`BgoY~>#}~44L+kol9Rd9u(3i>D zwdRw0{^QnYmk`F(xavIk3TyZGJCO2+AF4Ae_Iym#7y1lEM0Rew?)qdb{>iiU?Lx!* z!~%jmug^?&(qpP|bc75mt*6Z~lsXF>(2^xw!p*QbjIfVO0`lgoUu06RIlK=i#i_%Uz67EIAF4%x^GF^FEWG!EWN-FpMp!?aV0!gqE zQbbPS3&IjGtK|M(X}Id=iUUg9!~RA$WF=h2S`1`%8{6)>*LjPA#!aFz^?Q-yZlr3q zSNHI^D-_u>e;odi`aOb{;k@`r=i(dWoB?H(kK@^);1VAj9MI0U>}GGV!7%vf$MUGj zXt(+$1Dt<_wLGf($D+hJPDN6_7R_fGz8reCFc)K8iO4Vp3~%gN!&svjGX|LbnH@)O-VcDcuvHj8a1NeqO$z>sh#{s*g(c4I#XRp!$xH~u~l&k-T@I) zi;uF3dJD(&8^y5JC4}a1Fp_rSfUp~Yc>oP?+A;wa$AQT%XSZq2g=R3z$OIHsiBAjb zdH8b6Ezqe)OY`z%R+TRoQLDW_krBR|TzO zUvvHS&!31iq{;7IuANd$DG=%Pnf6H>x>%m|L>P!Jq)0h@3DJsk+?puAa|2u|v8)Nt z%Z}EEJ2Cb8k_Gu5$-YQL_=G@oM3LJz3kWstq#(1qml99fK{ju?M0exe% zQ1@QL-Tb~(QOq4Uon6HkK_VO@wTHZ=2$e7brKm3P$U7O~+LECD)?KlFs z&`6ZX?&UbuOMd`8=AsR*ok-0OYIWI-LH5RBpT=pkYzgDM;}Z*AXu6}WAH{q1y}3vp zk9(USD1viEru#uL${!TZ2XB3a4sIDVodP>`n+O_BCb++K(f2!^z67HVeuWmT5qJr{-x%O0)}xf@Cf#eUQ>3dP;+cS7Fi~M;0nB17 z0fGp+oEMZesr&bV3@7YY&Lu4(y%Wc(2X3HC)=O|HbwB=bODP2j%vJGVLTcQKR4M>` zg?ej&Ei$$*jCrSQPzJ~$Z*DYP*U|~uFDmc20i&TSlQuk6qHp+FL~9O@3&@`&cW@_) z4Z52vTZa0dZ38DkJG$D@1}prL?n+(Rn_jVGRW)xQs1FA%?cebPS()_7=N~*nxbMmf z8rB^e2=Mg0!1bJm`7p&z=)_rA9r$z3=Q!uQZ)tV4w0ejXs+Sr+8=Crs521l{G7?r{ zmMs;aOD;fkGi8dAeKgRTooCwhxh>~7B`0VPDaXt* z6I4<;eFt5EWdf0fJZFH9?mMEDp)-_)fa(W>3PA4wyIzCCn4UkUf5I2O+7oZ&TkgER z1WJu4#Nimd5jqW8lR&7h+%aDt%I2)q@;qDisBnT=4#dN}C5r0j{Rr(xXhzYr9kjy$ zNgG$k*PEEh9q zH*uT_$LO&sISYQ?^A9C9Ek8qCJ%K&$B2Gfk+p%dQK#&0Ig{s#+$uyV+;*hZN0(zG` zS*=X3$mr^;>-AYhMJHoBB~EBFuh@490Rh>5oR_X3osXuH)JChJlMMS_I+*9W!I>l- z#9*#W+FcZVv9Car@p|!GAErol%6u%B{;@Fov+$nFVMt*JOi4Q8RrN~x+?pOMve7U4 z!{8|UPo>#MsUNd7N-}@T63O2HA&|7Hr+{HM>P&zEk;e)- z&0cQCu1*tdd&t(YgK_>bvWQlYmehvB!z6rk2{gJ>+1OENVHsv!WYn%11LK?4yiO!^ zWr^I>ipyl=4!tHwO!NQ35GA$)qpt`BPHK)J_+pw_c$6ZF!R{OPAbo!b}{Mh)w z80pZl>d+X%wLeD8qW^v;wGS;Q!DT&B8#K;a5g$`BbxWt#MMY;=9OcbB1OB(B)qAup z(fej>`>@>(foJgjC8b$~L0m%)QpFHmXrH^7d}eahTFT>FQ_3JyBQ zIwck<$5_|2aLMbgS9n$a+oQ>w+_Qb&c_fea7x=Xl{nQ z3%hrH9tZQQmeTm|_c9g|n5PYA(Uz-%_=u?t|uNU#0Nij&6#tmXw(dW0@bi+VtUS7e?w+EYgL@`bx1o zQkaaj18-&dKNZ+%y!}I5ie>Ntqx}ZMo5j)MXvf2OwI6l*W0EF5&d@hC-vqU%ChKYr z(|y=Z`j2as!x*PS7LuLpbxbu?#5TXTg^@iQjf;n?lK6B{h=E+^=$-pAuEU;0=%QnF z^=1>MHvNY-(%VWT!S`+cAND=}WCl2brddNk*Dv~C1c zl6nS!V!(vB)%dWk_)Ab7KMtlWT7*$bm%JpGzm6F+D5dPHznEGJCgpdIV?Z5xMYh7% zsh=4iP9KR%Z?`h;s65Z}5ASkM3?Uh zdhaPhK^??K#egi6&;1A+3zejFi(_p8<9RxU*{;}cL-suDyYbY=WMNeoeCQdxqc|BY zM875Y|N2!r6rqR(Bmn}u2tfdAl$q)gT9tjt2UOxSq-!8oj+7WldPzNM25OVaA`dqf zHO%L1g^m0A)DTmquu_td#>l2@dI+^yizDARU)K*ZEf>A?3s1; z(|NGtLW;qyt7$#H^d`}L0vnTOOey0h&p&*#K!ny5&0=7Ayg5FzaR8*3N~f)fWhn4y z&`kg)-4aYvidI(rOHH(oT!uowQqEcPPdd*U=JMZIa@gq)@%HH`QUYP8V_`KuCRYFJ z4HEhXllX=2|7#;iRzUbt8UStZB#)5@-e}OcpjFXIe$|pj6 z>%IFvm#AP)%TLR0FBp2`S;g(=Q{lR=B>rUm|H`I;b_5{${sKa8x|(}FZWvZah3K-vunCR0Ez>$t6i&B0~lbzs(CErFd+R8UMz=KTWP!RO<7ftg<$%e zB45N<6|0y0%1lyc-X3tzk^d(b3hM(P$_+<@Qs-Ozhob1V>54>h>KDgbSKQ#C%0NU( z%qrT*+evn6dp8$@JIA>_p52Bhr~0L5d?DZ%c$?%YvGtc;fN%`_7Ip-h5Ppwo_k-@M zk9ROhEJ@vW1rthZL4nZbca*U4Fm!beU6UppZ-GSUL>UYBc9o&ta?vAxzTQBbci)I0 zV1Qov*vyUi_SI<^%rE%uy?;1+xI#c?EwhGS=;j%`0)l9s zWEA5F&||!;fl^eAs2xhQC-N5HM5TVD;NElNhxV|`^QNAten3=_9E)_VP9Qh*xxDQj z3=eCtibh*!1@Z^-0`%~wXv97xw`SM_MPIAUX;n1$O+h+r8DkW=CzSO9@pUVJZE^z8 ziIxxJ2XLT1<<`>^OK3S<1klHCztQGKgCdoZRZ=hZxEIAwL9g9HF2ldK|0Av73c@c>&li%>;KZ0CT90K_84xituf~L#024YzZZkL1coQ1FE`a zTstk#-+*$>E~oMIWJ>C(F0|%I%^v;CB;)UWUyOa!zdC#6#dMKqS7Lan5|y~nPXlfeLs`h!58m1Yno zQ3!Bv_a)bG{j(8j8HQDjVp@J7e1mY5fue}-fJx^j-6_p~2t^p^G+T^f`)hTSm6ewh zXzB&YFd_L`ry_9g&qH^OE8nXZ3otdj2oA*Y5qRt~wwEy7%{H?+eWE4hyN0LbkN|vi zQ7Uhx|KUM&ydq;oJ_N3zq7-5O3Gh4&5&22}qUxPfgnb`?Uz7Z_7a-RsgG|@PC3rQA z^WQ{kyX9A3OjyqQ(wTZ=q%WW$+#;u0^O0jGWc(w|akowPBb2CQDBBB zPR&1hf#zfXV9ed2l%bM6j-z`sOtuS78%OMJTkV>elp#b`(f$({44KibtHw7^_~}_* zCo5|^7%9e(9Q+!Ka{)2wU(70M`F78-v8bZOk@6+(A#93I+L7ob-n%)YI=K>Opj0(1C^ z0Z7c86Bz=Pur^zDn;GgGBl%mPaG}H(Pa|v2DGSNR5_*_)09ND>MdfDfuqlTPNp!?TFPU`~|^*MXKqeS&Ww$5vRwsuB{`lsxe!JCs|thZ{@8Q z--vM8!4e0fu*SazF`A>(hu~;E-Fp9X2=A2WKtz4d2MF0S*Vh*|v!@D@ z49h}cN)Z`PHSz`Lz;wtm^Akz=qzyGV>M!nMsgtU0gy4F3^2yVGA84E|8wXlR8?1jP zW=Bt$4Sr%oH6LpfO1m#a?)~L@Xc&+f1S0k;|+W*n2z-1Q7rLC?zlPD4Dk zryK48A5irTx6*q7O5_uCrc>aYxavfE8IyT1!GGh%%9#(DgAGC6wW zlMqIN7q+kGUdDu1%#o1Y&dn>4=mR#jA%qVG%IM=zJk4I?dS3f2KqdRER?gQ)C*^Y~ zkldi#KAztX)~9PLSJ#Ee@w6&&aIphk&c*Lv*0Ik{Y=T{L7q|I>2akJ zxSm-IgX@u|h2AjgdK-G2cSL)Y>_!$68|5H|G4+nRvG7Jz*bgJdg?;?Q!cGEV zS@j_zAZg&&*jns@RRiX1gMphh(w`L zfBkeZxDwcS%I5eU8~7&jx=nz-Gb{Dy@h;1}1_+%ZeOhr8`b}C;_c>-*-X6&~K{+T!YVVqG2BMJ)IQ_;^|Jj2w- zU{P={6f+2=bL@&afan>gf3Gd=n*ml9J7n9TXL_|=UfVGu2EFowVDf7S3qBQ$Z>3JD zzByzbuPeYU;=ead$aCaMqRFTT{H!{Di&7HC&Mk0`ZGi4tEdJu2w9c1(Y~^dzWi}Tu zz20(vshv`gesBqvd4P>S9lEAOG}&b@L6FJ;jBOiTik`Z3=8iMI(_#s#B)5st9UlbP z03Ti){?{3RU6rccITGP3j}%MHT6iLm;?2M?+=7$2@KMVKD>~zhDob#8RCKX+4w)+C zu{*Uq>Nbwe?6HuMkA~K1JB_#KN}@R-Tvp}ka46k_%9pb6TP7MVo3tk=UIaEeAjOgm zSYKw=E=9;zIIT)w>+4in7!GiM$}&1ho%53+s`@Y8&VzyRu6M^1fWS5XtBUuBU@XLR z>%IIPf)%qm43t1TfppZ)H@AjIznZ2t7$oxnFS{DtsPxu-u|&tDIH%=^Y6*KHgtoO) z^{(Z3nob>LKMZ!qa}#v7cz>iQZ4z_7_&P-~Y$oRpx9(uHX;5AzW0y(Dg=aYeVmL#l zx+-}lY!_Ih_7)PM&tPa8K-5(v&%%ZvMp}#%Xmjs8asGiu6&-YM%~t^StsO^_sd@=} z)|gS2vphYGdrh)3;+uFYh38t`Isa0KaGqkO4BuLxj@#BVS^jKsL(9#Yje?A3Yi#<` znD-oL1to{d)#Dw0tz}RxuDuYwrlh+S}k_3oo6w~(0cE&I7WvBNc3MGP5IHv_8sk2~5*anV#cUKEQ&9mDVX~}}+ zPgG0;dQE=t6xrhE%js%Spo*NsAs`N0{aGt0k@^8%xx6O!U{zRe)4PwGx=^DUy*jPs z7~%Y23~>^f8kGJ~B)Jf#Asz3@2&VsaM+f<@|BA1(0YzE7UYNhsej(HLz7D7jjHAVr zpLW<67(VJf^i!@Ek|y8&%MS94K@ziNn?^BX49I7jaf9)|vMJrjpr=GR&#CLTMiK|q zmiPFH={DmBsq?#dlG$GMuC_bR4aTe6-2Xa zsIJarGPZ{KcpS%6Q~9s(VLg5$_GU&5c$h+c#%QS2nI~x0hS4Mse_1Ijmhi_|r70SY z{kGuuq+XYHceSir%n7DBHXaVFyss;}^}x~(&4Odfwh2{(jmh(~;|wGA3nRU;MqtQ< z+bIgAL{c!74=G&NHH*X7thTS!!kH;Cd%ztul9bzv2`7+H3qyJw%cD?!sg~%qKZ71+ z>A#8&M>t=lbaGh(^3Oh0V7tMZX=Myw+z6d_si<`BHl>e^_hgcGe%y#|_P;ELv=SLMqA_W+F=%{gVhD$zG!qPS08^{uH zINPl4DLe9KTeUDKk3j=c)tXZ#sUm~j&K2IQNzcIP$~&K!-;$M8^;`a-MWv@752}27pI*~s`~p6956b=iR8diKZwb?gOp}^&tH}|Q zS(Q9RxIP}8w?s2WR4d(eDr?rN0VwDkEV;~*`j??3DG-_VZ$pWYPL_>0YJ?g+_xZ0> zX}x=(=i1_lqgG-p5Dx1aSN4(aJ4~KJZeLab*jisjvm*n0(!q}Qb=?j8;S9IpOiIe| zO8?`?_M-yW$v#;`a?S=>??`WaWz-7g{PQTJnz^^=u1~b|o>08K2F=iVhF`T17gi$W z5TMRW%h{D02mnv)H@=JLs|{UOR?`eQGHc&=7ri8UDD1ske69v!YZp8q+cljlQy|ohR9R&+MS^itpG40Ix&3Ql5Led zj>cQUTjHcZk#~()?_wPLXO@(U*7A~I)LMQPt1rTT60aZ;;EPua!hQ$6$5qd~z?KFpf|fD!~4NclqnER3ixrN6=UoEj;*YaAsj+-KmZ$?|O0GFu?) zH*soTHZE*5JN_<(&R935@8QZp1?|H9&mEQg5*nY;zl5lE*mZCR6-0IZCK)c-u*Lm& z$>&C*je8=M*mOZOz?jl5ER$%5-82w|l^GR=pA{);VOI8+Ev4RFd$PKYTzGE}aH#AI zaU;?D!T)b%AX*!naXd6C~60mgrYo}BWnp`?UqwP-gHwbQS~?>@14FEIaB zX>Qc&XGr9E0kyI2G#gIK63UY#tOvMMf#n*QfnNT1<>gz zNkhH&Ync{fh5WIgCRP?EGX6pf&g#jZXDBB%X(E+5Qn|JWRT(=BRg(s^y7SAqsu4-Nki09?l5=c>I)eV;Z#7O_7yYr~*H6%dg21kOJ7#w-XaKowMw7D-4E%UPglX)!wmb&lu1S^_miP zyU2J`o|)r$x(xj{^avfsnAI)gcO}_p0nL=AD0E+Js0MKC?vR)T%}yP~L}%;-icPa> zn~IOK5;6~e65Io)N;1YFKx5yt%gCH5&bKw&O=i!dBlI9ln;kg#jHq|yacO844&Z98t2X2TES}pk> z0w6mH`~aqbjj@R#v>tWy&(NYmGLt%;xeuA|g}EB#;{R=o0egv*kV7+}%OJ(X-bL2< z;bjw7N~Hs%Aa@<+j|=iV)wLb42_yJeexmp&JdopAmrw1&;cJHJmFhKht>iUOD@T=q zQl#cyLcQS=S*PbiNqAb%``=n?GgORcNW1)$y7!-hhXgqSgAD;Q_5a40N30YkO`sx( zNx4UztP>4QKky)Hx#1stF&~?OFQ!{akWMmc=0JudJx1IS)N%dW7Srvp`J~ct4|K2H z)yc&c3I?<{ykp6Y4961y>q(7VF|8qc+kmn)JXhijQbooXP#{l!CD&e2KA|?=z05+x zctQPLGug{ln1xFptJ>j)01*B*F7Q{@?Hk#Qk1S z9UE5Te<^q(Jbyp@R07ZV^$GX3|H25K`F!~o=gezaL)W8qwdg}yRi}$xK2uAG>DG0p zZ)h41G#1$h7Rem7#=t2L&tCnIJB=a6+(M1oJf*m-W`nP#vJ$GOlH}^Tj%L44#{UPK z2pAy>Z$I@cT+BVtr;D}aOiEY3AiW#(UI{+gM0C8d$-tTyNenANptmPOJ6&ffr>UzJ z>dB>tE|DqB>7Eht-0W~s5>Sz>V~HMY^nZMPbzBro_b;p;r) z_%P_%B&cVm*^oJqb>5^8tNz90|6?CpXu#*!)2PL-Jm#Fvv1z*rk6wQb=F`-UvyELF z$d24H0yQ<0=8x8QK~Sq~_w)2mO&5YU)K7YCk3yc`TZAp(eVhhfH?Lu2dpiQd1JL`IO$~{@7Vt#jH4g zQxvC9Vpc0E{I!xQn0EAgF8CAAR0gqK-TGs@!@X~cUCFUN@@EIyza6UveEt)WS*-M~ zblkR6I;;%*rmUcgNhn|MKC?X@Xbq3{#j%l}RX(gd711zpXeE@aKcBhhC#d!WGIL0{ z#HUg#w?YrKK9_4~eF~ex|LLlmfasW~o_;*m$(PuANWA#>P3UM2FgP#H`|D4ou+jJX9>%5hTYYdB zbJL;a3aJY~b2T^rnrH{YFHVW(sTZ+FM0=YP0yaO4r%~iRWe!WEW5Y{i3GMnLUq`JJ zfl!{$4%ZKW2!`=l2?;;{Uw+{B;Vy7=QqYCfd^D zb#DYTA9bQB6AsG6&TPH2)5)7cj6|Hr)CN*?st+)1SZ0)`3lohuXvc*8m1oY@TpXk7 zi@4dl3IK>#Us)9WL6!lXS0xHRe31&7e|?B~bhPb>N zg@Gfu;!!I73|DLzie#h$L-*8;D76nrSv;{z-|F;rhJEn_{D%;Yh11n4oK=~JwR!(E z`J%4b=;4gSi97rT2n&lAb%;6*Hv}9)Cr7voMH;-KZ*mRLfJx*;mD zI@7}_$oKaJ?{~6flD=Zlkp@T9;1#~jTJFqP#xOaSzgMm+9l9fngj9Qitze76Jkqzs z55k#R5j1^*dJ;UNJ0J2^4!(XdI(bmpErP{8A*t5gzUNFz(+hQUfB&5vii*r!H)#8U z*-}l3g8HNXj)2dTVgWN3&(FS zf*IiQ)Ms}Fs~eOsJoP|sk7zd@G{MkEq;mL~W7OrrcQx!QIMnjlZny?v-mXsWB5Pn6 z*m=`IpU?$-muxDjlN_bE_Ti`%We_}$vH0bOjoK8eO8kO0zS0=qk&E-WvWt+S#oC3v zcX}{>_61%HcciIg?WTB;C3rVVB02`do|4jHC_U{>-x8CB6{I_kwt&oahv5flBOk{Z zIW}}WP?52HFMG1PVJK5h!=}dx3)r>=CFVEBSSnvLT=SFeLltbP8dz@?>4^&(3ZAjd zBbUd%{;ZR(5Y~)mM8#TcMc-=P8KAfl0T3=yh9W*l4#6ao7zXAuia;=skn6Fb*c7Pn z+|A+yB}!j$DgAZ%Ko@w|-o3OE+25d9(Q&kCXtCOjI1U(^54{H8IPj@Mk@;}j)(Vwm z;{B3VFoGfY;-Ddxz!9{3wSvrV-hMQys$P(l zr<=3d%h-OASz0I0KXBnZy~&UExl>dA)*7aap0N*Jb0Dm_1}}lx`F7%c0;#&P{nNBk z+N}<#d`lYBmVCjI9_#(t*xNq^X`@49yB$ZA+>M!9yojBXPb@HG$HMpSQwC?Z-x{CM zaDET7qnVBna@|rfWUK=PbkUpvv37{2DmMZL|3ujioueD97Gz+iX^^cMdScK{q3UAd zeritxMNac~NxWW0vp$?7ke&HaQCemn)jbP}){Arr%7v0Q(-cc-)m?IIAO*@*V=v0& z3d`Hn?mM>4T$XPeotl5UoS#sjKHm;GOeA*xE_FU`KCiJxQv2hs!2)%Cjh^1*YHnzq zc6frAvrGSty}kF#3oGGzRFBnrjXy~1c_X!2V!)YD@bX0N_l39Uwe5%3kYr23G$#Od zO(YHB;7@9yLXfrifyg4)>C`Uf5I2#s!-$(4%(#;K3HPPgozZ;tQ|1SA)#=*>rwg9m z4LVK>PmwJUjwxCRzDcMvUi{4KUSh)phJW@qyS6EL8MQaB?Pj-XtqQbevFLA_(77?t z1VzXiq|zpqXXDI7C_)W#z;;~e_o`KOo;`-Oc~3O;ML0%087kO;cJFXRMF=r?JOEEYKdJNLXcsw@fn94fmO7=!7f52^RT@6RGia-~IH(#G0|h;})F zIU4(y#tR#C%GIjQ?^hiMK{AfPW&Kv34W!o@Ov}(FyW9u$zs%Ke%uG0v)5E(=kPDJc979FZ7oBlB~?F~Isx>FCqoq=JsUdKvg&ijQx%?xOS;yAK{ieB!dXq2Atgb+n9(M~g8^)dYD_jjW})OM|*G|^(^GPJ~plf{c{7LJ};lB&bll4NoB5b4YZ^_zO4NG7RCbEPP)+%p~v8S z;-AjMzTK!J%6I4Q)4+(pwQ$|D5Fy6|)yL*A`t4cPp~owevDxnuQ$%tG-$$3`Wz! zhqcVLat>EoT4*8JPqtri8jk;iVf4pEH%lQQb1@R!5lOs20UdccFPmNo)LoeI7|EQE z0WS7~owa)yW#@&y-5w<#ob6>qy74d6MVyGQ-sT>t^?kG}Q>4K((%V27bAD!rm)k35 zNBd80+8nFQa}&NZoj}iR!#{B(|HFHWI54ERHg9J1XL5<|u0`jE@;r>&yA*`A*gh~k zl-a5+qk@lUCuuM0&W*6{+`A`QRE+4>zRtJ6CIa#^PHiX>E;gN0{A@z&rjWT>YEZ~F zml|XntBY6L>7mfsFUwi#`O>d4k1)tNUd>*rV22_ry@?r1|Ahk@f?9# zx&A9pZQ?oZlnTi0XrIMCROFKW8%P8|sURU2x9CXGD7B#Q+p~dO`#iem(RYjSaw$^g zm*z4kFSg6i9ePZ#va-(dUq36v&IOv1AO&7~=s;y}VP}JPrc(E>y(`!4F$o@J^Hx*C?JpOIXR75x69Xy<4FN5bL zKmmfq*-;~#S>vz<`q|(dSGHcJu5Gz>wc+<;mN|ld5oE82UbeWf#1AN)uldYG1g5Be z(-4?Q_#C?6nv{5T5OFgN$h_!{BWRC}8a4i#qTM>+96UpeUDd>I>lb5&_8`;*pt0^9 z&DhG+nB`0VIsPcis*(}LZ(3i?4`~F{Y5v|pb0pA;wJCIHa6j~0B#+7-I>S4oxg8Tg z$!jF+MZO)UhvPfV=?hu+aU5_rQ#6@Z8g>6~)r?6$;4b3*g_pmBK6$->Un~i=d}5?L z1In#YlWj48I{)YeL0vcfnjKS5b^@xt=Y!3izm*XHjsh$?L2dja9A?m+Sdkn;xXfDD zK4FXn*{6mIohbxv6qpxAk$Dz^S7vpLYX4ez;UA6nvhNye-d$y(JwrXY*VcY-4X>{&lkoWC>~$jAUUZ0U`mKDC&814Sg>;T zU9%SvAADGnHrld8)RsWs-O@kao%r~}y(fRxAi)xCz&3n2lW(F>P=;)x^1sO$Ih+z7 z;;`njU^y(TD`vP|J9}*AbEgByPtdT=Sltn7G$<*X}$kWi|6HuF_ zn9N}L_Cq>o#?CECYv_8y$sP1;Xp>;s=;z=1wBs!xDQ88uOpz_dEfNzo)o_xjqirL7 zch!oQhOdIcU^(*gML|3gcP{4SnR+gB8u_rr{zY|ogH^D*K?fEcMovZAh@v>P_SxKB z5Q+P05m!n+9cl?WQ&rXVpAS-G7NgCvmP}`+3;(TAO$ZQ#Z7?(n4VRmawQJWZmHD{S z%GmMbqP$nU{fLMdKKzbYBS(L0KL1A1>YQ9YEZ|?>PLo;0Do3bBEX0}?( zh-WI>NC@ll)ViWsrc~gxfLS z!$8T=7b@~4kn%2{y8{}XZ(^c+rG=6$_wSeJ9x#Jm$9%jelHLh_>gDcd2rbfs)mBVI zi!e#nMmYHn>Z!B%e)?McB8PIrhtGDk2TIfjbTBXXI29eAy8gKW5WJY$dWXY;|Z2VAx2|nzy!Te--U1k4SBrY?kKI(t+FYj+7Fp7!`^!OSl zEYM?JH^(zSl9(RUU3MUJv?7k*{*wudwvI4akMkhv$~YXu-D1?g4G}0{)!W{2at{Gq zUT!G|XvBDRoZyjF!Sv)(4Or%t7{9Wz#+95R@oQzx^&gd=%Sn$r*-hZn|83J`7$9WO z?e#q(6cGre*cU#_R!|RVl;DOFKl1>Xe}1{Ol?`2XyyGO&v1gtHbQh4nA3u{6pbLms zd|wWC1lDVrM>%Z9#_q0<7#YWpR4;Z}v`LdsZlDBi)S4IeC3uwnVsFxhqAWZ>srkFw zpMt^*i3nRF8V<|Epu=*D;c3wWr9tcA>QWPw8@$Gl?QB+FZ41R7(vhzq*+!k~2ca8yDrVlES_HTM(Hanbe9ZR7d{S9M#YzG*Si_xyjWNjiFj$vB(( z8Xb7P(u74NLvp186S)$#t69sZiK3ux^;S{I2rdHcx(5o9!-nj7(cH=db z2ggL5X@<~mMwpxsew&wTt)Jfcpi8LZ(9BkDIS*8EM0)QbgBcS;}Ps1%@BZP;V9%IepdU0&KCq>o~1*JOoRL6Y`I|O(a(iLgFvbjVagl{-lW?lFU`elDvQQNdrn4AbOGz+J;N5!j` ztTiwEky)P5{ViP?3J3{xd47)oWrw;uJ8}e*kxE)V#)w89G^eCeLABjp2kHxbM@r!A zb_m)swm4PO{^^!o7^Ui@#C=83h@$PF8mQ01!kzl8j+5ExIQ(Cg^E(D;Mx8me;e12Z)h${+nP(;z(Xi%dU1@29Uk*- zTVK=YDkCDUZ5tO{+u2aaeOTg6I4S`fv7pOl5@#pd>G4vGTFcA-TMLp2I^yjSj(hS@;ROVZ#GtfKu0^i)e9&UL9>-B)7y9MtKid95v8U}; zuu$&F^iL)QHI{$T(=f!Nf&O?x1mH6^e&jaqp%`g<6qy3ryrhA$e+r&!j{!qObYR@s z_#L&J?RS&wFbfFF4|W5P$CCWFJr-OESHn@aC`?}NV5dvSNXqZ6a3Tp%jI8Uk@JRz6 zed#G(n~B>I<%^ZJ_OfJui@!6_N1Z-zZ$M4>QHuB6GgCu%qxc8bOUI%+$G=D#;`%uZ z|C(J2mbuOGdgoH}H-S|%J-Nl7)(>n1tP*jX`+;!c8A>c=d=L?L*;Kq>hsUYq<`Zp1 z+u#xX1oN9|j}{h>OUnCq0nmx`2({(k%(DpIAO5>Icg}=d9U-g(XP2+f+!u?tdMGqS zOz=HaKBsn5C9lGq+mEM+wk3}Nufi%x&&AZj)D<_Bn##0APzSJq$ol)V(m9ZqY+2M+J-5}bwomfPQ4sr>#{~Px6UwCw zVS@+Yr%o1#RyZ_0=&f|XFpqx+RakbmW_ib7;^sef8DiIhZ>vjq|M~*RD6_qL zCzJ}TBl(*v5XOPh?k&+JNIOzGb;8*t!D&QAi8I=+f}B10DIn)o-R=5{En&U{7dsAx zKbC;{g#nU0`mV$b5ym&^hE*;Q2vz1VtuzuZ2)-lfH}PkDfGV=&_}O8s!()g?mnSY{kBvSbn8`?QK^bQ*V zSP*~w{4CY5^Fe5)pAdcSX>Yczo*Q3*a5vGlU3kGpvA`XlYRjNu4_DL|bxBc6pERM} zji8YkKCU1UCMSgqO3LCplXz$^bTaGC%U$x5$|}VCmr3y@@3d;koP1M`|J0M z*sv(rZ@+B!$^C`?)~Zf)(0A!H$a3=Q?~1s27TxGCji!GvJ^2PIm_K&ToaTeRu9c|G zn`xzc8SA{80$Sqtk4Z$qK9VADN{SEcW?!sX`*u}HY!2pbn*{I!MR)abIAnkTz5^J1 zf%ONFNi8my8Pt}-GfMXp3A;rLPl9jQe&4)Qvp%mMPKGk5{U$zf3y2li46sh_|M2oU=SYB;GLF%=-5_3>`070Enup&Ae|jbP!)=g9k1dBG$DmzGoKIr*&{! zz4N*sR_{*!V7e>&=|pL7mbX^ilsKNn@>E&w>@*aDY~!GpSR{2P-!)X<8g~jDH|;CZ zFM?m+iWkfSkY-W+!!EXjGJIzt)i~bYVRN%8weIpGf>oYjyxrWS$yOGxt62Uj5lBdz zI1;`4DLN_q2z}fsxxYA5|1gwVX((^o^t~4L za2#y8Um4Hkq-DJ?R-mfw{;BD|D9PbCAjuOLuSwvy@bZ~J&11@J!+B@jP5_sc&P|c- z1sLGnO5s5V_0xj45BTC(J_&3^!hVTC|B%A(gyhIXXRgoQqEm@ZH&dqS)x40^cg-#K za9`)vMyFCD->bXmZ5ew<{Y@|V_sGTq2k%n~Yh+9Cm|YmL1b5k_Ts>K=Wj_)Pk_y4}z6~nqz9XM}fxZuEl`UlHFn+eE~qh$f) z7Po);B$9aHF3TJ>2fk_Q=%;QlCNtLA8$D}pf0aNWsF?p~E7 zs025laORsPKSDSP7KwYY{o#a1avr1ZhHOLm>zCd>iF_!f7#jPl3|xB-^z*VRYxsv1 zOAC($Y{nF8ujsjd4w;`^U;0{CfK;MP5SLbNPb|+pPc)FAir4olr6%c8LGP1OdK1=K{bEPVJ}A0f20%?cpkvD(L=e_YShWYTR2fG~;I;2@RK^9ix54|A+3K&kt7Q_Jz-`vA5KJm)pQ1TB%4oOLIRI)M zGe8TZFz7hx)-@`L6lALV^3X=F0%Y%1tDUwVxxYn6<|9DO5GQ1XCQWFW$ z&MR*d^DKi;`n)o0{;ekM=^Z{Q^=2#rrfyKvPSaLMK}jne&Je!lq{kY;tepn>?@a(s z`#vegK|F?>NEE<&&I3(H%^9kk-X+icaYht#bSjhpx~bJXwiB*?=PL{7i>^Cd$i&1o z{tO)wn#uq~GpvmQxg*^Wchbx3foslu4Jzfr!B{FNgq;{*)-E+^2*Br17RK9G+f4hV zZvcR^wCX(EQ<4Jk85d7v{5;oMEWhtR+5HhDzq4p`G;nwRtym$g8wF$2(u-<}v%@J* z(2dUz@aUNJWEHmFK-wZ%hX!-CDP6NwG7I}w^h&EfM7q)&q>W&>{=Dp1;-_YZ1QN1- zCEWx>EYec<+@>RVzMoRrpFtmrN;nI32pK+&-QR21>qf3?}WNL-vwq({wL%wbV&3BHy68SN?0+(L(Z}Tw9m|;K@0N!~Df0FDqYRZOg=q;vta0E* zZ|BowV|MI2TU&dhxy>n>7*sMsmiWfD#-I3`tg zBRQ-$iOI=rJ~%$GQ1P*YgR#e{;qgVhI{SG%zk}JUq%76iP4=Pi;7O1 zFMquZZXosLHTMo@xqTPXf}Q{Ikv9RZaC%DXhQpkX&^V#=jh}`GhRpZ3jscC$~Oilc~t=QZz35GUZ^f*(NbCx&d z=Ff8ctplTUUH6Q|k_|582~rMSOt*MXopbj?MjSs(g7`5clR0D%qg!_(8ca112({;0 z`hqeae?e<@ad!O44DBu{NtA1jTCUb7ve0(QJpi}e4O;2vazoDT5B%hI$=-pcPVo!H zy49>~djL3W^xgq7$b@e( z9aNu?nHJGTV=;P=-hHusxW82=WjU&{GkF;SuvalkObYbU<14AF$QIEwrDjpvJx4kA zg{)$_1jReg^i?Y_z0lqK1vYyZ8Ny5U@vJ{~KZ;27*lZ|hgJ(o{efc$Yb8^j}mzQ}q zUc=GDNG8Xzm^)ZOk)Z(`o0QGS zC0vG~D(;tliqFpt(}>vWbTE6D^q%iCXY7CRBul1@?Xc%eQV_c!<((;OYLB$?K(5L* z^~E7oiRL{5w0P!XGC4OUpZj@HSKjQfog8D=+QgnsSjnfwtFdFL;#h~bp-Fg*32#nA4g8>`jVtLU$DQS}?w$`V($)|QS zXP7vr2m70wC6Zn@>1PbZBN+}Myyth7EUJoR!slq}RyuAbfdE0ISJ=DTSS#yAfmxcI zMM~8{&fZq9I>~xpA9mO9rU~pk@Rffo8Z97y(neh2DTg)^*3`Cx^UGKD>!w>NUc~a| zS__|Ukn^{tv)HNgfmlTa3y{+*&AErDr^ZauZ193Y10R#^6i%1Uc=sC&IS8&SW>W9>*#@Iv=^HRa(pzj&YKF*qDGo9aa<1?EXebT;o4T8s;xPLh-tSJ zY1(3li@h5^PIQQph>>?9+?hflH~AUmu8=|N5CJRi*z9kS=O>J>t)f={@Lj#BLJq== zquwrnp%mCr-}*kBFq!Q1jdO8|2WkE`srN$ubj)8?_lZ$GuU)UXxqZPjAZn@)Z&%`U zbs(E^jPH_TS0!3?8B8Z}0RX0se0MkOPXP*LAPKQ?q3jqwr+DQ34YPo#bS^Eh@jb0t zihKd&Er0f!RoN(7^q z+0FpOtB~*o8|{~8MhQ>cDX>0$5uY%RTJo-WLF&#s6eTJpL%~xCQ=CscjMNnM=#1t2 z^yVX&+rGT1>%pD;#N3H8ru~I9Kxb`n*bI62^c86xfc^JD80PCQQT+u^U+_SOZy<9n zJ5SdS`yYIf8<8d#$H1v>0>IE5`W3|&veL;5ma@nj^uDMTBqv*S zn8h2mMaUMArWlnI>Ocp7&%B{>;zV~IDvjI#UG{QxB17P-ls;KTzHRAs=aXuDb zH_GQ~3|XYd+1-5?l>M0OrOK@!Olc z>QB|wX`#}wqY;vgTVy}_KB+X49%~yA!K8^((MV-1yVFO-(s7n@Ea?%=4+fCTH!lx@ zE+%>4PWA(^UnuzJPe}lFcH-f&X&$}A`XWCjw#A(}pAn@P{lN|}^pt({ZYL}Nqbi0)$nxO>6Ib8^Uy-UV zT4O^)QL%vNPfv~dyAku{hMq@@Ps)-Qz4emx+ms|h^>_bvL)HfM1ClJ;QR!l-#mHAM zKrUO`dSb{`UuzQG*tt(qVk|9BR!+fml7SL-cy_aDlJ#fon+FRX<(4NQI}5Kop9rg+ zySVDLlycrjjB1$yo({lJe6>eGSuk+lpeA-c`z-H?LU*ob%n z3>lT<1^y9abp4=kXQ8d_tVNzZxHd*H+E-Hshr7K5fNOP=P(fy6ewDnr4h)(HfZ?IRz+T$I!Ilh%c={h|aTNsr7hZ0Ch zsNa0gl%}7s0(L+Jo31@2dEi++)jj1Yt*#&EC1whN3qtd&EAFq~oeDFv)^^(5YCu@x zhY6ks(`IsiMyviCc)-K{Jvh6w*c4v9pH5*o-#niED_w#Yol_Py$Nk{bdCU7=C zm3!{ajcwH&3wMQ8Ioomk{A&3JFNlDcidJ)?%6lY81!`WoNaZxdbE+}uqV!lwm>eAf zeDgx_I@WL&e2k_zHc1dWZe@yi_UF0of`H1(YqAwo2%Ny_!g(d+=^51XL~^y4x_bsk z=N*Pd5HbDUCR=7k9ZTZrBpu(-*kT{{RMEd=Ne{PIal@hx|Lx|?~$qLH6P4W5RU+M{4A9C8EniYsx+=#`_q#35izDkp7FX8*@r;ysa83PH z#?L;$S}`>y`x<=Eq)#`N9G{=qXSyDK-nau#&xB>1#uaXEMo$PV9poXRt^cx21Y(&= z{bzx&RxFiFire>RWlPF8`qK4}{Vs}|Sb6TBEs>|KJMrR}l1}+Hdv6&HxAi-lHi`Yj=PgAp0&ujY9BpoNCc5jtp zOVoO)rud2coD3N-8(YG z8}28jD-Y5%KcI0tZ2Ees=6n&MeAmXm|EezHHZlH_?3$dFsddp@oq+4{rw}@@mVam( zAYZd`eLI@V<$1wVb$#aZ9ce`H6Ggb5K~pK|u9yTZ*UnGP!*2m4Lh_Bk)Zl1|()&BT?{52>#;8Gum65Y$?vK0_lWclGT=7c_ zjEG<@SJ}KPv}uYT3ncR@$wE-TYLc*YkkV>qp3db?=A8_=!{0n5)JLr1`15u}*9<>uh-44Zxiv=F8eibO{q2YRhaHKKUFSrwM-&{^{H@yFF>C2b`@l~e zPWfCm2D5XCq9|`@4K|8`*I&P!@6TIEsC4uFkzNuiV1BU4vH%MDuJQ&92Svh*204xf z8%Ayoe|H;1oB|$Hoxf?xxg#%3ZURwDl^W$Z5i}fl+VZj5(zH>a$SL*TgMH)fk zrsJp#2?-rVdO9s8MVe*%ae0#DU${}&TR-byfwV!B#t+g^6sH<1dDRHCRPQN zu(QwAy&ys0I3RkRF(%B0PHun_@2Lkcp{8E+P-H68(lcTaqj=L!t7S5|&T6dqMSIxL zV7M#^QVIJwX8WOEr4WY(;PsnRZj~Ql3aC=yp<*9G7*u{!#&Ssl+?hl%X`Lt*1J!H- zY43QqP%f7bLOAvZu`=mNm)DO(4(M`omIsQ02wmHJ;VqdBd8lMEmJTMV*e`c)bj@-@ zzC-g5ru35}=TfKx&%M3||14rtVzYuOUOf{`3_+UZG5IBI^wRn$W2wU_vcTH1b0K?=5`TwCrvwgfYrR4~P^bNl6FxuV&S zoL24Qk`q{Q6s3Wf_h{IYaI*(9z!|G%)iQ3}S>+WE>No!P#$nxvLv4J7-i4sg zg~A}`eO>W$ zhyo&~jhmkM!!WTvGc(p))~eK2Px<@frV4CI%-q3t8O732%|4+NUK`p!tu!n7kzN*Y zy3k*|CGgrx)^hmoV?w1%Ja z2?v`_Hx<2!`@uZg9QBAEx{VDk=L)ey=n;vs1-U+XlwMdpQ^4MGa>2)O`!Fs;sej~O zwtm!6e+2i*CnD0N_oW^_u{;~|ZtEM#SC&8!i|QFTn#dPZ6rd}mr$KTHi*?U@tWcEy zP-mq}2rK)|s&L|O)ErFwSHZ-T+=4yd+~#Qbmh3<8PB#Cr{y_AHC-nwOj$C;ILjas^ z8M4@OI9GZ*L%USQY)nn7?#Z$uu09Qx#^kfo16s0H{+f!ee8X&n;@y46`|ZY`X-P!7 zYOYm!0ii14q15N1pR!bU%`dmbS3;$!hJwV+`~6`v=o+a~^m{X|g`H*_>i6Ev>%0k% z6=ubgyU`B{aIaAdk(Ga}IqZ)F5hl#g(hV4)Vs;S{UG>O2~@ z?tPDoGb%LEUG#Oa`a-6W^LT}2c>IWQWmg^5(hunctBL5E;6EL)8gR5qM-QDKBnUF} z_l|_i26v_+rO+3A#uOgQlyJryP?CoTzgyv(E56I3WQ!|V%xvv`UTo25n{M6Y)c2|X zQt6-XT&ZCY`)^0(_mEM%DGUgnas*O5aLXs_sdiR6JqW#z4Q#k&NRqad+VZ}73dI{L z702~QzVhbOL$7w}|36DnDU@V!xfBB#Bm?Mbds*4ucn=H6^X=f>)P(eVlVeJftTEQD z#ozFsWLAv(Vx142fdgG#Pk|h~DyT2!C{Up2S+I%9UV$yA8!G`v*mv$z?0=9gxYwWW zj|KOGE$46=2JaUh;g=e2FP-K5_D^q(<0gK1)nWneD+JhL%;Y4I54~VEsB%KOL|k+P zQ$O!Yv(}OPW3&+DNAr9VbgnM;&qIlC8sAf^Q{-7=F zv{-q+RBmfkDLZq?QD6z(B4odsYlX3aeEQAZS|~mQ9WNAC@S@ibTj~77!&f>|DpyOg zEs{{_hKdg?7B#0g6`$xhFDIL&%c{kF9mAKkMhTQ1jGPW%-yF1A8|-e9WnB~r&V@s_ z#hf8%^uc}_?vcLO5!T5sJ(~Y8h+k5@SF4^u>DfVBSKI3T)VfEdbv1qZz=-WoPRnG| zA9i*9M?gDAuIJ#*NJ)b2uU=3{;4_}Y#qh|2=bgjOM+qu{k9-Q(!>l=qRr&St?Yi8T z6q`OHTS69@a2DwG@{ABLkt7j=HeDD6DiHyT z0IeJGuxXbKzXpC4DqLYLqL4N?GHYY$eU+2{584EpC7+&5=o9vHx|n6y%*gGi#^Vb%oa+{qO{- z{ghBxhBl$^xV^Spjg^ zeV4fv9$?}S(CO(A@hO>&4^LUA$ytmjMRPMAI#-L3wI)ELui}=O`l!M1D83~>-n>4 zEv#Qtt@Vewqcny0)x7^fs6*lS^If}d($duxLk0oc^_?8_@}WS7wc*j+=Ts&GSst=> zODB@uhHVF>MtS1od*v~YIK(J`!y{GtBKA+Cdt*e zD)b>>l>4j2z?!0`RVd3#K-s_9CEA*!sZ?ZyLn*AhsfJ^72i~o!-WA96P za8aEYyce-n$=CS`_v&I$30-g#nXX%Y;4^TT6zXcZtYV9|YCV3^V3;kUP-8?-6ARn< zG8k|_);lOuP7RAW*^hP>b{%I;*A=fYAH*}$Mo7~Y>QN#5e~x}r@G(kiiNZHLsZ2Vl zmNV0(q$5%+BGE~ZH7BXc4vvBiC*IhVz7sIBpa10K*x0`A{zrHu6H(-czeU7O=>=0l z{VLhLJonpfQGy2Jm+FeALVI`KR~#bo(XGFAH6q2rRf@KNaQ39+jxFX45o zu>}ukTp9OKU^RZmXOry4z)e~HAt7|h6lWVyZ1cSJjekA1ANpICb*j_suGF7*cSU(D zKHD90W9oYrjd#Q-Ua#IzLe#sqPH;fLa*60)_=IWYK3Au|r!>59vnFZ?OGU-<#(DUf zT!9TaTTgdnN+x`u*76PNCKfBB+bq4U=PC1FqRM~{MAf(I*#}^7Yakc#pBX1eo?2@^ zI|ZdsVRE7EehQe7OeTYHMq1Zj+L-$r%=iW8d2WY3Kbm+VfHR=7NmOw4z_51+KQ5nG z&Ji019EgvPm7*VO0;r(6O)$ir?0rJZD7$a4Y!C(^xuPP188{_3hR#qCg6j$qj*6a&LH{?WySUIWGc z>s#q9{58&Rc*id;mCH>zD~yIYue%uN`9PB(O_7LzH?SXvjS?n{Q!MvDXfvEKwNE~h zMpi1<1H1*!7#gZsbbBrD5y}Zu^MbywPVwlAn5|fK!VvX3Xl=WqoFE9XfR@3@fwt2{ z$#Ry*hyiY#;NVtIrGwJiVDEdfH0ir5;=l(=3K6NBaAsWJ9Jj_*AJ#=vfe`xlCH@ zcy-C9J9t0ckjysWL5fXsn(a~{M1u$O+VcoS2f_v~zl{Pk!3vE+Z8g`RHgB4tbe*xL zW^Zbw)Qy`d+VU&%X_&yDW!T$B!xY?sS%e>v88e)bS%)8~>$Ry6oLsY}kpW{9)yIfP zpp%D?LQd7lzGCM9FWvac zi;8S@AJC0@ReBM2K(942GKi@2W8pE<+$N^cb=1v?yemjG;gmf8sK6bhm0BTcVYuhw zv!Eckig;6di^0wQX@$PbUlv`+EpXlzg>+fKGIYq+6ZZr!tL?XH3asuD%?IC_sOBpT zW=2uctk2R=JgEAKPc+}Lt5EoMDqQG7<*1(%Z*$*Ehx0DVbV9bD}MGCxwE<)#sKlgya!0@96LHLRQEP2J)y~M+pn9w$=^L z%Jxl0hcL;~S|70E*ofmw{_Y1n|@$;;|j z(xnsJX*SB3Szn2HMSn`frI~*ctd5=n43~!cg~o1_+3sWOdF7NsD&-fR^KR4UG+x&| z@?V2`z&t82Dr zIP;w2ixmz&O948!j@!0FxzZgUvAM7AGvzHf`R8(36!0{yFeoIy<1!2Y6{=G|bG1|3 zSi(r1$aP(1>NqsX?FaA+JJY1EVc08c!>qKhvwhmdv;(vhqrp3Uq+;%SSMX~`7qsC+ShuE z!Evt9ewoJos_Ozr0^0F*cLzBhSXKwwy5=5W@(Bm^_UZkhR03w;X}>%j79oFtlA|Dz zR8`e-(HL{%tFU{Wbl&`v^PC^zWUPnM)4Q=-$;A%!qm)HV$13g3G}-T1(SBF` z4)O&MM@t)LrZfQT=?GuT3lnnP5Qb+{L)e?~>e_$&Q1uB-h*6HzyRggl~5D_s)2 zhfSis879v-KK)3{$7pfPTkp!_ppb#Esl&rs|0(3aW(%(MJWidLjl4&nttQE=eUkhX zD;?ELm|O!gxZTup;ZX@d4u@ZJnSf`3*8*Lw{+Z8Jg+WgVw638QJ{)*OX&C*Pl(I22 z+@`aKj~S&af%_1lO;sc-Og@3aWX+*d)s2qe|#2bzs7NXHQWbB zLA-=W9I^=G7!^N5N0UfMuX4Rmg$)68-p5@ef-*$>P{Lg5OfjXd~W9mL5dC?FT!g-;iaJ`r)m`Zyi>V{+&8q0n6-CS{G~ncf!w78%n5~ zZ|wRWoGwt)Q@{OBj;R?Wgm@`0aba$LnlR>jXu8sYmft~2$!@=zsngLMd5M|A<_wD% zE&<#0&6nkxb!Vl6kG8$l(?1w>0B~Sit??thW48UXK{vAE%AdZs7n294n$oWXP;iM| zxgZ(jhfHV#L!w^J$i8B4q;qYmoTF|E}Gw zRC1&W1AFfPaPI!JVYYP!qm_IU?n-C(4e0xmYQ!AOo1w2hkrjFb<4sz0^RJyqgi&y| zJKp}Y#Rc-`J72*mu`;)8kC}r`2xX@$q{Z^2Wh6ww`NB_+*=gLQ=}3kvAeq)WBNsl0 zp$8Uv3#7~wVrry&Nzt?0W8T)3e>OKY^J2H@@kN@d@vP-GA?ku>EaG*-gy@;r+2!KllriTL&BJY3fHoa~^_DrO!W1*6@vF6p*PoJyzrd+CtaU6v4K^mOo4jIOf zHb^r`)}N3Rc^qxm8^^p#K7#Gg;fK8psNicHAEw@T7)}~+nIT2D<;XCkQVFiEE--P)$SE@K>unj7s3j3?YQoT^Z$4xaVM+qpr?HAl0;w|>$NmV#C9i+h>z?p^Ar z(XX&S>18n#`0NI<3qFj%nJ8R)*wJWIe?s8}NG+jn0mnL}aFQVmbNW5+*fq12(_|yB zJ5hVBOA}fRwZFnrbMRR`$#?1tlUyoK=SR2w2mvHJk+4EE z$*^MmcI{O$7pF+{j=T=#CraVtw!Bm5mU&4y=t3FxEd7sj%)vOvfwfc~U_T*~IBTOa zQ(jzr){Arp>6QauiQDw9%jlMKZ&(cnLMkt74`j7@I^?g)JWo4zDymutK7DYXig$d? z7KE`T8|D}OW6&@`tlPMh8vyF~tD1?DL`#8Ovm9?sD5y7;1oEhPKT3WWUYq*_8p-gr zGd|r#p_T~H8?2TIHV&OO9bu^1J_0&JtxiI&7SmUbqvLi$e*&A!%99r`FMRQGjMPum zexzhb6r0w*@OqA2@)ep?;NtnMo9X|mg!kY?m6gjc&QqIqp=1UXhA?O$;8cfm;kXlE zPvy=BcD?!p|1oOS82nY|=ZJ}5n(H7S>Io5*HPbk>aNUwv8h)O(Q5(4agB@py%XqCS zZ`C{@3jKAcP|=QdWwTw!zSH8oO{WtrSegJ_&%cOCS`ertt4kX`C?t^Fx#F`h_q5@gcV00Tqv4VcJJbQtMwf z9fi~yx!Yu;!dc{`XqamDb*<_2@J)QZ=ZFHXObqJmpNte%j^8F9;}>^Z6B0a_T;Ph> zZC!}nD*t;HfR+6X=J?BYmyZ99jz3<+IP7WWUOaGWiQKrd(NTu49#aY4FNO4umh^&o z6>DR1ZBgWs?EOBFWqfUN_SW@-H>)F}&C820Qt>t;ueOTgi+tpfS{iBA8PO>sOz{OY{p>BvFi3veFYtq;<7@lVFr(H4OqI~`Pu#YX;Ui`sjBf2GcG zk>0?gpxEA&`zIp-?lo3pKHJpWM)K;5HD#%~W6T|V*d8JhKbSknL^58FCzvtaXB_=c zO!Wv;iM%BJ2_y9;$a=mZK(Ug}_;ii*x{x|jZ^%nsq%@iX_r4T;e{N2laxY<*EJ~|w zy&RJaDmdIU)oL-wr*{|qPw4znAKYZiUWO9}aBbr7L8VDqWYH)&(75@7Q|ESx)k4Tc zgM!7j6C<{n}$al(N$25^lLqKf4uuOP9Q2jX3w)h zss4}-`004!g-fz?tw%83BFV~Ax21lyvVIhaWtGuv)z+mB`)VL-wMwRU1W zUSI4*AIQZCH0>7wA-}BcKNRt3_ROT~L=hB4HOVA(S8J@gy>k|xv&f){M7(&jf~ROA5ToW z4mCBK_A5w}dafo-Y?v0LKKe3y^N$1BW(5b5VEpO4cQ)Sv!z6OuMUPt5oOG6k+V0HD zbi~O;8?td2RPI;_`a+oTLAPM5CrV*N)H7W8DwH;8V<%Pg_HMB7Tw$^{b2ZH(!#_40 z;tL!aGd|o5qn7elHz~db-+g03`#MigU~mOcd7uMfpWPMhKVke}vpA+ADSG~e{rC8# z7T*ThCr~ABCY6xYZc$VaQkdK(hJiUwf4_O6GRTrsu(3p*+edLcIDcgAnW5Y2J%l=| z*T(WdZ-;4UPV9<=Fi$sAq>D9$dxzro=>G z0}cZjWBtOe2!6}^?v$ow7#jRQ%x(S#S68@3_h`ixpWUnZS01VF6Lz(n+T>=B3CpFG z5hd)`8jTZ+sFus{n8tg+KM{+DS=9z$2GINy*5>PhA5z8i6Bl!IJ`L9%psYcxnP;dP zaLqGkzTj}oHau8Fmr5WvUpJW+S5>bbX=t#o1~WYqQPS*kPklc*TiLUHL?&pSKz)OJ zxq816)P0hr4*z4;T|t=Jaly~v{JC4CcXU+Urm;n@j+WYF$LkbfK+$MP_P@1!U6$vyM=QIVX%s05pj?_!RSh8vl(kZT7++^UpyI{Z*XF z@LEuvIzD{v<#cEfXO^Jz=(JNfjn4|>Z6IcZ+NP~{9Gh%HrDmSw)lS30`-6Q>YP2zU zj_y@VLz#=|++oaLdVideIFQqA$#XT9Mnc-6dOzGyt@Bj%vr#&Nv$z3F109U`>>hmd zINa^E#Y1HPWD2zx-?w4(HVKen6$M!-rs`JaFzFQkA&v4Gyt^@E;ZOf{QS@Jdu$moG zd#__M7l3$}_1$IZHhMeI@hef74N8T>h;zcDNWHFh_sgZn?;@^~;J@`Y)gdUc%()Gc zoaj9=op>R)sRv`8*@VLOQ?Mx`Zp2ekLAxu_RZdpiI4lHzD+N%GFc7};US8w{IV)g- z%De?8k}h6}0Nta|p4-ly;}v1CTp&D7%c~1M_o0lSL7{cyxsCQu`G+e=jv32poU$f7}wCjeciNcQ4U zO4prIO|)s@l5pb;e!0}WwRP+MA+~=Nc zn*z{54c&7@MHep+OB>p27wPd=M;$E# zz|E&+IND|xq%=(5e9T7BY{ovRmF&HMIk(5$hA^ zt4Z^R^Bu9b#YC?TJAgRc@@_v`afHw__#v$wpD3U6OHJ+dccJI_IdpIK;fT@td-8&r zKXVx$lYuKz+ikoGO2zaamvGF!EquIn`hLqip1e=HP`q+L55LMx!Lpma zoG5{^H?9vSY0nBUII(h$8R1beEK_&FmZ9h3mqnL+ZD``I`*QK{y*~pY>3cxyS)IS) zpF2U2C{~e@q4S`J9i{bGyRHr=OY{Z_$O7z9b9g(T8>7&(6#3AhFtohrh!=*1n4U@uQB+?wBios6Q*_+$3zto}SYhR&}S}kfv_Uq=BLu zv%<55(=C6;2`A6n{n^ZQrgE#-!3Rl5^Q!}#p9~r1+@4Yh2h0W?Ekou=wj07v*B=TR zY-qkZqarYRaoNGI(Pa2vAcm^o7q0r%s3eb=wrN&;o!VL;cvKwKK>Uc!l0hd{7g5O0dn*6~=R7Q;`j!*!k0}kpMZk@RAz; zykCpn5C;%%j9^T#lLE(|0R@^ATv3(Ikyk{t-4<_I4j|7j*H{WW>m|bomnfI}7Tu|S z5>d#G65EeQ%p2wQO~K&~U(yJ)gtdRf*+vSZS7_KV)R})gnGA-pg6jxBCqYVAW6X2} z>*>R!`FlqeIt5~&{RWf-mw+DtBQHjK#9 zY{OvNAVpQp_>1;#qPK>!WhMW}+S1E_pB0X)RCNCAmaE+C-{btkP6QHx`>gN0Is$ot+di|IA|)q!LhrQ9I5K9>Z-%Lng)aApb_z-zpX^M3 zlV`-o)u?CFRZnslYdM{MgfXg_c6JEQ3GMRvt4A+MXZ1aU8Z?O255w4JJx&6~Up#RB zQ_}Ij$pB>i$rwR_!5s8_xKz9!-x7s;o{ktX)rwyhwH=C-3<^^e88H%H$|+;C=R>?y z3bbWa7m^1bS2{w$L{LtI@C~V`l`rr@E z!Y3;PzfkOAn$S7+=qG!y$_>oI5-0<^Z$z?be%O0n869^oMZ81MZ0?ZH>RA)e7oT`S z?WUU?RcGO+4BK?IEMO=KVzzYZeoR{wn!%;sfj~e+M+Xy_G{Nl6yGC67ZvJt1iCEz7 z2)!Gb&hKuCeVWLy4;0l1DP@&J*SaoKt*y~{9kMsky-MuQ(Ywc?Z7fB+1*rYbw~WbI zlRi3nsaTjUD zUY-vFf|wD247iY+(wbuY@|ob7hh6B(Z52ZH!jGzLQL1aBz24X$dhdDe0r)r*m$gwF zdEGQ+h-Lvb)3b+aNho*elHg|82*&mA&;r+=lnvM)eH)fxNfmzQ>4slwm`{ztlGs1F zf9a(PHV=X0;ft1skz;9X6dIY+;=~dZD$9ckNG+wxC1%h#SV)Xvn7Y#&)4VJfJJ?eW z734AT5>*(SFsKo{1O1lEo044z=ge4{@VsBVC3D)`jYs%wA^(r@7!X=;p*xVLl)s)> zb&cD>8|-(d05dyOYYvl5G{Xeu4rkC6PcB`FSRRU26q2D5QXj1z5$y#0(R_29!5$)$ z15a@Y&p3AX(S^dwNoJZ^*o;J=%v(j&DBTS=3ma2(M!_d3RC7GaVfU}vKd5aVIe=e? z^ZlLT*=A8LOm*zw>m?SI#-v;5ILJAuvStFgQUkf7t~`KL?kIYR^bD&24`!@{oBbAa zV2(-cZ|sDQjUJfq9obE;Fdy`fUFATqP7ZOQD;1numgt)i{a zgwK`{)zYLf8)9@4n&dKZi^kbvoo)W)?m6wO?^@ z32OY*1c-@=MS_%u$*J{-sC@Lm4lN?Lj7q1o@I~nO8Bv78aV&vWZtf;%(sNy*smRuh zQVX#8Taf>l8HJIt766W$5&((N>;r&EyHjVfs5Qz!lR4R?rHr(c;4muwh>JvFnt8!A zm=w^yem(R>#OkFDwa|px=tmnYer}?7`SxoIvTr(N8!}*Kv&D06g(m^t`;EjIRt94lLp4xNMKt%f6rSlOb|+#N>Xv+bKNpke0z7m=aI}M!mLD-9E!(Z> zj)xL4;1h&WU&)X+y8Lhd7O3+`d#}fmn*Yl2_fcgLprZ zjeMoW$V&!x{7Q{#F!FSV4&mix*%J2m!!<_=37j5k?N4i&a_KRmij+*9c9?(WIEk;n z*USw=&Fg`dD%BszOKdA|!si|yz8lZ03vKIO9oMY>2N9`a+6CT^-)Z>e(Uq|6o?7u) zG1r_{CQN%QF|}mhvD=qwbD5Gcs61qEi<5|KuUHwB2-ybqdYj$YHGEf>C4mtYFv)h` z9K5xFPnsr-{G-?WV)N4r2Ql1)RkCs}Y_FVL&+##n={ z9p4P-D3>}mDtIP=>CS!5acyJ^R5*s@#Bz7$*LIoLKRh{F698n|r67S5mE<+M547`=JgzZZe-vEpv@$5&1gapn$Fg}P0;=l4Xe6@z`R}|5E3wM@8}4H? z0*}|F=-Oy@0*afdgX6XC)uleD^vR~kKzRWkxJI_Nq4~V~)X(pM3_~#o<*s>Z*BJ7x zM+%$klIB_?<>X)An0$ZqP#|F?Jto1Ry^H&I7Vjq}Uusy`ALj^^zNG>WB$hT5ad@}p z)4#uBfjI0#m(@kD9uMkmpBBaJysI;)t?ti_??;3?9xrWs9_7ynF6H~nzNEz>Wp9aieek`AIPy`F2 z<{)RmP{Qrk!P|YS3dd5YuCTm!<8t?>M|>?x;kY#X@*o4z5(j(c|12!Y%*3K8xYshn zRgIdWv#&z9I6RXE94k|N6%8h3Z(nP_&~*@=^t`Dho>xb=DzL?sSX5rqf$J7DEIH%z zT`*SZlg8XAGd4Vs;VBo$ntc%JB>|6h>%(|=5MzP<95l_xMl=gH{8OA9xT|&W?RK`h z^3FKC=6M~CO<;ywtki5%P$P96S z#+WQNhxS%&95XxtLmXZEr}zmG365coor0=krGW6e{z_4KA0v42;Cmjd12bXI;yK@S zYq^5~Y2-C#`90T#QGZ3+G`?FpO}AuI>fL(Tlw#Kc=okX68aJCB<@n()ewKYnd+s=& zV*dD-#4>&=kebN%JCE@R&~EZ3^9B=ip4ea$l}n9W>bT{m8a|u=wr{{hq;@d) zWXis! zCP~V?sQ%km6v?!eI3&n-dSL=%Ubt{r$`1B=nFS$=Sq@pWc3VR za}S#MW}l-)8dPwALq)>m?TaUVRYLi(IwY;V>eNd6tBAw?Cm!?8_D@ec?O-Bt6N85C z8yl-U?GKAuzsU@mE%VCNp1dE~y7w-7TxSdXx4~4F{3bkdi$i~@W`ied zMcK7I* z+H9s#Z(D=%I&45=&Tv=5tQ5JDqgyFD$iZ=MTgsLy6z;O7%csO($bh9gN|}IVVOb+c z4fLja7@QveK1-tS2KeLG>T+C&+=e1n@$S6P^XK^9lBVIgOKWkt4Y4>$aAx@zDe8G2 z=CzB)D@#Ted&~&lo*p)t#o3gCr6W=J-h)zHXU{Kt&)Y?BgjV1}c-j<~Lw4g``o46} zT5-}Z4`%Vjbv?{qX^gu51Kt^O7#cgTcctvx@rwC8lxzh&7j+*TMvj0vCQU z+d(J1q>%;}Vl~S;s)&;)k?@oNDM;Lgk0_N{iogg3Cp|LJqRRPYh7Nn_RVk}KvWCSf z4laZmMXLPeKnbh(si8hjVwwvJmadyad?=R1`Z|*6c)QK4OEdwVE`Lw2gblSt6)xe^ zy;i2A@cL35lYSBSz(~jY3V$q%drmWLoTga3>jYuBA%Urqu`aLgGO<|Nx6yS3wCTe1 zHVmV*Y5v+P5%(u55kFWT*opi>0iFg^0oOJdSA@a#K?dGf7SFwK*bYxGd-Jyk$^pq< z2cl+|%X1B@V`{=lZZWQLquKrt#o`_7|fxk%Qo*) z9K=kq@z1aE20xGRUF>reA{wlqs`lAvqjcY7yYkUw&IfRh9Z|C)(GTF=#xAaL6dWlR zb@(5;K3gOy`ttg2!KB;34*xXf4Znl)_}`0^G*7L$uz*qRN=Qqyt~YN#eW?cXUvgKfYd~ z!FQ5;l>j>LS>J6e2mXA$Fg7;s+7o>zyCInuVj8v7^xzsfgn;C@Ibiw_6AK7QuxwV9 zAb3-v7z6qeqk6v#s3Z_x2d(fUEh7ayiKZZ)xao`ehNao_JFN7`+9y&$_o)SY^%Yc# z2&fa%%)D=}Ghyk9kKdKZ;{XcSro?FgR%`YW!)XPh^-{x~}~)FOR6pP4=6@pQcOB@Zo(WpGk@+Kj>Wwev`JP!SvIog1G=trhk=x5I;&kMmd@uRqo%p97oUI4t_{Ji)%)CRA7hrwsErsH_`9%>*@ zl`@TaJDfQU-j>x;ogN~Z>-Gg2o5j2b7gL>9?^qZxG5n&UfJUjQ@g8zZCC1-O4`S)X|fN6-@COq*YB-G{BHelm_!4Y`A+@vdtZ^dbEU z=@oiqwXnNA$r8LQZRfrkI3avo;DmfBWA%RZ*(dhmk#!AwqWa0!ohBbTl!&YKt>k6k zq*A?Lh1VW6%3t8JB#?pmUvavi$<8fgtqrC6c`=9C?An_c24#I8@iN==7UwArSh7j4 zpMZH+>FlmcyFP?>2`3qD{Z=nz#YlaO_d$5NBdhJ>8TP%!P=dE?kkKx5Sn`*cag&ZX zDTl%jziYVU0MQ;2xZSIWMB?!J-XY>@!%P?^YlQC0|N ziG#&n70{{Z{jRA#(ynkYQ8o`3crihYbhdD$dN`bIGsfqt4Kl$$FCQkFbbXF`p;p|O z*%{MMS4jAVSxaeu-95@N9L7Cn?T1R~ln$Wt%olIYnn^+Fm5(L?1E|d1uMm*5PFyEv zE@xdSgOo%_8Z!LmQ9(usHbtFAYI*gqWAHjSVKP1XAMmgHP8>#jG_R)NMTj$HJTX4> zf~alxGHdPL8vL)UXbMhV_ocLcvWX?#;bWCW^1q1sz@DWXT?vN*a7o|Eq#a){G}%=J z0p(R_nQMcQ@E%Q~=3gh<=FVwsRGTY+(zlsGQT3|u(C#E(l}os){a8*%V_nqfR5l{k zH=h%-{5qB3J)8Dul@@d|*MN=g>~|g#@)|f+8*Vqnzhz;LZTjLzeg*fnl6e9?FwH=PiPOp*+9CN#n#n_ZB!wD*vfW(iqv&>K;6A|6Furge=`gk-g#UfW~+)+bx@}s z&9NJ-(gIT2h9^5wxM#ltKX62cZnwP){Z-VsQwq@1GeR|y-zTg7#`|J+=S>S4g+#%jh#Y(bR@`L zd};oc-+_*BjhZc+O2>&Oc;ESQ;M+i#1-=Jz^srSU)<1Qy&c5!m{O4yf0<0Q@A8m*4 zfw{|HROm?d9QamT>SA^mZ7)MVrr$jJ#ia3Z!zaPYxkT&#_xa9HD3cy{=Ph=5)iyRb zF4mXe4Wu!0lsW}ieMBKPOph8HIn!aeUJG$OzvNdOgmCo(QjH0&7zR3t|S0p6a z$x7gOL>ZVMo+Gopb;{t>V@fBr$fH%_4l>-1D3P;l{(3K zLjw*6V9PY@V{sO$qf zvI*p5$VDC192EtEcW+HQM9-?BlllAT1B7lGL2_eSaBkb zvqSv>-_+63akN{q@nWNazH)J(PIMI~~Td`N}OI0^5Xn&@FtqQD`({g{q z_a~Z%Ju7_gYnU|{>-XC5aK^SM^BtO-0P?##W#Vbl9 zp@+>7%~=?7#KDKN5_;Bc<)b*tGFneKT=ovU&=zeUdLTf<3MdYUA_@KT=!1 z+j28g;wvmuUc(bFIazur;Nqj!!3Ud%_nA}oh>-1~iE9zHcDZ$Kde({RsL^7@J(Xpg z)rT`A%^HWpkpVYhrc(R&}`*%mbVqQT^WxOBDEWbWNiK-D-l>yfSRZBiX_?%(`uMIIIU+ zLHTP~5_qp(e8xeesdj~c$Lk%pv;!3*`?DoHI z-%S(e_u9!pegCe&W3F+dUM8}6Vqo~Euq}wSe?0re5qki{BXC9*@mr9+Av%QFxp(KS zUNjnHUd+*?-(4%K<8u-&fAQ2xRLkYHuFFdjy^8zp97;jPbkMlQNV-y1+L(nD~U^F4`Si*D`5X^>_5OE!(M& zkuwey3f(m%Hb4Lc6CbL8R6#}+!O2L7+G_c1 zEHqZ;mlM>#7<_*LIzp8n`n)+V$M0CbM2y@Luhm5OCK8G$U6RnM)J<(0k85hBoe1Je zWxjYoIu3#g^Mal`d^Gy<(exM(w3d_No-U9kx-H<6a&` zz^%9=dreLwTe0&v9l6F1r>@kOB|LL;bWaTY8- zFkM@gaqT~|u#qWI8}Lg?*AHVWM6lNSCOy?PDBw_kA=foPW`yF+IP2ASZ26~4jkg+; zO7bEq$N2^JL6D?EnCV&t4TB;%5O%6lIf<1NALI#~p^tDy21ZNMeLf{GsH~|)W`Szm z;%9jDkRBH~z6Li26J$j3b$#kj^Dec_eg4}7Ve}yln!aWq$N5<{DbKoY?OBt zw99M}qlnR51i*C;e0wba7zzA-PGyii+L&PJxkTsD4dx4*1X6Gm*1}Mu`A@%Eg8{f@ z>8>8K9!-1AXZ06|%RAAcg2$&gb4QAVywlK%_=jzSgJesr;oKg#Ixr6BdJ-_Nu zZ>Y|^Zjt(30?*2jkaQ6lK2~n3C_XK3-u9F>xu|}cy{e5t-nBeW$Cc%{XJ9jRAlMY2 z=9^6L;VnWTuHW7_drc;qZO*pVel03k^z_xxV^5^Z z36Z{Ds^78Wa_<%4RpSLP0VOAkUsI47Pc^Uh7@AO3f^s~3y^<)a27cRWav*2nkFWj< z+NwW_=G^nN~%szWE#ug^eR_Q(2wO5(_mP z9ii`$*lwps3#jP@X#uh&y3@@|L-H0dfzd@4=lNLk(KfC+>mhxIpm^s|OFfhVNLZ5)Fwl z6TL|@w%`oUdOp4Ly`zM~LG=ZnbzdPfK*r||n4W}1`5t!nQeVc{w*XSuT00?UdHvb@ zOzQQ~k}QLl-^okC0+3NK+4<@7rr-_)^E8di_zw|j=CzMcu56V^X+%Mb*1#D3eL#FT zHrpq7OADsTk80#QuH=-BmcN`?eM734pHn)@RyyvnQIfRIy3aiPZ{Ni<^nTtA2bjKF1zsy!x-F`Po*#ugn{A zgs(1-wfmjYXcw6x6{2`a^g{hK21EBXK04e-)7}>va)*z4)8t^>8p1APHf2!P zVg-d3I@qp?FvwgRB*A-e(>a z(N!~e@VQW*Anxy+39!f7kdnJ!>1F)N$@X}!`bNRC-dbu8*=rNGy>vKuIl2(oS)$)O z4V|w+Qe+=%4n!EW=4@z0uGQ+6ZxD`jCFfo0na%`>57t$}F*13gwx_fc6j3QV?WQ>k zLQBVDlkKLyWg|vf`@f)74UPtbi`52f80?c>%QNhlno-5~29x{;poHF?G55X>ovyCC zbxggQ3CbcDFs{ck3Un;6^5Qk`{KR(QcE{&OW1Wu<^y$8RpdGGqu@gqE7Ud`T~pAn z>$F*W#}GAK?=yGj3*WiYsfR(n<3a|8GE}Ra*PasJjQOY#{E(Q_jMkgPvE#p1c2`aU|E4R>BGXc{FF7&k_v4)T_EOV+d2j(@I~ls+$< zP|Klw0#M^Wp3|O8d6WHvWWvm?z1DPZEBv>T3(Fh4+5U4!JCs$jBrCe%%p51R-CW3R zzd3HLpfTGCRX*LX`~$W@oc>N7U+ryt`rIb0eXc~sGEd0840pet9f_W0nF7f^PTsWh z67U0`_m{ojUj6uFlWS*yC`ItCM=UC3<(5)uVT3@@!#zs30S&FA>4i`*@W->|;cbIi zuuI4~&bAU^;4Ui`y64(R2j+WhkQG#eTG`TbL2>L;@HNEqhEi*X0eK>nDxU^v){vU1pAF&& zZUaWbx0q{|cTs_!bA$S$szSHHy+`s4ec|W#z$TS=8z=ENKlzL|64=wZn10;^XZ>Rl z|AhNb67TBiu(3o?b_%<0oAgbK(+E3*+`6#54l=c7f3(fEVvfQcm>>@a7#`0;XS@|l zyrXKQ$J04GFO;74?#>k&p~I_-J2gWd>xfpo+45_(jB_zHCL?5-_+Q?K#TSPrWa=mQ z$$II$8k9RD*vXCt)sI;nt{QL%ppqpyF1KR1dO(4eym3*UZ7rUB-}-os4_BJ}`hb0m zRCLOiW{&J!A22#-G^&b-GqSI!^WbfR2v-EAlo9b7B7qBuEB`#DSDth|xYl}SkLnG7P1te7hFhZeu9SHhm;CGdo zvi4V^PanU&cw`J6z5xZah`gh5wFd$Yf5qhz3a^W4N0G3U}puPlN? zeie&618RsARorX45#6v)g8!k7v=ng{P~|%(Pne3TyUP|q^1X0y`L8+u(+7O@y?&I2 z*ZEOL38%X(qN?b}muO7D>WJr&u%}E3?i~xC8;Tjt9=9J(9T z#;4WbwCMku3l#Cr6mnBk$bcj7cGp;(Pp+XZ@MYGMU^!qp93%?JJ|k>hg5~kLOujHb z`WeWj&!aFj*_Q@vR)r65zt>H~Fb)6|_Iy>EnnzDdhZn3@=eptqx=0sq3>7J|1=aiF z5uThSCH;JC**W^EEBU~oCXaooC{X0<$6tzlCkm1Rbk5j}^Q_>l@l$;J#vkVFc7YtF z`8H(}{OccgOFhxYY0*16nH*SDE;hi(dE|yZ|JNw|@lSUyovW`=6By%vftnAh&ZlRi zMfKo?aS1GzT1*EyfG}d_u`1v&tR5!TOr;6|Bi{G?T{b2sxJ z(e6Y|{d1oQ+TR;__++En-wA%(5Pw6}raHEoS>EV8P5x+h9?7c`NE zlz=TtqX0wQn9x|5^H}Fga7XU=z5E}~{r#_|%m5xJ4;-%!WJTAUouX=qJ2H1BT&C7g zjH@F>89_G%?^m{g%}jP(4|hHQr%~1Fy!QnGTJOp9<%n%g{7%FujBG|yaC#Qttb9hI>sXnyABLj+?19`3;*YF zoWEARE}QZ_g=x}&H#83I_nh}f%;PRza|0V1Ja!z3WK&J~o}W!@c$&9TDu87fC5RMP z8fxfE=-FaV3e3S3j{~#EX3R1p-Wn{*N~cPvnwCI&1mECc){TgFTShoHTxJ=@;5L6S z0-~!gTZf0vWZ=%0(7_G!Pfz)Xz7;Q}$HCL#sV-sq8fWd{)djFzp~k>dj zPbotxe!$`r@VOy&t+WEY+YNq%Ywy`}Hy zZ-a`pw|f0uOj%DxI;+~OYuF=LritHJ6IVsWzJi&s-8b8@~7d0 zOaE1o-=gYA)?BA+ax2fb@DDxx$Dw0>=WUP)<-lQ+_?-#;{kgx@4dROV&mluTS$}@$ zcX;*xKmPxZ?*C1(VGQFWPtyN4Rro)Yi%Q>{Dv{{)+6(Nz_2|E9rF<6Hi0m z{FCpuc!A+h{NMbz`kme!EG#Ao8S%&Jf48lVg#Ah2lHCXUmN*HV< zDq8Ov-vZ0h#=sg*i*ROzLa_5xXQ~!vGo1gl6s(QW>q}SQDJlwv@cD>|uGrP=MLk0A z_6>BBmOTZOqym)`$?t2(LYt-DH0L&sGWC(Dib9a$&;@Ig?h6zJdsBAvfa#?fIslii z0XWmCV+s}!IZrqZjWmZcti?PJ4tSK?z43LUdMEhNX8ncz9w|DVEX|ygGj^~(sX&2$ zgmM4GTAZFux!oKM{B-dds_v`^#N8t&M>{!c8H$6La20&e<|~oF2Sdr6Howz{>MNI9 zlwaahczOm*zPcY<4ECgUiJTrjui9)Nv`Go#f4{!p_Wc#Vt5IHK`>!yN1%egPl=uY? z?0qzgHgJvwr09m-@BEyWQy3fowgPTejTKuyr}NnK38EJ0T(4N5t(py!vj%JP2Cjfo z>Y1fo^_FDG+SAvy)ALkV=;f@uRj|tIfbOT`$wg{wW3US-HNg;DNS5%BePeNM~oH}K;DKFvj{cKdmlzkZtg4 zKb()VTlFR6%*P|7&^}n7NRdd7CPa&5(-r4}Y@Ctj%3#hK05kbu@l)TWc+Sej+teZ+ zua3v>(tl(HhiD+Wf4ZdP1?I{`9nop;o#u4$fbB%ZQnaq$JkL;5VJ zHtU3ApMuC?#a1Krn4RYiwVDF|K)w3(g4fxB*V=5j5=N@JcJXe(C%w*Kb#A>fEF*3! zIg}G~(d5z{eDPkVMdf{zuBlh!TWHs#Eqdmqd_{t`*{M@d6YkQl+8x6wMrqm2}bwH zcb_9g>V~2Sj;yz94deJxAT#9C$N|ABepz=)j__*!>#mG>qb6GSY(&%>_+X~xuBHDJ9CF^Ht#73K&a2v4a>{%ORzG4&xJ>I*R1I|AN--24v^1a2@u_jLX0IlOzs86r7_Oar^> z-D)L0yQcI!e%^~8Ztx&i2_Am}OFXqkK6kzM&6WjOz*3TvUFvaBD1qp+Z}?|F6dx;J z!%^{3eYn!IBN45kXZxeUuN2_-VdpU$$_b5`La5Ch@}G7Yjid!d&u~Ax&fgH8i%|2} zR4bVU0@wd(LjhY2{V|m;;)>8>Vs;bxHcZ=Sbs$0;lwX~*oICT^fJw_K@e?a~J)+!$ zllE#*V@6S?By1+^v@xQ=-AVuRVyT!X?fa-UdsD)Gplj*hFtJh*VAOy_Y)=`aWt?u1 z5Fapk1ya_2i7F^-&ngqEr5gqJHHP|p_1B@18OMjUgon6#fn3Y?}Rw`PtiD$ey z{Y9g-&`P>?XH^hLfzbYLhmSD1$ggsDUGM%Awh@@%3j680L|mEW5i81Q6ne6xN%=+|{0 zc`sg$ZFV6^zf%5q9d-S?1~9N&z1xC-4oTLn(!n^8sa;{8cBYC-p+h(K6Kk3! zB=yXsv@M6ov5rLt59Q$!=v!b!C|1h4Y4$hQXzvFQWCsB^wNDJu2)p^G=Be zx^m&YS(oGBqjbPPm#*Yi&FkJgq6+Z7_f}YA)c~#2 zDWcqlm6sMeUUf4Xs+V^&A{M=9QuDx>mkJ|e&}CZ67|KGCzBhx+v7#qt?pFzpT*5gh*1Y8K7rZE)|Tb-bzsTU*%nSJe2F(j|?J*PD)6WC?tguSxcxSV@vi* zmTZY5S!Sl;ht4ToC^#`O5cr~%_7o^9dq!_Bt3+VpSxIc>&0cJETU>^T^WNvh_M7uKW1HyWe!@in z2ioqUt0}jM5zS!_F_XwRO?cN~moWn7@)%h7!AL4LN@T5KN@W6F_$-6NxglY|y*!Nn z47n(4Qv@yP;BG8;v&7buZf=zKzm7y7cPO2xWb7AJcD*ewaUwlWcCWRW>E;HIFW$Pm zR(WEQbKXFyH(__Dr6s{Og;}mB9?-w3CSam^w!k;?@fW9KjSFx@BE$VUd2s`c!@_9f z*sk-@nltysuS<@9@X(bIA{@{O$0eDz`1@60%plSgF8pCuHC+WJD=wskNW@3m-yEO}KZK;L%?;52i z7)S1{D)TxjgoK(a7uEk5<5o_+u5~u7sk-qNY&@Cj&B zGENmqB3orYG1ZbR&y{)T7D=ABezYNrorq)y3W|I#2Y0;V%1bdNm?&cL+SKwP33g$!?D`ysw^d>Gk}7d zcUIw zVD|f`@?7qba95xo+m2*k@vF&10TK4CKsjF zH8?*@`mg7z+D-*5b^wQkHNS&iXPKM1-OY|HJkW@nQF7R9RT3G?Gq1Isy4NMhxPAVb zLmWFvi+x^;(VG;gwFT2G5j7HXCg1L9blFbhor8Y!8s;~qP4t~M=G_Y zM@mzE1E4gokwAZs_q0oaSfeyxwVGdF2||XEc^cTcVMv!&+^EGch+G*^zzYXq8F|G~ zJv$cuG~4_M4{)*Vp_5@3KoQ4itHN`Hp6+}&TZ7dU*}ML}+s!Buv~1-Yg$U~mvYNqB z2?s)O3UjEUH90ioj$P$%MGc2~G%Drj1Hg_x;XhXh52OvnzM3p{2)`U$W90B+1MW=U zdEH;w$wibCwQmK!Y>Wt_CeaQRhiKOcneOa~(6w4PZ-^d!y0Wv^R^;_HNZQM90-`}v%ZD@(r%8aO|NFH z+WFaBlCFfWc(l~u@l{JxZ&)s=!3AmWh!Im2oKYkXHqCSe>fD94L~~CZjosqVFC!_k z8Ar7t<8oOIt_I^yj-mC+i5t7#=<768t9 zSydC%uJ-$8)BS%+8tD5M-O@?sm~*Oulcm$7!NHvJZmDf%rhSB;g*eoV~qds02y&6ss^KO!I> z;l#lspGr;2HcA|!Z)^H)Qi$z(dvd{`Wm8uCqP(8cJ9Nz=4IV-$(X%jen?EenVIgGhNYzE$fG7OWIFE zuQ;LuK~yYV`Y`Aq54H7=JTN=0mRYB19Iq&)nRCZYG%O<_gsWW#tRWOAX z4wy%IgE2LWA15ww+44An16&1JG_LW);HlpoEQsYLnI4Um>{?*EAW!&?!x`RX@?{^I zjYs+nOFrL4{2FGF{J*ki3c{3}mE0`%i8A|TD0;of-cuj%ft}@LFSa z8C(w!I=;9&wN*`seVNOS)sc(n$6mWbWhT8l5{Bw!dyyS%F6L(=^&y)OqPZ!Rm|tvO2v!_7%@&S57(I?11FCy4H& z>S#4q7Mw(<(~?ffi1bg2j(mw;&T4H>a#QK?ZPd&X=U0{?4JRyQ++=W-{gN#q2c1d; z0e`P$_nvn2wfiXnQl6Ina^Pn$TFJRLi}ICV_Y=ez4Y+FmA#z^b#`K9+ocU0rZbedz z26z81%#cgWEYr4i)sLiXPxueRWpC%T%S3etQIueRypl^4BHIOi9Xd}&TiTdT2k>1@ zJ7OdF*9NqnyZZhv9PJm=fD!L%U;10-eub`pj_z_4JkPu`Sqkk$+yZH$%CO|5O_`V7 ztIsmLguo4}-?g$b_?^KnmU!BV!*Oo&2Re{*EFNr{Z9Txgnm8}tZ~wCxAtML&LFt&} zAXe~0v)1o?H5)Dwy0%+Gz$di>ug?Y%C*y$vjlczdos+5h7oi~_v>|Mu5mzsuo- z&~M%iut9CzclI12zc$1xy|gZIbE*zxK1|UJvIa%+kHBaL+!V^qBWMV=Dn_g>u8LeTk!C8%$_db^>PJUK$Q{aM*J(INE|q} z%fv}r?x&loV~3czMts%<{5`D4C?CG-vUBB7+|F4OD?T)xj}RMNp9wLmQ7WiSV^7Mp z2P{bQuPe1H9YND8c`1(n04?rBKh`V$q`}-SUWbLfpnCdiPnxX3r?iM2iYOskza7b; z26y$fR9r`kv9{+ly|FA>d%45@S}#ge{i;XhhNWcx?n?-}ft`FV6rgL1n;#hdL|4+kBv%-BdU39S-TiwMQ4InyDzU#{z?l|>Li!KNWe ze)YBa&Cwu8MyWpK()M#9dy)o+IzxzcE!9E(J=sI4M?xRDtpGzCT$6^&{AJ}-g7Rzl z+$)3;bK%3e;lbT93L&xOfiW3-@}CA0t5=)abWR{6xK)*Nb*&Rh%;DGrhuMaDzj?$S z!kQ_oX&62Yhrc@l;a5z$N~+{vnI^7y0A%q#{WAPiQ`R&Ox$RzQr?Y5}p#4ikw0$v3 zwpaij?mf_iuHee}I;-N-f1J^hZ5;XX68&+&^wK(vTlfmX2?M*vY6U8y1I{pT80Wg& zd>R$DMFYpASytzAdYkffg15q4{p#fB=eS2fk=H0r$u3Z+P|tXj)O-9Gga{$9%|m{h z6zZFmZkZjhfbqgEh3c=SEqf;_q?y8%eNpNQ)FVAWQ6#E(CR{#AWK8m;xE=FHVQBwZ z1_K&t5;K5>3N8=2WtWauGF}V<@o^dJ*<3YV`G!j&Jw8GC(&xr2olbF`;i6vKfKH^L zo&?a7T@Xv2Oip|}d|JDzH(Zn~mU}KYB=_V<`q%gFloCLm3fMUy$eoc8UE}51%Cd(n zhB*4pROyec0ETmFxARY?Fv))Z80nCQPw>oj=`Uf-4CS?DP-iXi-k)AjHEvCJr{zrU zB*_HZ-3hsE<5iKV-4B&=DUmhmk|jbyXU>du!CjXXP;+bDyU1`@8CidG1myGBbC(|=&-5sxn^W!z2AAPSlX6OmxhE``?r3*uxMv%^#|}@ zn+#Yw1|KW^iY22pn#ZeW?R&OsLYr!UKxRhfQjupUo5og{6nwmXdHeAh199x5o>^Fa zUHcz@Ia;%Qxv2;t)g2I~vifEL9lQgx4P%I}RQ#G9loVoqttO=xp8w(P8aqE$)-x_| zVfGs$ARuq=d*Icgw6@hJxw&- zwHC_wWN&*4UPCRlPGmO&m*hfuJasae;tN;~huJNr>Ri7q2=LyBZTP}8oK26rQ}V)b zWtqiV4u$iXxsLapPHpO4WHRXfPF24!i%E+XMkY);X_{=sQfkz2Oq8zMBsjO zvNotCENPUY#Z9weD5a*47}tDW4j3V2HxGn)CdhSYfEmxG@;6cq5yGuTrFwPV;39bw zIU}J6%oP5C;=DfwbqpQ=X4tQ)&1ReO>0r%_6 zkWO$bj?tQfbfZKSCieOnwY?90z$S6lySe0r6v|!q0NQs2F@}Z`-xpBzi>3IaGqWa7 z9uEKMXriz!5VoEMnyxnQTF34r_YQn0UrVr4ulEcORPJ-V%e;Nmv$@bd4oK?;*4`E~slZQp z^1tDpwjBhZUsbV1Ai|tdXm{#`jCI0AT;rN|V1*VhkAPdMgwaV}q5WS=bS-?%Lbu3x z=SU2G9d0oK`wd#xa^yoraendJ!#K5e`~$Xk@1smSEu_$nmN7FsGy~1mG5}Xww57|5 z)Y_k5v*&ww*U(fv?FymIHcINpR)E}p@xJ5G>JH0ayZ7cv*V!XQypNw?`wM8t(zml` zZ>@73)bl8;<-7{&$vj&WLIe8eT0M3P4bOOzb7!heN~T=(a<>NlhJGeRUhj>i!-IGk z>s%FI;(}qmdvnfYQvuINpvL?+WU!h-Q!BxFKO*h1)@?5J0tz>I(;;KRNHVJ(T@A(K zux_h|dV9v;$7d1zvutOX`SFG$-PKj%YMTAYDxTJ7PqqKG5q+dZ-OUQKc?NS+Db8!% zgCq3`A<0*=`DzCV>$%>;bCx`9-ah?Ta^g=Qv``jEle{xEgXenw*JTNhyjx%yfBy7D^MdRaxrqp z=7f~o(u{=rE`|t%Lj2?Ar?cXn)wTLfJ}JkEq&N9I4QQogCk&TeYLU*C zDJC|##_#(H#C?Q~U~M07Nq$>z?kLzbEC<-=;}WwGpdrX>4?2P>V0Ok@&5Y5sr{*cf z+N{YhD}fyb^Z&VpmTdSksml8r3H>)Ccc7dH^jvylC=d zJSRQsAz+~nC~YsVOlz<7xd(-79hmKKfu(rbY+t){A#?q3GMfXVW;_jxUaSgICHI?y8vkP%#X(^Dfpebg8Ayy(UwuJ1^B z(~vtj43f+kzsVERS_4|1#@M5l;CQpFQ$^V+fprDtpLx36Db=IzborU-PjXMpSy*U|EH((lz2NR4ECc5eK^T>7@Hw%b>a+h!@8%P2y|-1fHt3`fm_J zqE}znXlv>5li!ZdBZLdWmhCYkf~FMBtG>w9rbJ6LS zGR5q~8j$#~@6d745^h%uKG8dh9CMQmok%ky@Ce`Q&X!(0tj}hImbWx*Fhhx?pJ1~? zsihlDTJ)8GK5P+^7Rtq6ib%sMofl)i_KI9P5fHl{P`Ubay;94fqEJ6&9UT~mJKdR| z@)wR-^rY?_ek?M817E;~#Ba{iZ^aOSBSA1{`ybyrSrST4>0sRucRJ{SrybM}j!$j5 zum1UlRx>iZZfD96jxdFLRY!jMwR@X)444V{vm z{OY3zT3$_DsWl$3v~ts0h8vf;9(pC35%f^nP*<4+&Gkm)0&wzx6*EfU+VKT_URS;y z@=j(b5zKF;5U$^}ciiHHPuRhsoDEAXvU6?-iL549EU?t7nS(c_U67iy60(~5!4q`5 z3TZu1w(Hw5##`i5vwB3Bwhbld3jvKOn)$FtptL)XkvU;gqHlb7cy|z}sKAbTsRYe< znB7+Bc@#YF69kKf4j8&Atpwb*% zvEy=Gh(CBFaUl11s>c@FSoGSCtZyX1A7Em5bglI6+D>Yn=RA)=fLHM!x9x%@A~&<6 z0y+HZA1hAIM!+aGDU$xki7P)k zSik-EAnciQ#B}@w3BG?Wb|-vQMSFwsfA)}l|6<6i3lO?qq!#`(yuUBa#v!W`tb+H)>G*#A8*ANzr2qZh|HHJcQ|1ga`aael zUtxq{TIBlt`uBey&5y4nAQ8H5ktFgY{ z%5O{138p3gR{Nh--^cpnE5bH-ApW--KDUxCIDtvuY9PK2eop9}K2~t_T*!X`XgieZ literal 0 HcmV?d00001 diff --git a/docs/boot.html b/docs/boot.html index aaeef60..80f30f8 100644 --- a/docs/boot.html +++ b/docs/boot.html @@ -23,7 +23,7 @@ - + diff --git a/docs/ci.html b/docs/ci.html index 98f7579..4c0e9ac 100644 --- a/docs/ci.html +++ b/docs/ci.html @@ -23,7 +23,7 @@ - + diff --git a/docs/condprob.html b/docs/condprob.html index a654765..a3e195d 100644 --- a/docs/condprob.html +++ b/docs/condprob.html @@ -23,7 +23,7 @@ - + diff --git a/docs/crv.html b/docs/crv.html index 2341547..4d043c0 100644 --- a/docs/crv.html +++ b/docs/crv.html @@ -23,7 +23,7 @@ - + diff --git a/docs/distributions-intutition.html b/docs/distributions-intutition.html index 5a4f4d9..c25a59e 100644 --- a/docs/distributions-intutition.html +++ b/docs/distributions-intutition.html @@ -23,7 +23,7 @@ - + @@ -327,7 +327,7 @@

      Chapter 18 Distributions intutiti

      18.1 Discrete distributions

      -

      Exercise 18.1 (Bernoulli intuition 1) Arguably the simplest distribution you will enocounter is the Bernoulli distribution. +

      Exercise 18.1 (Bernoulli intuition 1) Arguably the simplest distribution you will enocounter is the Bernoulli distribution. It is a discrete probability distribution used to represent the outcome of a yes/no question. It has one parameter \(p\) which is the probability of success. The probability of failure is \((1-p)\), sometimes denoted as \(q\).

      @@ -345,10 +345,10 @@

      18.1 Discrete distributions\(q = \frac{2}{3}\)

    -

    +

    -

    Exercise 18.2 (Binomial intuition 1) The binomial distribution is a generalization of the Bernoulli distribution. +

    Exercise 18.2 (Binomial intuition 1) The binomial distribution is a generalization of the Bernoulli distribution. Instead of considering a single Bernoulli trial, we now consider a sequence of \(n\) trials, which are independent and have the same parameter \(p\). So the binomial distribution has two parameters \(n\) - the number of trials and \(p\) - the probability of success @@ -385,7 +385,7 @@

    18.1 Discrete distributions -

    +

    -

    Exercise 18.3 (Poisson intuition 1) Below are shown 3 different graphs of the Poisson distribution. Your task +

    Exercise 18.3 (Poisson intuition 1) Below are shown 3 different graphs of the Poisson distribution. Your task is to replicate them on your own in R by varying the \(\lambda\) parameter.

    Hint: You can use dpois() to get the probabilities.

    -

    +

    -
    library(ggplot2)
    -library(gridExtra)
    -
    -x = 0:15
    -
    -# Create Poisson data
    -data1 <- data.frame(x = x, y = dpois(x, lambda = 0.1))
    -data2 <- data.frame(x = x, y = dpois(x, lambda = 1))
    -data3 <- data.frame(x = x, y = dpois(x, lambda = 7.5))
    -
    -# Create individual ggplot objects
    -plot1 <- ggplot(data1, aes(x, y)) + geom_col() +
    -  xlab("x") + ylab("Probability") + ylim(0,1)
    -
    -plot2 <- ggplot(data2, aes(x, y)) + geom_col() +
    -  xlab("x") + ylab(NULL) + ylim(0,1)
    -
    -plot3 <- ggplot(data3, aes(x, y)) + geom_col() +
    -  xlab("x") + ylab(NULL) + ylim(0,1)
    -
    -# Combine the plots
    -grid.arrange(plot1, plot2, plot3, ncol = 3)
    +
    library(ggplot2)
    +library(gridExtra)
    +
    +x = 0:15
    +
    +# Create Poisson data
    +data1 <- data.frame(x = x, y = dpois(x, lambda = 0.1))
    +data2 <- data.frame(x = x, y = dpois(x, lambda = 1))
    +data3 <- data.frame(x = x, y = dpois(x, lambda = 7.5))
    +
    +# Create individual ggplot objects
    +plot1 <- ggplot(data1, aes(x, y)) + geom_col() +
    +  xlab("x") + ylab("Probability") + ylim(0,1)
    +
    +plot2 <- ggplot(data2, aes(x, y)) + geom_col() +
    +  xlab("x") + ylab(NULL) + ylim(0,1)
    +
    +plot3 <- ggplot(data3, aes(x, y)) + geom_col() +
    +  xlab("x") + ylab(NULL) + ylim(0,1)
    +
    +# Combine the plots
    +grid.arrange(plot1, plot2, plot3, ncol = 3)
    -

    Exercise 18.4 (Poisson intuition 2) The Poisson distribution is a discrete probability distribution that models the +

    Exercise 18.4 (Poisson intuition 2) The Poisson distribution is a discrete probability distribution that models the probability of a given number of events occuring within processes where events occur at a constant mean rate and independently of each other - a Poisson process.

    It has a single parameter \(\lambda\), which represents the constant mean rate.

    @@ -454,7 +454,7 @@

    18.1 Discrete distributions
    -

    +

    -

    Solution. Parameter p (the probability of success) for rolling a 6 is \(p=\frac{1}{6}\).

    +

    Solution. Parameter p (the probability of success) for rolling a 6 is \(p=\frac{1}{6}\).

    -
    library(ggplot2)
    -
    -# Parameters
    -p <- 1/6
    -x_vals <- 0:9  # Starting from 0
    -probs <- dgeom(x_vals, p)
    -
    -# Data
    -data <- data.frame(x_vals, probs)
    -
    -# Plot
    -ggplot(data, aes(x=x_vals, y=probs)) +
    -  geom_segment(aes(xend=x_vals, yend=0), color="black", size=1) +
    -  geom_point(color="red", size=2) + 
    -  labs(x = "Number of trials", y = "Probability") +
    -  theme_minimal() +
    -  scale_x_continuous(breaks = x_vals) # This line ensures integer x-axis labels
    -

    +

    library(ggplot2)
    +
    +# Parameters
    +p <- 1/6
    +x_vals <- 0:9  # Starting from 0
    +probs <- dgeom(x_vals, p)
    +
    +# Data
    +data <- data.frame(x_vals, probs)
    +
    +# Plot
    +ggplot(data, aes(x=x_vals, y=probs)) +
    +  geom_segment(aes(xend=x_vals, yend=0), color="black", size=1) +
    +  geom_point(color="red", size=2) + 
    +  labs(x = "Number of trials", y = "Probability") +
    +  theme_minimal() +
    +  scale_x_continuous(breaks = x_vals) # This line ensures integer x-axis labels
    +

    ::: {.solution} b) The expected value of a random variable (the mean) is denoted as \(E[X]\).

    \[E[X] = \frac{1-p}{p}= \frac{1- \frac{1}{6}}{\frac{1}{6}} = \frac{5}{6}\cdot 6 = 5\] @@ -533,17 +533,17 @@

    18.1 Discrete distributions

    18.2 Continuous distributions

    -

    Exercise 18.6 (Uniform intuition 1) The need for a randomness is a common problem. A practical solution are so-called random number generators (RNGs). The simplest RNG one would think of is choosing a set of numbers and having the generator return a number at random, where the probability of returning any number from this set is the same. If this set is an interval of real numbers, then we’ve basically described the continuous uniform distribution.

    +

    Exercise 18.6 (Uniform intuition 1) The need for a randomness is a common problem. A practical solution are so-called random number generators (RNGs). The simplest RNG one would think of is choosing a set of numbers and having the generator return a number at random, where the probability of returning any number from this set is the same. If this set is an interval of real numbers, then we’ve basically described the continuous uniform distribution.

    It has two parameters \(a\) and \(b\), which define the beginning and end of its support respectively.

    1. Let’s think of the mean intuitively. The expected value or mean of a distribution is the pivot point on our x-axis, which “balances” the graph. Given parameters \(a\) and \(b\) what is your intuitive guess of the mean for this distribution?
    2. A special case of the uniform distribution is the standard uniform distribution with \(a=0\) and \(b=1\). Write the pdf \(f(x)\) of this particular distribution.
    - +
    -

    Solution.

    +

    Solution.

    1. It’s the midpoint between \(a\) and \(b\), so \(\frac{a+b}{2}\)
    2. Inserting the parameter values we get:\[f(x) = @@ -557,7 +557,7 @@

      18.2 Continuous distributions

    -

    Exercise 18.7 (Normal intuition 1) The normal distribution, also known as the Gaussian distribution, is a continuous distribution that encompasses the entire real number line. It has two parameters: the mean, denoted by \(\mu\), and the variance, represented by \(\sigma^2\). Its shape resembles the iconic bell curve. The position of its peak is determined by the parameter \(\mu\), while the variance determines the spread or width of the curve. A smaller variance results in a sharper, narrower peak, while a larger variance leads to a broader, more spread-out curve.

    +

    Exercise 18.7 (Normal intuition 1) The normal distribution, also known as the Gaussian distribution, is a continuous distribution that encompasses the entire real number line. It has two parameters: the mean, denoted by \(\mu\), and the variance, represented by \(\sigma^2\). Its shape resembles the iconic bell curve. The position of its peak is determined by the parameter \(\mu\), while the variance determines the spread or width of the curve. A smaller variance results in a sharper, narrower peak, while a larger variance leads to a broader, more spread-out curve.

    Below, we graph the distribution of IQ scores for two different populations.

    We aim to identify individuals with an IQ at or above 140 for an experiment. We can identify them reliably; however, we only have time to examine one of the two groups. Which group should we investigate to have the best chance?

    NOTE: The graph below displays the parameter \(\sigma\), which is the square root of the variance, more commonly referred to as the standard deviation. Keep this in mind when solving the problems.

    @@ -570,10 +570,10 @@

    18.2 Continuous distributionsBONUS: Look up the CDF of the normal distribution and input the appropriate values to determine the percentage of each population that comprises individuals with an IQ of 140 or higher.

    -

    +

    -

    Solution.

    +

    Solution.

    1. Group 1: \(\mu = 100, \sigma=10 \rightarrow \sigma^2 = 100\) \[\frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{(x - \mu)^2}{2 \sigma^2}} = @@ -587,34 +587,34 @@

      18.2 Continuous distributions
    -
    library(ggplot2)
    -library(tidyr)
    -
    -# Create data
    -x <- seq(135, 145, by = 0.01)  # Adjusting the x range to account for the larger standard deviations
    -df <- data.frame(x = x)
    -
    -# Define the IQ distributions
    -df$IQ_mu100_sd10 <- dnorm(df$x, mean = 100, sd = 10)
    -df$IQ_mu105_sd8 <- dnorm(df$x, mean = 105, sd = 8)
    -
    -# Convert from wide to long format for ggplot2
    -df_long <- gather(df, distribution, density, -x)
    -
    -# Ensure the levels of the 'distribution' factor match our desired order
    -df_long$distribution <- factor(df_long$distribution, levels = c("IQ_mu100_sd10", "IQ_mu105_sd8"))
    -
    -# Plot
    -ggplot(df_long, aes(x = x, y = density, color = distribution)) +
    -  geom_line() +
    -  labs(x = "IQ Score", y = "Density") +
    -  scale_color_manual(
    -    name = "IQ Distribution",
    -    values = c(IQ_mu100_sd10 = "red", IQ_mu105_sd8 = "blue"),
    -    labels = c("Group 1 (µ=100, σ=10)", "Group 2 (µ=105, σ=8)")
    -  ) +
    -  theme_minimal()
    -

    +

    library(ggplot2)
    +library(tidyr)
    +
    +# Create data
    +x <- seq(135, 145, by = 0.01)  # Adjusting the x range to account for the larger standard deviations
    +df <- data.frame(x = x)
    +
    +# Define the IQ distributions
    +df$IQ_mu100_sd10 <- dnorm(df$x, mean = 100, sd = 10)
    +df$IQ_mu105_sd8 <- dnorm(df$x, mean = 105, sd = 8)
    +
    +# Convert from wide to long format for ggplot2
    +df_long <- gather(df, distribution, density, -x)
    +
    +# Ensure the levels of the 'distribution' factor match our desired order
    +df_long$distribution <- factor(df_long$distribution, levels = c("IQ_mu100_sd10", "IQ_mu105_sd8"))
    +
    +# Plot
    +ggplot(df_long, aes(x = x, y = density, color = distribution)) +
    +  geom_line() +
    +  labs(x = "IQ Score", y = "Density") +
    +  scale_color_manual(
    +    name = "IQ Distribution",
    +    values = c(IQ_mu100_sd10 = "red", IQ_mu105_sd8 = "blue"),
    +    labels = c("Group 1 (µ=100, σ=10)", "Group 2 (µ=105, σ=8)")
    +  ) +
    +  theme_minimal()
    +

    ::: {.solution} c. The CDF of the normal distribution is \(\Phi(x) = \frac{1}{2} \left[ 1 + \text{erf} \left( \frac{x - \mu}{\sigma \sqrt{2}} \right) \right]\). The CDF is defined as the integral of the distribution density up to x. So to get the total percentage of individuals with IQ at 140 or higher we will need to subtract the value from 1.

    Group 1: \[1 - \Phi(140) = \frac{1}{2} \left[ 1 + \text{erf} \left( \frac{140 - 100}{10 \sqrt{2}} \right) \right] \approx 3.17e-05 \] @@ -623,7 +623,7 @@

    18.2 Continuous distributions

    -

    Exercise 18.8 (Beta intuition 1) The beta distribution is a continuous distribution defined on the unit interval \([0,1]\). It has two strictly positive paramters \(\alpha\) and \(\beta\), which determine its shape. Its support makes it especially suitable to model distribtuions of percentages and proportions.

    +

    Exercise 18.8 (Beta intuition 1) The beta distribution is a continuous distribution defined on the unit interval \([0,1]\). It has two strictly positive paramters \(\alpha\) and \(\beta\), which determine its shape. Its support makes it especially suitable to model distribtuions of percentages and proportions.

    Below you’ve been provided with some code that you can copy into Rstudio. Once you run the code, an interactive Shiny app will appear and you will be able to manipulate the graph of the beta distribution.

    Play around with the parameters to get:

      @@ -673,7 +673,7 @@

      18.2 Continuous distributions
      -

      Exercise 18.9 (Exponential intuition 1) The exponential distribution represents the distributon of time between events in a Poisson process. It is the continuous analogue of the geometric distribution.

      +

      Exercise 18.9 (Exponential intuition 1) The exponential distribution represents the distributon of time between events in a Poisson process. It is the continuous analogue of the geometric distribution.

      It has a single parameter \(\lambda\), which is strictly positive and represents the constant rate of the corresponding Poisson process. The support is all positive reals, since time between events is non-negative, but not bound upwards.

      Let’s revisit the call center from our Poisson problem. We get 2.5 calls per day on average, this is our rate parameter \(\lambda\). A work day is 8 hours.

        @@ -701,7 +701,7 @@

        18.2 Continuous distributions
        -

        Exercise 18.10 (Gamma intuition 1) The gamma distribution is a continuous distribution characterized by two parameters, \(\alpha\) and \(\beta\), both greater than 0. These parameters afford the distribution a broad range of shapes, leading to it being commonly referred to as a family of distributions. Given its support over the positive real numbers, it is well suited for modeling a diverse range of positive-valued phenomena.

        +

        Exercise 18.10 (Gamma intuition 1) The gamma distribution is a continuous distribution characterized by two parameters, \(\alpha\) and \(\beta\), both greater than 0. These parameters afford the distribution a broad range of shapes, leading to it being commonly referred to as a family of distributions. Given its support over the positive real numbers, it is well suited for modeling a diverse range of positive-valued phenomena.

        1. The exponential distribution is actually just a particular form of the gamma distribution. What are the values of \(\alpha\) and \(\beta\)?
        2. Copy the code from our beta distribution Shiny app and modify it to simulate the gamma distribution. Then get it to show the exponential.
        3. @@ -729,7 +729,7 @@

          18.2 Continuous distributions
          -
          # Install and load necessary packages
          -install.packages(c("shiny", "ggplot2"))
          -library(shiny)
          -library(ggplot2)
          -
          -# The Shiny App
          -ui <- fluidPage(
          -  titlePanel("Gamma Distribution Viewer"),
          -  
          -  sidebarLayout(
          -    sidebarPanel(
          -      sliderInput("shape", "Shape (α):", min = 0.1, max = 10, value = 2, step = 0.1),
          -      sliderInput("scale", "Scale (β):", min = 0.1, max = 10, value = 2, step = 0.1)
          -    ),
          -    
          -    mainPanel(
          -      plotOutput("gammaPlot")
          -    )
          -  )
          -)
          -
          -server <- function(input, output) {
          -  output$gammaPlot <- renderPlot({
          -    x <- seq(0, 25, by = 0.1)
          -    y <- dgamma(x, shape = input$shape, scale = input$scale)
          -    
          -    ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) +
          -      geom_line() +
          -      labs(x = "Value", y = "Density") +
          -      theme_minimal()
          -  })
          -}
          -
          -shinyApp(ui = ui, server = server)
          +
          # Install and load necessary packages
          +install.packages(c("shiny", "ggplot2"))
          +library(shiny)
          +library(ggplot2)
          +
          +# The Shiny App
          +ui <- fluidPage(
          +  titlePanel("Gamma Distribution Viewer"),
          +  
          +  sidebarLayout(
          +    sidebarPanel(
          +      sliderInput("shape", "Shape (α):", min = 0.1, max = 10, value = 2, step = 0.1),
          +      sliderInput("scale", "Scale (β):", min = 0.1, max = 10, value = 2, step = 0.1)
          +    ),
          +    
          +    mainPanel(
          +      plotOutput("gammaPlot")
          +    )
          +  )
          +)
          +
          +server <- function(input, output) {
          +  output$gammaPlot <- renderPlot({
          +    x <- seq(0, 25, by = 0.1)
          +    y <- dgamma(x, shape = input$shape, scale = input$scale)
          +    
          +    ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) +
          +      geom_line() +
          +      labs(x = "Value", y = "Density") +
          +      theme_minimal()
          +  })
          +}
          +
          +shinyApp(ui = ui, server = server)

        diff --git a/docs/distributions.html b/docs/distributions.html index 5c0ce22..24ad8ae 100644 --- a/docs/distributions.html +++ b/docs/distributions.html @@ -23,7 +23,7 @@ - + diff --git a/docs/eb.html b/docs/eb.html index 268dc13..09cd76b 100644 --- a/docs/eb.html +++ b/docs/eb.html @@ -23,7 +23,7 @@ - + diff --git a/docs/ev.html b/docs/ev.html index 7da2fcf..c3e2dec 100644 --- a/docs/ev.html +++ b/docs/ev.html @@ -23,7 +23,7 @@ - + diff --git a/docs/index.html b/docs/index.html index 63ba5a9..589e5e1 100644 --- a/docs/index.html +++ b/docs/index.html @@ -23,7 +23,7 @@ - + @@ -300,7 +300,7 @@

        Preface

        diff --git a/docs/integ.html b/docs/integ.html index d0232ef..2d3e14f 100644 --- a/docs/integ.html +++ b/docs/integ.html @@ -23,7 +23,7 @@ - + diff --git a/docs/introduction.html b/docs/introduction.html index b738b9a..f472b1d 100644 --- a/docs/introduction.html +++ b/docs/introduction.html @@ -23,7 +23,7 @@ - + diff --git a/docs/lt.html b/docs/lt.html index fa2d250..f70caad 100644 --- a/docs/lt.html +++ b/docs/lt.html @@ -23,7 +23,7 @@ - + diff --git a/docs/ml.html b/docs/ml.html index 0275d52..01a8d4c 100644 --- a/docs/ml.html +++ b/docs/ml.html @@ -23,7 +23,7 @@ - + diff --git a/docs/mrv.html b/docs/mrv.html index c48af5f..e52e8ce 100644 --- a/docs/mrv.html +++ b/docs/mrv.html @@ -23,7 +23,7 @@ - + diff --git a/docs/mrvs.html b/docs/mrvs.html index e68b63f..d08c574 100644 --- a/docs/mrvs.html +++ b/docs/mrvs.html @@ -23,7 +23,7 @@ - + diff --git a/docs/nhst.html b/docs/nhst.html index a5ce27f..bc849af 100644 --- a/docs/nhst.html +++ b/docs/nhst.html @@ -23,7 +23,7 @@ - + diff --git a/docs/reference-keys.txt b/docs/reference-keys.txt index 2c42ff2..11934ae 100644 --- a/docs/reference-keys.txt +++ b/docs/reference-keys.txt @@ -121,9 +121,9 @@ exr:unnamed-chunk-286 exr:unnamed-chunk-288 exr:unnamed-chunk-293 exr:unnamed-chunk-296 -exr:unnamed-chunk-297 -exr:unnamed-chunk-299 -exr:unnamed-chunk-300 +exr:unnamed-chunk-301 +exr:unnamed-chunk-303 +exr:unnamed-chunk-305 introduction measure-and-probability-spaces properties-of-probability-measures @@ -203,16 +203,3 @@ writing-functions other-tips further-reading-and-references distributions -exr:unnamed-chunk-5 -exr:unnamed-chunk-22 -exr:unnamed-chunk-24 -exr:unnamed-chunk-25 -exr:unnamed-chunk-26 -exr:unnamed-chunk-27 -create-data -define-the-iq-distributions -convert-from-wide-to-long-format-for-ggplot2 -ensure-the-levels-of-the-distribution-factor-match-our-desired-order -plot -exr:unnamed-chunk-28 -exr:unnamed-chunk-29 diff --git a/docs/references.html b/docs/references.html index 01a85c5..5415593 100644 --- a/docs/references.html +++ b/docs/references.html @@ -23,7 +23,7 @@ - + diff --git a/docs/rvs.html b/docs/rvs.html index 875aeda..a9abfa1 100644 --- a/docs/rvs.html +++ b/docs/rvs.html @@ -23,7 +23,7 @@ - + diff --git a/docs/search_index.json b/docs/search_index.json index 1362239..955f399 100644 --- a/docs/search_index.json +++ b/docs/search_index.json @@ -1 +1 @@ -[["distributions-intutition.html", "Chapter 18 Distributions intutition 18.1 Discrete distributions 18.2 Continuous distributions", " Chapter 18 Distributions intutition This chapter is intended to help you familiarize yourself with the different probability distributions you will encounter in this course. You will need to use Appendix B extensively as a reference for the basic properties of distributions, so keep it close! .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 18.1 Discrete distributions Exercise 18.1 (Bernoulli intuition 1) Arguably the simplest distribution you will enocounter is the Bernoulli distribution. It is a discrete probability distribution used to represent the outcome of a yes/no question. It has one parameter \\(p\\) which is the probability of success. The probability of failure is \\((1-p)\\), sometimes denoted as \\(q\\). A classic way to think about a Bernoulli trial (a yes/no experiment) is a coin flip. Real coins are fair, meaning the probability of either heads (1) or tails (0) are the same, so \\(p=0.5\\) as shown below in figure a. Alternatively we may want to represent a process that doesn’t have equal probabilities of outcomes like “Will a throw of a fair die result in a 6?”. In this case \\(p=\\frac{1}{6}\\), shown in figure b. Using your knowledge of the Bernoulli distribution use the throw of a fair die to think of events, such that: \\(p = 0.5\\) \\(p = \\frac{5}{6}\\) \\(q = \\frac{2}{3}\\) Solution. An event that is equally likely to happen or not happen i.e. \\(p = 0.5\\) would be throwing an even number. More formally we can name this event \\(A\\) and write: \\(A = \\{2,4,6\\}\\), its probability being \\(P(A) = 0.5\\) An example of an event with \\(p = \\frac{5}{6}\\) would be throwing a number greater than 1. Defined as \\(B = \\{2,3,4,5,6\\}\\). We need an event that fails \\(\\frac{2}{3}\\) of the time. Alternatively we can reverse the problem and find an event that succeeds \\(\\frac{1}{3}\\) of the time, since: \\(q = 1 - p \\implies p = 1 - q = \\frac{1}{3}\\). The event that our outcome is divisible by 3: \\(C = \\{3, 6\\}\\) satisfies this condition. Exercise 18.2 (Binomial intuition 1) The binomial distribution is a generalization of the Bernoulli distribution. Instead of considering a single Bernoulli trial, we now consider a sequence of \\(n\\) trials, which are independent and have the same parameter \\(p\\). So the binomial distribution has two parameters \\(n\\) - the number of trials and \\(p\\) - the probability of success for each trial. If we return to our coin flip representation, we now flip a coin several times. The binomial distribution will give us the probabilities of all possible outcomes. Below we show the distribution for a series of 10 coin flips with a fair coin (left) and a biased coin (right). The numbers on the x axis represent the number of times the coin landed heads. Using your knowledge of the binomial distribution: Take the pmf of the binomial distribution and plug in \\(n=1\\), check that it is in fact equivalent to a Bernoulli distribution. In our examples we show the graph of a binomial distribution over 10 trials with \\(p=0.8\\). If we take a look at the graph, it appears as though the probabilities of getting 0,1,2 or 3 heads in 10 flips are zero. Is it actually zero? Check by plugging in the values into the pmf. Solution. The pmf of a binomial distribution is \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\), now we insert \\(n=1\\) to get: \\[\\binom{1}{k} p^k (1 - p)^{1 - k}\\]. Not quite equivalent to a Bernoulli, however note that the support of the binomial distribution is defined as \\(k \\in \\{0,1,\\dots,n\\}\\), so in our case \\(k = \\{0,1\\}\\), then: \\[\\binom{1}{0} = \\binom{1}{1} = 1\\] we get: \\(p^k (1 - p)^{1 - k}\\) ,the Bernoulli distribution. As we already know \\(p=0.8, n=10\\), so: \\[\\binom{10}{0} 0.8^0 (1 - 0.8)^{10 - 0} = 1.024 \\cdot 10^{-7}\\] \\[\\binom{10}{1} 0.8^1 (1 - 0.8)^{10 - 1} = 4.096 \\cdot 10^{-6}\\] \\[\\binom{10}{2} 0.8^2 (1 - 0.8)^{10 - 2} = 7.3728 \\cdot 10^{-5}\\] \\[\\binom{10}{3} 0.8^3 (1 - 0.8)^{10 - 3} = 7.86432\\cdot 10^{-4}\\] So the probabilities are not zero, just very small. Exercise 18.3 (Poisson intuition 1) Below are shown 3 different graphs of the Poisson distribution. Your task is to replicate them on your own in R by varying the \\(\\lambda\\) parameter. Hint: You can use dpois() to get the probabilities. library(ggplot2) library(gridExtra) x = 0:15 # Create Poisson data data1 <- data.frame(x = x, y = dpois(x, lambda = 0.1)) data2 <- data.frame(x = x, y = dpois(x, lambda = 1)) data3 <- data.frame(x = x, y = dpois(x, lambda = 7.5)) # Create individual ggplot objects plot1 <- ggplot(data1, aes(x, y)) + geom_col() + xlab("x") + ylab("Probability") + ylim(0,1) plot2 <- ggplot(data2, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) plot3 <- ggplot(data3, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) # Combine the plots grid.arrange(plot1, plot2, plot3, ncol = 3) Exercise 18.4 (Poisson intuition 2) The Poisson distribution is a discrete probability distribution that models the probability of a given number of events occuring within processes where events occur at a constant mean rate and independently of each other - a Poisson process. It has a single parameter \\(\\lambda\\), which represents the constant mean rate. A classic example of a scenario that can be modeled using the Poisson distribution is the number of calls received by a call center in a day (or in fact any other time interval). Suppose you work in a call center and have some understanding of probability distributions. You overhear your supervisor mentioning that the call center receives an average of 2.5 calls per day. Using your knowledge of the Poisson distribution, calculate: The probability you will get no calls today. The probability you will get more than 5 calls today. Solution. First recall the Poisson pmf: \\[p(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!}\\] as stated previously our parameter \\(\\lambda = 2.5\\) To get the probability of no calls we simply plug in \\(k = 0\\), so: \\[p(0) = \\frac{2.5^0 e^{-2.5}}{0!} = e^{-2.5} \\approx 0.082\\] The support of the Poisson distribution is non-negative integers. So if we wanted to calculate the probability of getting more than 5 calls we would need to add up the probabilities of getting 6 calls and 7 calls and so on up to infinity. Let us instead remember that the sum of all probabilties will be 1, we will reverse the problem and instead ask “What is the probability we get 5 calls or less?”. We can subtract the probability of the opposite outcome (the complement) from 1 to get the probability of our original question. \\[P(k > 5) = 1 - P(k \\leq 5)\\] \\[P(k \\leq 5) = \\sum_{i=0}^{5} p(i) = p(0) + p(1) + p(2) + p(3) + p(4) + p(5) =\\] \\[= \\frac{2.5^0 e^{-2.5}}{0!} + \\frac{2.5^1 e^{-2.5}}{1!} + \\dots =\\] \\[=0.957979\\] So the probability of geting more than 5 calls will be \\(1 - 0.957979 = 0.042021\\) Exercise 18.5 (Geometric intuition 1) The geometric distribution is a discrete distribution that models the number of failures before the first success in a sequence of independent Bernoulli trials. It has a single parameter \\(p\\), representing the probability of success. NOTE: There are two forms of this distribution, the one we just described and another that models the number of trials before the first success. The difference is subtle yet significant and you are likely to encounter both forms. In the graph below we show the pmf of a geometric distribution with \\(p=0.5\\). This can be thought of as the number of successive failures (tails) in the flip of a fair coin. You can see that there’s a 50% chance you will have zero failures i.e. you will flip a heads on your very first attempt. But there is some smaller chance that you will flip a sequence of tails in a row, with longer sequences having ever lower probability. Create an equivalent graph that represents the probability of rolling a 6 with a fair 6-sided die. Use the formula for the mean of the geometric distribution and determine the average number of failures before you roll a 6. Look up the alternative form of the geometric distribtuion and again use the formula for the mean to determine the average number of trials before you roll a 6. Solution. Parameter p (the probability of success) for rolling a 6 is \\(p=\\frac{1}{6}\\). library(ggplot2) # Parameters p <- 1/6 x_vals <- 0:9 # Starting from 0 probs <- dgeom(x_vals, p) # Data data <- data.frame(x_vals, probs) # Plot ggplot(data, aes(x=x_vals, y=probs)) + geom_segment(aes(xend=x_vals, yend=0), color="black", size=1) + geom_point(color="red", size=2) + labs(x = "Number of trials", y = "Probability") + theme_minimal() + scale_x_continuous(breaks = x_vals) # This line ensures integer x-axis labels ::: {.solution} b) The expected value of a random variable (the mean) is denoted as \\(E[X]\\). \\[E[X] = \\frac{1-p}{p}= \\frac{1- \\frac{1}{6}}{\\frac{1}{6}} = \\frac{5}{6}\\cdot 6 = 5\\] On average we will fail 5 times before we roll our first 6. The alternative form of this distribution (with support on all positive integers) has a slightly different formula for the mean. This change reflects the difference in the way we posed our question: \\[E[X] = \\frac{1}{p} = \\frac{1}{\\frac{1}{6}} = 6\\] On average we will have to throw the die 6 times before we roll a 6. ::: 18.2 Continuous distributions Exercise 18.6 (Uniform intuition 1) The need for a randomness is a common problem. A practical solution are so-called random number generators (RNGs). The simplest RNG one would think of is choosing a set of numbers and having the generator return a number at random, where the probability of returning any number from this set is the same. If this set is an interval of real numbers, then we’ve basically described the continuous uniform distribution. It has two parameters \\(a\\) and \\(b\\), which define the beginning and end of its support respectively. Let’s think of the mean intuitively. The expected value or mean of a distribution is the pivot point on our x-axis, which “balances” the graph. Given parameters \\(a\\) and \\(b\\) what is your intuitive guess of the mean for this distribution? A special case of the uniform distribution is the standard uniform distribution with \\(a=0\\) and \\(b=1\\). Write the pdf \\(f(x)\\) of this particular distribution. Solution. It’s the midpoint between \\(a\\) and \\(b\\), so \\(\\frac{a+b}{2}\\) Inserting the parameter values we get:\\[f(x) = \\begin{cases} 1 & \\text{if } 0 \\leq x \\leq 1 \\\\ 0 & \\text{otherwise} \\end{cases} \\] Notice how the pdf is just a constant \\(1\\) across all values of \\(x \\in [0,1]\\). Here it is important to distinguish between probability and probability density. The density may be 1, but the probability is not and while discreet distributions never exceed 1 on the y-axis, continuous distributions can go as high as you like. Exercise 18.7 (Normal intuition 1) The normal distribution, also known as the Gaussian distribution, is a continuous distribution that encompasses the entire real number line. It has two parameters: the mean, denoted by \\(\\mu\\), and the variance, represented by \\(\\sigma^2\\). Its shape resembles the iconic bell curve. The position of its peak is determined by the parameter \\(\\mu\\), while the variance determines the spread or width of the curve. A smaller variance results in a sharper, narrower peak, while a larger variance leads to a broader, more spread-out curve. Below, we graph the distribution of IQ scores for two different populations. We aim to identify individuals with an IQ at or above 140 for an experiment. We can identify them reliably; however, we only have time to examine one of the two groups. Which group should we investigate to have the best chance? NOTE: The graph below displays the parameter \\(\\sigma\\), which is the square root of the variance, more commonly referred to as the standard deviation. Keep this in mind when solving the problems. Insert the values of either population into the pdf of a normal distribution and determine which one has a higher density at \\(x=140\\). Generate the graph yourself and zoom into the relevant area to graphically verify your answer. To determine probability density, we can use the pdf. However, if we wish to know the proportion of the population that falls within certain parameters, we would need to integrate the pdf. Fortunately, the integrals of common distributions are well-established. This integral gives us the cumulative distribution function \\(F(x)\\) (CDF). BONUS: Look up the CDF of the normal distribution and input the appropriate values to determine the percentage of each population that comprises individuals with an IQ of 140 or higher. Solution. Group 1: \\(\\mu = 100, \\sigma=10 \\rightarrow \\sigma^2 = 100\\) \\[\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} = \\frac{1}{\\sqrt{2 \\pi 100}} e^{-\\frac{(140 - 100)^2}{2 \\cdot 100}} \\approx 1.34e-05\\] Group 2: \\(\\mu = 105, \\sigma=8 \\rightarrow \\sigma^2 = 64\\) \\[\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} = \\frac{1}{\\sqrt{2 \\pi 64}} e^{-\\frac{(140 - 105)^2}{2 \\cdot 64}} \\approx 3.48e-06\\] So despite the fact that group 1 has a lower average IQ, we are more likely to find 140 IQ individuals in group 2. library(ggplot2) library(tidyr) # Create data x <- seq(135, 145, by = 0.01) # Adjusting the x range to account for the larger standard deviations df <- data.frame(x = x) # Define the IQ distributions df$IQ_mu100_sd10 <- dnorm(df$x, mean = 100, sd = 10) df$IQ_mu105_sd8 <- dnorm(df$x, mean = 105, sd = 8) # Convert from wide to long format for ggplot2 df_long <- gather(df, distribution, density, -x) # Ensure the levels of the 'distribution' factor match our desired order df_long$distribution <- factor(df_long$distribution, levels = c("IQ_mu100_sd10", "IQ_mu105_sd8")) # Plot ggplot(df_long, aes(x = x, y = density, color = distribution)) + geom_line() + labs(x = "IQ Score", y = "Density") + scale_color_manual( name = "IQ Distribution", values = c(IQ_mu100_sd10 = "red", IQ_mu105_sd8 = "blue"), labels = c("Group 1 (µ=100, σ=10)", "Group 2 (µ=105, σ=8)") ) + theme_minimal() ::: {.solution} c. The CDF of the normal distribution is \\(\\Phi(x) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{x - \\mu}{\\sigma \\sqrt{2}} \\right) \\right]\\). The CDF is defined as the integral of the distribution density up to x. So to get the total percentage of individuals with IQ at 140 or higher we will need to subtract the value from 1. Group 1: \\[1 - \\Phi(140) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{140 - 100}{10 \\sqrt{2}} \\right) \\right] \\approx 3.17e-05 \\] Group 2 : \\[1 - \\Phi(140) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{140 - 105}{8 \\sqrt{2}} \\right) \\right] \\approx 6.07e-06 \\] So roughly 0.003% and 0.0006% of individuals in groups 1 and 2 respectively have an IQ at or above 140. ::: Exercise 18.8 (Beta intuition 1) The beta distribution is a continuous distribution defined on the unit interval \\([0,1]\\). It has two strictly positive paramters \\(\\alpha\\) and \\(\\beta\\), which determine its shape. Its support makes it especially suitable to model distribtuions of percentages and proportions. Below you’ve been provided with some code that you can copy into Rstudio. Once you run the code, an interactive Shiny app will appear and you will be able to manipulate the graph of the beta distribution. Play around with the parameters to get: A straight line from (0,0) to (1,2) A straight line from (0,2) to (1,0) A symmetric bell curve A bowl-shaped curve The standard uniform distribution is actually a special case of the beta distribution. Find the exact parameters \\(\\alpha\\) and \\(\\beta\\). Once you do, prove the equality by inserting the values into our pdf. Hint: The beta function is evaluated as \\(\\text{B}(a,b) = \\frac{\\Gamma(a)\\Gamma(b)}{\\Gamma(a+b)}\\), the gamma function for positive integers \\(n\\) is evaluated as \\(\\Gamma(n)= (n-1)!\\) # Install and load necessary packages install.packages(c("shiny", "ggplot2")) library(shiny) library(ggplot2) # The Shiny App ui <- fluidPage( titlePanel("Beta Distribution Viewer"), sidebarLayout( sidebarPanel( sliderInput("alpha", "Alpha:", min = 0.1, max = 10, value = 2, step = 0.1), sliderInput("beta", "Beta:", min = 0.1, max = 10, value = 2, step = 0.1) ), mainPanel( plotOutput("betaPlot") ) ) ) server <- function(input, output) { output$betaPlot <- renderPlot({ x <- seq(0, 1, by = 0.01) y <- dbeta(x, shape1 = input$alpha, shape2 = input$beta) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() + labs(x = "Value", y = "Density") + theme_minimal() }) } shinyApp(ui = ui, server = server) Solution. \\(\\alpha = 2, \\beta=1\\) \\(\\alpha = 1, \\beta=2\\) Possible solution \\(\\alpha = \\beta= 5\\) Possible solution \\(\\alpha = \\beta= 0.5\\) The correct parameters are \\(\\alpha = 1, \\beta=1\\), to prove the equality we insert them into the beta pdf: \\[\\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} = \\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\\text{B}(1, 1)} = \\frac{1}{\\frac{\\Gamma(1)\\Gamma(1)}{\\Gamma(1+1)}}= \\frac{1}{\\frac{(1-1)!(1-1)!}{(2-1)!}} = 1\\] Exercise 18.9 (Exponential intuition 1) The exponential distribution represents the distributon of time between events in a Poisson process. It is the continuous analogue of the geometric distribution. It has a single parameter \\(\\lambda\\), which is strictly positive and represents the constant rate of the corresponding Poisson process. The support is all positive reals, since time between events is non-negative, but not bound upwards. Let’s revisit the call center from our Poisson problem. We get 2.5 calls per day on average, this is our rate parameter \\(\\lambda\\). A work day is 8 hours. What is the mean time between phone calls? The cdf \\(F(x)\\) tells us what percentage of calls occur within x amount of time of each other. You want to take an hour long lunch break but are worried about missing calls. Calculate the percentage of calls you are likely to miss if you’re gone for an hour. Hint: The cdf is \\(F(x) = \\int_{-\\infty}^{x} f(x) dx\\) Solution. Taking \\(\\lambda = \\frac{2.5 \\text{ calls}}{8 \\text{ hours}} = \\frac{1 \\text{ call}}{3.2 \\text{ hours}}\\) \\[E[X] = \\frac{1}{\\lambda} = \\frac{3.2 \\text{ hours}}{\\text{call}}\\] First we derive the CDF, we can integrate from 0 instead of \\(-\\infty\\), since we have no support in the negatives: \\[\\begin{align} F(x) &= \\int_{0}^{x} \\lambda e^{-\\lambda t} dt \\\\ &= \\lambda \\int_{0}^{x} e^{-\\lambda t} dt \\\\ &= \\lambda (\\frac{1}{-\\lambda}e^{-\\lambda t} |_{0}^{x}) \\\\ &= \\lambda(\\frac{1}{\\lambda} - \\frac{1}{\\lambda} e^{-\\lambda x}) \\\\ &= 1 - e^{-\\lambda x}. \\end{align}\\] Then we just evaluate it for a time of 1 hour: \\[F(1 \\text{ hour}) = 1 - e^{-\\frac{1 \\text{ call}}{3.2 \\text{ hours}} \\cdot 1 \\text{ hour}}= 1 - e^{-\\frac{1 \\text{ call}}{3.2 \\text{ hours}}} \\approx 0.268\\] So we have about a 27% chance of missing a call if we’re gone for an hour. Exercise 18.10 (Gamma intuition 1) The gamma distribution is a continuous distribution characterized by two parameters, \\(\\alpha\\) and \\(\\beta\\), both greater than 0. These parameters afford the distribution a broad range of shapes, leading to it being commonly referred to as a family of distributions. Given its support over the positive real numbers, it is well suited for modeling a diverse range of positive-valued phenomena. The exponential distribution is actually just a particular form of the gamma distribution. What are the values of \\(\\alpha\\) and \\(\\beta\\)? Copy the code from our beta distribution Shiny app and modify it to simulate the gamma distribution. Then get it to show the exponential. Solution. Let’s start by taking a look at the pdfs of the two distributions side by side: \\[\\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x} = \\lambda e^{-\\lambda x}\\] The \\(x^{\\alpha - 1}\\) term is not found anywhere in the pdf of the exponential so we need to eliminate it by setting \\(\\alpha = 1\\). This also makes the fraction evaluate to \\(\\frac{\\beta^1}{\\Gamma(1)} = \\beta\\), which leaves us with \\[\\beta \\cdot e^{-\\beta x}\\] Now we can see that \\(\\beta = \\lambda\\) and \\(\\alpha = 1\\). # Install and load necessary packages install.packages(c("shiny", "ggplot2")) library(shiny) library(ggplot2) # The Shiny App ui <- fluidPage( titlePanel("Gamma Distribution Viewer"), sidebarLayout( sidebarPanel( sliderInput("shape", "Shape (α):", min = 0.1, max = 10, value = 2, step = 0.1), sliderInput("scale", "Scale (β):", min = 0.1, max = 10, value = 2, step = 0.1) ), mainPanel( plotOutput("gammaPlot") ) ) ) server <- function(input, output) { output$gammaPlot <- renderPlot({ x <- seq(0, 25, by = 0.1) y <- dgamma(x, shape = input$shape, scale = input$scale) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() + labs(x = "Value", y = "Density") + theme_minimal() }) } shinyApp(ui = ui, server = server) "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]] +[["index.html", "Principles of Uncertainty – exercises Preface", " Principles of Uncertainty – exercises Gregor Pirš and Erik Štrumbelj 2023-10-01 Preface These are the exercises for the Principles of Uncertainty course of the Data Science Master’s at University of Ljubljana, Faculty of Computer and Information Science. This document will be extended each week as the course progresses. At the end of each exercise session, we will post the solutions to the exercises worked in class and select exercises for homework. Students are also encouraged to solve the remaining exercises to further extend their knowledge. Some exercises require the use of R. Those exercises (or parts of) are coloured blue. Students that are not familiar with R programming language should study A to learn the basics. As the course progresses, we will cover more relevant uses of R for data science. "],["introduction.html", "Chapter 1 Probability spaces 1.1 Measure and probability spaces 1.2 Properties of probability measures 1.3 Discrete probability spaces", " Chapter 1 Probability spaces This chapter deals with measures and probability spaces. At the end of the chapter, we look more closely at discrete probability spaces. The students are expected to acquire the following knowledge: Theoretical Use properties of probability to calculate probabilities. Combinatorics. Understanding of continuity of probability. R Vectors and vector operations. For loop. Estimating probability with simulation. sample function. Matrices and matrix operations. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 1.1 Measure and probability spaces Exercise 1.1 (Completing a set to a sigma algebra) Let \\(\\Omega = \\{1,2,...,10\\}\\) and let \\(A = \\{\\emptyset, \\{1\\}, \\{2\\}, \\Omega \\}\\). Show that \\(A\\) is not a sigma algebra of \\(\\Omega\\). Find the minimum number of elements to complete A to a sigma algebra of \\(\\Omega\\). Solution. \\(1^c = \\{2,3,...,10\\} \\notin A \\implies\\) \\(A\\) is not sigma algebra. First we need the complements of all elements, so we need to add sets \\(\\{2,3,...,10\\}\\) and \\(\\{1,3,4,...,10\\}\\). Next we need unions of all sets – we add the set \\(\\{1,2\\}\\). Again we need the complement of this set, so we add \\(\\{3,4,...,10\\}\\). So the minimum number of elements we need to add is 4. Exercise 1.2 (Diversity of sigma algebras) Let \\(\\Omega\\) be a set. Find the smallest sigma algebra of \\(\\Omega\\). Find the largest sigma algebra of \\(\\Omega\\). Solution. \\(A = \\{\\emptyset, \\Omega\\}\\) \\(2^{\\Omega}\\) Exercise 1.3 Find all sigma algebras for \\(\\Omega = \\{0, 1, 2\\}\\). Solution. \\(A = \\{\\emptyset, \\Omega\\}\\) \\(A = 2^{\\Omega}\\) \\(A = \\{\\emptyset, \\{0\\}, \\{1,2\\}, \\Omega\\}\\) \\(A = \\{\\emptyset, \\{1\\}, \\{0,2\\}, \\Omega\\}\\) \\(A = \\{\\emptyset, \\{2\\}, \\{0,1\\}, \\Omega\\}\\) Exercise 1.4 (Difference between algebra and sigma algebra) Let \\(\\Omega = \\mathbb{N}\\) and \\(\\mathcal{A} = \\{A \\subseteq \\mathbb{N}: A \\text{ is finite or } A^c \\text{ is finite.} \\}\\). Show that \\(\\mathcal{A}\\) is an algebra but not a sigma algebra. Solution. \\(\\emptyset\\) is finite so \\(\\emptyset \\in \\mathcal{A}\\). Let \\(A \\in \\mathcal{A}\\) and \\(B \\in \\mathcal{A}\\). If both are finite, then their union is also finite and therefore in \\(\\mathcal{A}\\). Let at least one of them not be finite. Then their union is not finite. But \\((A \\cup B)^c = A^c \\cap B^c\\). And since at least one is infinite, then its complement is finite and the intersection is too. So finite unions are in \\(\\mathcal{A}\\). Let us look at numbers \\(2n\\). For any \\(n\\), \\(2n \\in \\mathcal{A}\\) as it is finite. But \\(\\bigcup_{k = 1}^{\\infty} 2n \\notin \\mathcal{A}\\). Exercise 1.5 We define \\(\\sigma(X) = \\cap_{\\lambda \\in I} S_\\lambda\\) to be a sigma algebra, generated by the set \\(X\\), where \\(S_\\lambda\\) are all sigma algebras such that \\(X \\subseteq S_\\lambda\\). \\(S_\\lambda\\) are indexed by \\(\\lambda \\in I\\). Let \\(A, B \\subseteq 2^{\\Omega}\\). Prove that \\(\\sigma(A) = \\sigma(B) \\iff A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A)\\). Solution. To prove the equivalence, we need to prove that the left hand side implies the right hand side and vice versa. Proving \\(\\sigma(A) = \\sigma(B) \\Rightarrow A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A)\\): we know \\(A \\subseteq \\sigma(A)\\) is always true, so by substituting in \\(\\sigma(B)\\) from the left hand side equality we obtain \\(A \\subseteq \\sigma(B)\\). We obtain \\(B \\subseteq \\sigma(A)\\) by symmetry. This proves the implication. Proving \\(A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A) \\Rightarrow \\sigma(A) = \\sigma(B)\\): by definition of a sigma algebra, generated by a set, we have \\(\\sigma(B) = \\cap_{\\lambda \\in I} S_\\lambda\\) where \\(S_\\lambda\\) are all sigma algebras where \\(B \\subseteq S_\\lambda\\). But \\(\\sigma(A)\\) is one of \\(S_\\lambda\\), so we can write \\(\\sigma(B) = \\sigma(A) \\cap \\left(\\cap_{\\lambda \\in I} S_\\lambda \\right)\\), which implies \\(\\sigma(B) \\subseteq \\sigma(A)\\). By symmetry, we have \\(\\sigma(A) \\subseteq \\sigma(B)\\). Since \\(\\sigma(A) \\subseteq \\sigma(B)\\) and \\(\\sigma(B) \\subseteq \\sigma(A)\\), we obtain \\(\\sigma(A) = \\sigma(B)\\), which proves the implication and completes the equivalence proof. Exercise 1.6 (Intro to measure) Take the measurable space \\(\\Omega = \\{1,2\\}\\), \\(F = 2^{\\Omega}\\). Which of the following is a measure? Which is a probability measure? \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 5\\), \\(\\mu(\\{2\\}) = 6\\), \\(\\mu(\\{1,2\\}) = 11\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 0\\), \\(\\mu(\\{1,2\\}) = 1\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 0\\), \\(\\mu(\\{1,2\\}) = 0\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 1\\), \\(\\mu(\\{1,2\\}) = 1\\) \\(\\mu(\\emptyset)=0\\), \\(\\mu(\\{1\\})=0\\), \\(\\mu(\\{2\\})=\\infty\\), \\(\\mu(\\{1,2\\})=\\infty\\) Solution. Measure. Not probability measure since \\(\\mu(\\Omega) > 1\\). Neither due to countable additivity. Measure. Not probability measure since \\(\\mu(\\Omega) = 0\\). Probability measure. Measure. Not probability measure since \\(\\mu(\\Omega) > 1\\). Exercise 1.7 Define a probability space that could be used to model the outcome of throwing two fair 6-sided dice. Solution. \\(\\Omega = \\{\\{i,j\\}, i = 1,...,6, j = 1,...,6\\}\\) \\(F = 2^{\\Omega}\\) \\(\\forall \\omega \\in \\Omega\\), \\(P(\\omega) = \\frac{1}{6} \\times \\frac{1}{6} = \\frac{1}{36}\\) 1.2 Properties of probability measures Exercise 1.8 A standard deck (52 cards) is distributed to two persons: 26 cards to each person. All partitions are equally likely. Find the probability that: The first person gets 4 Queens. The first person gets at least 2 Queens. R: Use simulation (sample) to check the above answers. Solution. \\(\\frac{\\binom{48}{22}}{\\binom{52}{26}}\\) 1 - \\(\\frac{\\binom{48}{26} + 4 \\times \\binom{48}{25}}{\\binom{52}{26}}\\) For the simulation, let us represent cards with numbers from 1 to 52, and let 1 through 4 represent Queens. set.seed(1) cards <- 1:52 n <- 10000 q4 <- vector(mode = "logical", length = n) q2 <- vector(mode = "logical", length = n) tmp <- vector(mode = "logical", length = n) for (i in 1:n) { p1 <- sample(1:52, 26) q4[i] <- sum(1:4 %in% p1) == 4 q2[i] <- sum(1:4 %in% p1) >= 2 } sum(q4) / n ## [1] 0.0572 sum(q2) / n ## [1] 0.6894 Exercise 1.9 Let \\(A\\) and \\(B\\) be events with probabilities \\(P(A) = \\frac{2}{3}\\) and \\(P(B) = \\frac{1}{2}\\). Show that \\(\\frac{1}{6} \\leq P(A\\cap B) \\leq \\frac{1}{2}\\), and give examples to show that both extremes are possible. Find corresponding bounds for \\(P(A\\cup B)\\). R: Draw samples from the examples and show the probability bounds of \\(P(A \\cap B)\\) . Solution. From the properties of probability we have \\[\\begin{equation} P(A \\cup B) = P(A) + P(B) - P(A \\cap B) \\leq 1. \\end{equation}\\] From this follows \\[\\begin{align} P(A \\cap B) &\\geq P(A) + P(B) - 1 \\\\ &= \\frac{2}{3} + \\frac{1}{2} - 1 \\\\ &= \\frac{1}{6}, \\end{align}\\] which is the lower bound for the intersection. Conversely, we have \\[\\begin{equation} P(A \\cup B) = P(A) + P(B) - P(A \\cap B) \\geq P(A). \\end{equation}\\] From this follows \\[\\begin{align} P(A \\cap B) &\\leq P(B) \\\\ &= \\frac{1}{2}, \\end{align}\\] which is the upper bound for the intersection. For an example take a fair die. To achieve the lower bound let \\(A = \\{3,4,5,6\\}\\) and \\(B = \\{1,2,3\\}\\), then their intersection is \\(A \\cap B = \\{3\\}\\). To achieve the upper bound take \\(A = \\{1,2,3,4\\}\\) and $B = {1,2,3} $. For the bounds of the union we will use the results from the first part. Again from the properties of probability we have \\[\\begin{align} P(A \\cup B) &= P(A) + P(B) - P(A \\cap B) \\\\ &\\geq P(A) + P(B) - \\frac{1}{2} \\\\ &= \\frac{2}{3}. \\end{align}\\] Conversely \\[\\begin{align} P(A \\cup B) &= P(A) + P(B) - P(A \\cap B) \\\\ &\\leq P(A) + P(B) - \\frac{1}{6} \\\\ &= 1. \\end{align}\\] Therefore \\(\\frac{2}{3} \\leq P(A \\cup B) \\leq 1\\). We use sample in R: set.seed(1) n <- 10000 samps <- sample(1:6, n, replace = TRUE) # lower bound lb <- vector(mode = "logical", length = n) A <- c(1,2,3) B <- c(3,4,5,6) for (i in 1:n) { lb[i] <- samps[i] %in% A & samps[i] %in% B } sum(lb) / n ## [1] 0.1605 # upper bound ub <- vector(mode = "logical", length = n) A <- c(1,2,3) B <- c(1,2,3,4) for (i in 1:n) { ub[i] <- samps[i] %in% A & samps[i] %in% B } sum(ub) / n ## [1] 0.4913 Exercise 1.10 A fair coin is tossed repeatedly. Show that, with probability one, a head turns up sooner or later. Show similarly that any given finite sequence of heads and tails occurs eventually with probability one. Solution. \\[\\begin{align} P(\\text{no heads}) &= \\lim_{n \\rightarrow \\infty} P(\\text{no heads in first }n \\text{ tosses}) \\\\ &= \\lim_{n \\rightarrow \\infty} \\frac{1}{2^n} \\\\ &= 0. \\end{align}\\] For the second part, let us fix the given sequence of heads and tails of length \\(k\\) as \\(s\\). A probability that this happens in \\(k\\) tosses is \\(\\frac{1}{2^k}\\). \\[\\begin{align} P(s \\text{ occurs}) &= \\lim_{n \\rightarrow \\infty} P(s \\text{ occurs in first } nk \\text{ tosses}) \\end{align}\\] The right part of the upper equation is greater than if \\(s\\) occurs either in the first \\(k\\) tosses, second \\(k\\) tosses,…, \\(n\\)-th \\(k\\) tosses. Therefore \\[\\begin{align} P(s \\text{ occurs}) &\\geq \\lim_{n \\rightarrow \\infty} P(s \\text{ occurs in first } n \\text{ disjoint sequences of length } k) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - P(s \\text{ does not occur in first } n \\text{ disjoint sequences})) \\\\ &= 1 - \\lim_{n \\rightarrow \\infty} P(s \\text{ does not occur in first } n \\text{ disjoint sequences}) \\\\ &= 1 - \\lim_{n \\rightarrow \\infty} (1 - \\frac{1}{2^k})^n \\\\ &= 1. \\end{align}\\] Exercise 1.11 An Erdos-Renyi random graph \\(G(n,p)\\) is a model with \\(n\\) nodes, where each pair of nodes is connected with probability \\(p\\). Calculate the probability that there exists a node that is not connected to any other node in \\(G(4,0.6)\\). Show that the upper bound for the probability that there exist 2 nodes that are not connected to any other node for an arbitrary \\(G(n,p)\\) is \\(\\binom{n}{2} (1-p)^{2n - 3}\\). R: Estimate the probability from the first point using simulation. Solution. Let \\(A_i\\) be the event that the \\(i\\)-th node is not connected to any other node. Then our goal is to calculate \\(P(\\cup_{i=1}^n A_i)\\). Using the inclusion-exclusion principle, we get \\[\\begin{align} P(\\cup_{i=1}^n A_i) &= \\sum_i A_i - \\sum_{i<j} P(A_i \\cap A_j) + \\sum_{i<j<k} P(A_i \\cap A_j \\cap A_k) - P(A_1 \\cap A_2 \\cap A_3 \\cap A_4) \\\\ &=4 (1 - p)^3 - \\binom{4}{2} (1 - p)^5 + \\binom{4}{3} (1 - p)^6 - (1 - p)^6 \\\\ &\\approx 0.21. \\end{align}\\] Let \\(A_{ij}\\) be the event that nodes \\(i\\) and \\(j\\) are not connected to any other node. We are interested in \\(P(\\cup_{i<j}A_{ij})\\). By using Boole`s inequality, we get \\[\\begin{align} P(\\cup_{i<j}A_{ij}) \\leq \\sum_{i<j} P(A_{ij}). \\end{align}\\] What is the probability of \\(A_{ij}\\)? There need to be no connections to the \\(i\\)-th node to the remaining nodes (excluding \\(j\\)), the same for the \\(j\\)-th node, and there can be no connection between them. Therefore \\[\\begin{align} P(\\cup_{i<j}A_{ij}) &\\leq \\sum_{i<j} (1 - p)^{2(n-2) + 1} \\\\ &= \\binom{n}{2} (1 - p)^{2n - 3}. \\end{align}\\] set.seed(1) n_samp <- 100000 n <- 4 p <- 0.6 conn_samp <- vector(mode = "logical", length = n_samp) for (i in 1:n_samp) { tmp_mat <- matrix(data = 0, nrow = n, ncol = n) samp_conn <- sample(c(0,1), choose(4,2), replace = TRUE, prob = c(1 - p, p)) tmp_mat[lower.tri(tmp_mat)] <- samp_conn tmp_mat[upper.tri(tmp_mat)] <- t(tmp_mat)[upper.tri(t(tmp_mat))] not_conn <- apply(tmp_mat, 1, sum) if (any(not_conn == 0)) { conn_samp[i] <- TRUE } else { conn_samp[i] <- FALSE } } sum(conn_samp) / n_samp ## [1] 0.20565 1.3 Discrete probability spaces Exercise 1.12 Show that the standard measurable space on \\(\\Omega = \\{0,1,...,n\\}\\) equipped with binomial measure is a discrete probability space. Define another probability measure on this measurable space. Show that for \\(n=1\\) the binomial measure is the same as the Bernoulli measure. R: Draw 1000 samples from the binomial distribution \\(p=0.5\\), \\(n=20\\) (rbinom) and compare relative frequencies with theoretical probability measure. Solution. We need to show that the terms of \\(\\sum_{k=0}^n \\binom{n}{k} p^k (1 - p)^{n - k}\\) sum to 1. For that we use the binomial theorem \\(\\sum_{k=0}^n \\binom{n}{k} x^k y^{n-k} = (x + y)^n\\). So \\[\\begin{equation} \\sum_{k=0}^n \\binom{n}{k} p^k (1 - p)^{n - k} = (p + 1 - p)^n = 1. \\end{equation}\\] \\(P(\\{k\\}) = \\frac{1}{n + 1}\\). When \\(n=1\\) then \\(k \\in \\{0,1\\}\\). Inserting \\(n=1\\) into the binomial measure, we get \\(\\binom{1}{k}p^k (1-p)^{1 - k}\\). Now \\(\\binom{1}{1} = \\binom{1}{0} = 1\\), so the measure is \\(p^k (1-p)^{1 - k}\\), which is the Bernoulli measure. set.seed(1) library(ggplot2) library(dplyr) bin_samp <- rbinom(n = 1000, size = 20, prob = 0.5) bin_samp <- data.frame(x = bin_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:20, n = dbinom(0:20, size = 20, prob = 0.5), type = "theoretical_measure")) bin_plot <- ggplot(data = bin_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(bin_plot) Exercise 1.13 Show that the standard measurable space on \\(\\Omega = \\{0,1,...,\\infty\\}\\) equipped with geometric measure is a discrete probability space, equipped with Poisson measure is a discrete probability space. Define another probability measure on this measurable space. R: Draw 1000 samples from the Poisson distribution \\(\\lambda = 10\\) (rpois) and compare relative frequencies with theoretical probability measure. Solution. \\(\\sum_{k = 0}^{\\infty} p(1 - p)^k = p \\sum_{k = 0}^{\\infty} (1 - p)^k = p \\frac{1}{1 - 1 + p} = 1\\). We used the formula for geometric series. \\(\\sum_{k = 0}^{\\infty} \\frac{\\lambda^k e^{-\\lambda}}{k!} = e^{-\\lambda} \\sum_{k = 0}^{\\infty} \\frac{\\lambda^k}{k!} = e^{-\\lambda} e^{\\lambda} = 1.\\) We used the Taylor expansion of the exponential function. Since we only have to define a probability measure, we could only assign probabilities that sum to one to a finite number of events in \\(\\Omega\\), and probability zero to the other infinite number of events. However to make this solution more educational, we will try to find a measure that assigns a non-zero probability to all events in \\(\\Omega\\). A good start for this would be to find a converging infinite series, as the probabilities will have to sum to one. One simple converging series is the geometric series \\(\\sum_{k=0}^{\\infty} p^k\\) for \\(|p| < 1\\). Let us choose an arbitrary \\(p = 0.5\\). Then \\(\\sum_{k=0}^{\\infty} p^k = \\frac{1}{1 - 0.5} = 2\\). To complete the measure, we have to normalize it, so it sums to one, therefore \\(P(\\{k\\}) = \\frac{0.5^k}{2}\\) is a probability measure on \\(\\Omega\\). We could make it even more difficult by making this measure dependent on some parameter \\(\\alpha\\), but this is out of the scope of this introductory chapter. set.seed(1) pois_samp <- rpois(n = 1000, lambda = 10) pois_samp <- data.frame(x = pois_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:25, n = dpois(0:25, lambda = 10), type = "theoretical_measure")) pois_plot <- ggplot(data = pois_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(pois_plot) Exercise 1.14 Define a probability measure on \\((\\Omega = \\mathbb{Z}, 2^{\\mathbb{Z}})\\). Define a probability measure such that \\(P(\\omega) > 0, \\forall \\omega \\in \\Omega\\). R: Implement a random generator that will generate samples with the relative frequency that corresponds to your probability measure. Compare relative frequencies with theoretical probability measure . Solution. \\(P(0) = 1, P(\\omega) = 0, \\forall \\omega \\neq 0\\). \\(P(\\{k\\}) = \\sum_{k = -\\infty}^{\\infty} \\frac{p(1 - p)^{|k|}}{2^{1 - 1_0(k)}}\\), where \\(1_0(k)\\) is the indicator function, which equals to one if \\(k\\) is 0, and equals to zero in every other case. n <- 1000 geom_samps <- rgeom(n, prob = 0.5) sign_samps <- sample(c(FALSE, TRUE), size = n, replace = TRUE) geom_samps[sign_samps] <- -geom_samps[sign_samps] my_pmf <- function (k, p) { indic <- rep(1, length(k)) indic[k == 0] <- 0 return ((p * (1 - p)^(abs(k))) / 2^indic) } geom_samps <- data.frame(x = geom_samps) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = -10:10, n = my_pmf(-10:10, 0.5), type = "theoretical_measure")) geom_plot <- ggplot(data = geom_samps, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(geom_plot) Exercise 1.15 Define a probability measure on \\(\\Omega = \\{1,2,3,4,5,6\\}\\) with parameter \\(m \\in \\{1,2,3,4,5,6\\}\\), so that the probability of outcome at distance \\(1\\) from \\(m\\) is half of the probability at distance \\(0\\), at distance \\(2\\) is half of the probability at distance \\(1\\), etc. R: Implement a random generator that will generate samples with the relative frequency that corresponds to your probability measure. Compare relative frequencies with theoretical probability measure . Solution. \\(P(\\{k\\}) = \\frac{\\frac{1}{2}^{|m - k|}}{\\sum_{i=1}^6 \\frac{1}{2}^{|m - i|}}\\) n <- 10000 m <- 4 my_pmf <- function (k, m) { denom <- sum(0.5^abs(m - 1:6)) return (0.5^abs(m - k) / denom) } samps <- c() for (i in 1:n) { a <- sample(1:6, 1) a_val <- my_pmf(a, m) prob <- runif(1) if (prob < a_val) { samps <- c(samps, a) } } samps <- data.frame(x = samps) %>% count(x) %>% mutate(n = n / length(samps), type = "empirical_frequencies") %>% bind_rows(data.frame(x = 1:6, n = my_pmf(1:6, m), type = "theoretical_measure")) my_plot <- ggplot(data = samps, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(my_plot) "],["uprobspaces.html", "Chapter 2 Uncountable probability spaces 2.1 Borel sets 2.2 Lebesgue measure", " Chapter 2 Uncountable probability spaces This chapter deals with uncountable probability spaces. The students are expected to acquire the following knowledge: Theoretical Understand Borel sets and identify them. Estimate Lebesgue measure for different sets. Know when sets are Borel-measurable. Understanding of countable and uncountable sets. R Uniform sampling. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 2.1 Borel sets Exercise 2.1 Prove that the intersection of two sigma algebras on \\(\\Omega\\) is a sigma algebra. Prove that the collection of all open subsets \\((a,b)\\) on \\((0,1]\\) is not a sigma algebra of \\((0,1]\\). Solution. Empty set: \\[\\begin{equation} \\emptyset \\in \\mathcal{A} \\wedge \\emptyset \\in \\mathcal{B} \\Rightarrow \\emptyset \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Complement: \\[\\begin{equation} \\text{Let } A \\in \\mathcal{A} \\cap \\mathcal{B} \\Rightarrow A \\in \\mathcal{A} \\wedge A \\in \\mathcal{B} \\Rightarrow A^c \\in \\mathcal{A} \\wedge A^c \\in \\mathcal{B} \\Rightarrow A^c \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Countable additivity: Let \\(\\{A_i\\}\\) be a countable sequence of subsets in \\(\\mathcal{A} \\cap \\mathcal{B}\\). \\[\\begin{equation} \\forall i: A_i \\in \\mathcal{A} \\cap \\mathcal{B} \\Rightarrow A_i \\in \\mathcal{A} \\wedge A_i \\in \\mathcal{B} \\Rightarrow \\cup A_i \\in \\mathcal{A} \\wedge \\cup A_i \\in \\mathcal{B} \\Rightarrow \\cup A_i \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Let \\(A\\) denote the collection of all open subsets \\((a,b)\\) on \\((0,1]\\). Then \\((0,1) \\in A\\). But \\((0,1)^c = 1 \\notin A\\). Exercise 2.2 Show that \\(\\mathcal{C} = \\sigma(\\mathcal{C})\\) if and only if \\(\\mathcal{C}\\) is a sigma algebra. Solution. “\\(\\Rightarrow\\)” This follows from the definition of a generated sigma algebra. “\\(\\Leftarrow\\)” Let \\(\\mathcal{F} = \\cap_i F_i\\) be the intersection of all sigma algebras that contain \\(\\mathcal{C}\\). Then \\(\\sigma(\\mathcal{C}) = \\mathcal{F}\\). Additionally, \\(\\forall i: \\mathcal{C} \\in F_i\\). So each \\(F_i\\) can be written as \\(F_i = \\mathcal{C} \\cup D\\), where \\(D\\) are the rest of the elements in the sigma algebra. In other words, each sigma algebra in the collection contains at least \\(\\mathcal{C}\\), but can contain other elements. Now for some \\(j\\), \\(F_j = \\mathcal{C}\\) as \\(\\{F_i\\}\\) contains all sigma algebras that contain \\(\\mathcal{C}\\) and \\(\\mathcal{C}\\) is such a sigma algebra. Since this is the smallest subset in the intersection it follows that \\(\\sigma(\\mathcal{C}) = \\mathcal{F} = \\mathcal{C}\\). Exercise 2.3 Let \\(\\mathcal{C}\\) and \\(\\mathcal{D}\\) be two collections of subsets on \\(\\Omega\\) such that \\(\\mathcal{C} \\subset \\mathcal{D}\\). Prove that \\(\\sigma(\\mathcal{C}) \\subseteq \\sigma(\\mathcal{D})\\). Solution. \\(\\sigma(\\mathcal{D})\\) is a sigma algebra that contains \\(\\mathcal{D}\\). It follows that \\(\\sigma(\\mathcal{D})\\) is a sigma algebra that contains \\(\\mathcal{C}\\). Let us write \\(\\sigma(\\mathcal{C}) = \\cap_i F_i\\), where \\(\\{F_i\\}\\) is the collection of all sigma algebras that contain \\(\\mathcal{C}\\). Since \\(\\sigma(\\mathcal{D})\\) is such a sigma algebra, there exists an index \\(j\\), so that \\(F_j = \\sigma(\\mathcal{D})\\). Then we can write \\[\\begin{align} \\sigma(\\mathcal{C}) &= (\\cap_{i \\neq j} F_i) \\cap \\sigma(\\mathcal{D}) \\\\ &\\subseteq \\sigma(\\mathcal{D}). \\end{align}\\] Exercise 2.4 Prove that the following subsets of \\((0,1]\\) are Borel-measurable by finding their measure. Any countable set. The set of numbers in (0,1] whose decimal expansion does not contain 7. Solution. This follows directly from the fact that every countable set is a union of singletons, whose measure is 0. Let us first look at numbers which have a 7 as the first decimal numbers. Their measure is 0.1. Then we take all the numbers with a 7 as the second decimal number (excluding those who already have it as the first). These have the measure 0.01, and there are 9 of them, so their total measure is 0.09. We can continue to do so infinitely many times. At each \\(n\\), we have the measure of the intervals which is \\(10^n\\) and the number of those intervals is \\(9^{n-1}\\). Now \\[\\begin{align} \\lambda(A) &= 1 - \\sum_{n = 0}^{\\infty} \\frac{9^n}{10^{n+1}} \\\\ &= 1 - \\frac{1}{10} \\sum_{n = 0}^{\\infty} (\\frac{9}{10})^n \\\\ &= 1 - \\frac{1}{10} \\frac{10}{1} \\\\ &= 0. \\end{align}\\] Since we have shown that the measure of the set is \\(0\\), we have also shown that the set is measurable. Exercise 2.5 Let \\(\\Omega = [0,1]\\), and let \\(\\mathcal{F}_3\\) consist of all countable subsets of \\(\\Omega\\), and all subsets of \\(\\Omega\\) having a countable complement. Show that \\(\\mathcal{F}_3\\) is a sigma algebra. Let us define \\(P(A)=0\\) if \\(A\\) is countable, and \\(P(A) = 1\\) if \\(A\\) has a countable complement. Is \\((\\Omega, \\mathcal{F}_3, P)\\) a legitimate probability space? Solution. The empty set is countable, therefore it is in \\(\\mathcal{F}_3\\). For any \\(A \\in \\mathcal{F}_3\\). If \\(A\\) is countable, then \\(A^c\\) has a countable complement and is in \\(\\mathcal{F}_3\\). If \\(A\\) is uncountable, then it has a countable complement \\(A^c\\) which is therefore also in \\(\\mathcal{F}_3\\). We are left with showing countable additivity. Let \\(\\{A_i\\}\\) be an arbitrary collection of sets in \\(\\mathcal{F}_3\\). We will look at two possibilities. First let all \\(A_i\\) be countable. A countable union of countable sets is countable, and therefore in \\(\\mathcal{F}_3\\). Second, let at least one \\(A_i\\) be uncountable. It follows that it has a countable complement. We can write \\[\\begin{equation} (\\cup_{i=1}^{\\infty} A_i)^c = \\cap_{i=1}^{\\infty} A_i^c. \\end{equation}\\] Since at least one \\(A_i^c\\) on the right side is countable, the whole intersection is countable, and therefore the union has a countable complement. It follows that the union is in \\(\\mathcal{F}_3\\). The tuple \\((\\Omega, \\mathcal{F}_3)\\) is a measurable space. Therefore, we only need to check whether \\(P\\) is a probability measure. The measure of the empty set is zero as it is countable. We have to check for countable additivity. Let us look at three situations. Let \\(A_i\\) be disjoint sets. First, let all \\(A_i\\) be countable. \\[\\begin{equation} P(\\cup_{i=1}^{\\infty} A_i) = \\sum_{i=1}^{\\infty}P( A_i)) = 0. \\end{equation}\\] Since the union is countable, the above equation holds. Second, let exactly one \\(A_i\\) be uncountable. W.L.O.G. let that be \\(A_1\\). Then \\[\\begin{equation} P(\\cup_{i=1}^{\\infty} A_i) = 1 + \\sum_{i=2}^{\\infty}P( A_i)) = 1. \\end{equation}\\] Since the union is uncountable, the above equation holds. Third, let at least two \\(A_i\\) be uncountable. We have to check whether it is possible for two uncountable sets in \\(\\mathcal{F}_3\\) to be disjoint. If that is possible, then their measures would sum to more than one and \\(P\\) would not be a probability measure. W.L.O.G. let \\(A_1\\) and \\(A_2\\) be uncountable. Then we have \\[\\begin{equation} A_1 \\cap A_2 = (A_1^c \\cup A_2^c)^c. \\end{equation}\\] Now \\(A_1^c\\) and \\(A_2^c\\) are countable and their union is therefore countable. Let \\(B = A_1^c \\cup A_2^c\\). So the intersection of \\(A_1\\) and \\(A_2\\) equals the complement of \\(B\\), which is countable. For the intersection to be the empty set, \\(B\\) would have to equal to \\(\\Omega\\). But \\(\\Omega\\) is uncountable and therefore \\(B\\) can not equal to \\(\\Omega\\). It follows that two uncountable sets in \\(\\mathcal{F}_3\\) can not have an empty intersection. Therefore the tuple is a legitimate probability space. 2.2 Lebesgue measure Exercise 2.6 Show that the Lebesgue measure of rational numbers on \\([0,1]\\) is 0. R: Implement a random number generator, which generates uniform samples of irrational numbers in \\([0,1]\\) by uniformly sampling from \\([0,1]\\) and rejecting a sample if it is rational. Solution. There are a countable number of rational numbers. Therefore, we can write \\[\\begin{align} \\lambda(\\mathbb{Q}) &= \\lambda(\\cup_{i = 1}^{\\infty} q_i) &\\\\ &= \\sum_{i = 1}^{\\infty} \\lambda(q_i) &\\text{ (countable additivity)} \\\\ &= \\sum_{i = 1}^{\\infty} 0 &\\text{ (Lebesgue measure of a singleton)} \\\\ &= 0. \\end{align}\\] Exercise 2.7 Prove that the Lebesgue measure of \\(\\mathbb{R}\\) is infinity. Paradox. Show that the cardinality of \\(\\mathbb{R}\\) and \\((0,1)\\) is the same, while their Lebesgue measures are infinity and one respectively. Solution. Let \\(a_i\\) be the \\(i\\)-th integer for \\(i \\in \\mathbb{Z}\\). We can write \\(\\mathbb{R} = \\cup_{-\\infty}^{\\infty} (a_i, a_{i + 1}]\\). \\[\\begin{align} \\lambda(\\mathbb{R}) &= \\lambda(\\cup_{i = -\\infty}^{\\infty} (a_i, a_{i + 1}]) \\\\ &= \\lambda(\\lim_{n \\rightarrow \\infty} \\cup_{i = -n}^{n} (a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\lambda(\\cup_{i = -n}^{n} (a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\sum_{i = -n}^{n} \\lambda((a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\sum_{i = -n}^{n} 1 \\\\ &= \\lim_{n \\rightarrow \\infty} 2n \\\\ &= \\infty. \\end{align}\\] We need to find a bijection between \\(\\mathbb{R}\\) and \\((0,1)\\). A well-known function that maps from a bounded interval to \\(\\mathbb{R}\\) is the tangent. To make the bijection easier to achieve, we will take the inverse, which maps from \\(\\mathbb{R}\\) to \\((-\\frac{\\pi}{2}, \\frac{\\pi}{2})\\). However, we need to change the function so it maps to \\((0,1)\\). First we add \\(\\frac{\\pi}{2}\\), so that we move the function above zero. Then we only have to divide by the max value, which in this case is \\(\\pi\\). So our bijection is \\[\\begin{equation} f(x) = \\frac{\\tan^{-1}(x) + \\frac{\\pi}{2}}{\\pi}. \\end{equation}\\] Exercise 2.8 Take the measure space \\((\\Omega_1 = (0,1], B_{(0,1]}, \\lambda)\\) (we know that this is a probability space on \\((0,1]\\)). Define a map (function) from \\(\\Omega_1\\) to \\(\\Omega_2 = \\{1,2,3,4,5,6\\}\\) such that the measure space \\((\\Omega_2, 2^{\\Omega_2}, \\lambda(f^{-1}()))\\) will be a discrete probability space with uniform probabilities (\\(P(\\omega) = \\frac{1}{6}, \\forall \\omega \\in \\Omega_2)\\). Is the map that you defined in (a) the only such map? How would you in the same fashion define a map that would result in a probability space that can be interpreted as a coin toss with probability \\(p\\) of heads? R: Use the map in (a) as a basis for a random generator for this fair die. Solution. In other words, we have to assign disjunct intervals of the same size to each element of \\(\\Omega_2\\). Therefore \\[\\begin{equation} f(x) = \\lceil 6x \\rceil. \\end{equation}\\] No, we could for example rearrange the order in which the intervals are mapped to integers. Additionally, we could have several disjoint intervals that mapped to the same integer, as long as the Lebesgue measure of their union would be \\(\\frac{1}{6}\\) and the function would remain injective. We have \\(\\Omega_3 = \\{0,1\\}\\), where zero represents heads and one represents tails. Then \\[\\begin{equation} f(x) = 0^{I_{A}(x)}, \\end{equation}\\] where \\(A = \\{y \\in (0,1] : y < p\\}\\). set.seed(1) unif_s <- runif(1000) die_s <- ceiling(6 * unif_s) summary(as.factor(die_s)) ## 1 2 3 4 5 6 ## 166 154 200 146 166 168 "],["condprob.html", "Chapter 3 Conditional probability 3.1 Calculating conditional probabilities 3.2 Conditional independence 3.3 Monty Hall problem", " Chapter 3 Conditional probability This chapter deals with conditional probability. The students are expected to acquire the following knowledge: Theoretical Identify whether variables are independent. Calculation of conditional probabilities. Understanding of conditional dependence and independence. How to apply Bayes’ theorem to solve difficult probabilistic questions. R Simulating conditional probabilities. cumsum. apply. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 3.1 Calculating conditional probabilities Exercise 3.1 A military officer is in charge of identifying enemy aircraft and shooting them down. He is able to positively identify an enemy airplane 95% of the time and positively identify a friendly airplane 90% of the time. Furthermore, 99% of the airplanes are friendly. When the officer identifies an airplane as an enemy airplane, what is the probability that it is not and they will shoot at a friendly airplane? Solution. Let \\(E = 0\\) denote that the observed plane is friendly and \\(E=1\\) that it is an enemy. Let \\(I = 0\\) denote that the officer identified it as friendly and \\(I = 1\\) as enemy. Then \\[\\begin{align} P(E = 0 | I = 1) &= \\frac{P(I = 1 | E = 0)P(E = 0)}{P(I = 1)} \\\\ &= \\frac{P(I = 1 | E = 0)P(E = 0)}{P(I = 1 | E = 0)P(E = 0) + P(I = 1 | E = 1)P(E = 1)} \\\\ &= \\frac{0.1 \\times 0.99}{0.1 \\times 0.99 + 0.95 \\times 0.01} \\\\ &= 0.91. \\end{align}\\] Exercise 3.2 R: Consider tossing a fair die. Let \\(A = \\{2,4,6\\}\\) and \\(B = \\{1,2,3,4\\}\\). Then \\(P(A) = \\frac{1}{2}\\), \\(P(B) = \\frac{2}{3}\\) and \\(P(AB) = \\frac{1}{3}\\). Since \\(P(AB) = P(A)P(B)\\), the events \\(A\\) and \\(B\\) are independent. Simulate draws from the sample space and verify that the proportions are the same. Then find two events \\(C\\) and \\(D\\) that are not independent and repeat the simulation. set.seed(1) nsamps <- 10000 tosses <- sample(1:6, nsamps, replace = TRUE) PA <- sum(tosses %in% c(2,4,6)) / nsamps PB <- sum(tosses %in% c(1,2,3,4)) / nsamps PA * PB ## [1] 0.3295095 sum(tosses %in% c(2,4)) / nsamps ## [1] 0.3323 # Let C = {1,2} and D = {2,3} PC <- sum(tosses %in% c(1,2)) / nsamps PD <- sum(tosses %in% c(2,3)) / nsamps PC * PD ## [1] 0.1067492 sum(tosses %in% c(2)) / nsamps ## [1] 0.1622 Exercise 3.3 A machine reports the true value of a thrown 12-sided die 5 out of 6 times. If the machine reports a 1 has been tossed, what is the probability that it is actually a 1? Now let the machine only report whether a 1 has been tossed or not. Does the probability change? R: Use simulation to check your answers to a) and b). Solution. Let \\(T = 1\\) denote that the toss is 1 and \\(M = 1\\) that the machine reports a 1. \\[\\begin{align} P(T = 1 | M = 1) &= \\frac{P(M = 1 | T = 1)P(T = 1)}{P(M = 1)} \\\\ &= \\frac{P(M = 1 | T = 1)P(T = 1)}{\\sum_{k=1}^{12} P(M = 1 | T = k)P(T = k)} \\\\ &= \\frac{\\frac{5}{6}\\frac{1}{12}}{\\frac{5}{6}\\frac{1}{12} + 11 \\frac{1}{6} \\frac{1}{11} \\frac{1}{12}} \\\\ &= \\frac{5}{6}. \\end{align}\\] Yes. \\[\\begin{align} P(T = 1 | M = 1) &= \\frac{P(M = 1 | T = 1)P(T = 1)}{P(M = 1)} \\\\ &= \\frac{P(M = 1 | T = 1)P(T = 1)}{\\sum_{k=1}^{12} P(M = 1 | T = k)P(T = k)} \\\\ &= \\frac{\\frac{5}{6}\\frac{1}{12}}{\\frac{5}{6}\\frac{1}{12} + 11 \\frac{1}{6} \\frac{1}{12}} \\\\ &= \\frac{5}{16}. \\end{align}\\] set.seed(1) nsamps <- 10000 report_a <- vector(mode = "numeric", length = nsamps) report_b <- vector(mode = "logical", length = nsamps) truths <- vector(mode = "logical", length = nsamps) for (i in 1:10000) { toss <- sample(1:12, size = 1) truth <- sample(c(TRUE, FALSE), size = 1, prob = c(5/6, 1/6)) truths[i] <- truth if (truth) { report_a[i] <- toss report_b[i] <- toss == 1 } else { remaining <- (1:12)[1:12 != toss] report_a[i] <- sample(remaining, size = 1) report_b[i] <- toss != 1 } } truth_a1 <- truths[report_a == 1] sum(truth_a1) / length(truth_a1) ## [1] 0.8300733 truth_b1 <- truths[report_b] sum(truth_b1) / length(truth_b1) ## [1] 0.3046209 Exercise 3.4 A coin is tossed independently \\(n\\) times. The probability of heads at each toss is \\(p\\). At each time \\(k\\), \\((k = 2,3,...,n)\\) we get a reward at time \\(k+1\\) if \\(k\\)-th toss was a head and the previous toss was a tail. Let \\(A_k\\) be the event that a reward is obtained at time \\(k\\). Are events \\(A_k\\) and \\(A_{k+1}\\) independent? Are events \\(A_k\\) and \\(A_{k+2}\\) independent? R: simulate 10 tosses 10000 times, where \\(p = 0.7\\). Check your answers to a) and b) by counting the frequencies of the events \\(A_5\\), \\(A_6\\), and \\(A_7\\). Solution. For \\(A_k\\) to happen, we need the tosses \\(k-2\\) and \\(k-1\\) be tails and heads respectively. For \\(A_{k+1}\\) to happen, we need tosses \\(k-1\\) and \\(k\\) be tails and heads respectively. As the toss \\(k-1\\) need to be heads for one and tails for the other, these two events can not happen simultaneously. Therefore the probability of their intersection is 0. But the probability of each of them separately is \\(p(1-p) > 0\\). Therefore, they are not independent. For \\(A_k\\) to happen, we need the tosses \\(k-2\\) and \\(k-1\\) be tails and heads respectively. For \\(A_{k+2}\\) to happen, we need tosses \\(k\\) and \\(k+1\\) be tails and heads respectively. So the probability of intersection is \\(p^2(1-p)^2\\). And the probability of each separately is again \\(p(1-p)\\). Therefore, they are independent. set.seed(1) nsamps <- 10000 p <- 0.7 rewardA_5 <- vector(mode = "logical", length = nsamps) rewardA_6 <- vector(mode = "logical", length = nsamps) rewardA_7 <- vector(mode = "logical", length = nsamps) rewardA_56 <- vector(mode = "logical", length = nsamps) rewardA_57 <- vector(mode = "logical", length = nsamps) for (i in 1:nsamps) { samps <- sample(c(0,1), size = 10, replace = TRUE, prob = c(0.7, 0.3)) rewardA_5[i] <- (samps[4] == 0 & samps[3] == 1) rewardA_6[i] <- (samps[5] == 0 & samps[4] == 1) rewardA_7[i] <- (samps[6] == 0 & samps[5] == 1) rewardA_56[i] <- (rewardA_5[i] & rewardA_6[i]) rewardA_57[i] <- (rewardA_5[i] & rewardA_7[i]) } sum(rewardA_5) / nsamps ## [1] 0.2141 sum(rewardA_6) / nsamps ## [1] 0.2122 sum(rewardA_7) / nsamps ## [1] 0.2107 sum(rewardA_56) / nsamps ## [1] 0 sum(rewardA_57) / nsamps ## [1] 0.0454 Exercise 3.5 A drawer contains two coins. One is an unbiased coin, the other is a biased coin, which will turn up heads with probability \\(p\\) and tails with probability \\(1-p\\). One coin is selected uniformly at random. The selected coin is tossed \\(n\\) times. The coin turns up heads \\(k\\) times and tails \\(n-k\\) times. What is the probability that the coin is biased? The selected coin is tossed repeatedly until it turns up heads \\(k\\) times. Given that it is tossed \\(n\\) times in total, what is the probability that the coin is biased? Solution. Let \\(B = 1\\) denote that the coin is biased and let \\(H = k\\) denote that we’ve seen \\(k\\) heads. \\[\\begin{align} P(B = 1 | H = k) &= \\frac{P(H = k | B = 1)P(B = 1)}{P(H = k)} \\\\ &= \\frac{P(H = k | B = 1)P(B = 1)}{P(H = k | B = 1)P(B = 1) + P(H = k | B = 0)P(B = 0)} \\\\ &= \\frac{p^k(1-p)^{n-k} 0.5}{p^k(1-p)^{n-k} 0.5 + 0.5^{n+1}} \\\\ &= \\frac{p^k(1-p)^{n-k}}{p^k(1-p)^{n-k} + 0.5^n}. \\end{align}\\] The same results as in a). The only difference between these two scenarios is that in b) the last throw must be heads. However, this holds for the biased and the unbiased coin and therefore does not affect the probability of the coin being biased. Exercise 3.6 Judy goes around the company for Women’s day and shares flowers. In every office she leaves a flower, if there is at least one woman inside. The probability that there’s a woman in the office is \\(\\frac{3}{5}\\). What is the probability that Judy leaves her first flower in the fourth office? Given that she has given away exactly three flowers in the first four offices, what is the probability that she gives her fourth flower in the eighth office? What is the probability that she leaves the second flower in the fifth office? What is the probability that she leaves the second flower in the fifth office, given that she did not leave the second flower in the second office? Judy needs a new supply of flowers immediately after the office, where she gives away her last flower. What is the probability that she visits at least five offices, if she starts with two flowers? R: simulate Judy’s walk 10000 times to check your answers a) - e). Solution. Let \\(X_i = k\\) denote the event that … \\(i\\)-th sample on the \\(k\\)-th run. Since the events are independent, we can multiply their probabilities to get \\[\\begin{equation} P(X_1 = 4) = 0.4^3 \\times 0.6 = 0.0384. \\end{equation}\\] Same as in a) as we have a fresh start after first four offices. For this to be possible, she had to leave the first flower in one of the first four offices. Therefore there are four possibilities, and for each of those the probability is \\(0.4^3 \\times 0.6\\). Additionally, the probability that she leaves a flower in the fifth office is \\(0.6\\). So \\[\\begin{equation} P(X_2 = 5) = \\binom{4}{1} \\times 0.4^3 \\times 0.6^2 = 0.09216. \\end{equation}\\] We use Bayes’ theorem. \\[\\begin{align} P(X_2 = 5 | X_2 \\neq 2) &= \\frac{P(X_2 \\neq 2 | X_2 = 5)P(X_2 = 5)}{P(X_2 \\neq 2)} \\\\ &= \\frac{0.09216}{0.64} \\\\ &= 0.144. \\end{align}\\] The denominator in the second equation can be calculated as follows. One of three things has to happen for the second not to be dealt in the second round. First, both are zero, so \\(0.4^2\\). Second, first is zero, and second is one, so \\(0.4 \\times 0.6\\). Third, the first is one and the second one zero, so \\(0.6 \\times 0.4\\). Summing these values we get \\(0.64\\). We will look at the complement, so the events that she gave away exactly two flowers after two, three and four offices. \\[\\begin{equation} P(X_2 \\geq 5) = 1 - 0.6^2 - 2 \\times 0.4 \\times 0.6^2 - 3 \\times 0.4^2 \\times 0.6^2 = 0.1792. \\end{equation}\\] The multiplying parts represent the possibilities of the first flower. set.seed(1) nsamps <- 100000 Judyswalks <- matrix(data = NA, nrow = nsamps, ncol = 8) for (i in 1:nsamps) { thiswalk <- sample(c(0,1), size = 8, replace = TRUE, prob = c(0.4, 0.6)) Judyswalks[i, ] <- thiswalk } csJudy <- t(apply(Judyswalks, 1, cumsum)) # a sum(csJudy[ ,4] == 1 & csJudy[ ,3] == 0) / nsamps ## [1] 0.03848 # b csJsubset <- csJudy[csJudy[ ,4] == 3 & csJudy[ ,3] == 2, ] sum(csJsubset[ ,8] == 4 & csJsubset[ ,7] == 3) / nrow(csJsubset) ## [1] 0.03665893 # c sum(csJudy[ ,5] == 2 & csJudy[ ,4] == 1) / nsamps ## [1] 0.09117 # d sum(csJudy[ ,5] == 2 & csJudy[ ,4] == 1) / sum(csJudy[ ,2] != 2) ## [1] 0.1422398 # e sum(csJudy[ ,4] < 2) / nsamps ## [1] 0.17818 3.2 Conditional independence Exercise 3.7 Describe: A real-world example of two events \\(A\\) and \\(B\\) that are dependent but become conditionally independent if conditioned on a third event \\(C\\). A real-world example of two events \\(A\\) and \\(B\\) that are independent, but become dependent if conditioned on some third event \\(C\\). Solution. Let \\(A\\) be the height of a person and let \\(B\\) be the person’s knowledge of the Dutch language. These events are dependent since the Dutch are known to be taller than average. However if \\(C\\) is the nationality of the person, then \\(A\\) and \\(B\\) are independent given \\(C\\). Let \\(A\\) be the event that Mary passes the exam and let \\(B\\) be the event that John passes the exam. These events are independent. However, if the event \\(C\\) is that Mary and John studied together, then \\(A\\) and \\(B\\) are conditionally dependent given \\(C\\). Exercise 3.8 We have two coins of identical appearance. We know that one is a fair coin and the other flips heads 80% of the time. We choose one of the two coins uniformly at random. We discard the coin that was not chosen. We now flip the chosen coin independently 10 times, producing a sequence \\(Y_1 = y_1\\), \\(Y_2 = y_2\\), …, \\(Y_{10} = y_{10}\\). Intuitively, without doing and computation, are these random variables independent? Compute the probability \\(P(Y_1 = 1)\\). Compute the probabilities \\(P(Y_2 = 1 | Y_1 = 1)\\) and \\(P(Y_{10} = 1 | Y_1 = 1,...,Y_9 = 1)\\). Given your answers to b) and c), would you now change your answer to a)? If so, discuss why your intuition had failed. Solution. \\(P(Y_1 = 1) = 0.5 * 0.8 + 0.5 * 0.5 = 0.65\\). Since we know that \\(Y_1 = 1\\) this should change our view of the probability of the coin being biased or not. Let \\(B = 1\\) denote the event that the coin is biased and let \\(B = 0\\) denote that the coin is unbiased. By using marginal probability, we can write \\[\\begin{align} P(Y_2 = 1 | Y_1 = 1) &= P(Y_2 = 1, B = 1 | Y_1 = 1) + P(Y_2 = 1, B = 0 | Y_1 = 1) \\\\ &= \\sum_{k=1}^2 P(Y_2 = 1 | B = k, Y_1 = 1)P(B = k | Y_1 = 1) \\\\ &= 0.8 \\frac{P(Y_1 = 1 | B = 1)P(B = 1)}{P(Y_1 = 1)} + 0.5 \\frac{P(Y_1 = 1 | B = 0)P(B = 0)}{P(Y_1 = 1)} \\\\ &= 0.8 \\frac{0.8 \\times 0.5}{0.65} + 0.5 \\frac{0.5 \\times 0.5}{0.65} \\\\ &\\approx 0.68. \\end{align}\\] For the other calculation we follow the same procedure. Let \\(X = 1\\) denote that first nine tosses are all heads (equivalent to \\(Y_1 = 1\\),…, \\(Y_9 = 1\\)). \\[\\begin{align} P(Y_{10} = 1 | X = 1) &= P(Y_2 = 1, B = 1 | X = 1) + P(Y_2 = 1, B = 0 | X = 1) \\\\ &= \\sum_{k=1}^2 P(Y_2 = 1 | B = k, X = 1)P(B = k | X = 1) \\\\ &= 0.8 \\frac{P(X = 1 | B = 1)P(B = 1)}{P(X = 1)} + 0.5 \\frac{P(X = 1 | B = 0)P(B = 0)}{P(X = 1)} \\\\ &= 0.8 \\frac{0.8^9 \\times 0.5}{0.5 \\times 0.8^9 + 0.5 \\times 0.5^9} + 0.5 \\frac{0.5^9 \\times 0.5}{0.5 \\times 0.8^9 + 0.5 \\times 0.5^9} \\\\ &\\approx 0.8. \\end{align}\\] 3.3 Monty Hall problem The Monty Hall problem is a famous probability puzzle with non-intuitive outcome. Many established mathematicians and statisticians had problems solving it and many even disregarded the correct solution until they’ve seen the proof by simulation. Here we will show how it can be solved relatively simply with the use of Bayes’ theorem if we select the variables in a smart way. Exercise 3.9 (Monty Hall problem) A prize is placed at random behind one of three doors. You pick a door. Now Monty Hall chooses one of the other two doors, opens it and shows you that it is empty. He then gives you the opportunity to keep your door or switch to the other unopened door. Should you stay or switch? Use Bayes’ theorem to calculate the probability of winning if you switch and if you do not. R: Check your answers in R. Solution. W.L.O.G. assume we always pick the first door. The host can only open door 2 or door 3, as he can not open the door we picked. Let \\(k \\in \\{2,3\\}\\). Let us first look at what happens if we do not change. Then we have \\[\\begin{align} P(\\text{car in 1} | \\text{open $k$}) &= \\frac{P(\\text{open $k$} | \\text{car in 1})P(\\text{car in 1})}{P(\\text{open $k$})} \\\\ &= \\frac{P(\\text{open $k$} | \\text{car in 1})P(\\text{car in 1})}{\\sum_{n=1}^3 P(\\text{open $k$} | \\text{car in $n$})P(\\text{car in $n$)}}. \\end{align}\\] The probability that he opened \\(k\\) if the car is in 1 is \\(\\frac{1}{2}\\), as he can choose between door 2 and 3 as both have a goat behind it. Let us look at the normalization constant. When \\(n = 1\\) we get the value in the nominator. When \\(n=k\\), we get 0, as he will not open the door if there’s a prize behind. The remaining option is that we select 1, the car is behind \\(k\\) and he opens the only door left. Since he can’t open 1 due to it being our pick and \\(k\\) due to having the prize, the probability of opening the remaining door is 1, and the prior probability of the car being behind this door is \\(\\frac{1}{3}\\). So we have \\[\\begin{align} P(\\text{car in 1} | \\text{open $k$}) &= \\frac{\\frac{1}{2}\\frac{1}{3}}{\\frac{1}{2}\\frac{1}{3} + \\frac{1}{3}} \\\\ &= \\frac{1}{3}. \\end{align}\\] Now let us look at what happens if we do change. Let \\(k' \\in \\{2,3\\}\\) be the door that is not opened. If we change, we select this door, so we have \\[\\begin{align} P(\\text{car in $k'$} | \\text{open $k$}) &= \\frac{P(\\text{open $k$} | \\text{car in $k'$})P(\\text{car in $k'$})}{P(\\text{open $k$})} \\\\ &= \\frac{P(\\text{open $k$} | \\text{car in $k'$})P(\\text{car in $k'$})}{\\sum_{n=1}^3 P(\\text{open $k$} | \\text{car in $n$})P(\\text{car in $n$)}}. \\end{align}\\] The denominator stays the same, the only thing that is different from before is \\(P(\\text{open $k$} | \\text{car in $k'$})\\). We have a situation where we initially selected door 1 and the car is in door \\(k'\\). The probability that the host will open door \\(k\\) is then 1, as he can not pick any other door. So we have \\[\\begin{align} P(\\text{car in $k'$} | \\text{open $k$}) &= \\frac{\\frac{1}{3}}{\\frac{1}{2}\\frac{1}{3} + \\frac{1}{3}} \\\\ &= \\frac{2}{3}. \\end{align}\\] Therefore it makes sense to change the door. set.seed(1) nsamps <- 1000 ifchange <- vector(mode = "logical", length = nsamps) ifstay <- vector(mode = "logical", length = nsamps) for (i in 1:nsamps) { where_car <- sample(c(1:3), 1) where_player <- sample(c(1:3), 1) open_samp <- (1:3)[where_car != (1:3) & where_player != (1:3)] if (length(open_samp) == 1) { where_open <- open_samp } else { where_open <- sample(open_samp, 1) } ifstay[i] <- where_car == where_player where_ifchange <- (1:3)[where_open != (1:3) & where_player != (1:3)] ifchange[i] <- where_ifchange == where_car } sum(ifstay) / nsamps ## [1] 0.328 sum(ifchange) / nsamps ## [1] 0.672 "],["rvs.html", "Chapter 4 Random variables 4.1 General properties and calculations 4.2 Discrete random variables 4.3 Continuous random variables 4.4 Singular random variables 4.5 Transformations", " Chapter 4 Random variables This chapter deals with random variables and their distributions. The students are expected to acquire the following knowledge: Theoretical Identification of random variables. Convolutions of random variables. Derivation of PDF, PMF, CDF, and quantile function. Definitions and properties of common discrete random variables. Definitions and properties of common continuous random variables. Transforming univariate random variables. R Familiarize with PDF, PMF, CDF, and quantile functions for several distributions. Visual inspection of probability distributions. Analytical and empirical calculation of probabilities based on distributions. New R functions for plotting (for example, facet_wrap). Creating random number generators based on the Uniform distribution. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 4.1 General properties and calculations Exercise 4.1 Which of the functions below are valid CDFs? Find their respective densities. R: Plot the three functions. \\[\\begin{equation} F(x) = \\begin{cases} 1 - e^{-x^2} & x \\geq 0 \\\\ 0 & x < 0. \\end{cases} \\end{equation}\\] \\[\\begin{equation} F(x) = \\begin{cases} e^{-\\frac{1}{x}} & x > 0 \\\\ 0 & x \\leq 0. \\end{cases} \\end{equation}\\] \\[\\begin{equation} F(x) = \\begin{cases} 0 & x \\leq 0 \\\\ \\frac{1}{3} & 0 < x \\leq \\frac{1}{2} \\\\ 1 & x > \\frac{1}{2}. \\end{cases} \\end{equation}\\] Solution. Yes. First, let us check the limits. \\(\\lim_{x \\rightarrow -\\infty} (0) = 0\\). \\(\\lim_{x \\rightarrow \\infty} (1 - e^{-x^2}) = 1 - \\lim_{x \\rightarrow \\infty} e^{-x^2} = 1 - 0 = 1\\). Second, let us check whether the function is increasing. Let \\(x > y \\geq 0\\). Then \\(1 - e^{-x^2} \\geq 1 - e^{-y^2}\\). We only have to check right continuity for the point zero. \\(F(0) = 0\\) and \\(\\lim_{\\epsilon \\downarrow 0}F (0 + \\epsilon) = \\lim_{\\epsilon \\downarrow 0} 1 - e^{-\\epsilon^2} = 1 - \\lim_{\\epsilon \\downarrow 0} e^{-\\epsilon^2} = 1 - 1 = 0\\). We get the density by differentiating the CDF. \\(p(x) = \\frac{d}{dx} 1 - e^{-x^2} = 2xe^{-x^2}.\\) Students are encouraged to check that this is a proper PDF. Yes. First, let us check the limits. $_{x -} (0) = 0 and \\(\\lim_{x \\rightarrow \\infty} (e^{-\\frac{1}{x}}) = 1\\). Second, let us check whether the function is increasing. Let \\(x > y \\geq 0\\). Then \\(e^{-\\frac{1}{x}} \\geq e^{-\\frac{1}{y}}\\). We only have to check right continuity for the point zero. \\(F(0) = 0\\) and \\(\\lim_{\\epsilon \\downarrow 0}F (0 + \\epsilon) = \\lim_{\\epsilon \\downarrow 0} e^{-\\frac{1}{\\epsilon}} = 0\\). We get the density by differentiating the CDF. \\(p(x) = \\frac{d}{dx} e^{-\\frac{1}{x}} = \\frac{1}{x^2}e^{-\\frac{1}{x}}.\\) Students are encouraged to check that this is a proper PDF. No. The function is not right continuous as \\(F(\\frac{1}{2}) = \\frac{1}{3}\\), but \\(\\lim_{\\epsilon \\downarrow 0} F(\\frac{1}{2} + \\epsilon) = 1\\). f1 <- function (x) { tmp <- 1 - exp(-x^2) tmp[x < 0] <- 0 return(tmp) } f2 <- function (x) { tmp <- exp(-(1 / x)) tmp[x <= 0] <- 0 return(tmp) } f3 <- function (x) { tmp <- x tmp[x == x] <- 1 tmp[x <= 0.5] <- 1/3 tmp[x <= 0] <- 0 return(tmp) } cdf_data <- tibble(x = seq(-1, 20, by = 0.001), f1 = f1(x), f2 = f2(x), f3 = f3(x)) %>% melt(id.vars = "x") cdf_plot <- ggplot(data = cdf_data, aes(x = x, y = value, color = variable)) + geom_hline(yintercept = 1) + geom_line() plot(cdf_plot) Exercise 4.2 Let \\(X\\) be a random variable with CDF \\[\\begin{equation} F(x) = \\begin{cases} 0 & x < 0 \\\\ \\frac{x^2}{2} & 0 \\leq x < 1 \\\\ \\frac{1}{2} + \\frac{p}{2} & 1 \\leq x < 2 \\\\ \\frac{1}{2} + \\frac{p}{2} + \\frac{1 - p}{2} & x \\geq 2 \\end{cases} \\end{equation}\\] R: Plot this CDF for \\(p = 0.3\\). Is it a discrete, continuous, or mixed random varible? Find the probability density/mass of \\(X\\). f1 <- function (x, p) { tmp <- x tmp[x >= 2] <- 0.5 + (p * 0.5) + ((1-p) * 0.5) tmp[x < 2] <- 0.5 + (p * 0.5) tmp[x < 1] <- (x[x < 1])^2 / 2 tmp[x < 0] <- 0 return(tmp) } cdf_data <- tibble(x = seq(-1, 5, by = 0.001), y = f1(x, 0.3)) cdf_plot <- ggplot(data = cdf_data, aes(x = x, y = y)) + geom_hline(yintercept = 1) + geom_line(color = "blue") plot(cdf_plot) ::: {.solution} \\(X\\) is a mixed random variable. Since \\(X\\) is a mixed random variable, we have to find the PDF of the continuous part and the PMF of the discrete part. We get the continuous part by differentiating the corresponding CDF, \\(\\frac{d}{dx}\\frac{x^2}{2} = x\\). So the PDF, when \\(0 \\leq x < 1\\), is \\(p(x) = x\\). Let us look at the discrete part now. It has two steps, so this is a discrete distribution with two outcomes – numbers 1 and 2. The first happens with probability \\(\\frac{p}{2}\\), and the second with probability \\(\\frac{1 - p}{2}\\). This reminds us of the Bernoulli distribution. The PMF for the discrete part is \\(P(X = x) = (\\frac{p}{2})^{2 - x} (\\frac{1 - p}{2})^{x - 1}\\). ::: Exercise 4.3 (Convolutions) Convolutions are probability distributions that correspond to sums of independent random variables. Let \\(X\\) and \\(Y\\) be independent discrete variables. Find the PMF of \\(Z = X + Y\\). Hint: Use the law of total probability. Let \\(X\\) and \\(Y\\) be independent continuous variables. Find the PDF of \\(Z = X + Y\\). Hint: Start with the CDF. Solution. \\[\\begin{align} P(Z = z) &= P(X + Y = z) & \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + Y = z | Y = k) P(Y = k) & \\text{ (law of total probability)} \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + k = z | Y = k) P(Y = k) & \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + k = z) P(Y = k) & \\text{ (independence of $X$ and $Y$)} \\\\ &= \\sum_{k = -\\infty}^\\infty P(X = z - k) P(Y = k). & \\end{align}\\] Let \\(f\\) and \\(g\\) be the PDFs of \\(X\\) and \\(Y\\) respectively. \\[\\begin{align} F(z) &= P(Z < z) \\\\ &= P(X + Y < z) \\\\ &= \\int_{-\\infty}^{\\infty} P(X + Y < z | Y = y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X + y < z | Y = y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X + y < z)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X < z - y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} (\\int_{-\\infty}^{z - y} f(x) dx) g(y) dy \\end{align}\\] Now \\[\\begin{align} p(z) &= \\frac{d}{dz} F(z) & \\\\ &= \\int_{-\\infty}^{\\infty} (\\frac{d}{dz}\\int_{-\\infty}^{z - y} f(x) dx) g(y) dy & \\\\ &= \\int_{-\\infty}^{\\infty} f(z - y) g(y) dy & \\text{ (fundamental theorem of calculus)}. \\end{align}\\] 4.2 Discrete random variables Exercise 4.4 (Binomial random variable) Let \\(X_k\\), \\(k = 1,...,n\\), be random variables with the Bernoulli measure as the PMF. Let \\(X = \\sum_{k=1}^n X_k\\). We call \\(X_k\\) a Bernoulli random variable with parameter \\(p \\in (0,1)\\). Find the CDF of \\(X_k\\). Find PMF of \\(X\\). This is a Binomial random variable with support in \\(\\{0,1,2,...,n\\}\\) and parameters \\(p \\in (0,1)\\) and \\(n \\in \\mathbb{N}_0\\). We denote \\[\\begin{equation} X | n,p \\sim \\text{binomial}(n,p). \\end{equation}\\] Find CDF of \\(X\\). R: Simulate from the binomial distribution with \\(n = 10\\) and \\(p = 0.5\\), and from \\(n\\) Bernoulli distributions with \\(p = 0.5\\). Visually compare the sum of Bernoullis and the binomial. Hint: there is no standard function like rpois for a Bernoulli random variable. Check exercise 1.12 to find out how to sample from a Bernoulli distribution. Solution. There are two outcomes – zero and one. Zero happens with probability \\(1 - p\\). Therefore \\[\\begin{equation} F(k) = \\begin{cases} 0 & k < 0 \\\\ 1 - p & 0 \\leq k < 1 \\\\ 1 & k \\geq 1. \\end{cases} \\end{equation}\\] For the probability of \\(X\\) to be equal to some \\(k \\leq n\\), exactly \\(k\\) Bernoulli variables need to be one, and the others zero. So \\(p^k(1-p)^{n-k}\\). There are \\(\\binom{n}{k}\\) such possible arrangements. Therefore \\[\\begin{align} P(X = k) = \\binom{n}{k} p^k (1 - p)^{n-k}. \\end{align}\\] \\[\\begin{equation} F(k) = \\sum_{i = 0}^{\\lfloor k \\rfloor} \\binom{n}{i} p^i (1 - p)^{n - i} \\end{equation}\\] set.seed(1) nsamps <- 10000 binom_samp <- rbinom(nsamps, size = 10, prob = 0.5) bernoulli_mat <- matrix(data = NA, nrow = nsamps, ncol = 10) for (i in 1:nsamps) { bernoulli_mat[i, ] <- rbinom(10, size = 1, prob = 0.5) } bern_samp <- apply(bernoulli_mat, 1, sum) b_data <- tibble(x = c(binom_samp, bern_samp), type = c(rep("binomial", 10000), rep("Bernoulli_sum", 10000))) b_plot <- ggplot(data = b_data, aes(x = x, fill = type)) + geom_bar(position = "dodge") plot(b_plot) Exercise 4.5 (Geometric random variable) A variable with PMF \\[\\begin{equation} P(k) = p(1-p)^k \\end{equation}\\] is a geometric random variable with support in non-negative integers. It has one parameter \\(p \\in (0,1]\\). We denote \\[\\begin{equation} X | p \\sim \\text{geometric}(p) \\end{equation}\\] Derive the CDF of a geometric random variable. R: Draw 1000 samples from the geometric distribution with \\(p = 0.3\\) and compare their frequencies to theoretical values. Solution. \\[\\begin{align} P(X \\leq k) &= \\sum_{i = 0}^k p(1-p)^i \\\\ &= p \\sum_{i = 0}^k (1-p)^i \\\\ &= p \\frac{1 - (1-p)^{k+1}}{1 - (1 - p)} \\\\ &= 1 - (1-p)^{k + 1} \\end{align}\\] set.seed(1) geo_samp <- rgeom(n = 1000, prob = 0.3) geo_samp <- data.frame(x = geo_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:20, n = dgeom(0:20, prob = 0.3), type = "theoretical_measure")) geo_plot <- ggplot(data = geo_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(geo_plot) Exercise 4.6 (Poisson random variable) A variable with PMF \\[\\begin{equation} P(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!} \\end{equation}\\] is a Poisson random variable with support in non-negative integers. It has one positive parameter \\(\\lambda\\), which also represents its mean value and variance (a measure of the deviation of the values from the mean – more on mean and variance in the next chapter). We denote \\[\\begin{equation} X | \\lambda \\sim \\text{Poisson}(\\lambda). \\end{equation}\\] This distribution is usually the default choice for modeling counts. We have already encountered a Poisson random variable in exercise 1.13, where we also sampled from this distribution. The CDF of a Poisson random variable is \\(P(X <= x) = e^{-\\lambda} \\sum_{i=0}^x \\frac{\\lambda^{i}}{i!}\\). R: Draw 1000 samples from the Poisson distribution with \\(\\lambda = 5\\) and compare their empirical cumulative distribution function with the theoretical CDF. set.seed(1) pois_samp <- rpois(n = 1000, lambda = 5) pois_samp <- data.frame(x = pois_samp) pois_plot <- ggplot(data = pois_samp, aes(x = x, colour = "ECDF")) + stat_ecdf(geom = "step") + geom_step(data = tibble(x = 0:17, y = ppois(x, 5)), aes(x = x, y = y, colour = "CDF")) + scale_colour_manual("Lgend title", values = c("black", "red")) plot(pois_plot) Exercise 4.7 (Negative binomial random variable) A variable with PMF \\[\\begin{equation} p(k) = \\binom{k + r - 1}{k}(1-p)^r p^k \\end{equation}\\] is a negative binomial random variable with support in non-negative integers. It has two parameters \\(r > 0\\) and \\(p \\in (0,1)\\). We denote \\[\\begin{equation} X | r,p \\sim \\text{NB}(r,p). \\end{equation}\\] Let us reparameterize the negative binomial distribution with \\(q = 1 - p\\). Find the PMF of \\(X \\sim \\text{NB}(1, q)\\). Do you recognize this distribution? Show that the sum of two negative binomial random variables with the same \\(p\\) is also a negative binomial random variable. Hint: Use the fact that the number of ways to place \\(n\\) indistinct balls into \\(k\\) boxes is \\(\\binom{n + k - 1}{n}\\). R: Draw samples from \\(X \\sim \\text{NB}(5, 0.4)\\) and \\(Y \\sim \\text{NB}(3, 0.4)\\). Draw samples from \\(Z = X + Y\\), where you use the parameters calculated in b). Plot both distributions, their sum, and \\(Z\\) using facet_wrap. Be careful, as R uses a different parameterization size=\\(r\\) and prob=\\(1 - p\\). Solution. \\[\\begin{align} P(X = k) &= \\binom{k + 1 - 1}{k}q^1 (1-q)^k \\\\ &= q(1-q)^k. \\end{align}\\] This is the geometric distribution. Let \\(X \\sim \\text{NB}(r_1, p)\\) and \\(Y \\sim \\text{NB}(r_2, p)\\). Let \\(Z = X + Y\\). \\[\\begin{align} P(Z = z) &= \\sum_{k = 0}^{\\infty} P(X = z - k)P(Y = k), \\text{ if k < 0, then the probabilities are 0} \\\\ &= \\sum_{k = 0}^{z} P(X = z - k)P(Y = k), \\text{ if k > z, then the probabilities are 0} \\\\ &= \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k}(1 - p)^{r_1} p^{z - k} \\binom{k + r_2 - 1}{k}(1 - p)^{r_2} p^{k} & \\\\ &= \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}(1 - p)^{r_1 + r_2} p^{z} & \\\\ &= (1 - p)^{r_1 + r_2} p^{z} \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}& \\end{align}\\] The part before the sum reminds us of the negative binomial distribution with parameters \\(r_1 + r_2\\) and \\(p\\). To complete this term to the negative binomial PMF we need \\(\\binom{z + r_1 + r_2 -1}{z}\\). So the only thing we need to prove is that the sum equals this term. Both terms in the sum can be interpreted as numbers of ways to place a number of balls into boxes. For the left term it is \\(z-k\\) balls into \\(r_1\\) boxes, and for the right \\(k\\) balls into \\(r_2\\) boxes. For each \\(k\\) we are distributing \\(z\\) balls in total. By summing over all \\(k\\), we actually get all the possible placements of \\(z\\) balls into \\(r_1 + r_2\\) boxes. Therefore \\[\\begin{align} P(Z = z) &= (1 - p)^{r_1 + r_2} p^{z} \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}& \\\\ &= \\binom{z + r_1 + r_2 -1}{z} (1 - p)^{r_1 + r_2} p^{z}. \\end{align}\\] From this it also follows that the sum of geometric distributions with the same parameter is a negative binomial distribution. \\(Z \\sim \\text{NB}(8, 0.4)\\). set.seed(1) nsamps <- 10000 x <- rnbinom(nsamps, size = 5, prob = 0.6) y <- rnbinom(nsamps, size = 3, prob = 0.6) xpy <- x + y z <- rnbinom(nsamps, size = 8, prob = 0.6) samps <- tibble(x, y, xpy, z) samps <- melt(samps) ggplot(data = samps, aes(x = value)) + geom_bar() + facet_wrap(~ variable) 4.3 Continuous random variables Exercise 4.8 (Exponential random variable) A variable \\(X\\) with PDF \\(\\lambda e^{-\\lambda x}\\) is an exponential random variable with support in non-negative real numbers. It has one positive parameter \\(\\lambda\\). We denote \\[\\begin{equation} X | \\lambda \\sim \\text{Exp}(\\lambda). \\end{equation}\\] Find the CDF of an exponential random variable. Find the quantile function of an exponential random variable. Calculate the probability \\(P(1 \\leq X \\leq 3)\\), where \\(X \\sim \\text{Exp(1.5)}\\). R: Check your answer to c) with a simulation (rexp). Plot the probability in a meaningful way. R: Implement PDF, CDF, and the quantile function and compare their values with corresponding R functions visually. Hint: use the size parameter to make one of the curves wider. Solution. \\[\\begin{align} F(x) &= \\int_{0}^{x} \\lambda e^{-\\lambda t} dt \\\\ &= \\lambda \\int_{0}^{x} e^{-\\lambda t} dt \\\\ &= \\lambda (\\frac{1}{-\\lambda}e^{-\\lambda t} |_{0}^{x}) \\\\ &= \\lambda(\\frac{1}{\\lambda} - \\frac{1}{\\lambda} e^{-\\lambda x}) \\\\ &= 1 - e^{-\\lambda x}. \\end{align}\\] \\[\\begin{align} F(F^{-1}(x)) &= x \\\\ 1 - e^{-\\lambda F^{-1}(x)} &= x \\\\ e^{-\\lambda F^{-1}(x)} &= 1 - x \\\\ -\\lambda F^{-1}(x) &= \\ln(1 - x) \\\\ F^{-1}(x) &= - \\frac{ln(1 - x)}{\\lambda}. \\end{align}\\] \\[\\begin{align} P(1 \\leq X \\leq 3) &= P(X \\leq 3) - P(X \\leq 1) \\\\ &= P(X \\leq 3) - P(X \\leq 1) \\\\ &= 1 - e^{-1.5 \\times 3} - 1 + e^{-1.5 \\times 1} \\\\ &\\approx 0.212. \\end{align}\\] set.seed(1) nsamps <- 1000 samps <- rexp(nsamps, rate = 1.5) sum(samps >= 1 & samps <= 3) / nsamps ## [1] 0.212 exp_plot <- ggplot(data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dexp, args = list(rate = 1.5)) + stat_function(fun = dexp, args = list(rate = 1.5), xlim = c(1,3), geom = "area", fill = "red") plot(exp_plot) exp_pdf <- function(x, lambda) { return (lambda * exp(-lambda * x)) } exp_cdf <- function(x, lambda) { return (1 - exp(-lambda * x)) } exp_quant <- function(q, lambda) { return (-(log(1 - q) / lambda)) } ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_pdf, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = pexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_cdf, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) ggplot(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x)) + stat_function(fun = qexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_quant, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) Exercise 4.9 (Uniform random variable) Continuous uniform random variable with parameters \\(a\\) and \\(b\\) has the PDF \\[\\begin{equation} p(x) = \\begin{cases} \\frac{1}{b - a} & x \\in [a,b] \\\\ 0 & \\text{otherwise}. \\end{cases} \\end{equation}\\] Find the CDF of the uniform random variable. Find the quantile function of the uniform random variable. Let \\(X \\sim \\text{Uniform}(a,b)\\). Find the CDF of the variable \\(Y = \\frac{X - a}{b - a}\\). This is the standard uniform random variable. Let \\(X \\sim \\text{Uniform}(-1, 3)\\). Find such \\(z\\) that \\(P(X < z + \\mu_x) = \\frac{1}{5}\\). R: Check your result from d) using simulation. Solution. \\[\\begin{align} F(x) &= \\int_{a}^x \\frac{1}{b - a} dt \\\\ &= \\frac{1}{b - a} \\int_{a}^x dt \\\\ &= \\frac{x - a}{b - a}. \\end{align}\\] \\[\\begin{align} F(F^{-1}(p)) &= p \\\\ \\frac{F^{-1}(p) - a}{b - a} &= p \\\\ F^{-1}(p) &= p(b - a) + a. \\end{align}\\] \\[\\begin{align} F_Y(y) &= P(Y < y) \\\\ &= P(\\frac{X - a}{b - a} < y) \\\\ &= P(X < y(b - a) + a) \\\\ &= F_X(y(b - a) + a) \\\\ &= \\frac{(y(b - a) + a) - a}{b - a} \\\\ &= y. \\end{align}\\] \\[\\begin{align} P(X < z + 1) &= \\frac{1}{5} \\\\ F(z + 1) &= \\frac{1}{5} \\\\ z + 1 &= F^{-1}(\\frac{1}{5}) \\\\ z &= \\frac{1}{5}4 - 1 - 1 \\\\ z &= -1.2. \\end{align}\\] set.seed(1) a <- -1 b <- 3 nsamps <- 10000 unif_samp <- runif(nsamps, a, b) mu_x <- mean(unif_samp) new_samp <- unif_samp - mu_x quantile(new_samp, probs = 1/5) ## 20% ## -1.203192 punif(-0.2, -1, 3) ## [1] 0.2 Exercise 4.10 (Beta random variable) A variable \\(X\\) with PDF \\[\\begin{equation} p(x) = \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)}, \\end{equation}\\] where \\(\\text{B}(\\alpha, \\beta) = \\frac{\\Gamma(\\alpha) \\Gamma(\\beta)}{\\Gamma(\\alpha + \\beta)}\\) and \\(\\Gamma(x) = \\int_0^{\\infty} x^{z - 1} e^{-x} dx\\) is a Beta random variable with support on \\([0,1]\\). It has two positive parameters \\(\\alpha\\) and \\(\\beta\\). Notation: \\[\\begin{equation} X | \\alpha, \\beta \\sim \\text{Beta}(\\alpha, \\beta) \\end{equation}\\] It is often used in modeling rates. Calculate the PDF for \\(\\alpha = 1\\) and \\(\\beta = 1\\). What do you notice? R: Plot densities of the beta distribution for parameter pairs (2, 2), (4, 1), (1, 4), (2, 5), and (0.1, 0.1). R: Sample from \\(X \\sim \\text{Beta}(2, 5)\\) and compare the histogram with Beta PDF. Solution. \\[\\begin{equation} p(x) = \\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\\text{B}(1, 1)} = 1. \\end{equation}\\] This is the standard uniform distribution. set.seed(1) ggplot(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x)) + stat_function(fun = dbeta, args = list(shape1 = 2, shape2 = 2), aes(color = "alpha = 0.5")) + stat_function(fun = dbeta, args = list(shape1 = 4, shape2 = 1), aes(color = "alpha = 4")) + stat_function(fun = dbeta, args = list(shape1 = 1, shape2 = 4), aes(color = "alpha = 1")) + stat_function(fun = dbeta, args = list(shape1 = 2, shape2 = 5), aes(color = "alpha = 25")) + stat_function(fun = dbeta, args = list(shape1 = 0.1, shape2 = 0.1), aes(color = "alpha = 0.1")) set.seed(1) nsamps <- 1000 samps <- rbeta(nsamps, 2, 5) ggplot(data = data.frame(x = samps), aes(x = x)) + geom_histogram(aes(y = ..density..), color = "black") + stat_function(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x), fun = dbeta, args = list(shape1 = 2, shape2 = 5), color = "red", size = 1.2) Exercise 4.11 (Gamma random variable) A random variable with PDF \\[\\begin{equation} p(x) = \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x} \\end{equation}\\] is a Gamma random variable with support on the positive numbers and parameters shape \\(\\alpha > 0\\) and rate \\(\\beta > 0\\). We write \\[\\begin{equation} X | \\alpha, \\beta \\sim \\text{Gamma}(\\alpha, \\beta) \\end{equation}\\] and it’s CDF is \\[\\begin{equation} \\frac{\\gamma(\\alpha, \\beta x)}{\\Gamma(\\alpha)}, \\end{equation}\\] where \\(\\gamma(s, x) = \\int_0^x t^{s-1} e^{-t} dt\\). It is usually used in modeling positive phenomena (for example insurance claims and rainfalls). Let \\(X \\sim \\text{Gamma}(1, \\beta)\\). Find the PDF of \\(X\\). Do you recognize this PDF? Let \\(k = \\alpha\\) and \\(\\theta = \\frac{1}{\\beta}\\). Find the PDF of \\(X | k, \\theta \\sim \\text{Gamma}(k, \\theta)\\). Random variables can be reparameterized, and sometimes a reparameterized distribution is more suitable for certain calculations. The first parameterization is for example usually used in Bayesian statistics, while this parameterization is more common in econometrics and some other applied fields. Note that you also need to pay attention to the parameters in statistical software, so diligently read the help files when using functions like rgamma to see how the function is parameterized. R: Plot gamma CDF for random variables with shape and rate parameters (1,1), (10,1), (1,10). Solution. \\[\\begin{align} p(x) &= \\frac{\\beta^1}{\\Gamma(1)} x^{1 - 1}e^{-\\beta x} \\\\ &= \\beta e^{-\\beta x} \\end{align}\\] This is the PDF of the exponential distribution with parameter \\(\\beta\\). \\[\\begin{align} p(x) &= \\frac{1}{\\Gamma(k)\\beta^k} x^{k - 1}e^{-\\frac{x}{\\theta}}. \\end{align}\\] set.seed(1) ggplot(data = data.frame(x = seq(0, 25, by = 0.01)), aes(x = x)) + stat_function(fun = pgamma, args = list(shape = 1, rate = 1), aes(color = "Gamma(1,1)")) + stat_function(fun = pgamma, args = list(shape = 10, rate = 1), aes(color = "Gamma(10,1)")) + stat_function(fun = pgamma, args = list(shape = 1, rate = 10), aes(color = "Gamma(1,10)")) Exercise 4.12 (Normal random variable) A random variable with PDF \\[\\begin{equation} p(x) = \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} \\end{equation}\\] is a normal random variable with support on the real axis and parameters \\(\\mu\\) in reals and \\(\\sigma^2 > 0\\). The first is the mean parameter and the second is the variance parameter. Many statistical methods assume a normal distribution. We denote \\[\\begin{equation} X | \\mu, \\sigma \\sim \\text{N}(\\mu, \\sigma^2), \\end{equation}\\] and it’s CDF is \\[\\begin{equation} F(x) = \\int_{-\\infty}^x \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(t - \\mu)^2}{2 \\sigma^2}} dt, \\end{equation}\\] which is intractable and is usually approximated. Due to its flexibility it is also one of the most researched distributions. For that reason statisticians often use transformations of variables or approximate distributions with the normal distribution. Show that a variable \\(\\frac{X - \\mu}{\\sigma} \\sim \\text{N}(0,1)\\). This transformation is called standardization, and \\(\\text{N}(0,1)\\) is a standard normal distribution. R: Plot the normal distribution with \\(\\mu = 0\\) and different values for the \\(\\sigma\\) parameter. R: The normal distribution provides a good approximation for the Poisson distribution with a large \\(\\lambda\\). Let \\(X \\sim \\text{Poisson}(50)\\). Approximate \\(X\\) with the normal distribution and compare its density with the Poisson histogram. What are the values of \\(\\mu\\) and \\(\\sigma^2\\) that should provide the best approximation? Note that R function rnorm takes standard deviation (\\(\\sigma\\)) as a parameter and not variance. Solution. \\[\\begin{align} P(\\frac{X - \\mu}{\\sigma} < x) &= P(X < \\sigma x + \\mu) \\\\ &= F(\\sigma x + \\mu) \\\\ &= \\int_{-\\infty}^{\\sigma x + \\mu} \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(t - \\mu)^2}{2\\sigma^2}} dt \\end{align}\\] Now let \\(s = f(t) = \\frac{t - \\mu}{\\sigma}\\), then \\(ds = \\frac{dt}{\\sigma}\\) and \\(f(\\sigma x + \\mu) = x\\), so \\[\\begin{align} P(\\frac{X - \\mu}{\\sigma} < x) &= \\int_{-\\infty}^{x} \\frac{1}{\\sqrt{2 \\pi}} e^{-\\frac{s^2}{2}} ds. \\end{align}\\] There is no need to evaluate this integral, as we recognize it as the CDF of a normal distribution with \\(\\mu = 0\\) and \\(\\sigma^2 = 1\\). set.seed(1) # b ggplot(data = data.frame(x = seq(-15, 15, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 1), aes(color = "sd = 1")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 0.4), aes(color = "sd = 0.1")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "sd = 2")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 5), aes(color = "sd = 5")) # c mean_par <- 50 nsamps <- 100000 pois_samps <- rpois(nsamps, lambda = mean_par) norm_samps <- rnorm(nsamps, mean = mean_par, sd = sqrt(mean_par)) my_plot <- ggplot() + geom_bar(data = tibble(x = pois_samps), aes(x = x, y = (..count..)/sum(..count..))) + geom_density(data = tibble(x = norm_samps), aes(x = x), color = "red") plot(my_plot) Exercise 4.13 (Logistic random variable) A logistic random variable has CDF \\[\\begin{equation} F(x) = \\frac{1}{1 + e^{-\\frac{x - \\mu}{s}}}, \\end{equation}\\] where \\(\\mu\\) is real and \\(s > 0\\). The support is on the real axis. We denote \\[\\begin{equation} X | \\mu, s \\sim \\text{Logistic}(\\mu, s). \\end{equation}\\] The distribution of the logistic random variable resembles a normal random variable, however it has heavier tails. Find the PDF of a logistic random variable. R: Implement logistic PDF and CDF and visually compare both for \\(X \\sim \\text{N}(0, 1)\\) and \\(Y \\sim \\text{logit}(0, \\sqrt{\\frac{3}{\\pi^2}})\\). These distributions have the same mean and variance. Additionally, plot the same plot on the interval [5,10], to better see the difference in the tails. R: For the distributions in b) find the probability \\(P(|X| > 4)\\) and interpret the result. Solution. \\[\\begin{align} p(x) &= \\frac{d}{dx} \\frac{1}{1 + e^{-\\frac{x - \\mu}{s}}} \\\\ &= \\frac{- \\frac{d}{dx} (1 + e^{-\\frac{x - \\mu}{s}})}{(1 + e{-\\frac{x - \\mu}{s}})^2} \\\\ &= \\frac{e^{-\\frac{x - \\mu}{s}}}{s(1 + e{-\\frac{x - \\mu}{s}})^2}. \\end{align}\\] # b set.seed(1) logit_pdf <- function (x, mu, s) { return ((exp(-(x - mu)/(s))) / (s * (1 + exp(-(x - mu)/(s)))^2)) } nl_plot <- ggplot(data = data.frame(x = seq(-12, 12, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "normal")) + stat_function(fun = logit_pdf, args = list(mu = 0, s = sqrt(12/pi^2)), aes(color = "logit")) plot(nl_plot) nl_plot <- ggplot(data = data.frame(x = seq(5, 10, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "normal")) + stat_function(fun = logit_pdf, args = list(mu = 0, s = sqrt(12/pi^2)), aes(color = "logit")) plot(nl_plot) # c logit_cdf <- function (x, mu, s) { return (1 / (1 + exp(-(x - mu) / s))) } p_logistic <- 1 - logit_cdf(4, 0, sqrt(12/pi^2)) + logit_cdf(-4, 0, sqrt(12/pi^2)) p_norm <- 1 - pnorm(4, 0, 2) + pnorm(-4, 0, 2) p_logistic ## [1] 0.05178347 p_norm ## [1] 0.04550026 # Logistic distribution has wider tails, therefore the probability of larger # absolute values is higher. 4.4 Singular random variables Exercise 4.14 (Cantor distribution) The Cantor set is a subset of \\([0,1]\\), which we create by iteratively deleting the middle third of the interval. For example, in the first iteration, we get the sets \\([0,\\frac{1}{3}]\\) and \\([\\frac{2}{3},1]\\). In the second iteration, we get \\([0,\\frac{1}{9}]\\), \\([\\frac{2}{9},\\frac{1}{3}]\\), \\([\\frac{2}{3}, \\frac{7}{9}]\\), and \\([\\frac{8}{9}, 1]\\). On the \\(n\\)-th iteration, we have \\[\\begin{equation} C_n = \\frac{C_{n-1}}{3} \\cup \\bigg(\\frac{2}{3} + \\frac{C_{n-1}}{3} \\bigg), \\end{equation}\\] where \\(C_0 = [0,1]\\). The Cantor set is then defined as the intersection of these sets \\[\\begin{equation} C = \\cap_{n=1}^{\\infty} C_n. \\end{equation}\\] It has the same cardinality as \\([0,1]\\). Another way to define the Cantor set is the set of all numbers on \\([0,1]\\), that do not have a 1 in the ternary representation \\(x = \\sum_{n=1}^\\infty \\frac{x_i}{3^i}, x_i \\in \\{0,1,2\\}\\). A random variable follows the Cantor distribution, if its CDF is the Cantor function (below). You can find the implementations of random number generator, CDF, and quantile functions for the Cantor distributions at https://github.com/Henrygb/CantorDist.R. Show that the Lebesgue measure of the Cantor set is 0. (Jagannathan) Let us look at an infinite sequence of independent fair-coin tosses. If the outcome is heads, let \\(x_i = 2\\) and \\(x_i = 0\\), when tails. Then use these to create \\(x = \\sum_{n=1}^\\infty \\frac{x_i}{3^i}\\). This is a random variable with the Cantor distribution. Show that \\(X\\) has a singular distribution. Solution. \\[\\begin{align} \\lambda(C) &= 1 - \\lambda(C^c) \\\\ &= 1 - \\frac{1}{3}\\sum_{k = 0}^\\infty (\\frac{2}{3})^k \\\\ &= 1 - \\frac{\\frac{1}{3}}{1 - \\frac{2}{3}} \\\\ &= 0. \\end{align}\\] First, for every \\(x\\), the probability of observing it is \\(\\lim_{n \\rightarrow \\infty} \\frac{1}{2^n} = 0\\). Second, the probability that we observe one of all the possible sequences is 1. Therefore \\(P(C) = 1\\). So this is a singular variable. The CDF only increments on the elements of the Cantor set. 4.5 Transformations Exercise 4.15 Let \\(X\\) be a random variable that is uniformly distributed on \\(\\{-2, -1, 0, 1, 2\\}\\). Find the PMF of \\(Y = X^2\\). Solution. \\[\\begin{align} P_Y(y) = \\sum_{x \\in \\sqrt(y)} P_X(x) = \\begin{cases} 0 & y \\notin \\{0,1,4\\} \\\\ \\frac{1}{5} & y = 0 \\\\ \\frac{2}{5} & y \\in \\{1,4\\} \\end{cases} \\end{align}\\] Exercise 4.16 (Lognormal random variable) A lognormal random variable is a variable whose logarithm is normally distributed. In practice, we often encounter skewed data. Usually using a log transformation on such data makes it more symmetric and therefore more suitable for modeling with the normal distribution (more on why we wish to model data with the normal distribution in the following chapters). Let \\(X \\sim \\text{N}(\\mu,\\sigma)\\). Find the PDF of \\(Y: \\log(Y) = X\\). R: Sample from the lognormal distribution with parameters \\(\\mu = 5\\) and \\(\\sigma = 2\\). Plot a histogram of the samples. Then log-transform the samples and plot a histogram along with the theoretical normal PDF. Solution. \\[\\begin{align} p_Y(y) &= p_X(\\log(y)) \\frac{d}{dy} \\log(y) \\\\ &= \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{\\frac{(\\log(y) - \\mu)^2}{2 \\sigma^2}} \\frac{1}{y} \\\\ &= \\frac{1}{y \\sqrt{2 \\pi \\sigma^2}} e^{\\frac{(\\log(y) - \\mu)^2}{2 \\sigma^2}}. \\end{align}\\] set.seed(1) nsamps <- 10000 mu <- 0.5 sigma <- 0.4 ln_samps <- rlnorm(nsamps, mu, sigma) ln_plot <- ggplot(data = data.frame(x = ln_samps), aes(x = x)) + geom_histogram(color = "black") plot(ln_plot) norm_samps <- log(ln_samps) n_plot <- ggplot(data = data.frame(x = norm_samps), aes(x = x)) + geom_histogram(aes(y = ..density..), color = "black") + stat_function(fun = dnorm, args = list(mean = mu, sd = sigma), color = "red") plot(n_plot) Exercise 4.17 (Probability integral transform) This exercise is borrowed from Wasserman. Let \\(X\\) have a continuous, strictly increasing CDF \\(F\\). Let \\(Y = F(X)\\). Find the density of \\(Y\\). This is called the probability integral transform. Let \\(U \\sim \\text{Uniform}(0,1)\\) and let \\(X = F^{-1}(U)\\). Show that \\(X \\sim F\\). R: Implement a program that takes Uniform(0,1) random variables and generates random variables from an exponential(\\(\\beta\\)) distribution. Compare your implemented function with function rexp in R. Solution. \\[\\begin{align} F_Y(y) &= P(Y < y) \\\\ &= P(F(X) < y) \\\\ &= P(X < F_X^{-1}(y)) \\\\ &= F_X(F_X^{-1}(y)) \\\\ &= y. \\end{align}\\] From the above it follows that \\(p(y) = 1\\). Note that we need to know the inverse CDF to be able to apply this procedure. \\[\\begin{align} P(X < x) &= P(F^{-1}(U) < x) \\\\ &= P(U < F(x)) \\\\ &= F_U(F(x)) \\\\ &= F(x). \\end{align}\\] set.seed(1) nsamps <- 10000 beta <- 4 generate_exp <- function (n, beta) { tmp <- runif(n) X <- qexp(tmp, beta) return (X) } df <- tibble("R" = rexp(nsamps, beta), "myGenerator" = generate_exp(nsamps, beta)) %>% gather() ggplot(data = df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") "],["mrvs.html", "Chapter 5 Multiple random variables 5.1 General 5.2 Bivariate distribution examples 5.3 Transformations", " Chapter 5 Multiple random variables This chapter deals with multiple random variables and their distributions. The students are expected to acquire the following knowledge: Theoretical Calculation of PDF of transformed multiple random variables. Finding marginal and conditional distributions. R Scatterplots of bivariate random variables. New R functions (for example, expand.grid). .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 5.1 General Exercise 5.1 Let \\(X \\sim \\text{N}(0,1)\\) and \\(Y \\sim \\text{N}(0,1)\\) be independent random variables. Draw 1000 samples from \\((X,Y)\\) and plot a scatterplot. Now let \\(X \\sim \\text{N}(0,1)\\) and \\(Y | X = x \\sim N(ax, 1)\\). Draw 1000 samples from \\((X,Y)\\) for \\(a = 1\\), \\(a=0\\), and \\(a=-0.5\\). Plot the scatterplots. How would you interpret parameter \\(a\\)? Plot the marginal distribution of \\(Y\\) for cases \\(a=1\\), \\(a=0\\), and \\(a=-0.5\\). Can you guess which distribution it is? set.seed(1) nsamps <- 1000 x <- rnorm(nsamps) y <- rnorm(nsamps) ggplot(data.frame(x, y), aes(x = x, y = y)) + geom_point() y1 <- rnorm(nsamps, mean = 1 * x) y2 <- rnorm(nsamps, mean = 0 * x) y3 <- rnorm(nsamps, mean = -0.5 * x) df <- tibble(x = c(x,x,x), y = c(y1,y2,y3), a = c(rep(1, nsamps), rep(0, nsamps), rep(-0.5, nsamps))) ggplot(df, aes(x = x, y = y)) + geom_point() + facet_wrap(~a) + coord_equal(ratio=1) # Parameter a controls the scale of linear dependency between X and Y. ggplot(df, aes(x = y)) + geom_density() + facet_wrap(~a) 5.2 Bivariate distribution examples Exercise 5.2 (Discrete bivariate random variable) Let \\(X\\) represent the event that a die rolls an even number and let \\(Y\\) represent the event that a die rolls one, two, or a three. Find the marginal distributions of \\(X\\) and \\(Y\\). Find the PMF of \\((X,Y)\\). Find the CDF of \\((X,Y)\\). Find \\(P(X = 1 | Y = 1)\\). Solution. \\[\\begin{align} P(X = 1) = \\frac{1}{2} \\text{ and } P(X = 0) = \\frac{1}{2} \\\\ P(Y = 1) = \\frac{1}{2} \\text{ and } P(Y = 0) = \\frac{1}{2} \\\\ \\end{align}\\] \\[\\begin{align} P(X = 1, Y = 1) = \\frac{1}{6} \\\\ P(X = 1, Y = 0) = \\frac{2}{6} \\\\ P(X = 0, Y = 1) = \\frac{2}{6} \\\\ P(X = 0, Y = 0) = \\frac{1}{6} \\end{align}\\] \\[\\begin{align} P(X \\leq x, Y \\leq y) = \\begin{cases} \\frac{1}{6} & x = 0, y = 0 \\\\ \\frac{3}{6} & x \\neq y \\\\ 1 & x = 1, y = 1 \\end{cases} \\end{align}\\] \\[\\begin{align} P(X = 1 | Y = 1) = \\frac{1}{3} \\end{align}\\] Exercise 5.3 (Continuous bivariate random variable) Let \\(p(x,y) = 6 (x - y)^2\\) be the PDF of a bivariate random variable \\((X,Y)\\), where both variables range from zero to one. Find CDF. Find marginal distributions. Find conditional distributions. R: Plot a grid of points and colour them by value – this can help us visualize the PDF. R: Implement a random number generator, which will generate numbers from \\((X,Y)\\) and visually check the results. R: Plot the marginal distribution of \\(Y\\) and the conditional distributions of \\(X | Y = y\\), where \\(y \\in \\{0, 0.1, 0.5\\}\\). Solution. \\[\\begin{align} F(x,y) &= \\int_0^{x} \\int_0^{y} 6 (t - s)^2 ds dt\\\\ &= 6 \\int_0^{x} \\int_0^{y} t^2 - 2ts + s^2 ds dt\\\\ &= 6 \\int_0^{x} t^2y - ty^2 + \\frac{y^3}{3} dt \\\\ &= 6 (\\frac{x^3 y}{3} - \\frac{x^2y^2}{2} + \\frac{x y^3}{3}) \\\\ &= 2 x^3 y - 3 t^2y^2 + 2 x y^3 \\end{align}\\] \\[\\begin{align} p(x) &= \\int_0^{1} 6 (x - y)^2 dy\\\\ &= 6 (x^2 - x + \\frac{1}{3}) \\\\ &= 6x^2 - 6x + 2 \\end{align}\\] \\[\\begin{align} p(y) &= \\int_0^{1} 6 (x - y)^2 dx\\\\ &= 6 (y^2 - y + \\frac{1}{3}) \\\\ &= 6y^2 - 6y + 2 \\end{align}\\] \\[\\begin{align} p(x|y) &= \\frac{p(xy)}{p(y)} \\\\ &= \\frac{6 (x - y)^2}{6 (y^2 - y + \\frac{1}{3})} \\\\ &= \\frac{(x - y)^2}{y^2 - y + \\frac{1}{3}} \\end{align}\\] \\[\\begin{align} p(y|x) &= \\frac{p(xy)}{p(x)} \\\\ &= \\frac{6 (x - y)^2}{6 (x^2 - x + \\frac{1}{3})} \\\\ &= \\frac{(x - y)^2}{x^2 - x + \\frac{1}{3}} \\end{align}\\] set.seed(1) # d pxy <- function (x, y) { return ((x - y)^2) } x_axis <- seq(0, 1, length.out = 100) y_axis <- seq(0, 1, length.out = 100) df <- expand.grid(x_axis, y_axis) colnames(df) <- c("x", "y") df <- cbind(df, pdf = pxy(df$x, df$y)) ggplot(data = df, aes(x = x, y = y, color = pdf)) + geom_point() # e samps <- NULL for (i in 1:10000) { xt <- runif(1, 0, 1) yt <- runif(1, 0, 1) pdft <- pxy(xt, yt) acc <- runif(1, 0, 6) if (acc <= pdft) { samps <- rbind(samps, c(xt, yt)) } } colnames(samps) <- c("x", "y") ggplot(data = as.data.frame(samps), aes(x = x, y = y)) + geom_point() # f mar_pdf <- function (x) { return (6 * x^2 - 6 * x + 2) } cond_pdf <- function (x, y) { return (((x - y)^2) / (y^2 - y + 1/3)) } df <- tibble(x = x_axis, mar = mar_pdf(x), y0 = cond_pdf(x, 0), y0.1 = cond_pdf(x, 0.1), y0.5 = cond_pdf(x, 0.5)) %>% gather(dist, value, -x) ggplot(df, aes(x = x, y = value, color = dist)) + geom_line() Exercise 5.4 (Mixed bivariate random variable) Let \\(f(x,y) = \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y+ \\alpha -1} e^{-x(1 + \\beta)}\\) be the PDF of a bivariate random variable, where \\(x \\in (0, \\infty)\\) and \\(y \\in \\mathbb{N}_0\\). Find the marginal distribution of \\(X\\). Do you recognize this distribution? Find the conditional distribution of \\(Y | X\\). Do you recognize this distribution? Calculate the probability \\(P(Y = 2 | X = 2.5)\\) for \\((X,Y)\\). Find the marginal distribution of \\(Y\\). Do you recognize this distribution? R: Take 1000 random samples from \\((X,Y)\\) with parameters \\(\\beta = 1\\) and \\(\\alpha = 1\\). Plot a scatterplot. Plot a bar plot of the marginal distribution of \\(Y\\), and the theoretical PMF calculated from d) on the range from 0 to 10. Hint: Use the gamma function in R.? Solution. \\[\\begin{align} p(x) &= \\sum_{k = 0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)k!} x^{k + \\alpha -1} e^{-x(1 + \\beta)} & \\\\ &= \\sum_{k = 0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)k!} x^{k} x^{\\alpha -1} e^{-x} e^{-\\beta x} & \\\\ &= \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x} \\sum_{k = 0}^{\\infty} \\frac{1}{k!} x^{k} e^{-x} & \\\\ &= \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x} & \\text{the last term above sums to one} \\end{align}\\] This is the Gamma PDF. \\[\\begin{align} p(y|x) &= \\frac{p(x,y)}{p(x)} \\\\ &= \\frac{\\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y+ \\alpha -1} e^{-x(1 + \\beta)}}{\\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x}} \\\\ &= \\frac{x^y e^{-x}}{y!}. \\end{align}\\] This is the Poisson PMF. \\[\\begin{align} P(Y = 2 | X = 2.5) = \\frac{2.5^2 e^{-2.5}}{2!} \\approx 0.26. \\end{align}\\] \\[\\begin{align} p(y) &= \\int_{0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y + \\alpha -1} e^{-x(1 + \\beta)} dx & \\\\ &= \\frac{1}{y!} \\int_{0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{(y + \\alpha) -1} e^{-(1 + \\beta)x} dx & \\\\ &= \\frac{1}{y!} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} \\int_{0}^{\\infty} \\frac{\\Gamma(y + \\alpha)}{(1 + \\beta)^{y + \\alpha}} \\frac{(1 + \\beta)^{y + \\alpha}}{\\Gamma(y + \\alpha)} x^{(y + \\alpha) -1} e^{-(1 + \\beta)x} dx & \\text{complete to Gamma PDF} \\\\ &= \\frac{1}{y!} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} \\frac{\\Gamma(y + \\alpha)}{(1 + \\beta)^{y + \\alpha}}. \\end{align}\\] We add the terms in the third equality to get a Gamma PDF inside the integral, which then integrates to one. We do not recognize this distribution. set.seed(1) px <- function (x, alpha, beta) { return((1 / factorial(x)) * (beta^alpha / gamma(alpha)) * (gamma(x + alpha) / (1 + beta)^(x + alpha))) } nsamps <- 1000 rx <- rgamma(nsamps, 1, 1) ryx <- rpois(nsamps, rx) ggplot(data = data.frame(x = rx, y = ryx), aes(x = x, y = y)) + geom_point() ggplot(data = data.frame(x = rx, y = ryx), aes(x = y)) + geom_bar(aes(y = (..count..)/sum(..count..))) + stat_function(fun = px, args = list(alpha = 1, beta = 1), color = "red") Exercise 5.5 Let \\(f(x,y) = cx^2y\\) for \\(x^2 \\leq y \\leq 1\\) and zero otherwise. Find such \\(c\\) that \\(f\\) is a PDF of a bivariate random variable. This exercise is borrowed from Wasserman. Solution. \\[\\begin{align} 1 &= \\int_{-1}^{1} \\int_{x^2}^1 cx^2y dy dx \\\\ &= \\int_{-1}^{1} cx^2 (\\frac{1}{2} - \\frac{x^4}{2}) dx \\\\ &= \\frac{c}{2} \\int_{-1}^{1} x^2 - x^6 dx \\\\ &= \\frac{c}{2} (\\frac{1}{3} + \\frac{1}{3} - \\frac{1}{7} - \\frac{1}{7}) \\\\ &= \\frac{c}{2} \\frac{8}{21} \\\\ &= \\frac{4c}{21} \\end{align}\\] It follows \\(c = \\frac{21}{4}\\). 5.3 Transformations Exercise 5.6 Let \\((X,Y)\\) be uniformly distributed on the unit ball \\(\\{(x,y,z) : x^2 + y^2 + z^2 \\leq 1\\}\\). Let \\(R = \\sqrt{X^2 + Y^2 + Z^2}\\). Find the CDF and PDF of \\(R\\). Solution. \\[\\begin{align} P(R < r) &= P(\\sqrt{X^2 + Y^2 + Z^2} < r) \\\\ &= P(X^2 + Y^2 + Z^2 < r^2) \\\\ &= \\frac{\\frac{4}{3} \\pi r^3}{\\frac{4}{3}\\pi} \\\\ &= r^3. \\end{align}\\] The second line shows us that we are looking at the probability which is represented by a smaller ball with radius \\(r\\). To get the probability, we divide it by the radius of the whole ball. We get the PDF by differentiating the CDF, so \\(p(r) = 3r^2\\). "],["integ.html", "Chapter 6 Integration 6.1 Monte Carlo integration 6.2 Lebesgue integrals", " Chapter 6 Integration This chapter deals with abstract and Monte Carlo integration. The students are expected to acquire the following knowledge: Theoretical How to calculate Lebesgue integrals for non-simple functions. R Monte Carlo integration. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 6.1 Monte Carlo integration Exercise 6.1 Let \\(X\\) and \\(Y\\) be continuous random variables on the unit interval and \\(p(x,y) = 6(x - y)^2\\). Use Monte Carlo integration to estimate the probability \\(P(0.2 \\leq X \\leq 0.5, \\: 0.1 \\leq Y \\leq 0.2)\\). Can you find the exact value? set.seed(1) nsamps <- 1000 V <- (0.5 - 0.2) * (0.2 - 0.1) x1 <- runif(nsamps, 0.2, 0.5) x2 <- runif(nsamps, 0.1, 0.2) f_1 <- function (x, y) { return (6 * (x - y)^2) } mcint <- V * (1 / nsamps) * sum(f_1(x1, x2)) sdm <- sqrt((V^2 / nsamps) * var(f_1(x1, x2))) mcint ## [1] 0.008793445 sdm ## [1] 0.0002197686 F_1 <- function (x, y) { return (2 * x^3 * y - 3 * x^2 * y^2 + 2 * x * y^3) } F_1(0.5, 0.2) - F_1(0.2, 0.2) - F_1(0.5, 0.1) + F_1(0.2, 0.1) ## [1] 0.0087 6.2 Lebesgue integrals Exercise 6.2 (borrowed from Jagannathan) Find the Lebesgue integral of the following functions on (\\(\\mathbb{R}\\), \\(\\mathcal{B}(\\mathbb{R})\\), \\(\\lambda\\)). \\[\\begin{align} f(\\omega) = \\begin{cases} \\omega, & \\text{for } \\omega = 0,1,...,n \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] \\[\\begin{align} f(\\omega) = \\begin{cases} 1, & \\text{for } \\omega = \\mathbb{Q}^c \\cap [0,1] \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] \\[\\begin{align} f(\\omega) = \\begin{cases} n, & \\text{for } \\omega = \\mathbb{Q}^c \\cap [0,n] \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] Solution. \\[\\begin{align} \\int f(\\omega) d\\lambda = \\sum_{\\omega = 0}^n \\omega \\lambda(\\omega) = 0. \\end{align}\\] \\[\\begin{align} \\int f(\\omega) d\\lambda = 1 \\times \\lambda(\\mathbb{Q}^c \\cap [0,1]) = 1. \\end{align}\\] \\[\\begin{align} \\int f(\\omega) d\\lambda = n \\times \\lambda(\\mathbb{Q}^c \\cap [0,n]) = n^2. \\end{align}\\] Exercise 6.3 (borrowed from Jagannathan) Let \\(c \\in \\mathbb{R}\\) be fixed and (\\(\\mathbb{R}\\), \\(\\mathcal{B}(\\mathbb{R})\\)) a measurable space. If for any Borel set \\(A\\), \\(\\delta_c (A) = 1\\) if \\(c \\in A\\), and \\(\\delta_c (A) = 0\\) otherwise, then \\(\\delta_c\\) is called a Dirac measure. Let \\(g\\) be a non-negative, measurable function. Show that \\(\\int g d \\delta_c = g(c)\\). Solution. \\[\\begin{align} \\int g d \\delta_c &= \\sup_{q \\in S(g)} \\int q d \\delta_c \\\\ &= \\sup_{q \\in S(g)} \\sum_{i = 1}^n a_i \\delta_c(A_i) \\\\ &= \\sup_{q \\in S(g)} \\sum_{i = 1}^n a_i \\text{I}_{A_i}(c) \\\\ &= \\sup_{q \\in S(g)} q(c) \\\\ &= g(c) \\end{align}\\] "],["ev.html", "Chapter 7 Expected value 7.1 Discrete random variables 7.2 Continuous random variables 7.3 Sums, functions, conditional expectations 7.4 Covariance", " Chapter 7 Expected value This chapter deals with expected values of random variables. The students are expected to acquire the following knowledge: Theoretical Calculation of the expected value. Calculation of variance and covariance. Cauchy distribution. R Estimation of expected value. Estimation of variance and covariance. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 7.1 Discrete random variables Exercise 7.1 (Bernoulli) Let \\(X \\sim \\text{Bernoulli}(p)\\). Find \\(E[X]\\). Find \\(Var[X]\\). R: Let \\(p = 0.4\\). Check your answers to a) and b) with a simulation. Solution. \\[\\begin{align*} E[X] = \\sum_{k=0}^1 p^k (1-p)^{1-k} k = p. \\end{align*}\\] \\[\\begin{align*} Var[X] = E[X^2] - E[X]^2 = \\sum_{k=0}^1 (p^k (1-p)^{1-k} k^2) - p^2 = p(1-p). \\end{align*}\\] set.seed(1) nsamps <- 1000 x <- rbinom(nsamps, 1, 0.4) mean(x) ## [1] 0.394 var(x) ## [1] 0.239003 0.4 * (1 - 0.4) ## [1] 0.24 Exercise 7.2 (Binomial) Let \\(X \\sim \\text{Binomial}(n,p)\\). Find \\(E[X]\\). Find \\(Var[X]\\). Solution. Let \\(X = \\sum_{i=0}^n X_i\\), where \\(X_i \\sim \\text{Bernoulli}(p)\\). Then, due to linearity of expectation \\[\\begin{align*} E[X] = E[\\sum_{i=0}^n X_i] = \\sum_{i=0}^n E[X_i] = np. \\end{align*}\\] Again let \\(X = \\sum_{i=0}^n X_i\\), where \\(X_i \\sim \\text{Bernoulli}(p)\\). Since the Bernoulli variables \\(X_i\\) are independent we have \\[\\begin{align*} Var[X] = Var[\\sum_{i=0}^n X_i] = \\sum_{i=0}^n Var[X_i] = np(1-p). \\end{align*}\\] Exercise 7.3 (Poisson) Let \\(X \\sim \\text{Poisson}(\\lambda)\\). Find \\(E[X]\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\sum_{k=0}^\\infty \\frac{\\lambda^k e^{-\\lambda}}{k!} k & \\\\ &= \\sum_{k=1}^\\infty \\frac{\\lambda^k e^{-\\lambda}}{k!} k & \\text{term at $k=0$ is 0} \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty \\frac{\\lambda^{k-1}}{(k - 1)!} & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=0}^\\infty \\frac{\\lambda^{k}}{k!} & \\\\ &= e^{-\\lambda} \\lambda e^\\lambda & \\\\ &= \\lambda. \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty k \\frac{\\lambda^{k-1}}{(k - 1)!} - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty (k - 1) + 1) \\frac{\\lambda^{k-1}}{(k - 1)!} - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\sum_{k=1}^\\infty (k - 1) \\frac{\\lambda^{k-1}}{(k - 1)!} + \\sum_{k=1}^\\infty \\frac{\\lambda^{k-1}}{(k - 1)!}\\Big) - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\lambda\\sum_{k=2}^\\infty \\frac{\\lambda^{k-2}}{(k - 2)!} + e^\\lambda\\Big) - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\lambda e^\\lambda + e^\\lambda\\Big) - \\lambda^2 & \\\\ &= \\lambda^2 + \\lambda - \\lambda^2 & \\\\ &= \\lambda. \\end{align*}\\] Exercise 7.4 (Geometric) Let \\(X \\sim \\text{Geometric}(p)\\). Find \\(E[X]\\). Hint: \\(\\frac{d}{dx} x^k = k x^{(k - 1)}\\). Solution. \\[\\begin{align*} E[X] &= \\sum_{k=0}^\\infty (1 - p)^k p k & \\\\ &= p (1 - p) \\sum_{k=0}^\\infty (1 - p)^{k-1} k & \\\\ &= p (1 - p) \\sum_{k=0}^\\infty -\\frac{d}{dp}(1 - p)^k & \\\\ &= p (1 - p) \\Big(-\\frac{d}{dp}\\Big) \\sum_{k=0}^\\infty (1 - p)^k & \\\\ &= p (1 - p) \\Big(-\\frac{d}{dp}\\Big) \\frac{1}{1 - (1 - p)} & \\text{geometric series} \\\\ &= \\frac{1 - p}{p} \\end{align*}\\] 7.2 Continuous random variables Exercise 7.5 (Gamma) Let \\(X \\sim \\text{Gamma}(\\alpha, \\beta)\\). Hint: \\(\\Gamma(z) = \\int_0^\\infty t^{z-1}e^{-t} dt\\) and \\(\\Gamma(z + 1) = z \\Gamma(z)\\). Find \\(E[X]\\). Find \\(Var[X]\\). R: Let \\(\\alpha = 10\\) and \\(\\beta = 2\\). Plot the density of \\(X\\). Add a horizontal line at the expected value that touches the density curve (geom_segment). Shade the area within a standard deviation of the expected value. Solution. \\[\\begin{align*} E[X] &= \\int_0^\\infty \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)}x^\\alpha e^{-\\beta x} dx & \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\int_0^\\infty x^\\alpha e^{-\\beta x} dx & \\text{ (let $t = \\beta x$)} \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(\\alpha) }\\int_0^\\infty \\frac{t^\\alpha}{\\beta^\\alpha} e^{-t} \\frac{dt}{\\beta} & \\\\ &= \\frac{1}{\\beta \\Gamma(\\alpha) }\\int_0^\\infty t^\\alpha e^{-t} dt & \\\\ &= \\frac{\\Gamma(\\alpha + 1)}{\\beta \\Gamma(\\alpha)} & \\\\ &= \\frac{\\alpha \\Gamma(\\alpha)}{\\beta \\Gamma(\\alpha)} & \\\\ &= \\frac{\\alpha}{\\beta}. & \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\int_0^\\infty \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)}x^{\\alpha+1} e^{-\\beta x} dx - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\Gamma(\\alpha + 2)}{\\beta^2 \\Gamma(\\alpha)} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{(\\alpha + 1)\\alpha\\Gamma(\\alpha)}{\\beta^2 \\Gamma(\\alpha)} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\alpha^2 + \\alpha}{\\beta^2} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\alpha}{\\beta^2}. \\end{align*}\\] set.seed(1) x <- seq(0, 25, by = 0.01) y <- dgamma(x, shape = 10, rate = 2) df <- data.frame(x = x, y = y) ggplot(df, aes(x = x, y = y)) + geom_line() + geom_segment(aes(x = 5, y = 0, xend = 5, yend = dgamma(5, shape = 10, rate = 2)), color = "red") + stat_function(fun = dgamma, args = list(shape = 10, rate = 2), xlim = c(5 - sqrt(10/4), 5 + sqrt(10/4)), geom = "area", fill = "gray", alpha = 0.4) Exercise 7.6 (Beta) Let \\(X \\sim \\text{Beta}(\\alpha, \\beta)\\). Find \\(E[X]\\). Hint 1: \\(\\text{B}(x,y) = \\int_0^1 t^{x-1} (1 - t)^{y-1} dt\\). Hint 2: \\(\\text{B}(x + 1, y) = \\text{B}(x,y)\\frac{x}{x + y}\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\int_0^1 \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} x dx \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)}\\int_0^1 x^{\\alpha} (1 - x)^{\\beta - 1} dx \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 1, \\beta) \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha, \\beta) \\frac{\\alpha}{\\alpha + \\beta} \\\\ &= \\frac{\\alpha}{\\alpha + \\beta}. \\\\ \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\int_0^1 \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} x^2 dx - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)}\\int_0^1 x^{\\alpha + 1} (1 - x)^{\\beta - 1} dx - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 2, \\beta) - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 1, \\beta) \\frac{\\alpha + 1}{\\alpha + \\beta + 1} - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{\\alpha + 1}{\\alpha + \\beta + 1} \\frac{\\alpha}{\\alpha + \\beta} - \\frac{\\alpha^2}{(\\alpha + \\beta)^2}\\\\ &= \\frac{\\alpha \\beta}{(\\alpha + \\beta)^2(\\alpha + \\beta + 1)}. \\end{align*}\\] Exercise 7.7 (Exponential) Let \\(X \\sim \\text{Exp}(\\lambda)\\). Find \\(E[X]\\). Hint: \\(\\Gamma(z + 1) = z\\Gamma(z)\\) and \\(\\Gamma(1) = 1\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\int_0^\\infty \\lambda e^{-\\lambda x} x dx & \\\\ &= \\lambda \\int_0^\\infty x e^{-\\lambda x} dx & \\\\ &= \\lambda \\int_0^\\infty \\frac{t}{\\lambda} e^{-t} \\frac{dt}{\\lambda} & \\text{$t = \\lambda x$}\\\\ &= \\lambda \\lambda^{-2} \\Gamma(2) & \\text{definition of gamma function} \\\\ &= \\lambda^{-1}. \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 & \\\\ &= \\int_0^\\infty \\lambda e^{-\\lambda x} x^2 dx - \\lambda^{-2} & \\\\ &= \\lambda \\int_0^\\infty \\frac{t^2}{\\lambda^2} e^{-t} \\frac{dt}{\\lambda} - \\lambda^{-2} & \\text{$t = \\lambda x$} \\\\ &= \\lambda \\lambda^{-3} \\Gamma(3) - \\lambda^{-2} & \\text{definition of gamma function} & \\\\ &= \\lambda^{-2} 2 \\Gamma(2) - \\lambda^{-2} & \\\\ &= 2 \\lambda^{-2} - \\lambda^{-2} & \\\\ &= \\lambda^{-2}. & \\\\ \\end{align*}\\] Exercise 7.8 (Normal) Let \\(X \\sim \\text{N}(\\mu, \\sigma)\\). Show that \\(E[X] = \\mu\\). Hint: Use the error function \\(\\text{erf}(x) = \\frac{1}{\\sqrt(\\pi)} \\int_{-x}^x e^{-t^2} dt\\). The statistical interpretation of this function is that if \\(Y \\sim \\text{N}(0, 0.5)\\), then the error function describes the probability of \\(Y\\) falling between \\(-x\\) and \\(x\\). Also, \\(\\text{erf}(\\infty) = 1\\). Show that \\(Var[X] = \\sigma^2\\). Hint: Start with the definition of variance. Solution. \\[\\begin{align*} E[X] &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} x dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty x e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty \\Big(t \\sqrt{2\\sigma^2} + \\mu\\Big)e^{-t^2} \\sqrt{2 \\sigma^2} dt & t = \\frac{x - \\mu}{\\sqrt{2}\\sigma} \\\\ &= \\frac{\\sqrt{2\\sigma^2}}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty t e^{-t^2} dt + \\frac{1}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty \\mu e^{-t^2} dt & \\\\ \\end{align*}\\] Let us calculate these integrals separately. \\[\\begin{align*} \\int t e^{-t^2} dt &= -\\frac{1}{2}\\int e^{s} ds & s = -t^2 \\\\ &= -\\frac{e^s}{2} + C \\\\ &= -\\frac{e^{-t^2}}{2} + C & \\text{undoing substitution}. \\end{align*}\\] Inserting the integration limits we get \\[\\begin{align*} \\int_{-\\infty}^\\infty t e^{-t^2} dt &= 0, \\end{align*}\\] due to the integrated function being symmetric. Reordering the second integral we get \\[\\begin{align*} \\mu \\frac{1}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty e^{-t^2} dt &= \\mu \\text{erf}(\\infty) & \\text{definition of error function} \\\\ &= \\mu & \\text{probability of $Y$ falling between $-\\infty$ and $\\infty$}. \\end{align*}\\] Combining all of the above we get \\[\\begin{align*} E[X] &= \\frac{\\sqrt{2\\sigma^2}}{\\sqrt{\\pi}} \\times 0 + \\mu &= \\mu.\\\\ \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[(X - E[X])^2] \\\\ &= E[(X - \\mu)^2] \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty (x - \\mu)^2 e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} dx \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\int_{-\\infty}^\\infty t^2 e^{-\\frac{t^2}{2}} dt \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\bigg(\\Big(- t e^{-\\frac{t^2}{2}} |_{-\\infty}^\\infty \\Big) + \\int_{-\\infty}^\\infty e^{-\\frac{t^2}{2}} \\bigg) dt & \\text{integration by parts} \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\sqrt{2 \\pi} \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt(\\pi)}e^{-s^2} \\bigg) & s = \\frac{t}{\\sqrt{2}} \\text{ and evaluating the left expression at the bounds} \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\sqrt{2 \\pi} \\Big(\\text{erf}(\\infty) & \\text{definition of error function} \\\\ &= \\sigma^2. \\end{align*}\\] 7.3 Sums, functions, conditional expectations Exercise 7.9 (Expectation of transformations) Let \\(X\\) follow a normal distribution with mean \\(\\mu\\) and variance \\(\\sigma^2\\). Find \\(E[2X + 4]\\). Find \\(E[X^2]\\). Find \\(E[\\exp(X)]\\). Hint: Use the error function \\(\\text{erf}(x) = \\frac{1}{\\sqrt(\\pi)} \\int_{-x}^x e^{-t^2} dt\\). Also, \\(\\text{erf}(\\infty) = 1\\). R: Check your results numerically for \\(\\mu = 0.4\\) and \\(\\sigma^2 = 0.25\\) and plot the densities of all four distributions. Solution. \\[\\begin{align} E[2X + 4] &= 2E[X] + 4 & \\text{linearity of expectation} \\\\ &= 2\\mu + 4. \\\\ \\end{align}\\] \\[\\begin{align} E[X^2] &= E[X]^2 + Var[X] & \\text{definition of variance} \\\\ &= \\mu^2 + \\sigma^2. \\end{align}\\] \\[\\begin{align} E[\\exp(X)] &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} e^x dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{\\frac{2 \\sigma^2 x}{2\\sigma^2} -\\frac{(x - \\mu)^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{x^2 - 2x(\\mu + \\sigma^2) + \\mu^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{(x - (\\mu + \\sigma^2))^2 + \\mu^2 - (\\mu + \\sigma^2)^2}{2\\sigma^2}} dx & \\text{complete the square} \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{\\frac{- \\mu^2 + (\\mu + \\sigma^2)^2}{2\\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{(x - (\\mu + \\sigma^2))^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{\\frac{- \\mu^2 + (\\mu + \\sigma^2)^2}{2\\sigma^2}} \\sigma \\sqrt{2 \\pi} \\text{erf}(\\infty) & \\\\ &= e^{\\frac{2\\mu + \\sigma^2}{2}}. \\end{align}\\] set.seed(1) mu <- 0.4 sigma <- 0.5 x <- rnorm(100000, mean = mu, sd = sigma) mean(2*x + 4) ## [1] 4.797756 2 * mu + 4 ## [1] 4.8 mean(x^2) ## [1] 0.4108658 mu^2 + sigma^2 ## [1] 0.41 mean(exp(x)) ## [1] 1.689794 exp((2 * mu + sigma^2) / 2) ## [1] 1.690459 Exercise 7.10 (Sum of independent random variables) Borrowed from Wasserman. Let \\(X_1, X_2,...,X_n\\) be IID random variables with expected value \\(E[X_i] = \\mu\\) and variance \\(Var[X_i] = \\sigma^2\\). Find the expected value and variance of \\(\\bar{X} = \\frac{1}{n} \\sum_{i=1}^n X_i\\). \\(\\bar{X}\\) is called a statistic (a function of the values in a sample). It is itself a random variable and its distribution is called a sampling distribution. R: Take \\(n = 5, 10, 100, 1000\\) samples from the N(\\(2\\), \\(6\\)) distribution 10000 times. Plot the theoretical density and the densities of \\(\\bar{X}\\) statistic for each \\(n\\). Intuitively, are the results in correspondence with your calculations? Check them numerically. Solution. Let us start with the expectation of \\(\\bar{X}\\). \\[\\begin{align} E[\\bar{X}] &= E[\\frac{1}{n} \\sum_{i=1}^n X_i] & \\\\ &= \\frac{1}{n} E[\\sum_{i=1}^n X_i] & \\text{ (multiplication with a scalar)} \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[X_i] & \\text{ (linearity)} \\\\ &= \\frac{1}{n} n \\mu & \\\\ &= \\mu. \\end{align}\\] Now the variance \\[\\begin{align} Var[\\bar{X}] &= Var[\\frac{1}{n} \\sum_{i=1}^n X_i] & \\\\ &= \\frac{1}{n^2} Var[\\sum_{i=1}^n X_i] & \\text{ (multiplication with a scalar)} \\\\ &= \\frac{1}{n^2} \\sum_{i=1}^n Var[X_i] & \\text{ (independence of samples)} \\\\ &= \\frac{1}{n^2} n \\sigma^2 & \\\\ &= \\frac{1}{n} \\sigma^2. \\end{align}\\] set.seed(1) nsamps <- 10000 mu <- 2 sigma <- sqrt(6) N <- c(5, 10, 100, 500) X <- matrix(data = NA, nrow = nsamps, ncol = length(N)) ind <- 1 for (n in N) { for (i in 1:nsamps) { X[i,ind] <- mean(rnorm(n, mu, sigma)) } ind <- ind + 1 } colnames(X) <- N X <- melt(as.data.frame(X)) ggplot(data = X, aes(x = value, colour = variable)) + geom_density() + stat_function(data = data.frame(x = seq(-2, 6, by = 0.01)), aes(x = x), fun = dnorm, args = list(mean = mu, sd = sigma), color = "black") Exercise 7.11 (Conditional expectation) Let \\(X \\in \\mathbb{R}_0^+\\) and \\(Y \\in \\mathbb{N}_0\\) be random variables with joint distribution \\(p_{XY}(X,Y) = \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} 0.5^{y + 1}\\). Find \\(E[X | Y = y]\\) by first finding \\(p_Y\\) and then \\(p_{X|Y}\\). Find \\(E[X]\\). R: check your answers to a) and b) by drawing 10000 samples from \\(p_Y\\) and \\(p_{X|Y}\\). Solution. \\[\\begin{align} p(y) &= \\int_0^\\infty \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} 0.5^{y + 1} dx \\\\ &= \\frac{0.5^{y + 1}}{y + 1} \\int_0^\\infty e^{-\\frac{x}{y + 1}} dx \\\\ &= \\frac{0.5^{y + 1}}{y + 1} (y + 1) \\\\ &= 0.5^{y + 1} \\\\ &= 0.5(1 - 0.5)^y. \\end{align}\\] We recognize this as the geometric distribution. \\[\\begin{align} p(x|y) &= \\frac{p(x,y)}{p(y)} \\\\ &= \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}}. \\end{align}\\] We recognize this as the exponential distribution. \\[\\begin{align} E[X | Y = y] &= \\int_0^\\infty x \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} dx \\\\ &= y + 1 & \\text{expected value of the exponential distribution} \\end{align}\\] Use the law of iterated expectation. \\[\\begin{align} E[X] &= E[E[X | Y]] \\\\ &= E[Y + 1] \\\\ &= E[Y] + 1 \\\\ &= \\frac{1 - 0.5}{0.5} + 1 \\\\ &= 2. \\end{align}\\] set.seed(1) y <- rgeom(100000, 0.5) x <- rexp(100000, rate = 1 / (y + 1)) x2 <- x[y == 3] mean(x2) ## [1] 4.048501 3 + 1 ## [1] 4 mean(x) ## [1] 2.007639 (1 - 0.5) / 0.5 + 1 ## [1] 2 Exercise 7.12 (Cauchy distribution) Let \\(p(x | x_0, \\gamma) = \\frac{1}{\\pi \\gamma \\Big(1 + \\big(\\frac{x - x_0}{\\gamma}\\big)^2\\Big)}\\). A random variable with this PDF follows a Cauchy distribution. This distribution is symmetric and has wider tails than the normal distribution. R: Draw \\(n = 1,...,1000\\) samples from a standard normal and \\(\\text{Cauchy}(0, 1)\\). For each \\(n\\) plot the mean and the median of the sample using facets. Interpret the results. To get a mathematical explanation of the results in a), evaluate the integral \\(\\int_0^\\infty \\frac{x}{1 + x^2} dx\\) and consider that \\(E[X] = \\int_{-\\infty}^\\infty \\frac{x}{1 + x^2}dx\\). set.seed(1) n <- 1000 means_n <- vector(mode = "numeric", length = n) means_c <- vector(mode = "numeric", length = n) medians_n <- vector(mode = "numeric", length = n) medians_c <- vector(mode = "numeric", length = n) for (i in 1:n) { tmp_n <- rnorm(i) tmp_c <- rcauchy(i) means_n[i] <- mean(tmp_n) means_c[i] <- mean(tmp_c) medians_n[i] <- median(tmp_n) medians_c[i] <- median(tmp_c) } df <- data.frame("distribution" = c(rep("normal", 2 * n), rep("Cauchy", 2 * n)), "type" = c(rep("mean", n), rep("median", n), rep("mean", n), rep("median", n)), "value" = c(means_n, medians_n, means_c, medians_c), "n" = rep(1:n, times = 4)) ggplot(df, aes(x = n, y = value)) + geom_line(alpha = 0.5) + facet_wrap(~ type + distribution , scales = "free") Solution. \\[\\begin{align} \\int_0^\\infty \\frac{x}{1 + x^2} dx &= \\frac{1}{2} \\int_1^\\infty \\frac{1}{u} du & u = 1 + x^2 \\\\ &= \\frac{1}{2} \\ln(x) |_0^\\infty. \\end{align}\\] This integral is not finite. The same holds for the negative part. Therefore, the expectation is undefined, as \\(E[|X|] = \\infty\\). Why can we not just claim that \\(f(x) = x / (1 + x^2)\\) is odd and \\(\\int_{-\\infty}^\\infty f(x) = 0\\)? By definition of the Lebesgue integral \\(\\int_{-\\infty}^{\\infty} f= \\int_{-\\infty}^{\\infty} f_+-\\int_{-\\infty}^{\\infty} f_-\\). At least one of the two integrals needs to be finite for \\(\\int_{-\\infty}^{\\infty} f\\) to be well-defined. However \\(\\int_{-\\infty}^{\\infty} f_+=\\int_0^{\\infty} x/(1+x^2)\\) and \\(\\int_{-\\infty}^{\\infty} f_-=\\int_{-\\infty}^{0} |x|/(1+x^2)\\). We have just shown that both of these integrals are infinite, which implies that their sum is also infinite. 7.4 Covariance Exercise 7.13 Below is a table of values for random variables \\(X\\) and \\(Y\\). X Y 2.1 8 -0.5 11 1 10 -2 12 4 9 Find sample covariance of \\(X\\) and \\(Y\\). Find sample variances of \\(X\\) and \\(Y\\). Find sample correlation of \\(X\\) and \\(Y\\). Find sample variance of \\(Z = 2X - 3Y\\). Solution. \\(\\bar{X} = 0.92\\) and \\(\\bar{Y} = 10\\). \\[\\begin{align} s(X, Y) &= \\frac{1}{n - 1} \\sum_{i=1}^5 (X_i - 0.92) (Y_i - 10) \\\\ &= -3.175. \\end{align}\\] \\[\\begin{align} s(X) &= \\frac{\\sum_{i=1}^5(X_i - 0.92)^2}{5 - 1} \\\\ &= 5.357. \\end{align}\\] \\[\\begin{align} s(Y) &= \\frac{\\sum_{i=1}^5(Y_i - 10)^2}{5 - 1} \\\\ &= 2.5. \\end{align}\\] \\[\\begin{align} r(X,Y) &= \\frac{Cov(X,Y)}{\\sqrt{Var[X]Var[Y]}} \\\\ &= \\frac{-3.175}{\\sqrt{5.357 \\times 2.5}} \\\\ &= -8.68. \\end{align}\\] \\[\\begin{align} s(Z) &= 2^2 s(X) + 3^2 s(Y) + 2 \\times 2 \\times 3 s(X, Y) \\\\ &= 4 \\times 5.357 + 9 \\times 2.5 + 12 \\times 3.175 \\\\ &= 82.028. \\end{align}\\] Exercise 7.14 Let \\(X \\sim \\text{Uniform}(0,1)\\) and \\(Y | X = x \\sim \\text{Uniform(0,x)}\\). Find the covariance of \\(X\\) and \\(Y\\). Find the correlation of \\(X\\) and \\(Y\\). R: check your answers to a) and b) with simulation. Plot \\(X\\) against \\(Y\\) on a scatterplot. Solution. The joint PDF is \\(p(x,y) = p(x)p(y|x) = \\frac{1}{x}\\). \\[\\begin{align} Cov(X,Y) &= E[XY] - E[X]E[Y] \\\\ \\end{align}\\] Let us first evaluate the first term: \\[\\begin{align} E[XY] &= \\int_0^1 \\int_0^x x y \\frac{1}{x} dy dx \\\\ &= \\int_0^1 \\int_0^x y dy dx \\\\ &= \\int_0^1 \\frac{x^2}{2} dx \\\\ &= \\frac{1}{6}. \\end{align}\\] Now let us find \\(E[Y]\\), \\(E[X]\\) is trivial. \\[\\begin{align} E[Y] = E[E[Y | X]] = E[\\frac{X}{2}] = \\frac{1}{2} \\int_0^1 x dx = \\frac{1}{4}. \\end{align}\\] Combining all: \\[\\begin{align} Cov(X,Y) &= \\frac{1}{6} - \\frac{1}{2} \\frac{1}{4} = \\frac{1}{24}. \\end{align}\\] \\[\\begin{align} \\rho(X,Y) &= \\frac{Cov(X,Y)}{\\sqrt{Var[X]Var[Y]}} \\\\ \\end{align}\\] Let us calculate \\(Var[X]\\). \\[\\begin{align} Var[X] &= E[X^2] - \\frac{1}{4} \\\\ &= \\int_0^1 x^2 - \\frac{1}{4} \\\\ &= \\frac{1}{3} - \\frac{1}{4} \\\\ &= \\frac{1}{12}. \\end{align}\\] Let us calculate \\(E[E[Y^2|X]]\\). \\[\\begin{align} E[E[Y^2|X]] &= E[\\frac{x^2}{3}] \\\\ &= \\frac{1}{9}. \\end{align}\\] Then \\(Var[Y] = \\frac{1}{9} - \\frac{1}{16} = \\frac{7}{144}\\). Combining all \\[\\begin{align} \\rho(X,Y) &= \\frac{\\frac{1}{24}}{\\sqrt{\\frac{1}{12}\\frac{7}{144}}} \\\\ &= 0.65. \\end{align}\\] set.seed(1) nsamps <- 10000 x <- runif(nsamps) y <- runif(nsamps, 0, x) cov(x, y) ## [1] 0.04274061 1/24 ## [1] 0.04166667 cor(x, y) ## [1] 0.6629567 (1 / 24) / (sqrt(7 / (12 * 144))) ## [1] 0.6546537 ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_point(alpha = 0.2) + geom_smooth(method = "lm") "],["mrv.html", "Chapter 8 Multivariate random variables 8.1 Multinomial random variables 8.2 Multivariate normal random variables 8.3 Transformations", " Chapter 8 Multivariate random variables This chapter deals with multivariate random variables. The students are expected to acquire the following knowledge: Theoretical Multinomial distribution. Multivariate normal distribution. Cholesky decomposition. Eigendecomposition. R Sampling from the multinomial distribution. Sampling from the multivariate normal distribution. Matrix decompositions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 8.1 Multinomial random variables Exercise 8.1 Let \\(X_i\\), \\(i = 1,...,k\\) represent \\(k\\) events, and \\(p_i\\) the probabilities of these events happening in a trial. Let \\(n\\) be the number of trials, and \\(X\\) a multivariate random variable, the collection of \\(X_i\\). Then \\(p(x) = \\frac{n!}{x_1!x_2!...x_k!} p_1^{x_1} p_2^{x_2}...p_k^{x_k}\\) is the PMF of a multinomial distribution, where \\(n = \\sum_{i = 1}^k x_i\\). Show that the marginal distribution of \\(X_i\\) is a binomial distribution. Take 1000 samples from the multinomial distribution with \\(n=4\\) and probabilities \\(p = (0.2, 0.2, 0.5, 0.1)\\). Then take 1000 samples from four binomial distributions with the same parameters. Inspect the results visually. Solution. We will approach this proof from the probabilistic point of view. W.L.O.G. let \\(x_1\\) be the marginal distribution we are interested in. The term \\(p^{x_1}\\) denotes the probability that event 1 happened \\(x_1\\) times. For this event not to happen, one of the other events needs to happen. So for each of the remaining trials, the probability of another event is \\(\\sum_{i=2}^k p_i = 1 - p_1\\), and there were \\(n - x_1\\) such trials. What is left to do is to calculate the number of permutations of event 1 happening and event 1 not happening. We choose \\(x_1\\) trials, from \\(n\\) trials. Therefore \\(p(x_1) = \\binom{n}{x_1} p_1^{x_1} (1 - p_1)^{n - x_1}\\), which is the binomial PMF. Interested students are encouraged to prove this mathematically. set.seed(1) nsamps <- 1000 samps_mult <- rmultinom(nsamps, 4, prob = c(0.2, 0.2, 0.5, 0.1)) samps_mult <- as_tibble(t(samps_mult)) %>% gather() samps <- tibble( V1 = rbinom(nsamps, 4, 0.2), V2 = rbinom(nsamps, 4, 0.2), V3 = rbinom(nsamps, 4, 0.5), V4 = rbinom(nsamps, 4, 0.1) ) %>% gather() %>% bind_rows(samps_mult) %>% bind_cols("dist" = c(rep("binomial", 4*nsamps), rep("multinomial", 4*nsamps))) ggplot(samps, aes(x = value, fill = dist)) + geom_bar(position = "dodge") + facet_wrap(~ key) Exercise 8.2 (Multinomial expected value) Find the expected value, variance and covariance of the multinomial distribution. Hint: First find the expected value for \\(n = 1\\) and then use the fact that the trials are independent. Solution. Let us first calculate the expected value of \\(X_1\\), when \\(n = 1\\). \\[\\begin{align} E[X_1] &= \\sum_{n_1 = 0}^1 \\sum_{n_2 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_1!n_2!...n_k!}p_1^{n_1}p_2^{n_2}...p_k^{n_k}n_1 \\\\ &= \\sum_{n_1 = 0}^1 \\frac{p_1^{n_1} n_1}{n_1!} \\sum_{n_2 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_2!...n_k!}p_2^{n_2}...p_k^{n_k} \\end{align}\\] When \\(n_1 = 0\\) then the whole terms is zero, so we do not need to evaluate other sums. When \\(n_1 = 1\\), all other \\(n_i\\) must be zero, as we have \\(1 = \\sum_{i=1}^k n_i\\). Therefore the other sums equal \\(1\\). So \\(E[X_1] = p_1\\) and \\(E[X_i] = p_i\\) for \\(i = 1,...,k\\). Now let \\(Y_j\\), \\(j = 1,...,n\\), have a multinomial distribution with \\(n = 1\\), and let \\(X\\) have a multinomial distribution with an arbitrary \\(n\\). Then we can write \\(X = \\sum_{j=1}^n Y_j\\). And due to independence \\[\\begin{align} E[X] &= E[\\sum_{j=1}^n Y_j] \\\\ &= \\sum_{j=1}^n E[Y_j] \\\\ &= np. \\end{align}\\] For the variance, we need \\(E[X^2]\\). Let us follow the same procedure as above and first calculate \\(E[X_i]\\) for \\(n = 1\\). The only thing that changes is that the term \\(n_i\\) becomes \\(n_i^2\\). Since we only have \\(0\\) and \\(1\\) this does not change the outcome. So \\[\\begin{align} Var[X_i] &= E[X_i^2] - E[X_i]^2\\\\ &= p_i(1 - p_i). \\end{align}\\] Analogous to above for arbitrary \\(n\\) \\[\\begin{align} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\sum_{j=1}^n E[Y_j^2] - \\sum_{j=1}^n E[Y_j]^2 \\\\ &= \\sum_{j=1}^n E[Y_j^2] - E[Y_j]^2 \\\\ &= \\sum_{j=1}^n p(1-p) \\\\ &= np(1-p). \\end{align}\\] To calculate the covariance, we need \\(E[X_i X_j]\\). Again, let us start with \\(n = 1\\). Without loss of generality, let us assume \\(i = 1\\) and \\(j = 2\\). \\[\\begin{align} E[X_1 X_2] = \\sum_{n_1 = 0}^1 \\sum_{n_2 = 0}^1 \\frac{p_1^{n_1} n_1}{n_1!} \\frac{p_2^{n_2} n_2}{n_2!} \\sum_{n_3 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_3!...n_k!}p_3^{n_3}...p_k^{n_k}. \\end{align}\\] In the above expression, at each iteration we multiply with \\(n_1\\) and \\(n_2\\). Since \\(n = 1\\), one of these always has to be zero. Therefore \\(E[X_1 X_2] = 0\\) and \\[\\begin{align} Cov(X_i, X_j) &= E[X_i X_j] - E[X_i]E[X_j] \\\\ &= - p_i p_j. \\end{align}\\] For arbitrary \\(n\\), let \\(X = \\sum_{t = 1}^n Y_t\\) be the sum of independent multinomial random variables \\(Y_t = [X_{1t}, X_{2t},...,X_{kt}]^T\\) with \\(n=1\\). Then \\(X_1 = \\sum_{t = 1}^n X_{1t}\\) and \\(X_2 = \\sum_{l = 1}^n X_{2l}\\). \\[\\begin{align} Cov(X_1, X_2) &= E[X_1 X_2] - E[X_1] E[X_2] \\\\ &= E[\\sum_{t = 1}^n X_{1t} \\sum_{l = 1}^n X_{2l}] - n^2 p_1 p_2 \\\\ &= \\sum_{t = 1}^n \\sum_{l = 1}^n E[X_{1t} X_{2l}] - n^2 p_1 p_2. \\end{align}\\] For \\(X_{1t}\\) and \\(X_{2l}\\) the expected value is zero when \\(t = l\\). When \\(t \\neq l\\) then they are independent, so the expected value is the product \\(p_1 p_2\\). There are \\(n^2\\) total terms, and for \\(n\\) of them \\(t = l\\) holds. So \\(E[X_1 X_2] = (n^2 - n) p_1 p_2\\). Inserting into the above \\[\\begin{align} Cov(X_1, X_2) &= (n^2 - n) p_1 p_2 - n^2 p_1 p_2 \\\\ &= - n p_1 p_2. \\end{align}\\] 8.2 Multivariate normal random variables Exercise 8.3 (Cholesky decomposition) Let \\(X\\) be a random vector of length \\(k\\) with \\(X_i \\sim \\text{N}(0, 1)\\) and \\(LL^*\\) the Cholesky decomposition of a Hermitian positive-definite matrix \\(A\\). Let \\(\\mu\\) be a vector of length \\(k\\). Find the distribution of the random vector \\(Y = \\mu + L X\\). Find the Cholesky decomposition of \\(A = \\begin{bmatrix} 2 & 1.2 \\\\ 1.2 & 1 \\end{bmatrix}\\). R: Use the results from a) and b) to sample from the MVN distribution \\(\\text{N}(\\mu, A)\\), where \\(\\mu = [1.5, -1]^T\\). Plot a scatterplot and compare it to direct samples from the multivariate normal distribution (rmvnorm). Solution. \\(X\\) has an independent normal distribution of dimension \\(k\\). Then \\[\\begin{align} Y = \\mu + L X &\\sim \\text{N}(\\mu, LL^T) \\\\ &\\sim \\text{N}(\\mu, A). \\end{align}\\] Solve \\[\\begin{align} \\begin{bmatrix} a & 0 \\\\ b & c \\end{bmatrix} \\begin{bmatrix} a & b \\\\ 0 & c \\end{bmatrix} = \\begin{bmatrix} 2 & 1.2 \\\\ 1.2 & 1 \\end{bmatrix} \\end{align}\\] # a set.seed(1) nsamps <- 1000 X <- matrix(data = rnorm(nsamps * 2), ncol = 2) mu <- c(1.5, -1) L <- matrix(data = c(sqrt(2), 0, 1.2 / sqrt(2), sqrt(1 - 1.2^2/2)), ncol = 2, byrow = TRUE) Y <- t(mu + L %*% t(X)) plot_df <- data.frame(rbind(X, Y), c(rep("X", nsamps), rep("Y", nsamps))) colnames(plot_df) <- c("D1", "D2", "var") ggplot(data = plot_df, aes(x = D1, y = D2, colour = as.factor(var))) + geom_point() Exercise 8.4 (Eigendecomposition) R: Let \\(\\Sigma = U \\Lambda U^T\\) be the eigendecomposition of covariance matrix \\(\\Sigma\\). Follow the procedure below, to sample from a multivariate normal with \\(\\mu = [-2, 1]^T\\) and \\(\\Sigma = \\begin{bmatrix} 0.3, -0.5 \\\\ -0.5, 1.6 \\end{bmatrix}\\): Sample from two independent standardized normal distributions to get \\(X\\). Find the eigendecomposition of \\(X\\) (eigen). Multiply \\(X\\) by \\(\\Lambda^{\\frac{1}{2}}\\) to get \\(X2\\). Consider how the eigendecomposition for \\(X2\\) changes compared to \\(X\\). Multiply \\(X2\\) by \\(U\\) to get \\(X3\\). Consider how the eigendecomposition for \\(X3\\) changes compared to \\(X2\\). Add \\(\\mu\\) to \\(X3\\). Consider how the eigendecomposition for \\(X4\\) changes compared to \\(X3\\). Plot the data and the eigenvectors (scaled with \\(\\Lambda^{\\frac{1}{2}}\\)) at each step. Hint: Use geom_segment for the eigenvectors. # a set.seed(1) sigma <- matrix(data = c(0.3, -0.5, -0.5, 1.6), nrow = 2, byrow = TRUE) ed <- eigen(sigma) e_val <- ed$values e_vec <- ed$vectors # b set.seed(1) nsamps <- 1000 X <- matrix(data = rnorm(nsamps * 2), ncol = 2) vec1 <- matrix(c(1,0,0,1), nrow = 2) X2 <- t(sqrt(diag(e_val)) %*% t(X)) vec2 <- sqrt(diag(e_val)) %*% vec1 X3 <- t(e_vec %*% t(X2)) vec3 <- e_vec %*% vec2 X4 <- t(c(-2, 1) + t(X3)) vec4 <- c(-2, 1) + vec3 vec_mat <- data.frame(matrix(c(0,0,0,0,0,0,0,0,0,0,0,0,-2,1,-2,1), ncol = 2, byrow = TRUE), t(cbind(vec1, vec2, vec3, vec4)), c(1,1,2,2,3,3,4,4)) df <- data.frame(rbind(X, X2, X3, X4), c(rep(1, nsamps), rep(2, nsamps), rep(3, nsamps), rep(4, nsamps))) colnames(df) <- c("D1", "D2", "wh") colnames(vec_mat) <- c("D1", "D2", "E1", "E2", "wh") ggplot(data = df, aes(x = D1, y = D2)) + geom_point() + geom_segment(data = vec_mat, aes(xend = E1, yend = E2), color = "red") + facet_wrap(~ wh) + coord_fixed() Exercise 8.5 (Marginal and conditional distributions) Let \\(X \\sim \\text{N}(\\mu, \\Sigma)\\), where \\(\\mu = [2, 0, -1]^T\\) and \\(\\Sigma = \\begin{bmatrix} 1 & -0.2 & 0.5 \\\\ -0.2 & 1.4 & -1.2 \\\\ 0.5 & -1.2 & 2 \\\\ \\end{bmatrix}\\). Let \\(A\\) represent the first two random variables and \\(B\\) the third random variable. R: For the calculation in the following points, you can use R. Find the marginal distribution of \\(B\\). Find the conditional distribution of \\(B | A = [a_1, a_2]^T\\). Find the marginal distribution of \\(A\\). Find the conditional distribution of \\(A | B = b\\). R: Visually compare the distributions of a) and b), and c) and d) at three different conditional values. mu <- c(2, 0, -1) Sigma <- matrix(c(1, -0.2, 0.5, -0.2, 1.4, -1.2, 0.5, -1.2, 2), nrow = 3, byrow = TRUE) mu_A <- c(2, 0) mu_B <- -1 Sigma_A <- Sigma[1:2, 1:2] Sigma_B <- Sigma[3, 3] Sigma_AB <- Sigma[1:2, 3] # b tmp_b <- t(Sigma_AB) %*% solve(Sigma_A) mu_b <- mu_B - tmp_b %*% mu_A Sigma_b <- Sigma_B - t(Sigma_AB) %*% solve(Sigma_A) %*% Sigma_AB mu_b ## [,1] ## [1,] -1.676471 tmp_b ## [,1] [,2] ## [1,] 0.3382353 -0.8088235 Sigma_b ## [,1] ## [1,] 0.8602941 # d tmp_a <- Sigma_AB * (1 / Sigma_B) mu_a <- mu_A - tmp_a * mu_B Sigma_d <- Sigma_A - (Sigma_AB * (1 / Sigma_B)) %*% t(Sigma_AB) mu_a ## [1] 2.25 -0.60 tmp_a ## [1] 0.25 -0.60 Sigma_d ## [,1] [,2] ## [1,] 0.875 0.10 ## [2,] 0.100 0.68 Solution. \\(B \\sim \\text{N}(-1, 2)\\). \\(B | A = a \\sim \\text{N}(-1.68 + [0.34, -0.81] a, 0.86)\\). \\(\\mu_A = [2, 0]^T\\) and \\(\\Sigma_A = \\begin{bmatrix} 1 & -0.2 & \\\\ -0.2 & 1.4 \\\\ \\end{bmatrix}\\). \\[\\begin{align} A | B = b &\\sim \\text{N}(\\mu_t, \\Sigma_t), \\\\ \\mu_t &= [2.25, -0.6]^T + [0.25, -0.6]^T b, \\\\ \\Sigma_t &= \\begin{bmatrix} 0.875 & 0.1 \\\\ 0.1 & 0.68 \\\\ \\end{bmatrix} \\end{align}\\] library(mvtnorm) set.seed(1) nsamps <- 1000 # a and b samps <- as.data.frame(matrix(data = NA, nrow = 4 * nsamps, ncol = 2)) samps[1:nsamps,1] <- rnorm(nsamps, mu_B, Sigma_B) samps[1:nsamps,2] <- "marginal" for (i in 1:3) { a <- rmvnorm(1, mu_A, Sigma_A) samps[(i*nsamps + 1):((i + 1) * nsamps), 1] <- rnorm(nsamps, mu_b + tmp_b %*% t(a), Sigma_b) samps[(i*nsamps + 1):((i + 1) * nsamps), 2] <- paste0(# "cond", round(a, digits = 2), collapse = "-") } colnames(samps) <- c("x", "dist") ggplot(samps, aes(x = x)) + geom_density() + facet_wrap(~ dist) # c and d samps <- as.data.frame(matrix(data = NA, nrow = 4 * nsamps, ncol = 3)) samps[1:nsamps,1:2] <- rmvnorm(nsamps, mu_A, Sigma_A) samps[1:nsamps,3] <- "marginal" for (i in 1:3) { b <- rnorm(1, mu_B, Sigma_B) samps[(i*nsamps + 1):((i + 1) * nsamps), 1:2] <- rmvnorm(nsamps, mu_a + tmp_a * b, Sigma_d) samps[(i*nsamps + 1):((i + 1) * nsamps), 3] <- b } colnames(samps) <- c("x", "y", "dist") ggplot(samps, aes(x = x, y = y)) + geom_point() + geom_smooth(method = "lm") + facet_wrap(~ dist) 8.3 Transformations Exercise 8.6 Let \\((U,V)\\) be a random variable with PDF \\(p(u,v) = \\frac{1}{4 \\sqrt{u}}\\), \\(U \\in [0,4]\\) and \\(V \\in [\\sqrt{U}, \\sqrt{U} + 1]\\). Let \\(X = \\sqrt{U}\\) and \\(Y = V - \\sqrt{U}\\). Find PDF of \\((X,Y)\\). What can you tell about distributions of \\(X\\) and \\(Y\\)? This exercise shows how we can simplify a probabilistic problem with a clever use of transformations. R: Take 1000 samples from \\((X,Y)\\) and transform them with inverses of the above functions to get samples from \\((U,V)\\). Plot both sets of samples. Solution. First we need to find the inverse functions. Since \\(x = \\sqrt{u}\\) it follows that \\(u = x^2\\), and that \\(x \\in [0,2]\\). Similarly \\(v = y + x\\) and \\(y \\in [0,1]\\). Let us first find the Jacobian. \\[\\renewcommand\\arraystretch{1.6} J(x,y) = \\begin{bmatrix} \\frac{\\partial u}{\\partial x} & \\frac{\\partial v}{\\partial x} \\\\%[1ex] % <-- 1ex more space between rows of matrix \\frac{\\partial u}{\\partial y} & \\frac{\\partial v}{\\partial y} \\end{bmatrix} = \\begin{bmatrix} 2x & 1 \\\\%[1ex] % <-- 1ex more space between rows of matrix 0 & 1 \\end{bmatrix}, \\] and the determinant is \\(|J(x,y)| = 2x\\). Putting everything together, we get \\[\\begin{align} p_{X,Y}(x,y) = p_{U,V}(x^2, y + x) |J(x,y)| = \\frac{1}{4 \\sqrt{x^2}} 2x = \\frac{1}{2}. \\end{align}\\] This reminds us of the Uniform distribution. Indeed we can see that \\(p_X(x) = \\frac{1}{2}\\) and \\(p_Y(y) = 1\\). So instead of dealing with an awkward PDF of \\((U,V)\\) and the corresponding dynamic bounds, we are now looking at two independent Uniform random variables. In practice, this could make modeling much easier. set.seed(1) nsamps <- 2000 x <- runif(nsamps, min = 0, max = 2) y <- runif(nsamps) orig <- tibble(x = x, y = y, vrs = "original") u <- x^2 v <- y + x transf <- tibble(x = u, y = v, vrs = "transformed") df <- bind_rows(orig, transf) ggplot(df, aes(x = x, y = y, color = vrs)) + geom_point(alpha = 0.3) Exercise 8.7 R: Write a function that will calculate the probability density of an arbitraty multivariate normal distribution, based on independent standardized normal PDFs. Compare with dmvnorm from the mvtnorm package. library(mvtnorm) set.seed(1) mvn_dens <- function (y, mu, Sigma) { L <- chol(Sigma) L_inv <- solve(t(L)) g_inv <- L_inv %*% t(y - mu) J <- L_inv J_det <- det(J) return(prod(dnorm(g_inv)) * J_det) } mu_v <- c(-2, 0, 1) cov_m <- matrix(c(1, -0.2, 0.5, -0.2, 2, 0.3, 0.5, 0.3, 1.6), ncol = 3, byrow = TRUE) n_comp <- 20 for (i in 1:n_comp) { x <- rmvnorm(1, mean = mu_v, sigma = cov_m) print(paste0("My function: ", mvn_dens(x, mu_v, cov_m), ", dmvnorm: ", dmvnorm(x, mu_v, cov_m))) } ## [1] "My function: 0.0229514237156383, dmvnorm: 0.0229514237156383" ## [1] "My function: 0.00763138915406231, dmvnorm: 0.00763138915406231" ## [1] "My function: 0.0230688881105741, dmvnorm: 0.0230688881105741" ## [1] "My function: 0.0113616213114731, dmvnorm: 0.0113616213114731" ## [1] "My function: 0.00151808500121907, dmvnorm: 0.00151808500121907" ## [1] "My function: 0.0257658045974509, dmvnorm: 0.0257658045974509" ## [1] "My function: 0.0157963825730805, dmvnorm: 0.0157963825730805" ## [1] "My function: 0.00408856287529248, dmvnorm: 0.00408856287529248" ## [1] "My function: 0.0327793540101256, dmvnorm: 0.0327793540101256" ## [1] "My function: 0.0111606542967978, dmvnorm: 0.0111606542967978" ## [1] "My function: 0.0147636757585684, dmvnorm: 0.0147636757585684" ## [1] "My function: 0.0142948300412207, dmvnorm: 0.0142948300412207" ## [1] "My function: 0.0203093820657542, dmvnorm: 0.0203093820657542" ## [1] "My function: 0.0287533273357481, dmvnorm: 0.0287533273357481" ## [1] "My function: 0.0213402305128623, dmvnorm: 0.0213402305128623" ## [1] "My function: 0.0218356957993885, dmvnorm: 0.0218356957993885" ## [1] "My function: 0.0250750113961771, dmvnorm: 0.0250750113961771" ## [1] "My function: 0.0166498666348048, dmvnorm: 0.0166498666348048" ## [1] "My function: 0.00189725106874659, dmvnorm: 0.00189725106874659" ## [1] "My function: 0.0196697814975113, dmvnorm: 0.0196697814975113" "],["ard.html", "Chapter 9 Alternative representation of distributions 9.1 Probability generating functions (PGFs) 9.2 Moment generating functions (MGFs)", " Chapter 9 Alternative representation of distributions This chapter deals with alternative representation of distributions. The students are expected to acquire the following knowledge: Theoretical Probability generating functions. Moment generating functions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 9.1 Probability generating functions (PGFs) Exercise 9.1 Show that the sum of independent Poisson random variables is itself a Poisson random variable. R: Let \\(X\\) be a sum of three Poisson distributions with \\(\\lambda_i \\in \\{2, 5.2, 10\\}\\). Take 1000 samples and plot the three distributions and the sum. Then take 1000 samples from the theoretical distribution of \\(X\\) and compare them to the sum. Solution. Let \\(X_i \\sim \\text{Poisson}(\\lambda_i)\\) for \\(i = 1,...,n\\), and let \\(X = \\sum_{i=1}^n X_i\\). \\[\\begin{align} \\alpha_X(t) &= \\prod_{i=1}^n \\alpha_{X_i}(t) \\\\ &= \\prod_{i=1}^n \\bigg( \\sum_{j=0}^\\infty t^j \\frac{\\lambda_i^j e^{-\\lambda_i}}{j!} \\bigg) \\\\ &= \\prod_{i=1}^n \\bigg( e^{-\\lambda_i} \\sum_{j=0}^\\infty \\frac{(t\\lambda_i)^j }{j!} \\bigg) \\\\ &= \\prod_{i=1}^n \\bigg( e^{-\\lambda_i} e^{t \\lambda_i} \\bigg) & \\text{power series} \\\\ &= \\prod_{i=1}^n \\bigg( e^{\\lambda_i(t - 1)} \\bigg) \\\\ &= e^{\\sum_{i=1}^n \\lambda_i(t - 1)} \\\\ &= e^{t \\sum_{i=1}^n \\lambda_i - \\sum_{i=1}^n \\lambda_i} \\\\ &= e^{-\\sum_{i=1}^n \\lambda_i} \\sum_{j=0}^\\infty \\frac{(t \\sum_{i=1}^n \\lambda_i)^j}{j!}\\\\ &= \\sum_{j=0}^\\infty \\frac{e^{-\\sum_{i=1}^n \\lambda_i} (t \\sum_{i=1}^n \\lambda_i)^j}{j!}\\\\ \\end{align}\\] The last term is the PGF of a Poisson random variable with parameter \\(\\sum_{i=1}^n \\lambda_i\\). Because the PGF is unique, \\(X\\) is a Poisson random variable. set.seed(1) library(tidyr) nsamps <- 1000 samps <- matrix(data = NA, nrow = nsamps, ncol = 4) samps[ ,1] <- rpois(nsamps, 2) samps[ ,2] <- rpois(nsamps, 5.2) samps[ ,3] <- rpois(nsamps, 10) samps[ ,4] <- samps[ ,1] + samps[ ,2] + samps[ ,3] colnames(samps) <- c(2, 2.5, 10, "sum") gsamps <- as_tibble(samps) gsamps <- gather(gsamps, key = "dist", value = "value") ggplot(gsamps, aes(x = value)) + geom_bar() + facet_wrap(~ dist) samps <- cbind(samps, "theoretical" = rpois(nsamps, 2 + 5.2 + 10)) gsamps <- as_tibble(samps[ ,4:5]) gsamps <- gather(gsamps, key = "dist", value = "value") ggplot(gsamps, aes(x = value, fill = dist)) + geom_bar(position = "dodge") Exercise 9.2 Find the expected value and variance of the negative binomial distribution. Hint: Find the Taylor series of \\((1 - y)^{-r}\\) at point 0. Solution. Let \\(X \\sim \\text{NB}(r, p)\\). \\[\\begin{align} \\alpha_X(t) &= E[t^X] \\\\ &= \\sum_{j=0}^\\infty t^j \\binom{j + r - 1}{j} (1 - p)^r p^j \\\\ &= (1 - p)^r \\sum_{j=0}^\\infty \\binom{j + r - 1}{j} (tp)^j \\\\ &= (1 - p)^r \\sum_{j=0}^\\infty \\frac{(j + r - 1)(j + r - 2)...r}{j!} (tp)^j. \\\\ \\end{align}\\] Let us look at the Taylor series of \\((1 - y)^{-r}\\) at 0 \\[\\begin{align} (1 - y)^{-r} = &1 + \\frac{-r(-1)}{1!}y + \\frac{-r(-r - 1)(-1)^2}{2!}y^2 + \\\\ &\\frac{-r(-r - 1)(-r - 2)(-1)^3}{3!}y^3 + ... \\\\ \\end{align}\\] How does the \\(k\\)-th term look like? We have \\(k\\) derivatives of our function so \\[\\begin{align} \\frac{d^k}{d^k y} (1 - y)^{-r} &= \\frac{-r(-r - 1)...(-r - k + 1)(-1)^k}{k!}y^k \\\\ &= \\frac{r(r + 1)...(r + k - 1)}{k!}y^k. \\end{align}\\] We observe that this equals to the \\(j\\)-th term in the sum of NB PGF. Therefore \\[\\begin{align} \\alpha_X(t) &= (1 - p)^r (1 - tp)^{-r} \\\\ &= \\Big(\\frac{1 - p}{1 - tp}\\Big)^r \\end{align}\\] To find the expected value, we need to differentiate \\[\\begin{align} \\frac{d}{dt} \\Big(\\frac{1 - p}{1 - tp}\\Big)^r &= r \\Big(\\frac{1 - p}{1 - tp}\\Big)^{r-1} \\frac{d}{dt} \\frac{1 - p}{1 - tp} \\\\ &= r \\Big(\\frac{1 - p}{1 - tp}\\Big)^{r-1} \\frac{p(1 - p)}{(1 - tp)^2}. \\\\ \\end{align}\\] Evaluating this at 1, we get: \\[\\begin{align} E[X] = \\frac{rp}{1 - p}. \\end{align}\\] For the variance we need the second derivative. \\[\\begin{align} \\frac{d^2}{d^2t} \\Big(\\frac{1 - p}{1 - tp}\\Big)^r &= \\frac{p^2 r (r + 1) (\\frac{1 - p}{1 - tp})^r}{(tp - 1)^2} \\end{align}\\] Evaluating this at 1 and inserting the first derivatives, we get: \\[\\begin{align} Var[X] &= \\frac{d^2}{dt^2} \\alpha_X(1) + \\frac{d}{dt}\\alpha_X(1) - \\Big(\\frac{d}{dt}\\alpha_X(t) \\Big)^2 \\\\ &= \\frac{p^2 r (r + 1)}{(1 - p)^2} + \\frac{rp}{1 - p} - \\frac{r^2p^2}{(1 - p)^2} \\\\ &= \\frac{rp}{(1 - p)^2}. \\end{align}\\] library(tidyr) set.seed(1) nsamps <- 100000 find_p <- function (mu, r) { return (10 / (r + 10)) } r <- c(1,2,10,20) p <- find_p(10, r) sigma <- rep(sqrt(p*r / (1 - p)^2), each = nsamps) samps <- cbind("r=1" = rnbinom(nsamps, size = r[1], prob = 1 - p[1]), "r=2" = rnbinom(nsamps, size = r[2], prob = 1 - p[2]), "r=4" = rnbinom(nsamps, size = r[3], prob = 1 - p[3]), "r=20" = rnbinom(nsamps, size = r[4], prob = 1 - p[4])) gsamps <- gather(as.data.frame(samps)) iw <- (gsamps$value > sigma + 10) | (gsamps$value < sigma - 10) ggplot(gsamps, aes(x = value, fill = iw)) + geom_bar() + # geom_density() + facet_wrap(~ key) 9.2 Moment generating functions (MGFs) Exercise 9.3 Find the variance of the geometric distribution. Solution. Let \\(X \\sim \\text{Geometric}(p)\\). The MGF of the geometric distribution is \\[\\begin{align} M_X(t) &= E[e^{tX}] \\\\ &= \\sum_{k=0}^\\infty p(1 - p)^k e^{tk} \\\\ &= p \\sum_{k=0}^\\infty ((1 - p)e^t)^k. \\end{align}\\] Let us assume that \\((1 - p)e^t < 1\\). Then, by using the geometric series we get \\[\\begin{align} M_X(t) &= \\frac{p}{1 - e^t + pe^t}. \\end{align}\\] The first derivative of the above expression is \\[\\begin{align} \\frac{d}{dt}M_X(t) &= \\frac{-p(-e^t + pe^t)}{(1 - e^t + pe^t)^2}, \\end{align}\\] and evaluating at \\(t = 0\\), we get \\(\\frac{1 - p}{p}\\), which we already recognize as the expected value of the geometric distribution. The second derivative is \\[\\begin{align} \\frac{d^2}{dt^2}M_X(t) &= \\frac{(p-1)pe^t((p-1)e^t - 1)}{((p - 1)e^t + 1)^3}, \\end{align}\\] and evaluating at \\(t = 0\\), we get \\(\\frac{(p - 1)(p - 2)}{p^2}\\). Combining we get the variance \\[\\begin{align} Var(X) &= \\frac{(p - 1)(p - 2)}{p^2} - \\frac{(1 - p)^2}{p^2} \\\\ &= \\frac{(p-1)(p-2) - (1-p)^2}{p^2} \\\\ &= \\frac{1 - p}{p^2}. \\end{align}\\] Exercise 9.4 Find the distribution of sum of two normal random variables \\(X\\) and \\(Y\\), by comparing \\(M_{X+Y}(t)\\) to \\(M_X(t)\\). R: To illustrate the result draw random samples from N\\((-3, 1)\\) and N\\((5, 1.2)\\) and calculate the empirical mean and variance of \\(X+Y\\). Plot all three histograms in one plot. Solution. Let \\(X \\sim \\text{N}(\\mu_X, 1)\\) and \\(Y \\sim \\text{N}(\\mu_Y, 1)\\). The MGF of the sum is \\[\\begin{align} M_{X+Y}(t) &= M_X(t) M_Y(t). \\end{align}\\] Let us calculate \\(M_X(t)\\), the MGF for \\(Y\\) then follows analogously. \\[\\begin{align} M_X(t) &= \\int_{-\\infty}^\\infty e^{tx} \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - mu_X)^2}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - mu_X)^2 - 2\\sigma_X tx}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{x^2 - 2\\mu_X x + \\mu_X^2 - 2\\sigma_X tx}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - (\\mu_X + \\sigma_X^2 t))^2 + \\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} dx & \\text{complete the square}\\\\ &= e^{-\\frac{\\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - (\\mu_X + \\sigma_X^2 t))^2}{2\\sigma_X^2}} dx & \\\\ &= e^{-\\frac{\\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} & \\text{normal PDF} \\\\ &= e^{-\\frac{\\mu_X^2 - \\mu_X^2 - \\mu_X \\sigma_X^2 t - 2 \\sigma_X^4 t^2}{2\\sigma_X^2}} \\\\ &= e^{\\sigma_X^2 t^2 + \\frac{\\mu_X t}{2}}. \\\\ \\end{align}\\] The MGF of the sum is then \\[\\begin{align} M_{X+Y}(t) &= e^{\\sigma_X^2 t^2 + 0.5\\mu_X t} e^{\\sigma_Y^2 t^2 + 0.5\\mu_Y t} \\\\ &= e^{t^2(\\sigma_X^2 + \\sigma_Y^2) + 0.5 t(\\mu_X + \\mu_Y)}. \\end{align}\\] By comparing \\(M_{X+Y}(t)\\) and \\(M_X(t)\\) we observe that both have two terms. The first is \\(2t^2\\) multiplied by the variance, and the second is \\(2t\\) multiplied by the mean. Since MGFs are unique, we conclude that \\(Z = X + Y \\sim \\text{N}(\\mu_X + \\mu_Y, \\sigma_X^2 + \\sigma_Y^2)\\). library(tidyr) library(ggplot2) set.seed(1) nsamps <- 1000 x <- rnorm(nsamps, -3, 1) y <- rnorm(nsamps, 5, 1.2) z <- x + y mean(z) ## [1] 1.968838 var(z) ## [1] 2.645034 df <- data.frame(x = x, y = y, z = z) %>% gather() ggplot(df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") "],["ci.html", "Chapter 10 Concentration inequalities 10.1 Comparison 10.2 Practical", " Chapter 10 Concentration inequalities This chapter deals with concentration inequalities. The students are expected to acquire the following knowledge: Theoretical More assumptions produce closer bounds. R Optimization. Estimating probability inequalities. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 10.1 Comparison Exercise 10.1 R: Let \\(X\\) be geometric random variable with \\(p = 0.7\\). Visually compare the Markov bound, Chernoff bound, and the theoretical probabilities for \\(x = 1,...,12\\). To get the best fitting Chernoff bound, you will need to optimize the bound depending on \\(t\\). Use either analytical or numerical optimization. bound_chernoff <- function (t, p, a) { return ((p / (1 - exp(t) + p * exp(t))) / exp(a * t)) } set.seed(1) p <- 0.7 a <- seq(1, 12, by = 1) ci_markov <- (1 - p) / p / a t <- vector(mode = "numeric", length = length(a)) for (i in 1:length(t)) { t[i] <- optimize(bound_chernoff, interval = c(0, log(1 / (1 - p))), p = p, a = a[i])$minimum } t ## [1] 0.5108267 0.7984981 0.9162927 0.9808238 1.0216635 1.0498233 1.0704327 ## [8] 1.0861944 1.0986159 1.1086800 1.1169653 1.1239426 ci_chernoff <- (p / (1 - exp(t) + p * exp(t))) / exp(a * t) actual <- 1 - pgeom(a, 0.7) plot_df <- rbind( data.frame(x = a, y = ci_markov, type = "Markov"), data.frame(x = a, y = ci_chernoff, type = "Chernoff"), data.frame(x = a, y = actual, type = "Actual") ) ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() Exercise 10.2 R: Let \\(X\\) be a sum of 100 Beta distributions with random parameters. Take 1000 samples and plot the Chebyshev bound, Hoeffding bound, and the empirical probabilities. set.seed(1) nvars <- 100 nsamps <- 1000 samps <- matrix(data = NA, nrow = nsamps, ncol = nvars) Sn_mean <- 0 Sn_var <- 0 for (i in 1:nvars) { alpha1 <- rgamma(1, 10, 1) beta1 <- rgamma(1, 10, 1) X <- rbeta(nsamps, alpha1, beta1) Sn_mean <- Sn_mean + alpha1 / (alpha1 + beta1) Sn_var <- Sn_var + alpha1 * beta1 / ((alpha1 + beta1)^2 * (alpha1 + beta1 + 1)) samps[ ,i] <- X } mean(apply(samps, 1, sum)) ## [1] 51.12511 Sn_mean ## [1] 51.15723 var(apply(samps, 1, sum)) ## [1] 1.170652 Sn_var ## [1] 1.166183 a <- 1:30 b <- a / sqrt(Sn_var) ci_chebyshev <- 1 / b^2 ci_hoeffding <- 2 * exp(- 2 * a^2 / nvars) empirical <- NULL for (i in 1:length(a)) { empirical[i] <- sum(abs((apply(samps, 1, sum)) - Sn_mean) >= a[i])/ nsamps } plot_df <- rbind( data.frame(x = a, y = ci_chebyshev, type = "Chebyshev"), data.frame(x = a, y = ci_hoeffding, type = "Hoeffding"), data.frame(x = a, y = empirical, type = "Empirical") ) ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() + coord_cartesian(xlim = c(15, 25), ylim = c(0, 0.05)) 10.2 Practical Exercise 10.3 From Jagannathan. Let \\(X_i\\), \\(i = 1,...n\\), be a random sample of size \\(n\\) of a random variable \\(X\\). Let \\(X\\) have mean \\(\\mu\\) and variance \\(\\sigma^2\\). Find the size of the sample \\(n\\) required so that the probability that the difference between sample mean and true mean is smaller than \\(\\frac{\\sigma}{10}\\) is at least 0.95. Hint: Derive a version of the Chebyshev inequality for \\(P(|X - \\mu| \\geq a)\\) using Markov inequality. Solution. Let \\(\\bar{X} = \\sum_{i=1}^n X_i\\). Then \\(E[\\bar{X}] = \\mu\\) and \\(Var[\\bar{X}] = \\frac{\\sigma^2}{n}\\). Let us first derive another representation of Chebyshev inequality. \\[\\begin{align} P(|X - \\mu| \\geq a) = P(|X - \\mu|^2 \\geq a^2) \\leq \\frac{E[|X - \\mu|^2]}{a^2} = \\frac{Var[X]}{a^2}. \\end{align}\\] Let us use that on our sampling distribution: \\[\\begin{align} P(|\\bar{X} - \\mu| \\geq \\frac{\\sigma}{10}) \\leq \\frac{100 Var[\\bar{X}]}{\\sigma^2} = \\frac{100 Var[X]}{n \\sigma^2} = \\frac{100}{n}. \\end{align}\\] We are interested in the difference being smaller, therefore \\[\\begin{align} P(|\\bar{X} - \\mu| < \\frac{\\sigma}{10}) = 1 - P(|\\bar{X} - \\mu| \\geq \\frac{\\sigma}{10}) \\geq 1 - \\frac{100}{n} \\geq 0.95. \\end{align}\\] It follows that we need a sample size of \\(n \\geq \\frac{100}{0.05} = 2000\\). "],["crv.html", "Chapter 11 Convergence of random variables", " Chapter 11 Convergence of random variables This chapter deals with convergence of random variables. The students are expected to acquire the following knowledge: Theoretical Finding convergences of random variables. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 11.1 Let \\(X_1\\), \\(X_2\\),…, \\(X_n\\) be a sequence of Bernoulli random variables. Let \\(Y_n = \\frac{X_1 + X_2 + ... + X_n}{n^2}\\). Show that this sequence converges point-wise to the zero random variable. R: Use a simulation to check your answer. Solution. Let \\(\\epsilon\\) be arbitrary. We need to find such \\(n_0\\), that for every \\(n\\) greater than \\(n_0\\) \\(|Y_n| < \\epsilon\\) holds. \\[\\begin{align} |Y_n| &= |\\frac{X_1 + X_2 + ... + X_n}{n^2}| \\\\ &\\leq |\\frac{n}{n^2}| \\\\ &= \\frac{1}{n}. \\end{align}\\] So we need to find such \\(n_0\\), that for every \\(n > n_0\\) we will have \\(\\frac{1}{n} < \\epsilon\\). So \\(n_0 > \\frac{1}{\\epsilon}\\). x <- 1:1000 X <- matrix(data = NA, nrow = length(x), ncol = 100) y <- vector(mode = "numeric", length = length(x)) for (i in 1:length(x)) { X[i, ] <- rbinom(100, size = 1, prob = 0.5) } X <- apply(X, 2, cumsum) tmp_mat <- matrix(data = (1:1000)^2, nrow = 1000, ncol = 100) X <- X / tmp_mat y <- apply(X, 1, mean) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() Exercise 11.2 Let \\(\\Omega = [0,1]\\) and let \\(X_n\\) be a sequence of random variables, defined as \\[\\begin{align} X_n(\\omega) = \\begin{cases} \\omega^3, &\\omega = \\frac{i}{n}, &0 \\leq i \\leq 1 \\\\ 1, & \\text{otherwise.} \\end{cases} \\end{align}\\] Show that \\(X_n\\) converges almost surely to \\(X \\sim \\text{Uniform}(0,1)\\). Solution. We need to show \\(P(\\{\\omega: X_n(\\omega) \\rightarrow X(\\omega)\\}) = 1\\). Let \\(\\omega \\neq \\frac{i}{n}\\). Then for any \\(\\omega\\), \\(X_n\\) converges pointwise to \\(X\\): \\[\\begin{align} X_n(\\omega) = 1 \\implies |X_n(\\omega) - X(s)| = |1 - 1| < \\epsilon. \\end{align}\\] The above is independent of \\(n\\). Since there are countably infinite number of elements in the complement (\\(\\frac{i}{n}\\)), the probability of this set is 1. Exercise 11.3 Borrowed from Wasserman. Let \\(X_n \\sim \\text{N}(0, \\frac{1}{n})\\) and let \\(X\\) be a random variable with CDF \\[\\begin{align} F_X(x) = \\begin{cases} 0, &x < 0 \\\\ 1, &x \\geq 0. \\end{cases} \\end{align}\\] Does \\(X_n\\) converge to \\(X\\) in distribution? How about in probability? Prove or disprove these statement. R: Plot the CDF of \\(X_n\\) for \\(n = 1, 2, 5, 10, 100, 1000\\). Solution. Let us first check convergence in distribution. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} F_{X_n}(x) &= \\lim_{n \\rightarrow \\infty} \\phi (\\sqrt(n) x). \\end{align}\\] We have two cases, for \\(x < 0\\) and \\(x > 0\\). We do not need to check for \\(x = 0\\), since \\(F_X\\) is not continuous in that point. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} \\phi (\\sqrt(n) x) = \\begin{cases} 0, & x < 0 \\\\ 1, & x > 0. \\end{cases} \\end{align}\\] This is the same as \\(F_X\\). Let us now check convergence in probability. Since \\(X\\) is a point-mass distribution at zero, we have \\[\\begin{align} \\lim_{n \\rightarrow \\infty} P(|X_n| > \\epsilon) &= \\lim_{n \\rightarrow \\infty} (P(X_n > \\epsilon) + P(X_n < -\\epsilon)) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - P(X_n < \\epsilon) + P(X_n < -\\epsilon)) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - \\phi(\\sqrt{n} \\epsilon) + \\phi(- \\sqrt{n} \\epsilon)) \\\\ &= 0. \\end{align}\\] n <- c(1,2,5,10,100,1000) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01)), aes(x = x)) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/1), aes(color = "sd = 1/1")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/2), aes(color = "sd = 1/2")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/5), aes(color = "sd = 1/5")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/10), aes(color = "sd = 1/10")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/100), aes(color = "sd = 1/100")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/1000), aes(color = "sd = 1/1000")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/10000), aes(color = "sd = 1/10000")) Exercise 11.4 Let \\(X_i\\) be i.i.d. and \\(\\mu = E(X_1)\\). Let variance of \\(X_1\\) be finite. Show that the mean of \\(X_i\\), \\(\\bar{X}_n = \\frac{1}{n}\\sum_{i=1}^n X_i\\) converges in quadratic mean to \\(\\mu\\). Solution. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} E(|\\bar{X_n} - \\mu|^2) &= \\lim_{n \\rightarrow \\infty} E(\\bar{X_n}^2 - 2 \\bar{X_n} \\mu + \\mu^2) \\\\ &= \\lim_{n \\rightarrow \\infty} (E(\\bar{X_n}^2) - 2 \\mu E(\\frac{\\sum_{i=1}^n X_i}{n}) + \\mu^2) \\\\ &= \\lim_{n \\rightarrow \\infty} E(\\bar{X_n})^2 + \\lim_{n \\rightarrow \\infty} Var(\\bar{X_n}) - 2 \\mu^2 + \\mu^2 \\\\ &= \\lim_{n \\rightarrow \\infty} \\frac{n^2 \\mu^2}{n^2} + \\lim_{n \\rightarrow \\infty} \\frac{\\sigma^2}{n} - \\mu^2 \\\\ &= \\mu^2 - \\mu^2 + \\lim_{n \\rightarrow \\infty} \\frac{\\sigma^2}{n} \\\\ &= 0. \\end{align}\\] "],["lt.html", "Chapter 12 Limit theorems", " Chapter 12 Limit theorems This chapter deals with limit theorems. The students are expected to acquire the following knowledge: Theoretical Monte Carlo integration convergence. Difference between weak and strong law of large numbers. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 12.1 Show that Monte Carlo integration converges almost surely to the true integral of a bounded function. Solution. Let \\(g\\) be a function defined on \\(\\Omega\\). Let \\(X_i\\), \\(i = 1,...,n\\) be i.i.d. (multivariate) uniform random variables with bounds defined on \\(\\Omega\\). Let \\(Y_i\\) = \\(g(X_i)\\). Then it follows that \\(Y_i\\) are also i.i.d. random variables and their expected value is \\(E[g(X)] = \\int_{\\Omega} g(x) f_X(x) dx = \\frac{1}{V_{\\Omega}} \\int_{\\Omega} g(x) dx\\). By the strong law of large numbers, we have \\[\\begin{equation} \\frac{1}{n}\\sum_{i=1}^n Y_i \\xrightarrow{\\text{a.s.}} E[g(X)]. \\end{equation}\\] It follows that \\[\\begin{equation} V_{\\Omega} \\frac{1}{n}\\sum_{i=1}^n Y_i \\xrightarrow{\\text{a.s.}} \\int_{\\Omega} g(x) dx. \\end{equation}\\] Exercise 12.2 Let \\(X\\) be a geometric random variable with probability 0.5 and support in positive integers. Let \\(Y = 2^X (-1)^X X^{-1}\\). Find the expected value of \\(Y\\) by using conditional convergence (this variable does not have an expected value in the conventional sense – the series is not absolutely convergent). R: Draw \\(10000\\) samples from a geometric distribution with probability 0.5 and support in positive integers to get \\(X\\). Then calculate \\(Y\\) and plot the means at each iteration (sample). Additionally, plot the expected value calculated in a. Try it with different seeds. What do you notice? Solution. \\[\\begin{align*} E[Y] &= \\sum_{x=1}^{\\infty} \\frac{2^x (-1)^x}{x} 0.5^x \\\\ &= \\sum_{x=1}^{\\infty} \\frac{(-1)^x}{x} \\\\ &= - \\sum_{x=1}^{\\infty} \\frac{(-1)^{x+1}}{x} \\\\ &= - \\ln(2) \\end{align*}\\] set.seed(3) x <- rgeom(100000, prob = 0.5) + 1 y <- 2^x * (-1)^x * x^{-1} y_means <- cumsum(y) / seq_along(y) df <- data.frame(x = 1:length(y_means), y = y_means) ggplot(data = df, aes(x = x, y = y)) + geom_line() + geom_hline(yintercept = -log(2)) "],["eb.html", "Chapter 13 Estimation basics 13.1 ECDF 13.2 Properties of estimators", " Chapter 13 Estimation basics This chapter deals with estimation basics. The students are expected to acquire the following knowledge: Biased and unbiased estimators. Consistent estimators. Empirical cumulative distribution function. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 13.1 ECDF Exercise 13.1 (ECDF intuition) Take any univariate continuous distribution that is readily available in R and plot its CDF (\\(F\\)). Draw one sample (\\(n = 1\\)) from the chosen distribution and draw the ECDF (\\(F_n\\)) of that one sample. Use the definition of the ECDF, not an existing function in R. Implementation hint: ECDFs are always piecewise constant - they only jump at the sampled values and by \\(1/n\\). Repeat (b) for \\(n = 5, 10, 100, 1000...\\) Theory says that \\(F_n\\) should converge to \\(F\\). Can you observe that? For \\(n = 100\\) repeat the process \\(m = 20\\) times and plot every \\(F_n^{(m)}\\). Theory says that \\(F_n\\) will converge to \\(F\\) the slowest where \\(F\\) is close to 0.5 (where the variance is largest). Can you observe that? library(ggplot2) set.seed(1) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) one_samp <- rnorm(1) X <- data.frame(x = c(-5, one_samp, 5), y = c(0,1,1)) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) + geom_step(data = X, aes(x = x, y = y)) N <- c(5, 10, 100, 1000) X <- NULL for (n in N) { tmp <- rnorm(n) tmp_X <- data.frame(x = c(-5, sort(tmp), 5), y = c(0, seq(1/n, 1, by = 1/n), 1), n = n) X <- rbind(X, tmp_X) } ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) + geom_step(data = X, aes(x = x, y = y, color = as.factor(n))) + labs(color = "N") 13.2 Properties of estimators Exercise 13.2 Show that the sample average is, as an estimator of the mean: unbiased, consistent, asymptotically normal. Solution. \\[\\begin{align*} E[\\frac{1}{n} \\sum_{i=1}^n X_i] &= \\frac{1}{n} \\sum_{i=i}^n E[X_i] \\\\ &= E[X]. \\end{align*}\\] \\[\\begin{align*} \\lim_{n \\rightarrow \\infty} P(|\\frac{1}{n} \\sum_{i=1}^n X_i - E[X]| > \\epsilon) &= \\lim_{n \\rightarrow \\infty} P((\\frac{1}{n} \\sum_{i=1}^n X_i - E[X])^2 > \\epsilon^2) \\\\ & \\leq \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i - E[X])^2]}{\\epsilon^2} & \\text{Markov inequality} \\\\ & = \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i)^2 - 2 \\frac{1}{n} \\sum_{i=1}^n X_i E[X] + E[X]^2]}{\\epsilon^2} \\\\ & = \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i)^2] - 2 E[X]^2 + E[X]^2}{\\epsilon^2} \\\\ &= 0 \\end{align*}\\] For the last equality see the solution to ??. Follows directly from the CLT. Exercise 13.3 (Consistent but biased estimator) Show that sample variance (the plug-in estimator of variance) is a biased estimator of variance. Show that sample variance is a consistent estimator of variance. Show that the estimator with (\\(N-1\\)) (Bessel correction) is unbiased. Solution. \\[\\begin{align*} E[\\frac{1}{n} \\sum_{i=1}^n (Y_i - \\bar{Y})^2] &= \\frac{1}{n} \\sum_{i=1}^n E[(Y_i - \\bar{Y})^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2] - 2 E[Y_i \\bar{Y}] + \\bar{Y}^2)] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2 - 2 Y_i \\bar{Y} + \\bar{Y}^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2 - \\frac{2}{n} Y_i^2 - \\frac{2}{n} \\sum_{i \\neq j} Y_i Y_j + \\frac{1}{n^2}\\sum_j \\sum_{k \\neq j} Y_j Y_k + \\frac{1}{n^2} \\sum_j Y_j^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n \\frac{n - 2}{n} (\\sigma^2 + \\mu^2) - \\frac{2}{n} (n - 1) \\mu^2 + \\frac{1}{n^2}n(n-1)\\mu^2 + \\frac{1}{n^2}n(\\sigma^2 + \\mu^2) \\\\ &= \\frac{n-1}{n}\\sigma^2 \\\\ < \\sigma^2. \\end{align*}\\] Let \\(S_n\\) denote the sample variance. Then we can write it as \\[\\begin{align*} S_n &= \\frac{1}{n} \\sum_{i=1}^n (X_i - \\bar{X})^2 = \\frac{1}{n} \\sum_{i=1}^n (X_i - \\mu)^2 + 2(X_i - \\mu)(\\mu - \\bar{X}) + (\\mu - \\bar{X})^2. \\end{align*}\\] Now \\(\\bar{X}\\) converges in probability (by WLLN) to \\(\\mu\\) therefore the right terms converge in probability to zero. The left term converges in probability to \\(\\sigma^2\\), also by WLLN. Therefore the sample variance is a consistent estimatior of the variance. The denominator changes in the second-to-last line of a., therefore the last line is now equality. Exercise 13.4 (Estimating the median) Show that the sample median is an unbiased estimator of the median for N\\((\\mu, \\sigma^2)\\). Show that the sample median is an unbiased estimator of the mean for any distribution with symmetric density. Hint 1: The pdf of an order statistic is \\(f_{X_{(k)}}(x) = \\frac{n!}{(n - k)!(k - 1)!}f_X(x)\\Big(F_X(x)^{k-1} (1 - F_X(x)^{n - k}) \\Big)\\). Hint 2: A distribution is symmetric when \\(X\\) and \\(2a - X\\) have the same distribution for some \\(a\\). Solution. Let \\(Z_i\\), \\(i = 1,...,n\\) be i.i.d. variables with a symmetric distribution and let \\(Z_{k:n}\\) denote the \\(k\\)-th order statistic. We will distinguish two cases, when \\(n\\) is odd and when \\(n\\) is even. Let first \\(n = 2m + 1\\) be odd. Then the sample median is \\(M = Z_{m+1:2m+1}\\). Its PDF is \\[\\begin{align*} f_M(x) = (m+1)\\binom{2m + 1}{m}f_Z(x)\\Big(F_Z(x)^m (1 - F_Z(x)^m) \\Big). \\end{align*}\\] For every symmetric distribution, it holds that \\(F_X(x) = 1 - F(2a - x)\\). Let \\(a = \\mu\\), the population mean. Plugging this into the PDF, we get that \\(f_M(x) = f_M(2\\mu -x)\\). It follows that \\[\\begin{align*} E[M] &= E[2\\mu - M] \\\\ 2E[M] &= 2\\mu \\\\ E[M] &= \\mu. \\end{align*}\\] Now let \\(n = 2m\\) be even. Then the sample median is \\(M = \\frac{Z_{m:2m} + Z_{m+1:2m}}{2}\\). It can be shown, that the joint PDF of these terms is also symmetric. Therefore, similar to the above \\[\\begin{align*} E[M] &= E[\\frac{Z_{m:2m} + Z_{m+1:2m}}{2}] \\\\ &= E[\\frac{2\\mu - M + 2\\mu - M}{2}] \\\\ &= E[2\\mu - M]. \\end{align*}\\] The above also proves point a. as the median and the mean are the same in normal distribution. Exercise 13.5 (Matrix trace estimation) The Hutchinson trace estimator [1] is an estimator of the trace of a symmetric positive semidefinite matrix A that relies on Monte Carlo sampling. The estimator is defined as \\[\\begin{align*} \\textrm{tr}(A) \\approx \\frac{1}{n} \\Sigma_{i=1}^n z_i^T A z_i, &\\\\ z_i \\sim_{\\mathrm{IID}} \\textrm{Uniform}(\\{-1, 1\\}^m), & \\end{align*}\\] where \\(A \\in \\mathbb{R}^{m \\times m}\\) is a symmetric positive semidefinite matrix. Elements of each vector \\(z_i\\) are either \\(-1\\) or \\(1\\) with equal probability. This is also called a Rademacher distribution. Data scientists often want the trace of a Hessian to obtain valuable curvature information for a loss function. Per [2], an example is classifying ten digits based on \\((28,28)\\) grayscale images (i.e. MNIST data) using logistic regression. The number of parameters is \\(m = 28^2 \\cdot 10 = 7840\\) and the size of the Hessian is \\(m^2\\), roughly \\(6 \\cdot 10^6\\). The diagonal average is equal to the average eigenvalue, which may be useful for optimization; in MCMC contexts, this would be useful for preconditioners and step size optimization. Computing Hessians (as a means of getting eigenvalue information) is often intractable, but Hessian-vector products can be computed faster by autodifferentiation (with e.g. Tensorflow, Pytorch, Jax). This is one motivation for the use of a stochastic trace estimator as outlined above. References: A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines (Hutchinson, 1990) A Modern Analysis of Hutchinson’s Trace Estimator (Skorski, 2020) Prove that the Hutchinson trace estimator is an unbiased estimator of the trace. Solution. We first simplify our task: \\[\\begin{align} \\mathbb{E}\\left[\\frac{1}{n} \\Sigma_{i=1}^n z_i^T A z_i \\right] &= \\frac{1}{n} \\Sigma_{i=1}^n \\mathbb{E}\\left[z_i^T A z_i \\right] \\\\ &= \\mathbb{E}\\left[z_i^T A z_i \\right], \\end{align}\\] where the second equality is due to having \\(n\\) IID vectors \\(z_i\\). We now only need to show that \\(\\mathbb{E}\\left[z^T A z \\right] = \\mathrm{tr}(A)\\). We omit the index due to all vectors being IID: \\[\\begin{align} \\mathrm{tr}(A) &= \\mathrm{tr}(AI) \\\\ &= \\mathrm{tr}(A\\mathbb{E}[zz^T]) \\\\ &= \\mathbb{E}[\\mathrm{tr}(Azz^T)] \\\\ &= \\mathbb{E}[\\mathrm{tr}(z^TAz)] \\\\ &= \\mathbb{E}[z^TAz]. \\end{align}\\] This concludes the proof. We clarify some equalities below. The second equality assumes that \\(\\mathbb{E}[zz^T] = I\\). By noting that the mean of the Rademacher distribution is 0, we have \\[\\begin{align} \\mathrm{Cov}[z, z] &= \\mathbb{E}[(z - \\mathbb{E}[z])(z - \\mathbb{E}[z])^T] \\\\ &= \\mathbb{E}[zz^T]. \\end{align}\\] Dimensions of \\(z\\) are independent, so \\(\\mathrm{Cov}[z, z]_{ij} = 0\\) for \\(i \\neq j\\). The diagonal will contain variances, which are equal to \\(1\\) for all dimensions \\(k = 1 \\dots m\\): \\(\\mathrm{Var}[z^{(k)}] = \\mathbb{E}[z^{(k)}z^{(k)}] - \\mathbb{E}[z^{(k)}]^2 = 1 - 0 = 1\\). It follows that the covariance is an identity matrix. Note that this is a general result for vectors with IID dimensions sampled from a distribution with mean 0 and variance 1. We could therefore use something else instead of the Rademacher, e.g. \\(z ~ N(0, I)\\). The third equality uses the fact that the expectation of a trace equals the trace of an expectation. If \\(X\\) is a random matrix, then \\(\\mathbb{E}[X]_{ij} = \\mathbb{E}[X_{ij}]\\). Therefore: \\[\\begin{align} \\mathrm{tr}(\\mathbb{E}[X]) &= \\Sigma_{i=1}^m(\\mathbb{E}[X]_{ii}) \\\\ &= \\Sigma_{i=1}^m(\\mathbb{E}[X_{ii}]) \\\\ &= \\mathbb{E}[\\Sigma_{i=1}^m(X_{ii})] \\\\ &= \\mathbb{E}[\\mathrm{tr}(X)], \\end{align}\\] where we used the linearity of the expectation in the third step. The fourth equality uses the fact that \\(\\mathrm{tr}(AB) = \\mathrm{tr}(BA)\\) for any matrices \\(A \\in \\mathbb{R}^{n \\times m}, B \\in \\mathbb{R}^{m \\times n}\\). The last inequality uses the fact that the trace of a \\(1 \\times 1\\) matrix is just its element. "],["boot.html", "Chapter 14 Bootstrap", " Chapter 14 Bootstrap This chapter deals with bootstrap. The students are expected to acquire the following knowledge: How to use bootstrap to generate coverage intervals. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 14.1 Ideally, a \\(1-\\alpha\\) CI would have \\(1-\\alpha\\) coverage. That is, say a 95% CI should, in the long run, contain the true value of the parameter 95% of the time. In practice, it is impossible to assess the coverage of our CI method, because we rarely know the true parameter. In simulation, however, we can. Let’s assess the coverage of bootstrap percentile intervals. Pick a univariate distribution with readily available mean and one that you can easily sample from. Draw \\(n = 30\\) random samples from the chosen distribution and use the bootstrap (with large enough m) and percentile CI method to construct 95% CI. Repeat the process many times and count how many times the CI contains the true mean. That is, compute the actual coverage probability (don’t forget to include the standard error of the coverage probability!). What can you observe? Try one or two different distributions. What can you observe? Repeat (b) and (c) using BCa intervals (R package boot). How does the coverage compare to percentile intervals? As (d) but using intervals based on asymptotic normality (+/- 1.96 SE). How do results from (b), (d), and (e) change if we increase the sample size to n = 200? What about n = 5? library(boot) set.seed(0) nit <- 1000 # Repeat the process "many times" alpha <- 0.05 # CI parameter nboot <- 100 # m parameter for bootstrap ("large enough m") # f: change this to 200 or 5. nsample <- 30 # n = 30 random samples from the chosen distribution. Comment out BCa code if it breaks. covers <- matrix(nrow = nit, ncol = 3) covers_BCa <- matrix(nrow = nit, ncol = 3) covers_asymp_norm <- matrix(nrow = nit, ncol = 3) isin <- function (x, lower, upper) { (x > lower) & (x < upper) } for (j in 1:nit) { # Repeating many times # a: pick a univariate distribution - standard normal x1 <- rnorm(nsample) # c: one or two different distributions - beta and poisson x2 <- rbeta(nsample, 1, 2) x3 <- rpois(nsample, 5) X1 <- matrix(data = NA, nrow = nsample, ncol = nboot) X2 <- matrix(data = NA, nrow = nsample, ncol = nboot) X3 <- matrix(data = NA, nrow = nsample, ncol = nboot) for (i in 1:nboot) { X1[ ,i] <- sample(x1, nsample, replace = T) X2[ ,i] <- sample(x2, nsample, T) X3[ ,i] <- sample(x3, nsample, T) } X1_func <- apply(X1, 2, mean) X2_func <- apply(X2, 2, mean) X3_func <- apply(X3, 2, mean) X1_quant <- quantile(X1_func, probs = c(alpha / 2, 1 - alpha / 2)) X2_quant <- quantile(X2_func, probs = c(alpha / 2, 1 - alpha / 2)) X3_quant <- quantile(X3_func, probs = c(alpha / 2, 1 - alpha / 2)) covers[j,1] <- (0 > X1_quant[1]) & (0 < X1_quant[2]) covers[j,2] <- ((1 / 3) > X2_quant[1]) & ((1 / 3) < X2_quant[2]) covers[j,3] <- (5 > X3_quant[1]) & (5 < X3_quant[2]) mf <- function (x, i) return(mean(x[i])) bootX1 <- boot(x1, statistic = mf, R = nboot) bootX2 <- boot(x2, statistic = mf, R = nboot) bootX3 <- boot(x3, statistic = mf, R = nboot) X1_quant_BCa <- boot.ci(bootX1, type = "bca")$bca X2_quant_BCa <- boot.ci(bootX2, type = "bca")$bca X3_quant_BCa <- boot.ci(bootX3, type = "bca")$bca covers_BCa[j,1] <- (0 > X1_quant_BCa[4]) & (0 < X1_quant_BCa[5]) covers_BCa[j,2] <- ((1 / 3) > X2_quant_BCa[4]) & ((1 / 3) < X2_quant_BCa[5]) covers_BCa[j,3] <- (5 > X3_quant_BCa[4]) & (5 < X3_quant_BCa[5]) # e: estimate mean and standard error # sample mean: x1_bar <- mean(x1) x2_bar <- mean(x2) x3_bar <- mean(x3) # standard error (of the sample mean) estimate: sample standard deviation / sqrt(n) x1_bar_SE <- sd(x1) / sqrt(nsample) x2_bar_SE <- sd(x2) / sqrt(nsample) x3_bar_SE <- sd(x3) / sqrt(nsample) covers_asymp_norm[j,1] <- isin(0, x1_bar - 1.96 * x1_bar_SE, x1_bar + 1.96 * x1_bar_SE) covers_asymp_norm[j,2] <- isin(1/3, x2_bar - 1.96 * x2_bar_SE, x2_bar + 1.96 * x2_bar_SE) covers_asymp_norm[j,3] <- isin(5, x3_bar - 1.96 * x3_bar_SE, x3_bar + 1.96 * x3_bar_SE) } apply(covers, 2, mean) ## [1] 0.918 0.925 0.905 apply(covers, 2, sd) / sqrt(nit) ## [1] 0.008680516 0.008333333 0.009276910 apply(covers_BCa, 2, mean) ## [1] 0.927 0.944 0.927 apply(covers_BCa, 2, sd) / sqrt(nit) ## [1] 0.008230355 0.007274401 0.008230355 apply(covers_asymp_norm, 2, mean) ## [1] 0.939 0.937 0.930 apply(covers_asymp_norm, 2, sd) / sqrt(nit) ## [1] 0.007572076 0.007687008 0.008072494 Exercise 14.2 You are given a sample of independent observations from a process of interest: Index 1 2 3 4 5 6 7 8 X 7 2 4 6 4 5 9 10 Compute the plug-in estimate of mean and 95% symmetric CI based on asymptotic normality. Use the plug-in estimate of SE. Same as (a), but use the unbiased estimate of SE. Apply nonparametric bootstrap with 1000 bootstrap replications and estimate the 95% CI for the mean with percentile-based CI. # a x <- c(7, 2, 4, 6, 4, 5, 9, 10) n <- length(x) mu <- mean(x) SE <- sqrt(mean((x - mu)^2)) / sqrt(n) SE ## [1] 0.8915839 z <- qnorm(1 - 0.05 / 2) c(mu - z * SE, mu + z * SE) ## [1] 4.127528 7.622472 # b SE <- sd(x) / sqrt(n) SE ## [1] 0.9531433 c(mu - z * SE, mu + z * SE) ## [1] 4.006873 7.743127 # c set.seed(0) m <- 1000 T_mean <- function(x) {mean(x)} est_boot <- array(NA, m) for (i in 1:m) { x_boot <- x[sample(1:n, n, rep = T)] est_boot[i] <- T_mean(x_boot) } quantile(est_boot, p = c(0.025, 0.975)) ## 2.5% 97.5% ## 4.250 7.625 Exercise 14.3 We are given a sample of 10 independent paired (bivariate) observations: Index 1 2 3 4 5 6 7 8 9 10 X 1.26 -0.33 1.33 1.27 0.41 -1.54 -0.93 -0.29 -0.01 2.40 Y 2.64 0.33 0.48 0.06 -0.88 -2.14 -2.21 0.95 0.83 1.45 Compute Pearson correlation between X and Y. Use the cor.test() from R to estimate a 95% CI for the estimate from (a). Apply nonparametric bootstrap with 1000 bootstrap replications and estimate the 95% CI for the Pearson correlation with percentile-based CI. Compare CI from (b) and (c). Are they similar? How would the bootstrap estimation of CI change if we were interested in Spearman or Kendall correlation instead? x <- c(1.26, -0.33, 1.33, 1.27, 0.41, -1.54, -0.93, -0.29, -0.01, 2.40) y <- c(2.64, 0.33, 0.48, 0.06, -0.88, -2.14, -2.21, 0.95, 0.83, 1.45) # a cor(x, y) ## [1] 0.6991247 # b res <- cor.test(x, y) res$conf.int[1:2] ## [1] 0.1241458 0.9226238 # c set.seed(0) m <- 1000 n <- length(x) T_cor <- function(x, y) {cor(x, y)} est_boot <- array(NA, m) for (i in 1:m) { idx <- sample(1:n, n, rep = T) # !!! important to use same indices to keep dependency between x and y est_boot[i] <- T_cor(x[idx], y[idx]) } quantile(est_boot, p = c(0.025, 0.975)) ## 2.5% 97.5% ## 0.2565537 0.9057664 # d # Yes, but the bootstrap CI is more narrow. # e # We just use the functions for Kendall/Spearman coefficients instead: T_kendall <- function(x, y) {cor(x, y, method = "kendall")} T_spearman <- function(x, y) {cor(x, y, method = "spearman")} # Put this in a function that returns the CI bootstrap_95_ci <- function(x, y, t, m = 1000) { n <- length(x) est_boot <- array(NA, m) for (i in 1:m) { idx <- sample(1:n, n, rep = T) # !!! important to use same indices to keep dependency between x and y est_boot[i] <- t(x[idx], y[idx]) } quantile(est_boot, p = c(0.025, 0.975)) } bootstrap_95_ci(x, y, T_kendall) ## 2.5% 97.5% ## -0.08108108 0.78378378 bootstrap_95_ci(x, y, T_spearman) ## 2.5% 97.5% ## -0.1701115 0.8867925 Exercise 14.4 In this problem we will illustrate the use of the nonparametric bootstrap for estimating CIs of regression model coefficients. Load the longley dataset from base R with data(longley). Use lm() to apply linear regression using “Employed” as the target (dependent) variable and all other variables as the predictors (independent). Using lm() results, print the estimated regression coefficients and standard errors. Estimate 95% CI for the coefficients using +/- 1.96 * SE. Use nonparametric bootstrap with 100 replications to estimate the SE of the coefficients from (b). Compare the SE from (c) with those from (b). # a data(longley) # b res <- lm(Employed ~ . , longley) tmp <- data.frame(summary(res)$coefficients[,1:2]) tmp$LB <- tmp[,1] - 1.96 * tmp[,2] tmp$UB <- tmp[,1] + 1.96 * tmp[,2] tmp ## Estimate Std..Error LB UB ## (Intercept) -3.482259e+03 8.904204e+02 -5.227483e+03 -1.737035e+03 ## GNP.deflator 1.506187e-02 8.491493e-02 -1.513714e-01 1.814951e-01 ## GNP -3.581918e-02 3.349101e-02 -1.014616e-01 2.982320e-02 ## Unemployed -2.020230e-02 4.883997e-03 -2.977493e-02 -1.062966e-02 ## Armed.Forces -1.033227e-02 2.142742e-03 -1.453204e-02 -6.132495e-03 ## Population -5.110411e-02 2.260732e-01 -4.942076e-01 3.919994e-01 ## Year 1.829151e+00 4.554785e-01 9.364136e-01 2.721889e+00 # c set.seed(0) m <- 100 n <- nrow(longley) T_coef <- function(x) { lm(Employed ~ . , x)$coefficients } est_boot <- array(NA, c(m, ncol(longley))) for (i in 1:m) { idx <- sample(1:n, n, rep = T) est_boot[i,] <- T_coef(longley[idx,]) } SE <- apply(est_boot, 2, sd) SE ## [1] 1.826011e+03 1.605981e-01 5.693746e-02 8.204892e-03 3.802225e-03 ## [6] 3.907527e-01 9.414436e-01 # Show the standard errors around coefficients library(ggplot2) library(reshape2) df <- data.frame(index = 1:7, bootstrap_SE = SE, lm_SE = tmp$Std..Error) melted_df <- melt(df[2:nrow(df), ], id.vars = "index") # Ignore bias which has a really large magnitude ggplot(melted_df, aes(x = index, y = value, fill = variable)) + geom_bar(stat="identity", position="dodge") + xlab("Coefficient") + ylab("Standard error") # + scale_y_continuous(trans = "log") # If you want to also plot bias Exercise 14.5 This exercise shows a shortcoming of the bootstrap method when using the plug in estimator for the maximum. Compute the 95% bootstrap CI for the maximum of a standard normal distribution. Compute the 95% bootstrap CI for the maximum of a binomial distribution with n = 15 and p = 0.2. Repeat (b) using p = 0.9. Why is the result different? # bootstrap CI for maximum alpha <- 0.05 T_max <- function(x) {max(x)} # Equal to T_max = max bootstrap <- function(x, t, m = 1000) { n <- length(x) values <- rep(0, m) for (i in 1:m) { values[i] <- t(sample(x, n, replace = T)) } quantile(values, probs = c(alpha / 2, 1 - alpha / 2)) } # a # Meaningless, as the normal distribution can yield arbitrarily large values. x <- rnorm(100) bootstrap(x, T_max) ## 2.5% 97.5% ## 1.819425 2.961743 # b x <- rbinom(100, size = 15, prob = 0.2) # min = 0, max = 15 bootstrap(x, T_max) ## 2.5% 97.5% ## 6 7 # c x <- rbinom(100, size = 15, prob = 0.9) # min = 0, max = 15 bootstrap(x, T_max) ## 2.5% 97.5% ## 15 15 # Observation: to estimate the maximum, we need sufficient probability mass near the maximum value the distribution can yield. # Using bootstrap is pointless when there is too little mass near the true maximum. # In general, bootstrap will fail when estimating the CI for the maximum. "],["ml.html", "Chapter 15 Maximum likelihood 15.1 Deriving MLE 15.2 Fisher information 15.3 The German tank problem", " Chapter 15 Maximum likelihood This chapter deals with maximum likelihood estimation. The students are expected to acquire the following knowledge: How to derive MLE. Applying MLE in R. Calculating and interpreting Fisher information. Practical use of MLE. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 15.1 Deriving MLE Exercise 15.1 Derive the maximum likelihood estimator of variance for N\\((\\mu, \\sigma^2)\\). Compare with results from 13.3. What does that say about the MLE estimator? Solution. The mean is assumed constant, so we have the likelihood \\[\\begin{align} L(\\sigma^2; y) &= \\prod_{i=1}^n \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(y_i - \\mu)^2}{2 \\sigma^2}} \\\\ &= \\frac{1}{\\sqrt{2 \\pi \\sigma^2}^n} e^{\\frac{-\\sum_{i=1}^n (y_i - \\mu)^2}{2 \\sigma^2}} \\end{align}\\] We need to find the maximum of this function. We first observe that we can replace \\(\\frac{-\\sum_{i=1}^n (y_i - \\mu)^2}{2}\\) with a constant \\(c\\), since none of the terms are dependent on \\(\\sigma^2\\). Additionally, the term \\(\\frac{1}{\\sqrt{2 \\pi}^n}\\) does not affect the calculation of the maximum. So now we have \\[\\begin{align} L(\\sigma^2; y) &= (\\sigma^2)^{-\\frac{n}{2}} e^{\\frac{c}{\\sigma^2}}. \\end{align}\\] Differentiating we get \\[\\begin{align} \\frac{d}{d \\sigma^2} L(\\sigma^2; y) &= (\\sigma^2)^{-\\frac{n}{2}} \\frac{d}{d \\sigma^2} e^{\\frac{c}{\\sigma^2}} + e^{\\frac{c}{\\sigma^2}} \\frac{d}{d \\sigma^2} (\\sigma^2)^{-\\frac{n}{2}} \\\\ &= - (\\sigma^2)^{-\\frac{n}{2}} e^{\\frac{c}{\\sigma^2}} \\frac{c}{(\\sigma^2)^2} - e^{\\frac{c}{\\sigma^2}} \\frac{n}{2} (\\sigma^2)^{-\\frac{n + 2}{2}} \\\\ &= - (\\sigma^2)^{-\\frac{n + 4}{2}} e^{\\frac{c}{\\sigma^2}} c - e^{\\frac{c}{\\sigma^2}} \\frac{n}{2} (\\sigma^2)^{-\\frac{n + 2}{2}} \\\\ &= - e^{\\frac{c}{\\sigma^2}} (\\sigma^2)^{-\\frac{n + 4}{2}} \\Big(c + \\frac{n}{2}\\sigma^2 \\Big). \\end{align}\\] To get the maximum, this has to equal to 0, so \\[\\begin{align} c + \\frac{n}{2}\\sigma^2 &= 0 \\\\ \\sigma^2 &= -\\frac{2c}{n} \\\\ \\sigma^2 &= \\frac{\\sum_{i=1}^n (Y_i - \\mu)^2}{n}. \\end{align}\\] The MLE estimator is biased. Exercise 15.2 (Multivariate normal distribution) Derive the maximum likelihood estimate for the mean and covariance matrix of the multivariate normal. Simulate \\(n = 40\\) samples from a bivariate normal distribution (choose non-trivial parameters, that is, mean \\(\\neq 0\\) and covariance \\(\\neq 0\\)). Compute the MLE for the sample. Overlay the data with an ellipse that is determined by the MLE and an ellipse that is determined by the chosen true parameters. Repeat b. several times and observe how the estimates (ellipses) vary around the true value. Hint: For the derivation of MLE, these identities will be helpful: \\(\\frac{\\partial b^T a}{\\partial a} = \\frac{\\partial a^T b}{\\partial a} = b\\), \\(\\frac{\\partial a^T A a}{\\partial a} = (A + A^T)a\\), \\(\\frac{\\partial \\text{tr}(BA)}{\\partial A} = B^T\\), \\(\\frac{\\partial \\ln |A|}{\\partial A} = (A^{-1})^T\\), \\(a^T A a = \\text{tr}(a^T A a) = \\text{tr}(a a^T A) = \\text{tr}(Aaa^T)\\). Solution. The log likelihood of the MVN distribution is \\[\\begin{align*} l(\\mu, \\Sigma ; x) &= -\\frac{1}{2}\\Big(\\sum_{i=1}^n k\\ln(2\\pi) + |\\Sigma| + (x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) \\\\ &= -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n(x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) + c, \\end{align*}\\] where \\(c\\) is a constant with respect to \\(\\mu\\) and \\(\\Sigma\\). To find the MLE we first need to find partial derivatives. Let us start with \\(\\mu\\). \\[\\begin{align*} \\frac{\\partial}{\\partial \\mu}l(\\mu, \\Sigma ; x) &= \\frac{\\partial}{\\partial \\mu} -\\frac{1}{2}\\Big(\\sum_{i=1}^n x_i^T \\Sigma^{-1} x_i - x_i^T \\Sigma^{-1} \\mu - \\mu^T \\Sigma^{-1} x_i + \\mu^T \\Sigma^{-1} \\mu \\Big) \\\\ &= -\\frac{1}{2}\\Big(\\sum_{i=1}^n - \\Sigma^{-1} x_i - \\Sigma^{-1} x_i + 2 \\Sigma^{-1} \\mu \\Big) \\\\ &= -\\Sigma^{-1}\\Big(\\sum_{i=1}^n - x_i + \\mu \\Big). \\end{align*}\\] Equating above with zero, we get \\[\\begin{align*} \\sum_{i=1}^n - x_i + \\mu &= 0 \\\\ \\hat{\\mu} = \\frac{1}{n} \\sum_{i=1}^n x_i, \\end{align*}\\] which is the dimension-wise empirical mean. Now for the covariance matrix \\[\\begin{align*} \\frac{\\partial}{\\partial \\Sigma^{-1}}l(\\mu, \\Sigma ; x) &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n(x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) \\\\ &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n \\text{tr}((x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu))\\Big) \\\\ &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n \\text{tr}((\\Sigma^{-1} (x_i - \\mu) (x_i - \\mu)^T )\\Big) \\\\ &= \\frac{n}{2}\\Sigma + -\\frac{1}{2}\\Big(\\sum_{i=1}^n (x_i - \\mu) (x_i - \\mu)^T \\Big). \\end{align*}\\] Equating above with zero, we get \\[\\begin{align*} \\hat{\\Sigma} = \\frac{1}{n}\\sum_{i=1}^n (x_i - \\mu) (x_i - \\mu)^T. \\end{align*}\\] set.seed(1) n <- 40 mu <- c(1, -2) Sigma <- matrix(data = c(2, -1.6, -1.6, 1.8), ncol = 2) X <- mvrnorm(n = n, mu = mu, Sigma = Sigma) colnames(X) <- c("X1", "X2") X <- as.data.frame(X) # plot.new() tru_ellip <- ellipse(mu, Sigma, draw = FALSE) colnames(tru_ellip) <- c("X1", "X2") tru_ellip <- as.data.frame(tru_ellip) mu_est <- apply(X, 2, mean) tmp <- as.matrix(sweep(X, 2, mu_est)) Sigma_est <- (1 / n) * t(tmp) %*% tmp est_ellip <- ellipse(mu_est, Sigma_est, draw = FALSE) colnames(est_ellip) <- c("X1", "X2") est_ellip <- as.data.frame(est_ellip) ggplot(data = X, aes(x = X1, y = X2)) + geom_point() + geom_path(data = tru_ellip, aes(x = X1, y = X2, color = "truth")) + geom_path(data = est_ellip, aes(x = X1, y = X2, color = "estimated")) + labs(color = "type") Exercise 15.3 (Logistic regression) Logistic regression is a popular discriminative model when our target variable is binary (categorical with 2 values). One of the ways of looking at logistic regression is that it is linear regression but instead of using the linear term as the mean of a normal RV, we use it as the mean of a Bernoulli RV. Of course, the mean of a Bernoulli is bounded on \\([0,1]\\), so, to avoid non-sensical values, we squeeze the linear between 0 and 1 with the inverse logit function inv_logit\\((z) = 1 / (1 + e^{-z})\\). This leads to the following model: \\(y_i | \\beta, x_i \\sim \\text{Bernoulli}(\\text{inv_logit}(\\beta x_i))\\). Explicitly write the likelihood function of beta. Implement the likelihood function in R. Use black-box box-constraint optimization (for example, optim() with L-BFGS) to find the maximum likelihood estimate for beta for \\(x\\) and \\(y\\) defined below. Plot the estimated probability as a function of the independent variable. Compare with the truth. Let \\(y2\\) be a response defined below. Will logistic regression work well on this dataset? Why not? How can we still use the model, without changing it? inv_log <- function (z) { return (1 / (1 + exp(-z))) } set.seed(1) x <- rnorm(100) y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) y2 <- rbinom(100, size = 1, prob = inv_log(1.2 * x + 1.4 * x^2)) Solution. \\[\\begin{align*} l(\\beta; x, y) &= p(y | x, \\beta) \\\\ &= \\ln(\\prod_{i=1}^n \\text{inv_logit}(\\beta x_i)^{y_i} (1 - \\text{inv_logit}(\\beta x_i))^{1 - y_i}) \\\\ &= \\sum_{i=1}^n y_i \\ln(\\text{inv_logit}(\\beta x_i)) + (1 - y_i) \\ln(1 - \\text{inv_logit}(\\beta x_i)). \\end{align*}\\] set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par ## [1] 1.166558 truth_p <- data.frame(x = x, prob = inv_log(1.2 * x), type = "truth") est_p <- data.frame(x = x, prob = inv_log(my_optim$par * x), type = "estimated") plot_df <- rbind(truth_p, est_p) ggplot(data = plot_df, aes(x = x, y = prob, color = type)) + geom_point(alpha = 0.3) y2 <- rbinom(2000, size = 1, prob = inv_log(1.2 * x + 1.4 * x^2)) X2 <- cbind(x, x^2) my_optim2 <- optim(par = c(0, 0), fn = l_logistic, method = "L-BFGS-B", lower = c(0, 0), upper = c(2, 2), X = t(X2), y = y2) my_optim2$par ## [1] 1.153656 1.257649 tmp <- sweep(data.frame(x = x, x2 = x^2), 2, my_optim2$par, FUN = "*") tmp <- tmp[ ,1] + tmp[ ,2] truth_p <- data.frame(x = x, prob = inv_log(1.2 * x + 1.4 * x^2), type = "truth") est_p <- data.frame(x = x, prob = inv_log(tmp), type = "estimated") plot_df <- rbind(truth_p, est_p) ggplot(data = plot_df, aes(x = x, y = prob, color = type)) + geom_point(alpha = 0.3) Exercise 15.4 (Linear regression) For the data generated below, do the following: Compute the least squares (MLE) estimate of coefficients beta using the matrix exact solution. Compute the MLE by minimizing the sum of squared residuals using black-box optimization (optim()). Compute the MLE by using the output built-in linear regression (lm() ). Compare (a-c and the true coefficients). Compute 95% CI on the beta coefficients using the output of built-in linear regression. Compute 95% CI on the beta coefficients by using (a or b) and the bootstrap with percentile method for CI. Compare with d. set.seed(1) n <- 100 x1 <- rnorm(n) x2 <- rnorm(n) x3 <- rnorm(n) X <- cbind(x1, x2, x3) beta <- c(0.2, 0.6, -1.2) y <- as.vector(t(beta %*% t(X))) + rnorm(n, sd = 0.2) set.seed(1) n <- 100 x1 <- rnorm(n) x2 <- rnorm(n) x3 <- rnorm(n) X <- cbind(x1, x2, x3) beta <- c(0.2, 0.6, -1.2) y <- as.vector(t(beta %*% t(X))) + rnorm(n, sd = 0.2) LS_fun <- function (beta, X, y) { return(sum((y - beta %*% t(X))^2)) } my_optim <- optim(par = c(0, 0, 0), fn = LS_fun, lower = -5, upper = 5, X = X, y = y, method = "L-BFGS-B") my_optim$par ## [1] 0.1898162 0.5885946 -1.1788264 df <- data.frame(y = y, x1 = x1, x2 = x2, x3 = x3) my_lm <- lm(y ~ x1 + x2 + x3 - 1, data = df) my_lm ## ## Call: ## lm(formula = y ~ x1 + x2 + x3 - 1, data = df) ## ## Coefficients: ## x1 x2 x3 ## 0.1898 0.5886 -1.1788 # matrix solution beta_hat <- solve(t(X) %*% X) %*% t(X) %*% y beta_hat ## [,1] ## x1 0.1898162 ## x2 0.5885946 ## x3 -1.1788264 out <- summary(my_lm) out$coefficients[ ,2] ## x1 x2 x3 ## 0.02209328 0.02087542 0.01934506 # bootstrap CI nboot <- 1000 beta_boot <- matrix(data = NA, ncol = length(beta), nrow = nboot) for (i in 1:nboot) { inds <- sample(1:n, n, replace = T) new_df <- df[inds, ] X_tmp <- as.matrix(new_df[ ,-1]) y_tmp <- new_df[ ,1] # print(nrow(new_df)) tmp_beta <- solve(t(X_tmp) %*% X_tmp) %*% t(X_tmp) %*% y_tmp beta_boot[i, ] <- tmp_beta } apply(beta_boot, 2, mean) ## [1] 0.1893281 0.5887068 -1.1800738 apply(beta_boot, 2, quantile, probs = c(0.025, 0.975)) ## [,1] [,2] [,3] ## 2.5% 0.1389441 0.5436911 -1.221560 ## 97.5% 0.2386295 0.6363102 -1.140416 out$coefficients[ ,2] ## x1 x2 x3 ## 0.02209328 0.02087542 0.01934506 Exercise 15.5 (Principal component analysis) Load the olympic data set from package ade4. The data show decathlon results for 33 men in 1988 Olympic Games. This data set serves as a great example of finding the latent structure in the data, as there are certain characteristics of the athletes that make them excel at different events. For example an explosive athlete will do particulary well in sprints and long jumps. Perform PCA (prcomp) on the data set and interpret the first 2 latent dimensions. Hint: Standardize the data first to get meaningful results. Use MLE to estimate the covariance of the standardized multivariate distribution. Decompose the estimated covariance matrix with the eigendecomposition. Compare the eigenvectors to the output of PCA. data(olympic) X <- olympic$tab X_scaled <- scale(X) my_pca <- prcomp(X_scaled) summary(my_pca) ## Importance of components: ## PC1 PC2 PC3 PC4 PC5 PC6 PC7 ## Standard deviation 1.8488 1.6144 0.97123 0.9370 0.74607 0.70088 0.65620 ## Proportion of Variance 0.3418 0.2606 0.09433 0.0878 0.05566 0.04912 0.04306 ## Cumulative Proportion 0.3418 0.6025 0.69679 0.7846 0.84026 0.88938 0.93244 ## PC8 PC9 PC10 ## Standard deviation 0.55389 0.51667 0.31915 ## Proportion of Variance 0.03068 0.02669 0.01019 ## Cumulative Proportion 0.96312 0.98981 1.00000 autoplot(my_pca, data = X, loadings = TRUE, loadings.colour = 'blue', loadings.label = TRUE, loadings.label.size = 3) Sigma_est <- (1 / nrow(X_scaled)) * t(X_scaled) %*% X_scaled Sigma_dec <- eigen(Sigma_est) Sigma_dec$vectors ## [,1] [,2] [,3] [,4] [,5] [,6] ## [1,] 0.4158823 0.1488081 -0.26747198 -0.08833244 -0.442314456 0.03071237 ## [2,] -0.3940515 -0.1520815 -0.16894945 -0.24424963 0.368913901 -0.09378242 ## [3,] -0.2691057 0.4835374 0.09853273 -0.10776276 -0.009754680 0.23002054 ## [4,] -0.2122818 0.0278985 -0.85498656 0.38794393 -0.001876311 0.07454380 ## [5,] 0.3558474 0.3521598 -0.18949642 0.08057457 0.146965351 -0.32692886 ## [6,] 0.4334816 0.0695682 -0.12616012 -0.38229029 -0.088802794 0.21049130 ## [7,] -0.1757923 0.5033347 0.04609969 0.02558404 0.019358607 0.61491241 ## [8,] -0.3840821 0.1495820 0.13687235 0.14396548 -0.716743474 -0.34776037 ## [9,] -0.1799436 0.3719570 -0.19232803 -0.60046566 0.095582043 -0.43744387 ## [10,] 0.1701426 0.4209653 0.22255233 0.48564231 0.339772188 -0.30032419 ## [,7] [,8] [,9] [,10] ## [1,] 0.2543985 0.663712826 -0.10839531 0.10948045 ## [2,] 0.7505343 0.141264141 0.04613910 0.05580431 ## [3,] -0.1106637 0.072505560 0.42247611 0.65073655 ## [4,] -0.1351242 -0.155435871 -0.10206505 0.11941181 ## [5,] 0.1413388 -0.146839303 0.65076229 -0.33681395 ## [6,] 0.2725296 -0.639003579 -0.20723854 0.25971800 ## [7,] 0.1439726 0.009400445 -0.16724055 -0.53450315 ## [8,] 0.2732665 -0.276873049 -0.01766443 -0.06589572 ## [9,] -0.3419099 0.058519366 -0.30619617 -0.13093187 ## [10,] 0.1868704 0.007310045 -0.45688227 0.24311846 my_pca$rotation ## PC1 PC2 PC3 PC4 PC5 PC6 ## 100 -0.4158823 0.1488081 0.26747198 -0.08833244 -0.442314456 0.03071237 ## long 0.3940515 -0.1520815 0.16894945 -0.24424963 0.368913901 -0.09378242 ## poid 0.2691057 0.4835374 -0.09853273 -0.10776276 -0.009754680 0.23002054 ## haut 0.2122818 0.0278985 0.85498656 0.38794393 -0.001876311 0.07454380 ## 400 -0.3558474 0.3521598 0.18949642 0.08057457 0.146965351 -0.32692886 ## 110 -0.4334816 0.0695682 0.12616012 -0.38229029 -0.088802794 0.21049130 ## disq 0.1757923 0.5033347 -0.04609969 0.02558404 0.019358607 0.61491241 ## perc 0.3840821 0.1495820 -0.13687235 0.14396548 -0.716743474 -0.34776037 ## jave 0.1799436 0.3719570 0.19232803 -0.60046566 0.095582043 -0.43744387 ## 1500 -0.1701426 0.4209653 -0.22255233 0.48564231 0.339772188 -0.30032419 ## PC7 PC8 PC9 PC10 ## 100 0.2543985 -0.663712826 0.10839531 -0.10948045 ## long 0.7505343 -0.141264141 -0.04613910 -0.05580431 ## poid -0.1106637 -0.072505560 -0.42247611 -0.65073655 ## haut -0.1351242 0.155435871 0.10206505 -0.11941181 ## 400 0.1413388 0.146839303 -0.65076229 0.33681395 ## 110 0.2725296 0.639003579 0.20723854 -0.25971800 ## disq 0.1439726 -0.009400445 0.16724055 0.53450315 ## perc 0.2732665 0.276873049 0.01766443 0.06589572 ## jave -0.3419099 -0.058519366 0.30619617 0.13093187 ## 1500 0.1868704 -0.007310045 0.45688227 -0.24311846 15.2 Fisher information Exercise 15.6 Let us assume a Poisson likelihood. Derive the MLE estimate of the mean. Derive the Fisher information. For the data below compute the MLE and construct confidence intervals. Use bootstrap to construct the CI for the mean. Compare with c) and discuss. x <- c(2, 5, 3, 1, 2, 1, 0, 3, 0, 2) Solution. The log likelihood of the Poisson is \\[\\begin{align*} l(\\lambda; x) = \\sum_{i=1}^n x_i \\ln \\lambda - n \\lambda - \\sum_{i=1}^n \\ln x_i! \\end{align*}\\] Taking the derivative and equating with 0 we get \\[\\begin{align*} \\frac{1}{\\hat{\\lambda}}\\sum_{i=1}^n x_i - n &= 0 \\\\ \\hat{\\lambda} &= \\frac{1}{n} \\sum_{i=1}^n x_i. \\end{align*}\\] Since \\(\\lambda\\) is the mean parameter, this was expected. For the Fischer information, we first need the second derivative, which is \\[\\begin{align*} - \\lambda^{-2} \\sum_{i=1}^n x_i. \\\\ \\end{align*}\\] Now taking the expectation of the negative of the above, we get \\[\\begin{align*} E[\\lambda^{-2} \\sum_{i=1}^n x_i] &= \\lambda^{-2} E[\\sum_{i=1}^n x_i] \\\\ &= \\lambda^{-2} n \\lambda \\\\ &= \\frac{n}{\\lambda}. \\end{align*}\\] set.seed(1) x <- c(2, 5, 3, 1, 2, 1, 0, 3, 0, 2) lambda_hat <- mean(x) finfo <- length(x) / lambda_hat mle_CI <- c(lambda_hat - 1.96 * sqrt(1 / finfo), lambda_hat + 1.96 * sqrt(1 / finfo)) boot_lambda <- c() nboot <- 1000 for (i in 1:nboot) { tmp_x <- sample(x, length(x), replace = T) boot_lambda[i] <- mean(tmp_x) } boot_CI <- c(quantile(boot_lambda, 0.025), quantile(boot_lambda, 0.975)) mle_CI ## [1] 1.045656 2.754344 boot_CI ## 2.5% 97.5% ## 1.0 2.7 Exercise 15.7 Find the Fisher information matrix for the Gamma distribution. Generate 20 samples from a Gamma distribution and plot a confidence ellipse of the inverse of Fisher information matrix around the ML estimates of the parameters. Also plot the theoretical values. Repeat the sampling several times. What do you observe? Discuss what a non-diagonal Fisher matrix implies. Hint: The digamma function is defined as \\(\\psi(x) = \\frac{\\frac{d}{dx} \\Gamma(x)}{\\Gamma(x)}\\). Additionally, you do not need to evaluate \\(\\frac{d}{dx} \\psi(x)\\). To calculate its value in R, use package numDeriv. Solution. The log likelihood of the Gamma is \\[\\begin{equation*} l(\\alpha, \\beta; x) = n \\alpha \\ln \\beta - n \\ln \\Gamma(\\alpha) + (\\alpha - 1) \\sum_{i=1}^n \\ln x_i - \\beta \\sum_{i=1}^n x_i. \\end{equation*}\\] Let us calculate the derivatives. \\[\\begin{align*} \\frac{\\partial}{\\partial \\alpha} l(\\alpha, \\beta; x) &= n \\ln \\beta - n \\psi(\\alpha) + \\sum_{i=1}^n \\ln x_i, \\\\ \\frac{\\partial}{\\partial \\beta} l(\\alpha, \\beta; x) &= \\frac{n \\alpha}{\\beta} - \\sum_{i=1}^n x_i, \\\\ \\frac{\\partial^2}{\\partial \\alpha \\beta} l(\\alpha, \\beta; x) &= \\frac{n}{\\beta}, \\\\ \\frac{\\partial^2}{\\partial \\alpha^2} l(\\alpha, \\beta; x) &= - n \\frac{\\partial}{\\partial \\alpha} \\psi(\\alpha), \\\\ \\frac{\\partial^2}{\\partial \\beta^2} l(\\alpha, \\beta; x) &= - \\frac{n \\alpha}{\\beta^2}. \\end{align*}\\] The Fisher information matrix is then \\[\\begin{align*} I(\\alpha, \\beta) = - E[ \\begin{bmatrix} - n \\psi'(\\alpha) & \\frac{n}{\\beta} \\\\ \\frac{n}{\\beta} & - \\frac{n \\alpha}{\\beta^2} \\end{bmatrix} ] = \\begin{bmatrix} n \\psi'(\\alpha) & - \\frac{n}{\\beta} \\\\ - \\frac{n}{\\beta} & \\frac{n \\alpha}{\\beta^2} \\end{bmatrix} \\end{align*}\\] A non-diagonal Fisher matrix implies that the parameter estimates are linearly dependent. set.seed(1) n <- 20 pars_theor <- c(5, 2) x <- rgamma(n, 5, 2) # MLE for alpha and beta log_lik <- function (pars, x) { n <- length(x) return (- (n * pars[1] * log(pars[2]) - n * log(gamma(pars[1])) + (pars[1] - 1) * sum(log(x)) - pars[2] * sum(x))) } my_optim <- optim(par = c(1,1), fn = log_lik, method = "L-BFGS-B", lower = c(0.001, 0.001), upper = c(8, 8), x = x) pars_mle <- my_optim$par fish_mat <- matrix(data = NA, nrow = 2, ncol = 2) fish_mat[1,2] <- - n / pars_mle[2] fish_mat[2,1] <- - n / pars_mle[2] fish_mat[2,2] <- (n * pars_mle[1]) / (pars_mle[2]^2) fish_mat[1,1] <- n * grad(digamma, pars_mle[1]) fish_mat_inv <- solve(fish_mat) est_ellip <- ellipse(pars_mle, fish_mat_inv, draw = FALSE) colnames(est_ellip) <- c("X1", "X2") est_ellip <- as.data.frame(est_ellip) ggplot() + geom_point(data = data.frame(x = pars_mle[1], y = pars_mle[2]), aes(x = x, y = y)) + geom_path(data = est_ellip, aes(x = X1, y = X2)) + geom_point(aes(x = pars_theor[1], y = pars_theor[2]), color = "red") + geom_text(aes(x = pars_theor[1], y = pars_theor[2], label = "Theoretical parameters"), color = "red", nudge_y = -0.2) 15.3 The German tank problem Exercise 15.8 (The German tank problem) During WWII the allied intelligence were faced with an important problem of estimating the total production of certain German tanks, such as the Panther. What turned out to be a successful approach was to estimate the maximum from the serial numbers of the small sample of captured or destroyed tanks (describe the statistical model used). What assumptions were made by using the above model? Do you think they are reasonable assumptions in practice? Show that the plug-in estimate for the maximum (i.e. the maximum of the sample) is a biased estimator. Derive the maximum likelihood estimate of the maximum. Check that the following estimator is not biased: \\(\\hat{n} = \\frac{k + 1}{k}m - 1\\). Solution. The data are the serial numbers of the tanks. The parameter is \\(n\\), the total production of the tank. The distribution of the serial numbers is a discrete uniform distribution over all serial numbers. One of the assumptions is that we have i.i.d samples, however in practice this might not be true, as some tanks produced later could be sent to the field later, therefore already in theory we would not be able to recover some values from the population. To find the expected value we first need to find the distribution of \\(m\\). Let us start with the CDF. \\[\\begin{align*} F_m(x) = P(Y_1 < x,...,Y_k < x). \\end{align*}\\] If \\(x < k\\) then \\(F_m(x) = 0\\) and if \\(x \\geq 1\\) then \\(F_m(x) = 1\\). What about between those values. So the probability that the maximum value is less than or equal to \\(m\\) is just the number of possible draws from \\(Y\\) that are all smaller than \\(m\\), divided by all possible draws. This is \\(\\frac{{x}\\choose{k}}{{n}\\choose{k}}\\). The PDF on the suitable bounds is then \\[\\begin{align*} P(m = x) = F_m(x) - F_m(x - 1) = \\frac{\\binom{x}{k} - \\binom{x - 1}{k}}{\\binom{n}{k}} = \\frac{\\binom{x - 1}{k - 1}}{\\binom{n}{k}}. \\end{align*}\\] Now we can calculate the expected value of \\(m\\) using some combinatorial identities. \\[\\begin{align*} E[m] &= \\sum_{i = k}^n i \\frac{{i - 1}\\choose{k - 1}}{{n}\\choose{k}} \\\\ &= \\sum_{i = k}^n i \\frac{\\frac{(i - 1)!}{(k - 1)!(i - k)!}}{{n}\\choose{k}} \\\\ &= \\frac{k}{\\binom{n}{k}}\\sum_{i = k}^n \\binom{i}{k} \\\\ &= \\frac{k}{\\binom{n}{k}} \\binom{n + 1}{k + 1} \\\\ &= \\frac{k(n + 1)}{k + 1}. \\end{align*}\\] The bias of this estimator is then \\[\\begin{align*} E[m] - n = \\frac{k(n + 1)}{k + 1} - n = \\frac{k - n}{k + 1}. \\end{align*}\\] The probability that we observed our sample \\(Y = {Y_1, Y_2,...,,Y_k}\\) given \\(n\\) is \\(\\frac{1}{{n}\\choose{k}}\\). We need to find such \\(n^*\\) that this function is maximized. Additionally, we have a constraint that \\(n^* \\geq m = \\max{(Y)}\\). Let us plot this function for \\(m = 10\\) and \\(k = 4\\). library(ggplot2) my_fun <- function (x, m, k) { tmp <- 1 / (choose(x, k)) tmp[x < m] <- 0 return (tmp) } x <- 1:20 y <- my_fun(x, 10, 4) df <- data.frame(x = x, y = y) ggplot(data = df, aes(x = x, y = y)) + geom_line() ::: {.solution} (continued) We observe that the maximum of this function lies at the maximum value of the sample. Therefore \\(n^* = m\\) and ML estimate equals the plug-in estimate. \\[\\begin{align*} E[\\hat{n}] &= \\frac{k + 1}{k} E[m] - 1 \\\\ &= \\frac{k + 1}{k} \\frac{k(n + 1)}{k + 1} - 1 \\\\ &= n. \\end{align*}\\] ::: "],["nhst.html", "Chapter 16 Null hypothesis significance testing", " Chapter 16 Null hypothesis significance testing This chapter deals with null hypothesis significance testing. The students are expected to acquire the following knowledge: Binomial test. t-test. Chi-squared test. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 16.1 (Binomial test) We assume \\(y_i \\in \\{0,1\\}\\), \\(i = 1,...,n\\) and \\(y_i | \\theta = 0.5 \\sim i.i.d.\\) Bernoulli\\((\\theta)\\). The test statistic is \\(X = \\sum_{i=1}^n\\) and the rejection region R is defined as the region where the probability of obtaining such or more extreme \\(X\\) given \\(\\theta = 0.5\\) is less than 0.05. Derive and plot the power function of the test for \\(n=100\\). What is the significance level of this test if \\(H0: \\theta = 0.5\\)? At which values of X will we reject the null hypothesis? # a # First we need the rejection region, so we need to find X_min and X_max n <- 100 qbinom(0.025, n, 0.5) ## [1] 40 qbinom(0.975, n, 0.5) ## [1] 60 pbinom(40, n, 0.5) ## [1] 0.02844397 pbinom(60, n, 0.5) ## [1] 0.9823999 X_min <- 39 X_max <- 60 thetas <- seq(0, 1, by = 0.01) beta_t <- 1 - pbinom(X_max, size = n, prob = thetas) + pbinom(X_min, size = n, prob = thetas) plot(beta_t) # b # The significance level is beta_t[51] ## [1] 0.0352002 # We will reject the null hypothesis at X values below X_min and above X_max. Exercise 16.2 (Long-run guarantees of the t-test) Generate a sample of size \\(n = 10\\) from the standard normal. Use the two-sided t-test with \\(H0: \\mu = 0\\) and record the p-value. Can you reject H0 at 0.05 significance level? (before simulating) If we repeated (b) many times, what would be the relative frequency of false positives/Type I errors (rejecting the null that is true)? What would be the relative frequency of false negatives /Type II errors (retaining the null when the null is false)? (now simulate b and check if the simulation results match your answer in b) Similar to (a-c) but now we generate data from N(-0.5, 1). Similar to (a-c) but now we generate data from N(\\(\\mu\\), 1) where we every time pick a different \\(\\mu < 0\\) and use a one-sided test \\(H0: \\mu <= 0\\). set.seed(2) # a x <- rnorm(10) my_test <- t.test(x, alternative = "two.sided", mu = 0) my_test ## ## One Sample t-test ## ## data: x ## t = 0.6779, df = 9, p-value = 0.5149 ## alternative hypothesis: true mean is not equal to 0 ## 95 percent confidence interval: ## -0.4934661 0.9157694 ## sample estimates: ## mean of x ## 0.2111516 # we can not reject the null hypothesis # b # The expected value of false positives would be 0.05. The expected value of # true negatives would be 0, as there are no negatives (the null hypothesis is # always the truth). nit <- 1000 typeIerr <- vector(mode = "logical", length = nit) typeIIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { x <- rnorm(10) my_test <- t.test(x, alternative = "two.sided", mu = 0) if (my_test$p.value < 0.05) { typeIerr[i] <- T } else { typeIerr[i] <- F } } mean(typeIerr) ## [1] 0.052 sd(typeIerr) / sqrt(nit) ## [1] 0.007024624 # d # We can not estimate the percentage of true negatives, but it will probably be # higher than 0.05. There will be no false positives as the null hypothesis is # always false. typeIIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { x <- rnorm(10, -0.5) my_test <- t.test(x, alternative = "two.sided", mu = 0) if (my_test$p.value < 0.05) { typeIIerr[i] <- F } else { typeIIerr[i] <- T } } mean(typeIIerr) ## [1] 0.719 sd(typeIIerr) / sqrt(nit) ## [1] 0.01422115 # e # The expected value of false positives would be lower than 0.05. The expected # value of true negatives would be 0, as there are no negatives (the null # hypothesis is always the truth). typeIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { u <- runif(1, -1, 0) x <- rnorm(10, u) my_test <- t.test(x, alternative = "greater", mu = 0) if (my_test$p.value < 0.05) { typeIerr[i] <- T } else { typeIerr[i] <- F } } mean(typeIerr) ## [1] 0.012 sd(typeIerr) / sqrt(nit) ## [1] 0.003444977 Exercise 16.3 (T-test, confidence intervals, and bootstrap) Sample \\(n=20\\) from a standard normal distribution and calculate the p-value using t-test, confidence intervals based on normal distribution, and bootstrap. Repeat this several times and check how many times we rejected the null hypothesis (made a type I error). Hint: For the confidence intervals you can use function CI from the Rmisc package. set.seed(1) library(Rmisc) nit <- 1000 n_boot <- 100 t_logic <- rep(F, nit) boot_logic <- rep(F, nit) norm_logic <- rep(F, nit) for (i in 1:nit) { x <- rnorm(20) my_test <- t.test(x) my_CI <- CI(x) if (my_test$p.value <= 0.05) t_logic[i] <- T boot_tmp <- vector(mode = "numeric", length = n_boot) for (j in 1:n_boot) { tmp_samp <- sample(x, size = 20, replace = T) boot_tmp[j] <- mean(tmp_samp) } if ((quantile(boot_tmp, 0.025) >= 0) | (quantile(boot_tmp, 0.975) <= 0)) { boot_logic[i] <- T } if ((my_CI[3] >= 0) | (my_CI[1] <= 0)) { norm_logic[i] <- T } } mean(t_logic) ## [1] 0.053 sd(t_logic) / sqrt(nit) ## [1] 0.007088106 mean(boot_logic) ## [1] 0.093 sd(boot_logic) / sqrt(nit) ## [1] 0.009188876 mean(norm_logic) ## [1] 0.053 sd(norm_logic) / sqrt(nit) ## [1] 0.007088106 Exercise 16.4 (Chi-squared test) Show that the \\(\\chi^2 = \\sum_{i=1}^k \\frac{(O_i - E_i)^2}{E_i}\\) test statistic is approximately \\(\\chi^2\\) distributed when we have two categories. Let us look at the US voting data here. Compare the number of voters who voted for Trump or Hillary depending on their income (less or more than 100.000 dollars per year). Manually calculate the chi-squared statistic, compare to the chisq.test in R, and discuss the results. Visualize the test. Solution. Let \\(X_i\\) be binary variables, \\(i = 1,...,n\\). We can then express the test statistic as \\[\\begin{align} \\chi^2 = &\\frac{(O_i - np)^2}{np} + \\frac{(n - O_i - n(1 - p))^2}{n(1 - p)} \\\\ &= \\frac{(O_i - np)^2}{np(1 - p)} \\\\ &= (\\frac{O_i - np}{\\sqrt{np(1 - p)}})^2. \\end{align}\\] When \\(n\\) is large, this distrbution is approximately normal with \\(\\mu = np\\) and \\(\\sigma^2 = np(1 - p)\\) (binomial converges in distribution to standard normal). By definition, the chi-squared distribution with \\(k\\) degrees of freedom is a sum of squares of \\(k\\) independent standard normal random variables. n <- 24588 less100 <- round(0.66 * n * c(0.49, 0.45, 0.06)) # some rounding, but it should not affect results more100 <- round(0.34 * n * c(0.47, 0.47, 0.06)) x <- rbind(less100, more100) colnames(x) <- c("Clinton", "Trump", "other/no answer") print(x) ## Clinton Trump other/no answer ## less100 7952 7303 974 ## more100 3929 3929 502 chisq.test(x) ## ## Pearson's Chi-squared test ## ## data: x ## X-squared = 9.3945, df = 2, p-value = 0.00912 x ## Clinton Trump other/no answer ## less100 7952 7303 974 ## more100 3929 3929 502 csum <- apply(x, 2, sum) rsum <- apply(x, 1, sum) chi2 <- (x[1,1] - csum[1] * rsum[1] / sum(x))^2 / (csum[1] * rsum[1] / sum(x)) + (x[1,2] - csum[2] * rsum[1] / sum(x))^2 / (csum[2] * rsum[1] / sum(x)) + (x[1,3] - csum[3] * rsum[1] / sum(x))^2 / (csum[3] * rsum[1] / sum(x)) + (x[2,1] - csum[1] * rsum[2] / sum(x))^2 / (csum[1] * rsum[2] / sum(x)) + (x[2,2] - csum[2] * rsum[2] / sum(x))^2 / (csum[2] * rsum[2] / sum(x)) + (x[2,3] - csum[3] * rsum[2] / sum(x))^2 / (csum[3] * rsum[2] / sum(x)) chi2 ## Clinton ## 9.394536 1 - pchisq(chi2, df = 2) ## Clinton ## 0.009120161 x <- seq(0, 15, by = 0.01) df <- data.frame(x = x) ggplot(data = df, aes(x = x)) + stat_function(fun = dchisq, args = list(df = 2)) + geom_segment(aes(x = chi2, y = 0, xend = chi2, yend = dchisq(chi2, df = 2))) + stat_function(fun = dchisq, args = list(df = 2), xlim = c(chi2, 15), geom = "area", fill = "red") "],["bi.html", "Chapter 17 Bayesian inference 17.1 Conjugate priors 17.2 Posterior sampling", " Chapter 17 Bayesian inference This chapter deals with Bayesian inference. The students are expected to acquire the following knowledge: How to set prior distribution. Compute posterior distribution. Compute posterior predictive distribution. Use sampling for inference. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 17.1 Conjugate priors Exercise 17.1 (Poisson-gamma model) Let us assume a Poisson likelihood and a gamma prior on the Poisson mean parameter (this is a conjugate prior). Derive posterior Below we have some data, which represents number of goals in a football match. Choose sensible prior for this data (draw the gamma density if necessary), justify it. Compute the posterior. Compute an interval such that the probability that the true mean is in there is 95%. What is the probability that the true mean is greater than 2.5? Back to theory: Compute prior predictive and posterior predictive. Discuss why the posterior predictive is overdispersed and not Poisson? Draw a histogram of the prior predictive and posterior predictive for the data from (b). Discuss. Generate 10 and 100 random samples from a Poisson distribution and compare the posteriors with a flat prior, and a prior concentrated away from the truth. x <- c(3, 2, 1, 1, 5, 4, 0, 0, 4, 3) Solution. \\[\\begin{align*} p(\\lambda | X) &= \\frac{p(X | \\lambda) p(\\lambda)}{\\int_0^\\infty p(X | \\lambda) p(\\lambda) d\\lambda} \\\\ &\\propto p(X | \\lambda) p(\\lambda) \\\\ &= \\Big(\\prod_{i=1}^n \\frac{1}{x_i!} \\lambda^{x_i} e^{-\\lambda}\\Big) \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\lambda^{\\alpha - 1} e^{-\\beta \\lambda} \\\\ &\\propto \\lambda^{\\sum_{i=1}^n x_i + \\alpha - 1} e^{- \\lambda (n + \\beta)} \\\\ \\end{align*}\\] We recognize this as the shape of a gamma distribution, therefore \\[\\begin{align*} \\lambda | X \\sim \\text{gamma}(\\alpha + \\sum_{i=1}^n x_i, \\beta + n) \\end{align*}\\] For the prior predictive, we have \\[\\begin{align*} p(x^*) &= \\int_0^\\infty p(x^*, \\lambda) d\\lambda \\\\ &= \\int_0^\\infty p(x^* | \\lambda) p(\\lambda) d\\lambda \\\\ &= \\int_0^\\infty \\frac{1}{x^*!} \\lambda^{x^*} e^{-\\lambda} \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\lambda^{\\alpha - 1} e^{-\\beta \\lambda} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\int_0^\\infty \\lambda^{x^* + \\alpha - 1} e^{-\\lambda (1 + \\beta)} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\frac{\\Gamma(x^* + \\alpha)}{(1 + \\beta)^{x^* + \\alpha}} \\int_0^\\infty \\frac{(1 + \\beta)^{x^* + \\alpha}}{\\Gamma(x^* + \\alpha)} \\lambda^{x^* + \\alpha - 1} e^{-\\lambda (1 + \\beta)} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\frac{\\Gamma(x^* + \\alpha)}{(1 + \\beta)^{x^* + \\alpha}} \\\\ &= \\frac{\\Gamma(x^* + \\alpha)}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} (\\frac{\\beta}{1 + \\beta})^\\alpha (\\frac{1}{1 + \\beta})^{x^*}, \\end{align*}\\] which we recognize as the negative binomial distribution with \\(r = \\alpha\\) and \\(p = \\frac{1}{\\beta + 1}\\). For the posterior predictive, the calculation is the same, only now the parameters are \\(r = \\alpha + \\sum_{i=1}^n x_i\\) and \\(p = \\frac{1}{\\beta + n + 1}\\). There are two sources of uncertainty in the predictive distribution. First is the uncertainty about the population. Second is the variability in sampling from the population. When \\(n\\) is large, the latter is going to be very small. But when \\(n\\) is small, the latter is going to be higher, resulting in an overdispersed predictive distribution. x <- c(3, 2, 1, 1, 5, 4, 0, 0, 4, 3) # b # quick visual check of the prior ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dgamma, args = list(shape = 1, rate = 1)) palpha <- 1 pbeta <- 1 alpha_post <- palpha + sum(x) beta_post <- pbeta + length(x) ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dgamma, args = list(shape = alpha_post, rate = beta_post)) # probability of being higher than 2.5 1 - pgamma(2.5, alpha_post, beta_post) ## [1] 0.2267148 # interval qgamma(c(0.025, 0.975), alpha_post, beta_post) ## [1] 1.397932 3.137390 # d prior_pred <- rnbinom(1000, size = palpha, prob = 1 - 1 / (pbeta + 1)) post_pred <- rnbinom(1000, size = palpha + sum(x), prob = 1 - 1 / (pbeta + 10 + 1)) df <- data.frame(prior = prior_pred, posterior = post_pred) df <- gather(df) ggplot(df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") # e set.seed(1) x1 <- rpois(10, 2.5) x2 <- rpois(100, 2.5) alpha_flat <- 1 beta_flat <- 0.1 alpha_conc <- 50 beta_conc <- 10 n <- 10000 df_flat <- data.frame(x1 = rgamma(n, alpha_flat + sum(x1), beta_flat + 10), x2 = rgamma(n, alpha_flat + sum(x2), beta_flat + 100), type = "flat") df_flat <- tidyr::gather(df_flat, key = "key", value = "value", - type) df_conc <- data.frame(x1 = rgamma(n, alpha_conc + sum(x1), beta_conc + 10), x2 = rgamma(n, alpha_conc + sum(x2), beta_conc + 100), type = "conc") df_conc <- tidyr::gather(df_conc, key = "key", value = "value", - type) df <- rbind(df_flat, df_conc) ggplot(data = df, aes(x = value, color = type)) + facet_wrap(~ key) + geom_density() 17.2 Posterior sampling Exercise 17.2 (Bayesian logistic regression) In Chapter 15 we implemented a MLE for logistic regression (see the code below). For this model, conjugate priors do not exist, which complicates the calculation of the posterior. However, we can use sampling from the numerator of the posterior, using rejection sampling. Set a sensible prior distribution on \\(\\beta\\) and use rejection sampling to find the posterior distribution. In a) you will get a distribution of parameter \\(\\beta\\). Plot the probabilities (as in exercise 15.3) for each sample of \\(\\beta\\) and compare to the truth. Hint: We can use rejection sampling even for functions which are not PDFs – they do not have to sum/integrate to 1. We just need to use a suitable envelope that we know how to sample from. For example, here we could use a uniform distribution and scale it suitably. set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par # Let's say we believe that the mean of beta is 0.5. Since we are not very sure # about this, we will give it a relatively high variance. So a normal prior with # mean 0.5 and standard deviation 5. But there is no right solution to this, # this is basically us expressing our prior belief in the parameter values. set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) if (is.nan(logl)) logl <- Inf return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par ## [1] 1.166558 f_logistic <- function (beta, X, y) { logl <- prod(inv_log(as.vector(beta %*% X))^y * (1 - inv_log(as.vector(beta %*% X)))^(1 - y)) return(logl) } a <- seq(0, 3, by = 0.01) my_l <- c() for (i in a) { my_l <- c(my_l, f_logistic(i, x, y) * dnorm(i, 0.5, 5)) } plot(my_l) envlp <- 10^(-25.8) * dunif(a, -5, 5) # found by trial and error tmp <- data.frame(envel = envlp, l = my_l, t = a) tmp <- gather(tmp, key = "key", value = "value", - t) ggplot(tmp, aes(x = t, y = value, color = key)) + geom_line() # envelope OK set.seed(1) nsamps <- 1000 samps <- c() for (i in 1:nsamps) { tmp <- runif(1, -5, 5) u <- runif(1, 0, 1) if (u < (f_logistic(tmp, x, y) * dnorm(tmp, 0.5, 5)) / (10^(-25.8) * dunif(tmp, -5, 5))) { samps <- c(samps, tmp) } } plot(density(samps)) mean(samps) ## [1] 1.211578 median(samps) ## [1] 1.204279 truth_p <- data.frame(x = x, prob = inv_log(1.2 * x), type = "truth") preds <- inv_log(x %*% t(samps)) preds <- gather(cbind(as.data.frame(preds), x = x), key = "key", "value" = value, - x) ggplot(preds, aes(x = x, y = value)) + geom_line(aes(group = key), color = "gray", alpha = 0.7) + geom_point(data = truth_p, aes(y = prob), color = "red", alpha = 0.7) + theme_bw() "],["distributions-intutition.html", "Chapter 18 Distributions intutition 18.1 Discrete distributions 18.2 Continuous distributions", " Chapter 18 Distributions intutition This chapter is intended to help you familiarize yourself with the different probability distributions you will encounter in this course. You will need to use Appendix B extensively as a reference for the basic properties of distributions, so keep it close! .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 18.1 Discrete distributions Exercise 18.1 (Bernoulli intuition 1) Arguably the simplest distribution you will enocounter is the Bernoulli distribution. It is a discrete probability distribution used to represent the outcome of a yes/no question. It has one parameter \\(p\\) which is the probability of success. The probability of failure is \\((1-p)\\), sometimes denoted as \\(q\\). A classic way to think about a Bernoulli trial (a yes/no experiment) is a coin flip. Real coins are fair, meaning the probability of either heads (1) or tails (0) are the same, so \\(p=0.5\\) as shown below in figure a. Alternatively we may want to represent a process that doesn’t have equal probabilities of outcomes like “Will a throw of a fair die result in a 6?”. In this case \\(p=\\frac{1}{6}\\), shown in figure b. Using your knowledge of the Bernoulli distribution use the throw of a fair die to think of events, such that: \\(p = 0.5\\) \\(p = \\frac{5}{6}\\) \\(q = \\frac{2}{3}\\) Solution. An event that is equally likely to happen or not happen i.e. \\(p = 0.5\\) would be throwing an even number. More formally we can name this event \\(A\\) and write: \\(A = \\{2,4,6\\}\\), its probability being \\(P(A) = 0.5\\) An example of an event with \\(p = \\frac{5}{6}\\) would be throwing a number greater than 1. Defined as \\(B = \\{2,3,4,5,6\\}\\). We need an event that fails \\(\\frac{2}{3}\\) of the time. Alternatively we can reverse the problem and find an event that succeeds \\(\\frac{1}{3}\\) of the time, since: \\(q = 1 - p \\implies p = 1 - q = \\frac{1}{3}\\). The event that our outcome is divisible by 3: \\(C = \\{3, 6\\}\\) satisfies this condition. Exercise 18.2 (Binomial intuition 1) The binomial distribution is a generalization of the Bernoulli distribution. Instead of considering a single Bernoulli trial, we now consider a sequence of \\(n\\) trials, which are independent and have the same parameter \\(p\\). So the binomial distribution has two parameters \\(n\\) - the number of trials and \\(p\\) - the probability of success for each trial. If we return to our coin flip representation, we now flip a coin several times. The binomial distribution will give us the probabilities of all possible outcomes. Below we show the distribution for a series of 10 coin flips with a fair coin (left) and a biased coin (right). The numbers on the x axis represent the number of times the coin landed heads. Using your knowledge of the binomial distribution: Take the pmf of the binomial distribution and plug in \\(n=1\\), check that it is in fact equivalent to a Bernoulli distribution. In our examples we show the graph of a binomial distribution over 10 trials with \\(p=0.8\\). If we take a look at the graph, it appears as though the probabilities of getting 0,1,2 or 3 heads in 10 flips are zero. Is it actually zero? Check by plugging in the values into the pmf. Solution. The pmf of a binomial distribution is \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\), now we insert \\(n=1\\) to get: \\[\\binom{1}{k} p^k (1 - p)^{1 - k}\\]. Not quite equivalent to a Bernoulli, however note that the support of the binomial distribution is defined as \\(k \\in \\{0,1,\\dots,n\\}\\), so in our case \\(k = \\{0,1\\}\\), then: \\[\\binom{1}{0} = \\binom{1}{1} = 1\\] we get: \\(p^k (1 - p)^{1 - k}\\) ,the Bernoulli distribution. As we already know \\(p=0.8, n=10\\), so: \\[\\binom{10}{0} 0.8^0 (1 - 0.8)^{10 - 0} = 1.024 \\cdot 10^{-7}\\] \\[\\binom{10}{1} 0.8^1 (1 - 0.8)^{10 - 1} = 4.096 \\cdot 10^{-6}\\] \\[\\binom{10}{2} 0.8^2 (1 - 0.8)^{10 - 2} = 7.3728 \\cdot 10^{-5}\\] \\[\\binom{10}{3} 0.8^3 (1 - 0.8)^{10 - 3} = 7.86432\\cdot 10^{-4}\\] So the probabilities are not zero, just very small. Exercise 18.3 (Poisson intuition 1) Below are shown 3 different graphs of the Poisson distribution. Your task is to replicate them on your own in R by varying the \\(\\lambda\\) parameter. Hint: You can use dpois() to get the probabilities. library(ggplot2) library(gridExtra) x = 0:15 # Create Poisson data data1 <- data.frame(x = x, y = dpois(x, lambda = 0.1)) data2 <- data.frame(x = x, y = dpois(x, lambda = 1)) data3 <- data.frame(x = x, y = dpois(x, lambda = 7.5)) # Create individual ggplot objects plot1 <- ggplot(data1, aes(x, y)) + geom_col() + xlab("x") + ylab("Probability") + ylim(0,1) plot2 <- ggplot(data2, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) plot3 <- ggplot(data3, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) # Combine the plots grid.arrange(plot1, plot2, plot3, ncol = 3) Exercise 18.4 (Poisson intuition 2) The Poisson distribution is a discrete probability distribution that models the probability of a given number of events occuring within processes where events occur at a constant mean rate and independently of each other - a Poisson process. It has a single parameter \\(\\lambda\\), which represents the constant mean rate. A classic example of a scenario that can be modeled using the Poisson distribution is the number of calls received by a call center in a day (or in fact any other time interval). Suppose you work in a call center and have some understanding of probability distributions. You overhear your supervisor mentioning that the call center receives an average of 2.5 calls per day. Using your knowledge of the Poisson distribution, calculate: The probability you will get no calls today. The probability you will get more than 5 calls today. Solution. First recall the Poisson pmf: \\[p(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!}\\] as stated previously our parameter \\(\\lambda = 2.5\\) To get the probability of no calls we simply plug in \\(k = 0\\), so: \\[p(0) = \\frac{2.5^0 e^{-2.5}}{0!} = e^{-2.5} \\approx 0.082\\] The support of the Poisson distribution is non-negative integers. So if we wanted to calculate the probability of getting more than 5 calls we would need to add up the probabilities of getting 6 calls and 7 calls and so on up to infinity. Let us instead remember that the sum of all probabilties will be 1, we will reverse the problem and instead ask “What is the probability we get 5 calls or less?”. We can subtract the probability of the opposite outcome (the complement) from 1 to get the probability of our original question. \\[P(k > 5) = 1 - P(k \\leq 5)\\] \\[P(k \\leq 5) = \\sum_{i=0}^{5} p(i) = p(0) + p(1) + p(2) + p(3) + p(4) + p(5) =\\] \\[= \\frac{2.5^0 e^{-2.5}}{0!} + \\frac{2.5^1 e^{-2.5}}{1!} + \\dots =\\] \\[=0.957979\\] So the probability of geting more than 5 calls will be \\(1 - 0.957979 = 0.042021\\) Exercise 18.5 (Geometric intuition 1) The geometric distribution is a discrete distribution that models the number of failures before the first success in a sequence of independent Bernoulli trials. It has a single parameter \\(p\\), representing the probability of success. NOTE: There are two forms of this distribution, the one we just described and another that models the number of trials before the first success. The difference is subtle yet significant and you are likely to encounter both forms. In the graph below we show the pmf of a geometric distribution with \\(p=0.5\\). This can be thought of as the number of successive failures (tails) in the flip of a fair coin. You can see that there’s a 50% chance you will have zero failures i.e. you will flip a heads on your very first attempt. But there is some smaller chance that you will flip a sequence of tails in a row, with longer sequences having ever lower probability. Create an equivalent graph that represents the probability of rolling a 6 with a fair 6-sided die. Use the formula for the mean of the geometric distribution and determine the average number of failures before you roll a 6. Look up the alternative form of the geometric distribtuion and again use the formula for the mean to determine the average number of trials before you roll a 6. Solution. Parameter p (the probability of success) for rolling a 6 is \\(p=\\frac{1}{6}\\). library(ggplot2) # Parameters p <- 1/6 x_vals <- 0:9 # Starting from 0 probs <- dgeom(x_vals, p) # Data data <- data.frame(x_vals, probs) # Plot ggplot(data, aes(x=x_vals, y=probs)) + geom_segment(aes(xend=x_vals, yend=0), color="black", size=1) + geom_point(color="red", size=2) + labs(x = "Number of trials", y = "Probability") + theme_minimal() + scale_x_continuous(breaks = x_vals) # This line ensures integer x-axis labels ::: {.solution} b) The expected value of a random variable (the mean) is denoted as \\(E[X]\\). \\[E[X] = \\frac{1-p}{p}= \\frac{1- \\frac{1}{6}}{\\frac{1}{6}} = \\frac{5}{6}\\cdot 6 = 5\\] On average we will fail 5 times before we roll our first 6. The alternative form of this distribution (with support on all positive integers) has a slightly different formula for the mean. This change reflects the difference in the way we posed our question: \\[E[X] = \\frac{1}{p} = \\frac{1}{\\frac{1}{6}} = 6\\] On average we will have to throw the die 6 times before we roll a 6. ::: 18.2 Continuous distributions Exercise 18.6 (Uniform intuition 1) The need for a randomness is a common problem. A practical solution are so-called random number generators (RNGs). The simplest RNG one would think of is choosing a set of numbers and having the generator return a number at random, where the probability of returning any number from this set is the same. If this set is an interval of real numbers, then we’ve basically described the continuous uniform distribution. It has two parameters \\(a\\) and \\(b\\), which define the beginning and end of its support respectively. Let’s think of the mean intuitively. The expected value or mean of a distribution is the pivot point on our x-axis, which “balances” the graph. Given parameters \\(a\\) and \\(b\\) what is your intuitive guess of the mean for this distribution? A special case of the uniform distribution is the standard uniform distribution with \\(a=0\\) and \\(b=1\\). Write the pdf \\(f(x)\\) of this particular distribution. Solution. It’s the midpoint between \\(a\\) and \\(b\\), so \\(\\frac{a+b}{2}\\) Inserting the parameter values we get:\\[f(x) = \\begin{cases} 1 & \\text{if } 0 \\leq x \\leq 1 \\\\ 0 & \\text{otherwise} \\end{cases} \\] Notice how the pdf is just a constant \\(1\\) across all values of \\(x \\in [0,1]\\). Here it is important to distinguish between probability and probability density. The density may be 1, but the probability is not and while discreet distributions never exceed 1 on the y-axis, continuous distributions can go as high as you like. Exercise 18.7 (Normal intuition 1) The normal distribution, also known as the Gaussian distribution, is a continuous distribution that encompasses the entire real number line. It has two parameters: the mean, denoted by \\(\\mu\\), and the variance, represented by \\(\\sigma^2\\). Its shape resembles the iconic bell curve. The position of its peak is determined by the parameter \\(\\mu\\), while the variance determines the spread or width of the curve. A smaller variance results in a sharper, narrower peak, while a larger variance leads to a broader, more spread-out curve. Below, we graph the distribution of IQ scores for two different populations. We aim to identify individuals with an IQ at or above 140 for an experiment. We can identify them reliably; however, we only have time to examine one of the two groups. Which group should we investigate to have the best chance? NOTE: The graph below displays the parameter \\(\\sigma\\), which is the square root of the variance, more commonly referred to as the standard deviation. Keep this in mind when solving the problems. Insert the values of either population into the pdf of a normal distribution and determine which one has a higher density at \\(x=140\\). Generate the graph yourself and zoom into the relevant area to graphically verify your answer. To determine probability density, we can use the pdf. However, if we wish to know the proportion of the population that falls within certain parameters, we would need to integrate the pdf. Fortunately, the integrals of common distributions are well-established. This integral gives us the cumulative distribution function \\(F(x)\\) (CDF). BONUS: Look up the CDF of the normal distribution and input the appropriate values to determine the percentage of each population that comprises individuals with an IQ of 140 or higher. Solution. Group 1: \\(\\mu = 100, \\sigma=10 \\rightarrow \\sigma^2 = 100\\) \\[\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} = \\frac{1}{\\sqrt{2 \\pi 100}} e^{-\\frac{(140 - 100)^2}{2 \\cdot 100}} \\approx 1.34e-05\\] Group 2: \\(\\mu = 105, \\sigma=8 \\rightarrow \\sigma^2 = 64\\) \\[\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} = \\frac{1}{\\sqrt{2 \\pi 64}} e^{-\\frac{(140 - 105)^2}{2 \\cdot 64}} \\approx 3.48e-06\\] So despite the fact that group 1 has a lower average IQ, we are more likely to find 140 IQ individuals in group 2. library(ggplot2) library(tidyr) # Create data x <- seq(135, 145, by = 0.01) # Adjusting the x range to account for the larger standard deviations df <- data.frame(x = x) # Define the IQ distributions df$IQ_mu100_sd10 <- dnorm(df$x, mean = 100, sd = 10) df$IQ_mu105_sd8 <- dnorm(df$x, mean = 105, sd = 8) # Convert from wide to long format for ggplot2 df_long <- gather(df, distribution, density, -x) # Ensure the levels of the 'distribution' factor match our desired order df_long$distribution <- factor(df_long$distribution, levels = c("IQ_mu100_sd10", "IQ_mu105_sd8")) # Plot ggplot(df_long, aes(x = x, y = density, color = distribution)) + geom_line() + labs(x = "IQ Score", y = "Density") + scale_color_manual( name = "IQ Distribution", values = c(IQ_mu100_sd10 = "red", IQ_mu105_sd8 = "blue"), labels = c("Group 1 (µ=100, σ=10)", "Group 2 (µ=105, σ=8)") ) + theme_minimal() ::: {.solution} c. The CDF of the normal distribution is \\(\\Phi(x) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{x - \\mu}{\\sigma \\sqrt{2}} \\right) \\right]\\). The CDF is defined as the integral of the distribution density up to x. So to get the total percentage of individuals with IQ at 140 or higher we will need to subtract the value from 1. Group 1: \\[1 - \\Phi(140) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{140 - 100}{10 \\sqrt{2}} \\right) \\right] \\approx 3.17e-05 \\] Group 2 : \\[1 - \\Phi(140) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{140 - 105}{8 \\sqrt{2}} \\right) \\right] \\approx 6.07e-06 \\] So roughly 0.003% and 0.0006% of individuals in groups 1 and 2 respectively have an IQ at or above 140. ::: Exercise 18.8 (Beta intuition 1) The beta distribution is a continuous distribution defined on the unit interval \\([0,1]\\). It has two strictly positive paramters \\(\\alpha\\) and \\(\\beta\\), which determine its shape. Its support makes it especially suitable to model distribtuions of percentages and proportions. Below you’ve been provided with some code that you can copy into Rstudio. Once you run the code, an interactive Shiny app will appear and you will be able to manipulate the graph of the beta distribution. Play around with the parameters to get: A straight line from (0,0) to (1,2) A straight line from (0,2) to (1,0) A symmetric bell curve A bowl-shaped curve The standard uniform distribution is actually a special case of the beta distribution. Find the exact parameters \\(\\alpha\\) and \\(\\beta\\). Once you do, prove the equality by inserting the values into our pdf. Hint: The beta function is evaluated as \\(\\text{B}(a,b) = \\frac{\\Gamma(a)\\Gamma(b)}{\\Gamma(a+b)}\\), the gamma function for positive integers \\(n\\) is evaluated as \\(\\Gamma(n)= (n-1)!\\) # Install and load necessary packages install.packages(c("shiny", "ggplot2")) library(shiny) library(ggplot2) # The Shiny App ui <- fluidPage( titlePanel("Beta Distribution Viewer"), sidebarLayout( sidebarPanel( sliderInput("alpha", "Alpha:", min = 0.1, max = 10, value = 2, step = 0.1), sliderInput("beta", "Beta:", min = 0.1, max = 10, value = 2, step = 0.1) ), mainPanel( plotOutput("betaPlot") ) ) ) server <- function(input, output) { output$betaPlot <- renderPlot({ x <- seq(0, 1, by = 0.01) y <- dbeta(x, shape1 = input$alpha, shape2 = input$beta) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() + labs(x = "Value", y = "Density") + theme_minimal() }) } shinyApp(ui = ui, server = server) Solution. \\(\\alpha = 2, \\beta=1\\) \\(\\alpha = 1, \\beta=2\\) Possible solution \\(\\alpha = \\beta= 5\\) Possible solution \\(\\alpha = \\beta= 0.5\\) The correct parameters are \\(\\alpha = 1, \\beta=1\\), to prove the equality we insert them into the beta pdf: \\[\\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} = \\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\\text{B}(1, 1)} = \\frac{1}{\\frac{\\Gamma(1)\\Gamma(1)}{\\Gamma(1+1)}}= \\frac{1}{\\frac{(1-1)!(1-1)!}{(2-1)!}} = 1\\] Exercise 18.9 (Exponential intuition 1) The exponential distribution represents the distributon of time between events in a Poisson process. It is the continuous analogue of the geometric distribution. It has a single parameter \\(\\lambda\\), which is strictly positive and represents the constant rate of the corresponding Poisson process. The support is all positive reals, since time between events is non-negative, but not bound upwards. Let’s revisit the call center from our Poisson problem. We get 2.5 calls per day on average, this is our rate parameter \\(\\lambda\\). A work day is 8 hours. What is the mean time between phone calls? The cdf \\(F(x)\\) tells us what percentage of calls occur within x amount of time of each other. You want to take an hour long lunch break but are worried about missing calls. Calculate the percentage of calls you are likely to miss if you’re gone for an hour. Hint: The cdf is \\(F(x) = \\int_{-\\infty}^{x} f(x) dx\\) Solution. Taking \\(\\lambda = \\frac{2.5 \\text{ calls}}{8 \\text{ hours}} = \\frac{1 \\text{ call}}{3.2 \\text{ hours}}\\) \\[E[X] = \\frac{1}{\\lambda} = \\frac{3.2 \\text{ hours}}{\\text{call}}\\] First we derive the CDF, we can integrate from 0 instead of \\(-\\infty\\), since we have no support in the negatives: \\[\\begin{align} F(x) &= \\int_{0}^{x} \\lambda e^{-\\lambda t} dt \\\\ &= \\lambda \\int_{0}^{x} e^{-\\lambda t} dt \\\\ &= \\lambda (\\frac{1}{-\\lambda}e^{-\\lambda t} |_{0}^{x}) \\\\ &= \\lambda(\\frac{1}{\\lambda} - \\frac{1}{\\lambda} e^{-\\lambda x}) \\\\ &= 1 - e^{-\\lambda x}. \\end{align}\\] Then we just evaluate it for a time of 1 hour: \\[F(1 \\text{ hour}) = 1 - e^{-\\frac{1 \\text{ call}}{3.2 \\text{ hours}} \\cdot 1 \\text{ hour}}= 1 - e^{-\\frac{1 \\text{ call}}{3.2 \\text{ hours}}} \\approx 0.268\\] So we have about a 27% chance of missing a call if we’re gone for an hour. Exercise 18.10 (Gamma intuition 1) The gamma distribution is a continuous distribution characterized by two parameters, \\(\\alpha\\) and \\(\\beta\\), both greater than 0. These parameters afford the distribution a broad range of shapes, leading to it being commonly referred to as a family of distributions. Given its support over the positive real numbers, it is well suited for modeling a diverse range of positive-valued phenomena. The exponential distribution is actually just a particular form of the gamma distribution. What are the values of \\(\\alpha\\) and \\(\\beta\\)? Copy the code from our beta distribution Shiny app and modify it to simulate the gamma distribution. Then get it to show the exponential. Solution. Let’s start by taking a look at the pdfs of the two distributions side by side: \\[\\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x} = \\lambda e^{-\\lambda x}\\] The \\(x^{\\alpha - 1}\\) term is not found anywhere in the pdf of the exponential so we need to eliminate it by setting \\(\\alpha = 1\\). This also makes the fraction evaluate to \\(\\frac{\\beta^1}{\\Gamma(1)} = \\beta\\), which leaves us with \\[\\beta \\cdot e^{-\\beta x}\\] Now we can see that \\(\\beta = \\lambda\\) and \\(\\alpha = 1\\). # Install and load necessary packages install.packages(c("shiny", "ggplot2")) library(shiny) library(ggplot2) # The Shiny App ui <- fluidPage( titlePanel("Gamma Distribution Viewer"), sidebarLayout( sidebarPanel( sliderInput("shape", "Shape (α):", min = 0.1, max = 10, value = 2, step = 0.1), sliderInput("scale", "Scale (β):", min = 0.1, max = 10, value = 2, step = 0.1) ), mainPanel( plotOutput("gammaPlot") ) ) ) server <- function(input, output) { output$gammaPlot <- renderPlot({ x <- seq(0, 25, by = 0.1) y <- dgamma(x, shape = input$shape, scale = input$scale) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() + labs(x = "Value", y = "Density") + theme_minimal() }) } shinyApp(ui = ui, server = server) "],["A1.html", "A R programming language A.1 Basic characteristics A.2 Why R? A.3 Setting up A.4 R basics A.5 Functions A.6 Other tips A.7 Further reading and references", " A R programming language A.1 Basic characteristics R is free software for statistical computing and graphics. It is widely used by statisticians, scientists, and other professionals for software development and data analysis. It is an interpreted language and therefore the programs do not need compilation. A.2 Why R? R is one of the main two languages used for statistics and machine learning (the other being Python). Pros Libraries. Comprehensive collection of statistical and machine learning packages. Easy to code. Open source. Anyone can access R and develop new methods. Additionally, it is relatively simple to get source code of established methods. Large community. The use of R has been rising for some time, in industry and academia. Therefore a large collection of blogs and tutorials exists, along with people offering help on pages like StackExchange and CrossValidated. Integration with other languages and LaTeX. New methods. Many researchers develop R packages based on their research, therefore new methods are available soon after development. Cons Slow. Programs run slower than in other programming languages, however this can be somewhat ammended by effective coding or integration with other languages. Memory intensive. This can become a problem with large data sets, as they need to be stored in the memory, along with all the information the models produce. Some packages are not as good as they should be, or have poor documentation. Object oriented programming in R can be very confusing and complex. A.3 Setting up https://www.r-project.org/. A.3.1 RStudio RStudio is the most widely used IDE for R. It is free, you can download it from https://rstudio.com/. While console R is sufficient for the requirements of this course, we recommend the students install RStudio for its better user interface. A.3.2 Libraries for data science Listed below are some of the more useful libraries (packages) for data science. Students are also encouraged to find other useful packages. dplyr Efficient data manipulation. Part of the wider package collection called tidyverse. ggplot2 Plotting based on grammar of graphics. stats Several statistical models. rstan Bayesian inference using Hamiltonian Monte Carlo. Very flexible model building. MCMCpack Bayesian inference. rmarkdown, knitr, and bookdown Dynamic reports (for example such as this one). devtools Package development. A.4 R basics A.4.1 Variables and types Important information and tips: no type declaration define variables with <- instead of = (although both work, there is a slight difference, additionally most of the packages use the arrow) for strings use \"\" for comments use # change types with as.type() functions no special type for single character like C++ for example n <- 20 x <- 2.7 m <- n # m gets value 20 my_flag <- TRUE student_name <- "Luka" typeof(n) ## [1] "double" typeof(student_name) ## [1] "character" typeof(my_flag) ## [1] "logical" typeof(as.integer(n)) ## [1] "integer" typeof(as.character(n)) ## [1] "character" A.4.2 Basic operations n + x ## [1] 22.7 n - x ## [1] 17.3 diff <- n - x # variable diff gets the difference between n and x diff ## [1] 17.3 n * x ## [1] 54 n / x ## [1] 7.407407 x^2 ## [1] 7.29 sqrt(x) ## [1] 1.643168 n > 2 * n ## [1] FALSE n == n ## [1] TRUE n == 2 * n ## [1] FALSE n != n ## [1] FALSE paste(student_name, "is", n, "years old") ## [1] "Luka is 20 years old" A.4.3 Vectors use c() to combine elements into vectors can only contain one type of variable if different types are provided, all are transformed to the most basic type in the vector access elements by indexes or logical vectors of the same length a scalar value is regarded as a vector of length 1 1:4 # creates a vector of integers from 1 to 4 ## [1] 1 2 3 4 student_ages <- c(20, 23, 21) student_names <- c("Luke", "Jen", "Mike") passed <- c(TRUE, TRUE, FALSE) length(student_ages) ## [1] 3 # access by index student_ages[2] ## [1] 23 student_ages[1:2] ## [1] 20 23 student_ages[2] <- 24 # change values # access by logical vectors student_ages[passed == TRUE] # same as student_ages[passed] ## [1] 20 24 student_ages[student_names %in% c("Luke", "Mike")] ## [1] 20 21 student_names[student_ages > 20] ## [1] "Jen" "Mike" A.4.3.1 Operations with vectors most operations are element-wise if we operate on vectors of different lengths, the shorter vector periodically repeats its elements until it reaches the length of the longer one a <- c(1, 3, 5) b <- c(2, 2, 1) d <- c(6, 7) a + b ## [1] 3 5 6 a * b ## [1] 2 6 5 a + d ## Warning in a + d: longer object length is not a multiple of shorter object ## length ## [1] 7 10 11 a + 2 * b ## [1] 5 7 7 a > b ## [1] FALSE TRUE TRUE b == a ## [1] FALSE FALSE FALSE a %*% b # vector multiplication, not element-wise ## [,1] ## [1,] 13 A.4.4 Factors vectors of finite predetermined classes suitable for categorical variables ordinal (ordered) or nominal (unordered) car_brand <- factor(c("Audi", "BMW", "Mercedes", "BMW"), ordered = FALSE) car_brand ## [1] Audi BMW Mercedes BMW ## Levels: Audi BMW Mercedes freq <- factor(x = NA, levels = c("never","rarely","sometimes","often","always"), ordered = TRUE) freq[1:3] <- c("rarely", "sometimes", "rarely") freq ## [1] rarely sometimes rarely ## Levels: never < rarely < sometimes < often < always freq[4] <- "quite_often" # non-existing level, returns NA ## Warning in `[<-.factor`(`*tmp*`, 4, value = "quite_often"): invalid factor ## level, NA generated freq ## [1] rarely sometimes rarely <NA> ## Levels: never < rarely < sometimes < often < always A.4.5 Matrices two-dimensional generalizations of vectors my_matrix <- matrix(c(1, 2, 1, 5, 4, 2), nrow = 2, byrow = TRUE) my_matrix ## [,1] [,2] [,3] ## [1,] 1 2 1 ## [2,] 5 4 2 my_square_matrix <- matrix(c(1, 3, 2, 3), nrow = 2) my_square_matrix ## [,1] [,2] ## [1,] 1 2 ## [2,] 3 3 my_matrix[1,2] # first row, second column ## [1] 2 my_matrix[2, ] # second row ## [1] 5 4 2 my_matrix[ ,3] # third column ## [1] 1 2 A.4.5.1 Matrix functions and operations most operation element-wise mind the dimensions when using matrix multiplication %*% nrow(my_matrix) # number of matrix rows ## [1] 2 ncol(my_matrix) # number of matrix columns ## [1] 3 dim(my_matrix) # matrix dimension ## [1] 2 3 t(my_matrix) # transpose ## [,1] [,2] ## [1,] 1 5 ## [2,] 2 4 ## [3,] 1 2 diag(my_matrix) # the diagonal of the matrix as vector ## [1] 1 4 diag(1, nrow = 3) # creates a diagonal matrix ## [,1] [,2] [,3] ## [1,] 1 0 0 ## [2,] 0 1 0 ## [3,] 0 0 1 det(my_square_matrix) # matrix determinant ## [1] -3 my_matrix + 2 * my_matrix ## [,1] [,2] [,3] ## [1,] 3 6 3 ## [2,] 15 12 6 my_matrix * my_matrix # element-wise multiplication ## [,1] [,2] [,3] ## [1,] 1 4 1 ## [2,] 25 16 4 my_matrix %*% t(my_matrix) # matrix multiplication ## [,1] [,2] ## [1,] 6 15 ## [2,] 15 45 my_vec <- as.vector(my_matrix) # transform to vector my_vec ## [1] 1 5 2 4 1 2 A.4.6 Arrays multi-dimensional generalizations of matrices my_array <- array(c(1, 2, 3, 4, 5, 6, 7, 8), dim = c(2, 2, 2)) my_array[1, 1, 1] ## [1] 1 my_array[2, 2, 1] ## [1] 4 my_array[1, , ] ## [,1] [,2] ## [1,] 1 5 ## [2,] 3 7 dim(my_array) ## [1] 2 2 2 A.4.7 Data frames basic data structure for analysis differ from matrices as columns can be of different types student_data <- data.frame("Name" = student_names, "Age" = student_ages, "Pass" = passed) student_data ## Name Age Pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE ## 3 Mike 21 FALSE colnames(student_data) <- c("name", "age", "pass") # change column names student_data[1, ] ## name age pass ## 1 Luke 20 TRUE student_data[ ,colnames(student_data) %in% c("name", "pass")] ## name pass ## 1 Luke TRUE ## 2 Jen TRUE ## 3 Mike FALSE student_data$pass # access column by name ## [1] TRUE TRUE FALSE student_data[student_data$pass == TRUE, ] ## name age pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE A.4.8 Lists useful for storing different data structures access elements with double square brackets elements can be named first_list <- list(student_ages, my_matrix, student_data) second_list <- list(student_ages, my_matrix, student_data, first_list) first_list[[1]] ## [1] 20 24 21 second_list[[4]] ## [[1]] ## [1] 20 24 21 ## ## [[2]] ## [,1] [,2] [,3] ## [1,] 1 2 1 ## [2,] 5 4 2 ## ## [[3]] ## name age pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE ## 3 Mike 21 FALSE second_list[[4]][[1]] # first element of the fourth element of second_list ## [1] 20 24 21 length(second_list) ## [1] 4 second_list[[length(second_list) + 1]] <- "add_me" # append an element names(first_list) <- c("Age", "Matrix", "Data") first_list$Age ## [1] 20 24 21 A.4.9 Loops mostly for loop for loop can iterate over an arbitrary vector # iterate over consecutive natural numbers my_sum <- 0 for (i in 1:10) { my_sum <- my_sum + i } my_sum ## [1] 55 # iterate over an arbirary vector my_sum <- 0 some_numbers <- c(2, 3.5, 6, 100) for (i in some_numbers) { my_sum <- my_sum + i } my_sum ## [1] 111.5 A.5 Functions for help use ?function_name A.5.1 Writing functions We can write our own functions with function(). In the brackets, we define the parameters the function gets, and in curly brackets we define what the function does. We use return() to return values. sum_first_n_elements <- function (n) { my_sum <- 0 for (i in 1:n) { my_sum <- my_sum + i } return (my_sum) } sum_first_n_elements(10) ## [1] 55 A.6 Other tips Use set.seed(arbitrary_number) at the beginning of a script to set the seed and ensure replication. To dynamically set the working directory in R Studio to the parent folder of a R script use setwd(dirname(rstudioapi::getSourceEditorContext()$path)). To avoid slow R loops use the apply family of functions. See ?apply and ?lapply. To make your data manipulation (and therefore your life) a whole lot easier, use the dplyr package. Use getAnywhere(function_name) to get the source code of any function. Use browser for debugging. See ?browser. A.7 Further reading and references Getting started with R Studio: https://www.youtube.com/watch?v=lVKMsaWju8w Official R manuals: https://cran.r-project.org/manuals.html Cheatsheets: https://www.rstudio.com/resources/cheatsheets/ Workshop on R, dplyr, ggplot2, and R Markdown: https://github.com/bstatcomp/Rworkshop "],["distributions.html", "B Probability distributions", " B Probability distributions Name parameters support pdf/pmf mean variance Bernoulli \\(p \\in [0,1]\\) \\(k \\in \\{0,1\\}\\) \\(p^k (1 - p)^{1 - k}\\) 1.12 \\(p\\) 7.1 \\(p(1-p)\\) 7.1 binomial \\(n \\in \\mathbb{N}\\), \\(p \\in [0,1]\\) \\(k \\in \\{0,1,\\dots,n\\}\\) \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\) 4.4 \\(np\\) 7.2 \\(np(1-p)\\) 7.2 Poisson \\(\\lambda > 0\\) \\(k \\in \\mathbb{N}_0\\) \\(\\frac{\\lambda^k e^{-\\lambda}}{k!}\\) 4.6 \\(\\lambda\\) 7.3 \\(\\lambda\\) 7.3 geometric \\(p \\in (0,1]\\) \\(k \\in \\mathbb{N}_0\\) \\(p(1-p)^k\\) 4.5 \\(\\frac{1 - p}{p}\\) 7.4 \\(\\frac{1 - p}{p^2}\\) 9.3 normal \\(\\mu \\in \\mathbb{R}\\), \\(\\sigma^2 > 0\\) \\(x \\in \\mathbb{R}\\) \\(\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}}\\) 4.12 \\(\\mu\\) 7.8 \\(\\sigma^2\\) 7.8 uniform \\(a,b \\in \\mathbb{R}\\), \\(a < b\\) \\(x \\in [a,b]\\) \\(\\frac{1}{b-a}\\) 4.9 \\(\\frac{a+b}{2}\\) \\(\\frac{(b-a)^2}{12}\\) beta \\(\\alpha,\\beta > 0\\) \\(x \\in [0,1]\\) \\(\\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)}\\) 4.10 \\(\\frac{\\alpha}{\\alpha + \\beta}\\) 7.6 \\(\\frac{\\alpha \\beta}{(\\alpha + \\beta)^2(\\alpha + \\beta + 1)}\\) 7.6 gamma \\(\\alpha,\\beta > 0\\) \\(x \\in (0, \\infty)\\) \\(\\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x}\\) 4.11 \\(\\frac{\\alpha}{\\beta}\\) 7.5 \\(\\frac{\\alpha}{\\beta^2}\\) 7.5 exponential \\(\\lambda > 0\\) \\(x \\in [0, \\infty)\\) \\(\\lambda e^{-\\lambda x}\\) 4.8 \\(\\frac{1}{\\lambda}\\) 7.7 \\(\\frac{1}{\\lambda^2}\\) 7.7 logistic \\(\\mu \\in \\mathbb{R}\\), \\(s > 0\\) \\(x \\in \\mathbb{R}\\) \\(\\frac{e^{-\\frac{x - \\mu}{s}}}{s(1 + e^{-\\frac{x - \\mu}{s}})^2}\\) 4.13 \\(\\mu\\) \\(\\frac{s^2 \\pi^2}{3}\\) negative binomial \\(r \\in \\mathbb{N}\\), \\(p \\in [0,1]\\) \\(k \\in \\mathbb{N}_0\\) \\(\\binom{k + r - 1}{k}(1-p)^r p^k\\) 4.7 \\(\\frac{rp}{1 - p}\\) 9.2 \\(\\frac{rp}{(1 - p)^2}\\) 9.2 multinomial \\(n \\in \\mathbb{N}\\), \\(k \\in \\mathbb{N}\\) \\(p_i \\in [0,1]\\), \\(\\sum p_i = 1\\) \\(x_i \\in \\{0,..., n\\}\\), \\(i \\in \\{1,...,k\\}\\), \\(\\sum{x_i} = n\\) \\(\\frac{n!}{x_1!x_2!...x_k!} p_1^{x_1} p_2^{x_2}...p_k^{x_k}\\) 8.1 \\(np_i\\) \\(np_i(1-p_i)\\) "],["references.html", "References", " References "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]] diff --git a/docs/uprobspaces.html b/docs/uprobspaces.html index c119e98..4989568 100644 --- a/docs/uprobspaces.html +++ b/docs/uprobspaces.html @@ -23,7 +23,7 @@ - + From 73c804396fe397d625ca2e41ab7101c79428ba46 Mon Sep 17 00:00:00 2001 From: LeonHvastja Date: Sun, 1 Oct 2023 21:41:51 +0200 Subject: [PATCH 6/7] minor fixest --- 18-distributions_intuition.Rmd | 17 ++++++++++------- docs/distributions-intutition.html | 19 +++++++++++-------- docs/search_index.json | 2 +- 3 files changed, 22 insertions(+), 16 deletions(-) diff --git a/18-distributions_intuition.Rmd b/18-distributions_intuition.Rmd index cd2f50a..5a57781 100644 --- a/18-distributions_intuition.Rmd +++ b/18-distributions_intuition.Rmd @@ -41,7 +41,7 @@ $(document).ready(function() { ## Discrete distributions ```{exercise, name = "Bernoulli intuition 1"} -Arguably the simplest distribution you will enocounter is the Bernoulli distribution. +Arguably the simplest distribution you will encounter is the Bernoulli distribution. It is a discrete probability distribution used to represent the outcome of a yes/no question. It has one parameter $p$ which is the probability of success. The probability of failure is $(1-p)$, sometimes denoted as $q$. @@ -127,7 +127,7 @@ into the pmf. ```{solution, echo = togs} a. The pmf of a binomial distribution is $\binom{n}{k} p^k (1 - p)^{n - k}$, now we insert $n=1$ to get: - $$\binom{1}{k} p^k (1 - p)^{1 - k}$$. + $$\binom{1}{k} p^k (1 - p)^{1 - k}$$ Not quite equivalent to a Bernoulli, however note that the support of the binomial distribution is defined as $k \in \{0,1,\dots,n\}$, so in our case $k = \{0,1\}$, then: @@ -240,7 +240,7 @@ occur at a constant mean rate and independently of each other - a **Poisson proc It has a single parameter $\lambda$, which represents the constant mean rate. A classic example of a scenario that can be modeled using the Poisson distribution -is the number of calls received by a call center in a day (or in fact any other +is the number of calls received at a call center in a day (or in fact any other time interval). Suppose you work in a call center and have some understanding of probability @@ -281,11 +281,14 @@ to get the probability of our original question. ```{exercise, name = "Geometric intuition 1"} The geometric distribution is a discrete distribution that models the **number of failures** before the first success in a sequence of independent Bernoulli trials. -It has a single parameter $p$, representing the probability of success. +It has a single parameter $p$, representing the probability of success and its +support is all non-negative integers $\{0,1,2,\dots\}$. NOTE: There are two forms of this distribution, the one we just described and another that models the **number of trials** before the first success. The difference is subtle yet significant and you are likely to encounter both forms. +The key to telling them apart is to check their support, since the number of trials +has to be at least $1$, for this case we have $\{1,2,\dots\}$. In the graph below we show the pmf of a geometric distribution with $p=0.5$. This can be thought of as the number of successive failures (tails) in the flip of a fair coin. @@ -402,7 +405,7 @@ b. Inserting the parameter values we get:$$f(x) = 0 & \text{otherwise} \end{cases} $$ -Notice how the pdf is just a constant $1$ across all values of $x \in [0,1]$. Here it is important to distinguish between probability and **probability density**. The density may be 1, but the probability is not and while discreet distributions never exceed 1 on the y-axis, continuous distributions can go as high as you like. +Notice how the pdf is just a constant $1$ across all values of $x \in [0,1]$. Here it is important to distinguish between probability and **probability density**. The density may be 1, but the probability is not and while discrete distributions never exceed 1 on the y-axis, continuous distributions can go as high as you like. ```
        @@ -412,7 +415,7 @@ The normal distribution, also known as the Gaussian distribution, is a continuou Below, we graph the distribution of IQ scores for two different populations. -We aim to identify individuals with an IQ at or above 140 for an experiment. We can identify them reliably; however, we only have time to examine one of the two groups. Which group should we investigate to have the best chance? +We aim to identify individuals with an IQ at or above 140 for an experiment. We can identify them reliably; however, we only have time to examine one of the two groups. Which group should we investigate to have the best chance of finding such individuals? NOTE: The graph below displays the parameter $\sigma$, which is the square root of the variance, more commonly referred to as the **standard deviation**. Keep this in mind when solving the problems. @@ -466,7 +469,7 @@ a. Group 1: $\mu = 100, \sigma=10 \rightarrow \sigma^2 = 100$ $$\frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{(x - \mu)^2}{2 \sigma^2}} = \frac{1}{\sqrt{2 \pi 64}} e^{-\frac{(140 - 105)^2}{2 \cdot 64}} \approx 3.48e-06$$ - So despite the fact that group 1 has a lower average IQ, we are more likely to find 140 IQ individuals in group 2. + So despite the fact that group 1 has a lower average IQ, we are more likely to find 140 IQ individuals in this group. ``` b. ```{r, echo=togs} diff --git a/docs/distributions-intutition.html b/docs/distributions-intutition.html index c25a59e..da87a6e 100644 --- a/docs/distributions-intutition.html +++ b/docs/distributions-intutition.html @@ -327,7 +327,7 @@

        Chapter 18 Distributions intutiti

        18.1 Discrete distributions

        -

        Exercise 18.1 (Bernoulli intuition 1) Arguably the simplest distribution you will enocounter is the Bernoulli distribution. +

        Exercise 18.1 (Bernoulli intuition 1) Arguably the simplest distribution you will encounter is the Bernoulli distribution. It is a discrete probability distribution used to represent the outcome of a yes/no question. It has one parameter \(p\) which is the probability of success. The probability of failure is \((1-p)\), sometimes denoted as \(q\).

        @@ -389,7 +389,7 @@

        18.1 Discrete distributions
      1. The pmf of a binomial distribution is \(\binom{n}{k} p^k (1 - p)^{n - k}\), now we insert \(n=1\) to get: -\[\binom{1}{k} p^k (1 - p)^{1 - k}\]. +\[\binom{1}{k} p^k (1 - p)^{1 - k}\] Not quite equivalent to a Bernoulli, however note that the support of the binomial distribution is defined as \(k \in \{0,1,\dots,n\}\), so in our case \(k = \{0,1\}\), then: @@ -441,7 +441,7 @@

        18.1 Discrete distributionsPoisson process.

        It has a single parameter \(\lambda\), which represents the constant mean rate.

        A classic example of a scenario that can be modeled using the Poisson distribution -is the number of calls received by a call center in a day (or in fact any other +is the number of calls received at a call center in a day (or in fact any other time interval).

        Suppose you work in a call center and have some understanding of probability distributions. You overhear your supervisor mentioning that the call center @@ -476,10 +476,13 @@

        18.1 Discrete distributions

        Exercise 18.5 (Geometric intuition 1) The geometric distribution is a discrete distribution that models the number of failures before the first success in a sequence of independent Bernoulli trials. -It has a single parameter \(p\), representing the probability of success.

        +It has a single parameter \(p\), representing the probability of success and its +support is all non-negative integers \(\{0,1,2,\dots\}\).

        NOTE: There are two forms of this distribution, the one we just described and another that models the number of trials before the first success. The -difference is subtle yet significant and you are likely to encounter both forms.

        +difference is subtle yet significant and you are likely to encounter both forms. +The key to telling them apart is to check their support, since the number of trials +has to be at least \(1\), for this case we have \(\{1,2,\dots\}\).

        In the graph below we show the pmf of a geometric distribution with \(p=0.5\). This can be thought of as the number of successive failures (tails) in the flip of a fair coin. You can see that there’s a 50% chance you will have zero failures i.e. you will @@ -552,14 +555,14 @@

        18.2 Continuous distributions -Notice how the pdf is just a constant \(1\) across all values of \(x \in [0,1]\). Here it is important to distinguish between probability and probability density. The density may be 1, but the probability is not and while discreet distributions never exceed 1 on the y-axis, continuous distributions can go as high as you like.

      2. +Notice how the pdf is just a constant \(1\) across all values of \(x \in [0,1]\). Here it is important to distinguish between probability and probability density. The density may be 1, but the probability is not and while discrete distributions never exceed 1 on the y-axis, continuous distributions can go as high as you like.

      Exercise 18.7 (Normal intuition 1) The normal distribution, also known as the Gaussian distribution, is a continuous distribution that encompasses the entire real number line. It has two parameters: the mean, denoted by \(\mu\), and the variance, represented by \(\sigma^2\). Its shape resembles the iconic bell curve. The position of its peak is determined by the parameter \(\mu\), while the variance determines the spread or width of the curve. A smaller variance results in a sharper, narrower peak, while a larger variance leads to a broader, more spread-out curve.

      Below, we graph the distribution of IQ scores for two different populations.

      -

      We aim to identify individuals with an IQ at or above 140 for an experiment. We can identify them reliably; however, we only have time to examine one of the two groups. Which group should we investigate to have the best chance?

      +

      We aim to identify individuals with an IQ at or above 140 for an experiment. We can identify them reliably; however, we only have time to examine one of the two groups. Which group should we investigate to have the best chance of finding such individuals?

      NOTE: The graph below displays the parameter \(\sigma\), which is the square root of the variance, more commonly referred to as the standard deviation. Keep this in mind when solving the problems.

      1. Insert the values of either population into the pdf of a normal distribution and determine which one has a higher density at \(x=140\).

      2. @@ -582,7 +585,7 @@

        18.2 Continuous distributions\[\frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{(x - \mu)^2}{2 \sigma^2}} = \frac{1}{\sqrt{2 \pi 64}} e^{-\frac{(140 - 105)^2}{2 \cdot 64}} \approx 3.48e-06\]

      -

      So despite the fact that group 1 has a lower average IQ, we are more likely to find 140 IQ individuals in group 2.

      +

      So despite the fact that group 1 has a lower average IQ, we are more likely to find 140 IQ individuals in this group.

      1. diff --git a/docs/search_index.json b/docs/search_index.json index 955f399..1068bd3 100644 --- a/docs/search_index.json +++ b/docs/search_index.json @@ -1 +1 @@ -[["index.html", "Principles of Uncertainty – exercises Preface", " Principles of Uncertainty – exercises Gregor Pirš and Erik Štrumbelj 2023-10-01 Preface These are the exercises for the Principles of Uncertainty course of the Data Science Master’s at University of Ljubljana, Faculty of Computer and Information Science. This document will be extended each week as the course progresses. At the end of each exercise session, we will post the solutions to the exercises worked in class and select exercises for homework. Students are also encouraged to solve the remaining exercises to further extend their knowledge. Some exercises require the use of R. Those exercises (or parts of) are coloured blue. Students that are not familiar with R programming language should study A to learn the basics. As the course progresses, we will cover more relevant uses of R for data science. "],["introduction.html", "Chapter 1 Probability spaces 1.1 Measure and probability spaces 1.2 Properties of probability measures 1.3 Discrete probability spaces", " Chapter 1 Probability spaces This chapter deals with measures and probability spaces. At the end of the chapter, we look more closely at discrete probability spaces. The students are expected to acquire the following knowledge: Theoretical Use properties of probability to calculate probabilities. Combinatorics. Understanding of continuity of probability. R Vectors and vector operations. For loop. Estimating probability with simulation. sample function. Matrices and matrix operations. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 1.1 Measure and probability spaces Exercise 1.1 (Completing a set to a sigma algebra) Let \\(\\Omega = \\{1,2,...,10\\}\\) and let \\(A = \\{\\emptyset, \\{1\\}, \\{2\\}, \\Omega \\}\\). Show that \\(A\\) is not a sigma algebra of \\(\\Omega\\). Find the minimum number of elements to complete A to a sigma algebra of \\(\\Omega\\). Solution. \\(1^c = \\{2,3,...,10\\} \\notin A \\implies\\) \\(A\\) is not sigma algebra. First we need the complements of all elements, so we need to add sets \\(\\{2,3,...,10\\}\\) and \\(\\{1,3,4,...,10\\}\\). Next we need unions of all sets – we add the set \\(\\{1,2\\}\\). Again we need the complement of this set, so we add \\(\\{3,4,...,10\\}\\). So the minimum number of elements we need to add is 4. Exercise 1.2 (Diversity of sigma algebras) Let \\(\\Omega\\) be a set. Find the smallest sigma algebra of \\(\\Omega\\). Find the largest sigma algebra of \\(\\Omega\\). Solution. \\(A = \\{\\emptyset, \\Omega\\}\\) \\(2^{\\Omega}\\) Exercise 1.3 Find all sigma algebras for \\(\\Omega = \\{0, 1, 2\\}\\). Solution. \\(A = \\{\\emptyset, \\Omega\\}\\) \\(A = 2^{\\Omega}\\) \\(A = \\{\\emptyset, \\{0\\}, \\{1,2\\}, \\Omega\\}\\) \\(A = \\{\\emptyset, \\{1\\}, \\{0,2\\}, \\Omega\\}\\) \\(A = \\{\\emptyset, \\{2\\}, \\{0,1\\}, \\Omega\\}\\) Exercise 1.4 (Difference between algebra and sigma algebra) Let \\(\\Omega = \\mathbb{N}\\) and \\(\\mathcal{A} = \\{A \\subseteq \\mathbb{N}: A \\text{ is finite or } A^c \\text{ is finite.} \\}\\). Show that \\(\\mathcal{A}\\) is an algebra but not a sigma algebra. Solution. \\(\\emptyset\\) is finite so \\(\\emptyset \\in \\mathcal{A}\\). Let \\(A \\in \\mathcal{A}\\) and \\(B \\in \\mathcal{A}\\). If both are finite, then their union is also finite and therefore in \\(\\mathcal{A}\\). Let at least one of them not be finite. Then their union is not finite. But \\((A \\cup B)^c = A^c \\cap B^c\\). And since at least one is infinite, then its complement is finite and the intersection is too. So finite unions are in \\(\\mathcal{A}\\). Let us look at numbers \\(2n\\). For any \\(n\\), \\(2n \\in \\mathcal{A}\\) as it is finite. But \\(\\bigcup_{k = 1}^{\\infty} 2n \\notin \\mathcal{A}\\). Exercise 1.5 We define \\(\\sigma(X) = \\cap_{\\lambda \\in I} S_\\lambda\\) to be a sigma algebra, generated by the set \\(X\\), where \\(S_\\lambda\\) are all sigma algebras such that \\(X \\subseteq S_\\lambda\\). \\(S_\\lambda\\) are indexed by \\(\\lambda \\in I\\). Let \\(A, B \\subseteq 2^{\\Omega}\\). Prove that \\(\\sigma(A) = \\sigma(B) \\iff A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A)\\). Solution. To prove the equivalence, we need to prove that the left hand side implies the right hand side and vice versa. Proving \\(\\sigma(A) = \\sigma(B) \\Rightarrow A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A)\\): we know \\(A \\subseteq \\sigma(A)\\) is always true, so by substituting in \\(\\sigma(B)\\) from the left hand side equality we obtain \\(A \\subseteq \\sigma(B)\\). We obtain \\(B \\subseteq \\sigma(A)\\) by symmetry. This proves the implication. Proving \\(A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A) \\Rightarrow \\sigma(A) = \\sigma(B)\\): by definition of a sigma algebra, generated by a set, we have \\(\\sigma(B) = \\cap_{\\lambda \\in I} S_\\lambda\\) where \\(S_\\lambda\\) are all sigma algebras where \\(B \\subseteq S_\\lambda\\). But \\(\\sigma(A)\\) is one of \\(S_\\lambda\\), so we can write \\(\\sigma(B) = \\sigma(A) \\cap \\left(\\cap_{\\lambda \\in I} S_\\lambda \\right)\\), which implies \\(\\sigma(B) \\subseteq \\sigma(A)\\). By symmetry, we have \\(\\sigma(A) \\subseteq \\sigma(B)\\). Since \\(\\sigma(A) \\subseteq \\sigma(B)\\) and \\(\\sigma(B) \\subseteq \\sigma(A)\\), we obtain \\(\\sigma(A) = \\sigma(B)\\), which proves the implication and completes the equivalence proof. Exercise 1.6 (Intro to measure) Take the measurable space \\(\\Omega = \\{1,2\\}\\), \\(F = 2^{\\Omega}\\). Which of the following is a measure? Which is a probability measure? \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 5\\), \\(\\mu(\\{2\\}) = 6\\), \\(\\mu(\\{1,2\\}) = 11\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 0\\), \\(\\mu(\\{1,2\\}) = 1\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 0\\), \\(\\mu(\\{1,2\\}) = 0\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 1\\), \\(\\mu(\\{1,2\\}) = 1\\) \\(\\mu(\\emptyset)=0\\), \\(\\mu(\\{1\\})=0\\), \\(\\mu(\\{2\\})=\\infty\\), \\(\\mu(\\{1,2\\})=\\infty\\) Solution. Measure. Not probability measure since \\(\\mu(\\Omega) > 1\\). Neither due to countable additivity. Measure. Not probability measure since \\(\\mu(\\Omega) = 0\\). Probability measure. Measure. Not probability measure since \\(\\mu(\\Omega) > 1\\). Exercise 1.7 Define a probability space that could be used to model the outcome of throwing two fair 6-sided dice. Solution. \\(\\Omega = \\{\\{i,j\\}, i = 1,...,6, j = 1,...,6\\}\\) \\(F = 2^{\\Omega}\\) \\(\\forall \\omega \\in \\Omega\\), \\(P(\\omega) = \\frac{1}{6} \\times \\frac{1}{6} = \\frac{1}{36}\\) 1.2 Properties of probability measures Exercise 1.8 A standard deck (52 cards) is distributed to two persons: 26 cards to each person. All partitions are equally likely. Find the probability that: The first person gets 4 Queens. The first person gets at least 2 Queens. R: Use simulation (sample) to check the above answers. Solution. \\(\\frac{\\binom{48}{22}}{\\binom{52}{26}}\\) 1 - \\(\\frac{\\binom{48}{26} + 4 \\times \\binom{48}{25}}{\\binom{52}{26}}\\) For the simulation, let us represent cards with numbers from 1 to 52, and let 1 through 4 represent Queens. set.seed(1) cards <- 1:52 n <- 10000 q4 <- vector(mode = "logical", length = n) q2 <- vector(mode = "logical", length = n) tmp <- vector(mode = "logical", length = n) for (i in 1:n) { p1 <- sample(1:52, 26) q4[i] <- sum(1:4 %in% p1) == 4 q2[i] <- sum(1:4 %in% p1) >= 2 } sum(q4) / n ## [1] 0.0572 sum(q2) / n ## [1] 0.6894 Exercise 1.9 Let \\(A\\) and \\(B\\) be events with probabilities \\(P(A) = \\frac{2}{3}\\) and \\(P(B) = \\frac{1}{2}\\). Show that \\(\\frac{1}{6} \\leq P(A\\cap B) \\leq \\frac{1}{2}\\), and give examples to show that both extremes are possible. Find corresponding bounds for \\(P(A\\cup B)\\). R: Draw samples from the examples and show the probability bounds of \\(P(A \\cap B)\\) . Solution. From the properties of probability we have \\[\\begin{equation} P(A \\cup B) = P(A) + P(B) - P(A \\cap B) \\leq 1. \\end{equation}\\] From this follows \\[\\begin{align} P(A \\cap B) &\\geq P(A) + P(B) - 1 \\\\ &= \\frac{2}{3} + \\frac{1}{2} - 1 \\\\ &= \\frac{1}{6}, \\end{align}\\] which is the lower bound for the intersection. Conversely, we have \\[\\begin{equation} P(A \\cup B) = P(A) + P(B) - P(A \\cap B) \\geq P(A). \\end{equation}\\] From this follows \\[\\begin{align} P(A \\cap B) &\\leq P(B) \\\\ &= \\frac{1}{2}, \\end{align}\\] which is the upper bound for the intersection. For an example take a fair die. To achieve the lower bound let \\(A = \\{3,4,5,6\\}\\) and \\(B = \\{1,2,3\\}\\), then their intersection is \\(A \\cap B = \\{3\\}\\). To achieve the upper bound take \\(A = \\{1,2,3,4\\}\\) and $B = {1,2,3} $. For the bounds of the union we will use the results from the first part. Again from the properties of probability we have \\[\\begin{align} P(A \\cup B) &= P(A) + P(B) - P(A \\cap B) \\\\ &\\geq P(A) + P(B) - \\frac{1}{2} \\\\ &= \\frac{2}{3}. \\end{align}\\] Conversely \\[\\begin{align} P(A \\cup B) &= P(A) + P(B) - P(A \\cap B) \\\\ &\\leq P(A) + P(B) - \\frac{1}{6} \\\\ &= 1. \\end{align}\\] Therefore \\(\\frac{2}{3} \\leq P(A \\cup B) \\leq 1\\). We use sample in R: set.seed(1) n <- 10000 samps <- sample(1:6, n, replace = TRUE) # lower bound lb <- vector(mode = "logical", length = n) A <- c(1,2,3) B <- c(3,4,5,6) for (i in 1:n) { lb[i] <- samps[i] %in% A & samps[i] %in% B } sum(lb) / n ## [1] 0.1605 # upper bound ub <- vector(mode = "logical", length = n) A <- c(1,2,3) B <- c(1,2,3,4) for (i in 1:n) { ub[i] <- samps[i] %in% A & samps[i] %in% B } sum(ub) / n ## [1] 0.4913 Exercise 1.10 A fair coin is tossed repeatedly. Show that, with probability one, a head turns up sooner or later. Show similarly that any given finite sequence of heads and tails occurs eventually with probability one. Solution. \\[\\begin{align} P(\\text{no heads}) &= \\lim_{n \\rightarrow \\infty} P(\\text{no heads in first }n \\text{ tosses}) \\\\ &= \\lim_{n \\rightarrow \\infty} \\frac{1}{2^n} \\\\ &= 0. \\end{align}\\] For the second part, let us fix the given sequence of heads and tails of length \\(k\\) as \\(s\\). A probability that this happens in \\(k\\) tosses is \\(\\frac{1}{2^k}\\). \\[\\begin{align} P(s \\text{ occurs}) &= \\lim_{n \\rightarrow \\infty} P(s \\text{ occurs in first } nk \\text{ tosses}) \\end{align}\\] The right part of the upper equation is greater than if \\(s\\) occurs either in the first \\(k\\) tosses, second \\(k\\) tosses,…, \\(n\\)-th \\(k\\) tosses. Therefore \\[\\begin{align} P(s \\text{ occurs}) &\\geq \\lim_{n \\rightarrow \\infty} P(s \\text{ occurs in first } n \\text{ disjoint sequences of length } k) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - P(s \\text{ does not occur in first } n \\text{ disjoint sequences})) \\\\ &= 1 - \\lim_{n \\rightarrow \\infty} P(s \\text{ does not occur in first } n \\text{ disjoint sequences}) \\\\ &= 1 - \\lim_{n \\rightarrow \\infty} (1 - \\frac{1}{2^k})^n \\\\ &= 1. \\end{align}\\] Exercise 1.11 An Erdos-Renyi random graph \\(G(n,p)\\) is a model with \\(n\\) nodes, where each pair of nodes is connected with probability \\(p\\). Calculate the probability that there exists a node that is not connected to any other node in \\(G(4,0.6)\\). Show that the upper bound for the probability that there exist 2 nodes that are not connected to any other node for an arbitrary \\(G(n,p)\\) is \\(\\binom{n}{2} (1-p)^{2n - 3}\\). R: Estimate the probability from the first point using simulation. Solution. Let \\(A_i\\) be the event that the \\(i\\)-th node is not connected to any other node. Then our goal is to calculate \\(P(\\cup_{i=1}^n A_i)\\). Using the inclusion-exclusion principle, we get \\[\\begin{align} P(\\cup_{i=1}^n A_i) &= \\sum_i A_i - \\sum_{i<j} P(A_i \\cap A_j) + \\sum_{i<j<k} P(A_i \\cap A_j \\cap A_k) - P(A_1 \\cap A_2 \\cap A_3 \\cap A_4) \\\\ &=4 (1 - p)^3 - \\binom{4}{2} (1 - p)^5 + \\binom{4}{3} (1 - p)^6 - (1 - p)^6 \\\\ &\\approx 0.21. \\end{align}\\] Let \\(A_{ij}\\) be the event that nodes \\(i\\) and \\(j\\) are not connected to any other node. We are interested in \\(P(\\cup_{i<j}A_{ij})\\). By using Boole`s inequality, we get \\[\\begin{align} P(\\cup_{i<j}A_{ij}) \\leq \\sum_{i<j} P(A_{ij}). \\end{align}\\] What is the probability of \\(A_{ij}\\)? There need to be no connections to the \\(i\\)-th node to the remaining nodes (excluding \\(j\\)), the same for the \\(j\\)-th node, and there can be no connection between them. Therefore \\[\\begin{align} P(\\cup_{i<j}A_{ij}) &\\leq \\sum_{i<j} (1 - p)^{2(n-2) + 1} \\\\ &= \\binom{n}{2} (1 - p)^{2n - 3}. \\end{align}\\] set.seed(1) n_samp <- 100000 n <- 4 p <- 0.6 conn_samp <- vector(mode = "logical", length = n_samp) for (i in 1:n_samp) { tmp_mat <- matrix(data = 0, nrow = n, ncol = n) samp_conn <- sample(c(0,1), choose(4,2), replace = TRUE, prob = c(1 - p, p)) tmp_mat[lower.tri(tmp_mat)] <- samp_conn tmp_mat[upper.tri(tmp_mat)] <- t(tmp_mat)[upper.tri(t(tmp_mat))] not_conn <- apply(tmp_mat, 1, sum) if (any(not_conn == 0)) { conn_samp[i] <- TRUE } else { conn_samp[i] <- FALSE } } sum(conn_samp) / n_samp ## [1] 0.20565 1.3 Discrete probability spaces Exercise 1.12 Show that the standard measurable space on \\(\\Omega = \\{0,1,...,n\\}\\) equipped with binomial measure is a discrete probability space. Define another probability measure on this measurable space. Show that for \\(n=1\\) the binomial measure is the same as the Bernoulli measure. R: Draw 1000 samples from the binomial distribution \\(p=0.5\\), \\(n=20\\) (rbinom) and compare relative frequencies with theoretical probability measure. Solution. We need to show that the terms of \\(\\sum_{k=0}^n \\binom{n}{k} p^k (1 - p)^{n - k}\\) sum to 1. For that we use the binomial theorem \\(\\sum_{k=0}^n \\binom{n}{k} x^k y^{n-k} = (x + y)^n\\). So \\[\\begin{equation} \\sum_{k=0}^n \\binom{n}{k} p^k (1 - p)^{n - k} = (p + 1 - p)^n = 1. \\end{equation}\\] \\(P(\\{k\\}) = \\frac{1}{n + 1}\\). When \\(n=1\\) then \\(k \\in \\{0,1\\}\\). Inserting \\(n=1\\) into the binomial measure, we get \\(\\binom{1}{k}p^k (1-p)^{1 - k}\\). Now \\(\\binom{1}{1} = \\binom{1}{0} = 1\\), so the measure is \\(p^k (1-p)^{1 - k}\\), which is the Bernoulli measure. set.seed(1) library(ggplot2) library(dplyr) bin_samp <- rbinom(n = 1000, size = 20, prob = 0.5) bin_samp <- data.frame(x = bin_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:20, n = dbinom(0:20, size = 20, prob = 0.5), type = "theoretical_measure")) bin_plot <- ggplot(data = bin_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(bin_plot) Exercise 1.13 Show that the standard measurable space on \\(\\Omega = \\{0,1,...,\\infty\\}\\) equipped with geometric measure is a discrete probability space, equipped with Poisson measure is a discrete probability space. Define another probability measure on this measurable space. R: Draw 1000 samples from the Poisson distribution \\(\\lambda = 10\\) (rpois) and compare relative frequencies with theoretical probability measure. Solution. \\(\\sum_{k = 0}^{\\infty} p(1 - p)^k = p \\sum_{k = 0}^{\\infty} (1 - p)^k = p \\frac{1}{1 - 1 + p} = 1\\). We used the formula for geometric series. \\(\\sum_{k = 0}^{\\infty} \\frac{\\lambda^k e^{-\\lambda}}{k!} = e^{-\\lambda} \\sum_{k = 0}^{\\infty} \\frac{\\lambda^k}{k!} = e^{-\\lambda} e^{\\lambda} = 1.\\) We used the Taylor expansion of the exponential function. Since we only have to define a probability measure, we could only assign probabilities that sum to one to a finite number of events in \\(\\Omega\\), and probability zero to the other infinite number of events. However to make this solution more educational, we will try to find a measure that assigns a non-zero probability to all events in \\(\\Omega\\). A good start for this would be to find a converging infinite series, as the probabilities will have to sum to one. One simple converging series is the geometric series \\(\\sum_{k=0}^{\\infty} p^k\\) for \\(|p| < 1\\). Let us choose an arbitrary \\(p = 0.5\\). Then \\(\\sum_{k=0}^{\\infty} p^k = \\frac{1}{1 - 0.5} = 2\\). To complete the measure, we have to normalize it, so it sums to one, therefore \\(P(\\{k\\}) = \\frac{0.5^k}{2}\\) is a probability measure on \\(\\Omega\\). We could make it even more difficult by making this measure dependent on some parameter \\(\\alpha\\), but this is out of the scope of this introductory chapter. set.seed(1) pois_samp <- rpois(n = 1000, lambda = 10) pois_samp <- data.frame(x = pois_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:25, n = dpois(0:25, lambda = 10), type = "theoretical_measure")) pois_plot <- ggplot(data = pois_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(pois_plot) Exercise 1.14 Define a probability measure on \\((\\Omega = \\mathbb{Z}, 2^{\\mathbb{Z}})\\). Define a probability measure such that \\(P(\\omega) > 0, \\forall \\omega \\in \\Omega\\). R: Implement a random generator that will generate samples with the relative frequency that corresponds to your probability measure. Compare relative frequencies with theoretical probability measure . Solution. \\(P(0) = 1, P(\\omega) = 0, \\forall \\omega \\neq 0\\). \\(P(\\{k\\}) = \\sum_{k = -\\infty}^{\\infty} \\frac{p(1 - p)^{|k|}}{2^{1 - 1_0(k)}}\\), where \\(1_0(k)\\) is the indicator function, which equals to one if \\(k\\) is 0, and equals to zero in every other case. n <- 1000 geom_samps <- rgeom(n, prob = 0.5) sign_samps <- sample(c(FALSE, TRUE), size = n, replace = TRUE) geom_samps[sign_samps] <- -geom_samps[sign_samps] my_pmf <- function (k, p) { indic <- rep(1, length(k)) indic[k == 0] <- 0 return ((p * (1 - p)^(abs(k))) / 2^indic) } geom_samps <- data.frame(x = geom_samps) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = -10:10, n = my_pmf(-10:10, 0.5), type = "theoretical_measure")) geom_plot <- ggplot(data = geom_samps, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(geom_plot) Exercise 1.15 Define a probability measure on \\(\\Omega = \\{1,2,3,4,5,6\\}\\) with parameter \\(m \\in \\{1,2,3,4,5,6\\}\\), so that the probability of outcome at distance \\(1\\) from \\(m\\) is half of the probability at distance \\(0\\), at distance \\(2\\) is half of the probability at distance \\(1\\), etc. R: Implement a random generator that will generate samples with the relative frequency that corresponds to your probability measure. Compare relative frequencies with theoretical probability measure . Solution. \\(P(\\{k\\}) = \\frac{\\frac{1}{2}^{|m - k|}}{\\sum_{i=1}^6 \\frac{1}{2}^{|m - i|}}\\) n <- 10000 m <- 4 my_pmf <- function (k, m) { denom <- sum(0.5^abs(m - 1:6)) return (0.5^abs(m - k) / denom) } samps <- c() for (i in 1:n) { a <- sample(1:6, 1) a_val <- my_pmf(a, m) prob <- runif(1) if (prob < a_val) { samps <- c(samps, a) } } samps <- data.frame(x = samps) %>% count(x) %>% mutate(n = n / length(samps), type = "empirical_frequencies") %>% bind_rows(data.frame(x = 1:6, n = my_pmf(1:6, m), type = "theoretical_measure")) my_plot <- ggplot(data = samps, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(my_plot) "],["uprobspaces.html", "Chapter 2 Uncountable probability spaces 2.1 Borel sets 2.2 Lebesgue measure", " Chapter 2 Uncountable probability spaces This chapter deals with uncountable probability spaces. The students are expected to acquire the following knowledge: Theoretical Understand Borel sets and identify them. Estimate Lebesgue measure for different sets. Know when sets are Borel-measurable. Understanding of countable and uncountable sets. R Uniform sampling. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 2.1 Borel sets Exercise 2.1 Prove that the intersection of two sigma algebras on \\(\\Omega\\) is a sigma algebra. Prove that the collection of all open subsets \\((a,b)\\) on \\((0,1]\\) is not a sigma algebra of \\((0,1]\\). Solution. Empty set: \\[\\begin{equation} \\emptyset \\in \\mathcal{A} \\wedge \\emptyset \\in \\mathcal{B} \\Rightarrow \\emptyset \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Complement: \\[\\begin{equation} \\text{Let } A \\in \\mathcal{A} \\cap \\mathcal{B} \\Rightarrow A \\in \\mathcal{A} \\wedge A \\in \\mathcal{B} \\Rightarrow A^c \\in \\mathcal{A} \\wedge A^c \\in \\mathcal{B} \\Rightarrow A^c \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Countable additivity: Let \\(\\{A_i\\}\\) be a countable sequence of subsets in \\(\\mathcal{A} \\cap \\mathcal{B}\\). \\[\\begin{equation} \\forall i: A_i \\in \\mathcal{A} \\cap \\mathcal{B} \\Rightarrow A_i \\in \\mathcal{A} \\wedge A_i \\in \\mathcal{B} \\Rightarrow \\cup A_i \\in \\mathcal{A} \\wedge \\cup A_i \\in \\mathcal{B} \\Rightarrow \\cup A_i \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Let \\(A\\) denote the collection of all open subsets \\((a,b)\\) on \\((0,1]\\). Then \\((0,1) \\in A\\). But \\((0,1)^c = 1 \\notin A\\). Exercise 2.2 Show that \\(\\mathcal{C} = \\sigma(\\mathcal{C})\\) if and only if \\(\\mathcal{C}\\) is a sigma algebra. Solution. “\\(\\Rightarrow\\)” This follows from the definition of a generated sigma algebra. “\\(\\Leftarrow\\)” Let \\(\\mathcal{F} = \\cap_i F_i\\) be the intersection of all sigma algebras that contain \\(\\mathcal{C}\\). Then \\(\\sigma(\\mathcal{C}) = \\mathcal{F}\\). Additionally, \\(\\forall i: \\mathcal{C} \\in F_i\\). So each \\(F_i\\) can be written as \\(F_i = \\mathcal{C} \\cup D\\), where \\(D\\) are the rest of the elements in the sigma algebra. In other words, each sigma algebra in the collection contains at least \\(\\mathcal{C}\\), but can contain other elements. Now for some \\(j\\), \\(F_j = \\mathcal{C}\\) as \\(\\{F_i\\}\\) contains all sigma algebras that contain \\(\\mathcal{C}\\) and \\(\\mathcal{C}\\) is such a sigma algebra. Since this is the smallest subset in the intersection it follows that \\(\\sigma(\\mathcal{C}) = \\mathcal{F} = \\mathcal{C}\\). Exercise 2.3 Let \\(\\mathcal{C}\\) and \\(\\mathcal{D}\\) be two collections of subsets on \\(\\Omega\\) such that \\(\\mathcal{C} \\subset \\mathcal{D}\\). Prove that \\(\\sigma(\\mathcal{C}) \\subseteq \\sigma(\\mathcal{D})\\). Solution. \\(\\sigma(\\mathcal{D})\\) is a sigma algebra that contains \\(\\mathcal{D}\\). It follows that \\(\\sigma(\\mathcal{D})\\) is a sigma algebra that contains \\(\\mathcal{C}\\). Let us write \\(\\sigma(\\mathcal{C}) = \\cap_i F_i\\), where \\(\\{F_i\\}\\) is the collection of all sigma algebras that contain \\(\\mathcal{C}\\). Since \\(\\sigma(\\mathcal{D})\\) is such a sigma algebra, there exists an index \\(j\\), so that \\(F_j = \\sigma(\\mathcal{D})\\). Then we can write \\[\\begin{align} \\sigma(\\mathcal{C}) &= (\\cap_{i \\neq j} F_i) \\cap \\sigma(\\mathcal{D}) \\\\ &\\subseteq \\sigma(\\mathcal{D}). \\end{align}\\] Exercise 2.4 Prove that the following subsets of \\((0,1]\\) are Borel-measurable by finding their measure. Any countable set. The set of numbers in (0,1] whose decimal expansion does not contain 7. Solution. This follows directly from the fact that every countable set is a union of singletons, whose measure is 0. Let us first look at numbers which have a 7 as the first decimal numbers. Their measure is 0.1. Then we take all the numbers with a 7 as the second decimal number (excluding those who already have it as the first). These have the measure 0.01, and there are 9 of them, so their total measure is 0.09. We can continue to do so infinitely many times. At each \\(n\\), we have the measure of the intervals which is \\(10^n\\) and the number of those intervals is \\(9^{n-1}\\). Now \\[\\begin{align} \\lambda(A) &= 1 - \\sum_{n = 0}^{\\infty} \\frac{9^n}{10^{n+1}} \\\\ &= 1 - \\frac{1}{10} \\sum_{n = 0}^{\\infty} (\\frac{9}{10})^n \\\\ &= 1 - \\frac{1}{10} \\frac{10}{1} \\\\ &= 0. \\end{align}\\] Since we have shown that the measure of the set is \\(0\\), we have also shown that the set is measurable. Exercise 2.5 Let \\(\\Omega = [0,1]\\), and let \\(\\mathcal{F}_3\\) consist of all countable subsets of \\(\\Omega\\), and all subsets of \\(\\Omega\\) having a countable complement. Show that \\(\\mathcal{F}_3\\) is a sigma algebra. Let us define \\(P(A)=0\\) if \\(A\\) is countable, and \\(P(A) = 1\\) if \\(A\\) has a countable complement. Is \\((\\Omega, \\mathcal{F}_3, P)\\) a legitimate probability space? Solution. The empty set is countable, therefore it is in \\(\\mathcal{F}_3\\). For any \\(A \\in \\mathcal{F}_3\\). If \\(A\\) is countable, then \\(A^c\\) has a countable complement and is in \\(\\mathcal{F}_3\\). If \\(A\\) is uncountable, then it has a countable complement \\(A^c\\) which is therefore also in \\(\\mathcal{F}_3\\). We are left with showing countable additivity. Let \\(\\{A_i\\}\\) be an arbitrary collection of sets in \\(\\mathcal{F}_3\\). We will look at two possibilities. First let all \\(A_i\\) be countable. A countable union of countable sets is countable, and therefore in \\(\\mathcal{F}_3\\). Second, let at least one \\(A_i\\) be uncountable. It follows that it has a countable complement. We can write \\[\\begin{equation} (\\cup_{i=1}^{\\infty} A_i)^c = \\cap_{i=1}^{\\infty} A_i^c. \\end{equation}\\] Since at least one \\(A_i^c\\) on the right side is countable, the whole intersection is countable, and therefore the union has a countable complement. It follows that the union is in \\(\\mathcal{F}_3\\). The tuple \\((\\Omega, \\mathcal{F}_3)\\) is a measurable space. Therefore, we only need to check whether \\(P\\) is a probability measure. The measure of the empty set is zero as it is countable. We have to check for countable additivity. Let us look at three situations. Let \\(A_i\\) be disjoint sets. First, let all \\(A_i\\) be countable. \\[\\begin{equation} P(\\cup_{i=1}^{\\infty} A_i) = \\sum_{i=1}^{\\infty}P( A_i)) = 0. \\end{equation}\\] Since the union is countable, the above equation holds. Second, let exactly one \\(A_i\\) be uncountable. W.L.O.G. let that be \\(A_1\\). Then \\[\\begin{equation} P(\\cup_{i=1}^{\\infty} A_i) = 1 + \\sum_{i=2}^{\\infty}P( A_i)) = 1. \\end{equation}\\] Since the union is uncountable, the above equation holds. Third, let at least two \\(A_i\\) be uncountable. We have to check whether it is possible for two uncountable sets in \\(\\mathcal{F}_3\\) to be disjoint. If that is possible, then their measures would sum to more than one and \\(P\\) would not be a probability measure. W.L.O.G. let \\(A_1\\) and \\(A_2\\) be uncountable. Then we have \\[\\begin{equation} A_1 \\cap A_2 = (A_1^c \\cup A_2^c)^c. \\end{equation}\\] Now \\(A_1^c\\) and \\(A_2^c\\) are countable and their union is therefore countable. Let \\(B = A_1^c \\cup A_2^c\\). So the intersection of \\(A_1\\) and \\(A_2\\) equals the complement of \\(B\\), which is countable. For the intersection to be the empty set, \\(B\\) would have to equal to \\(\\Omega\\). But \\(\\Omega\\) is uncountable and therefore \\(B\\) can not equal to \\(\\Omega\\). It follows that two uncountable sets in \\(\\mathcal{F}_3\\) can not have an empty intersection. Therefore the tuple is a legitimate probability space. 2.2 Lebesgue measure Exercise 2.6 Show that the Lebesgue measure of rational numbers on \\([0,1]\\) is 0. R: Implement a random number generator, which generates uniform samples of irrational numbers in \\([0,1]\\) by uniformly sampling from \\([0,1]\\) and rejecting a sample if it is rational. Solution. There are a countable number of rational numbers. Therefore, we can write \\[\\begin{align} \\lambda(\\mathbb{Q}) &= \\lambda(\\cup_{i = 1}^{\\infty} q_i) &\\\\ &= \\sum_{i = 1}^{\\infty} \\lambda(q_i) &\\text{ (countable additivity)} \\\\ &= \\sum_{i = 1}^{\\infty} 0 &\\text{ (Lebesgue measure of a singleton)} \\\\ &= 0. \\end{align}\\] Exercise 2.7 Prove that the Lebesgue measure of \\(\\mathbb{R}\\) is infinity. Paradox. Show that the cardinality of \\(\\mathbb{R}\\) and \\((0,1)\\) is the same, while their Lebesgue measures are infinity and one respectively. Solution. Let \\(a_i\\) be the \\(i\\)-th integer for \\(i \\in \\mathbb{Z}\\). We can write \\(\\mathbb{R} = \\cup_{-\\infty}^{\\infty} (a_i, a_{i + 1}]\\). \\[\\begin{align} \\lambda(\\mathbb{R}) &= \\lambda(\\cup_{i = -\\infty}^{\\infty} (a_i, a_{i + 1}]) \\\\ &= \\lambda(\\lim_{n \\rightarrow \\infty} \\cup_{i = -n}^{n} (a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\lambda(\\cup_{i = -n}^{n} (a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\sum_{i = -n}^{n} \\lambda((a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\sum_{i = -n}^{n} 1 \\\\ &= \\lim_{n \\rightarrow \\infty} 2n \\\\ &= \\infty. \\end{align}\\] We need to find a bijection between \\(\\mathbb{R}\\) and \\((0,1)\\). A well-known function that maps from a bounded interval to \\(\\mathbb{R}\\) is the tangent. To make the bijection easier to achieve, we will take the inverse, which maps from \\(\\mathbb{R}\\) to \\((-\\frac{\\pi}{2}, \\frac{\\pi}{2})\\). However, we need to change the function so it maps to \\((0,1)\\). First we add \\(\\frac{\\pi}{2}\\), so that we move the function above zero. Then we only have to divide by the max value, which in this case is \\(\\pi\\). So our bijection is \\[\\begin{equation} f(x) = \\frac{\\tan^{-1}(x) + \\frac{\\pi}{2}}{\\pi}. \\end{equation}\\] Exercise 2.8 Take the measure space \\((\\Omega_1 = (0,1], B_{(0,1]}, \\lambda)\\) (we know that this is a probability space on \\((0,1]\\)). Define a map (function) from \\(\\Omega_1\\) to \\(\\Omega_2 = \\{1,2,3,4,5,6\\}\\) such that the measure space \\((\\Omega_2, 2^{\\Omega_2}, \\lambda(f^{-1}()))\\) will be a discrete probability space with uniform probabilities (\\(P(\\omega) = \\frac{1}{6}, \\forall \\omega \\in \\Omega_2)\\). Is the map that you defined in (a) the only such map? How would you in the same fashion define a map that would result in a probability space that can be interpreted as a coin toss with probability \\(p\\) of heads? R: Use the map in (a) as a basis for a random generator for this fair die. Solution. In other words, we have to assign disjunct intervals of the same size to each element of \\(\\Omega_2\\). Therefore \\[\\begin{equation} f(x) = \\lceil 6x \\rceil. \\end{equation}\\] No, we could for example rearrange the order in which the intervals are mapped to integers. Additionally, we could have several disjoint intervals that mapped to the same integer, as long as the Lebesgue measure of their union would be \\(\\frac{1}{6}\\) and the function would remain injective. We have \\(\\Omega_3 = \\{0,1\\}\\), where zero represents heads and one represents tails. Then \\[\\begin{equation} f(x) = 0^{I_{A}(x)}, \\end{equation}\\] where \\(A = \\{y \\in (0,1] : y < p\\}\\). set.seed(1) unif_s <- runif(1000) die_s <- ceiling(6 * unif_s) summary(as.factor(die_s)) ## 1 2 3 4 5 6 ## 166 154 200 146 166 168 "],["condprob.html", "Chapter 3 Conditional probability 3.1 Calculating conditional probabilities 3.2 Conditional independence 3.3 Monty Hall problem", " Chapter 3 Conditional probability This chapter deals with conditional probability. The students are expected to acquire the following knowledge: Theoretical Identify whether variables are independent. Calculation of conditional probabilities. Understanding of conditional dependence and independence. How to apply Bayes’ theorem to solve difficult probabilistic questions. R Simulating conditional probabilities. cumsum. apply. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 3.1 Calculating conditional probabilities Exercise 3.1 A military officer is in charge of identifying enemy aircraft and shooting them down. He is able to positively identify an enemy airplane 95% of the time and positively identify a friendly airplane 90% of the time. Furthermore, 99% of the airplanes are friendly. When the officer identifies an airplane as an enemy airplane, what is the probability that it is not and they will shoot at a friendly airplane? Solution. Let \\(E = 0\\) denote that the observed plane is friendly and \\(E=1\\) that it is an enemy. Let \\(I = 0\\) denote that the officer identified it as friendly and \\(I = 1\\) as enemy. Then \\[\\begin{align} P(E = 0 | I = 1) &= \\frac{P(I = 1 | E = 0)P(E = 0)}{P(I = 1)} \\\\ &= \\frac{P(I = 1 | E = 0)P(E = 0)}{P(I = 1 | E = 0)P(E = 0) + P(I = 1 | E = 1)P(E = 1)} \\\\ &= \\frac{0.1 \\times 0.99}{0.1 \\times 0.99 + 0.95 \\times 0.01} \\\\ &= 0.91. \\end{align}\\] Exercise 3.2 R: Consider tossing a fair die. Let \\(A = \\{2,4,6\\}\\) and \\(B = \\{1,2,3,4\\}\\). Then \\(P(A) = \\frac{1}{2}\\), \\(P(B) = \\frac{2}{3}\\) and \\(P(AB) = \\frac{1}{3}\\). Since \\(P(AB) = P(A)P(B)\\), the events \\(A\\) and \\(B\\) are independent. Simulate draws from the sample space and verify that the proportions are the same. Then find two events \\(C\\) and \\(D\\) that are not independent and repeat the simulation. set.seed(1) nsamps <- 10000 tosses <- sample(1:6, nsamps, replace = TRUE) PA <- sum(tosses %in% c(2,4,6)) / nsamps PB <- sum(tosses %in% c(1,2,3,4)) / nsamps PA * PB ## [1] 0.3295095 sum(tosses %in% c(2,4)) / nsamps ## [1] 0.3323 # Let C = {1,2} and D = {2,3} PC <- sum(tosses %in% c(1,2)) / nsamps PD <- sum(tosses %in% c(2,3)) / nsamps PC * PD ## [1] 0.1067492 sum(tosses %in% c(2)) / nsamps ## [1] 0.1622 Exercise 3.3 A machine reports the true value of a thrown 12-sided die 5 out of 6 times. If the machine reports a 1 has been tossed, what is the probability that it is actually a 1? Now let the machine only report whether a 1 has been tossed or not. Does the probability change? R: Use simulation to check your answers to a) and b). Solution. Let \\(T = 1\\) denote that the toss is 1 and \\(M = 1\\) that the machine reports a 1. \\[\\begin{align} P(T = 1 | M = 1) &= \\frac{P(M = 1 | T = 1)P(T = 1)}{P(M = 1)} \\\\ &= \\frac{P(M = 1 | T = 1)P(T = 1)}{\\sum_{k=1}^{12} P(M = 1 | T = k)P(T = k)} \\\\ &= \\frac{\\frac{5}{6}\\frac{1}{12}}{\\frac{5}{6}\\frac{1}{12} + 11 \\frac{1}{6} \\frac{1}{11} \\frac{1}{12}} \\\\ &= \\frac{5}{6}. \\end{align}\\] Yes. \\[\\begin{align} P(T = 1 | M = 1) &= \\frac{P(M = 1 | T = 1)P(T = 1)}{P(M = 1)} \\\\ &= \\frac{P(M = 1 | T = 1)P(T = 1)}{\\sum_{k=1}^{12} P(M = 1 | T = k)P(T = k)} \\\\ &= \\frac{\\frac{5}{6}\\frac{1}{12}}{\\frac{5}{6}\\frac{1}{12} + 11 \\frac{1}{6} \\frac{1}{12}} \\\\ &= \\frac{5}{16}. \\end{align}\\] set.seed(1) nsamps <- 10000 report_a <- vector(mode = "numeric", length = nsamps) report_b <- vector(mode = "logical", length = nsamps) truths <- vector(mode = "logical", length = nsamps) for (i in 1:10000) { toss <- sample(1:12, size = 1) truth <- sample(c(TRUE, FALSE), size = 1, prob = c(5/6, 1/6)) truths[i] <- truth if (truth) { report_a[i] <- toss report_b[i] <- toss == 1 } else { remaining <- (1:12)[1:12 != toss] report_a[i] <- sample(remaining, size = 1) report_b[i] <- toss != 1 } } truth_a1 <- truths[report_a == 1] sum(truth_a1) / length(truth_a1) ## [1] 0.8300733 truth_b1 <- truths[report_b] sum(truth_b1) / length(truth_b1) ## [1] 0.3046209 Exercise 3.4 A coin is tossed independently \\(n\\) times. The probability of heads at each toss is \\(p\\). At each time \\(k\\), \\((k = 2,3,...,n)\\) we get a reward at time \\(k+1\\) if \\(k\\)-th toss was a head and the previous toss was a tail. Let \\(A_k\\) be the event that a reward is obtained at time \\(k\\). Are events \\(A_k\\) and \\(A_{k+1}\\) independent? Are events \\(A_k\\) and \\(A_{k+2}\\) independent? R: simulate 10 tosses 10000 times, where \\(p = 0.7\\). Check your answers to a) and b) by counting the frequencies of the events \\(A_5\\), \\(A_6\\), and \\(A_7\\). Solution. For \\(A_k\\) to happen, we need the tosses \\(k-2\\) and \\(k-1\\) be tails and heads respectively. For \\(A_{k+1}\\) to happen, we need tosses \\(k-1\\) and \\(k\\) be tails and heads respectively. As the toss \\(k-1\\) need to be heads for one and tails for the other, these two events can not happen simultaneously. Therefore the probability of their intersection is 0. But the probability of each of them separately is \\(p(1-p) > 0\\). Therefore, they are not independent. For \\(A_k\\) to happen, we need the tosses \\(k-2\\) and \\(k-1\\) be tails and heads respectively. For \\(A_{k+2}\\) to happen, we need tosses \\(k\\) and \\(k+1\\) be tails and heads respectively. So the probability of intersection is \\(p^2(1-p)^2\\). And the probability of each separately is again \\(p(1-p)\\). Therefore, they are independent. set.seed(1) nsamps <- 10000 p <- 0.7 rewardA_5 <- vector(mode = "logical", length = nsamps) rewardA_6 <- vector(mode = "logical", length = nsamps) rewardA_7 <- vector(mode = "logical", length = nsamps) rewardA_56 <- vector(mode = "logical", length = nsamps) rewardA_57 <- vector(mode = "logical", length = nsamps) for (i in 1:nsamps) { samps <- sample(c(0,1), size = 10, replace = TRUE, prob = c(0.7, 0.3)) rewardA_5[i] <- (samps[4] == 0 & samps[3] == 1) rewardA_6[i] <- (samps[5] == 0 & samps[4] == 1) rewardA_7[i] <- (samps[6] == 0 & samps[5] == 1) rewardA_56[i] <- (rewardA_5[i] & rewardA_6[i]) rewardA_57[i] <- (rewardA_5[i] & rewardA_7[i]) } sum(rewardA_5) / nsamps ## [1] 0.2141 sum(rewardA_6) / nsamps ## [1] 0.2122 sum(rewardA_7) / nsamps ## [1] 0.2107 sum(rewardA_56) / nsamps ## [1] 0 sum(rewardA_57) / nsamps ## [1] 0.0454 Exercise 3.5 A drawer contains two coins. One is an unbiased coin, the other is a biased coin, which will turn up heads with probability \\(p\\) and tails with probability \\(1-p\\). One coin is selected uniformly at random. The selected coin is tossed \\(n\\) times. The coin turns up heads \\(k\\) times and tails \\(n-k\\) times. What is the probability that the coin is biased? The selected coin is tossed repeatedly until it turns up heads \\(k\\) times. Given that it is tossed \\(n\\) times in total, what is the probability that the coin is biased? Solution. Let \\(B = 1\\) denote that the coin is biased and let \\(H = k\\) denote that we’ve seen \\(k\\) heads. \\[\\begin{align} P(B = 1 | H = k) &= \\frac{P(H = k | B = 1)P(B = 1)}{P(H = k)} \\\\ &= \\frac{P(H = k | B = 1)P(B = 1)}{P(H = k | B = 1)P(B = 1) + P(H = k | B = 0)P(B = 0)} \\\\ &= \\frac{p^k(1-p)^{n-k} 0.5}{p^k(1-p)^{n-k} 0.5 + 0.5^{n+1}} \\\\ &= \\frac{p^k(1-p)^{n-k}}{p^k(1-p)^{n-k} + 0.5^n}. \\end{align}\\] The same results as in a). The only difference between these two scenarios is that in b) the last throw must be heads. However, this holds for the biased and the unbiased coin and therefore does not affect the probability of the coin being biased. Exercise 3.6 Judy goes around the company for Women’s day and shares flowers. In every office she leaves a flower, if there is at least one woman inside. The probability that there’s a woman in the office is \\(\\frac{3}{5}\\). What is the probability that Judy leaves her first flower in the fourth office? Given that she has given away exactly three flowers in the first four offices, what is the probability that she gives her fourth flower in the eighth office? What is the probability that she leaves the second flower in the fifth office? What is the probability that she leaves the second flower in the fifth office, given that she did not leave the second flower in the second office? Judy needs a new supply of flowers immediately after the office, where she gives away her last flower. What is the probability that she visits at least five offices, if she starts with two flowers? R: simulate Judy’s walk 10000 times to check your answers a) - e). Solution. Let \\(X_i = k\\) denote the event that … \\(i\\)-th sample on the \\(k\\)-th run. Since the events are independent, we can multiply their probabilities to get \\[\\begin{equation} P(X_1 = 4) = 0.4^3 \\times 0.6 = 0.0384. \\end{equation}\\] Same as in a) as we have a fresh start after first four offices. For this to be possible, she had to leave the first flower in one of the first four offices. Therefore there are four possibilities, and for each of those the probability is \\(0.4^3 \\times 0.6\\). Additionally, the probability that she leaves a flower in the fifth office is \\(0.6\\). So \\[\\begin{equation} P(X_2 = 5) = \\binom{4}{1} \\times 0.4^3 \\times 0.6^2 = 0.09216. \\end{equation}\\] We use Bayes’ theorem. \\[\\begin{align} P(X_2 = 5 | X_2 \\neq 2) &= \\frac{P(X_2 \\neq 2 | X_2 = 5)P(X_2 = 5)}{P(X_2 \\neq 2)} \\\\ &= \\frac{0.09216}{0.64} \\\\ &= 0.144. \\end{align}\\] The denominator in the second equation can be calculated as follows. One of three things has to happen for the second not to be dealt in the second round. First, both are zero, so \\(0.4^2\\). Second, first is zero, and second is one, so \\(0.4 \\times 0.6\\). Third, the first is one and the second one zero, so \\(0.6 \\times 0.4\\). Summing these values we get \\(0.64\\). We will look at the complement, so the events that she gave away exactly two flowers after two, three and four offices. \\[\\begin{equation} P(X_2 \\geq 5) = 1 - 0.6^2 - 2 \\times 0.4 \\times 0.6^2 - 3 \\times 0.4^2 \\times 0.6^2 = 0.1792. \\end{equation}\\] The multiplying parts represent the possibilities of the first flower. set.seed(1) nsamps <- 100000 Judyswalks <- matrix(data = NA, nrow = nsamps, ncol = 8) for (i in 1:nsamps) { thiswalk <- sample(c(0,1), size = 8, replace = TRUE, prob = c(0.4, 0.6)) Judyswalks[i, ] <- thiswalk } csJudy <- t(apply(Judyswalks, 1, cumsum)) # a sum(csJudy[ ,4] == 1 & csJudy[ ,3] == 0) / nsamps ## [1] 0.03848 # b csJsubset <- csJudy[csJudy[ ,4] == 3 & csJudy[ ,3] == 2, ] sum(csJsubset[ ,8] == 4 & csJsubset[ ,7] == 3) / nrow(csJsubset) ## [1] 0.03665893 # c sum(csJudy[ ,5] == 2 & csJudy[ ,4] == 1) / nsamps ## [1] 0.09117 # d sum(csJudy[ ,5] == 2 & csJudy[ ,4] == 1) / sum(csJudy[ ,2] != 2) ## [1] 0.1422398 # e sum(csJudy[ ,4] < 2) / nsamps ## [1] 0.17818 3.2 Conditional independence Exercise 3.7 Describe: A real-world example of two events \\(A\\) and \\(B\\) that are dependent but become conditionally independent if conditioned on a third event \\(C\\). A real-world example of two events \\(A\\) and \\(B\\) that are independent, but become dependent if conditioned on some third event \\(C\\). Solution. Let \\(A\\) be the height of a person and let \\(B\\) be the person’s knowledge of the Dutch language. These events are dependent since the Dutch are known to be taller than average. However if \\(C\\) is the nationality of the person, then \\(A\\) and \\(B\\) are independent given \\(C\\). Let \\(A\\) be the event that Mary passes the exam and let \\(B\\) be the event that John passes the exam. These events are independent. However, if the event \\(C\\) is that Mary and John studied together, then \\(A\\) and \\(B\\) are conditionally dependent given \\(C\\). Exercise 3.8 We have two coins of identical appearance. We know that one is a fair coin and the other flips heads 80% of the time. We choose one of the two coins uniformly at random. We discard the coin that was not chosen. We now flip the chosen coin independently 10 times, producing a sequence \\(Y_1 = y_1\\), \\(Y_2 = y_2\\), …, \\(Y_{10} = y_{10}\\). Intuitively, without doing and computation, are these random variables independent? Compute the probability \\(P(Y_1 = 1)\\). Compute the probabilities \\(P(Y_2 = 1 | Y_1 = 1)\\) and \\(P(Y_{10} = 1 | Y_1 = 1,...,Y_9 = 1)\\). Given your answers to b) and c), would you now change your answer to a)? If so, discuss why your intuition had failed. Solution. \\(P(Y_1 = 1) = 0.5 * 0.8 + 0.5 * 0.5 = 0.65\\). Since we know that \\(Y_1 = 1\\) this should change our view of the probability of the coin being biased or not. Let \\(B = 1\\) denote the event that the coin is biased and let \\(B = 0\\) denote that the coin is unbiased. By using marginal probability, we can write \\[\\begin{align} P(Y_2 = 1 | Y_1 = 1) &= P(Y_2 = 1, B = 1 | Y_1 = 1) + P(Y_2 = 1, B = 0 | Y_1 = 1) \\\\ &= \\sum_{k=1}^2 P(Y_2 = 1 | B = k, Y_1 = 1)P(B = k | Y_1 = 1) \\\\ &= 0.8 \\frac{P(Y_1 = 1 | B = 1)P(B = 1)}{P(Y_1 = 1)} + 0.5 \\frac{P(Y_1 = 1 | B = 0)P(B = 0)}{P(Y_1 = 1)} \\\\ &= 0.8 \\frac{0.8 \\times 0.5}{0.65} + 0.5 \\frac{0.5 \\times 0.5}{0.65} \\\\ &\\approx 0.68. \\end{align}\\] For the other calculation we follow the same procedure. Let \\(X = 1\\) denote that first nine tosses are all heads (equivalent to \\(Y_1 = 1\\),…, \\(Y_9 = 1\\)). \\[\\begin{align} P(Y_{10} = 1 | X = 1) &= P(Y_2 = 1, B = 1 | X = 1) + P(Y_2 = 1, B = 0 | X = 1) \\\\ &= \\sum_{k=1}^2 P(Y_2 = 1 | B = k, X = 1)P(B = k | X = 1) \\\\ &= 0.8 \\frac{P(X = 1 | B = 1)P(B = 1)}{P(X = 1)} + 0.5 \\frac{P(X = 1 | B = 0)P(B = 0)}{P(X = 1)} \\\\ &= 0.8 \\frac{0.8^9 \\times 0.5}{0.5 \\times 0.8^9 + 0.5 \\times 0.5^9} + 0.5 \\frac{0.5^9 \\times 0.5}{0.5 \\times 0.8^9 + 0.5 \\times 0.5^9} \\\\ &\\approx 0.8. \\end{align}\\] 3.3 Monty Hall problem The Monty Hall problem is a famous probability puzzle with non-intuitive outcome. Many established mathematicians and statisticians had problems solving it and many even disregarded the correct solution until they’ve seen the proof by simulation. Here we will show how it can be solved relatively simply with the use of Bayes’ theorem if we select the variables in a smart way. Exercise 3.9 (Monty Hall problem) A prize is placed at random behind one of three doors. You pick a door. Now Monty Hall chooses one of the other two doors, opens it and shows you that it is empty. He then gives you the opportunity to keep your door or switch to the other unopened door. Should you stay or switch? Use Bayes’ theorem to calculate the probability of winning if you switch and if you do not. R: Check your answers in R. Solution. W.L.O.G. assume we always pick the first door. The host can only open door 2 or door 3, as he can not open the door we picked. Let \\(k \\in \\{2,3\\}\\). Let us first look at what happens if we do not change. Then we have \\[\\begin{align} P(\\text{car in 1} | \\text{open $k$}) &= \\frac{P(\\text{open $k$} | \\text{car in 1})P(\\text{car in 1})}{P(\\text{open $k$})} \\\\ &= \\frac{P(\\text{open $k$} | \\text{car in 1})P(\\text{car in 1})}{\\sum_{n=1}^3 P(\\text{open $k$} | \\text{car in $n$})P(\\text{car in $n$)}}. \\end{align}\\] The probability that he opened \\(k\\) if the car is in 1 is \\(\\frac{1}{2}\\), as he can choose between door 2 and 3 as both have a goat behind it. Let us look at the normalization constant. When \\(n = 1\\) we get the value in the nominator. When \\(n=k\\), we get 0, as he will not open the door if there’s a prize behind. The remaining option is that we select 1, the car is behind \\(k\\) and he opens the only door left. Since he can’t open 1 due to it being our pick and \\(k\\) due to having the prize, the probability of opening the remaining door is 1, and the prior probability of the car being behind this door is \\(\\frac{1}{3}\\). So we have \\[\\begin{align} P(\\text{car in 1} | \\text{open $k$}) &= \\frac{\\frac{1}{2}\\frac{1}{3}}{\\frac{1}{2}\\frac{1}{3} + \\frac{1}{3}} \\\\ &= \\frac{1}{3}. \\end{align}\\] Now let us look at what happens if we do change. Let \\(k' \\in \\{2,3\\}\\) be the door that is not opened. If we change, we select this door, so we have \\[\\begin{align} P(\\text{car in $k'$} | \\text{open $k$}) &= \\frac{P(\\text{open $k$} | \\text{car in $k'$})P(\\text{car in $k'$})}{P(\\text{open $k$})} \\\\ &= \\frac{P(\\text{open $k$} | \\text{car in $k'$})P(\\text{car in $k'$})}{\\sum_{n=1}^3 P(\\text{open $k$} | \\text{car in $n$})P(\\text{car in $n$)}}. \\end{align}\\] The denominator stays the same, the only thing that is different from before is \\(P(\\text{open $k$} | \\text{car in $k'$})\\). We have a situation where we initially selected door 1 and the car is in door \\(k'\\). The probability that the host will open door \\(k\\) is then 1, as he can not pick any other door. So we have \\[\\begin{align} P(\\text{car in $k'$} | \\text{open $k$}) &= \\frac{\\frac{1}{3}}{\\frac{1}{2}\\frac{1}{3} + \\frac{1}{3}} \\\\ &= \\frac{2}{3}. \\end{align}\\] Therefore it makes sense to change the door. set.seed(1) nsamps <- 1000 ifchange <- vector(mode = "logical", length = nsamps) ifstay <- vector(mode = "logical", length = nsamps) for (i in 1:nsamps) { where_car <- sample(c(1:3), 1) where_player <- sample(c(1:3), 1) open_samp <- (1:3)[where_car != (1:3) & where_player != (1:3)] if (length(open_samp) == 1) { where_open <- open_samp } else { where_open <- sample(open_samp, 1) } ifstay[i] <- where_car == where_player where_ifchange <- (1:3)[where_open != (1:3) & where_player != (1:3)] ifchange[i] <- where_ifchange == where_car } sum(ifstay) / nsamps ## [1] 0.328 sum(ifchange) / nsamps ## [1] 0.672 "],["rvs.html", "Chapter 4 Random variables 4.1 General properties and calculations 4.2 Discrete random variables 4.3 Continuous random variables 4.4 Singular random variables 4.5 Transformations", " Chapter 4 Random variables This chapter deals with random variables and their distributions. The students are expected to acquire the following knowledge: Theoretical Identification of random variables. Convolutions of random variables. Derivation of PDF, PMF, CDF, and quantile function. Definitions and properties of common discrete random variables. Definitions and properties of common continuous random variables. Transforming univariate random variables. R Familiarize with PDF, PMF, CDF, and quantile functions for several distributions. Visual inspection of probability distributions. Analytical and empirical calculation of probabilities based on distributions. New R functions for plotting (for example, facet_wrap). Creating random number generators based on the Uniform distribution. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 4.1 General properties and calculations Exercise 4.1 Which of the functions below are valid CDFs? Find their respective densities. R: Plot the three functions. \\[\\begin{equation} F(x) = \\begin{cases} 1 - e^{-x^2} & x \\geq 0 \\\\ 0 & x < 0. \\end{cases} \\end{equation}\\] \\[\\begin{equation} F(x) = \\begin{cases} e^{-\\frac{1}{x}} & x > 0 \\\\ 0 & x \\leq 0. \\end{cases} \\end{equation}\\] \\[\\begin{equation} F(x) = \\begin{cases} 0 & x \\leq 0 \\\\ \\frac{1}{3} & 0 < x \\leq \\frac{1}{2} \\\\ 1 & x > \\frac{1}{2}. \\end{cases} \\end{equation}\\] Solution. Yes. First, let us check the limits. \\(\\lim_{x \\rightarrow -\\infty} (0) = 0\\). \\(\\lim_{x \\rightarrow \\infty} (1 - e^{-x^2}) = 1 - \\lim_{x \\rightarrow \\infty} e^{-x^2} = 1 - 0 = 1\\). Second, let us check whether the function is increasing. Let \\(x > y \\geq 0\\). Then \\(1 - e^{-x^2} \\geq 1 - e^{-y^2}\\). We only have to check right continuity for the point zero. \\(F(0) = 0\\) and \\(\\lim_{\\epsilon \\downarrow 0}F (0 + \\epsilon) = \\lim_{\\epsilon \\downarrow 0} 1 - e^{-\\epsilon^2} = 1 - \\lim_{\\epsilon \\downarrow 0} e^{-\\epsilon^2} = 1 - 1 = 0\\). We get the density by differentiating the CDF. \\(p(x) = \\frac{d}{dx} 1 - e^{-x^2} = 2xe^{-x^2}.\\) Students are encouraged to check that this is a proper PDF. Yes. First, let us check the limits. $_{x -} (0) = 0 and \\(\\lim_{x \\rightarrow \\infty} (e^{-\\frac{1}{x}}) = 1\\). Second, let us check whether the function is increasing. Let \\(x > y \\geq 0\\). Then \\(e^{-\\frac{1}{x}} \\geq e^{-\\frac{1}{y}}\\). We only have to check right continuity for the point zero. \\(F(0) = 0\\) and \\(\\lim_{\\epsilon \\downarrow 0}F (0 + \\epsilon) = \\lim_{\\epsilon \\downarrow 0} e^{-\\frac{1}{\\epsilon}} = 0\\). We get the density by differentiating the CDF. \\(p(x) = \\frac{d}{dx} e^{-\\frac{1}{x}} = \\frac{1}{x^2}e^{-\\frac{1}{x}}.\\) Students are encouraged to check that this is a proper PDF. No. The function is not right continuous as \\(F(\\frac{1}{2}) = \\frac{1}{3}\\), but \\(\\lim_{\\epsilon \\downarrow 0} F(\\frac{1}{2} + \\epsilon) = 1\\). f1 <- function (x) { tmp <- 1 - exp(-x^2) tmp[x < 0] <- 0 return(tmp) } f2 <- function (x) { tmp <- exp(-(1 / x)) tmp[x <= 0] <- 0 return(tmp) } f3 <- function (x) { tmp <- x tmp[x == x] <- 1 tmp[x <= 0.5] <- 1/3 tmp[x <= 0] <- 0 return(tmp) } cdf_data <- tibble(x = seq(-1, 20, by = 0.001), f1 = f1(x), f2 = f2(x), f3 = f3(x)) %>% melt(id.vars = "x") cdf_plot <- ggplot(data = cdf_data, aes(x = x, y = value, color = variable)) + geom_hline(yintercept = 1) + geom_line() plot(cdf_plot) Exercise 4.2 Let \\(X\\) be a random variable with CDF \\[\\begin{equation} F(x) = \\begin{cases} 0 & x < 0 \\\\ \\frac{x^2}{2} & 0 \\leq x < 1 \\\\ \\frac{1}{2} + \\frac{p}{2} & 1 \\leq x < 2 \\\\ \\frac{1}{2} + \\frac{p}{2} + \\frac{1 - p}{2} & x \\geq 2 \\end{cases} \\end{equation}\\] R: Plot this CDF for \\(p = 0.3\\). Is it a discrete, continuous, or mixed random varible? Find the probability density/mass of \\(X\\). f1 <- function (x, p) { tmp <- x tmp[x >= 2] <- 0.5 + (p * 0.5) + ((1-p) * 0.5) tmp[x < 2] <- 0.5 + (p * 0.5) tmp[x < 1] <- (x[x < 1])^2 / 2 tmp[x < 0] <- 0 return(tmp) } cdf_data <- tibble(x = seq(-1, 5, by = 0.001), y = f1(x, 0.3)) cdf_plot <- ggplot(data = cdf_data, aes(x = x, y = y)) + geom_hline(yintercept = 1) + geom_line(color = "blue") plot(cdf_plot) ::: {.solution} \\(X\\) is a mixed random variable. Since \\(X\\) is a mixed random variable, we have to find the PDF of the continuous part and the PMF of the discrete part. We get the continuous part by differentiating the corresponding CDF, \\(\\frac{d}{dx}\\frac{x^2}{2} = x\\). So the PDF, when \\(0 \\leq x < 1\\), is \\(p(x) = x\\). Let us look at the discrete part now. It has two steps, so this is a discrete distribution with two outcomes – numbers 1 and 2. The first happens with probability \\(\\frac{p}{2}\\), and the second with probability \\(\\frac{1 - p}{2}\\). This reminds us of the Bernoulli distribution. The PMF for the discrete part is \\(P(X = x) = (\\frac{p}{2})^{2 - x} (\\frac{1 - p}{2})^{x - 1}\\). ::: Exercise 4.3 (Convolutions) Convolutions are probability distributions that correspond to sums of independent random variables. Let \\(X\\) and \\(Y\\) be independent discrete variables. Find the PMF of \\(Z = X + Y\\). Hint: Use the law of total probability. Let \\(X\\) and \\(Y\\) be independent continuous variables. Find the PDF of \\(Z = X + Y\\). Hint: Start with the CDF. Solution. \\[\\begin{align} P(Z = z) &= P(X + Y = z) & \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + Y = z | Y = k) P(Y = k) & \\text{ (law of total probability)} \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + k = z | Y = k) P(Y = k) & \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + k = z) P(Y = k) & \\text{ (independence of $X$ and $Y$)} \\\\ &= \\sum_{k = -\\infty}^\\infty P(X = z - k) P(Y = k). & \\end{align}\\] Let \\(f\\) and \\(g\\) be the PDFs of \\(X\\) and \\(Y\\) respectively. \\[\\begin{align} F(z) &= P(Z < z) \\\\ &= P(X + Y < z) \\\\ &= \\int_{-\\infty}^{\\infty} P(X + Y < z | Y = y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X + y < z | Y = y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X + y < z)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X < z - y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} (\\int_{-\\infty}^{z - y} f(x) dx) g(y) dy \\end{align}\\] Now \\[\\begin{align} p(z) &= \\frac{d}{dz} F(z) & \\\\ &= \\int_{-\\infty}^{\\infty} (\\frac{d}{dz}\\int_{-\\infty}^{z - y} f(x) dx) g(y) dy & \\\\ &= \\int_{-\\infty}^{\\infty} f(z - y) g(y) dy & \\text{ (fundamental theorem of calculus)}. \\end{align}\\] 4.2 Discrete random variables Exercise 4.4 (Binomial random variable) Let \\(X_k\\), \\(k = 1,...,n\\), be random variables with the Bernoulli measure as the PMF. Let \\(X = \\sum_{k=1}^n X_k\\). We call \\(X_k\\) a Bernoulli random variable with parameter \\(p \\in (0,1)\\). Find the CDF of \\(X_k\\). Find PMF of \\(X\\). This is a Binomial random variable with support in \\(\\{0,1,2,...,n\\}\\) and parameters \\(p \\in (0,1)\\) and \\(n \\in \\mathbb{N}_0\\). We denote \\[\\begin{equation} X | n,p \\sim \\text{binomial}(n,p). \\end{equation}\\] Find CDF of \\(X\\). R: Simulate from the binomial distribution with \\(n = 10\\) and \\(p = 0.5\\), and from \\(n\\) Bernoulli distributions with \\(p = 0.5\\). Visually compare the sum of Bernoullis and the binomial. Hint: there is no standard function like rpois for a Bernoulli random variable. Check exercise 1.12 to find out how to sample from a Bernoulli distribution. Solution. There are two outcomes – zero and one. Zero happens with probability \\(1 - p\\). Therefore \\[\\begin{equation} F(k) = \\begin{cases} 0 & k < 0 \\\\ 1 - p & 0 \\leq k < 1 \\\\ 1 & k \\geq 1. \\end{cases} \\end{equation}\\] For the probability of \\(X\\) to be equal to some \\(k \\leq n\\), exactly \\(k\\) Bernoulli variables need to be one, and the others zero. So \\(p^k(1-p)^{n-k}\\). There are \\(\\binom{n}{k}\\) such possible arrangements. Therefore \\[\\begin{align} P(X = k) = \\binom{n}{k} p^k (1 - p)^{n-k}. \\end{align}\\] \\[\\begin{equation} F(k) = \\sum_{i = 0}^{\\lfloor k \\rfloor} \\binom{n}{i} p^i (1 - p)^{n - i} \\end{equation}\\] set.seed(1) nsamps <- 10000 binom_samp <- rbinom(nsamps, size = 10, prob = 0.5) bernoulli_mat <- matrix(data = NA, nrow = nsamps, ncol = 10) for (i in 1:nsamps) { bernoulli_mat[i, ] <- rbinom(10, size = 1, prob = 0.5) } bern_samp <- apply(bernoulli_mat, 1, sum) b_data <- tibble(x = c(binom_samp, bern_samp), type = c(rep("binomial", 10000), rep("Bernoulli_sum", 10000))) b_plot <- ggplot(data = b_data, aes(x = x, fill = type)) + geom_bar(position = "dodge") plot(b_plot) Exercise 4.5 (Geometric random variable) A variable with PMF \\[\\begin{equation} P(k) = p(1-p)^k \\end{equation}\\] is a geometric random variable with support in non-negative integers. It has one parameter \\(p \\in (0,1]\\). We denote \\[\\begin{equation} X | p \\sim \\text{geometric}(p) \\end{equation}\\] Derive the CDF of a geometric random variable. R: Draw 1000 samples from the geometric distribution with \\(p = 0.3\\) and compare their frequencies to theoretical values. Solution. \\[\\begin{align} P(X \\leq k) &= \\sum_{i = 0}^k p(1-p)^i \\\\ &= p \\sum_{i = 0}^k (1-p)^i \\\\ &= p \\frac{1 - (1-p)^{k+1}}{1 - (1 - p)} \\\\ &= 1 - (1-p)^{k + 1} \\end{align}\\] set.seed(1) geo_samp <- rgeom(n = 1000, prob = 0.3) geo_samp <- data.frame(x = geo_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:20, n = dgeom(0:20, prob = 0.3), type = "theoretical_measure")) geo_plot <- ggplot(data = geo_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(geo_plot) Exercise 4.6 (Poisson random variable) A variable with PMF \\[\\begin{equation} P(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!} \\end{equation}\\] is a Poisson random variable with support in non-negative integers. It has one positive parameter \\(\\lambda\\), which also represents its mean value and variance (a measure of the deviation of the values from the mean – more on mean and variance in the next chapter). We denote \\[\\begin{equation} X | \\lambda \\sim \\text{Poisson}(\\lambda). \\end{equation}\\] This distribution is usually the default choice for modeling counts. We have already encountered a Poisson random variable in exercise 1.13, where we also sampled from this distribution. The CDF of a Poisson random variable is \\(P(X <= x) = e^{-\\lambda} \\sum_{i=0}^x \\frac{\\lambda^{i}}{i!}\\). R: Draw 1000 samples from the Poisson distribution with \\(\\lambda = 5\\) and compare their empirical cumulative distribution function with the theoretical CDF. set.seed(1) pois_samp <- rpois(n = 1000, lambda = 5) pois_samp <- data.frame(x = pois_samp) pois_plot <- ggplot(data = pois_samp, aes(x = x, colour = "ECDF")) + stat_ecdf(geom = "step") + geom_step(data = tibble(x = 0:17, y = ppois(x, 5)), aes(x = x, y = y, colour = "CDF")) + scale_colour_manual("Lgend title", values = c("black", "red")) plot(pois_plot) Exercise 4.7 (Negative binomial random variable) A variable with PMF \\[\\begin{equation} p(k) = \\binom{k + r - 1}{k}(1-p)^r p^k \\end{equation}\\] is a negative binomial random variable with support in non-negative integers. It has two parameters \\(r > 0\\) and \\(p \\in (0,1)\\). We denote \\[\\begin{equation} X | r,p \\sim \\text{NB}(r,p). \\end{equation}\\] Let us reparameterize the negative binomial distribution with \\(q = 1 - p\\). Find the PMF of \\(X \\sim \\text{NB}(1, q)\\). Do you recognize this distribution? Show that the sum of two negative binomial random variables with the same \\(p\\) is also a negative binomial random variable. Hint: Use the fact that the number of ways to place \\(n\\) indistinct balls into \\(k\\) boxes is \\(\\binom{n + k - 1}{n}\\). R: Draw samples from \\(X \\sim \\text{NB}(5, 0.4)\\) and \\(Y \\sim \\text{NB}(3, 0.4)\\). Draw samples from \\(Z = X + Y\\), where you use the parameters calculated in b). Plot both distributions, their sum, and \\(Z\\) using facet_wrap. Be careful, as R uses a different parameterization size=\\(r\\) and prob=\\(1 - p\\). Solution. \\[\\begin{align} P(X = k) &= \\binom{k + 1 - 1}{k}q^1 (1-q)^k \\\\ &= q(1-q)^k. \\end{align}\\] This is the geometric distribution. Let \\(X \\sim \\text{NB}(r_1, p)\\) and \\(Y \\sim \\text{NB}(r_2, p)\\). Let \\(Z = X + Y\\). \\[\\begin{align} P(Z = z) &= \\sum_{k = 0}^{\\infty} P(X = z - k)P(Y = k), \\text{ if k < 0, then the probabilities are 0} \\\\ &= \\sum_{k = 0}^{z} P(X = z - k)P(Y = k), \\text{ if k > z, then the probabilities are 0} \\\\ &= \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k}(1 - p)^{r_1} p^{z - k} \\binom{k + r_2 - 1}{k}(1 - p)^{r_2} p^{k} & \\\\ &= \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}(1 - p)^{r_1 + r_2} p^{z} & \\\\ &= (1 - p)^{r_1 + r_2} p^{z} \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}& \\end{align}\\] The part before the sum reminds us of the negative binomial distribution with parameters \\(r_1 + r_2\\) and \\(p\\). To complete this term to the negative binomial PMF we need \\(\\binom{z + r_1 + r_2 -1}{z}\\). So the only thing we need to prove is that the sum equals this term. Both terms in the sum can be interpreted as numbers of ways to place a number of balls into boxes. For the left term it is \\(z-k\\) balls into \\(r_1\\) boxes, and for the right \\(k\\) balls into \\(r_2\\) boxes. For each \\(k\\) we are distributing \\(z\\) balls in total. By summing over all \\(k\\), we actually get all the possible placements of \\(z\\) balls into \\(r_1 + r_2\\) boxes. Therefore \\[\\begin{align} P(Z = z) &= (1 - p)^{r_1 + r_2} p^{z} \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}& \\\\ &= \\binom{z + r_1 + r_2 -1}{z} (1 - p)^{r_1 + r_2} p^{z}. \\end{align}\\] From this it also follows that the sum of geometric distributions with the same parameter is a negative binomial distribution. \\(Z \\sim \\text{NB}(8, 0.4)\\). set.seed(1) nsamps <- 10000 x <- rnbinom(nsamps, size = 5, prob = 0.6) y <- rnbinom(nsamps, size = 3, prob = 0.6) xpy <- x + y z <- rnbinom(nsamps, size = 8, prob = 0.6) samps <- tibble(x, y, xpy, z) samps <- melt(samps) ggplot(data = samps, aes(x = value)) + geom_bar() + facet_wrap(~ variable) 4.3 Continuous random variables Exercise 4.8 (Exponential random variable) A variable \\(X\\) with PDF \\(\\lambda e^{-\\lambda x}\\) is an exponential random variable with support in non-negative real numbers. It has one positive parameter \\(\\lambda\\). We denote \\[\\begin{equation} X | \\lambda \\sim \\text{Exp}(\\lambda). \\end{equation}\\] Find the CDF of an exponential random variable. Find the quantile function of an exponential random variable. Calculate the probability \\(P(1 \\leq X \\leq 3)\\), where \\(X \\sim \\text{Exp(1.5)}\\). R: Check your answer to c) with a simulation (rexp). Plot the probability in a meaningful way. R: Implement PDF, CDF, and the quantile function and compare their values with corresponding R functions visually. Hint: use the size parameter to make one of the curves wider. Solution. \\[\\begin{align} F(x) &= \\int_{0}^{x} \\lambda e^{-\\lambda t} dt \\\\ &= \\lambda \\int_{0}^{x} e^{-\\lambda t} dt \\\\ &= \\lambda (\\frac{1}{-\\lambda}e^{-\\lambda t} |_{0}^{x}) \\\\ &= \\lambda(\\frac{1}{\\lambda} - \\frac{1}{\\lambda} e^{-\\lambda x}) \\\\ &= 1 - e^{-\\lambda x}. \\end{align}\\] \\[\\begin{align} F(F^{-1}(x)) &= x \\\\ 1 - e^{-\\lambda F^{-1}(x)} &= x \\\\ e^{-\\lambda F^{-1}(x)} &= 1 - x \\\\ -\\lambda F^{-1}(x) &= \\ln(1 - x) \\\\ F^{-1}(x) &= - \\frac{ln(1 - x)}{\\lambda}. \\end{align}\\] \\[\\begin{align} P(1 \\leq X \\leq 3) &= P(X \\leq 3) - P(X \\leq 1) \\\\ &= P(X \\leq 3) - P(X \\leq 1) \\\\ &= 1 - e^{-1.5 \\times 3} - 1 + e^{-1.5 \\times 1} \\\\ &\\approx 0.212. \\end{align}\\] set.seed(1) nsamps <- 1000 samps <- rexp(nsamps, rate = 1.5) sum(samps >= 1 & samps <= 3) / nsamps ## [1] 0.212 exp_plot <- ggplot(data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dexp, args = list(rate = 1.5)) + stat_function(fun = dexp, args = list(rate = 1.5), xlim = c(1,3), geom = "area", fill = "red") plot(exp_plot) exp_pdf <- function(x, lambda) { return (lambda * exp(-lambda * x)) } exp_cdf <- function(x, lambda) { return (1 - exp(-lambda * x)) } exp_quant <- function(q, lambda) { return (-(log(1 - q) / lambda)) } ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_pdf, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = pexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_cdf, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) ggplot(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x)) + stat_function(fun = qexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_quant, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) Exercise 4.9 (Uniform random variable) Continuous uniform random variable with parameters \\(a\\) and \\(b\\) has the PDF \\[\\begin{equation} p(x) = \\begin{cases} \\frac{1}{b - a} & x \\in [a,b] \\\\ 0 & \\text{otherwise}. \\end{cases} \\end{equation}\\] Find the CDF of the uniform random variable. Find the quantile function of the uniform random variable. Let \\(X \\sim \\text{Uniform}(a,b)\\). Find the CDF of the variable \\(Y = \\frac{X - a}{b - a}\\). This is the standard uniform random variable. Let \\(X \\sim \\text{Uniform}(-1, 3)\\). Find such \\(z\\) that \\(P(X < z + \\mu_x) = \\frac{1}{5}\\). R: Check your result from d) using simulation. Solution. \\[\\begin{align} F(x) &= \\int_{a}^x \\frac{1}{b - a} dt \\\\ &= \\frac{1}{b - a} \\int_{a}^x dt \\\\ &= \\frac{x - a}{b - a}. \\end{align}\\] \\[\\begin{align} F(F^{-1}(p)) &= p \\\\ \\frac{F^{-1}(p) - a}{b - a} &= p \\\\ F^{-1}(p) &= p(b - a) + a. \\end{align}\\] \\[\\begin{align} F_Y(y) &= P(Y < y) \\\\ &= P(\\frac{X - a}{b - a} < y) \\\\ &= P(X < y(b - a) + a) \\\\ &= F_X(y(b - a) + a) \\\\ &= \\frac{(y(b - a) + a) - a}{b - a} \\\\ &= y. \\end{align}\\] \\[\\begin{align} P(X < z + 1) &= \\frac{1}{5} \\\\ F(z + 1) &= \\frac{1}{5} \\\\ z + 1 &= F^{-1}(\\frac{1}{5}) \\\\ z &= \\frac{1}{5}4 - 1 - 1 \\\\ z &= -1.2. \\end{align}\\] set.seed(1) a <- -1 b <- 3 nsamps <- 10000 unif_samp <- runif(nsamps, a, b) mu_x <- mean(unif_samp) new_samp <- unif_samp - mu_x quantile(new_samp, probs = 1/5) ## 20% ## -1.203192 punif(-0.2, -1, 3) ## [1] 0.2 Exercise 4.10 (Beta random variable) A variable \\(X\\) with PDF \\[\\begin{equation} p(x) = \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)}, \\end{equation}\\] where \\(\\text{B}(\\alpha, \\beta) = \\frac{\\Gamma(\\alpha) \\Gamma(\\beta)}{\\Gamma(\\alpha + \\beta)}\\) and \\(\\Gamma(x) = \\int_0^{\\infty} x^{z - 1} e^{-x} dx\\) is a Beta random variable with support on \\([0,1]\\). It has two positive parameters \\(\\alpha\\) and \\(\\beta\\). Notation: \\[\\begin{equation} X | \\alpha, \\beta \\sim \\text{Beta}(\\alpha, \\beta) \\end{equation}\\] It is often used in modeling rates. Calculate the PDF for \\(\\alpha = 1\\) and \\(\\beta = 1\\). What do you notice? R: Plot densities of the beta distribution for parameter pairs (2, 2), (4, 1), (1, 4), (2, 5), and (0.1, 0.1). R: Sample from \\(X \\sim \\text{Beta}(2, 5)\\) and compare the histogram with Beta PDF. Solution. \\[\\begin{equation} p(x) = \\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\\text{B}(1, 1)} = 1. \\end{equation}\\] This is the standard uniform distribution. set.seed(1) ggplot(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x)) + stat_function(fun = dbeta, args = list(shape1 = 2, shape2 = 2), aes(color = "alpha = 0.5")) + stat_function(fun = dbeta, args = list(shape1 = 4, shape2 = 1), aes(color = "alpha = 4")) + stat_function(fun = dbeta, args = list(shape1 = 1, shape2 = 4), aes(color = "alpha = 1")) + stat_function(fun = dbeta, args = list(shape1 = 2, shape2 = 5), aes(color = "alpha = 25")) + stat_function(fun = dbeta, args = list(shape1 = 0.1, shape2 = 0.1), aes(color = "alpha = 0.1")) set.seed(1) nsamps <- 1000 samps <- rbeta(nsamps, 2, 5) ggplot(data = data.frame(x = samps), aes(x = x)) + geom_histogram(aes(y = ..density..), color = "black") + stat_function(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x), fun = dbeta, args = list(shape1 = 2, shape2 = 5), color = "red", size = 1.2) Exercise 4.11 (Gamma random variable) A random variable with PDF \\[\\begin{equation} p(x) = \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x} \\end{equation}\\] is a Gamma random variable with support on the positive numbers and parameters shape \\(\\alpha > 0\\) and rate \\(\\beta > 0\\). We write \\[\\begin{equation} X | \\alpha, \\beta \\sim \\text{Gamma}(\\alpha, \\beta) \\end{equation}\\] and it’s CDF is \\[\\begin{equation} \\frac{\\gamma(\\alpha, \\beta x)}{\\Gamma(\\alpha)}, \\end{equation}\\] where \\(\\gamma(s, x) = \\int_0^x t^{s-1} e^{-t} dt\\). It is usually used in modeling positive phenomena (for example insurance claims and rainfalls). Let \\(X \\sim \\text{Gamma}(1, \\beta)\\). Find the PDF of \\(X\\). Do you recognize this PDF? Let \\(k = \\alpha\\) and \\(\\theta = \\frac{1}{\\beta}\\). Find the PDF of \\(X | k, \\theta \\sim \\text{Gamma}(k, \\theta)\\). Random variables can be reparameterized, and sometimes a reparameterized distribution is more suitable for certain calculations. The first parameterization is for example usually used in Bayesian statistics, while this parameterization is more common in econometrics and some other applied fields. Note that you also need to pay attention to the parameters in statistical software, so diligently read the help files when using functions like rgamma to see how the function is parameterized. R: Plot gamma CDF for random variables with shape and rate parameters (1,1), (10,1), (1,10). Solution. \\[\\begin{align} p(x) &= \\frac{\\beta^1}{\\Gamma(1)} x^{1 - 1}e^{-\\beta x} \\\\ &= \\beta e^{-\\beta x} \\end{align}\\] This is the PDF of the exponential distribution with parameter \\(\\beta\\). \\[\\begin{align} p(x) &= \\frac{1}{\\Gamma(k)\\beta^k} x^{k - 1}e^{-\\frac{x}{\\theta}}. \\end{align}\\] set.seed(1) ggplot(data = data.frame(x = seq(0, 25, by = 0.01)), aes(x = x)) + stat_function(fun = pgamma, args = list(shape = 1, rate = 1), aes(color = "Gamma(1,1)")) + stat_function(fun = pgamma, args = list(shape = 10, rate = 1), aes(color = "Gamma(10,1)")) + stat_function(fun = pgamma, args = list(shape = 1, rate = 10), aes(color = "Gamma(1,10)")) Exercise 4.12 (Normal random variable) A random variable with PDF \\[\\begin{equation} p(x) = \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} \\end{equation}\\] is a normal random variable with support on the real axis and parameters \\(\\mu\\) in reals and \\(\\sigma^2 > 0\\). The first is the mean parameter and the second is the variance parameter. Many statistical methods assume a normal distribution. We denote \\[\\begin{equation} X | \\mu, \\sigma \\sim \\text{N}(\\mu, \\sigma^2), \\end{equation}\\] and it’s CDF is \\[\\begin{equation} F(x) = \\int_{-\\infty}^x \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(t - \\mu)^2}{2 \\sigma^2}} dt, \\end{equation}\\] which is intractable and is usually approximated. Due to its flexibility it is also one of the most researched distributions. For that reason statisticians often use transformations of variables or approximate distributions with the normal distribution. Show that a variable \\(\\frac{X - \\mu}{\\sigma} \\sim \\text{N}(0,1)\\). This transformation is called standardization, and \\(\\text{N}(0,1)\\) is a standard normal distribution. R: Plot the normal distribution with \\(\\mu = 0\\) and different values for the \\(\\sigma\\) parameter. R: The normal distribution provides a good approximation for the Poisson distribution with a large \\(\\lambda\\). Let \\(X \\sim \\text{Poisson}(50)\\). Approximate \\(X\\) with the normal distribution and compare its density with the Poisson histogram. What are the values of \\(\\mu\\) and \\(\\sigma^2\\) that should provide the best approximation? Note that R function rnorm takes standard deviation (\\(\\sigma\\)) as a parameter and not variance. Solution. \\[\\begin{align} P(\\frac{X - \\mu}{\\sigma} < x) &= P(X < \\sigma x + \\mu) \\\\ &= F(\\sigma x + \\mu) \\\\ &= \\int_{-\\infty}^{\\sigma x + \\mu} \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(t - \\mu)^2}{2\\sigma^2}} dt \\end{align}\\] Now let \\(s = f(t) = \\frac{t - \\mu}{\\sigma}\\), then \\(ds = \\frac{dt}{\\sigma}\\) and \\(f(\\sigma x + \\mu) = x\\), so \\[\\begin{align} P(\\frac{X - \\mu}{\\sigma} < x) &= \\int_{-\\infty}^{x} \\frac{1}{\\sqrt{2 \\pi}} e^{-\\frac{s^2}{2}} ds. \\end{align}\\] There is no need to evaluate this integral, as we recognize it as the CDF of a normal distribution with \\(\\mu = 0\\) and \\(\\sigma^2 = 1\\). set.seed(1) # b ggplot(data = data.frame(x = seq(-15, 15, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 1), aes(color = "sd = 1")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 0.4), aes(color = "sd = 0.1")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "sd = 2")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 5), aes(color = "sd = 5")) # c mean_par <- 50 nsamps <- 100000 pois_samps <- rpois(nsamps, lambda = mean_par) norm_samps <- rnorm(nsamps, mean = mean_par, sd = sqrt(mean_par)) my_plot <- ggplot() + geom_bar(data = tibble(x = pois_samps), aes(x = x, y = (..count..)/sum(..count..))) + geom_density(data = tibble(x = norm_samps), aes(x = x), color = "red") plot(my_plot) Exercise 4.13 (Logistic random variable) A logistic random variable has CDF \\[\\begin{equation} F(x) = \\frac{1}{1 + e^{-\\frac{x - \\mu}{s}}}, \\end{equation}\\] where \\(\\mu\\) is real and \\(s > 0\\). The support is on the real axis. We denote \\[\\begin{equation} X | \\mu, s \\sim \\text{Logistic}(\\mu, s). \\end{equation}\\] The distribution of the logistic random variable resembles a normal random variable, however it has heavier tails. Find the PDF of a logistic random variable. R: Implement logistic PDF and CDF and visually compare both for \\(X \\sim \\text{N}(0, 1)\\) and \\(Y \\sim \\text{logit}(0, \\sqrt{\\frac{3}{\\pi^2}})\\). These distributions have the same mean and variance. Additionally, plot the same plot on the interval [5,10], to better see the difference in the tails. R: For the distributions in b) find the probability \\(P(|X| > 4)\\) and interpret the result. Solution. \\[\\begin{align} p(x) &= \\frac{d}{dx} \\frac{1}{1 + e^{-\\frac{x - \\mu}{s}}} \\\\ &= \\frac{- \\frac{d}{dx} (1 + e^{-\\frac{x - \\mu}{s}})}{(1 + e{-\\frac{x - \\mu}{s}})^2} \\\\ &= \\frac{e^{-\\frac{x - \\mu}{s}}}{s(1 + e{-\\frac{x - \\mu}{s}})^2}. \\end{align}\\] # b set.seed(1) logit_pdf <- function (x, mu, s) { return ((exp(-(x - mu)/(s))) / (s * (1 + exp(-(x - mu)/(s)))^2)) } nl_plot <- ggplot(data = data.frame(x = seq(-12, 12, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "normal")) + stat_function(fun = logit_pdf, args = list(mu = 0, s = sqrt(12/pi^2)), aes(color = "logit")) plot(nl_plot) nl_plot <- ggplot(data = data.frame(x = seq(5, 10, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "normal")) + stat_function(fun = logit_pdf, args = list(mu = 0, s = sqrt(12/pi^2)), aes(color = "logit")) plot(nl_plot) # c logit_cdf <- function (x, mu, s) { return (1 / (1 + exp(-(x - mu) / s))) } p_logistic <- 1 - logit_cdf(4, 0, sqrt(12/pi^2)) + logit_cdf(-4, 0, sqrt(12/pi^2)) p_norm <- 1 - pnorm(4, 0, 2) + pnorm(-4, 0, 2) p_logistic ## [1] 0.05178347 p_norm ## [1] 0.04550026 # Logistic distribution has wider tails, therefore the probability of larger # absolute values is higher. 4.4 Singular random variables Exercise 4.14 (Cantor distribution) The Cantor set is a subset of \\([0,1]\\), which we create by iteratively deleting the middle third of the interval. For example, in the first iteration, we get the sets \\([0,\\frac{1}{3}]\\) and \\([\\frac{2}{3},1]\\). In the second iteration, we get \\([0,\\frac{1}{9}]\\), \\([\\frac{2}{9},\\frac{1}{3}]\\), \\([\\frac{2}{3}, \\frac{7}{9}]\\), and \\([\\frac{8}{9}, 1]\\). On the \\(n\\)-th iteration, we have \\[\\begin{equation} C_n = \\frac{C_{n-1}}{3} \\cup \\bigg(\\frac{2}{3} + \\frac{C_{n-1}}{3} \\bigg), \\end{equation}\\] where \\(C_0 = [0,1]\\). The Cantor set is then defined as the intersection of these sets \\[\\begin{equation} C = \\cap_{n=1}^{\\infty} C_n. \\end{equation}\\] It has the same cardinality as \\([0,1]\\). Another way to define the Cantor set is the set of all numbers on \\([0,1]\\), that do not have a 1 in the ternary representation \\(x = \\sum_{n=1}^\\infty \\frac{x_i}{3^i}, x_i \\in \\{0,1,2\\}\\). A random variable follows the Cantor distribution, if its CDF is the Cantor function (below). You can find the implementations of random number generator, CDF, and quantile functions for the Cantor distributions at https://github.com/Henrygb/CantorDist.R. Show that the Lebesgue measure of the Cantor set is 0. (Jagannathan) Let us look at an infinite sequence of independent fair-coin tosses. If the outcome is heads, let \\(x_i = 2\\) and \\(x_i = 0\\), when tails. Then use these to create \\(x = \\sum_{n=1}^\\infty \\frac{x_i}{3^i}\\). This is a random variable with the Cantor distribution. Show that \\(X\\) has a singular distribution. Solution. \\[\\begin{align} \\lambda(C) &= 1 - \\lambda(C^c) \\\\ &= 1 - \\frac{1}{3}\\sum_{k = 0}^\\infty (\\frac{2}{3})^k \\\\ &= 1 - \\frac{\\frac{1}{3}}{1 - \\frac{2}{3}} \\\\ &= 0. \\end{align}\\] First, for every \\(x\\), the probability of observing it is \\(\\lim_{n \\rightarrow \\infty} \\frac{1}{2^n} = 0\\). Second, the probability that we observe one of all the possible sequences is 1. Therefore \\(P(C) = 1\\). So this is a singular variable. The CDF only increments on the elements of the Cantor set. 4.5 Transformations Exercise 4.15 Let \\(X\\) be a random variable that is uniformly distributed on \\(\\{-2, -1, 0, 1, 2\\}\\). Find the PMF of \\(Y = X^2\\). Solution. \\[\\begin{align} P_Y(y) = \\sum_{x \\in \\sqrt(y)} P_X(x) = \\begin{cases} 0 & y \\notin \\{0,1,4\\} \\\\ \\frac{1}{5} & y = 0 \\\\ \\frac{2}{5} & y \\in \\{1,4\\} \\end{cases} \\end{align}\\] Exercise 4.16 (Lognormal random variable) A lognormal random variable is a variable whose logarithm is normally distributed. In practice, we often encounter skewed data. Usually using a log transformation on such data makes it more symmetric and therefore more suitable for modeling with the normal distribution (more on why we wish to model data with the normal distribution in the following chapters). Let \\(X \\sim \\text{N}(\\mu,\\sigma)\\). Find the PDF of \\(Y: \\log(Y) = X\\). R: Sample from the lognormal distribution with parameters \\(\\mu = 5\\) and \\(\\sigma = 2\\). Plot a histogram of the samples. Then log-transform the samples and plot a histogram along with the theoretical normal PDF. Solution. \\[\\begin{align} p_Y(y) &= p_X(\\log(y)) \\frac{d}{dy} \\log(y) \\\\ &= \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{\\frac{(\\log(y) - \\mu)^2}{2 \\sigma^2}} \\frac{1}{y} \\\\ &= \\frac{1}{y \\sqrt{2 \\pi \\sigma^2}} e^{\\frac{(\\log(y) - \\mu)^2}{2 \\sigma^2}}. \\end{align}\\] set.seed(1) nsamps <- 10000 mu <- 0.5 sigma <- 0.4 ln_samps <- rlnorm(nsamps, mu, sigma) ln_plot <- ggplot(data = data.frame(x = ln_samps), aes(x = x)) + geom_histogram(color = "black") plot(ln_plot) norm_samps <- log(ln_samps) n_plot <- ggplot(data = data.frame(x = norm_samps), aes(x = x)) + geom_histogram(aes(y = ..density..), color = "black") + stat_function(fun = dnorm, args = list(mean = mu, sd = sigma), color = "red") plot(n_plot) Exercise 4.17 (Probability integral transform) This exercise is borrowed from Wasserman. Let \\(X\\) have a continuous, strictly increasing CDF \\(F\\). Let \\(Y = F(X)\\). Find the density of \\(Y\\). This is called the probability integral transform. Let \\(U \\sim \\text{Uniform}(0,1)\\) and let \\(X = F^{-1}(U)\\). Show that \\(X \\sim F\\). R: Implement a program that takes Uniform(0,1) random variables and generates random variables from an exponential(\\(\\beta\\)) distribution. Compare your implemented function with function rexp in R. Solution. \\[\\begin{align} F_Y(y) &= P(Y < y) \\\\ &= P(F(X) < y) \\\\ &= P(X < F_X^{-1}(y)) \\\\ &= F_X(F_X^{-1}(y)) \\\\ &= y. \\end{align}\\] From the above it follows that \\(p(y) = 1\\). Note that we need to know the inverse CDF to be able to apply this procedure. \\[\\begin{align} P(X < x) &= P(F^{-1}(U) < x) \\\\ &= P(U < F(x)) \\\\ &= F_U(F(x)) \\\\ &= F(x). \\end{align}\\] set.seed(1) nsamps <- 10000 beta <- 4 generate_exp <- function (n, beta) { tmp <- runif(n) X <- qexp(tmp, beta) return (X) } df <- tibble("R" = rexp(nsamps, beta), "myGenerator" = generate_exp(nsamps, beta)) %>% gather() ggplot(data = df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") "],["mrvs.html", "Chapter 5 Multiple random variables 5.1 General 5.2 Bivariate distribution examples 5.3 Transformations", " Chapter 5 Multiple random variables This chapter deals with multiple random variables and their distributions. The students are expected to acquire the following knowledge: Theoretical Calculation of PDF of transformed multiple random variables. Finding marginal and conditional distributions. R Scatterplots of bivariate random variables. New R functions (for example, expand.grid). .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 5.1 General Exercise 5.1 Let \\(X \\sim \\text{N}(0,1)\\) and \\(Y \\sim \\text{N}(0,1)\\) be independent random variables. Draw 1000 samples from \\((X,Y)\\) and plot a scatterplot. Now let \\(X \\sim \\text{N}(0,1)\\) and \\(Y | X = x \\sim N(ax, 1)\\). Draw 1000 samples from \\((X,Y)\\) for \\(a = 1\\), \\(a=0\\), and \\(a=-0.5\\). Plot the scatterplots. How would you interpret parameter \\(a\\)? Plot the marginal distribution of \\(Y\\) for cases \\(a=1\\), \\(a=0\\), and \\(a=-0.5\\). Can you guess which distribution it is? set.seed(1) nsamps <- 1000 x <- rnorm(nsamps) y <- rnorm(nsamps) ggplot(data.frame(x, y), aes(x = x, y = y)) + geom_point() y1 <- rnorm(nsamps, mean = 1 * x) y2 <- rnorm(nsamps, mean = 0 * x) y3 <- rnorm(nsamps, mean = -0.5 * x) df <- tibble(x = c(x,x,x), y = c(y1,y2,y3), a = c(rep(1, nsamps), rep(0, nsamps), rep(-0.5, nsamps))) ggplot(df, aes(x = x, y = y)) + geom_point() + facet_wrap(~a) + coord_equal(ratio=1) # Parameter a controls the scale of linear dependency between X and Y. ggplot(df, aes(x = y)) + geom_density() + facet_wrap(~a) 5.2 Bivariate distribution examples Exercise 5.2 (Discrete bivariate random variable) Let \\(X\\) represent the event that a die rolls an even number and let \\(Y\\) represent the event that a die rolls one, two, or a three. Find the marginal distributions of \\(X\\) and \\(Y\\). Find the PMF of \\((X,Y)\\). Find the CDF of \\((X,Y)\\). Find \\(P(X = 1 | Y = 1)\\). Solution. \\[\\begin{align} P(X = 1) = \\frac{1}{2} \\text{ and } P(X = 0) = \\frac{1}{2} \\\\ P(Y = 1) = \\frac{1}{2} \\text{ and } P(Y = 0) = \\frac{1}{2} \\\\ \\end{align}\\] \\[\\begin{align} P(X = 1, Y = 1) = \\frac{1}{6} \\\\ P(X = 1, Y = 0) = \\frac{2}{6} \\\\ P(X = 0, Y = 1) = \\frac{2}{6} \\\\ P(X = 0, Y = 0) = \\frac{1}{6} \\end{align}\\] \\[\\begin{align} P(X \\leq x, Y \\leq y) = \\begin{cases} \\frac{1}{6} & x = 0, y = 0 \\\\ \\frac{3}{6} & x \\neq y \\\\ 1 & x = 1, y = 1 \\end{cases} \\end{align}\\] \\[\\begin{align} P(X = 1 | Y = 1) = \\frac{1}{3} \\end{align}\\] Exercise 5.3 (Continuous bivariate random variable) Let \\(p(x,y) = 6 (x - y)^2\\) be the PDF of a bivariate random variable \\((X,Y)\\), where both variables range from zero to one. Find CDF. Find marginal distributions. Find conditional distributions. R: Plot a grid of points and colour them by value – this can help us visualize the PDF. R: Implement a random number generator, which will generate numbers from \\((X,Y)\\) and visually check the results. R: Plot the marginal distribution of \\(Y\\) and the conditional distributions of \\(X | Y = y\\), where \\(y \\in \\{0, 0.1, 0.5\\}\\). Solution. \\[\\begin{align} F(x,y) &= \\int_0^{x} \\int_0^{y} 6 (t - s)^2 ds dt\\\\ &= 6 \\int_0^{x} \\int_0^{y} t^2 - 2ts + s^2 ds dt\\\\ &= 6 \\int_0^{x} t^2y - ty^2 + \\frac{y^3}{3} dt \\\\ &= 6 (\\frac{x^3 y}{3} - \\frac{x^2y^2}{2} + \\frac{x y^3}{3}) \\\\ &= 2 x^3 y - 3 t^2y^2 + 2 x y^3 \\end{align}\\] \\[\\begin{align} p(x) &= \\int_0^{1} 6 (x - y)^2 dy\\\\ &= 6 (x^2 - x + \\frac{1}{3}) \\\\ &= 6x^2 - 6x + 2 \\end{align}\\] \\[\\begin{align} p(y) &= \\int_0^{1} 6 (x - y)^2 dx\\\\ &= 6 (y^2 - y + \\frac{1}{3}) \\\\ &= 6y^2 - 6y + 2 \\end{align}\\] \\[\\begin{align} p(x|y) &= \\frac{p(xy)}{p(y)} \\\\ &= \\frac{6 (x - y)^2}{6 (y^2 - y + \\frac{1}{3})} \\\\ &= \\frac{(x - y)^2}{y^2 - y + \\frac{1}{3}} \\end{align}\\] \\[\\begin{align} p(y|x) &= \\frac{p(xy)}{p(x)} \\\\ &= \\frac{6 (x - y)^2}{6 (x^2 - x + \\frac{1}{3})} \\\\ &= \\frac{(x - y)^2}{x^2 - x + \\frac{1}{3}} \\end{align}\\] set.seed(1) # d pxy <- function (x, y) { return ((x - y)^2) } x_axis <- seq(0, 1, length.out = 100) y_axis <- seq(0, 1, length.out = 100) df <- expand.grid(x_axis, y_axis) colnames(df) <- c("x", "y") df <- cbind(df, pdf = pxy(df$x, df$y)) ggplot(data = df, aes(x = x, y = y, color = pdf)) + geom_point() # e samps <- NULL for (i in 1:10000) { xt <- runif(1, 0, 1) yt <- runif(1, 0, 1) pdft <- pxy(xt, yt) acc <- runif(1, 0, 6) if (acc <= pdft) { samps <- rbind(samps, c(xt, yt)) } } colnames(samps) <- c("x", "y") ggplot(data = as.data.frame(samps), aes(x = x, y = y)) + geom_point() # f mar_pdf <- function (x) { return (6 * x^2 - 6 * x + 2) } cond_pdf <- function (x, y) { return (((x - y)^2) / (y^2 - y + 1/3)) } df <- tibble(x = x_axis, mar = mar_pdf(x), y0 = cond_pdf(x, 0), y0.1 = cond_pdf(x, 0.1), y0.5 = cond_pdf(x, 0.5)) %>% gather(dist, value, -x) ggplot(df, aes(x = x, y = value, color = dist)) + geom_line() Exercise 5.4 (Mixed bivariate random variable) Let \\(f(x,y) = \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y+ \\alpha -1} e^{-x(1 + \\beta)}\\) be the PDF of a bivariate random variable, where \\(x \\in (0, \\infty)\\) and \\(y \\in \\mathbb{N}_0\\). Find the marginal distribution of \\(X\\). Do you recognize this distribution? Find the conditional distribution of \\(Y | X\\). Do you recognize this distribution? Calculate the probability \\(P(Y = 2 | X = 2.5)\\) for \\((X,Y)\\). Find the marginal distribution of \\(Y\\). Do you recognize this distribution? R: Take 1000 random samples from \\((X,Y)\\) with parameters \\(\\beta = 1\\) and \\(\\alpha = 1\\). Plot a scatterplot. Plot a bar plot of the marginal distribution of \\(Y\\), and the theoretical PMF calculated from d) on the range from 0 to 10. Hint: Use the gamma function in R.? Solution. \\[\\begin{align} p(x) &= \\sum_{k = 0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)k!} x^{k + \\alpha -1} e^{-x(1 + \\beta)} & \\\\ &= \\sum_{k = 0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)k!} x^{k} x^{\\alpha -1} e^{-x} e^{-\\beta x} & \\\\ &= \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x} \\sum_{k = 0}^{\\infty} \\frac{1}{k!} x^{k} e^{-x} & \\\\ &= \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x} & \\text{the last term above sums to one} \\end{align}\\] This is the Gamma PDF. \\[\\begin{align} p(y|x) &= \\frac{p(x,y)}{p(x)} \\\\ &= \\frac{\\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y+ \\alpha -1} e^{-x(1 + \\beta)}}{\\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x}} \\\\ &= \\frac{x^y e^{-x}}{y!}. \\end{align}\\] This is the Poisson PMF. \\[\\begin{align} P(Y = 2 | X = 2.5) = \\frac{2.5^2 e^{-2.5}}{2!} \\approx 0.26. \\end{align}\\] \\[\\begin{align} p(y) &= \\int_{0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y + \\alpha -1} e^{-x(1 + \\beta)} dx & \\\\ &= \\frac{1}{y!} \\int_{0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{(y + \\alpha) -1} e^{-(1 + \\beta)x} dx & \\\\ &= \\frac{1}{y!} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} \\int_{0}^{\\infty} \\frac{\\Gamma(y + \\alpha)}{(1 + \\beta)^{y + \\alpha}} \\frac{(1 + \\beta)^{y + \\alpha}}{\\Gamma(y + \\alpha)} x^{(y + \\alpha) -1} e^{-(1 + \\beta)x} dx & \\text{complete to Gamma PDF} \\\\ &= \\frac{1}{y!} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} \\frac{\\Gamma(y + \\alpha)}{(1 + \\beta)^{y + \\alpha}}. \\end{align}\\] We add the terms in the third equality to get a Gamma PDF inside the integral, which then integrates to one. We do not recognize this distribution. set.seed(1) px <- function (x, alpha, beta) { return((1 / factorial(x)) * (beta^alpha / gamma(alpha)) * (gamma(x + alpha) / (1 + beta)^(x + alpha))) } nsamps <- 1000 rx <- rgamma(nsamps, 1, 1) ryx <- rpois(nsamps, rx) ggplot(data = data.frame(x = rx, y = ryx), aes(x = x, y = y)) + geom_point() ggplot(data = data.frame(x = rx, y = ryx), aes(x = y)) + geom_bar(aes(y = (..count..)/sum(..count..))) + stat_function(fun = px, args = list(alpha = 1, beta = 1), color = "red") Exercise 5.5 Let \\(f(x,y) = cx^2y\\) for \\(x^2 \\leq y \\leq 1\\) and zero otherwise. Find such \\(c\\) that \\(f\\) is a PDF of a bivariate random variable. This exercise is borrowed from Wasserman. Solution. \\[\\begin{align} 1 &= \\int_{-1}^{1} \\int_{x^2}^1 cx^2y dy dx \\\\ &= \\int_{-1}^{1} cx^2 (\\frac{1}{2} - \\frac{x^4}{2}) dx \\\\ &= \\frac{c}{2} \\int_{-1}^{1} x^2 - x^6 dx \\\\ &= \\frac{c}{2} (\\frac{1}{3} + \\frac{1}{3} - \\frac{1}{7} - \\frac{1}{7}) \\\\ &= \\frac{c}{2} \\frac{8}{21} \\\\ &= \\frac{4c}{21} \\end{align}\\] It follows \\(c = \\frac{21}{4}\\). 5.3 Transformations Exercise 5.6 Let \\((X,Y)\\) be uniformly distributed on the unit ball \\(\\{(x,y,z) : x^2 + y^2 + z^2 \\leq 1\\}\\). Let \\(R = \\sqrt{X^2 + Y^2 + Z^2}\\). Find the CDF and PDF of \\(R\\). Solution. \\[\\begin{align} P(R < r) &= P(\\sqrt{X^2 + Y^2 + Z^2} < r) \\\\ &= P(X^2 + Y^2 + Z^2 < r^2) \\\\ &= \\frac{\\frac{4}{3} \\pi r^3}{\\frac{4}{3}\\pi} \\\\ &= r^3. \\end{align}\\] The second line shows us that we are looking at the probability which is represented by a smaller ball with radius \\(r\\). To get the probability, we divide it by the radius of the whole ball. We get the PDF by differentiating the CDF, so \\(p(r) = 3r^2\\). "],["integ.html", "Chapter 6 Integration 6.1 Monte Carlo integration 6.2 Lebesgue integrals", " Chapter 6 Integration This chapter deals with abstract and Monte Carlo integration. The students are expected to acquire the following knowledge: Theoretical How to calculate Lebesgue integrals for non-simple functions. R Monte Carlo integration. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 6.1 Monte Carlo integration Exercise 6.1 Let \\(X\\) and \\(Y\\) be continuous random variables on the unit interval and \\(p(x,y) = 6(x - y)^2\\). Use Monte Carlo integration to estimate the probability \\(P(0.2 \\leq X \\leq 0.5, \\: 0.1 \\leq Y \\leq 0.2)\\). Can you find the exact value? set.seed(1) nsamps <- 1000 V <- (0.5 - 0.2) * (0.2 - 0.1) x1 <- runif(nsamps, 0.2, 0.5) x2 <- runif(nsamps, 0.1, 0.2) f_1 <- function (x, y) { return (6 * (x - y)^2) } mcint <- V * (1 / nsamps) * sum(f_1(x1, x2)) sdm <- sqrt((V^2 / nsamps) * var(f_1(x1, x2))) mcint ## [1] 0.008793445 sdm ## [1] 0.0002197686 F_1 <- function (x, y) { return (2 * x^3 * y - 3 * x^2 * y^2 + 2 * x * y^3) } F_1(0.5, 0.2) - F_1(0.2, 0.2) - F_1(0.5, 0.1) + F_1(0.2, 0.1) ## [1] 0.0087 6.2 Lebesgue integrals Exercise 6.2 (borrowed from Jagannathan) Find the Lebesgue integral of the following functions on (\\(\\mathbb{R}\\), \\(\\mathcal{B}(\\mathbb{R})\\), \\(\\lambda\\)). \\[\\begin{align} f(\\omega) = \\begin{cases} \\omega, & \\text{for } \\omega = 0,1,...,n \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] \\[\\begin{align} f(\\omega) = \\begin{cases} 1, & \\text{for } \\omega = \\mathbb{Q}^c \\cap [0,1] \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] \\[\\begin{align} f(\\omega) = \\begin{cases} n, & \\text{for } \\omega = \\mathbb{Q}^c \\cap [0,n] \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] Solution. \\[\\begin{align} \\int f(\\omega) d\\lambda = \\sum_{\\omega = 0}^n \\omega \\lambda(\\omega) = 0. \\end{align}\\] \\[\\begin{align} \\int f(\\omega) d\\lambda = 1 \\times \\lambda(\\mathbb{Q}^c \\cap [0,1]) = 1. \\end{align}\\] \\[\\begin{align} \\int f(\\omega) d\\lambda = n \\times \\lambda(\\mathbb{Q}^c \\cap [0,n]) = n^2. \\end{align}\\] Exercise 6.3 (borrowed from Jagannathan) Let \\(c \\in \\mathbb{R}\\) be fixed and (\\(\\mathbb{R}\\), \\(\\mathcal{B}(\\mathbb{R})\\)) a measurable space. If for any Borel set \\(A\\), \\(\\delta_c (A) = 1\\) if \\(c \\in A\\), and \\(\\delta_c (A) = 0\\) otherwise, then \\(\\delta_c\\) is called a Dirac measure. Let \\(g\\) be a non-negative, measurable function. Show that \\(\\int g d \\delta_c = g(c)\\). Solution. \\[\\begin{align} \\int g d \\delta_c &= \\sup_{q \\in S(g)} \\int q d \\delta_c \\\\ &= \\sup_{q \\in S(g)} \\sum_{i = 1}^n a_i \\delta_c(A_i) \\\\ &= \\sup_{q \\in S(g)} \\sum_{i = 1}^n a_i \\text{I}_{A_i}(c) \\\\ &= \\sup_{q \\in S(g)} q(c) \\\\ &= g(c) \\end{align}\\] "],["ev.html", "Chapter 7 Expected value 7.1 Discrete random variables 7.2 Continuous random variables 7.3 Sums, functions, conditional expectations 7.4 Covariance", " Chapter 7 Expected value This chapter deals with expected values of random variables. The students are expected to acquire the following knowledge: Theoretical Calculation of the expected value. Calculation of variance and covariance. Cauchy distribution. R Estimation of expected value. Estimation of variance and covariance. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 7.1 Discrete random variables Exercise 7.1 (Bernoulli) Let \\(X \\sim \\text{Bernoulli}(p)\\). Find \\(E[X]\\). Find \\(Var[X]\\). R: Let \\(p = 0.4\\). Check your answers to a) and b) with a simulation. Solution. \\[\\begin{align*} E[X] = \\sum_{k=0}^1 p^k (1-p)^{1-k} k = p. \\end{align*}\\] \\[\\begin{align*} Var[X] = E[X^2] - E[X]^2 = \\sum_{k=0}^1 (p^k (1-p)^{1-k} k^2) - p^2 = p(1-p). \\end{align*}\\] set.seed(1) nsamps <- 1000 x <- rbinom(nsamps, 1, 0.4) mean(x) ## [1] 0.394 var(x) ## [1] 0.239003 0.4 * (1 - 0.4) ## [1] 0.24 Exercise 7.2 (Binomial) Let \\(X \\sim \\text{Binomial}(n,p)\\). Find \\(E[X]\\). Find \\(Var[X]\\). Solution. Let \\(X = \\sum_{i=0}^n X_i\\), where \\(X_i \\sim \\text{Bernoulli}(p)\\). Then, due to linearity of expectation \\[\\begin{align*} E[X] = E[\\sum_{i=0}^n X_i] = \\sum_{i=0}^n E[X_i] = np. \\end{align*}\\] Again let \\(X = \\sum_{i=0}^n X_i\\), where \\(X_i \\sim \\text{Bernoulli}(p)\\). Since the Bernoulli variables \\(X_i\\) are independent we have \\[\\begin{align*} Var[X] = Var[\\sum_{i=0}^n X_i] = \\sum_{i=0}^n Var[X_i] = np(1-p). \\end{align*}\\] Exercise 7.3 (Poisson) Let \\(X \\sim \\text{Poisson}(\\lambda)\\). Find \\(E[X]\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\sum_{k=0}^\\infty \\frac{\\lambda^k e^{-\\lambda}}{k!} k & \\\\ &= \\sum_{k=1}^\\infty \\frac{\\lambda^k e^{-\\lambda}}{k!} k & \\text{term at $k=0$ is 0} \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty \\frac{\\lambda^{k-1}}{(k - 1)!} & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=0}^\\infty \\frac{\\lambda^{k}}{k!} & \\\\ &= e^{-\\lambda} \\lambda e^\\lambda & \\\\ &= \\lambda. \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty k \\frac{\\lambda^{k-1}}{(k - 1)!} - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty (k - 1) + 1) \\frac{\\lambda^{k-1}}{(k - 1)!} - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\sum_{k=1}^\\infty (k - 1) \\frac{\\lambda^{k-1}}{(k - 1)!} + \\sum_{k=1}^\\infty \\frac{\\lambda^{k-1}}{(k - 1)!}\\Big) - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\lambda\\sum_{k=2}^\\infty \\frac{\\lambda^{k-2}}{(k - 2)!} + e^\\lambda\\Big) - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\lambda e^\\lambda + e^\\lambda\\Big) - \\lambda^2 & \\\\ &= \\lambda^2 + \\lambda - \\lambda^2 & \\\\ &= \\lambda. \\end{align*}\\] Exercise 7.4 (Geometric) Let \\(X \\sim \\text{Geometric}(p)\\). Find \\(E[X]\\). Hint: \\(\\frac{d}{dx} x^k = k x^{(k - 1)}\\). Solution. \\[\\begin{align*} E[X] &= \\sum_{k=0}^\\infty (1 - p)^k p k & \\\\ &= p (1 - p) \\sum_{k=0}^\\infty (1 - p)^{k-1} k & \\\\ &= p (1 - p) \\sum_{k=0}^\\infty -\\frac{d}{dp}(1 - p)^k & \\\\ &= p (1 - p) \\Big(-\\frac{d}{dp}\\Big) \\sum_{k=0}^\\infty (1 - p)^k & \\\\ &= p (1 - p) \\Big(-\\frac{d}{dp}\\Big) \\frac{1}{1 - (1 - p)} & \\text{geometric series} \\\\ &= \\frac{1 - p}{p} \\end{align*}\\] 7.2 Continuous random variables Exercise 7.5 (Gamma) Let \\(X \\sim \\text{Gamma}(\\alpha, \\beta)\\). Hint: \\(\\Gamma(z) = \\int_0^\\infty t^{z-1}e^{-t} dt\\) and \\(\\Gamma(z + 1) = z \\Gamma(z)\\). Find \\(E[X]\\). Find \\(Var[X]\\). R: Let \\(\\alpha = 10\\) and \\(\\beta = 2\\). Plot the density of \\(X\\). Add a horizontal line at the expected value that touches the density curve (geom_segment). Shade the area within a standard deviation of the expected value. Solution. \\[\\begin{align*} E[X] &= \\int_0^\\infty \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)}x^\\alpha e^{-\\beta x} dx & \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\int_0^\\infty x^\\alpha e^{-\\beta x} dx & \\text{ (let $t = \\beta x$)} \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(\\alpha) }\\int_0^\\infty \\frac{t^\\alpha}{\\beta^\\alpha} e^{-t} \\frac{dt}{\\beta} & \\\\ &= \\frac{1}{\\beta \\Gamma(\\alpha) }\\int_0^\\infty t^\\alpha e^{-t} dt & \\\\ &= \\frac{\\Gamma(\\alpha + 1)}{\\beta \\Gamma(\\alpha)} & \\\\ &= \\frac{\\alpha \\Gamma(\\alpha)}{\\beta \\Gamma(\\alpha)} & \\\\ &= \\frac{\\alpha}{\\beta}. & \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\int_0^\\infty \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)}x^{\\alpha+1} e^{-\\beta x} dx - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\Gamma(\\alpha + 2)}{\\beta^2 \\Gamma(\\alpha)} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{(\\alpha + 1)\\alpha\\Gamma(\\alpha)}{\\beta^2 \\Gamma(\\alpha)} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\alpha^2 + \\alpha}{\\beta^2} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\alpha}{\\beta^2}. \\end{align*}\\] set.seed(1) x <- seq(0, 25, by = 0.01) y <- dgamma(x, shape = 10, rate = 2) df <- data.frame(x = x, y = y) ggplot(df, aes(x = x, y = y)) + geom_line() + geom_segment(aes(x = 5, y = 0, xend = 5, yend = dgamma(5, shape = 10, rate = 2)), color = "red") + stat_function(fun = dgamma, args = list(shape = 10, rate = 2), xlim = c(5 - sqrt(10/4), 5 + sqrt(10/4)), geom = "area", fill = "gray", alpha = 0.4) Exercise 7.6 (Beta) Let \\(X \\sim \\text{Beta}(\\alpha, \\beta)\\). Find \\(E[X]\\). Hint 1: \\(\\text{B}(x,y) = \\int_0^1 t^{x-1} (1 - t)^{y-1} dt\\). Hint 2: \\(\\text{B}(x + 1, y) = \\text{B}(x,y)\\frac{x}{x + y}\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\int_0^1 \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} x dx \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)}\\int_0^1 x^{\\alpha} (1 - x)^{\\beta - 1} dx \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 1, \\beta) \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha, \\beta) \\frac{\\alpha}{\\alpha + \\beta} \\\\ &= \\frac{\\alpha}{\\alpha + \\beta}. \\\\ \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\int_0^1 \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} x^2 dx - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)}\\int_0^1 x^{\\alpha + 1} (1 - x)^{\\beta - 1} dx - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 2, \\beta) - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 1, \\beta) \\frac{\\alpha + 1}{\\alpha + \\beta + 1} - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{\\alpha + 1}{\\alpha + \\beta + 1} \\frac{\\alpha}{\\alpha + \\beta} - \\frac{\\alpha^2}{(\\alpha + \\beta)^2}\\\\ &= \\frac{\\alpha \\beta}{(\\alpha + \\beta)^2(\\alpha + \\beta + 1)}. \\end{align*}\\] Exercise 7.7 (Exponential) Let \\(X \\sim \\text{Exp}(\\lambda)\\). Find \\(E[X]\\). Hint: \\(\\Gamma(z + 1) = z\\Gamma(z)\\) and \\(\\Gamma(1) = 1\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\int_0^\\infty \\lambda e^{-\\lambda x} x dx & \\\\ &= \\lambda \\int_0^\\infty x e^{-\\lambda x} dx & \\\\ &= \\lambda \\int_0^\\infty \\frac{t}{\\lambda} e^{-t} \\frac{dt}{\\lambda} & \\text{$t = \\lambda x$}\\\\ &= \\lambda \\lambda^{-2} \\Gamma(2) & \\text{definition of gamma function} \\\\ &= \\lambda^{-1}. \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 & \\\\ &= \\int_0^\\infty \\lambda e^{-\\lambda x} x^2 dx - \\lambda^{-2} & \\\\ &= \\lambda \\int_0^\\infty \\frac{t^2}{\\lambda^2} e^{-t} \\frac{dt}{\\lambda} - \\lambda^{-2} & \\text{$t = \\lambda x$} \\\\ &= \\lambda \\lambda^{-3} \\Gamma(3) - \\lambda^{-2} & \\text{definition of gamma function} & \\\\ &= \\lambda^{-2} 2 \\Gamma(2) - \\lambda^{-2} & \\\\ &= 2 \\lambda^{-2} - \\lambda^{-2} & \\\\ &= \\lambda^{-2}. & \\\\ \\end{align*}\\] Exercise 7.8 (Normal) Let \\(X \\sim \\text{N}(\\mu, \\sigma)\\). Show that \\(E[X] = \\mu\\). Hint: Use the error function \\(\\text{erf}(x) = \\frac{1}{\\sqrt(\\pi)} \\int_{-x}^x e^{-t^2} dt\\). The statistical interpretation of this function is that if \\(Y \\sim \\text{N}(0, 0.5)\\), then the error function describes the probability of \\(Y\\) falling between \\(-x\\) and \\(x\\). Also, \\(\\text{erf}(\\infty) = 1\\). Show that \\(Var[X] = \\sigma^2\\). Hint: Start with the definition of variance. Solution. \\[\\begin{align*} E[X] &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} x dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty x e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty \\Big(t \\sqrt{2\\sigma^2} + \\mu\\Big)e^{-t^2} \\sqrt{2 \\sigma^2} dt & t = \\frac{x - \\mu}{\\sqrt{2}\\sigma} \\\\ &= \\frac{\\sqrt{2\\sigma^2}}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty t e^{-t^2} dt + \\frac{1}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty \\mu e^{-t^2} dt & \\\\ \\end{align*}\\] Let us calculate these integrals separately. \\[\\begin{align*} \\int t e^{-t^2} dt &= -\\frac{1}{2}\\int e^{s} ds & s = -t^2 \\\\ &= -\\frac{e^s}{2} + C \\\\ &= -\\frac{e^{-t^2}}{2} + C & \\text{undoing substitution}. \\end{align*}\\] Inserting the integration limits we get \\[\\begin{align*} \\int_{-\\infty}^\\infty t e^{-t^2} dt &= 0, \\end{align*}\\] due to the integrated function being symmetric. Reordering the second integral we get \\[\\begin{align*} \\mu \\frac{1}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty e^{-t^2} dt &= \\mu \\text{erf}(\\infty) & \\text{definition of error function} \\\\ &= \\mu & \\text{probability of $Y$ falling between $-\\infty$ and $\\infty$}. \\end{align*}\\] Combining all of the above we get \\[\\begin{align*} E[X] &= \\frac{\\sqrt{2\\sigma^2}}{\\sqrt{\\pi}} \\times 0 + \\mu &= \\mu.\\\\ \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[(X - E[X])^2] \\\\ &= E[(X - \\mu)^2] \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty (x - \\mu)^2 e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} dx \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\int_{-\\infty}^\\infty t^2 e^{-\\frac{t^2}{2}} dt \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\bigg(\\Big(- t e^{-\\frac{t^2}{2}} |_{-\\infty}^\\infty \\Big) + \\int_{-\\infty}^\\infty e^{-\\frac{t^2}{2}} \\bigg) dt & \\text{integration by parts} \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\sqrt{2 \\pi} \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt(\\pi)}e^{-s^2} \\bigg) & s = \\frac{t}{\\sqrt{2}} \\text{ and evaluating the left expression at the bounds} \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\sqrt{2 \\pi} \\Big(\\text{erf}(\\infty) & \\text{definition of error function} \\\\ &= \\sigma^2. \\end{align*}\\] 7.3 Sums, functions, conditional expectations Exercise 7.9 (Expectation of transformations) Let \\(X\\) follow a normal distribution with mean \\(\\mu\\) and variance \\(\\sigma^2\\). Find \\(E[2X + 4]\\). Find \\(E[X^2]\\). Find \\(E[\\exp(X)]\\). Hint: Use the error function \\(\\text{erf}(x) = \\frac{1}{\\sqrt(\\pi)} \\int_{-x}^x e^{-t^2} dt\\). Also, \\(\\text{erf}(\\infty) = 1\\). R: Check your results numerically for \\(\\mu = 0.4\\) and \\(\\sigma^2 = 0.25\\) and plot the densities of all four distributions. Solution. \\[\\begin{align} E[2X + 4] &= 2E[X] + 4 & \\text{linearity of expectation} \\\\ &= 2\\mu + 4. \\\\ \\end{align}\\] \\[\\begin{align} E[X^2] &= E[X]^2 + Var[X] & \\text{definition of variance} \\\\ &= \\mu^2 + \\sigma^2. \\end{align}\\] \\[\\begin{align} E[\\exp(X)] &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} e^x dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{\\frac{2 \\sigma^2 x}{2\\sigma^2} -\\frac{(x - \\mu)^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{x^2 - 2x(\\mu + \\sigma^2) + \\mu^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{(x - (\\mu + \\sigma^2))^2 + \\mu^2 - (\\mu + \\sigma^2)^2}{2\\sigma^2}} dx & \\text{complete the square} \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{\\frac{- \\mu^2 + (\\mu + \\sigma^2)^2}{2\\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{(x - (\\mu + \\sigma^2))^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{\\frac{- \\mu^2 + (\\mu + \\sigma^2)^2}{2\\sigma^2}} \\sigma \\sqrt{2 \\pi} \\text{erf}(\\infty) & \\\\ &= e^{\\frac{2\\mu + \\sigma^2}{2}}. \\end{align}\\] set.seed(1) mu <- 0.4 sigma <- 0.5 x <- rnorm(100000, mean = mu, sd = sigma) mean(2*x + 4) ## [1] 4.797756 2 * mu + 4 ## [1] 4.8 mean(x^2) ## [1] 0.4108658 mu^2 + sigma^2 ## [1] 0.41 mean(exp(x)) ## [1] 1.689794 exp((2 * mu + sigma^2) / 2) ## [1] 1.690459 Exercise 7.10 (Sum of independent random variables) Borrowed from Wasserman. Let \\(X_1, X_2,...,X_n\\) be IID random variables with expected value \\(E[X_i] = \\mu\\) and variance \\(Var[X_i] = \\sigma^2\\). Find the expected value and variance of \\(\\bar{X} = \\frac{1}{n} \\sum_{i=1}^n X_i\\). \\(\\bar{X}\\) is called a statistic (a function of the values in a sample). It is itself a random variable and its distribution is called a sampling distribution. R: Take \\(n = 5, 10, 100, 1000\\) samples from the N(\\(2\\), \\(6\\)) distribution 10000 times. Plot the theoretical density and the densities of \\(\\bar{X}\\) statistic for each \\(n\\). Intuitively, are the results in correspondence with your calculations? Check them numerically. Solution. Let us start with the expectation of \\(\\bar{X}\\). \\[\\begin{align} E[\\bar{X}] &= E[\\frac{1}{n} \\sum_{i=1}^n X_i] & \\\\ &= \\frac{1}{n} E[\\sum_{i=1}^n X_i] & \\text{ (multiplication with a scalar)} \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[X_i] & \\text{ (linearity)} \\\\ &= \\frac{1}{n} n \\mu & \\\\ &= \\mu. \\end{align}\\] Now the variance \\[\\begin{align} Var[\\bar{X}] &= Var[\\frac{1}{n} \\sum_{i=1}^n X_i] & \\\\ &= \\frac{1}{n^2} Var[\\sum_{i=1}^n X_i] & \\text{ (multiplication with a scalar)} \\\\ &= \\frac{1}{n^2} \\sum_{i=1}^n Var[X_i] & \\text{ (independence of samples)} \\\\ &= \\frac{1}{n^2} n \\sigma^2 & \\\\ &= \\frac{1}{n} \\sigma^2. \\end{align}\\] set.seed(1) nsamps <- 10000 mu <- 2 sigma <- sqrt(6) N <- c(5, 10, 100, 500) X <- matrix(data = NA, nrow = nsamps, ncol = length(N)) ind <- 1 for (n in N) { for (i in 1:nsamps) { X[i,ind] <- mean(rnorm(n, mu, sigma)) } ind <- ind + 1 } colnames(X) <- N X <- melt(as.data.frame(X)) ggplot(data = X, aes(x = value, colour = variable)) + geom_density() + stat_function(data = data.frame(x = seq(-2, 6, by = 0.01)), aes(x = x), fun = dnorm, args = list(mean = mu, sd = sigma), color = "black") Exercise 7.11 (Conditional expectation) Let \\(X \\in \\mathbb{R}_0^+\\) and \\(Y \\in \\mathbb{N}_0\\) be random variables with joint distribution \\(p_{XY}(X,Y) = \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} 0.5^{y + 1}\\). Find \\(E[X | Y = y]\\) by first finding \\(p_Y\\) and then \\(p_{X|Y}\\). Find \\(E[X]\\). R: check your answers to a) and b) by drawing 10000 samples from \\(p_Y\\) and \\(p_{X|Y}\\). Solution. \\[\\begin{align} p(y) &= \\int_0^\\infty \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} 0.5^{y + 1} dx \\\\ &= \\frac{0.5^{y + 1}}{y + 1} \\int_0^\\infty e^{-\\frac{x}{y + 1}} dx \\\\ &= \\frac{0.5^{y + 1}}{y + 1} (y + 1) \\\\ &= 0.5^{y + 1} \\\\ &= 0.5(1 - 0.5)^y. \\end{align}\\] We recognize this as the geometric distribution. \\[\\begin{align} p(x|y) &= \\frac{p(x,y)}{p(y)} \\\\ &= \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}}. \\end{align}\\] We recognize this as the exponential distribution. \\[\\begin{align} E[X | Y = y] &= \\int_0^\\infty x \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} dx \\\\ &= y + 1 & \\text{expected value of the exponential distribution} \\end{align}\\] Use the law of iterated expectation. \\[\\begin{align} E[X] &= E[E[X | Y]] \\\\ &= E[Y + 1] \\\\ &= E[Y] + 1 \\\\ &= \\frac{1 - 0.5}{0.5} + 1 \\\\ &= 2. \\end{align}\\] set.seed(1) y <- rgeom(100000, 0.5) x <- rexp(100000, rate = 1 / (y + 1)) x2 <- x[y == 3] mean(x2) ## [1] 4.048501 3 + 1 ## [1] 4 mean(x) ## [1] 2.007639 (1 - 0.5) / 0.5 + 1 ## [1] 2 Exercise 7.12 (Cauchy distribution) Let \\(p(x | x_0, \\gamma) = \\frac{1}{\\pi \\gamma \\Big(1 + \\big(\\frac{x - x_0}{\\gamma}\\big)^2\\Big)}\\). A random variable with this PDF follows a Cauchy distribution. This distribution is symmetric and has wider tails than the normal distribution. R: Draw \\(n = 1,...,1000\\) samples from a standard normal and \\(\\text{Cauchy}(0, 1)\\). For each \\(n\\) plot the mean and the median of the sample using facets. Interpret the results. To get a mathematical explanation of the results in a), evaluate the integral \\(\\int_0^\\infty \\frac{x}{1 + x^2} dx\\) and consider that \\(E[X] = \\int_{-\\infty}^\\infty \\frac{x}{1 + x^2}dx\\). set.seed(1) n <- 1000 means_n <- vector(mode = "numeric", length = n) means_c <- vector(mode = "numeric", length = n) medians_n <- vector(mode = "numeric", length = n) medians_c <- vector(mode = "numeric", length = n) for (i in 1:n) { tmp_n <- rnorm(i) tmp_c <- rcauchy(i) means_n[i] <- mean(tmp_n) means_c[i] <- mean(tmp_c) medians_n[i] <- median(tmp_n) medians_c[i] <- median(tmp_c) } df <- data.frame("distribution" = c(rep("normal", 2 * n), rep("Cauchy", 2 * n)), "type" = c(rep("mean", n), rep("median", n), rep("mean", n), rep("median", n)), "value" = c(means_n, medians_n, means_c, medians_c), "n" = rep(1:n, times = 4)) ggplot(df, aes(x = n, y = value)) + geom_line(alpha = 0.5) + facet_wrap(~ type + distribution , scales = "free") Solution. \\[\\begin{align} \\int_0^\\infty \\frac{x}{1 + x^2} dx &= \\frac{1}{2} \\int_1^\\infty \\frac{1}{u} du & u = 1 + x^2 \\\\ &= \\frac{1}{2} \\ln(x) |_0^\\infty. \\end{align}\\] This integral is not finite. The same holds for the negative part. Therefore, the expectation is undefined, as \\(E[|X|] = \\infty\\). Why can we not just claim that \\(f(x) = x / (1 + x^2)\\) is odd and \\(\\int_{-\\infty}^\\infty f(x) = 0\\)? By definition of the Lebesgue integral \\(\\int_{-\\infty}^{\\infty} f= \\int_{-\\infty}^{\\infty} f_+-\\int_{-\\infty}^{\\infty} f_-\\). At least one of the two integrals needs to be finite for \\(\\int_{-\\infty}^{\\infty} f\\) to be well-defined. However \\(\\int_{-\\infty}^{\\infty} f_+=\\int_0^{\\infty} x/(1+x^2)\\) and \\(\\int_{-\\infty}^{\\infty} f_-=\\int_{-\\infty}^{0} |x|/(1+x^2)\\). We have just shown that both of these integrals are infinite, which implies that their sum is also infinite. 7.4 Covariance Exercise 7.13 Below is a table of values for random variables \\(X\\) and \\(Y\\). X Y 2.1 8 -0.5 11 1 10 -2 12 4 9 Find sample covariance of \\(X\\) and \\(Y\\). Find sample variances of \\(X\\) and \\(Y\\). Find sample correlation of \\(X\\) and \\(Y\\). Find sample variance of \\(Z = 2X - 3Y\\). Solution. \\(\\bar{X} = 0.92\\) and \\(\\bar{Y} = 10\\). \\[\\begin{align} s(X, Y) &= \\frac{1}{n - 1} \\sum_{i=1}^5 (X_i - 0.92) (Y_i - 10) \\\\ &= -3.175. \\end{align}\\] \\[\\begin{align} s(X) &= \\frac{\\sum_{i=1}^5(X_i - 0.92)^2}{5 - 1} \\\\ &= 5.357. \\end{align}\\] \\[\\begin{align} s(Y) &= \\frac{\\sum_{i=1}^5(Y_i - 10)^2}{5 - 1} \\\\ &= 2.5. \\end{align}\\] \\[\\begin{align} r(X,Y) &= \\frac{Cov(X,Y)}{\\sqrt{Var[X]Var[Y]}} \\\\ &= \\frac{-3.175}{\\sqrt{5.357 \\times 2.5}} \\\\ &= -8.68. \\end{align}\\] \\[\\begin{align} s(Z) &= 2^2 s(X) + 3^2 s(Y) + 2 \\times 2 \\times 3 s(X, Y) \\\\ &= 4 \\times 5.357 + 9 \\times 2.5 + 12 \\times 3.175 \\\\ &= 82.028. \\end{align}\\] Exercise 7.14 Let \\(X \\sim \\text{Uniform}(0,1)\\) and \\(Y | X = x \\sim \\text{Uniform(0,x)}\\). Find the covariance of \\(X\\) and \\(Y\\). Find the correlation of \\(X\\) and \\(Y\\). R: check your answers to a) and b) with simulation. Plot \\(X\\) against \\(Y\\) on a scatterplot. Solution. The joint PDF is \\(p(x,y) = p(x)p(y|x) = \\frac{1}{x}\\). \\[\\begin{align} Cov(X,Y) &= E[XY] - E[X]E[Y] \\\\ \\end{align}\\] Let us first evaluate the first term: \\[\\begin{align} E[XY] &= \\int_0^1 \\int_0^x x y \\frac{1}{x} dy dx \\\\ &= \\int_0^1 \\int_0^x y dy dx \\\\ &= \\int_0^1 \\frac{x^2}{2} dx \\\\ &= \\frac{1}{6}. \\end{align}\\] Now let us find \\(E[Y]\\), \\(E[X]\\) is trivial. \\[\\begin{align} E[Y] = E[E[Y | X]] = E[\\frac{X}{2}] = \\frac{1}{2} \\int_0^1 x dx = \\frac{1}{4}. \\end{align}\\] Combining all: \\[\\begin{align} Cov(X,Y) &= \\frac{1}{6} - \\frac{1}{2} \\frac{1}{4} = \\frac{1}{24}. \\end{align}\\] \\[\\begin{align} \\rho(X,Y) &= \\frac{Cov(X,Y)}{\\sqrt{Var[X]Var[Y]}} \\\\ \\end{align}\\] Let us calculate \\(Var[X]\\). \\[\\begin{align} Var[X] &= E[X^2] - \\frac{1}{4} \\\\ &= \\int_0^1 x^2 - \\frac{1}{4} \\\\ &= \\frac{1}{3} - \\frac{1}{4} \\\\ &= \\frac{1}{12}. \\end{align}\\] Let us calculate \\(E[E[Y^2|X]]\\). \\[\\begin{align} E[E[Y^2|X]] &= E[\\frac{x^2}{3}] \\\\ &= \\frac{1}{9}. \\end{align}\\] Then \\(Var[Y] = \\frac{1}{9} - \\frac{1}{16} = \\frac{7}{144}\\). Combining all \\[\\begin{align} \\rho(X,Y) &= \\frac{\\frac{1}{24}}{\\sqrt{\\frac{1}{12}\\frac{7}{144}}} \\\\ &= 0.65. \\end{align}\\] set.seed(1) nsamps <- 10000 x <- runif(nsamps) y <- runif(nsamps, 0, x) cov(x, y) ## [1] 0.04274061 1/24 ## [1] 0.04166667 cor(x, y) ## [1] 0.6629567 (1 / 24) / (sqrt(7 / (12 * 144))) ## [1] 0.6546537 ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_point(alpha = 0.2) + geom_smooth(method = "lm") "],["mrv.html", "Chapter 8 Multivariate random variables 8.1 Multinomial random variables 8.2 Multivariate normal random variables 8.3 Transformations", " Chapter 8 Multivariate random variables This chapter deals with multivariate random variables. The students are expected to acquire the following knowledge: Theoretical Multinomial distribution. Multivariate normal distribution. Cholesky decomposition. Eigendecomposition. R Sampling from the multinomial distribution. Sampling from the multivariate normal distribution. Matrix decompositions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 8.1 Multinomial random variables Exercise 8.1 Let \\(X_i\\), \\(i = 1,...,k\\) represent \\(k\\) events, and \\(p_i\\) the probabilities of these events happening in a trial. Let \\(n\\) be the number of trials, and \\(X\\) a multivariate random variable, the collection of \\(X_i\\). Then \\(p(x) = \\frac{n!}{x_1!x_2!...x_k!} p_1^{x_1} p_2^{x_2}...p_k^{x_k}\\) is the PMF of a multinomial distribution, where \\(n = \\sum_{i = 1}^k x_i\\). Show that the marginal distribution of \\(X_i\\) is a binomial distribution. Take 1000 samples from the multinomial distribution with \\(n=4\\) and probabilities \\(p = (0.2, 0.2, 0.5, 0.1)\\). Then take 1000 samples from four binomial distributions with the same parameters. Inspect the results visually. Solution. We will approach this proof from the probabilistic point of view. W.L.O.G. let \\(x_1\\) be the marginal distribution we are interested in. The term \\(p^{x_1}\\) denotes the probability that event 1 happened \\(x_1\\) times. For this event not to happen, one of the other events needs to happen. So for each of the remaining trials, the probability of another event is \\(\\sum_{i=2}^k p_i = 1 - p_1\\), and there were \\(n - x_1\\) such trials. What is left to do is to calculate the number of permutations of event 1 happening and event 1 not happening. We choose \\(x_1\\) trials, from \\(n\\) trials. Therefore \\(p(x_1) = \\binom{n}{x_1} p_1^{x_1} (1 - p_1)^{n - x_1}\\), which is the binomial PMF. Interested students are encouraged to prove this mathematically. set.seed(1) nsamps <- 1000 samps_mult <- rmultinom(nsamps, 4, prob = c(0.2, 0.2, 0.5, 0.1)) samps_mult <- as_tibble(t(samps_mult)) %>% gather() samps <- tibble( V1 = rbinom(nsamps, 4, 0.2), V2 = rbinom(nsamps, 4, 0.2), V3 = rbinom(nsamps, 4, 0.5), V4 = rbinom(nsamps, 4, 0.1) ) %>% gather() %>% bind_rows(samps_mult) %>% bind_cols("dist" = c(rep("binomial", 4*nsamps), rep("multinomial", 4*nsamps))) ggplot(samps, aes(x = value, fill = dist)) + geom_bar(position = "dodge") + facet_wrap(~ key) Exercise 8.2 (Multinomial expected value) Find the expected value, variance and covariance of the multinomial distribution. Hint: First find the expected value for \\(n = 1\\) and then use the fact that the trials are independent. Solution. Let us first calculate the expected value of \\(X_1\\), when \\(n = 1\\). \\[\\begin{align} E[X_1] &= \\sum_{n_1 = 0}^1 \\sum_{n_2 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_1!n_2!...n_k!}p_1^{n_1}p_2^{n_2}...p_k^{n_k}n_1 \\\\ &= \\sum_{n_1 = 0}^1 \\frac{p_1^{n_1} n_1}{n_1!} \\sum_{n_2 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_2!...n_k!}p_2^{n_2}...p_k^{n_k} \\end{align}\\] When \\(n_1 = 0\\) then the whole terms is zero, so we do not need to evaluate other sums. When \\(n_1 = 1\\), all other \\(n_i\\) must be zero, as we have \\(1 = \\sum_{i=1}^k n_i\\). Therefore the other sums equal \\(1\\). So \\(E[X_1] = p_1\\) and \\(E[X_i] = p_i\\) for \\(i = 1,...,k\\). Now let \\(Y_j\\), \\(j = 1,...,n\\), have a multinomial distribution with \\(n = 1\\), and let \\(X\\) have a multinomial distribution with an arbitrary \\(n\\). Then we can write \\(X = \\sum_{j=1}^n Y_j\\). And due to independence \\[\\begin{align} E[X] &= E[\\sum_{j=1}^n Y_j] \\\\ &= \\sum_{j=1}^n E[Y_j] \\\\ &= np. \\end{align}\\] For the variance, we need \\(E[X^2]\\). Let us follow the same procedure as above and first calculate \\(E[X_i]\\) for \\(n = 1\\). The only thing that changes is that the term \\(n_i\\) becomes \\(n_i^2\\). Since we only have \\(0\\) and \\(1\\) this does not change the outcome. So \\[\\begin{align} Var[X_i] &= E[X_i^2] - E[X_i]^2\\\\ &= p_i(1 - p_i). \\end{align}\\] Analogous to above for arbitrary \\(n\\) \\[\\begin{align} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\sum_{j=1}^n E[Y_j^2] - \\sum_{j=1}^n E[Y_j]^2 \\\\ &= \\sum_{j=1}^n E[Y_j^2] - E[Y_j]^2 \\\\ &= \\sum_{j=1}^n p(1-p) \\\\ &= np(1-p). \\end{align}\\] To calculate the covariance, we need \\(E[X_i X_j]\\). Again, let us start with \\(n = 1\\). Without loss of generality, let us assume \\(i = 1\\) and \\(j = 2\\). \\[\\begin{align} E[X_1 X_2] = \\sum_{n_1 = 0}^1 \\sum_{n_2 = 0}^1 \\frac{p_1^{n_1} n_1}{n_1!} \\frac{p_2^{n_2} n_2}{n_2!} \\sum_{n_3 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_3!...n_k!}p_3^{n_3}...p_k^{n_k}. \\end{align}\\] In the above expression, at each iteration we multiply with \\(n_1\\) and \\(n_2\\). Since \\(n = 1\\), one of these always has to be zero. Therefore \\(E[X_1 X_2] = 0\\) and \\[\\begin{align} Cov(X_i, X_j) &= E[X_i X_j] - E[X_i]E[X_j] \\\\ &= - p_i p_j. \\end{align}\\] For arbitrary \\(n\\), let \\(X = \\sum_{t = 1}^n Y_t\\) be the sum of independent multinomial random variables \\(Y_t = [X_{1t}, X_{2t},...,X_{kt}]^T\\) with \\(n=1\\). Then \\(X_1 = \\sum_{t = 1}^n X_{1t}\\) and \\(X_2 = \\sum_{l = 1}^n X_{2l}\\). \\[\\begin{align} Cov(X_1, X_2) &= E[X_1 X_2] - E[X_1] E[X_2] \\\\ &= E[\\sum_{t = 1}^n X_{1t} \\sum_{l = 1}^n X_{2l}] - n^2 p_1 p_2 \\\\ &= \\sum_{t = 1}^n \\sum_{l = 1}^n E[X_{1t} X_{2l}] - n^2 p_1 p_2. \\end{align}\\] For \\(X_{1t}\\) and \\(X_{2l}\\) the expected value is zero when \\(t = l\\). When \\(t \\neq l\\) then they are independent, so the expected value is the product \\(p_1 p_2\\). There are \\(n^2\\) total terms, and for \\(n\\) of them \\(t = l\\) holds. So \\(E[X_1 X_2] = (n^2 - n) p_1 p_2\\). Inserting into the above \\[\\begin{align} Cov(X_1, X_2) &= (n^2 - n) p_1 p_2 - n^2 p_1 p_2 \\\\ &= - n p_1 p_2. \\end{align}\\] 8.2 Multivariate normal random variables Exercise 8.3 (Cholesky decomposition) Let \\(X\\) be a random vector of length \\(k\\) with \\(X_i \\sim \\text{N}(0, 1)\\) and \\(LL^*\\) the Cholesky decomposition of a Hermitian positive-definite matrix \\(A\\). Let \\(\\mu\\) be a vector of length \\(k\\). Find the distribution of the random vector \\(Y = \\mu + L X\\). Find the Cholesky decomposition of \\(A = \\begin{bmatrix} 2 & 1.2 \\\\ 1.2 & 1 \\end{bmatrix}\\). R: Use the results from a) and b) to sample from the MVN distribution \\(\\text{N}(\\mu, A)\\), where \\(\\mu = [1.5, -1]^T\\). Plot a scatterplot and compare it to direct samples from the multivariate normal distribution (rmvnorm). Solution. \\(X\\) has an independent normal distribution of dimension \\(k\\). Then \\[\\begin{align} Y = \\mu + L X &\\sim \\text{N}(\\mu, LL^T) \\\\ &\\sim \\text{N}(\\mu, A). \\end{align}\\] Solve \\[\\begin{align} \\begin{bmatrix} a & 0 \\\\ b & c \\end{bmatrix} \\begin{bmatrix} a & b \\\\ 0 & c \\end{bmatrix} = \\begin{bmatrix} 2 & 1.2 \\\\ 1.2 & 1 \\end{bmatrix} \\end{align}\\] # a set.seed(1) nsamps <- 1000 X <- matrix(data = rnorm(nsamps * 2), ncol = 2) mu <- c(1.5, -1) L <- matrix(data = c(sqrt(2), 0, 1.2 / sqrt(2), sqrt(1 - 1.2^2/2)), ncol = 2, byrow = TRUE) Y <- t(mu + L %*% t(X)) plot_df <- data.frame(rbind(X, Y), c(rep("X", nsamps), rep("Y", nsamps))) colnames(plot_df) <- c("D1", "D2", "var") ggplot(data = plot_df, aes(x = D1, y = D2, colour = as.factor(var))) + geom_point() Exercise 8.4 (Eigendecomposition) R: Let \\(\\Sigma = U \\Lambda U^T\\) be the eigendecomposition of covariance matrix \\(\\Sigma\\). Follow the procedure below, to sample from a multivariate normal with \\(\\mu = [-2, 1]^T\\) and \\(\\Sigma = \\begin{bmatrix} 0.3, -0.5 \\\\ -0.5, 1.6 \\end{bmatrix}\\): Sample from two independent standardized normal distributions to get \\(X\\). Find the eigendecomposition of \\(X\\) (eigen). Multiply \\(X\\) by \\(\\Lambda^{\\frac{1}{2}}\\) to get \\(X2\\). Consider how the eigendecomposition for \\(X2\\) changes compared to \\(X\\). Multiply \\(X2\\) by \\(U\\) to get \\(X3\\). Consider how the eigendecomposition for \\(X3\\) changes compared to \\(X2\\). Add \\(\\mu\\) to \\(X3\\). Consider how the eigendecomposition for \\(X4\\) changes compared to \\(X3\\). Plot the data and the eigenvectors (scaled with \\(\\Lambda^{\\frac{1}{2}}\\)) at each step. Hint: Use geom_segment for the eigenvectors. # a set.seed(1) sigma <- matrix(data = c(0.3, -0.5, -0.5, 1.6), nrow = 2, byrow = TRUE) ed <- eigen(sigma) e_val <- ed$values e_vec <- ed$vectors # b set.seed(1) nsamps <- 1000 X <- matrix(data = rnorm(nsamps * 2), ncol = 2) vec1 <- matrix(c(1,0,0,1), nrow = 2) X2 <- t(sqrt(diag(e_val)) %*% t(X)) vec2 <- sqrt(diag(e_val)) %*% vec1 X3 <- t(e_vec %*% t(X2)) vec3 <- e_vec %*% vec2 X4 <- t(c(-2, 1) + t(X3)) vec4 <- c(-2, 1) + vec3 vec_mat <- data.frame(matrix(c(0,0,0,0,0,0,0,0,0,0,0,0,-2,1,-2,1), ncol = 2, byrow = TRUE), t(cbind(vec1, vec2, vec3, vec4)), c(1,1,2,2,3,3,4,4)) df <- data.frame(rbind(X, X2, X3, X4), c(rep(1, nsamps), rep(2, nsamps), rep(3, nsamps), rep(4, nsamps))) colnames(df) <- c("D1", "D2", "wh") colnames(vec_mat) <- c("D1", "D2", "E1", "E2", "wh") ggplot(data = df, aes(x = D1, y = D2)) + geom_point() + geom_segment(data = vec_mat, aes(xend = E1, yend = E2), color = "red") + facet_wrap(~ wh) + coord_fixed() Exercise 8.5 (Marginal and conditional distributions) Let \\(X \\sim \\text{N}(\\mu, \\Sigma)\\), where \\(\\mu = [2, 0, -1]^T\\) and \\(\\Sigma = \\begin{bmatrix} 1 & -0.2 & 0.5 \\\\ -0.2 & 1.4 & -1.2 \\\\ 0.5 & -1.2 & 2 \\\\ \\end{bmatrix}\\). Let \\(A\\) represent the first two random variables and \\(B\\) the third random variable. R: For the calculation in the following points, you can use R. Find the marginal distribution of \\(B\\). Find the conditional distribution of \\(B | A = [a_1, a_2]^T\\). Find the marginal distribution of \\(A\\). Find the conditional distribution of \\(A | B = b\\). R: Visually compare the distributions of a) and b), and c) and d) at three different conditional values. mu <- c(2, 0, -1) Sigma <- matrix(c(1, -0.2, 0.5, -0.2, 1.4, -1.2, 0.5, -1.2, 2), nrow = 3, byrow = TRUE) mu_A <- c(2, 0) mu_B <- -1 Sigma_A <- Sigma[1:2, 1:2] Sigma_B <- Sigma[3, 3] Sigma_AB <- Sigma[1:2, 3] # b tmp_b <- t(Sigma_AB) %*% solve(Sigma_A) mu_b <- mu_B - tmp_b %*% mu_A Sigma_b <- Sigma_B - t(Sigma_AB) %*% solve(Sigma_A) %*% Sigma_AB mu_b ## [,1] ## [1,] -1.676471 tmp_b ## [,1] [,2] ## [1,] 0.3382353 -0.8088235 Sigma_b ## [,1] ## [1,] 0.8602941 # d tmp_a <- Sigma_AB * (1 / Sigma_B) mu_a <- mu_A - tmp_a * mu_B Sigma_d <- Sigma_A - (Sigma_AB * (1 / Sigma_B)) %*% t(Sigma_AB) mu_a ## [1] 2.25 -0.60 tmp_a ## [1] 0.25 -0.60 Sigma_d ## [,1] [,2] ## [1,] 0.875 0.10 ## [2,] 0.100 0.68 Solution. \\(B \\sim \\text{N}(-1, 2)\\). \\(B | A = a \\sim \\text{N}(-1.68 + [0.34, -0.81] a, 0.86)\\). \\(\\mu_A = [2, 0]^T\\) and \\(\\Sigma_A = \\begin{bmatrix} 1 & -0.2 & \\\\ -0.2 & 1.4 \\\\ \\end{bmatrix}\\). \\[\\begin{align} A | B = b &\\sim \\text{N}(\\mu_t, \\Sigma_t), \\\\ \\mu_t &= [2.25, -0.6]^T + [0.25, -0.6]^T b, \\\\ \\Sigma_t &= \\begin{bmatrix} 0.875 & 0.1 \\\\ 0.1 & 0.68 \\\\ \\end{bmatrix} \\end{align}\\] library(mvtnorm) set.seed(1) nsamps <- 1000 # a and b samps <- as.data.frame(matrix(data = NA, nrow = 4 * nsamps, ncol = 2)) samps[1:nsamps,1] <- rnorm(nsamps, mu_B, Sigma_B) samps[1:nsamps,2] <- "marginal" for (i in 1:3) { a <- rmvnorm(1, mu_A, Sigma_A) samps[(i*nsamps + 1):((i + 1) * nsamps), 1] <- rnorm(nsamps, mu_b + tmp_b %*% t(a), Sigma_b) samps[(i*nsamps + 1):((i + 1) * nsamps), 2] <- paste0(# "cond", round(a, digits = 2), collapse = "-") } colnames(samps) <- c("x", "dist") ggplot(samps, aes(x = x)) + geom_density() + facet_wrap(~ dist) # c and d samps <- as.data.frame(matrix(data = NA, nrow = 4 * nsamps, ncol = 3)) samps[1:nsamps,1:2] <- rmvnorm(nsamps, mu_A, Sigma_A) samps[1:nsamps,3] <- "marginal" for (i in 1:3) { b <- rnorm(1, mu_B, Sigma_B) samps[(i*nsamps + 1):((i + 1) * nsamps), 1:2] <- rmvnorm(nsamps, mu_a + tmp_a * b, Sigma_d) samps[(i*nsamps + 1):((i + 1) * nsamps), 3] <- b } colnames(samps) <- c("x", "y", "dist") ggplot(samps, aes(x = x, y = y)) + geom_point() + geom_smooth(method = "lm") + facet_wrap(~ dist) 8.3 Transformations Exercise 8.6 Let \\((U,V)\\) be a random variable with PDF \\(p(u,v) = \\frac{1}{4 \\sqrt{u}}\\), \\(U \\in [0,4]\\) and \\(V \\in [\\sqrt{U}, \\sqrt{U} + 1]\\). Let \\(X = \\sqrt{U}\\) and \\(Y = V - \\sqrt{U}\\). Find PDF of \\((X,Y)\\). What can you tell about distributions of \\(X\\) and \\(Y\\)? This exercise shows how we can simplify a probabilistic problem with a clever use of transformations. R: Take 1000 samples from \\((X,Y)\\) and transform them with inverses of the above functions to get samples from \\((U,V)\\). Plot both sets of samples. Solution. First we need to find the inverse functions. Since \\(x = \\sqrt{u}\\) it follows that \\(u = x^2\\), and that \\(x \\in [0,2]\\). Similarly \\(v = y + x\\) and \\(y \\in [0,1]\\). Let us first find the Jacobian. \\[\\renewcommand\\arraystretch{1.6} J(x,y) = \\begin{bmatrix} \\frac{\\partial u}{\\partial x} & \\frac{\\partial v}{\\partial x} \\\\%[1ex] % <-- 1ex more space between rows of matrix \\frac{\\partial u}{\\partial y} & \\frac{\\partial v}{\\partial y} \\end{bmatrix} = \\begin{bmatrix} 2x & 1 \\\\%[1ex] % <-- 1ex more space between rows of matrix 0 & 1 \\end{bmatrix}, \\] and the determinant is \\(|J(x,y)| = 2x\\). Putting everything together, we get \\[\\begin{align} p_{X,Y}(x,y) = p_{U,V}(x^2, y + x) |J(x,y)| = \\frac{1}{4 \\sqrt{x^2}} 2x = \\frac{1}{2}. \\end{align}\\] This reminds us of the Uniform distribution. Indeed we can see that \\(p_X(x) = \\frac{1}{2}\\) and \\(p_Y(y) = 1\\). So instead of dealing with an awkward PDF of \\((U,V)\\) and the corresponding dynamic bounds, we are now looking at two independent Uniform random variables. In practice, this could make modeling much easier. set.seed(1) nsamps <- 2000 x <- runif(nsamps, min = 0, max = 2) y <- runif(nsamps) orig <- tibble(x = x, y = y, vrs = "original") u <- x^2 v <- y + x transf <- tibble(x = u, y = v, vrs = "transformed") df <- bind_rows(orig, transf) ggplot(df, aes(x = x, y = y, color = vrs)) + geom_point(alpha = 0.3) Exercise 8.7 R: Write a function that will calculate the probability density of an arbitraty multivariate normal distribution, based on independent standardized normal PDFs. Compare with dmvnorm from the mvtnorm package. library(mvtnorm) set.seed(1) mvn_dens <- function (y, mu, Sigma) { L <- chol(Sigma) L_inv <- solve(t(L)) g_inv <- L_inv %*% t(y - mu) J <- L_inv J_det <- det(J) return(prod(dnorm(g_inv)) * J_det) } mu_v <- c(-2, 0, 1) cov_m <- matrix(c(1, -0.2, 0.5, -0.2, 2, 0.3, 0.5, 0.3, 1.6), ncol = 3, byrow = TRUE) n_comp <- 20 for (i in 1:n_comp) { x <- rmvnorm(1, mean = mu_v, sigma = cov_m) print(paste0("My function: ", mvn_dens(x, mu_v, cov_m), ", dmvnorm: ", dmvnorm(x, mu_v, cov_m))) } ## [1] "My function: 0.0229514237156383, dmvnorm: 0.0229514237156383" ## [1] "My function: 0.00763138915406231, dmvnorm: 0.00763138915406231" ## [1] "My function: 0.0230688881105741, dmvnorm: 0.0230688881105741" ## [1] "My function: 0.0113616213114731, dmvnorm: 0.0113616213114731" ## [1] "My function: 0.00151808500121907, dmvnorm: 0.00151808500121907" ## [1] "My function: 0.0257658045974509, dmvnorm: 0.0257658045974509" ## [1] "My function: 0.0157963825730805, dmvnorm: 0.0157963825730805" ## [1] "My function: 0.00408856287529248, dmvnorm: 0.00408856287529248" ## [1] "My function: 0.0327793540101256, dmvnorm: 0.0327793540101256" ## [1] "My function: 0.0111606542967978, dmvnorm: 0.0111606542967978" ## [1] "My function: 0.0147636757585684, dmvnorm: 0.0147636757585684" ## [1] "My function: 0.0142948300412207, dmvnorm: 0.0142948300412207" ## [1] "My function: 0.0203093820657542, dmvnorm: 0.0203093820657542" ## [1] "My function: 0.0287533273357481, dmvnorm: 0.0287533273357481" ## [1] "My function: 0.0213402305128623, dmvnorm: 0.0213402305128623" ## [1] "My function: 0.0218356957993885, dmvnorm: 0.0218356957993885" ## [1] "My function: 0.0250750113961771, dmvnorm: 0.0250750113961771" ## [1] "My function: 0.0166498666348048, dmvnorm: 0.0166498666348048" ## [1] "My function: 0.00189725106874659, dmvnorm: 0.00189725106874659" ## [1] "My function: 0.0196697814975113, dmvnorm: 0.0196697814975113" "],["ard.html", "Chapter 9 Alternative representation of distributions 9.1 Probability generating functions (PGFs) 9.2 Moment generating functions (MGFs)", " Chapter 9 Alternative representation of distributions This chapter deals with alternative representation of distributions. The students are expected to acquire the following knowledge: Theoretical Probability generating functions. Moment generating functions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 9.1 Probability generating functions (PGFs) Exercise 9.1 Show that the sum of independent Poisson random variables is itself a Poisson random variable. R: Let \\(X\\) be a sum of three Poisson distributions with \\(\\lambda_i \\in \\{2, 5.2, 10\\}\\). Take 1000 samples and plot the three distributions and the sum. Then take 1000 samples from the theoretical distribution of \\(X\\) and compare them to the sum. Solution. Let \\(X_i \\sim \\text{Poisson}(\\lambda_i)\\) for \\(i = 1,...,n\\), and let \\(X = \\sum_{i=1}^n X_i\\). \\[\\begin{align} \\alpha_X(t) &= \\prod_{i=1}^n \\alpha_{X_i}(t) \\\\ &= \\prod_{i=1}^n \\bigg( \\sum_{j=0}^\\infty t^j \\frac{\\lambda_i^j e^{-\\lambda_i}}{j!} \\bigg) \\\\ &= \\prod_{i=1}^n \\bigg( e^{-\\lambda_i} \\sum_{j=0}^\\infty \\frac{(t\\lambda_i)^j }{j!} \\bigg) \\\\ &= \\prod_{i=1}^n \\bigg( e^{-\\lambda_i} e^{t \\lambda_i} \\bigg) & \\text{power series} \\\\ &= \\prod_{i=1}^n \\bigg( e^{\\lambda_i(t - 1)} \\bigg) \\\\ &= e^{\\sum_{i=1}^n \\lambda_i(t - 1)} \\\\ &= e^{t \\sum_{i=1}^n \\lambda_i - \\sum_{i=1}^n \\lambda_i} \\\\ &= e^{-\\sum_{i=1}^n \\lambda_i} \\sum_{j=0}^\\infty \\frac{(t \\sum_{i=1}^n \\lambda_i)^j}{j!}\\\\ &= \\sum_{j=0}^\\infty \\frac{e^{-\\sum_{i=1}^n \\lambda_i} (t \\sum_{i=1}^n \\lambda_i)^j}{j!}\\\\ \\end{align}\\] The last term is the PGF of a Poisson random variable with parameter \\(\\sum_{i=1}^n \\lambda_i\\). Because the PGF is unique, \\(X\\) is a Poisson random variable. set.seed(1) library(tidyr) nsamps <- 1000 samps <- matrix(data = NA, nrow = nsamps, ncol = 4) samps[ ,1] <- rpois(nsamps, 2) samps[ ,2] <- rpois(nsamps, 5.2) samps[ ,3] <- rpois(nsamps, 10) samps[ ,4] <- samps[ ,1] + samps[ ,2] + samps[ ,3] colnames(samps) <- c(2, 2.5, 10, "sum") gsamps <- as_tibble(samps) gsamps <- gather(gsamps, key = "dist", value = "value") ggplot(gsamps, aes(x = value)) + geom_bar() + facet_wrap(~ dist) samps <- cbind(samps, "theoretical" = rpois(nsamps, 2 + 5.2 + 10)) gsamps <- as_tibble(samps[ ,4:5]) gsamps <- gather(gsamps, key = "dist", value = "value") ggplot(gsamps, aes(x = value, fill = dist)) + geom_bar(position = "dodge") Exercise 9.2 Find the expected value and variance of the negative binomial distribution. Hint: Find the Taylor series of \\((1 - y)^{-r}\\) at point 0. Solution. Let \\(X \\sim \\text{NB}(r, p)\\). \\[\\begin{align} \\alpha_X(t) &= E[t^X] \\\\ &= \\sum_{j=0}^\\infty t^j \\binom{j + r - 1}{j} (1 - p)^r p^j \\\\ &= (1 - p)^r \\sum_{j=0}^\\infty \\binom{j + r - 1}{j} (tp)^j \\\\ &= (1 - p)^r \\sum_{j=0}^\\infty \\frac{(j + r - 1)(j + r - 2)...r}{j!} (tp)^j. \\\\ \\end{align}\\] Let us look at the Taylor series of \\((1 - y)^{-r}\\) at 0 \\[\\begin{align} (1 - y)^{-r} = &1 + \\frac{-r(-1)}{1!}y + \\frac{-r(-r - 1)(-1)^2}{2!}y^2 + \\\\ &\\frac{-r(-r - 1)(-r - 2)(-1)^3}{3!}y^3 + ... \\\\ \\end{align}\\] How does the \\(k\\)-th term look like? We have \\(k\\) derivatives of our function so \\[\\begin{align} \\frac{d^k}{d^k y} (1 - y)^{-r} &= \\frac{-r(-r - 1)...(-r - k + 1)(-1)^k}{k!}y^k \\\\ &= \\frac{r(r + 1)...(r + k - 1)}{k!}y^k. \\end{align}\\] We observe that this equals to the \\(j\\)-th term in the sum of NB PGF. Therefore \\[\\begin{align} \\alpha_X(t) &= (1 - p)^r (1 - tp)^{-r} \\\\ &= \\Big(\\frac{1 - p}{1 - tp}\\Big)^r \\end{align}\\] To find the expected value, we need to differentiate \\[\\begin{align} \\frac{d}{dt} \\Big(\\frac{1 - p}{1 - tp}\\Big)^r &= r \\Big(\\frac{1 - p}{1 - tp}\\Big)^{r-1} \\frac{d}{dt} \\frac{1 - p}{1 - tp} \\\\ &= r \\Big(\\frac{1 - p}{1 - tp}\\Big)^{r-1} \\frac{p(1 - p)}{(1 - tp)^2}. \\\\ \\end{align}\\] Evaluating this at 1, we get: \\[\\begin{align} E[X] = \\frac{rp}{1 - p}. \\end{align}\\] For the variance we need the second derivative. \\[\\begin{align} \\frac{d^2}{d^2t} \\Big(\\frac{1 - p}{1 - tp}\\Big)^r &= \\frac{p^2 r (r + 1) (\\frac{1 - p}{1 - tp})^r}{(tp - 1)^2} \\end{align}\\] Evaluating this at 1 and inserting the first derivatives, we get: \\[\\begin{align} Var[X] &= \\frac{d^2}{dt^2} \\alpha_X(1) + \\frac{d}{dt}\\alpha_X(1) - \\Big(\\frac{d}{dt}\\alpha_X(t) \\Big)^2 \\\\ &= \\frac{p^2 r (r + 1)}{(1 - p)^2} + \\frac{rp}{1 - p} - \\frac{r^2p^2}{(1 - p)^2} \\\\ &= \\frac{rp}{(1 - p)^2}. \\end{align}\\] library(tidyr) set.seed(1) nsamps <- 100000 find_p <- function (mu, r) { return (10 / (r + 10)) } r <- c(1,2,10,20) p <- find_p(10, r) sigma <- rep(sqrt(p*r / (1 - p)^2), each = nsamps) samps <- cbind("r=1" = rnbinom(nsamps, size = r[1], prob = 1 - p[1]), "r=2" = rnbinom(nsamps, size = r[2], prob = 1 - p[2]), "r=4" = rnbinom(nsamps, size = r[3], prob = 1 - p[3]), "r=20" = rnbinom(nsamps, size = r[4], prob = 1 - p[4])) gsamps <- gather(as.data.frame(samps)) iw <- (gsamps$value > sigma + 10) | (gsamps$value < sigma - 10) ggplot(gsamps, aes(x = value, fill = iw)) + geom_bar() + # geom_density() + facet_wrap(~ key) 9.2 Moment generating functions (MGFs) Exercise 9.3 Find the variance of the geometric distribution. Solution. Let \\(X \\sim \\text{Geometric}(p)\\). The MGF of the geometric distribution is \\[\\begin{align} M_X(t) &= E[e^{tX}] \\\\ &= \\sum_{k=0}^\\infty p(1 - p)^k e^{tk} \\\\ &= p \\sum_{k=0}^\\infty ((1 - p)e^t)^k. \\end{align}\\] Let us assume that \\((1 - p)e^t < 1\\). Then, by using the geometric series we get \\[\\begin{align} M_X(t) &= \\frac{p}{1 - e^t + pe^t}. \\end{align}\\] The first derivative of the above expression is \\[\\begin{align} \\frac{d}{dt}M_X(t) &= \\frac{-p(-e^t + pe^t)}{(1 - e^t + pe^t)^2}, \\end{align}\\] and evaluating at \\(t = 0\\), we get \\(\\frac{1 - p}{p}\\), which we already recognize as the expected value of the geometric distribution. The second derivative is \\[\\begin{align} \\frac{d^2}{dt^2}M_X(t) &= \\frac{(p-1)pe^t((p-1)e^t - 1)}{((p - 1)e^t + 1)^3}, \\end{align}\\] and evaluating at \\(t = 0\\), we get \\(\\frac{(p - 1)(p - 2)}{p^2}\\). Combining we get the variance \\[\\begin{align} Var(X) &= \\frac{(p - 1)(p - 2)}{p^2} - \\frac{(1 - p)^2}{p^2} \\\\ &= \\frac{(p-1)(p-2) - (1-p)^2}{p^2} \\\\ &= \\frac{1 - p}{p^2}. \\end{align}\\] Exercise 9.4 Find the distribution of sum of two normal random variables \\(X\\) and \\(Y\\), by comparing \\(M_{X+Y}(t)\\) to \\(M_X(t)\\). R: To illustrate the result draw random samples from N\\((-3, 1)\\) and N\\((5, 1.2)\\) and calculate the empirical mean and variance of \\(X+Y\\). Plot all three histograms in one plot. Solution. Let \\(X \\sim \\text{N}(\\mu_X, 1)\\) and \\(Y \\sim \\text{N}(\\mu_Y, 1)\\). The MGF of the sum is \\[\\begin{align} M_{X+Y}(t) &= M_X(t) M_Y(t). \\end{align}\\] Let us calculate \\(M_X(t)\\), the MGF for \\(Y\\) then follows analogously. \\[\\begin{align} M_X(t) &= \\int_{-\\infty}^\\infty e^{tx} \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - mu_X)^2}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - mu_X)^2 - 2\\sigma_X tx}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{x^2 - 2\\mu_X x + \\mu_X^2 - 2\\sigma_X tx}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - (\\mu_X + \\sigma_X^2 t))^2 + \\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} dx & \\text{complete the square}\\\\ &= e^{-\\frac{\\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - (\\mu_X + \\sigma_X^2 t))^2}{2\\sigma_X^2}} dx & \\\\ &= e^{-\\frac{\\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} & \\text{normal PDF} \\\\ &= e^{-\\frac{\\mu_X^2 - \\mu_X^2 - \\mu_X \\sigma_X^2 t - 2 \\sigma_X^4 t^2}{2\\sigma_X^2}} \\\\ &= e^{\\sigma_X^2 t^2 + \\frac{\\mu_X t}{2}}. \\\\ \\end{align}\\] The MGF of the sum is then \\[\\begin{align} M_{X+Y}(t) &= e^{\\sigma_X^2 t^2 + 0.5\\mu_X t} e^{\\sigma_Y^2 t^2 + 0.5\\mu_Y t} \\\\ &= e^{t^2(\\sigma_X^2 + \\sigma_Y^2) + 0.5 t(\\mu_X + \\mu_Y)}. \\end{align}\\] By comparing \\(M_{X+Y}(t)\\) and \\(M_X(t)\\) we observe that both have two terms. The first is \\(2t^2\\) multiplied by the variance, and the second is \\(2t\\) multiplied by the mean. Since MGFs are unique, we conclude that \\(Z = X + Y \\sim \\text{N}(\\mu_X + \\mu_Y, \\sigma_X^2 + \\sigma_Y^2)\\). library(tidyr) library(ggplot2) set.seed(1) nsamps <- 1000 x <- rnorm(nsamps, -3, 1) y <- rnorm(nsamps, 5, 1.2) z <- x + y mean(z) ## [1] 1.968838 var(z) ## [1] 2.645034 df <- data.frame(x = x, y = y, z = z) %>% gather() ggplot(df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") "],["ci.html", "Chapter 10 Concentration inequalities 10.1 Comparison 10.2 Practical", " Chapter 10 Concentration inequalities This chapter deals with concentration inequalities. The students are expected to acquire the following knowledge: Theoretical More assumptions produce closer bounds. R Optimization. Estimating probability inequalities. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 10.1 Comparison Exercise 10.1 R: Let \\(X\\) be geometric random variable with \\(p = 0.7\\). Visually compare the Markov bound, Chernoff bound, and the theoretical probabilities for \\(x = 1,...,12\\). To get the best fitting Chernoff bound, you will need to optimize the bound depending on \\(t\\). Use either analytical or numerical optimization. bound_chernoff <- function (t, p, a) { return ((p / (1 - exp(t) + p * exp(t))) / exp(a * t)) } set.seed(1) p <- 0.7 a <- seq(1, 12, by = 1) ci_markov <- (1 - p) / p / a t <- vector(mode = "numeric", length = length(a)) for (i in 1:length(t)) { t[i] <- optimize(bound_chernoff, interval = c(0, log(1 / (1 - p))), p = p, a = a[i])$minimum } t ## [1] 0.5108267 0.7984981 0.9162927 0.9808238 1.0216635 1.0498233 1.0704327 ## [8] 1.0861944 1.0986159 1.1086800 1.1169653 1.1239426 ci_chernoff <- (p / (1 - exp(t) + p * exp(t))) / exp(a * t) actual <- 1 - pgeom(a, 0.7) plot_df <- rbind( data.frame(x = a, y = ci_markov, type = "Markov"), data.frame(x = a, y = ci_chernoff, type = "Chernoff"), data.frame(x = a, y = actual, type = "Actual") ) ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() Exercise 10.2 R: Let \\(X\\) be a sum of 100 Beta distributions with random parameters. Take 1000 samples and plot the Chebyshev bound, Hoeffding bound, and the empirical probabilities. set.seed(1) nvars <- 100 nsamps <- 1000 samps <- matrix(data = NA, nrow = nsamps, ncol = nvars) Sn_mean <- 0 Sn_var <- 0 for (i in 1:nvars) { alpha1 <- rgamma(1, 10, 1) beta1 <- rgamma(1, 10, 1) X <- rbeta(nsamps, alpha1, beta1) Sn_mean <- Sn_mean + alpha1 / (alpha1 + beta1) Sn_var <- Sn_var + alpha1 * beta1 / ((alpha1 + beta1)^2 * (alpha1 + beta1 + 1)) samps[ ,i] <- X } mean(apply(samps, 1, sum)) ## [1] 51.12511 Sn_mean ## [1] 51.15723 var(apply(samps, 1, sum)) ## [1] 1.170652 Sn_var ## [1] 1.166183 a <- 1:30 b <- a / sqrt(Sn_var) ci_chebyshev <- 1 / b^2 ci_hoeffding <- 2 * exp(- 2 * a^2 / nvars) empirical <- NULL for (i in 1:length(a)) { empirical[i] <- sum(abs((apply(samps, 1, sum)) - Sn_mean) >= a[i])/ nsamps } plot_df <- rbind( data.frame(x = a, y = ci_chebyshev, type = "Chebyshev"), data.frame(x = a, y = ci_hoeffding, type = "Hoeffding"), data.frame(x = a, y = empirical, type = "Empirical") ) ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() + coord_cartesian(xlim = c(15, 25), ylim = c(0, 0.05)) 10.2 Practical Exercise 10.3 From Jagannathan. Let \\(X_i\\), \\(i = 1,...n\\), be a random sample of size \\(n\\) of a random variable \\(X\\). Let \\(X\\) have mean \\(\\mu\\) and variance \\(\\sigma^2\\). Find the size of the sample \\(n\\) required so that the probability that the difference between sample mean and true mean is smaller than \\(\\frac{\\sigma}{10}\\) is at least 0.95. Hint: Derive a version of the Chebyshev inequality for \\(P(|X - \\mu| \\geq a)\\) using Markov inequality. Solution. Let \\(\\bar{X} = \\sum_{i=1}^n X_i\\). Then \\(E[\\bar{X}] = \\mu\\) and \\(Var[\\bar{X}] = \\frac{\\sigma^2}{n}\\). Let us first derive another representation of Chebyshev inequality. \\[\\begin{align} P(|X - \\mu| \\geq a) = P(|X - \\mu|^2 \\geq a^2) \\leq \\frac{E[|X - \\mu|^2]}{a^2} = \\frac{Var[X]}{a^2}. \\end{align}\\] Let us use that on our sampling distribution: \\[\\begin{align} P(|\\bar{X} - \\mu| \\geq \\frac{\\sigma}{10}) \\leq \\frac{100 Var[\\bar{X}]}{\\sigma^2} = \\frac{100 Var[X]}{n \\sigma^2} = \\frac{100}{n}. \\end{align}\\] We are interested in the difference being smaller, therefore \\[\\begin{align} P(|\\bar{X} - \\mu| < \\frac{\\sigma}{10}) = 1 - P(|\\bar{X} - \\mu| \\geq \\frac{\\sigma}{10}) \\geq 1 - \\frac{100}{n} \\geq 0.95. \\end{align}\\] It follows that we need a sample size of \\(n \\geq \\frac{100}{0.05} = 2000\\). "],["crv.html", "Chapter 11 Convergence of random variables", " Chapter 11 Convergence of random variables This chapter deals with convergence of random variables. The students are expected to acquire the following knowledge: Theoretical Finding convergences of random variables. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 11.1 Let \\(X_1\\), \\(X_2\\),…, \\(X_n\\) be a sequence of Bernoulli random variables. Let \\(Y_n = \\frac{X_1 + X_2 + ... + X_n}{n^2}\\). Show that this sequence converges point-wise to the zero random variable. R: Use a simulation to check your answer. Solution. Let \\(\\epsilon\\) be arbitrary. We need to find such \\(n_0\\), that for every \\(n\\) greater than \\(n_0\\) \\(|Y_n| < \\epsilon\\) holds. \\[\\begin{align} |Y_n| &= |\\frac{X_1 + X_2 + ... + X_n}{n^2}| \\\\ &\\leq |\\frac{n}{n^2}| \\\\ &= \\frac{1}{n}. \\end{align}\\] So we need to find such \\(n_0\\), that for every \\(n > n_0\\) we will have \\(\\frac{1}{n} < \\epsilon\\). So \\(n_0 > \\frac{1}{\\epsilon}\\). x <- 1:1000 X <- matrix(data = NA, nrow = length(x), ncol = 100) y <- vector(mode = "numeric", length = length(x)) for (i in 1:length(x)) { X[i, ] <- rbinom(100, size = 1, prob = 0.5) } X <- apply(X, 2, cumsum) tmp_mat <- matrix(data = (1:1000)^2, nrow = 1000, ncol = 100) X <- X / tmp_mat y <- apply(X, 1, mean) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() Exercise 11.2 Let \\(\\Omega = [0,1]\\) and let \\(X_n\\) be a sequence of random variables, defined as \\[\\begin{align} X_n(\\omega) = \\begin{cases} \\omega^3, &\\omega = \\frac{i}{n}, &0 \\leq i \\leq 1 \\\\ 1, & \\text{otherwise.} \\end{cases} \\end{align}\\] Show that \\(X_n\\) converges almost surely to \\(X \\sim \\text{Uniform}(0,1)\\). Solution. We need to show \\(P(\\{\\omega: X_n(\\omega) \\rightarrow X(\\omega)\\}) = 1\\). Let \\(\\omega \\neq \\frac{i}{n}\\). Then for any \\(\\omega\\), \\(X_n\\) converges pointwise to \\(X\\): \\[\\begin{align} X_n(\\omega) = 1 \\implies |X_n(\\omega) - X(s)| = |1 - 1| < \\epsilon. \\end{align}\\] The above is independent of \\(n\\). Since there are countably infinite number of elements in the complement (\\(\\frac{i}{n}\\)), the probability of this set is 1. Exercise 11.3 Borrowed from Wasserman. Let \\(X_n \\sim \\text{N}(0, \\frac{1}{n})\\) and let \\(X\\) be a random variable with CDF \\[\\begin{align} F_X(x) = \\begin{cases} 0, &x < 0 \\\\ 1, &x \\geq 0. \\end{cases} \\end{align}\\] Does \\(X_n\\) converge to \\(X\\) in distribution? How about in probability? Prove or disprove these statement. R: Plot the CDF of \\(X_n\\) for \\(n = 1, 2, 5, 10, 100, 1000\\). Solution. Let us first check convergence in distribution. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} F_{X_n}(x) &= \\lim_{n \\rightarrow \\infty} \\phi (\\sqrt(n) x). \\end{align}\\] We have two cases, for \\(x < 0\\) and \\(x > 0\\). We do not need to check for \\(x = 0\\), since \\(F_X\\) is not continuous in that point. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} \\phi (\\sqrt(n) x) = \\begin{cases} 0, & x < 0 \\\\ 1, & x > 0. \\end{cases} \\end{align}\\] This is the same as \\(F_X\\). Let us now check convergence in probability. Since \\(X\\) is a point-mass distribution at zero, we have \\[\\begin{align} \\lim_{n \\rightarrow \\infty} P(|X_n| > \\epsilon) &= \\lim_{n \\rightarrow \\infty} (P(X_n > \\epsilon) + P(X_n < -\\epsilon)) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - P(X_n < \\epsilon) + P(X_n < -\\epsilon)) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - \\phi(\\sqrt{n} \\epsilon) + \\phi(- \\sqrt{n} \\epsilon)) \\\\ &= 0. \\end{align}\\] n <- c(1,2,5,10,100,1000) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01)), aes(x = x)) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/1), aes(color = "sd = 1/1")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/2), aes(color = "sd = 1/2")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/5), aes(color = "sd = 1/5")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/10), aes(color = "sd = 1/10")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/100), aes(color = "sd = 1/100")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/1000), aes(color = "sd = 1/1000")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/10000), aes(color = "sd = 1/10000")) Exercise 11.4 Let \\(X_i\\) be i.i.d. and \\(\\mu = E(X_1)\\). Let variance of \\(X_1\\) be finite. Show that the mean of \\(X_i\\), \\(\\bar{X}_n = \\frac{1}{n}\\sum_{i=1}^n X_i\\) converges in quadratic mean to \\(\\mu\\). Solution. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} E(|\\bar{X_n} - \\mu|^2) &= \\lim_{n \\rightarrow \\infty} E(\\bar{X_n}^2 - 2 \\bar{X_n} \\mu + \\mu^2) \\\\ &= \\lim_{n \\rightarrow \\infty} (E(\\bar{X_n}^2) - 2 \\mu E(\\frac{\\sum_{i=1}^n X_i}{n}) + \\mu^2) \\\\ &= \\lim_{n \\rightarrow \\infty} E(\\bar{X_n})^2 + \\lim_{n \\rightarrow \\infty} Var(\\bar{X_n}) - 2 \\mu^2 + \\mu^2 \\\\ &= \\lim_{n \\rightarrow \\infty} \\frac{n^2 \\mu^2}{n^2} + \\lim_{n \\rightarrow \\infty} \\frac{\\sigma^2}{n} - \\mu^2 \\\\ &= \\mu^2 - \\mu^2 + \\lim_{n \\rightarrow \\infty} \\frac{\\sigma^2}{n} \\\\ &= 0. \\end{align}\\] "],["lt.html", "Chapter 12 Limit theorems", " Chapter 12 Limit theorems This chapter deals with limit theorems. The students are expected to acquire the following knowledge: Theoretical Monte Carlo integration convergence. Difference between weak and strong law of large numbers. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 12.1 Show that Monte Carlo integration converges almost surely to the true integral of a bounded function. Solution. Let \\(g\\) be a function defined on \\(\\Omega\\). Let \\(X_i\\), \\(i = 1,...,n\\) be i.i.d. (multivariate) uniform random variables with bounds defined on \\(\\Omega\\). Let \\(Y_i\\) = \\(g(X_i)\\). Then it follows that \\(Y_i\\) are also i.i.d. random variables and their expected value is \\(E[g(X)] = \\int_{\\Omega} g(x) f_X(x) dx = \\frac{1}{V_{\\Omega}} \\int_{\\Omega} g(x) dx\\). By the strong law of large numbers, we have \\[\\begin{equation} \\frac{1}{n}\\sum_{i=1}^n Y_i \\xrightarrow{\\text{a.s.}} E[g(X)]. \\end{equation}\\] It follows that \\[\\begin{equation} V_{\\Omega} \\frac{1}{n}\\sum_{i=1}^n Y_i \\xrightarrow{\\text{a.s.}} \\int_{\\Omega} g(x) dx. \\end{equation}\\] Exercise 12.2 Let \\(X\\) be a geometric random variable with probability 0.5 and support in positive integers. Let \\(Y = 2^X (-1)^X X^{-1}\\). Find the expected value of \\(Y\\) by using conditional convergence (this variable does not have an expected value in the conventional sense – the series is not absolutely convergent). R: Draw \\(10000\\) samples from a geometric distribution with probability 0.5 and support in positive integers to get \\(X\\). Then calculate \\(Y\\) and plot the means at each iteration (sample). Additionally, plot the expected value calculated in a. Try it with different seeds. What do you notice? Solution. \\[\\begin{align*} E[Y] &= \\sum_{x=1}^{\\infty} \\frac{2^x (-1)^x}{x} 0.5^x \\\\ &= \\sum_{x=1}^{\\infty} \\frac{(-1)^x}{x} \\\\ &= - \\sum_{x=1}^{\\infty} \\frac{(-1)^{x+1}}{x} \\\\ &= - \\ln(2) \\end{align*}\\] set.seed(3) x <- rgeom(100000, prob = 0.5) + 1 y <- 2^x * (-1)^x * x^{-1} y_means <- cumsum(y) / seq_along(y) df <- data.frame(x = 1:length(y_means), y = y_means) ggplot(data = df, aes(x = x, y = y)) + geom_line() + geom_hline(yintercept = -log(2)) "],["eb.html", "Chapter 13 Estimation basics 13.1 ECDF 13.2 Properties of estimators", " Chapter 13 Estimation basics This chapter deals with estimation basics. The students are expected to acquire the following knowledge: Biased and unbiased estimators. Consistent estimators. Empirical cumulative distribution function. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 13.1 ECDF Exercise 13.1 (ECDF intuition) Take any univariate continuous distribution that is readily available in R and plot its CDF (\\(F\\)). Draw one sample (\\(n = 1\\)) from the chosen distribution and draw the ECDF (\\(F_n\\)) of that one sample. Use the definition of the ECDF, not an existing function in R. Implementation hint: ECDFs are always piecewise constant - they only jump at the sampled values and by \\(1/n\\). Repeat (b) for \\(n = 5, 10, 100, 1000...\\) Theory says that \\(F_n\\) should converge to \\(F\\). Can you observe that? For \\(n = 100\\) repeat the process \\(m = 20\\) times and plot every \\(F_n^{(m)}\\). Theory says that \\(F_n\\) will converge to \\(F\\) the slowest where \\(F\\) is close to 0.5 (where the variance is largest). Can you observe that? library(ggplot2) set.seed(1) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) one_samp <- rnorm(1) X <- data.frame(x = c(-5, one_samp, 5), y = c(0,1,1)) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) + geom_step(data = X, aes(x = x, y = y)) N <- c(5, 10, 100, 1000) X <- NULL for (n in N) { tmp <- rnorm(n) tmp_X <- data.frame(x = c(-5, sort(tmp), 5), y = c(0, seq(1/n, 1, by = 1/n), 1), n = n) X <- rbind(X, tmp_X) } ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) + geom_step(data = X, aes(x = x, y = y, color = as.factor(n))) + labs(color = "N") 13.2 Properties of estimators Exercise 13.2 Show that the sample average is, as an estimator of the mean: unbiased, consistent, asymptotically normal. Solution. \\[\\begin{align*} E[\\frac{1}{n} \\sum_{i=1}^n X_i] &= \\frac{1}{n} \\sum_{i=i}^n E[X_i] \\\\ &= E[X]. \\end{align*}\\] \\[\\begin{align*} \\lim_{n \\rightarrow \\infty} P(|\\frac{1}{n} \\sum_{i=1}^n X_i - E[X]| > \\epsilon) &= \\lim_{n \\rightarrow \\infty} P((\\frac{1}{n} \\sum_{i=1}^n X_i - E[X])^2 > \\epsilon^2) \\\\ & \\leq \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i - E[X])^2]}{\\epsilon^2} & \\text{Markov inequality} \\\\ & = \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i)^2 - 2 \\frac{1}{n} \\sum_{i=1}^n X_i E[X] + E[X]^2]}{\\epsilon^2} \\\\ & = \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i)^2] - 2 E[X]^2 + E[X]^2}{\\epsilon^2} \\\\ &= 0 \\end{align*}\\] For the last equality see the solution to ??. Follows directly from the CLT. Exercise 13.3 (Consistent but biased estimator) Show that sample variance (the plug-in estimator of variance) is a biased estimator of variance. Show that sample variance is a consistent estimator of variance. Show that the estimator with (\\(N-1\\)) (Bessel correction) is unbiased. Solution. \\[\\begin{align*} E[\\frac{1}{n} \\sum_{i=1}^n (Y_i - \\bar{Y})^2] &= \\frac{1}{n} \\sum_{i=1}^n E[(Y_i - \\bar{Y})^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2] - 2 E[Y_i \\bar{Y}] + \\bar{Y}^2)] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2 - 2 Y_i \\bar{Y} + \\bar{Y}^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2 - \\frac{2}{n} Y_i^2 - \\frac{2}{n} \\sum_{i \\neq j} Y_i Y_j + \\frac{1}{n^2}\\sum_j \\sum_{k \\neq j} Y_j Y_k + \\frac{1}{n^2} \\sum_j Y_j^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n \\frac{n - 2}{n} (\\sigma^2 + \\mu^2) - \\frac{2}{n} (n - 1) \\mu^2 + \\frac{1}{n^2}n(n-1)\\mu^2 + \\frac{1}{n^2}n(\\sigma^2 + \\mu^2) \\\\ &= \\frac{n-1}{n}\\sigma^2 \\\\ < \\sigma^2. \\end{align*}\\] Let \\(S_n\\) denote the sample variance. Then we can write it as \\[\\begin{align*} S_n &= \\frac{1}{n} \\sum_{i=1}^n (X_i - \\bar{X})^2 = \\frac{1}{n} \\sum_{i=1}^n (X_i - \\mu)^2 + 2(X_i - \\mu)(\\mu - \\bar{X}) + (\\mu - \\bar{X})^2. \\end{align*}\\] Now \\(\\bar{X}\\) converges in probability (by WLLN) to \\(\\mu\\) therefore the right terms converge in probability to zero. The left term converges in probability to \\(\\sigma^2\\), also by WLLN. Therefore the sample variance is a consistent estimatior of the variance. The denominator changes in the second-to-last line of a., therefore the last line is now equality. Exercise 13.4 (Estimating the median) Show that the sample median is an unbiased estimator of the median for N\\((\\mu, \\sigma^2)\\). Show that the sample median is an unbiased estimator of the mean for any distribution with symmetric density. Hint 1: The pdf of an order statistic is \\(f_{X_{(k)}}(x) = \\frac{n!}{(n - k)!(k - 1)!}f_X(x)\\Big(F_X(x)^{k-1} (1 - F_X(x)^{n - k}) \\Big)\\). Hint 2: A distribution is symmetric when \\(X\\) and \\(2a - X\\) have the same distribution for some \\(a\\). Solution. Let \\(Z_i\\), \\(i = 1,...,n\\) be i.i.d. variables with a symmetric distribution and let \\(Z_{k:n}\\) denote the \\(k\\)-th order statistic. We will distinguish two cases, when \\(n\\) is odd and when \\(n\\) is even. Let first \\(n = 2m + 1\\) be odd. Then the sample median is \\(M = Z_{m+1:2m+1}\\). Its PDF is \\[\\begin{align*} f_M(x) = (m+1)\\binom{2m + 1}{m}f_Z(x)\\Big(F_Z(x)^m (1 - F_Z(x)^m) \\Big). \\end{align*}\\] For every symmetric distribution, it holds that \\(F_X(x) = 1 - F(2a - x)\\). Let \\(a = \\mu\\), the population mean. Plugging this into the PDF, we get that \\(f_M(x) = f_M(2\\mu -x)\\). It follows that \\[\\begin{align*} E[M] &= E[2\\mu - M] \\\\ 2E[M] &= 2\\mu \\\\ E[M] &= \\mu. \\end{align*}\\] Now let \\(n = 2m\\) be even. Then the sample median is \\(M = \\frac{Z_{m:2m} + Z_{m+1:2m}}{2}\\). It can be shown, that the joint PDF of these terms is also symmetric. Therefore, similar to the above \\[\\begin{align*} E[M] &= E[\\frac{Z_{m:2m} + Z_{m+1:2m}}{2}] \\\\ &= E[\\frac{2\\mu - M + 2\\mu - M}{2}] \\\\ &= E[2\\mu - M]. \\end{align*}\\] The above also proves point a. as the median and the mean are the same in normal distribution. Exercise 13.5 (Matrix trace estimation) The Hutchinson trace estimator [1] is an estimator of the trace of a symmetric positive semidefinite matrix A that relies on Monte Carlo sampling. The estimator is defined as \\[\\begin{align*} \\textrm{tr}(A) \\approx \\frac{1}{n} \\Sigma_{i=1}^n z_i^T A z_i, &\\\\ z_i \\sim_{\\mathrm{IID}} \\textrm{Uniform}(\\{-1, 1\\}^m), & \\end{align*}\\] where \\(A \\in \\mathbb{R}^{m \\times m}\\) is a symmetric positive semidefinite matrix. Elements of each vector \\(z_i\\) are either \\(-1\\) or \\(1\\) with equal probability. This is also called a Rademacher distribution. Data scientists often want the trace of a Hessian to obtain valuable curvature information for a loss function. Per [2], an example is classifying ten digits based on \\((28,28)\\) grayscale images (i.e. MNIST data) using logistic regression. The number of parameters is \\(m = 28^2 \\cdot 10 = 7840\\) and the size of the Hessian is \\(m^2\\), roughly \\(6 \\cdot 10^6\\). The diagonal average is equal to the average eigenvalue, which may be useful for optimization; in MCMC contexts, this would be useful for preconditioners and step size optimization. Computing Hessians (as a means of getting eigenvalue information) is often intractable, but Hessian-vector products can be computed faster by autodifferentiation (with e.g. Tensorflow, Pytorch, Jax). This is one motivation for the use of a stochastic trace estimator as outlined above. References: A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines (Hutchinson, 1990) A Modern Analysis of Hutchinson’s Trace Estimator (Skorski, 2020) Prove that the Hutchinson trace estimator is an unbiased estimator of the trace. Solution. We first simplify our task: \\[\\begin{align} \\mathbb{E}\\left[\\frac{1}{n} \\Sigma_{i=1}^n z_i^T A z_i \\right] &= \\frac{1}{n} \\Sigma_{i=1}^n \\mathbb{E}\\left[z_i^T A z_i \\right] \\\\ &= \\mathbb{E}\\left[z_i^T A z_i \\right], \\end{align}\\] where the second equality is due to having \\(n\\) IID vectors \\(z_i\\). We now only need to show that \\(\\mathbb{E}\\left[z^T A z \\right] = \\mathrm{tr}(A)\\). We omit the index due to all vectors being IID: \\[\\begin{align} \\mathrm{tr}(A) &= \\mathrm{tr}(AI) \\\\ &= \\mathrm{tr}(A\\mathbb{E}[zz^T]) \\\\ &= \\mathbb{E}[\\mathrm{tr}(Azz^T)] \\\\ &= \\mathbb{E}[\\mathrm{tr}(z^TAz)] \\\\ &= \\mathbb{E}[z^TAz]. \\end{align}\\] This concludes the proof. We clarify some equalities below. The second equality assumes that \\(\\mathbb{E}[zz^T] = I\\). By noting that the mean of the Rademacher distribution is 0, we have \\[\\begin{align} \\mathrm{Cov}[z, z] &= \\mathbb{E}[(z - \\mathbb{E}[z])(z - \\mathbb{E}[z])^T] \\\\ &= \\mathbb{E}[zz^T]. \\end{align}\\] Dimensions of \\(z\\) are independent, so \\(\\mathrm{Cov}[z, z]_{ij} = 0\\) for \\(i \\neq j\\). The diagonal will contain variances, which are equal to \\(1\\) for all dimensions \\(k = 1 \\dots m\\): \\(\\mathrm{Var}[z^{(k)}] = \\mathbb{E}[z^{(k)}z^{(k)}] - \\mathbb{E}[z^{(k)}]^2 = 1 - 0 = 1\\). It follows that the covariance is an identity matrix. Note that this is a general result for vectors with IID dimensions sampled from a distribution with mean 0 and variance 1. We could therefore use something else instead of the Rademacher, e.g. \\(z ~ N(0, I)\\). The third equality uses the fact that the expectation of a trace equals the trace of an expectation. If \\(X\\) is a random matrix, then \\(\\mathbb{E}[X]_{ij} = \\mathbb{E}[X_{ij}]\\). Therefore: \\[\\begin{align} \\mathrm{tr}(\\mathbb{E}[X]) &= \\Sigma_{i=1}^m(\\mathbb{E}[X]_{ii}) \\\\ &= \\Sigma_{i=1}^m(\\mathbb{E}[X_{ii}]) \\\\ &= \\mathbb{E}[\\Sigma_{i=1}^m(X_{ii})] \\\\ &= \\mathbb{E}[\\mathrm{tr}(X)], \\end{align}\\] where we used the linearity of the expectation in the third step. The fourth equality uses the fact that \\(\\mathrm{tr}(AB) = \\mathrm{tr}(BA)\\) for any matrices \\(A \\in \\mathbb{R}^{n \\times m}, B \\in \\mathbb{R}^{m \\times n}\\). The last inequality uses the fact that the trace of a \\(1 \\times 1\\) matrix is just its element. "],["boot.html", "Chapter 14 Bootstrap", " Chapter 14 Bootstrap This chapter deals with bootstrap. The students are expected to acquire the following knowledge: How to use bootstrap to generate coverage intervals. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 14.1 Ideally, a \\(1-\\alpha\\) CI would have \\(1-\\alpha\\) coverage. That is, say a 95% CI should, in the long run, contain the true value of the parameter 95% of the time. In practice, it is impossible to assess the coverage of our CI method, because we rarely know the true parameter. In simulation, however, we can. Let’s assess the coverage of bootstrap percentile intervals. Pick a univariate distribution with readily available mean and one that you can easily sample from. Draw \\(n = 30\\) random samples from the chosen distribution and use the bootstrap (with large enough m) and percentile CI method to construct 95% CI. Repeat the process many times and count how many times the CI contains the true mean. That is, compute the actual coverage probability (don’t forget to include the standard error of the coverage probability!). What can you observe? Try one or two different distributions. What can you observe? Repeat (b) and (c) using BCa intervals (R package boot). How does the coverage compare to percentile intervals? As (d) but using intervals based on asymptotic normality (+/- 1.96 SE). How do results from (b), (d), and (e) change if we increase the sample size to n = 200? What about n = 5? library(boot) set.seed(0) nit <- 1000 # Repeat the process "many times" alpha <- 0.05 # CI parameter nboot <- 100 # m parameter for bootstrap ("large enough m") # f: change this to 200 or 5. nsample <- 30 # n = 30 random samples from the chosen distribution. Comment out BCa code if it breaks. covers <- matrix(nrow = nit, ncol = 3) covers_BCa <- matrix(nrow = nit, ncol = 3) covers_asymp_norm <- matrix(nrow = nit, ncol = 3) isin <- function (x, lower, upper) { (x > lower) & (x < upper) } for (j in 1:nit) { # Repeating many times # a: pick a univariate distribution - standard normal x1 <- rnorm(nsample) # c: one or two different distributions - beta and poisson x2 <- rbeta(nsample, 1, 2) x3 <- rpois(nsample, 5) X1 <- matrix(data = NA, nrow = nsample, ncol = nboot) X2 <- matrix(data = NA, nrow = nsample, ncol = nboot) X3 <- matrix(data = NA, nrow = nsample, ncol = nboot) for (i in 1:nboot) { X1[ ,i] <- sample(x1, nsample, replace = T) X2[ ,i] <- sample(x2, nsample, T) X3[ ,i] <- sample(x3, nsample, T) } X1_func <- apply(X1, 2, mean) X2_func <- apply(X2, 2, mean) X3_func <- apply(X3, 2, mean) X1_quant <- quantile(X1_func, probs = c(alpha / 2, 1 - alpha / 2)) X2_quant <- quantile(X2_func, probs = c(alpha / 2, 1 - alpha / 2)) X3_quant <- quantile(X3_func, probs = c(alpha / 2, 1 - alpha / 2)) covers[j,1] <- (0 > X1_quant[1]) & (0 < X1_quant[2]) covers[j,2] <- ((1 / 3) > X2_quant[1]) & ((1 / 3) < X2_quant[2]) covers[j,3] <- (5 > X3_quant[1]) & (5 < X3_quant[2]) mf <- function (x, i) return(mean(x[i])) bootX1 <- boot(x1, statistic = mf, R = nboot) bootX2 <- boot(x2, statistic = mf, R = nboot) bootX3 <- boot(x3, statistic = mf, R = nboot) X1_quant_BCa <- boot.ci(bootX1, type = "bca")$bca X2_quant_BCa <- boot.ci(bootX2, type = "bca")$bca X3_quant_BCa <- boot.ci(bootX3, type = "bca")$bca covers_BCa[j,1] <- (0 > X1_quant_BCa[4]) & (0 < X1_quant_BCa[5]) covers_BCa[j,2] <- ((1 / 3) > X2_quant_BCa[4]) & ((1 / 3) < X2_quant_BCa[5]) covers_BCa[j,3] <- (5 > X3_quant_BCa[4]) & (5 < X3_quant_BCa[5]) # e: estimate mean and standard error # sample mean: x1_bar <- mean(x1) x2_bar <- mean(x2) x3_bar <- mean(x3) # standard error (of the sample mean) estimate: sample standard deviation / sqrt(n) x1_bar_SE <- sd(x1) / sqrt(nsample) x2_bar_SE <- sd(x2) / sqrt(nsample) x3_bar_SE <- sd(x3) / sqrt(nsample) covers_asymp_norm[j,1] <- isin(0, x1_bar - 1.96 * x1_bar_SE, x1_bar + 1.96 * x1_bar_SE) covers_asymp_norm[j,2] <- isin(1/3, x2_bar - 1.96 * x2_bar_SE, x2_bar + 1.96 * x2_bar_SE) covers_asymp_norm[j,3] <- isin(5, x3_bar - 1.96 * x3_bar_SE, x3_bar + 1.96 * x3_bar_SE) } apply(covers, 2, mean) ## [1] 0.918 0.925 0.905 apply(covers, 2, sd) / sqrt(nit) ## [1] 0.008680516 0.008333333 0.009276910 apply(covers_BCa, 2, mean) ## [1] 0.927 0.944 0.927 apply(covers_BCa, 2, sd) / sqrt(nit) ## [1] 0.008230355 0.007274401 0.008230355 apply(covers_asymp_norm, 2, mean) ## [1] 0.939 0.937 0.930 apply(covers_asymp_norm, 2, sd) / sqrt(nit) ## [1] 0.007572076 0.007687008 0.008072494 Exercise 14.2 You are given a sample of independent observations from a process of interest: Index 1 2 3 4 5 6 7 8 X 7 2 4 6 4 5 9 10 Compute the plug-in estimate of mean and 95% symmetric CI based on asymptotic normality. Use the plug-in estimate of SE. Same as (a), but use the unbiased estimate of SE. Apply nonparametric bootstrap with 1000 bootstrap replications and estimate the 95% CI for the mean with percentile-based CI. # a x <- c(7, 2, 4, 6, 4, 5, 9, 10) n <- length(x) mu <- mean(x) SE <- sqrt(mean((x - mu)^2)) / sqrt(n) SE ## [1] 0.8915839 z <- qnorm(1 - 0.05 / 2) c(mu - z * SE, mu + z * SE) ## [1] 4.127528 7.622472 # b SE <- sd(x) / sqrt(n) SE ## [1] 0.9531433 c(mu - z * SE, mu + z * SE) ## [1] 4.006873 7.743127 # c set.seed(0) m <- 1000 T_mean <- function(x) {mean(x)} est_boot <- array(NA, m) for (i in 1:m) { x_boot <- x[sample(1:n, n, rep = T)] est_boot[i] <- T_mean(x_boot) } quantile(est_boot, p = c(0.025, 0.975)) ## 2.5% 97.5% ## 4.250 7.625 Exercise 14.3 We are given a sample of 10 independent paired (bivariate) observations: Index 1 2 3 4 5 6 7 8 9 10 X 1.26 -0.33 1.33 1.27 0.41 -1.54 -0.93 -0.29 -0.01 2.40 Y 2.64 0.33 0.48 0.06 -0.88 -2.14 -2.21 0.95 0.83 1.45 Compute Pearson correlation between X and Y. Use the cor.test() from R to estimate a 95% CI for the estimate from (a). Apply nonparametric bootstrap with 1000 bootstrap replications and estimate the 95% CI for the Pearson correlation with percentile-based CI. Compare CI from (b) and (c). Are they similar? How would the bootstrap estimation of CI change if we were interested in Spearman or Kendall correlation instead? x <- c(1.26, -0.33, 1.33, 1.27, 0.41, -1.54, -0.93, -0.29, -0.01, 2.40) y <- c(2.64, 0.33, 0.48, 0.06, -0.88, -2.14, -2.21, 0.95, 0.83, 1.45) # a cor(x, y) ## [1] 0.6991247 # b res <- cor.test(x, y) res$conf.int[1:2] ## [1] 0.1241458 0.9226238 # c set.seed(0) m <- 1000 n <- length(x) T_cor <- function(x, y) {cor(x, y)} est_boot <- array(NA, m) for (i in 1:m) { idx <- sample(1:n, n, rep = T) # !!! important to use same indices to keep dependency between x and y est_boot[i] <- T_cor(x[idx], y[idx]) } quantile(est_boot, p = c(0.025, 0.975)) ## 2.5% 97.5% ## 0.2565537 0.9057664 # d # Yes, but the bootstrap CI is more narrow. # e # We just use the functions for Kendall/Spearman coefficients instead: T_kendall <- function(x, y) {cor(x, y, method = "kendall")} T_spearman <- function(x, y) {cor(x, y, method = "spearman")} # Put this in a function that returns the CI bootstrap_95_ci <- function(x, y, t, m = 1000) { n <- length(x) est_boot <- array(NA, m) for (i in 1:m) { idx <- sample(1:n, n, rep = T) # !!! important to use same indices to keep dependency between x and y est_boot[i] <- t(x[idx], y[idx]) } quantile(est_boot, p = c(0.025, 0.975)) } bootstrap_95_ci(x, y, T_kendall) ## 2.5% 97.5% ## -0.08108108 0.78378378 bootstrap_95_ci(x, y, T_spearman) ## 2.5% 97.5% ## -0.1701115 0.8867925 Exercise 14.4 In this problem we will illustrate the use of the nonparametric bootstrap for estimating CIs of regression model coefficients. Load the longley dataset from base R with data(longley). Use lm() to apply linear regression using “Employed” as the target (dependent) variable and all other variables as the predictors (independent). Using lm() results, print the estimated regression coefficients and standard errors. Estimate 95% CI for the coefficients using +/- 1.96 * SE. Use nonparametric bootstrap with 100 replications to estimate the SE of the coefficients from (b). Compare the SE from (c) with those from (b). # a data(longley) # b res <- lm(Employed ~ . , longley) tmp <- data.frame(summary(res)$coefficients[,1:2]) tmp$LB <- tmp[,1] - 1.96 * tmp[,2] tmp$UB <- tmp[,1] + 1.96 * tmp[,2] tmp ## Estimate Std..Error LB UB ## (Intercept) -3.482259e+03 8.904204e+02 -5.227483e+03 -1.737035e+03 ## GNP.deflator 1.506187e-02 8.491493e-02 -1.513714e-01 1.814951e-01 ## GNP -3.581918e-02 3.349101e-02 -1.014616e-01 2.982320e-02 ## Unemployed -2.020230e-02 4.883997e-03 -2.977493e-02 -1.062966e-02 ## Armed.Forces -1.033227e-02 2.142742e-03 -1.453204e-02 -6.132495e-03 ## Population -5.110411e-02 2.260732e-01 -4.942076e-01 3.919994e-01 ## Year 1.829151e+00 4.554785e-01 9.364136e-01 2.721889e+00 # c set.seed(0) m <- 100 n <- nrow(longley) T_coef <- function(x) { lm(Employed ~ . , x)$coefficients } est_boot <- array(NA, c(m, ncol(longley))) for (i in 1:m) { idx <- sample(1:n, n, rep = T) est_boot[i,] <- T_coef(longley[idx,]) } SE <- apply(est_boot, 2, sd) SE ## [1] 1.826011e+03 1.605981e-01 5.693746e-02 8.204892e-03 3.802225e-03 ## [6] 3.907527e-01 9.414436e-01 # Show the standard errors around coefficients library(ggplot2) library(reshape2) df <- data.frame(index = 1:7, bootstrap_SE = SE, lm_SE = tmp$Std..Error) melted_df <- melt(df[2:nrow(df), ], id.vars = "index") # Ignore bias which has a really large magnitude ggplot(melted_df, aes(x = index, y = value, fill = variable)) + geom_bar(stat="identity", position="dodge") + xlab("Coefficient") + ylab("Standard error") # + scale_y_continuous(trans = "log") # If you want to also plot bias Exercise 14.5 This exercise shows a shortcoming of the bootstrap method when using the plug in estimator for the maximum. Compute the 95% bootstrap CI for the maximum of a standard normal distribution. Compute the 95% bootstrap CI for the maximum of a binomial distribution with n = 15 and p = 0.2. Repeat (b) using p = 0.9. Why is the result different? # bootstrap CI for maximum alpha <- 0.05 T_max <- function(x) {max(x)} # Equal to T_max = max bootstrap <- function(x, t, m = 1000) { n <- length(x) values <- rep(0, m) for (i in 1:m) { values[i] <- t(sample(x, n, replace = T)) } quantile(values, probs = c(alpha / 2, 1 - alpha / 2)) } # a # Meaningless, as the normal distribution can yield arbitrarily large values. x <- rnorm(100) bootstrap(x, T_max) ## 2.5% 97.5% ## 1.819425 2.961743 # b x <- rbinom(100, size = 15, prob = 0.2) # min = 0, max = 15 bootstrap(x, T_max) ## 2.5% 97.5% ## 6 7 # c x <- rbinom(100, size = 15, prob = 0.9) # min = 0, max = 15 bootstrap(x, T_max) ## 2.5% 97.5% ## 15 15 # Observation: to estimate the maximum, we need sufficient probability mass near the maximum value the distribution can yield. # Using bootstrap is pointless when there is too little mass near the true maximum. # In general, bootstrap will fail when estimating the CI for the maximum. "],["ml.html", "Chapter 15 Maximum likelihood 15.1 Deriving MLE 15.2 Fisher information 15.3 The German tank problem", " Chapter 15 Maximum likelihood This chapter deals with maximum likelihood estimation. The students are expected to acquire the following knowledge: How to derive MLE. Applying MLE in R. Calculating and interpreting Fisher information. Practical use of MLE. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 15.1 Deriving MLE Exercise 15.1 Derive the maximum likelihood estimator of variance for N\\((\\mu, \\sigma^2)\\). Compare with results from 13.3. What does that say about the MLE estimator? Solution. The mean is assumed constant, so we have the likelihood \\[\\begin{align} L(\\sigma^2; y) &= \\prod_{i=1}^n \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(y_i - \\mu)^2}{2 \\sigma^2}} \\\\ &= \\frac{1}{\\sqrt{2 \\pi \\sigma^2}^n} e^{\\frac{-\\sum_{i=1}^n (y_i - \\mu)^2}{2 \\sigma^2}} \\end{align}\\] We need to find the maximum of this function. We first observe that we can replace \\(\\frac{-\\sum_{i=1}^n (y_i - \\mu)^2}{2}\\) with a constant \\(c\\), since none of the terms are dependent on \\(\\sigma^2\\). Additionally, the term \\(\\frac{1}{\\sqrt{2 \\pi}^n}\\) does not affect the calculation of the maximum. So now we have \\[\\begin{align} L(\\sigma^2; y) &= (\\sigma^2)^{-\\frac{n}{2}} e^{\\frac{c}{\\sigma^2}}. \\end{align}\\] Differentiating we get \\[\\begin{align} \\frac{d}{d \\sigma^2} L(\\sigma^2; y) &= (\\sigma^2)^{-\\frac{n}{2}} \\frac{d}{d \\sigma^2} e^{\\frac{c}{\\sigma^2}} + e^{\\frac{c}{\\sigma^2}} \\frac{d}{d \\sigma^2} (\\sigma^2)^{-\\frac{n}{2}} \\\\ &= - (\\sigma^2)^{-\\frac{n}{2}} e^{\\frac{c}{\\sigma^2}} \\frac{c}{(\\sigma^2)^2} - e^{\\frac{c}{\\sigma^2}} \\frac{n}{2} (\\sigma^2)^{-\\frac{n + 2}{2}} \\\\ &= - (\\sigma^2)^{-\\frac{n + 4}{2}} e^{\\frac{c}{\\sigma^2}} c - e^{\\frac{c}{\\sigma^2}} \\frac{n}{2} (\\sigma^2)^{-\\frac{n + 2}{2}} \\\\ &= - e^{\\frac{c}{\\sigma^2}} (\\sigma^2)^{-\\frac{n + 4}{2}} \\Big(c + \\frac{n}{2}\\sigma^2 \\Big). \\end{align}\\] To get the maximum, this has to equal to 0, so \\[\\begin{align} c + \\frac{n}{2}\\sigma^2 &= 0 \\\\ \\sigma^2 &= -\\frac{2c}{n} \\\\ \\sigma^2 &= \\frac{\\sum_{i=1}^n (Y_i - \\mu)^2}{n}. \\end{align}\\] The MLE estimator is biased. Exercise 15.2 (Multivariate normal distribution) Derive the maximum likelihood estimate for the mean and covariance matrix of the multivariate normal. Simulate \\(n = 40\\) samples from a bivariate normal distribution (choose non-trivial parameters, that is, mean \\(\\neq 0\\) and covariance \\(\\neq 0\\)). Compute the MLE for the sample. Overlay the data with an ellipse that is determined by the MLE and an ellipse that is determined by the chosen true parameters. Repeat b. several times and observe how the estimates (ellipses) vary around the true value. Hint: For the derivation of MLE, these identities will be helpful: \\(\\frac{\\partial b^T a}{\\partial a} = \\frac{\\partial a^T b}{\\partial a} = b\\), \\(\\frac{\\partial a^T A a}{\\partial a} = (A + A^T)a\\), \\(\\frac{\\partial \\text{tr}(BA)}{\\partial A} = B^T\\), \\(\\frac{\\partial \\ln |A|}{\\partial A} = (A^{-1})^T\\), \\(a^T A a = \\text{tr}(a^T A a) = \\text{tr}(a a^T A) = \\text{tr}(Aaa^T)\\). Solution. The log likelihood of the MVN distribution is \\[\\begin{align*} l(\\mu, \\Sigma ; x) &= -\\frac{1}{2}\\Big(\\sum_{i=1}^n k\\ln(2\\pi) + |\\Sigma| + (x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) \\\\ &= -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n(x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) + c, \\end{align*}\\] where \\(c\\) is a constant with respect to \\(\\mu\\) and \\(\\Sigma\\). To find the MLE we first need to find partial derivatives. Let us start with \\(\\mu\\). \\[\\begin{align*} \\frac{\\partial}{\\partial \\mu}l(\\mu, \\Sigma ; x) &= \\frac{\\partial}{\\partial \\mu} -\\frac{1}{2}\\Big(\\sum_{i=1}^n x_i^T \\Sigma^{-1} x_i - x_i^T \\Sigma^{-1} \\mu - \\mu^T \\Sigma^{-1} x_i + \\mu^T \\Sigma^{-1} \\mu \\Big) \\\\ &= -\\frac{1}{2}\\Big(\\sum_{i=1}^n - \\Sigma^{-1} x_i - \\Sigma^{-1} x_i + 2 \\Sigma^{-1} \\mu \\Big) \\\\ &= -\\Sigma^{-1}\\Big(\\sum_{i=1}^n - x_i + \\mu \\Big). \\end{align*}\\] Equating above with zero, we get \\[\\begin{align*} \\sum_{i=1}^n - x_i + \\mu &= 0 \\\\ \\hat{\\mu} = \\frac{1}{n} \\sum_{i=1}^n x_i, \\end{align*}\\] which is the dimension-wise empirical mean. Now for the covariance matrix \\[\\begin{align*} \\frac{\\partial}{\\partial \\Sigma^{-1}}l(\\mu, \\Sigma ; x) &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n(x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) \\\\ &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n \\text{tr}((x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu))\\Big) \\\\ &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n \\text{tr}((\\Sigma^{-1} (x_i - \\mu) (x_i - \\mu)^T )\\Big) \\\\ &= \\frac{n}{2}\\Sigma + -\\frac{1}{2}\\Big(\\sum_{i=1}^n (x_i - \\mu) (x_i - \\mu)^T \\Big). \\end{align*}\\] Equating above with zero, we get \\[\\begin{align*} \\hat{\\Sigma} = \\frac{1}{n}\\sum_{i=1}^n (x_i - \\mu) (x_i - \\mu)^T. \\end{align*}\\] set.seed(1) n <- 40 mu <- c(1, -2) Sigma <- matrix(data = c(2, -1.6, -1.6, 1.8), ncol = 2) X <- mvrnorm(n = n, mu = mu, Sigma = Sigma) colnames(X) <- c("X1", "X2") X <- as.data.frame(X) # plot.new() tru_ellip <- ellipse(mu, Sigma, draw = FALSE) colnames(tru_ellip) <- c("X1", "X2") tru_ellip <- as.data.frame(tru_ellip) mu_est <- apply(X, 2, mean) tmp <- as.matrix(sweep(X, 2, mu_est)) Sigma_est <- (1 / n) * t(tmp) %*% tmp est_ellip <- ellipse(mu_est, Sigma_est, draw = FALSE) colnames(est_ellip) <- c("X1", "X2") est_ellip <- as.data.frame(est_ellip) ggplot(data = X, aes(x = X1, y = X2)) + geom_point() + geom_path(data = tru_ellip, aes(x = X1, y = X2, color = "truth")) + geom_path(data = est_ellip, aes(x = X1, y = X2, color = "estimated")) + labs(color = "type") Exercise 15.3 (Logistic regression) Logistic regression is a popular discriminative model when our target variable is binary (categorical with 2 values). One of the ways of looking at logistic regression is that it is linear regression but instead of using the linear term as the mean of a normal RV, we use it as the mean of a Bernoulli RV. Of course, the mean of a Bernoulli is bounded on \\([0,1]\\), so, to avoid non-sensical values, we squeeze the linear between 0 and 1 with the inverse logit function inv_logit\\((z) = 1 / (1 + e^{-z})\\). This leads to the following model: \\(y_i | \\beta, x_i \\sim \\text{Bernoulli}(\\text{inv_logit}(\\beta x_i))\\). Explicitly write the likelihood function of beta. Implement the likelihood function in R. Use black-box box-constraint optimization (for example, optim() with L-BFGS) to find the maximum likelihood estimate for beta for \\(x\\) and \\(y\\) defined below. Plot the estimated probability as a function of the independent variable. Compare with the truth. Let \\(y2\\) be a response defined below. Will logistic regression work well on this dataset? Why not? How can we still use the model, without changing it? inv_log <- function (z) { return (1 / (1 + exp(-z))) } set.seed(1) x <- rnorm(100) y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) y2 <- rbinom(100, size = 1, prob = inv_log(1.2 * x + 1.4 * x^2)) Solution. \\[\\begin{align*} l(\\beta; x, y) &= p(y | x, \\beta) \\\\ &= \\ln(\\prod_{i=1}^n \\text{inv_logit}(\\beta x_i)^{y_i} (1 - \\text{inv_logit}(\\beta x_i))^{1 - y_i}) \\\\ &= \\sum_{i=1}^n y_i \\ln(\\text{inv_logit}(\\beta x_i)) + (1 - y_i) \\ln(1 - \\text{inv_logit}(\\beta x_i)). \\end{align*}\\] set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par ## [1] 1.166558 truth_p <- data.frame(x = x, prob = inv_log(1.2 * x), type = "truth") est_p <- data.frame(x = x, prob = inv_log(my_optim$par * x), type = "estimated") plot_df <- rbind(truth_p, est_p) ggplot(data = plot_df, aes(x = x, y = prob, color = type)) + geom_point(alpha = 0.3) y2 <- rbinom(2000, size = 1, prob = inv_log(1.2 * x + 1.4 * x^2)) X2 <- cbind(x, x^2) my_optim2 <- optim(par = c(0, 0), fn = l_logistic, method = "L-BFGS-B", lower = c(0, 0), upper = c(2, 2), X = t(X2), y = y2) my_optim2$par ## [1] 1.153656 1.257649 tmp <- sweep(data.frame(x = x, x2 = x^2), 2, my_optim2$par, FUN = "*") tmp <- tmp[ ,1] + tmp[ ,2] truth_p <- data.frame(x = x, prob = inv_log(1.2 * x + 1.4 * x^2), type = "truth") est_p <- data.frame(x = x, prob = inv_log(tmp), type = "estimated") plot_df <- rbind(truth_p, est_p) ggplot(data = plot_df, aes(x = x, y = prob, color = type)) + geom_point(alpha = 0.3) Exercise 15.4 (Linear regression) For the data generated below, do the following: Compute the least squares (MLE) estimate of coefficients beta using the matrix exact solution. Compute the MLE by minimizing the sum of squared residuals using black-box optimization (optim()). Compute the MLE by using the output built-in linear regression (lm() ). Compare (a-c and the true coefficients). Compute 95% CI on the beta coefficients using the output of built-in linear regression. Compute 95% CI on the beta coefficients by using (a or b) and the bootstrap with percentile method for CI. Compare with d. set.seed(1) n <- 100 x1 <- rnorm(n) x2 <- rnorm(n) x3 <- rnorm(n) X <- cbind(x1, x2, x3) beta <- c(0.2, 0.6, -1.2) y <- as.vector(t(beta %*% t(X))) + rnorm(n, sd = 0.2) set.seed(1) n <- 100 x1 <- rnorm(n) x2 <- rnorm(n) x3 <- rnorm(n) X <- cbind(x1, x2, x3) beta <- c(0.2, 0.6, -1.2) y <- as.vector(t(beta %*% t(X))) + rnorm(n, sd = 0.2) LS_fun <- function (beta, X, y) { return(sum((y - beta %*% t(X))^2)) } my_optim <- optim(par = c(0, 0, 0), fn = LS_fun, lower = -5, upper = 5, X = X, y = y, method = "L-BFGS-B") my_optim$par ## [1] 0.1898162 0.5885946 -1.1788264 df <- data.frame(y = y, x1 = x1, x2 = x2, x3 = x3) my_lm <- lm(y ~ x1 + x2 + x3 - 1, data = df) my_lm ## ## Call: ## lm(formula = y ~ x1 + x2 + x3 - 1, data = df) ## ## Coefficients: ## x1 x2 x3 ## 0.1898 0.5886 -1.1788 # matrix solution beta_hat <- solve(t(X) %*% X) %*% t(X) %*% y beta_hat ## [,1] ## x1 0.1898162 ## x2 0.5885946 ## x3 -1.1788264 out <- summary(my_lm) out$coefficients[ ,2] ## x1 x2 x3 ## 0.02209328 0.02087542 0.01934506 # bootstrap CI nboot <- 1000 beta_boot <- matrix(data = NA, ncol = length(beta), nrow = nboot) for (i in 1:nboot) { inds <- sample(1:n, n, replace = T) new_df <- df[inds, ] X_tmp <- as.matrix(new_df[ ,-1]) y_tmp <- new_df[ ,1] # print(nrow(new_df)) tmp_beta <- solve(t(X_tmp) %*% X_tmp) %*% t(X_tmp) %*% y_tmp beta_boot[i, ] <- tmp_beta } apply(beta_boot, 2, mean) ## [1] 0.1893281 0.5887068 -1.1800738 apply(beta_boot, 2, quantile, probs = c(0.025, 0.975)) ## [,1] [,2] [,3] ## 2.5% 0.1389441 0.5436911 -1.221560 ## 97.5% 0.2386295 0.6363102 -1.140416 out$coefficients[ ,2] ## x1 x2 x3 ## 0.02209328 0.02087542 0.01934506 Exercise 15.5 (Principal component analysis) Load the olympic data set from package ade4. The data show decathlon results for 33 men in 1988 Olympic Games. This data set serves as a great example of finding the latent structure in the data, as there are certain characteristics of the athletes that make them excel at different events. For example an explosive athlete will do particulary well in sprints and long jumps. Perform PCA (prcomp) on the data set and interpret the first 2 latent dimensions. Hint: Standardize the data first to get meaningful results. Use MLE to estimate the covariance of the standardized multivariate distribution. Decompose the estimated covariance matrix with the eigendecomposition. Compare the eigenvectors to the output of PCA. data(olympic) X <- olympic$tab X_scaled <- scale(X) my_pca <- prcomp(X_scaled) summary(my_pca) ## Importance of components: ## PC1 PC2 PC3 PC4 PC5 PC6 PC7 ## Standard deviation 1.8488 1.6144 0.97123 0.9370 0.74607 0.70088 0.65620 ## Proportion of Variance 0.3418 0.2606 0.09433 0.0878 0.05566 0.04912 0.04306 ## Cumulative Proportion 0.3418 0.6025 0.69679 0.7846 0.84026 0.88938 0.93244 ## PC8 PC9 PC10 ## Standard deviation 0.55389 0.51667 0.31915 ## Proportion of Variance 0.03068 0.02669 0.01019 ## Cumulative Proportion 0.96312 0.98981 1.00000 autoplot(my_pca, data = X, loadings = TRUE, loadings.colour = 'blue', loadings.label = TRUE, loadings.label.size = 3) Sigma_est <- (1 / nrow(X_scaled)) * t(X_scaled) %*% X_scaled Sigma_dec <- eigen(Sigma_est) Sigma_dec$vectors ## [,1] [,2] [,3] [,4] [,5] [,6] ## [1,] 0.4158823 0.1488081 -0.26747198 -0.08833244 -0.442314456 0.03071237 ## [2,] -0.3940515 -0.1520815 -0.16894945 -0.24424963 0.368913901 -0.09378242 ## [3,] -0.2691057 0.4835374 0.09853273 -0.10776276 -0.009754680 0.23002054 ## [4,] -0.2122818 0.0278985 -0.85498656 0.38794393 -0.001876311 0.07454380 ## [5,] 0.3558474 0.3521598 -0.18949642 0.08057457 0.146965351 -0.32692886 ## [6,] 0.4334816 0.0695682 -0.12616012 -0.38229029 -0.088802794 0.21049130 ## [7,] -0.1757923 0.5033347 0.04609969 0.02558404 0.019358607 0.61491241 ## [8,] -0.3840821 0.1495820 0.13687235 0.14396548 -0.716743474 -0.34776037 ## [9,] -0.1799436 0.3719570 -0.19232803 -0.60046566 0.095582043 -0.43744387 ## [10,] 0.1701426 0.4209653 0.22255233 0.48564231 0.339772188 -0.30032419 ## [,7] [,8] [,9] [,10] ## [1,] 0.2543985 0.663712826 -0.10839531 0.10948045 ## [2,] 0.7505343 0.141264141 0.04613910 0.05580431 ## [3,] -0.1106637 0.072505560 0.42247611 0.65073655 ## [4,] -0.1351242 -0.155435871 -0.10206505 0.11941181 ## [5,] 0.1413388 -0.146839303 0.65076229 -0.33681395 ## [6,] 0.2725296 -0.639003579 -0.20723854 0.25971800 ## [7,] 0.1439726 0.009400445 -0.16724055 -0.53450315 ## [8,] 0.2732665 -0.276873049 -0.01766443 -0.06589572 ## [9,] -0.3419099 0.058519366 -0.30619617 -0.13093187 ## [10,] 0.1868704 0.007310045 -0.45688227 0.24311846 my_pca$rotation ## PC1 PC2 PC3 PC4 PC5 PC6 ## 100 -0.4158823 0.1488081 0.26747198 -0.08833244 -0.442314456 0.03071237 ## long 0.3940515 -0.1520815 0.16894945 -0.24424963 0.368913901 -0.09378242 ## poid 0.2691057 0.4835374 -0.09853273 -0.10776276 -0.009754680 0.23002054 ## haut 0.2122818 0.0278985 0.85498656 0.38794393 -0.001876311 0.07454380 ## 400 -0.3558474 0.3521598 0.18949642 0.08057457 0.146965351 -0.32692886 ## 110 -0.4334816 0.0695682 0.12616012 -0.38229029 -0.088802794 0.21049130 ## disq 0.1757923 0.5033347 -0.04609969 0.02558404 0.019358607 0.61491241 ## perc 0.3840821 0.1495820 -0.13687235 0.14396548 -0.716743474 -0.34776037 ## jave 0.1799436 0.3719570 0.19232803 -0.60046566 0.095582043 -0.43744387 ## 1500 -0.1701426 0.4209653 -0.22255233 0.48564231 0.339772188 -0.30032419 ## PC7 PC8 PC9 PC10 ## 100 0.2543985 -0.663712826 0.10839531 -0.10948045 ## long 0.7505343 -0.141264141 -0.04613910 -0.05580431 ## poid -0.1106637 -0.072505560 -0.42247611 -0.65073655 ## haut -0.1351242 0.155435871 0.10206505 -0.11941181 ## 400 0.1413388 0.146839303 -0.65076229 0.33681395 ## 110 0.2725296 0.639003579 0.20723854 -0.25971800 ## disq 0.1439726 -0.009400445 0.16724055 0.53450315 ## perc 0.2732665 0.276873049 0.01766443 0.06589572 ## jave -0.3419099 -0.058519366 0.30619617 0.13093187 ## 1500 0.1868704 -0.007310045 0.45688227 -0.24311846 15.2 Fisher information Exercise 15.6 Let us assume a Poisson likelihood. Derive the MLE estimate of the mean. Derive the Fisher information. For the data below compute the MLE and construct confidence intervals. Use bootstrap to construct the CI for the mean. Compare with c) and discuss. x <- c(2, 5, 3, 1, 2, 1, 0, 3, 0, 2) Solution. The log likelihood of the Poisson is \\[\\begin{align*} l(\\lambda; x) = \\sum_{i=1}^n x_i \\ln \\lambda - n \\lambda - \\sum_{i=1}^n \\ln x_i! \\end{align*}\\] Taking the derivative and equating with 0 we get \\[\\begin{align*} \\frac{1}{\\hat{\\lambda}}\\sum_{i=1}^n x_i - n &= 0 \\\\ \\hat{\\lambda} &= \\frac{1}{n} \\sum_{i=1}^n x_i. \\end{align*}\\] Since \\(\\lambda\\) is the mean parameter, this was expected. For the Fischer information, we first need the second derivative, which is \\[\\begin{align*} - \\lambda^{-2} \\sum_{i=1}^n x_i. \\\\ \\end{align*}\\] Now taking the expectation of the negative of the above, we get \\[\\begin{align*} E[\\lambda^{-2} \\sum_{i=1}^n x_i] &= \\lambda^{-2} E[\\sum_{i=1}^n x_i] \\\\ &= \\lambda^{-2} n \\lambda \\\\ &= \\frac{n}{\\lambda}. \\end{align*}\\] set.seed(1) x <- c(2, 5, 3, 1, 2, 1, 0, 3, 0, 2) lambda_hat <- mean(x) finfo <- length(x) / lambda_hat mle_CI <- c(lambda_hat - 1.96 * sqrt(1 / finfo), lambda_hat + 1.96 * sqrt(1 / finfo)) boot_lambda <- c() nboot <- 1000 for (i in 1:nboot) { tmp_x <- sample(x, length(x), replace = T) boot_lambda[i] <- mean(tmp_x) } boot_CI <- c(quantile(boot_lambda, 0.025), quantile(boot_lambda, 0.975)) mle_CI ## [1] 1.045656 2.754344 boot_CI ## 2.5% 97.5% ## 1.0 2.7 Exercise 15.7 Find the Fisher information matrix for the Gamma distribution. Generate 20 samples from a Gamma distribution and plot a confidence ellipse of the inverse of Fisher information matrix around the ML estimates of the parameters. Also plot the theoretical values. Repeat the sampling several times. What do you observe? Discuss what a non-diagonal Fisher matrix implies. Hint: The digamma function is defined as \\(\\psi(x) = \\frac{\\frac{d}{dx} \\Gamma(x)}{\\Gamma(x)}\\). Additionally, you do not need to evaluate \\(\\frac{d}{dx} \\psi(x)\\). To calculate its value in R, use package numDeriv. Solution. The log likelihood of the Gamma is \\[\\begin{equation*} l(\\alpha, \\beta; x) = n \\alpha \\ln \\beta - n \\ln \\Gamma(\\alpha) + (\\alpha - 1) \\sum_{i=1}^n \\ln x_i - \\beta \\sum_{i=1}^n x_i. \\end{equation*}\\] Let us calculate the derivatives. \\[\\begin{align*} \\frac{\\partial}{\\partial \\alpha} l(\\alpha, \\beta; x) &= n \\ln \\beta - n \\psi(\\alpha) + \\sum_{i=1}^n \\ln x_i, \\\\ \\frac{\\partial}{\\partial \\beta} l(\\alpha, \\beta; x) &= \\frac{n \\alpha}{\\beta} - \\sum_{i=1}^n x_i, \\\\ \\frac{\\partial^2}{\\partial \\alpha \\beta} l(\\alpha, \\beta; x) &= \\frac{n}{\\beta}, \\\\ \\frac{\\partial^2}{\\partial \\alpha^2} l(\\alpha, \\beta; x) &= - n \\frac{\\partial}{\\partial \\alpha} \\psi(\\alpha), \\\\ \\frac{\\partial^2}{\\partial \\beta^2} l(\\alpha, \\beta; x) &= - \\frac{n \\alpha}{\\beta^2}. \\end{align*}\\] The Fisher information matrix is then \\[\\begin{align*} I(\\alpha, \\beta) = - E[ \\begin{bmatrix} - n \\psi'(\\alpha) & \\frac{n}{\\beta} \\\\ \\frac{n}{\\beta} & - \\frac{n \\alpha}{\\beta^2} \\end{bmatrix} ] = \\begin{bmatrix} n \\psi'(\\alpha) & - \\frac{n}{\\beta} \\\\ - \\frac{n}{\\beta} & \\frac{n \\alpha}{\\beta^2} \\end{bmatrix} \\end{align*}\\] A non-diagonal Fisher matrix implies that the parameter estimates are linearly dependent. set.seed(1) n <- 20 pars_theor <- c(5, 2) x <- rgamma(n, 5, 2) # MLE for alpha and beta log_lik <- function (pars, x) { n <- length(x) return (- (n * pars[1] * log(pars[2]) - n * log(gamma(pars[1])) + (pars[1] - 1) * sum(log(x)) - pars[2] * sum(x))) } my_optim <- optim(par = c(1,1), fn = log_lik, method = "L-BFGS-B", lower = c(0.001, 0.001), upper = c(8, 8), x = x) pars_mle <- my_optim$par fish_mat <- matrix(data = NA, nrow = 2, ncol = 2) fish_mat[1,2] <- - n / pars_mle[2] fish_mat[2,1] <- - n / pars_mle[2] fish_mat[2,2] <- (n * pars_mle[1]) / (pars_mle[2]^2) fish_mat[1,1] <- n * grad(digamma, pars_mle[1]) fish_mat_inv <- solve(fish_mat) est_ellip <- ellipse(pars_mle, fish_mat_inv, draw = FALSE) colnames(est_ellip) <- c("X1", "X2") est_ellip <- as.data.frame(est_ellip) ggplot() + geom_point(data = data.frame(x = pars_mle[1], y = pars_mle[2]), aes(x = x, y = y)) + geom_path(data = est_ellip, aes(x = X1, y = X2)) + geom_point(aes(x = pars_theor[1], y = pars_theor[2]), color = "red") + geom_text(aes(x = pars_theor[1], y = pars_theor[2], label = "Theoretical parameters"), color = "red", nudge_y = -0.2) 15.3 The German tank problem Exercise 15.8 (The German tank problem) During WWII the allied intelligence were faced with an important problem of estimating the total production of certain German tanks, such as the Panther. What turned out to be a successful approach was to estimate the maximum from the serial numbers of the small sample of captured or destroyed tanks (describe the statistical model used). What assumptions were made by using the above model? Do you think they are reasonable assumptions in practice? Show that the plug-in estimate for the maximum (i.e. the maximum of the sample) is a biased estimator. Derive the maximum likelihood estimate of the maximum. Check that the following estimator is not biased: \\(\\hat{n} = \\frac{k + 1}{k}m - 1\\). Solution. The data are the serial numbers of the tanks. The parameter is \\(n\\), the total production of the tank. The distribution of the serial numbers is a discrete uniform distribution over all serial numbers. One of the assumptions is that we have i.i.d samples, however in practice this might not be true, as some tanks produced later could be sent to the field later, therefore already in theory we would not be able to recover some values from the population. To find the expected value we first need to find the distribution of \\(m\\). Let us start with the CDF. \\[\\begin{align*} F_m(x) = P(Y_1 < x,...,Y_k < x). \\end{align*}\\] If \\(x < k\\) then \\(F_m(x) = 0\\) and if \\(x \\geq 1\\) then \\(F_m(x) = 1\\). What about between those values. So the probability that the maximum value is less than or equal to \\(m\\) is just the number of possible draws from \\(Y\\) that are all smaller than \\(m\\), divided by all possible draws. This is \\(\\frac{{x}\\choose{k}}{{n}\\choose{k}}\\). The PDF on the suitable bounds is then \\[\\begin{align*} P(m = x) = F_m(x) - F_m(x - 1) = \\frac{\\binom{x}{k} - \\binom{x - 1}{k}}{\\binom{n}{k}} = \\frac{\\binom{x - 1}{k - 1}}{\\binom{n}{k}}. \\end{align*}\\] Now we can calculate the expected value of \\(m\\) using some combinatorial identities. \\[\\begin{align*} E[m] &= \\sum_{i = k}^n i \\frac{{i - 1}\\choose{k - 1}}{{n}\\choose{k}} \\\\ &= \\sum_{i = k}^n i \\frac{\\frac{(i - 1)!}{(k - 1)!(i - k)!}}{{n}\\choose{k}} \\\\ &= \\frac{k}{\\binom{n}{k}}\\sum_{i = k}^n \\binom{i}{k} \\\\ &= \\frac{k}{\\binom{n}{k}} \\binom{n + 1}{k + 1} \\\\ &= \\frac{k(n + 1)}{k + 1}. \\end{align*}\\] The bias of this estimator is then \\[\\begin{align*} E[m] - n = \\frac{k(n + 1)}{k + 1} - n = \\frac{k - n}{k + 1}. \\end{align*}\\] The probability that we observed our sample \\(Y = {Y_1, Y_2,...,,Y_k}\\) given \\(n\\) is \\(\\frac{1}{{n}\\choose{k}}\\). We need to find such \\(n^*\\) that this function is maximized. Additionally, we have a constraint that \\(n^* \\geq m = \\max{(Y)}\\). Let us plot this function for \\(m = 10\\) and \\(k = 4\\). library(ggplot2) my_fun <- function (x, m, k) { tmp <- 1 / (choose(x, k)) tmp[x < m] <- 0 return (tmp) } x <- 1:20 y <- my_fun(x, 10, 4) df <- data.frame(x = x, y = y) ggplot(data = df, aes(x = x, y = y)) + geom_line() ::: {.solution} (continued) We observe that the maximum of this function lies at the maximum value of the sample. Therefore \\(n^* = m\\) and ML estimate equals the plug-in estimate. \\[\\begin{align*} E[\\hat{n}] &= \\frac{k + 1}{k} E[m] - 1 \\\\ &= \\frac{k + 1}{k} \\frac{k(n + 1)}{k + 1} - 1 \\\\ &= n. \\end{align*}\\] ::: "],["nhst.html", "Chapter 16 Null hypothesis significance testing", " Chapter 16 Null hypothesis significance testing This chapter deals with null hypothesis significance testing. The students are expected to acquire the following knowledge: Binomial test. t-test. Chi-squared test. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 16.1 (Binomial test) We assume \\(y_i \\in \\{0,1\\}\\), \\(i = 1,...,n\\) and \\(y_i | \\theta = 0.5 \\sim i.i.d.\\) Bernoulli\\((\\theta)\\). The test statistic is \\(X = \\sum_{i=1}^n\\) and the rejection region R is defined as the region where the probability of obtaining such or more extreme \\(X\\) given \\(\\theta = 0.5\\) is less than 0.05. Derive and plot the power function of the test for \\(n=100\\). What is the significance level of this test if \\(H0: \\theta = 0.5\\)? At which values of X will we reject the null hypothesis? # a # First we need the rejection region, so we need to find X_min and X_max n <- 100 qbinom(0.025, n, 0.5) ## [1] 40 qbinom(0.975, n, 0.5) ## [1] 60 pbinom(40, n, 0.5) ## [1] 0.02844397 pbinom(60, n, 0.5) ## [1] 0.9823999 X_min <- 39 X_max <- 60 thetas <- seq(0, 1, by = 0.01) beta_t <- 1 - pbinom(X_max, size = n, prob = thetas) + pbinom(X_min, size = n, prob = thetas) plot(beta_t) # b # The significance level is beta_t[51] ## [1] 0.0352002 # We will reject the null hypothesis at X values below X_min and above X_max. Exercise 16.2 (Long-run guarantees of the t-test) Generate a sample of size \\(n = 10\\) from the standard normal. Use the two-sided t-test with \\(H0: \\mu = 0\\) and record the p-value. Can you reject H0 at 0.05 significance level? (before simulating) If we repeated (b) many times, what would be the relative frequency of false positives/Type I errors (rejecting the null that is true)? What would be the relative frequency of false negatives /Type II errors (retaining the null when the null is false)? (now simulate b and check if the simulation results match your answer in b) Similar to (a-c) but now we generate data from N(-0.5, 1). Similar to (a-c) but now we generate data from N(\\(\\mu\\), 1) where we every time pick a different \\(\\mu < 0\\) and use a one-sided test \\(H0: \\mu <= 0\\). set.seed(2) # a x <- rnorm(10) my_test <- t.test(x, alternative = "two.sided", mu = 0) my_test ## ## One Sample t-test ## ## data: x ## t = 0.6779, df = 9, p-value = 0.5149 ## alternative hypothesis: true mean is not equal to 0 ## 95 percent confidence interval: ## -0.4934661 0.9157694 ## sample estimates: ## mean of x ## 0.2111516 # we can not reject the null hypothesis # b # The expected value of false positives would be 0.05. The expected value of # true negatives would be 0, as there are no negatives (the null hypothesis is # always the truth). nit <- 1000 typeIerr <- vector(mode = "logical", length = nit) typeIIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { x <- rnorm(10) my_test <- t.test(x, alternative = "two.sided", mu = 0) if (my_test$p.value < 0.05) { typeIerr[i] <- T } else { typeIerr[i] <- F } } mean(typeIerr) ## [1] 0.052 sd(typeIerr) / sqrt(nit) ## [1] 0.007024624 # d # We can not estimate the percentage of true negatives, but it will probably be # higher than 0.05. There will be no false positives as the null hypothesis is # always false. typeIIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { x <- rnorm(10, -0.5) my_test <- t.test(x, alternative = "two.sided", mu = 0) if (my_test$p.value < 0.05) { typeIIerr[i] <- F } else { typeIIerr[i] <- T } } mean(typeIIerr) ## [1] 0.719 sd(typeIIerr) / sqrt(nit) ## [1] 0.01422115 # e # The expected value of false positives would be lower than 0.05. The expected # value of true negatives would be 0, as there are no negatives (the null # hypothesis is always the truth). typeIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { u <- runif(1, -1, 0) x <- rnorm(10, u) my_test <- t.test(x, alternative = "greater", mu = 0) if (my_test$p.value < 0.05) { typeIerr[i] <- T } else { typeIerr[i] <- F } } mean(typeIerr) ## [1] 0.012 sd(typeIerr) / sqrt(nit) ## [1] 0.003444977 Exercise 16.3 (T-test, confidence intervals, and bootstrap) Sample \\(n=20\\) from a standard normal distribution and calculate the p-value using t-test, confidence intervals based on normal distribution, and bootstrap. Repeat this several times and check how many times we rejected the null hypothesis (made a type I error). Hint: For the confidence intervals you can use function CI from the Rmisc package. set.seed(1) library(Rmisc) nit <- 1000 n_boot <- 100 t_logic <- rep(F, nit) boot_logic <- rep(F, nit) norm_logic <- rep(F, nit) for (i in 1:nit) { x <- rnorm(20) my_test <- t.test(x) my_CI <- CI(x) if (my_test$p.value <= 0.05) t_logic[i] <- T boot_tmp <- vector(mode = "numeric", length = n_boot) for (j in 1:n_boot) { tmp_samp <- sample(x, size = 20, replace = T) boot_tmp[j] <- mean(tmp_samp) } if ((quantile(boot_tmp, 0.025) >= 0) | (quantile(boot_tmp, 0.975) <= 0)) { boot_logic[i] <- T } if ((my_CI[3] >= 0) | (my_CI[1] <= 0)) { norm_logic[i] <- T } } mean(t_logic) ## [1] 0.053 sd(t_logic) / sqrt(nit) ## [1] 0.007088106 mean(boot_logic) ## [1] 0.093 sd(boot_logic) / sqrt(nit) ## [1] 0.009188876 mean(norm_logic) ## [1] 0.053 sd(norm_logic) / sqrt(nit) ## [1] 0.007088106 Exercise 16.4 (Chi-squared test) Show that the \\(\\chi^2 = \\sum_{i=1}^k \\frac{(O_i - E_i)^2}{E_i}\\) test statistic is approximately \\(\\chi^2\\) distributed when we have two categories. Let us look at the US voting data here. Compare the number of voters who voted for Trump or Hillary depending on their income (less or more than 100.000 dollars per year). Manually calculate the chi-squared statistic, compare to the chisq.test in R, and discuss the results. Visualize the test. Solution. Let \\(X_i\\) be binary variables, \\(i = 1,...,n\\). We can then express the test statistic as \\[\\begin{align} \\chi^2 = &\\frac{(O_i - np)^2}{np} + \\frac{(n - O_i - n(1 - p))^2}{n(1 - p)} \\\\ &= \\frac{(O_i - np)^2}{np(1 - p)} \\\\ &= (\\frac{O_i - np}{\\sqrt{np(1 - p)}})^2. \\end{align}\\] When \\(n\\) is large, this distrbution is approximately normal with \\(\\mu = np\\) and \\(\\sigma^2 = np(1 - p)\\) (binomial converges in distribution to standard normal). By definition, the chi-squared distribution with \\(k\\) degrees of freedom is a sum of squares of \\(k\\) independent standard normal random variables. n <- 24588 less100 <- round(0.66 * n * c(0.49, 0.45, 0.06)) # some rounding, but it should not affect results more100 <- round(0.34 * n * c(0.47, 0.47, 0.06)) x <- rbind(less100, more100) colnames(x) <- c("Clinton", "Trump", "other/no answer") print(x) ## Clinton Trump other/no answer ## less100 7952 7303 974 ## more100 3929 3929 502 chisq.test(x) ## ## Pearson's Chi-squared test ## ## data: x ## X-squared = 9.3945, df = 2, p-value = 0.00912 x ## Clinton Trump other/no answer ## less100 7952 7303 974 ## more100 3929 3929 502 csum <- apply(x, 2, sum) rsum <- apply(x, 1, sum) chi2 <- (x[1,1] - csum[1] * rsum[1] / sum(x))^2 / (csum[1] * rsum[1] / sum(x)) + (x[1,2] - csum[2] * rsum[1] / sum(x))^2 / (csum[2] * rsum[1] / sum(x)) + (x[1,3] - csum[3] * rsum[1] / sum(x))^2 / (csum[3] * rsum[1] / sum(x)) + (x[2,1] - csum[1] * rsum[2] / sum(x))^2 / (csum[1] * rsum[2] / sum(x)) + (x[2,2] - csum[2] * rsum[2] / sum(x))^2 / (csum[2] * rsum[2] / sum(x)) + (x[2,3] - csum[3] * rsum[2] / sum(x))^2 / (csum[3] * rsum[2] / sum(x)) chi2 ## Clinton ## 9.394536 1 - pchisq(chi2, df = 2) ## Clinton ## 0.009120161 x <- seq(0, 15, by = 0.01) df <- data.frame(x = x) ggplot(data = df, aes(x = x)) + stat_function(fun = dchisq, args = list(df = 2)) + geom_segment(aes(x = chi2, y = 0, xend = chi2, yend = dchisq(chi2, df = 2))) + stat_function(fun = dchisq, args = list(df = 2), xlim = c(chi2, 15), geom = "area", fill = "red") "],["bi.html", "Chapter 17 Bayesian inference 17.1 Conjugate priors 17.2 Posterior sampling", " Chapter 17 Bayesian inference This chapter deals with Bayesian inference. The students are expected to acquire the following knowledge: How to set prior distribution. Compute posterior distribution. Compute posterior predictive distribution. Use sampling for inference. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 17.1 Conjugate priors Exercise 17.1 (Poisson-gamma model) Let us assume a Poisson likelihood and a gamma prior on the Poisson mean parameter (this is a conjugate prior). Derive posterior Below we have some data, which represents number of goals in a football match. Choose sensible prior for this data (draw the gamma density if necessary), justify it. Compute the posterior. Compute an interval such that the probability that the true mean is in there is 95%. What is the probability that the true mean is greater than 2.5? Back to theory: Compute prior predictive and posterior predictive. Discuss why the posterior predictive is overdispersed and not Poisson? Draw a histogram of the prior predictive and posterior predictive for the data from (b). Discuss. Generate 10 and 100 random samples from a Poisson distribution and compare the posteriors with a flat prior, and a prior concentrated away from the truth. x <- c(3, 2, 1, 1, 5, 4, 0, 0, 4, 3) Solution. \\[\\begin{align*} p(\\lambda | X) &= \\frac{p(X | \\lambda) p(\\lambda)}{\\int_0^\\infty p(X | \\lambda) p(\\lambda) d\\lambda} \\\\ &\\propto p(X | \\lambda) p(\\lambda) \\\\ &= \\Big(\\prod_{i=1}^n \\frac{1}{x_i!} \\lambda^{x_i} e^{-\\lambda}\\Big) \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\lambda^{\\alpha - 1} e^{-\\beta \\lambda} \\\\ &\\propto \\lambda^{\\sum_{i=1}^n x_i + \\alpha - 1} e^{- \\lambda (n + \\beta)} \\\\ \\end{align*}\\] We recognize this as the shape of a gamma distribution, therefore \\[\\begin{align*} \\lambda | X \\sim \\text{gamma}(\\alpha + \\sum_{i=1}^n x_i, \\beta + n) \\end{align*}\\] For the prior predictive, we have \\[\\begin{align*} p(x^*) &= \\int_0^\\infty p(x^*, \\lambda) d\\lambda \\\\ &= \\int_0^\\infty p(x^* | \\lambda) p(\\lambda) d\\lambda \\\\ &= \\int_0^\\infty \\frac{1}{x^*!} \\lambda^{x^*} e^{-\\lambda} \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\lambda^{\\alpha - 1} e^{-\\beta \\lambda} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\int_0^\\infty \\lambda^{x^* + \\alpha - 1} e^{-\\lambda (1 + \\beta)} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\frac{\\Gamma(x^* + \\alpha)}{(1 + \\beta)^{x^* + \\alpha}} \\int_0^\\infty \\frac{(1 + \\beta)^{x^* + \\alpha}}{\\Gamma(x^* + \\alpha)} \\lambda^{x^* + \\alpha - 1} e^{-\\lambda (1 + \\beta)} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\frac{\\Gamma(x^* + \\alpha)}{(1 + \\beta)^{x^* + \\alpha}} \\\\ &= \\frac{\\Gamma(x^* + \\alpha)}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} (\\frac{\\beta}{1 + \\beta})^\\alpha (\\frac{1}{1 + \\beta})^{x^*}, \\end{align*}\\] which we recognize as the negative binomial distribution with \\(r = \\alpha\\) and \\(p = \\frac{1}{\\beta + 1}\\). For the posterior predictive, the calculation is the same, only now the parameters are \\(r = \\alpha + \\sum_{i=1}^n x_i\\) and \\(p = \\frac{1}{\\beta + n + 1}\\). There are two sources of uncertainty in the predictive distribution. First is the uncertainty about the population. Second is the variability in sampling from the population. When \\(n\\) is large, the latter is going to be very small. But when \\(n\\) is small, the latter is going to be higher, resulting in an overdispersed predictive distribution. x <- c(3, 2, 1, 1, 5, 4, 0, 0, 4, 3) # b # quick visual check of the prior ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dgamma, args = list(shape = 1, rate = 1)) palpha <- 1 pbeta <- 1 alpha_post <- palpha + sum(x) beta_post <- pbeta + length(x) ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dgamma, args = list(shape = alpha_post, rate = beta_post)) # probability of being higher than 2.5 1 - pgamma(2.5, alpha_post, beta_post) ## [1] 0.2267148 # interval qgamma(c(0.025, 0.975), alpha_post, beta_post) ## [1] 1.397932 3.137390 # d prior_pred <- rnbinom(1000, size = palpha, prob = 1 - 1 / (pbeta + 1)) post_pred <- rnbinom(1000, size = palpha + sum(x), prob = 1 - 1 / (pbeta + 10 + 1)) df <- data.frame(prior = prior_pred, posterior = post_pred) df <- gather(df) ggplot(df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") # e set.seed(1) x1 <- rpois(10, 2.5) x2 <- rpois(100, 2.5) alpha_flat <- 1 beta_flat <- 0.1 alpha_conc <- 50 beta_conc <- 10 n <- 10000 df_flat <- data.frame(x1 = rgamma(n, alpha_flat + sum(x1), beta_flat + 10), x2 = rgamma(n, alpha_flat + sum(x2), beta_flat + 100), type = "flat") df_flat <- tidyr::gather(df_flat, key = "key", value = "value", - type) df_conc <- data.frame(x1 = rgamma(n, alpha_conc + sum(x1), beta_conc + 10), x2 = rgamma(n, alpha_conc + sum(x2), beta_conc + 100), type = "conc") df_conc <- tidyr::gather(df_conc, key = "key", value = "value", - type) df <- rbind(df_flat, df_conc) ggplot(data = df, aes(x = value, color = type)) + facet_wrap(~ key) + geom_density() 17.2 Posterior sampling Exercise 17.2 (Bayesian logistic regression) In Chapter 15 we implemented a MLE for logistic regression (see the code below). For this model, conjugate priors do not exist, which complicates the calculation of the posterior. However, we can use sampling from the numerator of the posterior, using rejection sampling. Set a sensible prior distribution on \\(\\beta\\) and use rejection sampling to find the posterior distribution. In a) you will get a distribution of parameter \\(\\beta\\). Plot the probabilities (as in exercise 15.3) for each sample of \\(\\beta\\) and compare to the truth. Hint: We can use rejection sampling even for functions which are not PDFs – they do not have to sum/integrate to 1. We just need to use a suitable envelope that we know how to sample from. For example, here we could use a uniform distribution and scale it suitably. set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par # Let's say we believe that the mean of beta is 0.5. Since we are not very sure # about this, we will give it a relatively high variance. So a normal prior with # mean 0.5 and standard deviation 5. But there is no right solution to this, # this is basically us expressing our prior belief in the parameter values. set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) if (is.nan(logl)) logl <- Inf return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par ## [1] 1.166558 f_logistic <- function (beta, X, y) { logl <- prod(inv_log(as.vector(beta %*% X))^y * (1 - inv_log(as.vector(beta %*% X)))^(1 - y)) return(logl) } a <- seq(0, 3, by = 0.01) my_l <- c() for (i in a) { my_l <- c(my_l, f_logistic(i, x, y) * dnorm(i, 0.5, 5)) } plot(my_l) envlp <- 10^(-25.8) * dunif(a, -5, 5) # found by trial and error tmp <- data.frame(envel = envlp, l = my_l, t = a) tmp <- gather(tmp, key = "key", value = "value", - t) ggplot(tmp, aes(x = t, y = value, color = key)) + geom_line() # envelope OK set.seed(1) nsamps <- 1000 samps <- c() for (i in 1:nsamps) { tmp <- runif(1, -5, 5) u <- runif(1, 0, 1) if (u < (f_logistic(tmp, x, y) * dnorm(tmp, 0.5, 5)) / (10^(-25.8) * dunif(tmp, -5, 5))) { samps <- c(samps, tmp) } } plot(density(samps)) mean(samps) ## [1] 1.211578 median(samps) ## [1] 1.204279 truth_p <- data.frame(x = x, prob = inv_log(1.2 * x), type = "truth") preds <- inv_log(x %*% t(samps)) preds <- gather(cbind(as.data.frame(preds), x = x), key = "key", "value" = value, - x) ggplot(preds, aes(x = x, y = value)) + geom_line(aes(group = key), color = "gray", alpha = 0.7) + geom_point(data = truth_p, aes(y = prob), color = "red", alpha = 0.7) + theme_bw() "],["distributions-intutition.html", "Chapter 18 Distributions intutition 18.1 Discrete distributions 18.2 Continuous distributions", " Chapter 18 Distributions intutition This chapter is intended to help you familiarize yourself with the different probability distributions you will encounter in this course. You will need to use Appendix B extensively as a reference for the basic properties of distributions, so keep it close! .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 18.1 Discrete distributions Exercise 18.1 (Bernoulli intuition 1) Arguably the simplest distribution you will enocounter is the Bernoulli distribution. It is a discrete probability distribution used to represent the outcome of a yes/no question. It has one parameter \\(p\\) which is the probability of success. The probability of failure is \\((1-p)\\), sometimes denoted as \\(q\\). A classic way to think about a Bernoulli trial (a yes/no experiment) is a coin flip. Real coins are fair, meaning the probability of either heads (1) or tails (0) are the same, so \\(p=0.5\\) as shown below in figure a. Alternatively we may want to represent a process that doesn’t have equal probabilities of outcomes like “Will a throw of a fair die result in a 6?”. In this case \\(p=\\frac{1}{6}\\), shown in figure b. Using your knowledge of the Bernoulli distribution use the throw of a fair die to think of events, such that: \\(p = 0.5\\) \\(p = \\frac{5}{6}\\) \\(q = \\frac{2}{3}\\) Solution. An event that is equally likely to happen or not happen i.e. \\(p = 0.5\\) would be throwing an even number. More formally we can name this event \\(A\\) and write: \\(A = \\{2,4,6\\}\\), its probability being \\(P(A) = 0.5\\) An example of an event with \\(p = \\frac{5}{6}\\) would be throwing a number greater than 1. Defined as \\(B = \\{2,3,4,5,6\\}\\). We need an event that fails \\(\\frac{2}{3}\\) of the time. Alternatively we can reverse the problem and find an event that succeeds \\(\\frac{1}{3}\\) of the time, since: \\(q = 1 - p \\implies p = 1 - q = \\frac{1}{3}\\). The event that our outcome is divisible by 3: \\(C = \\{3, 6\\}\\) satisfies this condition. Exercise 18.2 (Binomial intuition 1) The binomial distribution is a generalization of the Bernoulli distribution. Instead of considering a single Bernoulli trial, we now consider a sequence of \\(n\\) trials, which are independent and have the same parameter \\(p\\). So the binomial distribution has two parameters \\(n\\) - the number of trials and \\(p\\) - the probability of success for each trial. If we return to our coin flip representation, we now flip a coin several times. The binomial distribution will give us the probabilities of all possible outcomes. Below we show the distribution for a series of 10 coin flips with a fair coin (left) and a biased coin (right). The numbers on the x axis represent the number of times the coin landed heads. Using your knowledge of the binomial distribution: Take the pmf of the binomial distribution and plug in \\(n=1\\), check that it is in fact equivalent to a Bernoulli distribution. In our examples we show the graph of a binomial distribution over 10 trials with \\(p=0.8\\). If we take a look at the graph, it appears as though the probabilities of getting 0,1,2 or 3 heads in 10 flips are zero. Is it actually zero? Check by plugging in the values into the pmf. Solution. The pmf of a binomial distribution is \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\), now we insert \\(n=1\\) to get: \\[\\binom{1}{k} p^k (1 - p)^{1 - k}\\]. Not quite equivalent to a Bernoulli, however note that the support of the binomial distribution is defined as \\(k \\in \\{0,1,\\dots,n\\}\\), so in our case \\(k = \\{0,1\\}\\), then: \\[\\binom{1}{0} = \\binom{1}{1} = 1\\] we get: \\(p^k (1 - p)^{1 - k}\\) ,the Bernoulli distribution. As we already know \\(p=0.8, n=10\\), so: \\[\\binom{10}{0} 0.8^0 (1 - 0.8)^{10 - 0} = 1.024 \\cdot 10^{-7}\\] \\[\\binom{10}{1} 0.8^1 (1 - 0.8)^{10 - 1} = 4.096 \\cdot 10^{-6}\\] \\[\\binom{10}{2} 0.8^2 (1 - 0.8)^{10 - 2} = 7.3728 \\cdot 10^{-5}\\] \\[\\binom{10}{3} 0.8^3 (1 - 0.8)^{10 - 3} = 7.86432\\cdot 10^{-4}\\] So the probabilities are not zero, just very small. Exercise 18.3 (Poisson intuition 1) Below are shown 3 different graphs of the Poisson distribution. Your task is to replicate them on your own in R by varying the \\(\\lambda\\) parameter. Hint: You can use dpois() to get the probabilities. library(ggplot2) library(gridExtra) x = 0:15 # Create Poisson data data1 <- data.frame(x = x, y = dpois(x, lambda = 0.1)) data2 <- data.frame(x = x, y = dpois(x, lambda = 1)) data3 <- data.frame(x = x, y = dpois(x, lambda = 7.5)) # Create individual ggplot objects plot1 <- ggplot(data1, aes(x, y)) + geom_col() + xlab("x") + ylab("Probability") + ylim(0,1) plot2 <- ggplot(data2, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) plot3 <- ggplot(data3, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) # Combine the plots grid.arrange(plot1, plot2, plot3, ncol = 3) Exercise 18.4 (Poisson intuition 2) The Poisson distribution is a discrete probability distribution that models the probability of a given number of events occuring within processes where events occur at a constant mean rate and independently of each other - a Poisson process. It has a single parameter \\(\\lambda\\), which represents the constant mean rate. A classic example of a scenario that can be modeled using the Poisson distribution is the number of calls received by a call center in a day (or in fact any other time interval). Suppose you work in a call center and have some understanding of probability distributions. You overhear your supervisor mentioning that the call center receives an average of 2.5 calls per day. Using your knowledge of the Poisson distribution, calculate: The probability you will get no calls today. The probability you will get more than 5 calls today. Solution. First recall the Poisson pmf: \\[p(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!}\\] as stated previously our parameter \\(\\lambda = 2.5\\) To get the probability of no calls we simply plug in \\(k = 0\\), so: \\[p(0) = \\frac{2.5^0 e^{-2.5}}{0!} = e^{-2.5} \\approx 0.082\\] The support of the Poisson distribution is non-negative integers. So if we wanted to calculate the probability of getting more than 5 calls we would need to add up the probabilities of getting 6 calls and 7 calls and so on up to infinity. Let us instead remember that the sum of all probabilties will be 1, we will reverse the problem and instead ask “What is the probability we get 5 calls or less?”. We can subtract the probability of the opposite outcome (the complement) from 1 to get the probability of our original question. \\[P(k > 5) = 1 - P(k \\leq 5)\\] \\[P(k \\leq 5) = \\sum_{i=0}^{5} p(i) = p(0) + p(1) + p(2) + p(3) + p(4) + p(5) =\\] \\[= \\frac{2.5^0 e^{-2.5}}{0!} + \\frac{2.5^1 e^{-2.5}}{1!} + \\dots =\\] \\[=0.957979\\] So the probability of geting more than 5 calls will be \\(1 - 0.957979 = 0.042021\\) Exercise 18.5 (Geometric intuition 1) The geometric distribution is a discrete distribution that models the number of failures before the first success in a sequence of independent Bernoulli trials. It has a single parameter \\(p\\), representing the probability of success. NOTE: There are two forms of this distribution, the one we just described and another that models the number of trials before the first success. The difference is subtle yet significant and you are likely to encounter both forms. In the graph below we show the pmf of a geometric distribution with \\(p=0.5\\). This can be thought of as the number of successive failures (tails) in the flip of a fair coin. You can see that there’s a 50% chance you will have zero failures i.e. you will flip a heads on your very first attempt. But there is some smaller chance that you will flip a sequence of tails in a row, with longer sequences having ever lower probability. Create an equivalent graph that represents the probability of rolling a 6 with a fair 6-sided die. Use the formula for the mean of the geometric distribution and determine the average number of failures before you roll a 6. Look up the alternative form of the geometric distribtuion and again use the formula for the mean to determine the average number of trials before you roll a 6. Solution. Parameter p (the probability of success) for rolling a 6 is \\(p=\\frac{1}{6}\\). library(ggplot2) # Parameters p <- 1/6 x_vals <- 0:9 # Starting from 0 probs <- dgeom(x_vals, p) # Data data <- data.frame(x_vals, probs) # Plot ggplot(data, aes(x=x_vals, y=probs)) + geom_segment(aes(xend=x_vals, yend=0), color="black", size=1) + geom_point(color="red", size=2) + labs(x = "Number of trials", y = "Probability") + theme_minimal() + scale_x_continuous(breaks = x_vals) # This line ensures integer x-axis labels ::: {.solution} b) The expected value of a random variable (the mean) is denoted as \\(E[X]\\). \\[E[X] = \\frac{1-p}{p}= \\frac{1- \\frac{1}{6}}{\\frac{1}{6}} = \\frac{5}{6}\\cdot 6 = 5\\] On average we will fail 5 times before we roll our first 6. The alternative form of this distribution (with support on all positive integers) has a slightly different formula for the mean. This change reflects the difference in the way we posed our question: \\[E[X] = \\frac{1}{p} = \\frac{1}{\\frac{1}{6}} = 6\\] On average we will have to throw the die 6 times before we roll a 6. ::: 18.2 Continuous distributions Exercise 18.6 (Uniform intuition 1) The need for a randomness is a common problem. A practical solution are so-called random number generators (RNGs). The simplest RNG one would think of is choosing a set of numbers and having the generator return a number at random, where the probability of returning any number from this set is the same. If this set is an interval of real numbers, then we’ve basically described the continuous uniform distribution. It has two parameters \\(a\\) and \\(b\\), which define the beginning and end of its support respectively. Let’s think of the mean intuitively. The expected value or mean of a distribution is the pivot point on our x-axis, which “balances” the graph. Given parameters \\(a\\) and \\(b\\) what is your intuitive guess of the mean for this distribution? A special case of the uniform distribution is the standard uniform distribution with \\(a=0\\) and \\(b=1\\). Write the pdf \\(f(x)\\) of this particular distribution. Solution. It’s the midpoint between \\(a\\) and \\(b\\), so \\(\\frac{a+b}{2}\\) Inserting the parameter values we get:\\[f(x) = \\begin{cases} 1 & \\text{if } 0 \\leq x \\leq 1 \\\\ 0 & \\text{otherwise} \\end{cases} \\] Notice how the pdf is just a constant \\(1\\) across all values of \\(x \\in [0,1]\\). Here it is important to distinguish between probability and probability density. The density may be 1, but the probability is not and while discreet distributions never exceed 1 on the y-axis, continuous distributions can go as high as you like. Exercise 18.7 (Normal intuition 1) The normal distribution, also known as the Gaussian distribution, is a continuous distribution that encompasses the entire real number line. It has two parameters: the mean, denoted by \\(\\mu\\), and the variance, represented by \\(\\sigma^2\\). Its shape resembles the iconic bell curve. The position of its peak is determined by the parameter \\(\\mu\\), while the variance determines the spread or width of the curve. A smaller variance results in a sharper, narrower peak, while a larger variance leads to a broader, more spread-out curve. Below, we graph the distribution of IQ scores for two different populations. We aim to identify individuals with an IQ at or above 140 for an experiment. We can identify them reliably; however, we only have time to examine one of the two groups. Which group should we investigate to have the best chance? NOTE: The graph below displays the parameter \\(\\sigma\\), which is the square root of the variance, more commonly referred to as the standard deviation. Keep this in mind when solving the problems. Insert the values of either population into the pdf of a normal distribution and determine which one has a higher density at \\(x=140\\). Generate the graph yourself and zoom into the relevant area to graphically verify your answer. To determine probability density, we can use the pdf. However, if we wish to know the proportion of the population that falls within certain parameters, we would need to integrate the pdf. Fortunately, the integrals of common distributions are well-established. This integral gives us the cumulative distribution function \\(F(x)\\) (CDF). BONUS: Look up the CDF of the normal distribution and input the appropriate values to determine the percentage of each population that comprises individuals with an IQ of 140 or higher. Solution. Group 1: \\(\\mu = 100, \\sigma=10 \\rightarrow \\sigma^2 = 100\\) \\[\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} = \\frac{1}{\\sqrt{2 \\pi 100}} e^{-\\frac{(140 - 100)^2}{2 \\cdot 100}} \\approx 1.34e-05\\] Group 2: \\(\\mu = 105, \\sigma=8 \\rightarrow \\sigma^2 = 64\\) \\[\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} = \\frac{1}{\\sqrt{2 \\pi 64}} e^{-\\frac{(140 - 105)^2}{2 \\cdot 64}} \\approx 3.48e-06\\] So despite the fact that group 1 has a lower average IQ, we are more likely to find 140 IQ individuals in group 2. library(ggplot2) library(tidyr) # Create data x <- seq(135, 145, by = 0.01) # Adjusting the x range to account for the larger standard deviations df <- data.frame(x = x) # Define the IQ distributions df$IQ_mu100_sd10 <- dnorm(df$x, mean = 100, sd = 10) df$IQ_mu105_sd8 <- dnorm(df$x, mean = 105, sd = 8) # Convert from wide to long format for ggplot2 df_long <- gather(df, distribution, density, -x) # Ensure the levels of the 'distribution' factor match our desired order df_long$distribution <- factor(df_long$distribution, levels = c("IQ_mu100_sd10", "IQ_mu105_sd8")) # Plot ggplot(df_long, aes(x = x, y = density, color = distribution)) + geom_line() + labs(x = "IQ Score", y = "Density") + scale_color_manual( name = "IQ Distribution", values = c(IQ_mu100_sd10 = "red", IQ_mu105_sd8 = "blue"), labels = c("Group 1 (µ=100, σ=10)", "Group 2 (µ=105, σ=8)") ) + theme_minimal() ::: {.solution} c. The CDF of the normal distribution is \\(\\Phi(x) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{x - \\mu}{\\sigma \\sqrt{2}} \\right) \\right]\\). The CDF is defined as the integral of the distribution density up to x. So to get the total percentage of individuals with IQ at 140 or higher we will need to subtract the value from 1. Group 1: \\[1 - \\Phi(140) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{140 - 100}{10 \\sqrt{2}} \\right) \\right] \\approx 3.17e-05 \\] Group 2 : \\[1 - \\Phi(140) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{140 - 105}{8 \\sqrt{2}} \\right) \\right] \\approx 6.07e-06 \\] So roughly 0.003% and 0.0006% of individuals in groups 1 and 2 respectively have an IQ at or above 140. ::: Exercise 18.8 (Beta intuition 1) The beta distribution is a continuous distribution defined on the unit interval \\([0,1]\\). It has two strictly positive paramters \\(\\alpha\\) and \\(\\beta\\), which determine its shape. Its support makes it especially suitable to model distribtuions of percentages and proportions. Below you’ve been provided with some code that you can copy into Rstudio. Once you run the code, an interactive Shiny app will appear and you will be able to manipulate the graph of the beta distribution. Play around with the parameters to get: A straight line from (0,0) to (1,2) A straight line from (0,2) to (1,0) A symmetric bell curve A bowl-shaped curve The standard uniform distribution is actually a special case of the beta distribution. Find the exact parameters \\(\\alpha\\) and \\(\\beta\\). Once you do, prove the equality by inserting the values into our pdf. Hint: The beta function is evaluated as \\(\\text{B}(a,b) = \\frac{\\Gamma(a)\\Gamma(b)}{\\Gamma(a+b)}\\), the gamma function for positive integers \\(n\\) is evaluated as \\(\\Gamma(n)= (n-1)!\\) # Install and load necessary packages install.packages(c("shiny", "ggplot2")) library(shiny) library(ggplot2) # The Shiny App ui <- fluidPage( titlePanel("Beta Distribution Viewer"), sidebarLayout( sidebarPanel( sliderInput("alpha", "Alpha:", min = 0.1, max = 10, value = 2, step = 0.1), sliderInput("beta", "Beta:", min = 0.1, max = 10, value = 2, step = 0.1) ), mainPanel( plotOutput("betaPlot") ) ) ) server <- function(input, output) { output$betaPlot <- renderPlot({ x <- seq(0, 1, by = 0.01) y <- dbeta(x, shape1 = input$alpha, shape2 = input$beta) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() + labs(x = "Value", y = "Density") + theme_minimal() }) } shinyApp(ui = ui, server = server) Solution. \\(\\alpha = 2, \\beta=1\\) \\(\\alpha = 1, \\beta=2\\) Possible solution \\(\\alpha = \\beta= 5\\) Possible solution \\(\\alpha = \\beta= 0.5\\) The correct parameters are \\(\\alpha = 1, \\beta=1\\), to prove the equality we insert them into the beta pdf: \\[\\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} = \\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\\text{B}(1, 1)} = \\frac{1}{\\frac{\\Gamma(1)\\Gamma(1)}{\\Gamma(1+1)}}= \\frac{1}{\\frac{(1-1)!(1-1)!}{(2-1)!}} = 1\\] Exercise 18.9 (Exponential intuition 1) The exponential distribution represents the distributon of time between events in a Poisson process. It is the continuous analogue of the geometric distribution. It has a single parameter \\(\\lambda\\), which is strictly positive and represents the constant rate of the corresponding Poisson process. The support is all positive reals, since time between events is non-negative, but not bound upwards. Let’s revisit the call center from our Poisson problem. We get 2.5 calls per day on average, this is our rate parameter \\(\\lambda\\). A work day is 8 hours. What is the mean time between phone calls? The cdf \\(F(x)\\) tells us what percentage of calls occur within x amount of time of each other. You want to take an hour long lunch break but are worried about missing calls. Calculate the percentage of calls you are likely to miss if you’re gone for an hour. Hint: The cdf is \\(F(x) = \\int_{-\\infty}^{x} f(x) dx\\) Solution. Taking \\(\\lambda = \\frac{2.5 \\text{ calls}}{8 \\text{ hours}} = \\frac{1 \\text{ call}}{3.2 \\text{ hours}}\\) \\[E[X] = \\frac{1}{\\lambda} = \\frac{3.2 \\text{ hours}}{\\text{call}}\\] First we derive the CDF, we can integrate from 0 instead of \\(-\\infty\\), since we have no support in the negatives: \\[\\begin{align} F(x) &= \\int_{0}^{x} \\lambda e^{-\\lambda t} dt \\\\ &= \\lambda \\int_{0}^{x} e^{-\\lambda t} dt \\\\ &= \\lambda (\\frac{1}{-\\lambda}e^{-\\lambda t} |_{0}^{x}) \\\\ &= \\lambda(\\frac{1}{\\lambda} - \\frac{1}{\\lambda} e^{-\\lambda x}) \\\\ &= 1 - e^{-\\lambda x}. \\end{align}\\] Then we just evaluate it for a time of 1 hour: \\[F(1 \\text{ hour}) = 1 - e^{-\\frac{1 \\text{ call}}{3.2 \\text{ hours}} \\cdot 1 \\text{ hour}}= 1 - e^{-\\frac{1 \\text{ call}}{3.2 \\text{ hours}}} \\approx 0.268\\] So we have about a 27% chance of missing a call if we’re gone for an hour. Exercise 18.10 (Gamma intuition 1) The gamma distribution is a continuous distribution characterized by two parameters, \\(\\alpha\\) and \\(\\beta\\), both greater than 0. These parameters afford the distribution a broad range of shapes, leading to it being commonly referred to as a family of distributions. Given its support over the positive real numbers, it is well suited for modeling a diverse range of positive-valued phenomena. The exponential distribution is actually just a particular form of the gamma distribution. What are the values of \\(\\alpha\\) and \\(\\beta\\)? Copy the code from our beta distribution Shiny app and modify it to simulate the gamma distribution. Then get it to show the exponential. Solution. Let’s start by taking a look at the pdfs of the two distributions side by side: \\[\\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x} = \\lambda e^{-\\lambda x}\\] The \\(x^{\\alpha - 1}\\) term is not found anywhere in the pdf of the exponential so we need to eliminate it by setting \\(\\alpha = 1\\). This also makes the fraction evaluate to \\(\\frac{\\beta^1}{\\Gamma(1)} = \\beta\\), which leaves us with \\[\\beta \\cdot e^{-\\beta x}\\] Now we can see that \\(\\beta = \\lambda\\) and \\(\\alpha = 1\\). # Install and load necessary packages install.packages(c("shiny", "ggplot2")) library(shiny) library(ggplot2) # The Shiny App ui <- fluidPage( titlePanel("Gamma Distribution Viewer"), sidebarLayout( sidebarPanel( sliderInput("shape", "Shape (α):", min = 0.1, max = 10, value = 2, step = 0.1), sliderInput("scale", "Scale (β):", min = 0.1, max = 10, value = 2, step = 0.1) ), mainPanel( plotOutput("gammaPlot") ) ) ) server <- function(input, output) { output$gammaPlot <- renderPlot({ x <- seq(0, 25, by = 0.1) y <- dgamma(x, shape = input$shape, scale = input$scale) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() + labs(x = "Value", y = "Density") + theme_minimal() }) } shinyApp(ui = ui, server = server) "],["A1.html", "A R programming language A.1 Basic characteristics A.2 Why R? A.3 Setting up A.4 R basics A.5 Functions A.6 Other tips A.7 Further reading and references", " A R programming language A.1 Basic characteristics R is free software for statistical computing and graphics. It is widely used by statisticians, scientists, and other professionals for software development and data analysis. It is an interpreted language and therefore the programs do not need compilation. A.2 Why R? R is one of the main two languages used for statistics and machine learning (the other being Python). Pros Libraries. Comprehensive collection of statistical and machine learning packages. Easy to code. Open source. Anyone can access R and develop new methods. Additionally, it is relatively simple to get source code of established methods. Large community. The use of R has been rising for some time, in industry and academia. Therefore a large collection of blogs and tutorials exists, along with people offering help on pages like StackExchange and CrossValidated. Integration with other languages and LaTeX. New methods. Many researchers develop R packages based on their research, therefore new methods are available soon after development. Cons Slow. Programs run slower than in other programming languages, however this can be somewhat ammended by effective coding or integration with other languages. Memory intensive. This can become a problem with large data sets, as they need to be stored in the memory, along with all the information the models produce. Some packages are not as good as they should be, or have poor documentation. Object oriented programming in R can be very confusing and complex. A.3 Setting up https://www.r-project.org/. A.3.1 RStudio RStudio is the most widely used IDE for R. It is free, you can download it from https://rstudio.com/. While console R is sufficient for the requirements of this course, we recommend the students install RStudio for its better user interface. A.3.2 Libraries for data science Listed below are some of the more useful libraries (packages) for data science. Students are also encouraged to find other useful packages. dplyr Efficient data manipulation. Part of the wider package collection called tidyverse. ggplot2 Plotting based on grammar of graphics. stats Several statistical models. rstan Bayesian inference using Hamiltonian Monte Carlo. Very flexible model building. MCMCpack Bayesian inference. rmarkdown, knitr, and bookdown Dynamic reports (for example such as this one). devtools Package development. A.4 R basics A.4.1 Variables and types Important information and tips: no type declaration define variables with <- instead of = (although both work, there is a slight difference, additionally most of the packages use the arrow) for strings use \"\" for comments use # change types with as.type() functions no special type for single character like C++ for example n <- 20 x <- 2.7 m <- n # m gets value 20 my_flag <- TRUE student_name <- "Luka" typeof(n) ## [1] "double" typeof(student_name) ## [1] "character" typeof(my_flag) ## [1] "logical" typeof(as.integer(n)) ## [1] "integer" typeof(as.character(n)) ## [1] "character" A.4.2 Basic operations n + x ## [1] 22.7 n - x ## [1] 17.3 diff <- n - x # variable diff gets the difference between n and x diff ## [1] 17.3 n * x ## [1] 54 n / x ## [1] 7.407407 x^2 ## [1] 7.29 sqrt(x) ## [1] 1.643168 n > 2 * n ## [1] FALSE n == n ## [1] TRUE n == 2 * n ## [1] FALSE n != n ## [1] FALSE paste(student_name, "is", n, "years old") ## [1] "Luka is 20 years old" A.4.3 Vectors use c() to combine elements into vectors can only contain one type of variable if different types are provided, all are transformed to the most basic type in the vector access elements by indexes or logical vectors of the same length a scalar value is regarded as a vector of length 1 1:4 # creates a vector of integers from 1 to 4 ## [1] 1 2 3 4 student_ages <- c(20, 23, 21) student_names <- c("Luke", "Jen", "Mike") passed <- c(TRUE, TRUE, FALSE) length(student_ages) ## [1] 3 # access by index student_ages[2] ## [1] 23 student_ages[1:2] ## [1] 20 23 student_ages[2] <- 24 # change values # access by logical vectors student_ages[passed == TRUE] # same as student_ages[passed] ## [1] 20 24 student_ages[student_names %in% c("Luke", "Mike")] ## [1] 20 21 student_names[student_ages > 20] ## [1] "Jen" "Mike" A.4.3.1 Operations with vectors most operations are element-wise if we operate on vectors of different lengths, the shorter vector periodically repeats its elements until it reaches the length of the longer one a <- c(1, 3, 5) b <- c(2, 2, 1) d <- c(6, 7) a + b ## [1] 3 5 6 a * b ## [1] 2 6 5 a + d ## Warning in a + d: longer object length is not a multiple of shorter object ## length ## [1] 7 10 11 a + 2 * b ## [1] 5 7 7 a > b ## [1] FALSE TRUE TRUE b == a ## [1] FALSE FALSE FALSE a %*% b # vector multiplication, not element-wise ## [,1] ## [1,] 13 A.4.4 Factors vectors of finite predetermined classes suitable for categorical variables ordinal (ordered) or nominal (unordered) car_brand <- factor(c("Audi", "BMW", "Mercedes", "BMW"), ordered = FALSE) car_brand ## [1] Audi BMW Mercedes BMW ## Levels: Audi BMW Mercedes freq <- factor(x = NA, levels = c("never","rarely","sometimes","often","always"), ordered = TRUE) freq[1:3] <- c("rarely", "sometimes", "rarely") freq ## [1] rarely sometimes rarely ## Levels: never < rarely < sometimes < often < always freq[4] <- "quite_often" # non-existing level, returns NA ## Warning in `[<-.factor`(`*tmp*`, 4, value = "quite_often"): invalid factor ## level, NA generated freq ## [1] rarely sometimes rarely <NA> ## Levels: never < rarely < sometimes < often < always A.4.5 Matrices two-dimensional generalizations of vectors my_matrix <- matrix(c(1, 2, 1, 5, 4, 2), nrow = 2, byrow = TRUE) my_matrix ## [,1] [,2] [,3] ## [1,] 1 2 1 ## [2,] 5 4 2 my_square_matrix <- matrix(c(1, 3, 2, 3), nrow = 2) my_square_matrix ## [,1] [,2] ## [1,] 1 2 ## [2,] 3 3 my_matrix[1,2] # first row, second column ## [1] 2 my_matrix[2, ] # second row ## [1] 5 4 2 my_matrix[ ,3] # third column ## [1] 1 2 A.4.5.1 Matrix functions and operations most operation element-wise mind the dimensions when using matrix multiplication %*% nrow(my_matrix) # number of matrix rows ## [1] 2 ncol(my_matrix) # number of matrix columns ## [1] 3 dim(my_matrix) # matrix dimension ## [1] 2 3 t(my_matrix) # transpose ## [,1] [,2] ## [1,] 1 5 ## [2,] 2 4 ## [3,] 1 2 diag(my_matrix) # the diagonal of the matrix as vector ## [1] 1 4 diag(1, nrow = 3) # creates a diagonal matrix ## [,1] [,2] [,3] ## [1,] 1 0 0 ## [2,] 0 1 0 ## [3,] 0 0 1 det(my_square_matrix) # matrix determinant ## [1] -3 my_matrix + 2 * my_matrix ## [,1] [,2] [,3] ## [1,] 3 6 3 ## [2,] 15 12 6 my_matrix * my_matrix # element-wise multiplication ## [,1] [,2] [,3] ## [1,] 1 4 1 ## [2,] 25 16 4 my_matrix %*% t(my_matrix) # matrix multiplication ## [,1] [,2] ## [1,] 6 15 ## [2,] 15 45 my_vec <- as.vector(my_matrix) # transform to vector my_vec ## [1] 1 5 2 4 1 2 A.4.6 Arrays multi-dimensional generalizations of matrices my_array <- array(c(1, 2, 3, 4, 5, 6, 7, 8), dim = c(2, 2, 2)) my_array[1, 1, 1] ## [1] 1 my_array[2, 2, 1] ## [1] 4 my_array[1, , ] ## [,1] [,2] ## [1,] 1 5 ## [2,] 3 7 dim(my_array) ## [1] 2 2 2 A.4.7 Data frames basic data structure for analysis differ from matrices as columns can be of different types student_data <- data.frame("Name" = student_names, "Age" = student_ages, "Pass" = passed) student_data ## Name Age Pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE ## 3 Mike 21 FALSE colnames(student_data) <- c("name", "age", "pass") # change column names student_data[1, ] ## name age pass ## 1 Luke 20 TRUE student_data[ ,colnames(student_data) %in% c("name", "pass")] ## name pass ## 1 Luke TRUE ## 2 Jen TRUE ## 3 Mike FALSE student_data$pass # access column by name ## [1] TRUE TRUE FALSE student_data[student_data$pass == TRUE, ] ## name age pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE A.4.8 Lists useful for storing different data structures access elements with double square brackets elements can be named first_list <- list(student_ages, my_matrix, student_data) second_list <- list(student_ages, my_matrix, student_data, first_list) first_list[[1]] ## [1] 20 24 21 second_list[[4]] ## [[1]] ## [1] 20 24 21 ## ## [[2]] ## [,1] [,2] [,3] ## [1,] 1 2 1 ## [2,] 5 4 2 ## ## [[3]] ## name age pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE ## 3 Mike 21 FALSE second_list[[4]][[1]] # first element of the fourth element of second_list ## [1] 20 24 21 length(second_list) ## [1] 4 second_list[[length(second_list) + 1]] <- "add_me" # append an element names(first_list) <- c("Age", "Matrix", "Data") first_list$Age ## [1] 20 24 21 A.4.9 Loops mostly for loop for loop can iterate over an arbitrary vector # iterate over consecutive natural numbers my_sum <- 0 for (i in 1:10) { my_sum <- my_sum + i } my_sum ## [1] 55 # iterate over an arbirary vector my_sum <- 0 some_numbers <- c(2, 3.5, 6, 100) for (i in some_numbers) { my_sum <- my_sum + i } my_sum ## [1] 111.5 A.5 Functions for help use ?function_name A.5.1 Writing functions We can write our own functions with function(). In the brackets, we define the parameters the function gets, and in curly brackets we define what the function does. We use return() to return values. sum_first_n_elements <- function (n) { my_sum <- 0 for (i in 1:n) { my_sum <- my_sum + i } return (my_sum) } sum_first_n_elements(10) ## [1] 55 A.6 Other tips Use set.seed(arbitrary_number) at the beginning of a script to set the seed and ensure replication. To dynamically set the working directory in R Studio to the parent folder of a R script use setwd(dirname(rstudioapi::getSourceEditorContext()$path)). To avoid slow R loops use the apply family of functions. See ?apply and ?lapply. To make your data manipulation (and therefore your life) a whole lot easier, use the dplyr package. Use getAnywhere(function_name) to get the source code of any function. Use browser for debugging. See ?browser. A.7 Further reading and references Getting started with R Studio: https://www.youtube.com/watch?v=lVKMsaWju8w Official R manuals: https://cran.r-project.org/manuals.html Cheatsheets: https://www.rstudio.com/resources/cheatsheets/ Workshop on R, dplyr, ggplot2, and R Markdown: https://github.com/bstatcomp/Rworkshop "],["distributions.html", "B Probability distributions", " B Probability distributions Name parameters support pdf/pmf mean variance Bernoulli \\(p \\in [0,1]\\) \\(k \\in \\{0,1\\}\\) \\(p^k (1 - p)^{1 - k}\\) 1.12 \\(p\\) 7.1 \\(p(1-p)\\) 7.1 binomial \\(n \\in \\mathbb{N}\\), \\(p \\in [0,1]\\) \\(k \\in \\{0,1,\\dots,n\\}\\) \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\) 4.4 \\(np\\) 7.2 \\(np(1-p)\\) 7.2 Poisson \\(\\lambda > 0\\) \\(k \\in \\mathbb{N}_0\\) \\(\\frac{\\lambda^k e^{-\\lambda}}{k!}\\) 4.6 \\(\\lambda\\) 7.3 \\(\\lambda\\) 7.3 geometric \\(p \\in (0,1]\\) \\(k \\in \\mathbb{N}_0\\) \\(p(1-p)^k\\) 4.5 \\(\\frac{1 - p}{p}\\) 7.4 \\(\\frac{1 - p}{p^2}\\) 9.3 normal \\(\\mu \\in \\mathbb{R}\\), \\(\\sigma^2 > 0\\) \\(x \\in \\mathbb{R}\\) \\(\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}}\\) 4.12 \\(\\mu\\) 7.8 \\(\\sigma^2\\) 7.8 uniform \\(a,b \\in \\mathbb{R}\\), \\(a < b\\) \\(x \\in [a,b]\\) \\(\\frac{1}{b-a}\\) 4.9 \\(\\frac{a+b}{2}\\) \\(\\frac{(b-a)^2}{12}\\) beta \\(\\alpha,\\beta > 0\\) \\(x \\in [0,1]\\) \\(\\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)}\\) 4.10 \\(\\frac{\\alpha}{\\alpha + \\beta}\\) 7.6 \\(\\frac{\\alpha \\beta}{(\\alpha + \\beta)^2(\\alpha + \\beta + 1)}\\) 7.6 gamma \\(\\alpha,\\beta > 0\\) \\(x \\in (0, \\infty)\\) \\(\\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x}\\) 4.11 \\(\\frac{\\alpha}{\\beta}\\) 7.5 \\(\\frac{\\alpha}{\\beta^2}\\) 7.5 exponential \\(\\lambda > 0\\) \\(x \\in [0, \\infty)\\) \\(\\lambda e^{-\\lambda x}\\) 4.8 \\(\\frac{1}{\\lambda}\\) 7.7 \\(\\frac{1}{\\lambda^2}\\) 7.7 logistic \\(\\mu \\in \\mathbb{R}\\), \\(s > 0\\) \\(x \\in \\mathbb{R}\\) \\(\\frac{e^{-\\frac{x - \\mu}{s}}}{s(1 + e^{-\\frac{x - \\mu}{s}})^2}\\) 4.13 \\(\\mu\\) \\(\\frac{s^2 \\pi^2}{3}\\) negative binomial \\(r \\in \\mathbb{N}\\), \\(p \\in [0,1]\\) \\(k \\in \\mathbb{N}_0\\) \\(\\binom{k + r - 1}{k}(1-p)^r p^k\\) 4.7 \\(\\frac{rp}{1 - p}\\) 9.2 \\(\\frac{rp}{(1 - p)^2}\\) 9.2 multinomial \\(n \\in \\mathbb{N}\\), \\(k \\in \\mathbb{N}\\) \\(p_i \\in [0,1]\\), \\(\\sum p_i = 1\\) \\(x_i \\in \\{0,..., n\\}\\), \\(i \\in \\{1,...,k\\}\\), \\(\\sum{x_i} = n\\) \\(\\frac{n!}{x_1!x_2!...x_k!} p_1^{x_1} p_2^{x_2}...p_k^{x_k}\\) 8.1 \\(np_i\\) \\(np_i(1-p_i)\\) "],["references.html", "References", " References "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]] +[["index.html", "Principles of Uncertainty – exercises Preface", " Principles of Uncertainty – exercises Gregor Pirš and Erik Štrumbelj 2023-10-01 Preface These are the exercises for the Principles of Uncertainty course of the Data Science Master’s at University of Ljubljana, Faculty of Computer and Information Science. This document will be extended each week as the course progresses. At the end of each exercise session, we will post the solutions to the exercises worked in class and select exercises for homework. Students are also encouraged to solve the remaining exercises to further extend their knowledge. Some exercises require the use of R. Those exercises (or parts of) are coloured blue. Students that are not familiar with R programming language should study A to learn the basics. As the course progresses, we will cover more relevant uses of R for data science. "],["introduction.html", "Chapter 1 Probability spaces 1.1 Measure and probability spaces 1.2 Properties of probability measures 1.3 Discrete probability spaces", " Chapter 1 Probability spaces This chapter deals with measures and probability spaces. At the end of the chapter, we look more closely at discrete probability spaces. The students are expected to acquire the following knowledge: Theoretical Use properties of probability to calculate probabilities. Combinatorics. Understanding of continuity of probability. R Vectors and vector operations. For loop. Estimating probability with simulation. sample function. Matrices and matrix operations. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 1.1 Measure and probability spaces Exercise 1.1 (Completing a set to a sigma algebra) Let \\(\\Omega = \\{1,2,...,10\\}\\) and let \\(A = \\{\\emptyset, \\{1\\}, \\{2\\}, \\Omega \\}\\). Show that \\(A\\) is not a sigma algebra of \\(\\Omega\\). Find the minimum number of elements to complete A to a sigma algebra of \\(\\Omega\\). Solution. \\(1^c = \\{2,3,...,10\\} \\notin A \\implies\\) \\(A\\) is not sigma algebra. First we need the complements of all elements, so we need to add sets \\(\\{2,3,...,10\\}\\) and \\(\\{1,3,4,...,10\\}\\). Next we need unions of all sets – we add the set \\(\\{1,2\\}\\). Again we need the complement of this set, so we add \\(\\{3,4,...,10\\}\\). So the minimum number of elements we need to add is 4. Exercise 1.2 (Diversity of sigma algebras) Let \\(\\Omega\\) be a set. Find the smallest sigma algebra of \\(\\Omega\\). Find the largest sigma algebra of \\(\\Omega\\). Solution. \\(A = \\{\\emptyset, \\Omega\\}\\) \\(2^{\\Omega}\\) Exercise 1.3 Find all sigma algebras for \\(\\Omega = \\{0, 1, 2\\}\\). Solution. \\(A = \\{\\emptyset, \\Omega\\}\\) \\(A = 2^{\\Omega}\\) \\(A = \\{\\emptyset, \\{0\\}, \\{1,2\\}, \\Omega\\}\\) \\(A = \\{\\emptyset, \\{1\\}, \\{0,2\\}, \\Omega\\}\\) \\(A = \\{\\emptyset, \\{2\\}, \\{0,1\\}, \\Omega\\}\\) Exercise 1.4 (Difference between algebra and sigma algebra) Let \\(\\Omega = \\mathbb{N}\\) and \\(\\mathcal{A} = \\{A \\subseteq \\mathbb{N}: A \\text{ is finite or } A^c \\text{ is finite.} \\}\\). Show that \\(\\mathcal{A}\\) is an algebra but not a sigma algebra. Solution. \\(\\emptyset\\) is finite so \\(\\emptyset \\in \\mathcal{A}\\). Let \\(A \\in \\mathcal{A}\\) and \\(B \\in \\mathcal{A}\\). If both are finite, then their union is also finite and therefore in \\(\\mathcal{A}\\). Let at least one of them not be finite. Then their union is not finite. But \\((A \\cup B)^c = A^c \\cap B^c\\). And since at least one is infinite, then its complement is finite and the intersection is too. So finite unions are in \\(\\mathcal{A}\\). Let us look at numbers \\(2n\\). For any \\(n\\), \\(2n \\in \\mathcal{A}\\) as it is finite. But \\(\\bigcup_{k = 1}^{\\infty} 2n \\notin \\mathcal{A}\\). Exercise 1.5 We define \\(\\sigma(X) = \\cap_{\\lambda \\in I} S_\\lambda\\) to be a sigma algebra, generated by the set \\(X\\), where \\(S_\\lambda\\) are all sigma algebras such that \\(X \\subseteq S_\\lambda\\). \\(S_\\lambda\\) are indexed by \\(\\lambda \\in I\\). Let \\(A, B \\subseteq 2^{\\Omega}\\). Prove that \\(\\sigma(A) = \\sigma(B) \\iff A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A)\\). Solution. To prove the equivalence, we need to prove that the left hand side implies the right hand side and vice versa. Proving \\(\\sigma(A) = \\sigma(B) \\Rightarrow A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A)\\): we know \\(A \\subseteq \\sigma(A)\\) is always true, so by substituting in \\(\\sigma(B)\\) from the left hand side equality we obtain \\(A \\subseteq \\sigma(B)\\). We obtain \\(B \\subseteq \\sigma(A)\\) by symmetry. This proves the implication. Proving \\(A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A) \\Rightarrow \\sigma(A) = \\sigma(B)\\): by definition of a sigma algebra, generated by a set, we have \\(\\sigma(B) = \\cap_{\\lambda \\in I} S_\\lambda\\) where \\(S_\\lambda\\) are all sigma algebras where \\(B \\subseteq S_\\lambda\\). But \\(\\sigma(A)\\) is one of \\(S_\\lambda\\), so we can write \\(\\sigma(B) = \\sigma(A) \\cap \\left(\\cap_{\\lambda \\in I} S_\\lambda \\right)\\), which implies \\(\\sigma(B) \\subseteq \\sigma(A)\\). By symmetry, we have \\(\\sigma(A) \\subseteq \\sigma(B)\\). Since \\(\\sigma(A) \\subseteq \\sigma(B)\\) and \\(\\sigma(B) \\subseteq \\sigma(A)\\), we obtain \\(\\sigma(A) = \\sigma(B)\\), which proves the implication and completes the equivalence proof. Exercise 1.6 (Intro to measure) Take the measurable space \\(\\Omega = \\{1,2\\}\\), \\(F = 2^{\\Omega}\\). Which of the following is a measure? Which is a probability measure? \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 5\\), \\(\\mu(\\{2\\}) = 6\\), \\(\\mu(\\{1,2\\}) = 11\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 0\\), \\(\\mu(\\{1,2\\}) = 1\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 0\\), \\(\\mu(\\{1,2\\}) = 0\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 1\\), \\(\\mu(\\{1,2\\}) = 1\\) \\(\\mu(\\emptyset)=0\\), \\(\\mu(\\{1\\})=0\\), \\(\\mu(\\{2\\})=\\infty\\), \\(\\mu(\\{1,2\\})=\\infty\\) Solution. Measure. Not probability measure since \\(\\mu(\\Omega) > 1\\). Neither due to countable additivity. Measure. Not probability measure since \\(\\mu(\\Omega) = 0\\). Probability measure. Measure. Not probability measure since \\(\\mu(\\Omega) > 1\\). Exercise 1.7 Define a probability space that could be used to model the outcome of throwing two fair 6-sided dice. Solution. \\(\\Omega = \\{\\{i,j\\}, i = 1,...,6, j = 1,...,6\\}\\) \\(F = 2^{\\Omega}\\) \\(\\forall \\omega \\in \\Omega\\), \\(P(\\omega) = \\frac{1}{6} \\times \\frac{1}{6} = \\frac{1}{36}\\) 1.2 Properties of probability measures Exercise 1.8 A standard deck (52 cards) is distributed to two persons: 26 cards to each person. All partitions are equally likely. Find the probability that: The first person gets 4 Queens. The first person gets at least 2 Queens. R: Use simulation (sample) to check the above answers. Solution. \\(\\frac{\\binom{48}{22}}{\\binom{52}{26}}\\) 1 - \\(\\frac{\\binom{48}{26} + 4 \\times \\binom{48}{25}}{\\binom{52}{26}}\\) For the simulation, let us represent cards with numbers from 1 to 52, and let 1 through 4 represent Queens. set.seed(1) cards <- 1:52 n <- 10000 q4 <- vector(mode = "logical", length = n) q2 <- vector(mode = "logical", length = n) tmp <- vector(mode = "logical", length = n) for (i in 1:n) { p1 <- sample(1:52, 26) q4[i] <- sum(1:4 %in% p1) == 4 q2[i] <- sum(1:4 %in% p1) >= 2 } sum(q4) / n ## [1] 0.0572 sum(q2) / n ## [1] 0.6894 Exercise 1.9 Let \\(A\\) and \\(B\\) be events with probabilities \\(P(A) = \\frac{2}{3}\\) and \\(P(B) = \\frac{1}{2}\\). Show that \\(\\frac{1}{6} \\leq P(A\\cap B) \\leq \\frac{1}{2}\\), and give examples to show that both extremes are possible. Find corresponding bounds for \\(P(A\\cup B)\\). R: Draw samples from the examples and show the probability bounds of \\(P(A \\cap B)\\) . Solution. From the properties of probability we have \\[\\begin{equation} P(A \\cup B) = P(A) + P(B) - P(A \\cap B) \\leq 1. \\end{equation}\\] From this follows \\[\\begin{align} P(A \\cap B) &\\geq P(A) + P(B) - 1 \\\\ &= \\frac{2}{3} + \\frac{1}{2} - 1 \\\\ &= \\frac{1}{6}, \\end{align}\\] which is the lower bound for the intersection. Conversely, we have \\[\\begin{equation} P(A \\cup B) = P(A) + P(B) - P(A \\cap B) \\geq P(A). \\end{equation}\\] From this follows \\[\\begin{align} P(A \\cap B) &\\leq P(B) \\\\ &= \\frac{1}{2}, \\end{align}\\] which is the upper bound for the intersection. For an example take a fair die. To achieve the lower bound let \\(A = \\{3,4,5,6\\}\\) and \\(B = \\{1,2,3\\}\\), then their intersection is \\(A \\cap B = \\{3\\}\\). To achieve the upper bound take \\(A = \\{1,2,3,4\\}\\) and $B = {1,2,3} $. For the bounds of the union we will use the results from the first part. Again from the properties of probability we have \\[\\begin{align} P(A \\cup B) &= P(A) + P(B) - P(A \\cap B) \\\\ &\\geq P(A) + P(B) - \\frac{1}{2} \\\\ &= \\frac{2}{3}. \\end{align}\\] Conversely \\[\\begin{align} P(A \\cup B) &= P(A) + P(B) - P(A \\cap B) \\\\ &\\leq P(A) + P(B) - \\frac{1}{6} \\\\ &= 1. \\end{align}\\] Therefore \\(\\frac{2}{3} \\leq P(A \\cup B) \\leq 1\\). We use sample in R: set.seed(1) n <- 10000 samps <- sample(1:6, n, replace = TRUE) # lower bound lb <- vector(mode = "logical", length = n) A <- c(1,2,3) B <- c(3,4,5,6) for (i in 1:n) { lb[i] <- samps[i] %in% A & samps[i] %in% B } sum(lb) / n ## [1] 0.1605 # upper bound ub <- vector(mode = "logical", length = n) A <- c(1,2,3) B <- c(1,2,3,4) for (i in 1:n) { ub[i] <- samps[i] %in% A & samps[i] %in% B } sum(ub) / n ## [1] 0.4913 Exercise 1.10 A fair coin is tossed repeatedly. Show that, with probability one, a head turns up sooner or later. Show similarly that any given finite sequence of heads and tails occurs eventually with probability one. Solution. \\[\\begin{align} P(\\text{no heads}) &= \\lim_{n \\rightarrow \\infty} P(\\text{no heads in first }n \\text{ tosses}) \\\\ &= \\lim_{n \\rightarrow \\infty} \\frac{1}{2^n} \\\\ &= 0. \\end{align}\\] For the second part, let us fix the given sequence of heads and tails of length \\(k\\) as \\(s\\). A probability that this happens in \\(k\\) tosses is \\(\\frac{1}{2^k}\\). \\[\\begin{align} P(s \\text{ occurs}) &= \\lim_{n \\rightarrow \\infty} P(s \\text{ occurs in first } nk \\text{ tosses}) \\end{align}\\] The right part of the upper equation is greater than if \\(s\\) occurs either in the first \\(k\\) tosses, second \\(k\\) tosses,…, \\(n\\)-th \\(k\\) tosses. Therefore \\[\\begin{align} P(s \\text{ occurs}) &\\geq \\lim_{n \\rightarrow \\infty} P(s \\text{ occurs in first } n \\text{ disjoint sequences of length } k) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - P(s \\text{ does not occur in first } n \\text{ disjoint sequences})) \\\\ &= 1 - \\lim_{n \\rightarrow \\infty} P(s \\text{ does not occur in first } n \\text{ disjoint sequences}) \\\\ &= 1 - \\lim_{n \\rightarrow \\infty} (1 - \\frac{1}{2^k})^n \\\\ &= 1. \\end{align}\\] Exercise 1.11 An Erdos-Renyi random graph \\(G(n,p)\\) is a model with \\(n\\) nodes, where each pair of nodes is connected with probability \\(p\\). Calculate the probability that there exists a node that is not connected to any other node in \\(G(4,0.6)\\). Show that the upper bound for the probability that there exist 2 nodes that are not connected to any other node for an arbitrary \\(G(n,p)\\) is \\(\\binom{n}{2} (1-p)^{2n - 3}\\). R: Estimate the probability from the first point using simulation. Solution. Let \\(A_i\\) be the event that the \\(i\\)-th node is not connected to any other node. Then our goal is to calculate \\(P(\\cup_{i=1}^n A_i)\\). Using the inclusion-exclusion principle, we get \\[\\begin{align} P(\\cup_{i=1}^n A_i) &= \\sum_i A_i - \\sum_{i<j} P(A_i \\cap A_j) + \\sum_{i<j<k} P(A_i \\cap A_j \\cap A_k) - P(A_1 \\cap A_2 \\cap A_3 \\cap A_4) \\\\ &=4 (1 - p)^3 - \\binom{4}{2} (1 - p)^5 + \\binom{4}{3} (1 - p)^6 - (1 - p)^6 \\\\ &\\approx 0.21. \\end{align}\\] Let \\(A_{ij}\\) be the event that nodes \\(i\\) and \\(j\\) are not connected to any other node. We are interested in \\(P(\\cup_{i<j}A_{ij})\\). By using Boole`s inequality, we get \\[\\begin{align} P(\\cup_{i<j}A_{ij}) \\leq \\sum_{i<j} P(A_{ij}). \\end{align}\\] What is the probability of \\(A_{ij}\\)? There need to be no connections to the \\(i\\)-th node to the remaining nodes (excluding \\(j\\)), the same for the \\(j\\)-th node, and there can be no connection between them. Therefore \\[\\begin{align} P(\\cup_{i<j}A_{ij}) &\\leq \\sum_{i<j} (1 - p)^{2(n-2) + 1} \\\\ &= \\binom{n}{2} (1 - p)^{2n - 3}. \\end{align}\\] set.seed(1) n_samp <- 100000 n <- 4 p <- 0.6 conn_samp <- vector(mode = "logical", length = n_samp) for (i in 1:n_samp) { tmp_mat <- matrix(data = 0, nrow = n, ncol = n) samp_conn <- sample(c(0,1), choose(4,2), replace = TRUE, prob = c(1 - p, p)) tmp_mat[lower.tri(tmp_mat)] <- samp_conn tmp_mat[upper.tri(tmp_mat)] <- t(tmp_mat)[upper.tri(t(tmp_mat))] not_conn <- apply(tmp_mat, 1, sum) if (any(not_conn == 0)) { conn_samp[i] <- TRUE } else { conn_samp[i] <- FALSE } } sum(conn_samp) / n_samp ## [1] 0.20565 1.3 Discrete probability spaces Exercise 1.12 Show that the standard measurable space on \\(\\Omega = \\{0,1,...,n\\}\\) equipped with binomial measure is a discrete probability space. Define another probability measure on this measurable space. Show that for \\(n=1\\) the binomial measure is the same as the Bernoulli measure. R: Draw 1000 samples from the binomial distribution \\(p=0.5\\), \\(n=20\\) (rbinom) and compare relative frequencies with theoretical probability measure. Solution. We need to show that the terms of \\(\\sum_{k=0}^n \\binom{n}{k} p^k (1 - p)^{n - k}\\) sum to 1. For that we use the binomial theorem \\(\\sum_{k=0}^n \\binom{n}{k} x^k y^{n-k} = (x + y)^n\\). So \\[\\begin{equation} \\sum_{k=0}^n \\binom{n}{k} p^k (1 - p)^{n - k} = (p + 1 - p)^n = 1. \\end{equation}\\] \\(P(\\{k\\}) = \\frac{1}{n + 1}\\). When \\(n=1\\) then \\(k \\in \\{0,1\\}\\). Inserting \\(n=1\\) into the binomial measure, we get \\(\\binom{1}{k}p^k (1-p)^{1 - k}\\). Now \\(\\binom{1}{1} = \\binom{1}{0} = 1\\), so the measure is \\(p^k (1-p)^{1 - k}\\), which is the Bernoulli measure. set.seed(1) library(ggplot2) library(dplyr) bin_samp <- rbinom(n = 1000, size = 20, prob = 0.5) bin_samp <- data.frame(x = bin_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:20, n = dbinom(0:20, size = 20, prob = 0.5), type = "theoretical_measure")) bin_plot <- ggplot(data = bin_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(bin_plot) Exercise 1.13 Show that the standard measurable space on \\(\\Omega = \\{0,1,...,\\infty\\}\\) equipped with geometric measure is a discrete probability space, equipped with Poisson measure is a discrete probability space. Define another probability measure on this measurable space. R: Draw 1000 samples from the Poisson distribution \\(\\lambda = 10\\) (rpois) and compare relative frequencies with theoretical probability measure. Solution. \\(\\sum_{k = 0}^{\\infty} p(1 - p)^k = p \\sum_{k = 0}^{\\infty} (1 - p)^k = p \\frac{1}{1 - 1 + p} = 1\\). We used the formula for geometric series. \\(\\sum_{k = 0}^{\\infty} \\frac{\\lambda^k e^{-\\lambda}}{k!} = e^{-\\lambda} \\sum_{k = 0}^{\\infty} \\frac{\\lambda^k}{k!} = e^{-\\lambda} e^{\\lambda} = 1.\\) We used the Taylor expansion of the exponential function. Since we only have to define a probability measure, we could only assign probabilities that sum to one to a finite number of events in \\(\\Omega\\), and probability zero to the other infinite number of events. However to make this solution more educational, we will try to find a measure that assigns a non-zero probability to all events in \\(\\Omega\\). A good start for this would be to find a converging infinite series, as the probabilities will have to sum to one. One simple converging series is the geometric series \\(\\sum_{k=0}^{\\infty} p^k\\) for \\(|p| < 1\\). Let us choose an arbitrary \\(p = 0.5\\). Then \\(\\sum_{k=0}^{\\infty} p^k = \\frac{1}{1 - 0.5} = 2\\). To complete the measure, we have to normalize it, so it sums to one, therefore \\(P(\\{k\\}) = \\frac{0.5^k}{2}\\) is a probability measure on \\(\\Omega\\). We could make it even more difficult by making this measure dependent on some parameter \\(\\alpha\\), but this is out of the scope of this introductory chapter. set.seed(1) pois_samp <- rpois(n = 1000, lambda = 10) pois_samp <- data.frame(x = pois_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:25, n = dpois(0:25, lambda = 10), type = "theoretical_measure")) pois_plot <- ggplot(data = pois_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(pois_plot) Exercise 1.14 Define a probability measure on \\((\\Omega = \\mathbb{Z}, 2^{\\mathbb{Z}})\\). Define a probability measure such that \\(P(\\omega) > 0, \\forall \\omega \\in \\Omega\\). R: Implement a random generator that will generate samples with the relative frequency that corresponds to your probability measure. Compare relative frequencies with theoretical probability measure . Solution. \\(P(0) = 1, P(\\omega) = 0, \\forall \\omega \\neq 0\\). \\(P(\\{k\\}) = \\sum_{k = -\\infty}^{\\infty} \\frac{p(1 - p)^{|k|}}{2^{1 - 1_0(k)}}\\), where \\(1_0(k)\\) is the indicator function, which equals to one if \\(k\\) is 0, and equals to zero in every other case. n <- 1000 geom_samps <- rgeom(n, prob = 0.5) sign_samps <- sample(c(FALSE, TRUE), size = n, replace = TRUE) geom_samps[sign_samps] <- -geom_samps[sign_samps] my_pmf <- function (k, p) { indic <- rep(1, length(k)) indic[k == 0] <- 0 return ((p * (1 - p)^(abs(k))) / 2^indic) } geom_samps <- data.frame(x = geom_samps) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = -10:10, n = my_pmf(-10:10, 0.5), type = "theoretical_measure")) geom_plot <- ggplot(data = geom_samps, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(geom_plot) Exercise 1.15 Define a probability measure on \\(\\Omega = \\{1,2,3,4,5,6\\}\\) with parameter \\(m \\in \\{1,2,3,4,5,6\\}\\), so that the probability of outcome at distance \\(1\\) from \\(m\\) is half of the probability at distance \\(0\\), at distance \\(2\\) is half of the probability at distance \\(1\\), etc. R: Implement a random generator that will generate samples with the relative frequency that corresponds to your probability measure. Compare relative frequencies with theoretical probability measure . Solution. \\(P(\\{k\\}) = \\frac{\\frac{1}{2}^{|m - k|}}{\\sum_{i=1}^6 \\frac{1}{2}^{|m - i|}}\\) n <- 10000 m <- 4 my_pmf <- function (k, m) { denom <- sum(0.5^abs(m - 1:6)) return (0.5^abs(m - k) / denom) } samps <- c() for (i in 1:n) { a <- sample(1:6, 1) a_val <- my_pmf(a, m) prob <- runif(1) if (prob < a_val) { samps <- c(samps, a) } } samps <- data.frame(x = samps) %>% count(x) %>% mutate(n = n / length(samps), type = "empirical_frequencies") %>% bind_rows(data.frame(x = 1:6, n = my_pmf(1:6, m), type = "theoretical_measure")) my_plot <- ggplot(data = samps, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(my_plot) "],["uprobspaces.html", "Chapter 2 Uncountable probability spaces 2.1 Borel sets 2.2 Lebesgue measure", " Chapter 2 Uncountable probability spaces This chapter deals with uncountable probability spaces. The students are expected to acquire the following knowledge: Theoretical Understand Borel sets and identify them. Estimate Lebesgue measure for different sets. Know when sets are Borel-measurable. Understanding of countable and uncountable sets. R Uniform sampling. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 2.1 Borel sets Exercise 2.1 Prove that the intersection of two sigma algebras on \\(\\Omega\\) is a sigma algebra. Prove that the collection of all open subsets \\((a,b)\\) on \\((0,1]\\) is not a sigma algebra of \\((0,1]\\). Solution. Empty set: \\[\\begin{equation} \\emptyset \\in \\mathcal{A} \\wedge \\emptyset \\in \\mathcal{B} \\Rightarrow \\emptyset \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Complement: \\[\\begin{equation} \\text{Let } A \\in \\mathcal{A} \\cap \\mathcal{B} \\Rightarrow A \\in \\mathcal{A} \\wedge A \\in \\mathcal{B} \\Rightarrow A^c \\in \\mathcal{A} \\wedge A^c \\in \\mathcal{B} \\Rightarrow A^c \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Countable additivity: Let \\(\\{A_i\\}\\) be a countable sequence of subsets in \\(\\mathcal{A} \\cap \\mathcal{B}\\). \\[\\begin{equation} \\forall i: A_i \\in \\mathcal{A} \\cap \\mathcal{B} \\Rightarrow A_i \\in \\mathcal{A} \\wedge A_i \\in \\mathcal{B} \\Rightarrow \\cup A_i \\in \\mathcal{A} \\wedge \\cup A_i \\in \\mathcal{B} \\Rightarrow \\cup A_i \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Let \\(A\\) denote the collection of all open subsets \\((a,b)\\) on \\((0,1]\\). Then \\((0,1) \\in A\\). But \\((0,1)^c = 1 \\notin A\\). Exercise 2.2 Show that \\(\\mathcal{C} = \\sigma(\\mathcal{C})\\) if and only if \\(\\mathcal{C}\\) is a sigma algebra. Solution. “\\(\\Rightarrow\\)” This follows from the definition of a generated sigma algebra. “\\(\\Leftarrow\\)” Let \\(\\mathcal{F} = \\cap_i F_i\\) be the intersection of all sigma algebras that contain \\(\\mathcal{C}\\). Then \\(\\sigma(\\mathcal{C}) = \\mathcal{F}\\). Additionally, \\(\\forall i: \\mathcal{C} \\in F_i\\). So each \\(F_i\\) can be written as \\(F_i = \\mathcal{C} \\cup D\\), where \\(D\\) are the rest of the elements in the sigma algebra. In other words, each sigma algebra in the collection contains at least \\(\\mathcal{C}\\), but can contain other elements. Now for some \\(j\\), \\(F_j = \\mathcal{C}\\) as \\(\\{F_i\\}\\) contains all sigma algebras that contain \\(\\mathcal{C}\\) and \\(\\mathcal{C}\\) is such a sigma algebra. Since this is the smallest subset in the intersection it follows that \\(\\sigma(\\mathcal{C}) = \\mathcal{F} = \\mathcal{C}\\). Exercise 2.3 Let \\(\\mathcal{C}\\) and \\(\\mathcal{D}\\) be two collections of subsets on \\(\\Omega\\) such that \\(\\mathcal{C} \\subset \\mathcal{D}\\). Prove that \\(\\sigma(\\mathcal{C}) \\subseteq \\sigma(\\mathcal{D})\\). Solution. \\(\\sigma(\\mathcal{D})\\) is a sigma algebra that contains \\(\\mathcal{D}\\). It follows that \\(\\sigma(\\mathcal{D})\\) is a sigma algebra that contains \\(\\mathcal{C}\\). Let us write \\(\\sigma(\\mathcal{C}) = \\cap_i F_i\\), where \\(\\{F_i\\}\\) is the collection of all sigma algebras that contain \\(\\mathcal{C}\\). Since \\(\\sigma(\\mathcal{D})\\) is such a sigma algebra, there exists an index \\(j\\), so that \\(F_j = \\sigma(\\mathcal{D})\\). Then we can write \\[\\begin{align} \\sigma(\\mathcal{C}) &= (\\cap_{i \\neq j} F_i) \\cap \\sigma(\\mathcal{D}) \\\\ &\\subseteq \\sigma(\\mathcal{D}). \\end{align}\\] Exercise 2.4 Prove that the following subsets of \\((0,1]\\) are Borel-measurable by finding their measure. Any countable set. The set of numbers in (0,1] whose decimal expansion does not contain 7. Solution. This follows directly from the fact that every countable set is a union of singletons, whose measure is 0. Let us first look at numbers which have a 7 as the first decimal numbers. Their measure is 0.1. Then we take all the numbers with a 7 as the second decimal number (excluding those who already have it as the first). These have the measure 0.01, and there are 9 of them, so their total measure is 0.09. We can continue to do so infinitely many times. At each \\(n\\), we have the measure of the intervals which is \\(10^n\\) and the number of those intervals is \\(9^{n-1}\\). Now \\[\\begin{align} \\lambda(A) &= 1 - \\sum_{n = 0}^{\\infty} \\frac{9^n}{10^{n+1}} \\\\ &= 1 - \\frac{1}{10} \\sum_{n = 0}^{\\infty} (\\frac{9}{10})^n \\\\ &= 1 - \\frac{1}{10} \\frac{10}{1} \\\\ &= 0. \\end{align}\\] Since we have shown that the measure of the set is \\(0\\), we have also shown that the set is measurable. Exercise 2.5 Let \\(\\Omega = [0,1]\\), and let \\(\\mathcal{F}_3\\) consist of all countable subsets of \\(\\Omega\\), and all subsets of \\(\\Omega\\) having a countable complement. Show that \\(\\mathcal{F}_3\\) is a sigma algebra. Let us define \\(P(A)=0\\) if \\(A\\) is countable, and \\(P(A) = 1\\) if \\(A\\) has a countable complement. Is \\((\\Omega, \\mathcal{F}_3, P)\\) a legitimate probability space? Solution. The empty set is countable, therefore it is in \\(\\mathcal{F}_3\\). For any \\(A \\in \\mathcal{F}_3\\). If \\(A\\) is countable, then \\(A^c\\) has a countable complement and is in \\(\\mathcal{F}_3\\). If \\(A\\) is uncountable, then it has a countable complement \\(A^c\\) which is therefore also in \\(\\mathcal{F}_3\\). We are left with showing countable additivity. Let \\(\\{A_i\\}\\) be an arbitrary collection of sets in \\(\\mathcal{F}_3\\). We will look at two possibilities. First let all \\(A_i\\) be countable. A countable union of countable sets is countable, and therefore in \\(\\mathcal{F}_3\\). Second, let at least one \\(A_i\\) be uncountable. It follows that it has a countable complement. We can write \\[\\begin{equation} (\\cup_{i=1}^{\\infty} A_i)^c = \\cap_{i=1}^{\\infty} A_i^c. \\end{equation}\\] Since at least one \\(A_i^c\\) on the right side is countable, the whole intersection is countable, and therefore the union has a countable complement. It follows that the union is in \\(\\mathcal{F}_3\\). The tuple \\((\\Omega, \\mathcal{F}_3)\\) is a measurable space. Therefore, we only need to check whether \\(P\\) is a probability measure. The measure of the empty set is zero as it is countable. We have to check for countable additivity. Let us look at three situations. Let \\(A_i\\) be disjoint sets. First, let all \\(A_i\\) be countable. \\[\\begin{equation} P(\\cup_{i=1}^{\\infty} A_i) = \\sum_{i=1}^{\\infty}P( A_i)) = 0. \\end{equation}\\] Since the union is countable, the above equation holds. Second, let exactly one \\(A_i\\) be uncountable. W.L.O.G. let that be \\(A_1\\). Then \\[\\begin{equation} P(\\cup_{i=1}^{\\infty} A_i) = 1 + \\sum_{i=2}^{\\infty}P( A_i)) = 1. \\end{equation}\\] Since the union is uncountable, the above equation holds. Third, let at least two \\(A_i\\) be uncountable. We have to check whether it is possible for two uncountable sets in \\(\\mathcal{F}_3\\) to be disjoint. If that is possible, then their measures would sum to more than one and \\(P\\) would not be a probability measure. W.L.O.G. let \\(A_1\\) and \\(A_2\\) be uncountable. Then we have \\[\\begin{equation} A_1 \\cap A_2 = (A_1^c \\cup A_2^c)^c. \\end{equation}\\] Now \\(A_1^c\\) and \\(A_2^c\\) are countable and their union is therefore countable. Let \\(B = A_1^c \\cup A_2^c\\). So the intersection of \\(A_1\\) and \\(A_2\\) equals the complement of \\(B\\), which is countable. For the intersection to be the empty set, \\(B\\) would have to equal to \\(\\Omega\\). But \\(\\Omega\\) is uncountable and therefore \\(B\\) can not equal to \\(\\Omega\\). It follows that two uncountable sets in \\(\\mathcal{F}_3\\) can not have an empty intersection. Therefore the tuple is a legitimate probability space. 2.2 Lebesgue measure Exercise 2.6 Show that the Lebesgue measure of rational numbers on \\([0,1]\\) is 0. R: Implement a random number generator, which generates uniform samples of irrational numbers in \\([0,1]\\) by uniformly sampling from \\([0,1]\\) and rejecting a sample if it is rational. Solution. There are a countable number of rational numbers. Therefore, we can write \\[\\begin{align} \\lambda(\\mathbb{Q}) &= \\lambda(\\cup_{i = 1}^{\\infty} q_i) &\\\\ &= \\sum_{i = 1}^{\\infty} \\lambda(q_i) &\\text{ (countable additivity)} \\\\ &= \\sum_{i = 1}^{\\infty} 0 &\\text{ (Lebesgue measure of a singleton)} \\\\ &= 0. \\end{align}\\] Exercise 2.7 Prove that the Lebesgue measure of \\(\\mathbb{R}\\) is infinity. Paradox. Show that the cardinality of \\(\\mathbb{R}\\) and \\((0,1)\\) is the same, while their Lebesgue measures are infinity and one respectively. Solution. Let \\(a_i\\) be the \\(i\\)-th integer for \\(i \\in \\mathbb{Z}\\). We can write \\(\\mathbb{R} = \\cup_{-\\infty}^{\\infty} (a_i, a_{i + 1}]\\). \\[\\begin{align} \\lambda(\\mathbb{R}) &= \\lambda(\\cup_{i = -\\infty}^{\\infty} (a_i, a_{i + 1}]) \\\\ &= \\lambda(\\lim_{n \\rightarrow \\infty} \\cup_{i = -n}^{n} (a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\lambda(\\cup_{i = -n}^{n} (a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\sum_{i = -n}^{n} \\lambda((a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\sum_{i = -n}^{n} 1 \\\\ &= \\lim_{n \\rightarrow \\infty} 2n \\\\ &= \\infty. \\end{align}\\] We need to find a bijection between \\(\\mathbb{R}\\) and \\((0,1)\\). A well-known function that maps from a bounded interval to \\(\\mathbb{R}\\) is the tangent. To make the bijection easier to achieve, we will take the inverse, which maps from \\(\\mathbb{R}\\) to \\((-\\frac{\\pi}{2}, \\frac{\\pi}{2})\\). However, we need to change the function so it maps to \\((0,1)\\). First we add \\(\\frac{\\pi}{2}\\), so that we move the function above zero. Then we only have to divide by the max value, which in this case is \\(\\pi\\). So our bijection is \\[\\begin{equation} f(x) = \\frac{\\tan^{-1}(x) + \\frac{\\pi}{2}}{\\pi}. \\end{equation}\\] Exercise 2.8 Take the measure space \\((\\Omega_1 = (0,1], B_{(0,1]}, \\lambda)\\) (we know that this is a probability space on \\((0,1]\\)). Define a map (function) from \\(\\Omega_1\\) to \\(\\Omega_2 = \\{1,2,3,4,5,6\\}\\) such that the measure space \\((\\Omega_2, 2^{\\Omega_2}, \\lambda(f^{-1}()))\\) will be a discrete probability space with uniform probabilities (\\(P(\\omega) = \\frac{1}{6}, \\forall \\omega \\in \\Omega_2)\\). Is the map that you defined in (a) the only such map? How would you in the same fashion define a map that would result in a probability space that can be interpreted as a coin toss with probability \\(p\\) of heads? R: Use the map in (a) as a basis for a random generator for this fair die. Solution. In other words, we have to assign disjunct intervals of the same size to each element of \\(\\Omega_2\\). Therefore \\[\\begin{equation} f(x) = \\lceil 6x \\rceil. \\end{equation}\\] No, we could for example rearrange the order in which the intervals are mapped to integers. Additionally, we could have several disjoint intervals that mapped to the same integer, as long as the Lebesgue measure of their union would be \\(\\frac{1}{6}\\) and the function would remain injective. We have \\(\\Omega_3 = \\{0,1\\}\\), where zero represents heads and one represents tails. Then \\[\\begin{equation} f(x) = 0^{I_{A}(x)}, \\end{equation}\\] where \\(A = \\{y \\in (0,1] : y < p\\}\\). set.seed(1) unif_s <- runif(1000) die_s <- ceiling(6 * unif_s) summary(as.factor(die_s)) ## 1 2 3 4 5 6 ## 166 154 200 146 166 168 "],["condprob.html", "Chapter 3 Conditional probability 3.1 Calculating conditional probabilities 3.2 Conditional independence 3.3 Monty Hall problem", " Chapter 3 Conditional probability This chapter deals with conditional probability. The students are expected to acquire the following knowledge: Theoretical Identify whether variables are independent. Calculation of conditional probabilities. Understanding of conditional dependence and independence. How to apply Bayes’ theorem to solve difficult probabilistic questions. R Simulating conditional probabilities. cumsum. apply. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 3.1 Calculating conditional probabilities Exercise 3.1 A military officer is in charge of identifying enemy aircraft and shooting them down. He is able to positively identify an enemy airplane 95% of the time and positively identify a friendly airplane 90% of the time. Furthermore, 99% of the airplanes are friendly. When the officer identifies an airplane as an enemy airplane, what is the probability that it is not and they will shoot at a friendly airplane? Solution. Let \\(E = 0\\) denote that the observed plane is friendly and \\(E=1\\) that it is an enemy. Let \\(I = 0\\) denote that the officer identified it as friendly and \\(I = 1\\) as enemy. Then \\[\\begin{align} P(E = 0 | I = 1) &= \\frac{P(I = 1 | E = 0)P(E = 0)}{P(I = 1)} \\\\ &= \\frac{P(I = 1 | E = 0)P(E = 0)}{P(I = 1 | E = 0)P(E = 0) + P(I = 1 | E = 1)P(E = 1)} \\\\ &= \\frac{0.1 \\times 0.99}{0.1 \\times 0.99 + 0.95 \\times 0.01} \\\\ &= 0.91. \\end{align}\\] Exercise 3.2 R: Consider tossing a fair die. Let \\(A = \\{2,4,6\\}\\) and \\(B = \\{1,2,3,4\\}\\). Then \\(P(A) = \\frac{1}{2}\\), \\(P(B) = \\frac{2}{3}\\) and \\(P(AB) = \\frac{1}{3}\\). Since \\(P(AB) = P(A)P(B)\\), the events \\(A\\) and \\(B\\) are independent. Simulate draws from the sample space and verify that the proportions are the same. Then find two events \\(C\\) and \\(D\\) that are not independent and repeat the simulation. set.seed(1) nsamps <- 10000 tosses <- sample(1:6, nsamps, replace = TRUE) PA <- sum(tosses %in% c(2,4,6)) / nsamps PB <- sum(tosses %in% c(1,2,3,4)) / nsamps PA * PB ## [1] 0.3295095 sum(tosses %in% c(2,4)) / nsamps ## [1] 0.3323 # Let C = {1,2} and D = {2,3} PC <- sum(tosses %in% c(1,2)) / nsamps PD <- sum(tosses %in% c(2,3)) / nsamps PC * PD ## [1] 0.1067492 sum(tosses %in% c(2)) / nsamps ## [1] 0.1622 Exercise 3.3 A machine reports the true value of a thrown 12-sided die 5 out of 6 times. If the machine reports a 1 has been tossed, what is the probability that it is actually a 1? Now let the machine only report whether a 1 has been tossed or not. Does the probability change? R: Use simulation to check your answers to a) and b). Solution. Let \\(T = 1\\) denote that the toss is 1 and \\(M = 1\\) that the machine reports a 1. \\[\\begin{align} P(T = 1 | M = 1) &= \\frac{P(M = 1 | T = 1)P(T = 1)}{P(M = 1)} \\\\ &= \\frac{P(M = 1 | T = 1)P(T = 1)}{\\sum_{k=1}^{12} P(M = 1 | T = k)P(T = k)} \\\\ &= \\frac{\\frac{5}{6}\\frac{1}{12}}{\\frac{5}{6}\\frac{1}{12} + 11 \\frac{1}{6} \\frac{1}{11} \\frac{1}{12}} \\\\ &= \\frac{5}{6}. \\end{align}\\] Yes. \\[\\begin{align} P(T = 1 | M = 1) &= \\frac{P(M = 1 | T = 1)P(T = 1)}{P(M = 1)} \\\\ &= \\frac{P(M = 1 | T = 1)P(T = 1)}{\\sum_{k=1}^{12} P(M = 1 | T = k)P(T = k)} \\\\ &= \\frac{\\frac{5}{6}\\frac{1}{12}}{\\frac{5}{6}\\frac{1}{12} + 11 \\frac{1}{6} \\frac{1}{12}} \\\\ &= \\frac{5}{16}. \\end{align}\\] set.seed(1) nsamps <- 10000 report_a <- vector(mode = "numeric", length = nsamps) report_b <- vector(mode = "logical", length = nsamps) truths <- vector(mode = "logical", length = nsamps) for (i in 1:10000) { toss <- sample(1:12, size = 1) truth <- sample(c(TRUE, FALSE), size = 1, prob = c(5/6, 1/6)) truths[i] <- truth if (truth) { report_a[i] <- toss report_b[i] <- toss == 1 } else { remaining <- (1:12)[1:12 != toss] report_a[i] <- sample(remaining, size = 1) report_b[i] <- toss != 1 } } truth_a1 <- truths[report_a == 1] sum(truth_a1) / length(truth_a1) ## [1] 0.8300733 truth_b1 <- truths[report_b] sum(truth_b1) / length(truth_b1) ## [1] 0.3046209 Exercise 3.4 A coin is tossed independently \\(n\\) times. The probability of heads at each toss is \\(p\\). At each time \\(k\\), \\((k = 2,3,...,n)\\) we get a reward at time \\(k+1\\) if \\(k\\)-th toss was a head and the previous toss was a tail. Let \\(A_k\\) be the event that a reward is obtained at time \\(k\\). Are events \\(A_k\\) and \\(A_{k+1}\\) independent? Are events \\(A_k\\) and \\(A_{k+2}\\) independent? R: simulate 10 tosses 10000 times, where \\(p = 0.7\\). Check your answers to a) and b) by counting the frequencies of the events \\(A_5\\), \\(A_6\\), and \\(A_7\\). Solution. For \\(A_k\\) to happen, we need the tosses \\(k-2\\) and \\(k-1\\) be tails and heads respectively. For \\(A_{k+1}\\) to happen, we need tosses \\(k-1\\) and \\(k\\) be tails and heads respectively. As the toss \\(k-1\\) need to be heads for one and tails for the other, these two events can not happen simultaneously. Therefore the probability of their intersection is 0. But the probability of each of them separately is \\(p(1-p) > 0\\). Therefore, they are not independent. For \\(A_k\\) to happen, we need the tosses \\(k-2\\) and \\(k-1\\) be tails and heads respectively. For \\(A_{k+2}\\) to happen, we need tosses \\(k\\) and \\(k+1\\) be tails and heads respectively. So the probability of intersection is \\(p^2(1-p)^2\\). And the probability of each separately is again \\(p(1-p)\\). Therefore, they are independent. set.seed(1) nsamps <- 10000 p <- 0.7 rewardA_5 <- vector(mode = "logical", length = nsamps) rewardA_6 <- vector(mode = "logical", length = nsamps) rewardA_7 <- vector(mode = "logical", length = nsamps) rewardA_56 <- vector(mode = "logical", length = nsamps) rewardA_57 <- vector(mode = "logical", length = nsamps) for (i in 1:nsamps) { samps <- sample(c(0,1), size = 10, replace = TRUE, prob = c(0.7, 0.3)) rewardA_5[i] <- (samps[4] == 0 & samps[3] == 1) rewardA_6[i] <- (samps[5] == 0 & samps[4] == 1) rewardA_7[i] <- (samps[6] == 0 & samps[5] == 1) rewardA_56[i] <- (rewardA_5[i] & rewardA_6[i]) rewardA_57[i] <- (rewardA_5[i] & rewardA_7[i]) } sum(rewardA_5) / nsamps ## [1] 0.2141 sum(rewardA_6) / nsamps ## [1] 0.2122 sum(rewardA_7) / nsamps ## [1] 0.2107 sum(rewardA_56) / nsamps ## [1] 0 sum(rewardA_57) / nsamps ## [1] 0.0454 Exercise 3.5 A drawer contains two coins. One is an unbiased coin, the other is a biased coin, which will turn up heads with probability \\(p\\) and tails with probability \\(1-p\\). One coin is selected uniformly at random. The selected coin is tossed \\(n\\) times. The coin turns up heads \\(k\\) times and tails \\(n-k\\) times. What is the probability that the coin is biased? The selected coin is tossed repeatedly until it turns up heads \\(k\\) times. Given that it is tossed \\(n\\) times in total, what is the probability that the coin is biased? Solution. Let \\(B = 1\\) denote that the coin is biased and let \\(H = k\\) denote that we’ve seen \\(k\\) heads. \\[\\begin{align} P(B = 1 | H = k) &= \\frac{P(H = k | B = 1)P(B = 1)}{P(H = k)} \\\\ &= \\frac{P(H = k | B = 1)P(B = 1)}{P(H = k | B = 1)P(B = 1) + P(H = k | B = 0)P(B = 0)} \\\\ &= \\frac{p^k(1-p)^{n-k} 0.5}{p^k(1-p)^{n-k} 0.5 + 0.5^{n+1}} \\\\ &= \\frac{p^k(1-p)^{n-k}}{p^k(1-p)^{n-k} + 0.5^n}. \\end{align}\\] The same results as in a). The only difference between these two scenarios is that in b) the last throw must be heads. However, this holds for the biased and the unbiased coin and therefore does not affect the probability of the coin being biased. Exercise 3.6 Judy goes around the company for Women’s day and shares flowers. In every office she leaves a flower, if there is at least one woman inside. The probability that there’s a woman in the office is \\(\\frac{3}{5}\\). What is the probability that Judy leaves her first flower in the fourth office? Given that she has given away exactly three flowers in the first four offices, what is the probability that she gives her fourth flower in the eighth office? What is the probability that she leaves the second flower in the fifth office? What is the probability that she leaves the second flower in the fifth office, given that she did not leave the second flower in the second office? Judy needs a new supply of flowers immediately after the office, where she gives away her last flower. What is the probability that she visits at least five offices, if she starts with two flowers? R: simulate Judy’s walk 10000 times to check your answers a) - e). Solution. Let \\(X_i = k\\) denote the event that … \\(i\\)-th sample on the \\(k\\)-th run. Since the events are independent, we can multiply their probabilities to get \\[\\begin{equation} P(X_1 = 4) = 0.4^3 \\times 0.6 = 0.0384. \\end{equation}\\] Same as in a) as we have a fresh start after first four offices. For this to be possible, she had to leave the first flower in one of the first four offices. Therefore there are four possibilities, and for each of those the probability is \\(0.4^3 \\times 0.6\\). Additionally, the probability that she leaves a flower in the fifth office is \\(0.6\\). So \\[\\begin{equation} P(X_2 = 5) = \\binom{4}{1} \\times 0.4^3 \\times 0.6^2 = 0.09216. \\end{equation}\\] We use Bayes’ theorem. \\[\\begin{align} P(X_2 = 5 | X_2 \\neq 2) &= \\frac{P(X_2 \\neq 2 | X_2 = 5)P(X_2 = 5)}{P(X_2 \\neq 2)} \\\\ &= \\frac{0.09216}{0.64} \\\\ &= 0.144. \\end{align}\\] The denominator in the second equation can be calculated as follows. One of three things has to happen for the second not to be dealt in the second round. First, both are zero, so \\(0.4^2\\). Second, first is zero, and second is one, so \\(0.4 \\times 0.6\\). Third, the first is one and the second one zero, so \\(0.6 \\times 0.4\\). Summing these values we get \\(0.64\\). We will look at the complement, so the events that she gave away exactly two flowers after two, three and four offices. \\[\\begin{equation} P(X_2 \\geq 5) = 1 - 0.6^2 - 2 \\times 0.4 \\times 0.6^2 - 3 \\times 0.4^2 \\times 0.6^2 = 0.1792. \\end{equation}\\] The multiplying parts represent the possibilities of the first flower. set.seed(1) nsamps <- 100000 Judyswalks <- matrix(data = NA, nrow = nsamps, ncol = 8) for (i in 1:nsamps) { thiswalk <- sample(c(0,1), size = 8, replace = TRUE, prob = c(0.4, 0.6)) Judyswalks[i, ] <- thiswalk } csJudy <- t(apply(Judyswalks, 1, cumsum)) # a sum(csJudy[ ,4] == 1 & csJudy[ ,3] == 0) / nsamps ## [1] 0.03848 # b csJsubset <- csJudy[csJudy[ ,4] == 3 & csJudy[ ,3] == 2, ] sum(csJsubset[ ,8] == 4 & csJsubset[ ,7] == 3) / nrow(csJsubset) ## [1] 0.03665893 # c sum(csJudy[ ,5] == 2 & csJudy[ ,4] == 1) / nsamps ## [1] 0.09117 # d sum(csJudy[ ,5] == 2 & csJudy[ ,4] == 1) / sum(csJudy[ ,2] != 2) ## [1] 0.1422398 # e sum(csJudy[ ,4] < 2) / nsamps ## [1] 0.17818 3.2 Conditional independence Exercise 3.7 Describe: A real-world example of two events \\(A\\) and \\(B\\) that are dependent but become conditionally independent if conditioned on a third event \\(C\\). A real-world example of two events \\(A\\) and \\(B\\) that are independent, but become dependent if conditioned on some third event \\(C\\). Solution. Let \\(A\\) be the height of a person and let \\(B\\) be the person’s knowledge of the Dutch language. These events are dependent since the Dutch are known to be taller than average. However if \\(C\\) is the nationality of the person, then \\(A\\) and \\(B\\) are independent given \\(C\\). Let \\(A\\) be the event that Mary passes the exam and let \\(B\\) be the event that John passes the exam. These events are independent. However, if the event \\(C\\) is that Mary and John studied together, then \\(A\\) and \\(B\\) are conditionally dependent given \\(C\\). Exercise 3.8 We have two coins of identical appearance. We know that one is a fair coin and the other flips heads 80% of the time. We choose one of the two coins uniformly at random. We discard the coin that was not chosen. We now flip the chosen coin independently 10 times, producing a sequence \\(Y_1 = y_1\\), \\(Y_2 = y_2\\), …, \\(Y_{10} = y_{10}\\). Intuitively, without doing and computation, are these random variables independent? Compute the probability \\(P(Y_1 = 1)\\). Compute the probabilities \\(P(Y_2 = 1 | Y_1 = 1)\\) and \\(P(Y_{10} = 1 | Y_1 = 1,...,Y_9 = 1)\\). Given your answers to b) and c), would you now change your answer to a)? If so, discuss why your intuition had failed. Solution. \\(P(Y_1 = 1) = 0.5 * 0.8 + 0.5 * 0.5 = 0.65\\). Since we know that \\(Y_1 = 1\\) this should change our view of the probability of the coin being biased or not. Let \\(B = 1\\) denote the event that the coin is biased and let \\(B = 0\\) denote that the coin is unbiased. By using marginal probability, we can write \\[\\begin{align} P(Y_2 = 1 | Y_1 = 1) &= P(Y_2 = 1, B = 1 | Y_1 = 1) + P(Y_2 = 1, B = 0 | Y_1 = 1) \\\\ &= \\sum_{k=1}^2 P(Y_2 = 1 | B = k, Y_1 = 1)P(B = k | Y_1 = 1) \\\\ &= 0.8 \\frac{P(Y_1 = 1 | B = 1)P(B = 1)}{P(Y_1 = 1)} + 0.5 \\frac{P(Y_1 = 1 | B = 0)P(B = 0)}{P(Y_1 = 1)} \\\\ &= 0.8 \\frac{0.8 \\times 0.5}{0.65} + 0.5 \\frac{0.5 \\times 0.5}{0.65} \\\\ &\\approx 0.68. \\end{align}\\] For the other calculation we follow the same procedure. Let \\(X = 1\\) denote that first nine tosses are all heads (equivalent to \\(Y_1 = 1\\),…, \\(Y_9 = 1\\)). \\[\\begin{align} P(Y_{10} = 1 | X = 1) &= P(Y_2 = 1, B = 1 | X = 1) + P(Y_2 = 1, B = 0 | X = 1) \\\\ &= \\sum_{k=1}^2 P(Y_2 = 1 | B = k, X = 1)P(B = k | X = 1) \\\\ &= 0.8 \\frac{P(X = 1 | B = 1)P(B = 1)}{P(X = 1)} + 0.5 \\frac{P(X = 1 | B = 0)P(B = 0)}{P(X = 1)} \\\\ &= 0.8 \\frac{0.8^9 \\times 0.5}{0.5 \\times 0.8^9 + 0.5 \\times 0.5^9} + 0.5 \\frac{0.5^9 \\times 0.5}{0.5 \\times 0.8^9 + 0.5 \\times 0.5^9} \\\\ &\\approx 0.8. \\end{align}\\] 3.3 Monty Hall problem The Monty Hall problem is a famous probability puzzle with non-intuitive outcome. Many established mathematicians and statisticians had problems solving it and many even disregarded the correct solution until they’ve seen the proof by simulation. Here we will show how it can be solved relatively simply with the use of Bayes’ theorem if we select the variables in a smart way. Exercise 3.9 (Monty Hall problem) A prize is placed at random behind one of three doors. You pick a door. Now Monty Hall chooses one of the other two doors, opens it and shows you that it is empty. He then gives you the opportunity to keep your door or switch to the other unopened door. Should you stay or switch? Use Bayes’ theorem to calculate the probability of winning if you switch and if you do not. R: Check your answers in R. Solution. W.L.O.G. assume we always pick the first door. The host can only open door 2 or door 3, as he can not open the door we picked. Let \\(k \\in \\{2,3\\}\\). Let us first look at what happens if we do not change. Then we have \\[\\begin{align} P(\\text{car in 1} | \\text{open $k$}) &= \\frac{P(\\text{open $k$} | \\text{car in 1})P(\\text{car in 1})}{P(\\text{open $k$})} \\\\ &= \\frac{P(\\text{open $k$} | \\text{car in 1})P(\\text{car in 1})}{\\sum_{n=1}^3 P(\\text{open $k$} | \\text{car in $n$})P(\\text{car in $n$)}}. \\end{align}\\] The probability that he opened \\(k\\) if the car is in 1 is \\(\\frac{1}{2}\\), as he can choose between door 2 and 3 as both have a goat behind it. Let us look at the normalization constant. When \\(n = 1\\) we get the value in the nominator. When \\(n=k\\), we get 0, as he will not open the door if there’s a prize behind. The remaining option is that we select 1, the car is behind \\(k\\) and he opens the only door left. Since he can’t open 1 due to it being our pick and \\(k\\) due to having the prize, the probability of opening the remaining door is 1, and the prior probability of the car being behind this door is \\(\\frac{1}{3}\\). So we have \\[\\begin{align} P(\\text{car in 1} | \\text{open $k$}) &= \\frac{\\frac{1}{2}\\frac{1}{3}}{\\frac{1}{2}\\frac{1}{3} + \\frac{1}{3}} \\\\ &= \\frac{1}{3}. \\end{align}\\] Now let us look at what happens if we do change. Let \\(k' \\in \\{2,3\\}\\) be the door that is not opened. If we change, we select this door, so we have \\[\\begin{align} P(\\text{car in $k'$} | \\text{open $k$}) &= \\frac{P(\\text{open $k$} | \\text{car in $k'$})P(\\text{car in $k'$})}{P(\\text{open $k$})} \\\\ &= \\frac{P(\\text{open $k$} | \\text{car in $k'$})P(\\text{car in $k'$})}{\\sum_{n=1}^3 P(\\text{open $k$} | \\text{car in $n$})P(\\text{car in $n$)}}. \\end{align}\\] The denominator stays the same, the only thing that is different from before is \\(P(\\text{open $k$} | \\text{car in $k'$})\\). We have a situation where we initially selected door 1 and the car is in door \\(k'\\). The probability that the host will open door \\(k\\) is then 1, as he can not pick any other door. So we have \\[\\begin{align} P(\\text{car in $k'$} | \\text{open $k$}) &= \\frac{\\frac{1}{3}}{\\frac{1}{2}\\frac{1}{3} + \\frac{1}{3}} \\\\ &= \\frac{2}{3}. \\end{align}\\] Therefore it makes sense to change the door. set.seed(1) nsamps <- 1000 ifchange <- vector(mode = "logical", length = nsamps) ifstay <- vector(mode = "logical", length = nsamps) for (i in 1:nsamps) { where_car <- sample(c(1:3), 1) where_player <- sample(c(1:3), 1) open_samp <- (1:3)[where_car != (1:3) & where_player != (1:3)] if (length(open_samp) == 1) { where_open <- open_samp } else { where_open <- sample(open_samp, 1) } ifstay[i] <- where_car == where_player where_ifchange <- (1:3)[where_open != (1:3) & where_player != (1:3)] ifchange[i] <- where_ifchange == where_car } sum(ifstay) / nsamps ## [1] 0.328 sum(ifchange) / nsamps ## [1] 0.672 "],["rvs.html", "Chapter 4 Random variables 4.1 General properties and calculations 4.2 Discrete random variables 4.3 Continuous random variables 4.4 Singular random variables 4.5 Transformations", " Chapter 4 Random variables This chapter deals with random variables and their distributions. The students are expected to acquire the following knowledge: Theoretical Identification of random variables. Convolutions of random variables. Derivation of PDF, PMF, CDF, and quantile function. Definitions and properties of common discrete random variables. Definitions and properties of common continuous random variables. Transforming univariate random variables. R Familiarize with PDF, PMF, CDF, and quantile functions for several distributions. Visual inspection of probability distributions. Analytical and empirical calculation of probabilities based on distributions. New R functions for plotting (for example, facet_wrap). Creating random number generators based on the Uniform distribution. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 4.1 General properties and calculations Exercise 4.1 Which of the functions below are valid CDFs? Find their respective densities. R: Plot the three functions. \\[\\begin{equation} F(x) = \\begin{cases} 1 - e^{-x^2} & x \\geq 0 \\\\ 0 & x < 0. \\end{cases} \\end{equation}\\] \\[\\begin{equation} F(x) = \\begin{cases} e^{-\\frac{1}{x}} & x > 0 \\\\ 0 & x \\leq 0. \\end{cases} \\end{equation}\\] \\[\\begin{equation} F(x) = \\begin{cases} 0 & x \\leq 0 \\\\ \\frac{1}{3} & 0 < x \\leq \\frac{1}{2} \\\\ 1 & x > \\frac{1}{2}. \\end{cases} \\end{equation}\\] Solution. Yes. First, let us check the limits. \\(\\lim_{x \\rightarrow -\\infty} (0) = 0\\). \\(\\lim_{x \\rightarrow \\infty} (1 - e^{-x^2}) = 1 - \\lim_{x \\rightarrow \\infty} e^{-x^2} = 1 - 0 = 1\\). Second, let us check whether the function is increasing. Let \\(x > y \\geq 0\\). Then \\(1 - e^{-x^2} \\geq 1 - e^{-y^2}\\). We only have to check right continuity for the point zero. \\(F(0) = 0\\) and \\(\\lim_{\\epsilon \\downarrow 0}F (0 + \\epsilon) = \\lim_{\\epsilon \\downarrow 0} 1 - e^{-\\epsilon^2} = 1 - \\lim_{\\epsilon \\downarrow 0} e^{-\\epsilon^2} = 1 - 1 = 0\\). We get the density by differentiating the CDF. \\(p(x) = \\frac{d}{dx} 1 - e^{-x^2} = 2xe^{-x^2}.\\) Students are encouraged to check that this is a proper PDF. Yes. First, let us check the limits. $_{x -} (0) = 0 and \\(\\lim_{x \\rightarrow \\infty} (e^{-\\frac{1}{x}}) = 1\\). Second, let us check whether the function is increasing. Let \\(x > y \\geq 0\\). Then \\(e^{-\\frac{1}{x}} \\geq e^{-\\frac{1}{y}}\\). We only have to check right continuity for the point zero. \\(F(0) = 0\\) and \\(\\lim_{\\epsilon \\downarrow 0}F (0 + \\epsilon) = \\lim_{\\epsilon \\downarrow 0} e^{-\\frac{1}{\\epsilon}} = 0\\). We get the density by differentiating the CDF. \\(p(x) = \\frac{d}{dx} e^{-\\frac{1}{x}} = \\frac{1}{x^2}e^{-\\frac{1}{x}}.\\) Students are encouraged to check that this is a proper PDF. No. The function is not right continuous as \\(F(\\frac{1}{2}) = \\frac{1}{3}\\), but \\(\\lim_{\\epsilon \\downarrow 0} F(\\frac{1}{2} + \\epsilon) = 1\\). f1 <- function (x) { tmp <- 1 - exp(-x^2) tmp[x < 0] <- 0 return(tmp) } f2 <- function (x) { tmp <- exp(-(1 / x)) tmp[x <= 0] <- 0 return(tmp) } f3 <- function (x) { tmp <- x tmp[x == x] <- 1 tmp[x <= 0.5] <- 1/3 tmp[x <= 0] <- 0 return(tmp) } cdf_data <- tibble(x = seq(-1, 20, by = 0.001), f1 = f1(x), f2 = f2(x), f3 = f3(x)) %>% melt(id.vars = "x") cdf_plot <- ggplot(data = cdf_data, aes(x = x, y = value, color = variable)) + geom_hline(yintercept = 1) + geom_line() plot(cdf_plot) Exercise 4.2 Let \\(X\\) be a random variable with CDF \\[\\begin{equation} F(x) = \\begin{cases} 0 & x < 0 \\\\ \\frac{x^2}{2} & 0 \\leq x < 1 \\\\ \\frac{1}{2} + \\frac{p}{2} & 1 \\leq x < 2 \\\\ \\frac{1}{2} + \\frac{p}{2} + \\frac{1 - p}{2} & x \\geq 2 \\end{cases} \\end{equation}\\] R: Plot this CDF for \\(p = 0.3\\). Is it a discrete, continuous, or mixed random varible? Find the probability density/mass of \\(X\\). f1 <- function (x, p) { tmp <- x tmp[x >= 2] <- 0.5 + (p * 0.5) + ((1-p) * 0.5) tmp[x < 2] <- 0.5 + (p * 0.5) tmp[x < 1] <- (x[x < 1])^2 / 2 tmp[x < 0] <- 0 return(tmp) } cdf_data <- tibble(x = seq(-1, 5, by = 0.001), y = f1(x, 0.3)) cdf_plot <- ggplot(data = cdf_data, aes(x = x, y = y)) + geom_hline(yintercept = 1) + geom_line(color = "blue") plot(cdf_plot) ::: {.solution} \\(X\\) is a mixed random variable. Since \\(X\\) is a mixed random variable, we have to find the PDF of the continuous part and the PMF of the discrete part. We get the continuous part by differentiating the corresponding CDF, \\(\\frac{d}{dx}\\frac{x^2}{2} = x\\). So the PDF, when \\(0 \\leq x < 1\\), is \\(p(x) = x\\). Let us look at the discrete part now. It has two steps, so this is a discrete distribution with two outcomes – numbers 1 and 2. The first happens with probability \\(\\frac{p}{2}\\), and the second with probability \\(\\frac{1 - p}{2}\\). This reminds us of the Bernoulli distribution. The PMF for the discrete part is \\(P(X = x) = (\\frac{p}{2})^{2 - x} (\\frac{1 - p}{2})^{x - 1}\\). ::: Exercise 4.3 (Convolutions) Convolutions are probability distributions that correspond to sums of independent random variables. Let \\(X\\) and \\(Y\\) be independent discrete variables. Find the PMF of \\(Z = X + Y\\). Hint: Use the law of total probability. Let \\(X\\) and \\(Y\\) be independent continuous variables. Find the PDF of \\(Z = X + Y\\). Hint: Start with the CDF. Solution. \\[\\begin{align} P(Z = z) &= P(X + Y = z) & \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + Y = z | Y = k) P(Y = k) & \\text{ (law of total probability)} \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + k = z | Y = k) P(Y = k) & \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + k = z) P(Y = k) & \\text{ (independence of $X$ and $Y$)} \\\\ &= \\sum_{k = -\\infty}^\\infty P(X = z - k) P(Y = k). & \\end{align}\\] Let \\(f\\) and \\(g\\) be the PDFs of \\(X\\) and \\(Y\\) respectively. \\[\\begin{align} F(z) &= P(Z < z) \\\\ &= P(X + Y < z) \\\\ &= \\int_{-\\infty}^{\\infty} P(X + Y < z | Y = y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X + y < z | Y = y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X + y < z)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X < z - y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} (\\int_{-\\infty}^{z - y} f(x) dx) g(y) dy \\end{align}\\] Now \\[\\begin{align} p(z) &= \\frac{d}{dz} F(z) & \\\\ &= \\int_{-\\infty}^{\\infty} (\\frac{d}{dz}\\int_{-\\infty}^{z - y} f(x) dx) g(y) dy & \\\\ &= \\int_{-\\infty}^{\\infty} f(z - y) g(y) dy & \\text{ (fundamental theorem of calculus)}. \\end{align}\\] 4.2 Discrete random variables Exercise 4.4 (Binomial random variable) Let \\(X_k\\), \\(k = 1,...,n\\), be random variables with the Bernoulli measure as the PMF. Let \\(X = \\sum_{k=1}^n X_k\\). We call \\(X_k\\) a Bernoulli random variable with parameter \\(p \\in (0,1)\\). Find the CDF of \\(X_k\\). Find PMF of \\(X\\). This is a Binomial random variable with support in \\(\\{0,1,2,...,n\\}\\) and parameters \\(p \\in (0,1)\\) and \\(n \\in \\mathbb{N}_0\\). We denote \\[\\begin{equation} X | n,p \\sim \\text{binomial}(n,p). \\end{equation}\\] Find CDF of \\(X\\). R: Simulate from the binomial distribution with \\(n = 10\\) and \\(p = 0.5\\), and from \\(n\\) Bernoulli distributions with \\(p = 0.5\\). Visually compare the sum of Bernoullis and the binomial. Hint: there is no standard function like rpois for a Bernoulli random variable. Check exercise 1.12 to find out how to sample from a Bernoulli distribution. Solution. There are two outcomes – zero and one. Zero happens with probability \\(1 - p\\). Therefore \\[\\begin{equation} F(k) = \\begin{cases} 0 & k < 0 \\\\ 1 - p & 0 \\leq k < 1 \\\\ 1 & k \\geq 1. \\end{cases} \\end{equation}\\] For the probability of \\(X\\) to be equal to some \\(k \\leq n\\), exactly \\(k\\) Bernoulli variables need to be one, and the others zero. So \\(p^k(1-p)^{n-k}\\). There are \\(\\binom{n}{k}\\) such possible arrangements. Therefore \\[\\begin{align} P(X = k) = \\binom{n}{k} p^k (1 - p)^{n-k}. \\end{align}\\] \\[\\begin{equation} F(k) = \\sum_{i = 0}^{\\lfloor k \\rfloor} \\binom{n}{i} p^i (1 - p)^{n - i} \\end{equation}\\] set.seed(1) nsamps <- 10000 binom_samp <- rbinom(nsamps, size = 10, prob = 0.5) bernoulli_mat <- matrix(data = NA, nrow = nsamps, ncol = 10) for (i in 1:nsamps) { bernoulli_mat[i, ] <- rbinom(10, size = 1, prob = 0.5) } bern_samp <- apply(bernoulli_mat, 1, sum) b_data <- tibble(x = c(binom_samp, bern_samp), type = c(rep("binomial", 10000), rep("Bernoulli_sum", 10000))) b_plot <- ggplot(data = b_data, aes(x = x, fill = type)) + geom_bar(position = "dodge") plot(b_plot) Exercise 4.5 (Geometric random variable) A variable with PMF \\[\\begin{equation} P(k) = p(1-p)^k \\end{equation}\\] is a geometric random variable with support in non-negative integers. It has one parameter \\(p \\in (0,1]\\). We denote \\[\\begin{equation} X | p \\sim \\text{geometric}(p) \\end{equation}\\] Derive the CDF of a geometric random variable. R: Draw 1000 samples from the geometric distribution with \\(p = 0.3\\) and compare their frequencies to theoretical values. Solution. \\[\\begin{align} P(X \\leq k) &= \\sum_{i = 0}^k p(1-p)^i \\\\ &= p \\sum_{i = 0}^k (1-p)^i \\\\ &= p \\frac{1 - (1-p)^{k+1}}{1 - (1 - p)} \\\\ &= 1 - (1-p)^{k + 1} \\end{align}\\] set.seed(1) geo_samp <- rgeom(n = 1000, prob = 0.3) geo_samp <- data.frame(x = geo_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:20, n = dgeom(0:20, prob = 0.3), type = "theoretical_measure")) geo_plot <- ggplot(data = geo_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(geo_plot) Exercise 4.6 (Poisson random variable) A variable with PMF \\[\\begin{equation} P(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!} \\end{equation}\\] is a Poisson random variable with support in non-negative integers. It has one positive parameter \\(\\lambda\\), which also represents its mean value and variance (a measure of the deviation of the values from the mean – more on mean and variance in the next chapter). We denote \\[\\begin{equation} X | \\lambda \\sim \\text{Poisson}(\\lambda). \\end{equation}\\] This distribution is usually the default choice for modeling counts. We have already encountered a Poisson random variable in exercise 1.13, where we also sampled from this distribution. The CDF of a Poisson random variable is \\(P(X <= x) = e^{-\\lambda} \\sum_{i=0}^x \\frac{\\lambda^{i}}{i!}\\). R: Draw 1000 samples from the Poisson distribution with \\(\\lambda = 5\\) and compare their empirical cumulative distribution function with the theoretical CDF. set.seed(1) pois_samp <- rpois(n = 1000, lambda = 5) pois_samp <- data.frame(x = pois_samp) pois_plot <- ggplot(data = pois_samp, aes(x = x, colour = "ECDF")) + stat_ecdf(geom = "step") + geom_step(data = tibble(x = 0:17, y = ppois(x, 5)), aes(x = x, y = y, colour = "CDF")) + scale_colour_manual("Lgend title", values = c("black", "red")) plot(pois_plot) Exercise 4.7 (Negative binomial random variable) A variable with PMF \\[\\begin{equation} p(k) = \\binom{k + r - 1}{k}(1-p)^r p^k \\end{equation}\\] is a negative binomial random variable with support in non-negative integers. It has two parameters \\(r > 0\\) and \\(p \\in (0,1)\\). We denote \\[\\begin{equation} X | r,p \\sim \\text{NB}(r,p). \\end{equation}\\] Let us reparameterize the negative binomial distribution with \\(q = 1 - p\\). Find the PMF of \\(X \\sim \\text{NB}(1, q)\\). Do you recognize this distribution? Show that the sum of two negative binomial random variables with the same \\(p\\) is also a negative binomial random variable. Hint: Use the fact that the number of ways to place \\(n\\) indistinct balls into \\(k\\) boxes is \\(\\binom{n + k - 1}{n}\\). R: Draw samples from \\(X \\sim \\text{NB}(5, 0.4)\\) and \\(Y \\sim \\text{NB}(3, 0.4)\\). Draw samples from \\(Z = X + Y\\), where you use the parameters calculated in b). Plot both distributions, their sum, and \\(Z\\) using facet_wrap. Be careful, as R uses a different parameterization size=\\(r\\) and prob=\\(1 - p\\). Solution. \\[\\begin{align} P(X = k) &= \\binom{k + 1 - 1}{k}q^1 (1-q)^k \\\\ &= q(1-q)^k. \\end{align}\\] This is the geometric distribution. Let \\(X \\sim \\text{NB}(r_1, p)\\) and \\(Y \\sim \\text{NB}(r_2, p)\\). Let \\(Z = X + Y\\). \\[\\begin{align} P(Z = z) &= \\sum_{k = 0}^{\\infty} P(X = z - k)P(Y = k), \\text{ if k < 0, then the probabilities are 0} \\\\ &= \\sum_{k = 0}^{z} P(X = z - k)P(Y = k), \\text{ if k > z, then the probabilities are 0} \\\\ &= \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k}(1 - p)^{r_1} p^{z - k} \\binom{k + r_2 - 1}{k}(1 - p)^{r_2} p^{k} & \\\\ &= \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}(1 - p)^{r_1 + r_2} p^{z} & \\\\ &= (1 - p)^{r_1 + r_2} p^{z} \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}& \\end{align}\\] The part before the sum reminds us of the negative binomial distribution with parameters \\(r_1 + r_2\\) and \\(p\\). To complete this term to the negative binomial PMF we need \\(\\binom{z + r_1 + r_2 -1}{z}\\). So the only thing we need to prove is that the sum equals this term. Both terms in the sum can be interpreted as numbers of ways to place a number of balls into boxes. For the left term it is \\(z-k\\) balls into \\(r_1\\) boxes, and for the right \\(k\\) balls into \\(r_2\\) boxes. For each \\(k\\) we are distributing \\(z\\) balls in total. By summing over all \\(k\\), we actually get all the possible placements of \\(z\\) balls into \\(r_1 + r_2\\) boxes. Therefore \\[\\begin{align} P(Z = z) &= (1 - p)^{r_1 + r_2} p^{z} \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}& \\\\ &= \\binom{z + r_1 + r_2 -1}{z} (1 - p)^{r_1 + r_2} p^{z}. \\end{align}\\] From this it also follows that the sum of geometric distributions with the same parameter is a negative binomial distribution. \\(Z \\sim \\text{NB}(8, 0.4)\\). set.seed(1) nsamps <- 10000 x <- rnbinom(nsamps, size = 5, prob = 0.6) y <- rnbinom(nsamps, size = 3, prob = 0.6) xpy <- x + y z <- rnbinom(nsamps, size = 8, prob = 0.6) samps <- tibble(x, y, xpy, z) samps <- melt(samps) ggplot(data = samps, aes(x = value)) + geom_bar() + facet_wrap(~ variable) 4.3 Continuous random variables Exercise 4.8 (Exponential random variable) A variable \\(X\\) with PDF \\(\\lambda e^{-\\lambda x}\\) is an exponential random variable with support in non-negative real numbers. It has one positive parameter \\(\\lambda\\). We denote \\[\\begin{equation} X | \\lambda \\sim \\text{Exp}(\\lambda). \\end{equation}\\] Find the CDF of an exponential random variable. Find the quantile function of an exponential random variable. Calculate the probability \\(P(1 \\leq X \\leq 3)\\), where \\(X \\sim \\text{Exp(1.5)}\\). R: Check your answer to c) with a simulation (rexp). Plot the probability in a meaningful way. R: Implement PDF, CDF, and the quantile function and compare their values with corresponding R functions visually. Hint: use the size parameter to make one of the curves wider. Solution. \\[\\begin{align} F(x) &= \\int_{0}^{x} \\lambda e^{-\\lambda t} dt \\\\ &= \\lambda \\int_{0}^{x} e^{-\\lambda t} dt \\\\ &= \\lambda (\\frac{1}{-\\lambda}e^{-\\lambda t} |_{0}^{x}) \\\\ &= \\lambda(\\frac{1}{\\lambda} - \\frac{1}{\\lambda} e^{-\\lambda x}) \\\\ &= 1 - e^{-\\lambda x}. \\end{align}\\] \\[\\begin{align} F(F^{-1}(x)) &= x \\\\ 1 - e^{-\\lambda F^{-1}(x)} &= x \\\\ e^{-\\lambda F^{-1}(x)} &= 1 - x \\\\ -\\lambda F^{-1}(x) &= \\ln(1 - x) \\\\ F^{-1}(x) &= - \\frac{ln(1 - x)}{\\lambda}. \\end{align}\\] \\[\\begin{align} P(1 \\leq X \\leq 3) &= P(X \\leq 3) - P(X \\leq 1) \\\\ &= P(X \\leq 3) - P(X \\leq 1) \\\\ &= 1 - e^{-1.5 \\times 3} - 1 + e^{-1.5 \\times 1} \\\\ &\\approx 0.212. \\end{align}\\] set.seed(1) nsamps <- 1000 samps <- rexp(nsamps, rate = 1.5) sum(samps >= 1 & samps <= 3) / nsamps ## [1] 0.212 exp_plot <- ggplot(data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dexp, args = list(rate = 1.5)) + stat_function(fun = dexp, args = list(rate = 1.5), xlim = c(1,3), geom = "area", fill = "red") plot(exp_plot) exp_pdf <- function(x, lambda) { return (lambda * exp(-lambda * x)) } exp_cdf <- function(x, lambda) { return (1 - exp(-lambda * x)) } exp_quant <- function(q, lambda) { return (-(log(1 - q) / lambda)) } ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_pdf, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = pexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_cdf, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) ggplot(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x)) + stat_function(fun = qexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_quant, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) Exercise 4.9 (Uniform random variable) Continuous uniform random variable with parameters \\(a\\) and \\(b\\) has the PDF \\[\\begin{equation} p(x) = \\begin{cases} \\frac{1}{b - a} & x \\in [a,b] \\\\ 0 & \\text{otherwise}. \\end{cases} \\end{equation}\\] Find the CDF of the uniform random variable. Find the quantile function of the uniform random variable. Let \\(X \\sim \\text{Uniform}(a,b)\\). Find the CDF of the variable \\(Y = \\frac{X - a}{b - a}\\). This is the standard uniform random variable. Let \\(X \\sim \\text{Uniform}(-1, 3)\\). Find such \\(z\\) that \\(P(X < z + \\mu_x) = \\frac{1}{5}\\). R: Check your result from d) using simulation. Solution. \\[\\begin{align} F(x) &= \\int_{a}^x \\frac{1}{b - a} dt \\\\ &= \\frac{1}{b - a} \\int_{a}^x dt \\\\ &= \\frac{x - a}{b - a}. \\end{align}\\] \\[\\begin{align} F(F^{-1}(p)) &= p \\\\ \\frac{F^{-1}(p) - a}{b - a} &= p \\\\ F^{-1}(p) &= p(b - a) + a. \\end{align}\\] \\[\\begin{align} F_Y(y) &= P(Y < y) \\\\ &= P(\\frac{X - a}{b - a} < y) \\\\ &= P(X < y(b - a) + a) \\\\ &= F_X(y(b - a) + a) \\\\ &= \\frac{(y(b - a) + a) - a}{b - a} \\\\ &= y. \\end{align}\\] \\[\\begin{align} P(X < z + 1) &= \\frac{1}{5} \\\\ F(z + 1) &= \\frac{1}{5} \\\\ z + 1 &= F^{-1}(\\frac{1}{5}) \\\\ z &= \\frac{1}{5}4 - 1 - 1 \\\\ z &= -1.2. \\end{align}\\] set.seed(1) a <- -1 b <- 3 nsamps <- 10000 unif_samp <- runif(nsamps, a, b) mu_x <- mean(unif_samp) new_samp <- unif_samp - mu_x quantile(new_samp, probs = 1/5) ## 20% ## -1.203192 punif(-0.2, -1, 3) ## [1] 0.2 Exercise 4.10 (Beta random variable) A variable \\(X\\) with PDF \\[\\begin{equation} p(x) = \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)}, \\end{equation}\\] where \\(\\text{B}(\\alpha, \\beta) = \\frac{\\Gamma(\\alpha) \\Gamma(\\beta)}{\\Gamma(\\alpha + \\beta)}\\) and \\(\\Gamma(x) = \\int_0^{\\infty} x^{z - 1} e^{-x} dx\\) is a Beta random variable with support on \\([0,1]\\). It has two positive parameters \\(\\alpha\\) and \\(\\beta\\). Notation: \\[\\begin{equation} X | \\alpha, \\beta \\sim \\text{Beta}(\\alpha, \\beta) \\end{equation}\\] It is often used in modeling rates. Calculate the PDF for \\(\\alpha = 1\\) and \\(\\beta = 1\\). What do you notice? R: Plot densities of the beta distribution for parameter pairs (2, 2), (4, 1), (1, 4), (2, 5), and (0.1, 0.1). R: Sample from \\(X \\sim \\text{Beta}(2, 5)\\) and compare the histogram with Beta PDF. Solution. \\[\\begin{equation} p(x) = \\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\\text{B}(1, 1)} = 1. \\end{equation}\\] This is the standard uniform distribution. set.seed(1) ggplot(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x)) + stat_function(fun = dbeta, args = list(shape1 = 2, shape2 = 2), aes(color = "alpha = 0.5")) + stat_function(fun = dbeta, args = list(shape1 = 4, shape2 = 1), aes(color = "alpha = 4")) + stat_function(fun = dbeta, args = list(shape1 = 1, shape2 = 4), aes(color = "alpha = 1")) + stat_function(fun = dbeta, args = list(shape1 = 2, shape2 = 5), aes(color = "alpha = 25")) + stat_function(fun = dbeta, args = list(shape1 = 0.1, shape2 = 0.1), aes(color = "alpha = 0.1")) set.seed(1) nsamps <- 1000 samps <- rbeta(nsamps, 2, 5) ggplot(data = data.frame(x = samps), aes(x = x)) + geom_histogram(aes(y = ..density..), color = "black") + stat_function(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x), fun = dbeta, args = list(shape1 = 2, shape2 = 5), color = "red", size = 1.2) Exercise 4.11 (Gamma random variable) A random variable with PDF \\[\\begin{equation} p(x) = \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x} \\end{equation}\\] is a Gamma random variable with support on the positive numbers and parameters shape \\(\\alpha > 0\\) and rate \\(\\beta > 0\\). We write \\[\\begin{equation} X | \\alpha, \\beta \\sim \\text{Gamma}(\\alpha, \\beta) \\end{equation}\\] and it’s CDF is \\[\\begin{equation} \\frac{\\gamma(\\alpha, \\beta x)}{\\Gamma(\\alpha)}, \\end{equation}\\] where \\(\\gamma(s, x) = \\int_0^x t^{s-1} e^{-t} dt\\). It is usually used in modeling positive phenomena (for example insurance claims and rainfalls). Let \\(X \\sim \\text{Gamma}(1, \\beta)\\). Find the PDF of \\(X\\). Do you recognize this PDF? Let \\(k = \\alpha\\) and \\(\\theta = \\frac{1}{\\beta}\\). Find the PDF of \\(X | k, \\theta \\sim \\text{Gamma}(k, \\theta)\\). Random variables can be reparameterized, and sometimes a reparameterized distribution is more suitable for certain calculations. The first parameterization is for example usually used in Bayesian statistics, while this parameterization is more common in econometrics and some other applied fields. Note that you also need to pay attention to the parameters in statistical software, so diligently read the help files when using functions like rgamma to see how the function is parameterized. R: Plot gamma CDF for random variables with shape and rate parameters (1,1), (10,1), (1,10). Solution. \\[\\begin{align} p(x) &= \\frac{\\beta^1}{\\Gamma(1)} x^{1 - 1}e^{-\\beta x} \\\\ &= \\beta e^{-\\beta x} \\end{align}\\] This is the PDF of the exponential distribution with parameter \\(\\beta\\). \\[\\begin{align} p(x) &= \\frac{1}{\\Gamma(k)\\beta^k} x^{k - 1}e^{-\\frac{x}{\\theta}}. \\end{align}\\] set.seed(1) ggplot(data = data.frame(x = seq(0, 25, by = 0.01)), aes(x = x)) + stat_function(fun = pgamma, args = list(shape = 1, rate = 1), aes(color = "Gamma(1,1)")) + stat_function(fun = pgamma, args = list(shape = 10, rate = 1), aes(color = "Gamma(10,1)")) + stat_function(fun = pgamma, args = list(shape = 1, rate = 10), aes(color = "Gamma(1,10)")) Exercise 4.12 (Normal random variable) A random variable with PDF \\[\\begin{equation} p(x) = \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} \\end{equation}\\] is a normal random variable with support on the real axis and parameters \\(\\mu\\) in reals and \\(\\sigma^2 > 0\\). The first is the mean parameter and the second is the variance parameter. Many statistical methods assume a normal distribution. We denote \\[\\begin{equation} X | \\mu, \\sigma \\sim \\text{N}(\\mu, \\sigma^2), \\end{equation}\\] and it’s CDF is \\[\\begin{equation} F(x) = \\int_{-\\infty}^x \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(t - \\mu)^2}{2 \\sigma^2}} dt, \\end{equation}\\] which is intractable and is usually approximated. Due to its flexibility it is also one of the most researched distributions. For that reason statisticians often use transformations of variables or approximate distributions with the normal distribution. Show that a variable \\(\\frac{X - \\mu}{\\sigma} \\sim \\text{N}(0,1)\\). This transformation is called standardization, and \\(\\text{N}(0,1)\\) is a standard normal distribution. R: Plot the normal distribution with \\(\\mu = 0\\) and different values for the \\(\\sigma\\) parameter. R: The normal distribution provides a good approximation for the Poisson distribution with a large \\(\\lambda\\). Let \\(X \\sim \\text{Poisson}(50)\\). Approximate \\(X\\) with the normal distribution and compare its density with the Poisson histogram. What are the values of \\(\\mu\\) and \\(\\sigma^2\\) that should provide the best approximation? Note that R function rnorm takes standard deviation (\\(\\sigma\\)) as a parameter and not variance. Solution. \\[\\begin{align} P(\\frac{X - \\mu}{\\sigma} < x) &= P(X < \\sigma x + \\mu) \\\\ &= F(\\sigma x + \\mu) \\\\ &= \\int_{-\\infty}^{\\sigma x + \\mu} \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(t - \\mu)^2}{2\\sigma^2}} dt \\end{align}\\] Now let \\(s = f(t) = \\frac{t - \\mu}{\\sigma}\\), then \\(ds = \\frac{dt}{\\sigma}\\) and \\(f(\\sigma x + \\mu) = x\\), so \\[\\begin{align} P(\\frac{X - \\mu}{\\sigma} < x) &= \\int_{-\\infty}^{x} \\frac{1}{\\sqrt{2 \\pi}} e^{-\\frac{s^2}{2}} ds. \\end{align}\\] There is no need to evaluate this integral, as we recognize it as the CDF of a normal distribution with \\(\\mu = 0\\) and \\(\\sigma^2 = 1\\). set.seed(1) # b ggplot(data = data.frame(x = seq(-15, 15, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 1), aes(color = "sd = 1")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 0.4), aes(color = "sd = 0.1")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "sd = 2")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 5), aes(color = "sd = 5")) # c mean_par <- 50 nsamps <- 100000 pois_samps <- rpois(nsamps, lambda = mean_par) norm_samps <- rnorm(nsamps, mean = mean_par, sd = sqrt(mean_par)) my_plot <- ggplot() + geom_bar(data = tibble(x = pois_samps), aes(x = x, y = (..count..)/sum(..count..))) + geom_density(data = tibble(x = norm_samps), aes(x = x), color = "red") plot(my_plot) Exercise 4.13 (Logistic random variable) A logistic random variable has CDF \\[\\begin{equation} F(x) = \\frac{1}{1 + e^{-\\frac{x - \\mu}{s}}}, \\end{equation}\\] where \\(\\mu\\) is real and \\(s > 0\\). The support is on the real axis. We denote \\[\\begin{equation} X | \\mu, s \\sim \\text{Logistic}(\\mu, s). \\end{equation}\\] The distribution of the logistic random variable resembles a normal random variable, however it has heavier tails. Find the PDF of a logistic random variable. R: Implement logistic PDF and CDF and visually compare both for \\(X \\sim \\text{N}(0, 1)\\) and \\(Y \\sim \\text{logit}(0, \\sqrt{\\frac{3}{\\pi^2}})\\). These distributions have the same mean and variance. Additionally, plot the same plot on the interval [5,10], to better see the difference in the tails. R: For the distributions in b) find the probability \\(P(|X| > 4)\\) and interpret the result. Solution. \\[\\begin{align} p(x) &= \\frac{d}{dx} \\frac{1}{1 + e^{-\\frac{x - \\mu}{s}}} \\\\ &= \\frac{- \\frac{d}{dx} (1 + e^{-\\frac{x - \\mu}{s}})}{(1 + e{-\\frac{x - \\mu}{s}})^2} \\\\ &= \\frac{e^{-\\frac{x - \\mu}{s}}}{s(1 + e{-\\frac{x - \\mu}{s}})^2}. \\end{align}\\] # b set.seed(1) logit_pdf <- function (x, mu, s) { return ((exp(-(x - mu)/(s))) / (s * (1 + exp(-(x - mu)/(s)))^2)) } nl_plot <- ggplot(data = data.frame(x = seq(-12, 12, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "normal")) + stat_function(fun = logit_pdf, args = list(mu = 0, s = sqrt(12/pi^2)), aes(color = "logit")) plot(nl_plot) nl_plot <- ggplot(data = data.frame(x = seq(5, 10, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "normal")) + stat_function(fun = logit_pdf, args = list(mu = 0, s = sqrt(12/pi^2)), aes(color = "logit")) plot(nl_plot) # c logit_cdf <- function (x, mu, s) { return (1 / (1 + exp(-(x - mu) / s))) } p_logistic <- 1 - logit_cdf(4, 0, sqrt(12/pi^2)) + logit_cdf(-4, 0, sqrt(12/pi^2)) p_norm <- 1 - pnorm(4, 0, 2) + pnorm(-4, 0, 2) p_logistic ## [1] 0.05178347 p_norm ## [1] 0.04550026 # Logistic distribution has wider tails, therefore the probability of larger # absolute values is higher. 4.4 Singular random variables Exercise 4.14 (Cantor distribution) The Cantor set is a subset of \\([0,1]\\), which we create by iteratively deleting the middle third of the interval. For example, in the first iteration, we get the sets \\([0,\\frac{1}{3}]\\) and \\([\\frac{2}{3},1]\\). In the second iteration, we get \\([0,\\frac{1}{9}]\\), \\([\\frac{2}{9},\\frac{1}{3}]\\), \\([\\frac{2}{3}, \\frac{7}{9}]\\), and \\([\\frac{8}{9}, 1]\\). On the \\(n\\)-th iteration, we have \\[\\begin{equation} C_n = \\frac{C_{n-1}}{3} \\cup \\bigg(\\frac{2}{3} + \\frac{C_{n-1}}{3} \\bigg), \\end{equation}\\] where \\(C_0 = [0,1]\\). The Cantor set is then defined as the intersection of these sets \\[\\begin{equation} C = \\cap_{n=1}^{\\infty} C_n. \\end{equation}\\] It has the same cardinality as \\([0,1]\\). Another way to define the Cantor set is the set of all numbers on \\([0,1]\\), that do not have a 1 in the ternary representation \\(x = \\sum_{n=1}^\\infty \\frac{x_i}{3^i}, x_i \\in \\{0,1,2\\}\\). A random variable follows the Cantor distribution, if its CDF is the Cantor function (below). You can find the implementations of random number generator, CDF, and quantile functions for the Cantor distributions at https://github.com/Henrygb/CantorDist.R. Show that the Lebesgue measure of the Cantor set is 0. (Jagannathan) Let us look at an infinite sequence of independent fair-coin tosses. If the outcome is heads, let \\(x_i = 2\\) and \\(x_i = 0\\), when tails. Then use these to create \\(x = \\sum_{n=1}^\\infty \\frac{x_i}{3^i}\\). This is a random variable with the Cantor distribution. Show that \\(X\\) has a singular distribution. Solution. \\[\\begin{align} \\lambda(C) &= 1 - \\lambda(C^c) \\\\ &= 1 - \\frac{1}{3}\\sum_{k = 0}^\\infty (\\frac{2}{3})^k \\\\ &= 1 - \\frac{\\frac{1}{3}}{1 - \\frac{2}{3}} \\\\ &= 0. \\end{align}\\] First, for every \\(x\\), the probability of observing it is \\(\\lim_{n \\rightarrow \\infty} \\frac{1}{2^n} = 0\\). Second, the probability that we observe one of all the possible sequences is 1. Therefore \\(P(C) = 1\\). So this is a singular variable. The CDF only increments on the elements of the Cantor set. 4.5 Transformations Exercise 4.15 Let \\(X\\) be a random variable that is uniformly distributed on \\(\\{-2, -1, 0, 1, 2\\}\\). Find the PMF of \\(Y = X^2\\). Solution. \\[\\begin{align} P_Y(y) = \\sum_{x \\in \\sqrt(y)} P_X(x) = \\begin{cases} 0 & y \\notin \\{0,1,4\\} \\\\ \\frac{1}{5} & y = 0 \\\\ \\frac{2}{5} & y \\in \\{1,4\\} \\end{cases} \\end{align}\\] Exercise 4.16 (Lognormal random variable) A lognormal random variable is a variable whose logarithm is normally distributed. In practice, we often encounter skewed data. Usually using a log transformation on such data makes it more symmetric and therefore more suitable for modeling with the normal distribution (more on why we wish to model data with the normal distribution in the following chapters). Let \\(X \\sim \\text{N}(\\mu,\\sigma)\\). Find the PDF of \\(Y: \\log(Y) = X\\). R: Sample from the lognormal distribution with parameters \\(\\mu = 5\\) and \\(\\sigma = 2\\). Plot a histogram of the samples. Then log-transform the samples and plot a histogram along with the theoretical normal PDF. Solution. \\[\\begin{align} p_Y(y) &= p_X(\\log(y)) \\frac{d}{dy} \\log(y) \\\\ &= \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{\\frac{(\\log(y) - \\mu)^2}{2 \\sigma^2}} \\frac{1}{y} \\\\ &= \\frac{1}{y \\sqrt{2 \\pi \\sigma^2}} e^{\\frac{(\\log(y) - \\mu)^2}{2 \\sigma^2}}. \\end{align}\\] set.seed(1) nsamps <- 10000 mu <- 0.5 sigma <- 0.4 ln_samps <- rlnorm(nsamps, mu, sigma) ln_plot <- ggplot(data = data.frame(x = ln_samps), aes(x = x)) + geom_histogram(color = "black") plot(ln_plot) norm_samps <- log(ln_samps) n_plot <- ggplot(data = data.frame(x = norm_samps), aes(x = x)) + geom_histogram(aes(y = ..density..), color = "black") + stat_function(fun = dnorm, args = list(mean = mu, sd = sigma), color = "red") plot(n_plot) Exercise 4.17 (Probability integral transform) This exercise is borrowed from Wasserman. Let \\(X\\) have a continuous, strictly increasing CDF \\(F\\). Let \\(Y = F(X)\\). Find the density of \\(Y\\). This is called the probability integral transform. Let \\(U \\sim \\text{Uniform}(0,1)\\) and let \\(X = F^{-1}(U)\\). Show that \\(X \\sim F\\). R: Implement a program that takes Uniform(0,1) random variables and generates random variables from an exponential(\\(\\beta\\)) distribution. Compare your implemented function with function rexp in R. Solution. \\[\\begin{align} F_Y(y) &= P(Y < y) \\\\ &= P(F(X) < y) \\\\ &= P(X < F_X^{-1}(y)) \\\\ &= F_X(F_X^{-1}(y)) \\\\ &= y. \\end{align}\\] From the above it follows that \\(p(y) = 1\\). Note that we need to know the inverse CDF to be able to apply this procedure. \\[\\begin{align} P(X < x) &= P(F^{-1}(U) < x) \\\\ &= P(U < F(x)) \\\\ &= F_U(F(x)) \\\\ &= F(x). \\end{align}\\] set.seed(1) nsamps <- 10000 beta <- 4 generate_exp <- function (n, beta) { tmp <- runif(n) X <- qexp(tmp, beta) return (X) } df <- tibble("R" = rexp(nsamps, beta), "myGenerator" = generate_exp(nsamps, beta)) %>% gather() ggplot(data = df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") "],["mrvs.html", "Chapter 5 Multiple random variables 5.1 General 5.2 Bivariate distribution examples 5.3 Transformations", " Chapter 5 Multiple random variables This chapter deals with multiple random variables and their distributions. The students are expected to acquire the following knowledge: Theoretical Calculation of PDF of transformed multiple random variables. Finding marginal and conditional distributions. R Scatterplots of bivariate random variables. New R functions (for example, expand.grid). .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 5.1 General Exercise 5.1 Let \\(X \\sim \\text{N}(0,1)\\) and \\(Y \\sim \\text{N}(0,1)\\) be independent random variables. Draw 1000 samples from \\((X,Y)\\) and plot a scatterplot. Now let \\(X \\sim \\text{N}(0,1)\\) and \\(Y | X = x \\sim N(ax, 1)\\). Draw 1000 samples from \\((X,Y)\\) for \\(a = 1\\), \\(a=0\\), and \\(a=-0.5\\). Plot the scatterplots. How would you interpret parameter \\(a\\)? Plot the marginal distribution of \\(Y\\) for cases \\(a=1\\), \\(a=0\\), and \\(a=-0.5\\). Can you guess which distribution it is? set.seed(1) nsamps <- 1000 x <- rnorm(nsamps) y <- rnorm(nsamps) ggplot(data.frame(x, y), aes(x = x, y = y)) + geom_point() y1 <- rnorm(nsamps, mean = 1 * x) y2 <- rnorm(nsamps, mean = 0 * x) y3 <- rnorm(nsamps, mean = -0.5 * x) df <- tibble(x = c(x,x,x), y = c(y1,y2,y3), a = c(rep(1, nsamps), rep(0, nsamps), rep(-0.5, nsamps))) ggplot(df, aes(x = x, y = y)) + geom_point() + facet_wrap(~a) + coord_equal(ratio=1) # Parameter a controls the scale of linear dependency between X and Y. ggplot(df, aes(x = y)) + geom_density() + facet_wrap(~a) 5.2 Bivariate distribution examples Exercise 5.2 (Discrete bivariate random variable) Let \\(X\\) represent the event that a die rolls an even number and let \\(Y\\) represent the event that a die rolls one, two, or a three. Find the marginal distributions of \\(X\\) and \\(Y\\). Find the PMF of \\((X,Y)\\). Find the CDF of \\((X,Y)\\). Find \\(P(X = 1 | Y = 1)\\). Solution. \\[\\begin{align} P(X = 1) = \\frac{1}{2} \\text{ and } P(X = 0) = \\frac{1}{2} \\\\ P(Y = 1) = \\frac{1}{2} \\text{ and } P(Y = 0) = \\frac{1}{2} \\\\ \\end{align}\\] \\[\\begin{align} P(X = 1, Y = 1) = \\frac{1}{6} \\\\ P(X = 1, Y = 0) = \\frac{2}{6} \\\\ P(X = 0, Y = 1) = \\frac{2}{6} \\\\ P(X = 0, Y = 0) = \\frac{1}{6} \\end{align}\\] \\[\\begin{align} P(X \\leq x, Y \\leq y) = \\begin{cases} \\frac{1}{6} & x = 0, y = 0 \\\\ \\frac{3}{6} & x \\neq y \\\\ 1 & x = 1, y = 1 \\end{cases} \\end{align}\\] \\[\\begin{align} P(X = 1 | Y = 1) = \\frac{1}{3} \\end{align}\\] Exercise 5.3 (Continuous bivariate random variable) Let \\(p(x,y) = 6 (x - y)^2\\) be the PDF of a bivariate random variable \\((X,Y)\\), where both variables range from zero to one. Find CDF. Find marginal distributions. Find conditional distributions. R: Plot a grid of points and colour them by value – this can help us visualize the PDF. R: Implement a random number generator, which will generate numbers from \\((X,Y)\\) and visually check the results. R: Plot the marginal distribution of \\(Y\\) and the conditional distributions of \\(X | Y = y\\), where \\(y \\in \\{0, 0.1, 0.5\\}\\). Solution. \\[\\begin{align} F(x,y) &= \\int_0^{x} \\int_0^{y} 6 (t - s)^2 ds dt\\\\ &= 6 \\int_0^{x} \\int_0^{y} t^2 - 2ts + s^2 ds dt\\\\ &= 6 \\int_0^{x} t^2y - ty^2 + \\frac{y^3}{3} dt \\\\ &= 6 (\\frac{x^3 y}{3} - \\frac{x^2y^2}{2} + \\frac{x y^3}{3}) \\\\ &= 2 x^3 y - 3 t^2y^2 + 2 x y^3 \\end{align}\\] \\[\\begin{align} p(x) &= \\int_0^{1} 6 (x - y)^2 dy\\\\ &= 6 (x^2 - x + \\frac{1}{3}) \\\\ &= 6x^2 - 6x + 2 \\end{align}\\] \\[\\begin{align} p(y) &= \\int_0^{1} 6 (x - y)^2 dx\\\\ &= 6 (y^2 - y + \\frac{1}{3}) \\\\ &= 6y^2 - 6y + 2 \\end{align}\\] \\[\\begin{align} p(x|y) &= \\frac{p(xy)}{p(y)} \\\\ &= \\frac{6 (x - y)^2}{6 (y^2 - y + \\frac{1}{3})} \\\\ &= \\frac{(x - y)^2}{y^2 - y + \\frac{1}{3}} \\end{align}\\] \\[\\begin{align} p(y|x) &= \\frac{p(xy)}{p(x)} \\\\ &= \\frac{6 (x - y)^2}{6 (x^2 - x + \\frac{1}{3})} \\\\ &= \\frac{(x - y)^2}{x^2 - x + \\frac{1}{3}} \\end{align}\\] set.seed(1) # d pxy <- function (x, y) { return ((x - y)^2) } x_axis <- seq(0, 1, length.out = 100) y_axis <- seq(0, 1, length.out = 100) df <- expand.grid(x_axis, y_axis) colnames(df) <- c("x", "y") df <- cbind(df, pdf = pxy(df$x, df$y)) ggplot(data = df, aes(x = x, y = y, color = pdf)) + geom_point() # e samps <- NULL for (i in 1:10000) { xt <- runif(1, 0, 1) yt <- runif(1, 0, 1) pdft <- pxy(xt, yt) acc <- runif(1, 0, 6) if (acc <= pdft) { samps <- rbind(samps, c(xt, yt)) } } colnames(samps) <- c("x", "y") ggplot(data = as.data.frame(samps), aes(x = x, y = y)) + geom_point() # f mar_pdf <- function (x) { return (6 * x^2 - 6 * x + 2) } cond_pdf <- function (x, y) { return (((x - y)^2) / (y^2 - y + 1/3)) } df <- tibble(x = x_axis, mar = mar_pdf(x), y0 = cond_pdf(x, 0), y0.1 = cond_pdf(x, 0.1), y0.5 = cond_pdf(x, 0.5)) %>% gather(dist, value, -x) ggplot(df, aes(x = x, y = value, color = dist)) + geom_line() Exercise 5.4 (Mixed bivariate random variable) Let \\(f(x,y) = \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y+ \\alpha -1} e^{-x(1 + \\beta)}\\) be the PDF of a bivariate random variable, where \\(x \\in (0, \\infty)\\) and \\(y \\in \\mathbb{N}_0\\). Find the marginal distribution of \\(X\\). Do you recognize this distribution? Find the conditional distribution of \\(Y | X\\). Do you recognize this distribution? Calculate the probability \\(P(Y = 2 | X = 2.5)\\) for \\((X,Y)\\). Find the marginal distribution of \\(Y\\). Do you recognize this distribution? R: Take 1000 random samples from \\((X,Y)\\) with parameters \\(\\beta = 1\\) and \\(\\alpha = 1\\). Plot a scatterplot. Plot a bar plot of the marginal distribution of \\(Y\\), and the theoretical PMF calculated from d) on the range from 0 to 10. Hint: Use the gamma function in R.? Solution. \\[\\begin{align} p(x) &= \\sum_{k = 0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)k!} x^{k + \\alpha -1} e^{-x(1 + \\beta)} & \\\\ &= \\sum_{k = 0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)k!} x^{k} x^{\\alpha -1} e^{-x} e^{-\\beta x} & \\\\ &= \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x} \\sum_{k = 0}^{\\infty} \\frac{1}{k!} x^{k} e^{-x} & \\\\ &= \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x} & \\text{the last term above sums to one} \\end{align}\\] This is the Gamma PDF. \\[\\begin{align} p(y|x) &= \\frac{p(x,y)}{p(x)} \\\\ &= \\frac{\\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y+ \\alpha -1} e^{-x(1 + \\beta)}}{\\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x}} \\\\ &= \\frac{x^y e^{-x}}{y!}. \\end{align}\\] This is the Poisson PMF. \\[\\begin{align} P(Y = 2 | X = 2.5) = \\frac{2.5^2 e^{-2.5}}{2!} \\approx 0.26. \\end{align}\\] \\[\\begin{align} p(y) &= \\int_{0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y + \\alpha -1} e^{-x(1 + \\beta)} dx & \\\\ &= \\frac{1}{y!} \\int_{0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{(y + \\alpha) -1} e^{-(1 + \\beta)x} dx & \\\\ &= \\frac{1}{y!} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} \\int_{0}^{\\infty} \\frac{\\Gamma(y + \\alpha)}{(1 + \\beta)^{y + \\alpha}} \\frac{(1 + \\beta)^{y + \\alpha}}{\\Gamma(y + \\alpha)} x^{(y + \\alpha) -1} e^{-(1 + \\beta)x} dx & \\text{complete to Gamma PDF} \\\\ &= \\frac{1}{y!} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} \\frac{\\Gamma(y + \\alpha)}{(1 + \\beta)^{y + \\alpha}}. \\end{align}\\] We add the terms in the third equality to get a Gamma PDF inside the integral, which then integrates to one. We do not recognize this distribution. set.seed(1) px <- function (x, alpha, beta) { return((1 / factorial(x)) * (beta^alpha / gamma(alpha)) * (gamma(x + alpha) / (1 + beta)^(x + alpha))) } nsamps <- 1000 rx <- rgamma(nsamps, 1, 1) ryx <- rpois(nsamps, rx) ggplot(data = data.frame(x = rx, y = ryx), aes(x = x, y = y)) + geom_point() ggplot(data = data.frame(x = rx, y = ryx), aes(x = y)) + geom_bar(aes(y = (..count..)/sum(..count..))) + stat_function(fun = px, args = list(alpha = 1, beta = 1), color = "red") Exercise 5.5 Let \\(f(x,y) = cx^2y\\) for \\(x^2 \\leq y \\leq 1\\) and zero otherwise. Find such \\(c\\) that \\(f\\) is a PDF of a bivariate random variable. This exercise is borrowed from Wasserman. Solution. \\[\\begin{align} 1 &= \\int_{-1}^{1} \\int_{x^2}^1 cx^2y dy dx \\\\ &= \\int_{-1}^{1} cx^2 (\\frac{1}{2} - \\frac{x^4}{2}) dx \\\\ &= \\frac{c}{2} \\int_{-1}^{1} x^2 - x^6 dx \\\\ &= \\frac{c}{2} (\\frac{1}{3} + \\frac{1}{3} - \\frac{1}{7} - \\frac{1}{7}) \\\\ &= \\frac{c}{2} \\frac{8}{21} \\\\ &= \\frac{4c}{21} \\end{align}\\] It follows \\(c = \\frac{21}{4}\\). 5.3 Transformations Exercise 5.6 Let \\((X,Y)\\) be uniformly distributed on the unit ball \\(\\{(x,y,z) : x^2 + y^2 + z^2 \\leq 1\\}\\). Let \\(R = \\sqrt{X^2 + Y^2 + Z^2}\\). Find the CDF and PDF of \\(R\\). Solution. \\[\\begin{align} P(R < r) &= P(\\sqrt{X^2 + Y^2 + Z^2} < r) \\\\ &= P(X^2 + Y^2 + Z^2 < r^2) \\\\ &= \\frac{\\frac{4}{3} \\pi r^3}{\\frac{4}{3}\\pi} \\\\ &= r^3. \\end{align}\\] The second line shows us that we are looking at the probability which is represented by a smaller ball with radius \\(r\\). To get the probability, we divide it by the radius of the whole ball. We get the PDF by differentiating the CDF, so \\(p(r) = 3r^2\\). "],["integ.html", "Chapter 6 Integration 6.1 Monte Carlo integration 6.2 Lebesgue integrals", " Chapter 6 Integration This chapter deals with abstract and Monte Carlo integration. The students are expected to acquire the following knowledge: Theoretical How to calculate Lebesgue integrals for non-simple functions. R Monte Carlo integration. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 6.1 Monte Carlo integration Exercise 6.1 Let \\(X\\) and \\(Y\\) be continuous random variables on the unit interval and \\(p(x,y) = 6(x - y)^2\\). Use Monte Carlo integration to estimate the probability \\(P(0.2 \\leq X \\leq 0.5, \\: 0.1 \\leq Y \\leq 0.2)\\). Can you find the exact value? set.seed(1) nsamps <- 1000 V <- (0.5 - 0.2) * (0.2 - 0.1) x1 <- runif(nsamps, 0.2, 0.5) x2 <- runif(nsamps, 0.1, 0.2) f_1 <- function (x, y) { return (6 * (x - y)^2) } mcint <- V * (1 / nsamps) * sum(f_1(x1, x2)) sdm <- sqrt((V^2 / nsamps) * var(f_1(x1, x2))) mcint ## [1] 0.008793445 sdm ## [1] 0.0002197686 F_1 <- function (x, y) { return (2 * x^3 * y - 3 * x^2 * y^2 + 2 * x * y^3) } F_1(0.5, 0.2) - F_1(0.2, 0.2) - F_1(0.5, 0.1) + F_1(0.2, 0.1) ## [1] 0.0087 6.2 Lebesgue integrals Exercise 6.2 (borrowed from Jagannathan) Find the Lebesgue integral of the following functions on (\\(\\mathbb{R}\\), \\(\\mathcal{B}(\\mathbb{R})\\), \\(\\lambda\\)). \\[\\begin{align} f(\\omega) = \\begin{cases} \\omega, & \\text{for } \\omega = 0,1,...,n \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] \\[\\begin{align} f(\\omega) = \\begin{cases} 1, & \\text{for } \\omega = \\mathbb{Q}^c \\cap [0,1] \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] \\[\\begin{align} f(\\omega) = \\begin{cases} n, & \\text{for } \\omega = \\mathbb{Q}^c \\cap [0,n] \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] Solution. \\[\\begin{align} \\int f(\\omega) d\\lambda = \\sum_{\\omega = 0}^n \\omega \\lambda(\\omega) = 0. \\end{align}\\] \\[\\begin{align} \\int f(\\omega) d\\lambda = 1 \\times \\lambda(\\mathbb{Q}^c \\cap [0,1]) = 1. \\end{align}\\] \\[\\begin{align} \\int f(\\omega) d\\lambda = n \\times \\lambda(\\mathbb{Q}^c \\cap [0,n]) = n^2. \\end{align}\\] Exercise 6.3 (borrowed from Jagannathan) Let \\(c \\in \\mathbb{R}\\) be fixed and (\\(\\mathbb{R}\\), \\(\\mathcal{B}(\\mathbb{R})\\)) a measurable space. If for any Borel set \\(A\\), \\(\\delta_c (A) = 1\\) if \\(c \\in A\\), and \\(\\delta_c (A) = 0\\) otherwise, then \\(\\delta_c\\) is called a Dirac measure. Let \\(g\\) be a non-negative, measurable function. Show that \\(\\int g d \\delta_c = g(c)\\). Solution. \\[\\begin{align} \\int g d \\delta_c &= \\sup_{q \\in S(g)} \\int q d \\delta_c \\\\ &= \\sup_{q \\in S(g)} \\sum_{i = 1}^n a_i \\delta_c(A_i) \\\\ &= \\sup_{q \\in S(g)} \\sum_{i = 1}^n a_i \\text{I}_{A_i}(c) \\\\ &= \\sup_{q \\in S(g)} q(c) \\\\ &= g(c) \\end{align}\\] "],["ev.html", "Chapter 7 Expected value 7.1 Discrete random variables 7.2 Continuous random variables 7.3 Sums, functions, conditional expectations 7.4 Covariance", " Chapter 7 Expected value This chapter deals with expected values of random variables. The students are expected to acquire the following knowledge: Theoretical Calculation of the expected value. Calculation of variance and covariance. Cauchy distribution. R Estimation of expected value. Estimation of variance and covariance. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 7.1 Discrete random variables Exercise 7.1 (Bernoulli) Let \\(X \\sim \\text{Bernoulli}(p)\\). Find \\(E[X]\\). Find \\(Var[X]\\). R: Let \\(p = 0.4\\). Check your answers to a) and b) with a simulation. Solution. \\[\\begin{align*} E[X] = \\sum_{k=0}^1 p^k (1-p)^{1-k} k = p. \\end{align*}\\] \\[\\begin{align*} Var[X] = E[X^2] - E[X]^2 = \\sum_{k=0}^1 (p^k (1-p)^{1-k} k^2) - p^2 = p(1-p). \\end{align*}\\] set.seed(1) nsamps <- 1000 x <- rbinom(nsamps, 1, 0.4) mean(x) ## [1] 0.394 var(x) ## [1] 0.239003 0.4 * (1 - 0.4) ## [1] 0.24 Exercise 7.2 (Binomial) Let \\(X \\sim \\text{Binomial}(n,p)\\). Find \\(E[X]\\). Find \\(Var[X]\\). Solution. Let \\(X = \\sum_{i=0}^n X_i\\), where \\(X_i \\sim \\text{Bernoulli}(p)\\). Then, due to linearity of expectation \\[\\begin{align*} E[X] = E[\\sum_{i=0}^n X_i] = \\sum_{i=0}^n E[X_i] = np. \\end{align*}\\] Again let \\(X = \\sum_{i=0}^n X_i\\), where \\(X_i \\sim \\text{Bernoulli}(p)\\). Since the Bernoulli variables \\(X_i\\) are independent we have \\[\\begin{align*} Var[X] = Var[\\sum_{i=0}^n X_i] = \\sum_{i=0}^n Var[X_i] = np(1-p). \\end{align*}\\] Exercise 7.3 (Poisson) Let \\(X \\sim \\text{Poisson}(\\lambda)\\). Find \\(E[X]\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\sum_{k=0}^\\infty \\frac{\\lambda^k e^{-\\lambda}}{k!} k & \\\\ &= \\sum_{k=1}^\\infty \\frac{\\lambda^k e^{-\\lambda}}{k!} k & \\text{term at $k=0$ is 0} \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty \\frac{\\lambda^{k-1}}{(k - 1)!} & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=0}^\\infty \\frac{\\lambda^{k}}{k!} & \\\\ &= e^{-\\lambda} \\lambda e^\\lambda & \\\\ &= \\lambda. \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty k \\frac{\\lambda^{k-1}}{(k - 1)!} - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty (k - 1) + 1) \\frac{\\lambda^{k-1}}{(k - 1)!} - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\sum_{k=1}^\\infty (k - 1) \\frac{\\lambda^{k-1}}{(k - 1)!} + \\sum_{k=1}^\\infty \\frac{\\lambda^{k-1}}{(k - 1)!}\\Big) - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\lambda\\sum_{k=2}^\\infty \\frac{\\lambda^{k-2}}{(k - 2)!} + e^\\lambda\\Big) - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\lambda e^\\lambda + e^\\lambda\\Big) - \\lambda^2 & \\\\ &= \\lambda^2 + \\lambda - \\lambda^2 & \\\\ &= \\lambda. \\end{align*}\\] Exercise 7.4 (Geometric) Let \\(X \\sim \\text{Geometric}(p)\\). Find \\(E[X]\\). Hint: \\(\\frac{d}{dx} x^k = k x^{(k - 1)}\\). Solution. \\[\\begin{align*} E[X] &= \\sum_{k=0}^\\infty (1 - p)^k p k & \\\\ &= p (1 - p) \\sum_{k=0}^\\infty (1 - p)^{k-1} k & \\\\ &= p (1 - p) \\sum_{k=0}^\\infty -\\frac{d}{dp}(1 - p)^k & \\\\ &= p (1 - p) \\Big(-\\frac{d}{dp}\\Big) \\sum_{k=0}^\\infty (1 - p)^k & \\\\ &= p (1 - p) \\Big(-\\frac{d}{dp}\\Big) \\frac{1}{1 - (1 - p)} & \\text{geometric series} \\\\ &= \\frac{1 - p}{p} \\end{align*}\\] 7.2 Continuous random variables Exercise 7.5 (Gamma) Let \\(X \\sim \\text{Gamma}(\\alpha, \\beta)\\). Hint: \\(\\Gamma(z) = \\int_0^\\infty t^{z-1}e^{-t} dt\\) and \\(\\Gamma(z + 1) = z \\Gamma(z)\\). Find \\(E[X]\\). Find \\(Var[X]\\). R: Let \\(\\alpha = 10\\) and \\(\\beta = 2\\). Plot the density of \\(X\\). Add a horizontal line at the expected value that touches the density curve (geom_segment). Shade the area within a standard deviation of the expected value. Solution. \\[\\begin{align*} E[X] &= \\int_0^\\infty \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)}x^\\alpha e^{-\\beta x} dx & \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\int_0^\\infty x^\\alpha e^{-\\beta x} dx & \\text{ (let $t = \\beta x$)} \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(\\alpha) }\\int_0^\\infty \\frac{t^\\alpha}{\\beta^\\alpha} e^{-t} \\frac{dt}{\\beta} & \\\\ &= \\frac{1}{\\beta \\Gamma(\\alpha) }\\int_0^\\infty t^\\alpha e^{-t} dt & \\\\ &= \\frac{\\Gamma(\\alpha + 1)}{\\beta \\Gamma(\\alpha)} & \\\\ &= \\frac{\\alpha \\Gamma(\\alpha)}{\\beta \\Gamma(\\alpha)} & \\\\ &= \\frac{\\alpha}{\\beta}. & \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\int_0^\\infty \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)}x^{\\alpha+1} e^{-\\beta x} dx - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\Gamma(\\alpha + 2)}{\\beta^2 \\Gamma(\\alpha)} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{(\\alpha + 1)\\alpha\\Gamma(\\alpha)}{\\beta^2 \\Gamma(\\alpha)} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\alpha^2 + \\alpha}{\\beta^2} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\alpha}{\\beta^2}. \\end{align*}\\] set.seed(1) x <- seq(0, 25, by = 0.01) y <- dgamma(x, shape = 10, rate = 2) df <- data.frame(x = x, y = y) ggplot(df, aes(x = x, y = y)) + geom_line() + geom_segment(aes(x = 5, y = 0, xend = 5, yend = dgamma(5, shape = 10, rate = 2)), color = "red") + stat_function(fun = dgamma, args = list(shape = 10, rate = 2), xlim = c(5 - sqrt(10/4), 5 + sqrt(10/4)), geom = "area", fill = "gray", alpha = 0.4) Exercise 7.6 (Beta) Let \\(X \\sim \\text{Beta}(\\alpha, \\beta)\\). Find \\(E[X]\\). Hint 1: \\(\\text{B}(x,y) = \\int_0^1 t^{x-1} (1 - t)^{y-1} dt\\). Hint 2: \\(\\text{B}(x + 1, y) = \\text{B}(x,y)\\frac{x}{x + y}\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\int_0^1 \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} x dx \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)}\\int_0^1 x^{\\alpha} (1 - x)^{\\beta - 1} dx \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 1, \\beta) \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha, \\beta) \\frac{\\alpha}{\\alpha + \\beta} \\\\ &= \\frac{\\alpha}{\\alpha + \\beta}. \\\\ \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\int_0^1 \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} x^2 dx - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)}\\int_0^1 x^{\\alpha + 1} (1 - x)^{\\beta - 1} dx - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 2, \\beta) - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 1, \\beta) \\frac{\\alpha + 1}{\\alpha + \\beta + 1} - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{\\alpha + 1}{\\alpha + \\beta + 1} \\frac{\\alpha}{\\alpha + \\beta} - \\frac{\\alpha^2}{(\\alpha + \\beta)^2}\\\\ &= \\frac{\\alpha \\beta}{(\\alpha + \\beta)^2(\\alpha + \\beta + 1)}. \\end{align*}\\] Exercise 7.7 (Exponential) Let \\(X \\sim \\text{Exp}(\\lambda)\\). Find \\(E[X]\\). Hint: \\(\\Gamma(z + 1) = z\\Gamma(z)\\) and \\(\\Gamma(1) = 1\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\int_0^\\infty \\lambda e^{-\\lambda x} x dx & \\\\ &= \\lambda \\int_0^\\infty x e^{-\\lambda x} dx & \\\\ &= \\lambda \\int_0^\\infty \\frac{t}{\\lambda} e^{-t} \\frac{dt}{\\lambda} & \\text{$t = \\lambda x$}\\\\ &= \\lambda \\lambda^{-2} \\Gamma(2) & \\text{definition of gamma function} \\\\ &= \\lambda^{-1}. \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 & \\\\ &= \\int_0^\\infty \\lambda e^{-\\lambda x} x^2 dx - \\lambda^{-2} & \\\\ &= \\lambda \\int_0^\\infty \\frac{t^2}{\\lambda^2} e^{-t} \\frac{dt}{\\lambda} - \\lambda^{-2} & \\text{$t = \\lambda x$} \\\\ &= \\lambda \\lambda^{-3} \\Gamma(3) - \\lambda^{-2} & \\text{definition of gamma function} & \\\\ &= \\lambda^{-2} 2 \\Gamma(2) - \\lambda^{-2} & \\\\ &= 2 \\lambda^{-2} - \\lambda^{-2} & \\\\ &= \\lambda^{-2}. & \\\\ \\end{align*}\\] Exercise 7.8 (Normal) Let \\(X \\sim \\text{N}(\\mu, \\sigma)\\). Show that \\(E[X] = \\mu\\). Hint: Use the error function \\(\\text{erf}(x) = \\frac{1}{\\sqrt(\\pi)} \\int_{-x}^x e^{-t^2} dt\\). The statistical interpretation of this function is that if \\(Y \\sim \\text{N}(0, 0.5)\\), then the error function describes the probability of \\(Y\\) falling between \\(-x\\) and \\(x\\). Also, \\(\\text{erf}(\\infty) = 1\\). Show that \\(Var[X] = \\sigma^2\\). Hint: Start with the definition of variance. Solution. \\[\\begin{align*} E[X] &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} x dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty x e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty \\Big(t \\sqrt{2\\sigma^2} + \\mu\\Big)e^{-t^2} \\sqrt{2 \\sigma^2} dt & t = \\frac{x - \\mu}{\\sqrt{2}\\sigma} \\\\ &= \\frac{\\sqrt{2\\sigma^2}}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty t e^{-t^2} dt + \\frac{1}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty \\mu e^{-t^2} dt & \\\\ \\end{align*}\\] Let us calculate these integrals separately. \\[\\begin{align*} \\int t e^{-t^2} dt &= -\\frac{1}{2}\\int e^{s} ds & s = -t^2 \\\\ &= -\\frac{e^s}{2} + C \\\\ &= -\\frac{e^{-t^2}}{2} + C & \\text{undoing substitution}. \\end{align*}\\] Inserting the integration limits we get \\[\\begin{align*} \\int_{-\\infty}^\\infty t e^{-t^2} dt &= 0, \\end{align*}\\] due to the integrated function being symmetric. Reordering the second integral we get \\[\\begin{align*} \\mu \\frac{1}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty e^{-t^2} dt &= \\mu \\text{erf}(\\infty) & \\text{definition of error function} \\\\ &= \\mu & \\text{probability of $Y$ falling between $-\\infty$ and $\\infty$}. \\end{align*}\\] Combining all of the above we get \\[\\begin{align*} E[X] &= \\frac{\\sqrt{2\\sigma^2}}{\\sqrt{\\pi}} \\times 0 + \\mu &= \\mu.\\\\ \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[(X - E[X])^2] \\\\ &= E[(X - \\mu)^2] \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty (x - \\mu)^2 e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} dx \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\int_{-\\infty}^\\infty t^2 e^{-\\frac{t^2}{2}} dt \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\bigg(\\Big(- t e^{-\\frac{t^2}{2}} |_{-\\infty}^\\infty \\Big) + \\int_{-\\infty}^\\infty e^{-\\frac{t^2}{2}} \\bigg) dt & \\text{integration by parts} \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\sqrt{2 \\pi} \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt(\\pi)}e^{-s^2} \\bigg) & s = \\frac{t}{\\sqrt{2}} \\text{ and evaluating the left expression at the bounds} \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\sqrt{2 \\pi} \\Big(\\text{erf}(\\infty) & \\text{definition of error function} \\\\ &= \\sigma^2. \\end{align*}\\] 7.3 Sums, functions, conditional expectations Exercise 7.9 (Expectation of transformations) Let \\(X\\) follow a normal distribution with mean \\(\\mu\\) and variance \\(\\sigma^2\\). Find \\(E[2X + 4]\\). Find \\(E[X^2]\\). Find \\(E[\\exp(X)]\\). Hint: Use the error function \\(\\text{erf}(x) = \\frac{1}{\\sqrt(\\pi)} \\int_{-x}^x e^{-t^2} dt\\). Also, \\(\\text{erf}(\\infty) = 1\\). R: Check your results numerically for \\(\\mu = 0.4\\) and \\(\\sigma^2 = 0.25\\) and plot the densities of all four distributions. Solution. \\[\\begin{align} E[2X + 4] &= 2E[X] + 4 & \\text{linearity of expectation} \\\\ &= 2\\mu + 4. \\\\ \\end{align}\\] \\[\\begin{align} E[X^2] &= E[X]^2 + Var[X] & \\text{definition of variance} \\\\ &= \\mu^2 + \\sigma^2. \\end{align}\\] \\[\\begin{align} E[\\exp(X)] &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} e^x dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{\\frac{2 \\sigma^2 x}{2\\sigma^2} -\\frac{(x - \\mu)^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{x^2 - 2x(\\mu + \\sigma^2) + \\mu^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{(x - (\\mu + \\sigma^2))^2 + \\mu^2 - (\\mu + \\sigma^2)^2}{2\\sigma^2}} dx & \\text{complete the square} \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{\\frac{- \\mu^2 + (\\mu + \\sigma^2)^2}{2\\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{(x - (\\mu + \\sigma^2))^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{\\frac{- \\mu^2 + (\\mu + \\sigma^2)^2}{2\\sigma^2}} \\sigma \\sqrt{2 \\pi} \\text{erf}(\\infty) & \\\\ &= e^{\\frac{2\\mu + \\sigma^2}{2}}. \\end{align}\\] set.seed(1) mu <- 0.4 sigma <- 0.5 x <- rnorm(100000, mean = mu, sd = sigma) mean(2*x + 4) ## [1] 4.797756 2 * mu + 4 ## [1] 4.8 mean(x^2) ## [1] 0.4108658 mu^2 + sigma^2 ## [1] 0.41 mean(exp(x)) ## [1] 1.689794 exp((2 * mu + sigma^2) / 2) ## [1] 1.690459 Exercise 7.10 (Sum of independent random variables) Borrowed from Wasserman. Let \\(X_1, X_2,...,X_n\\) be IID random variables with expected value \\(E[X_i] = \\mu\\) and variance \\(Var[X_i] = \\sigma^2\\). Find the expected value and variance of \\(\\bar{X} = \\frac{1}{n} \\sum_{i=1}^n X_i\\). \\(\\bar{X}\\) is called a statistic (a function of the values in a sample). It is itself a random variable and its distribution is called a sampling distribution. R: Take \\(n = 5, 10, 100, 1000\\) samples from the N(\\(2\\), \\(6\\)) distribution 10000 times. Plot the theoretical density and the densities of \\(\\bar{X}\\) statistic for each \\(n\\). Intuitively, are the results in correspondence with your calculations? Check them numerically. Solution. Let us start with the expectation of \\(\\bar{X}\\). \\[\\begin{align} E[\\bar{X}] &= E[\\frac{1}{n} \\sum_{i=1}^n X_i] & \\\\ &= \\frac{1}{n} E[\\sum_{i=1}^n X_i] & \\text{ (multiplication with a scalar)} \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[X_i] & \\text{ (linearity)} \\\\ &= \\frac{1}{n} n \\mu & \\\\ &= \\mu. \\end{align}\\] Now the variance \\[\\begin{align} Var[\\bar{X}] &= Var[\\frac{1}{n} \\sum_{i=1}^n X_i] & \\\\ &= \\frac{1}{n^2} Var[\\sum_{i=1}^n X_i] & \\text{ (multiplication with a scalar)} \\\\ &= \\frac{1}{n^2} \\sum_{i=1}^n Var[X_i] & \\text{ (independence of samples)} \\\\ &= \\frac{1}{n^2} n \\sigma^2 & \\\\ &= \\frac{1}{n} \\sigma^2. \\end{align}\\] set.seed(1) nsamps <- 10000 mu <- 2 sigma <- sqrt(6) N <- c(5, 10, 100, 500) X <- matrix(data = NA, nrow = nsamps, ncol = length(N)) ind <- 1 for (n in N) { for (i in 1:nsamps) { X[i,ind] <- mean(rnorm(n, mu, sigma)) } ind <- ind + 1 } colnames(X) <- N X <- melt(as.data.frame(X)) ggplot(data = X, aes(x = value, colour = variable)) + geom_density() + stat_function(data = data.frame(x = seq(-2, 6, by = 0.01)), aes(x = x), fun = dnorm, args = list(mean = mu, sd = sigma), color = "black") Exercise 7.11 (Conditional expectation) Let \\(X \\in \\mathbb{R}_0^+\\) and \\(Y \\in \\mathbb{N}_0\\) be random variables with joint distribution \\(p_{XY}(X,Y) = \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} 0.5^{y + 1}\\). Find \\(E[X | Y = y]\\) by first finding \\(p_Y\\) and then \\(p_{X|Y}\\). Find \\(E[X]\\). R: check your answers to a) and b) by drawing 10000 samples from \\(p_Y\\) and \\(p_{X|Y}\\). Solution. \\[\\begin{align} p(y) &= \\int_0^\\infty \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} 0.5^{y + 1} dx \\\\ &= \\frac{0.5^{y + 1}}{y + 1} \\int_0^\\infty e^{-\\frac{x}{y + 1}} dx \\\\ &= \\frac{0.5^{y + 1}}{y + 1} (y + 1) \\\\ &= 0.5^{y + 1} \\\\ &= 0.5(1 - 0.5)^y. \\end{align}\\] We recognize this as the geometric distribution. \\[\\begin{align} p(x|y) &= \\frac{p(x,y)}{p(y)} \\\\ &= \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}}. \\end{align}\\] We recognize this as the exponential distribution. \\[\\begin{align} E[X | Y = y] &= \\int_0^\\infty x \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} dx \\\\ &= y + 1 & \\text{expected value of the exponential distribution} \\end{align}\\] Use the law of iterated expectation. \\[\\begin{align} E[X] &= E[E[X | Y]] \\\\ &= E[Y + 1] \\\\ &= E[Y] + 1 \\\\ &= \\frac{1 - 0.5}{0.5} + 1 \\\\ &= 2. \\end{align}\\] set.seed(1) y <- rgeom(100000, 0.5) x <- rexp(100000, rate = 1 / (y + 1)) x2 <- x[y == 3] mean(x2) ## [1] 4.048501 3 + 1 ## [1] 4 mean(x) ## [1] 2.007639 (1 - 0.5) / 0.5 + 1 ## [1] 2 Exercise 7.12 (Cauchy distribution) Let \\(p(x | x_0, \\gamma) = \\frac{1}{\\pi \\gamma \\Big(1 + \\big(\\frac{x - x_0}{\\gamma}\\big)^2\\Big)}\\). A random variable with this PDF follows a Cauchy distribution. This distribution is symmetric and has wider tails than the normal distribution. R: Draw \\(n = 1,...,1000\\) samples from a standard normal and \\(\\text{Cauchy}(0, 1)\\). For each \\(n\\) plot the mean and the median of the sample using facets. Interpret the results. To get a mathematical explanation of the results in a), evaluate the integral \\(\\int_0^\\infty \\frac{x}{1 + x^2} dx\\) and consider that \\(E[X] = \\int_{-\\infty}^\\infty \\frac{x}{1 + x^2}dx\\). set.seed(1) n <- 1000 means_n <- vector(mode = "numeric", length = n) means_c <- vector(mode = "numeric", length = n) medians_n <- vector(mode = "numeric", length = n) medians_c <- vector(mode = "numeric", length = n) for (i in 1:n) { tmp_n <- rnorm(i) tmp_c <- rcauchy(i) means_n[i] <- mean(tmp_n) means_c[i] <- mean(tmp_c) medians_n[i] <- median(tmp_n) medians_c[i] <- median(tmp_c) } df <- data.frame("distribution" = c(rep("normal", 2 * n), rep("Cauchy", 2 * n)), "type" = c(rep("mean", n), rep("median", n), rep("mean", n), rep("median", n)), "value" = c(means_n, medians_n, means_c, medians_c), "n" = rep(1:n, times = 4)) ggplot(df, aes(x = n, y = value)) + geom_line(alpha = 0.5) + facet_wrap(~ type + distribution , scales = "free") Solution. \\[\\begin{align} \\int_0^\\infty \\frac{x}{1 + x^2} dx &= \\frac{1}{2} \\int_1^\\infty \\frac{1}{u} du & u = 1 + x^2 \\\\ &= \\frac{1}{2} \\ln(x) |_0^\\infty. \\end{align}\\] This integral is not finite. The same holds for the negative part. Therefore, the expectation is undefined, as \\(E[|X|] = \\infty\\). Why can we not just claim that \\(f(x) = x / (1 + x^2)\\) is odd and \\(\\int_{-\\infty}^\\infty f(x) = 0\\)? By definition of the Lebesgue integral \\(\\int_{-\\infty}^{\\infty} f= \\int_{-\\infty}^{\\infty} f_+-\\int_{-\\infty}^{\\infty} f_-\\). At least one of the two integrals needs to be finite for \\(\\int_{-\\infty}^{\\infty} f\\) to be well-defined. However \\(\\int_{-\\infty}^{\\infty} f_+=\\int_0^{\\infty} x/(1+x^2)\\) and \\(\\int_{-\\infty}^{\\infty} f_-=\\int_{-\\infty}^{0} |x|/(1+x^2)\\). We have just shown that both of these integrals are infinite, which implies that their sum is also infinite. 7.4 Covariance Exercise 7.13 Below is a table of values for random variables \\(X\\) and \\(Y\\). X Y 2.1 8 -0.5 11 1 10 -2 12 4 9 Find sample covariance of \\(X\\) and \\(Y\\). Find sample variances of \\(X\\) and \\(Y\\). Find sample correlation of \\(X\\) and \\(Y\\). Find sample variance of \\(Z = 2X - 3Y\\). Solution. \\(\\bar{X} = 0.92\\) and \\(\\bar{Y} = 10\\). \\[\\begin{align} s(X, Y) &= \\frac{1}{n - 1} \\sum_{i=1}^5 (X_i - 0.92) (Y_i - 10) \\\\ &= -3.175. \\end{align}\\] \\[\\begin{align} s(X) &= \\frac{\\sum_{i=1}^5(X_i - 0.92)^2}{5 - 1} \\\\ &= 5.357. \\end{align}\\] \\[\\begin{align} s(Y) &= \\frac{\\sum_{i=1}^5(Y_i - 10)^2}{5 - 1} \\\\ &= 2.5. \\end{align}\\] \\[\\begin{align} r(X,Y) &= \\frac{Cov(X,Y)}{\\sqrt{Var[X]Var[Y]}} \\\\ &= \\frac{-3.175}{\\sqrt{5.357 \\times 2.5}} \\\\ &= -8.68. \\end{align}\\] \\[\\begin{align} s(Z) &= 2^2 s(X) + 3^2 s(Y) + 2 \\times 2 \\times 3 s(X, Y) \\\\ &= 4 \\times 5.357 + 9 \\times 2.5 + 12 \\times 3.175 \\\\ &= 82.028. \\end{align}\\] Exercise 7.14 Let \\(X \\sim \\text{Uniform}(0,1)\\) and \\(Y | X = x \\sim \\text{Uniform(0,x)}\\). Find the covariance of \\(X\\) and \\(Y\\). Find the correlation of \\(X\\) and \\(Y\\). R: check your answers to a) and b) with simulation. Plot \\(X\\) against \\(Y\\) on a scatterplot. Solution. The joint PDF is \\(p(x,y) = p(x)p(y|x) = \\frac{1}{x}\\). \\[\\begin{align} Cov(X,Y) &= E[XY] - E[X]E[Y] \\\\ \\end{align}\\] Let us first evaluate the first term: \\[\\begin{align} E[XY] &= \\int_0^1 \\int_0^x x y \\frac{1}{x} dy dx \\\\ &= \\int_0^1 \\int_0^x y dy dx \\\\ &= \\int_0^1 \\frac{x^2}{2} dx \\\\ &= \\frac{1}{6}. \\end{align}\\] Now let us find \\(E[Y]\\), \\(E[X]\\) is trivial. \\[\\begin{align} E[Y] = E[E[Y | X]] = E[\\frac{X}{2}] = \\frac{1}{2} \\int_0^1 x dx = \\frac{1}{4}. \\end{align}\\] Combining all: \\[\\begin{align} Cov(X,Y) &= \\frac{1}{6} - \\frac{1}{2} \\frac{1}{4} = \\frac{1}{24}. \\end{align}\\] \\[\\begin{align} \\rho(X,Y) &= \\frac{Cov(X,Y)}{\\sqrt{Var[X]Var[Y]}} \\\\ \\end{align}\\] Let us calculate \\(Var[X]\\). \\[\\begin{align} Var[X] &= E[X^2] - \\frac{1}{4} \\\\ &= \\int_0^1 x^2 - \\frac{1}{4} \\\\ &= \\frac{1}{3} - \\frac{1}{4} \\\\ &= \\frac{1}{12}. \\end{align}\\] Let us calculate \\(E[E[Y^2|X]]\\). \\[\\begin{align} E[E[Y^2|X]] &= E[\\frac{x^2}{3}] \\\\ &= \\frac{1}{9}. \\end{align}\\] Then \\(Var[Y] = \\frac{1}{9} - \\frac{1}{16} = \\frac{7}{144}\\). Combining all \\[\\begin{align} \\rho(X,Y) &= \\frac{\\frac{1}{24}}{\\sqrt{\\frac{1}{12}\\frac{7}{144}}} \\\\ &= 0.65. \\end{align}\\] set.seed(1) nsamps <- 10000 x <- runif(nsamps) y <- runif(nsamps, 0, x) cov(x, y) ## [1] 0.04274061 1/24 ## [1] 0.04166667 cor(x, y) ## [1] 0.6629567 (1 / 24) / (sqrt(7 / (12 * 144))) ## [1] 0.6546537 ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_point(alpha = 0.2) + geom_smooth(method = "lm") "],["mrv.html", "Chapter 8 Multivariate random variables 8.1 Multinomial random variables 8.2 Multivariate normal random variables 8.3 Transformations", " Chapter 8 Multivariate random variables This chapter deals with multivariate random variables. The students are expected to acquire the following knowledge: Theoretical Multinomial distribution. Multivariate normal distribution. Cholesky decomposition. Eigendecomposition. R Sampling from the multinomial distribution. Sampling from the multivariate normal distribution. Matrix decompositions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 8.1 Multinomial random variables Exercise 8.1 Let \\(X_i\\), \\(i = 1,...,k\\) represent \\(k\\) events, and \\(p_i\\) the probabilities of these events happening in a trial. Let \\(n\\) be the number of trials, and \\(X\\) a multivariate random variable, the collection of \\(X_i\\). Then \\(p(x) = \\frac{n!}{x_1!x_2!...x_k!} p_1^{x_1} p_2^{x_2}...p_k^{x_k}\\) is the PMF of a multinomial distribution, where \\(n = \\sum_{i = 1}^k x_i\\). Show that the marginal distribution of \\(X_i\\) is a binomial distribution. Take 1000 samples from the multinomial distribution with \\(n=4\\) and probabilities \\(p = (0.2, 0.2, 0.5, 0.1)\\). Then take 1000 samples from four binomial distributions with the same parameters. Inspect the results visually. Solution. We will approach this proof from the probabilistic point of view. W.L.O.G. let \\(x_1\\) be the marginal distribution we are interested in. The term \\(p^{x_1}\\) denotes the probability that event 1 happened \\(x_1\\) times. For this event not to happen, one of the other events needs to happen. So for each of the remaining trials, the probability of another event is \\(\\sum_{i=2}^k p_i = 1 - p_1\\), and there were \\(n - x_1\\) such trials. What is left to do is to calculate the number of permutations of event 1 happening and event 1 not happening. We choose \\(x_1\\) trials, from \\(n\\) trials. Therefore \\(p(x_1) = \\binom{n}{x_1} p_1^{x_1} (1 - p_1)^{n - x_1}\\), which is the binomial PMF. Interested students are encouraged to prove this mathematically. set.seed(1) nsamps <- 1000 samps_mult <- rmultinom(nsamps, 4, prob = c(0.2, 0.2, 0.5, 0.1)) samps_mult <- as_tibble(t(samps_mult)) %>% gather() samps <- tibble( V1 = rbinom(nsamps, 4, 0.2), V2 = rbinom(nsamps, 4, 0.2), V3 = rbinom(nsamps, 4, 0.5), V4 = rbinom(nsamps, 4, 0.1) ) %>% gather() %>% bind_rows(samps_mult) %>% bind_cols("dist" = c(rep("binomial", 4*nsamps), rep("multinomial", 4*nsamps))) ggplot(samps, aes(x = value, fill = dist)) + geom_bar(position = "dodge") + facet_wrap(~ key) Exercise 8.2 (Multinomial expected value) Find the expected value, variance and covariance of the multinomial distribution. Hint: First find the expected value for \\(n = 1\\) and then use the fact that the trials are independent. Solution. Let us first calculate the expected value of \\(X_1\\), when \\(n = 1\\). \\[\\begin{align} E[X_1] &= \\sum_{n_1 = 0}^1 \\sum_{n_2 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_1!n_2!...n_k!}p_1^{n_1}p_2^{n_2}...p_k^{n_k}n_1 \\\\ &= \\sum_{n_1 = 0}^1 \\frac{p_1^{n_1} n_1}{n_1!} \\sum_{n_2 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_2!...n_k!}p_2^{n_2}...p_k^{n_k} \\end{align}\\] When \\(n_1 = 0\\) then the whole terms is zero, so we do not need to evaluate other sums. When \\(n_1 = 1\\), all other \\(n_i\\) must be zero, as we have \\(1 = \\sum_{i=1}^k n_i\\). Therefore the other sums equal \\(1\\). So \\(E[X_1] = p_1\\) and \\(E[X_i] = p_i\\) for \\(i = 1,...,k\\). Now let \\(Y_j\\), \\(j = 1,...,n\\), have a multinomial distribution with \\(n = 1\\), and let \\(X\\) have a multinomial distribution with an arbitrary \\(n\\). Then we can write \\(X = \\sum_{j=1}^n Y_j\\). And due to independence \\[\\begin{align} E[X] &= E[\\sum_{j=1}^n Y_j] \\\\ &= \\sum_{j=1}^n E[Y_j] \\\\ &= np. \\end{align}\\] For the variance, we need \\(E[X^2]\\). Let us follow the same procedure as above and first calculate \\(E[X_i]\\) for \\(n = 1\\). The only thing that changes is that the term \\(n_i\\) becomes \\(n_i^2\\). Since we only have \\(0\\) and \\(1\\) this does not change the outcome. So \\[\\begin{align} Var[X_i] &= E[X_i^2] - E[X_i]^2\\\\ &= p_i(1 - p_i). \\end{align}\\] Analogous to above for arbitrary \\(n\\) \\[\\begin{align} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\sum_{j=1}^n E[Y_j^2] - \\sum_{j=1}^n E[Y_j]^2 \\\\ &= \\sum_{j=1}^n E[Y_j^2] - E[Y_j]^2 \\\\ &= \\sum_{j=1}^n p(1-p) \\\\ &= np(1-p). \\end{align}\\] To calculate the covariance, we need \\(E[X_i X_j]\\). Again, let us start with \\(n = 1\\). Without loss of generality, let us assume \\(i = 1\\) and \\(j = 2\\). \\[\\begin{align} E[X_1 X_2] = \\sum_{n_1 = 0}^1 \\sum_{n_2 = 0}^1 \\frac{p_1^{n_1} n_1}{n_1!} \\frac{p_2^{n_2} n_2}{n_2!} \\sum_{n_3 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_3!...n_k!}p_3^{n_3}...p_k^{n_k}. \\end{align}\\] In the above expression, at each iteration we multiply with \\(n_1\\) and \\(n_2\\). Since \\(n = 1\\), one of these always has to be zero. Therefore \\(E[X_1 X_2] = 0\\) and \\[\\begin{align} Cov(X_i, X_j) &= E[X_i X_j] - E[X_i]E[X_j] \\\\ &= - p_i p_j. \\end{align}\\] For arbitrary \\(n\\), let \\(X = \\sum_{t = 1}^n Y_t\\) be the sum of independent multinomial random variables \\(Y_t = [X_{1t}, X_{2t},...,X_{kt}]^T\\) with \\(n=1\\). Then \\(X_1 = \\sum_{t = 1}^n X_{1t}\\) and \\(X_2 = \\sum_{l = 1}^n X_{2l}\\). \\[\\begin{align} Cov(X_1, X_2) &= E[X_1 X_2] - E[X_1] E[X_2] \\\\ &= E[\\sum_{t = 1}^n X_{1t} \\sum_{l = 1}^n X_{2l}] - n^2 p_1 p_2 \\\\ &= \\sum_{t = 1}^n \\sum_{l = 1}^n E[X_{1t} X_{2l}] - n^2 p_1 p_2. \\end{align}\\] For \\(X_{1t}\\) and \\(X_{2l}\\) the expected value is zero when \\(t = l\\). When \\(t \\neq l\\) then they are independent, so the expected value is the product \\(p_1 p_2\\). There are \\(n^2\\) total terms, and for \\(n\\) of them \\(t = l\\) holds. So \\(E[X_1 X_2] = (n^2 - n) p_1 p_2\\). Inserting into the above \\[\\begin{align} Cov(X_1, X_2) &= (n^2 - n) p_1 p_2 - n^2 p_1 p_2 \\\\ &= - n p_1 p_2. \\end{align}\\] 8.2 Multivariate normal random variables Exercise 8.3 (Cholesky decomposition) Let \\(X\\) be a random vector of length \\(k\\) with \\(X_i \\sim \\text{N}(0, 1)\\) and \\(LL^*\\) the Cholesky decomposition of a Hermitian positive-definite matrix \\(A\\). Let \\(\\mu\\) be a vector of length \\(k\\). Find the distribution of the random vector \\(Y = \\mu + L X\\). Find the Cholesky decomposition of \\(A = \\begin{bmatrix} 2 & 1.2 \\\\ 1.2 & 1 \\end{bmatrix}\\). R: Use the results from a) and b) to sample from the MVN distribution \\(\\text{N}(\\mu, A)\\), where \\(\\mu = [1.5, -1]^T\\). Plot a scatterplot and compare it to direct samples from the multivariate normal distribution (rmvnorm). Solution. \\(X\\) has an independent normal distribution of dimension \\(k\\). Then \\[\\begin{align} Y = \\mu + L X &\\sim \\text{N}(\\mu, LL^T) \\\\ &\\sim \\text{N}(\\mu, A). \\end{align}\\] Solve \\[\\begin{align} \\begin{bmatrix} a & 0 \\\\ b & c \\end{bmatrix} \\begin{bmatrix} a & b \\\\ 0 & c \\end{bmatrix} = \\begin{bmatrix} 2 & 1.2 \\\\ 1.2 & 1 \\end{bmatrix} \\end{align}\\] # a set.seed(1) nsamps <- 1000 X <- matrix(data = rnorm(nsamps * 2), ncol = 2) mu <- c(1.5, -1) L <- matrix(data = c(sqrt(2), 0, 1.2 / sqrt(2), sqrt(1 - 1.2^2/2)), ncol = 2, byrow = TRUE) Y <- t(mu + L %*% t(X)) plot_df <- data.frame(rbind(X, Y), c(rep("X", nsamps), rep("Y", nsamps))) colnames(plot_df) <- c("D1", "D2", "var") ggplot(data = plot_df, aes(x = D1, y = D2, colour = as.factor(var))) + geom_point() Exercise 8.4 (Eigendecomposition) R: Let \\(\\Sigma = U \\Lambda U^T\\) be the eigendecomposition of covariance matrix \\(\\Sigma\\). Follow the procedure below, to sample from a multivariate normal with \\(\\mu = [-2, 1]^T\\) and \\(\\Sigma = \\begin{bmatrix} 0.3, -0.5 \\\\ -0.5, 1.6 \\end{bmatrix}\\): Sample from two independent standardized normal distributions to get \\(X\\). Find the eigendecomposition of \\(X\\) (eigen). Multiply \\(X\\) by \\(\\Lambda^{\\frac{1}{2}}\\) to get \\(X2\\). Consider how the eigendecomposition for \\(X2\\) changes compared to \\(X\\). Multiply \\(X2\\) by \\(U\\) to get \\(X3\\). Consider how the eigendecomposition for \\(X3\\) changes compared to \\(X2\\). Add \\(\\mu\\) to \\(X3\\). Consider how the eigendecomposition for \\(X4\\) changes compared to \\(X3\\). Plot the data and the eigenvectors (scaled with \\(\\Lambda^{\\frac{1}{2}}\\)) at each step. Hint: Use geom_segment for the eigenvectors. # a set.seed(1) sigma <- matrix(data = c(0.3, -0.5, -0.5, 1.6), nrow = 2, byrow = TRUE) ed <- eigen(sigma) e_val <- ed$values e_vec <- ed$vectors # b set.seed(1) nsamps <- 1000 X <- matrix(data = rnorm(nsamps * 2), ncol = 2) vec1 <- matrix(c(1,0,0,1), nrow = 2) X2 <- t(sqrt(diag(e_val)) %*% t(X)) vec2 <- sqrt(diag(e_val)) %*% vec1 X3 <- t(e_vec %*% t(X2)) vec3 <- e_vec %*% vec2 X4 <- t(c(-2, 1) + t(X3)) vec4 <- c(-2, 1) + vec3 vec_mat <- data.frame(matrix(c(0,0,0,0,0,0,0,0,0,0,0,0,-2,1,-2,1), ncol = 2, byrow = TRUE), t(cbind(vec1, vec2, vec3, vec4)), c(1,1,2,2,3,3,4,4)) df <- data.frame(rbind(X, X2, X3, X4), c(rep(1, nsamps), rep(2, nsamps), rep(3, nsamps), rep(4, nsamps))) colnames(df) <- c("D1", "D2", "wh") colnames(vec_mat) <- c("D1", "D2", "E1", "E2", "wh") ggplot(data = df, aes(x = D1, y = D2)) + geom_point() + geom_segment(data = vec_mat, aes(xend = E1, yend = E2), color = "red") + facet_wrap(~ wh) + coord_fixed() Exercise 8.5 (Marginal and conditional distributions) Let \\(X \\sim \\text{N}(\\mu, \\Sigma)\\), where \\(\\mu = [2, 0, -1]^T\\) and \\(\\Sigma = \\begin{bmatrix} 1 & -0.2 & 0.5 \\\\ -0.2 & 1.4 & -1.2 \\\\ 0.5 & -1.2 & 2 \\\\ \\end{bmatrix}\\). Let \\(A\\) represent the first two random variables and \\(B\\) the third random variable. R: For the calculation in the following points, you can use R. Find the marginal distribution of \\(B\\). Find the conditional distribution of \\(B | A = [a_1, a_2]^T\\). Find the marginal distribution of \\(A\\). Find the conditional distribution of \\(A | B = b\\). R: Visually compare the distributions of a) and b), and c) and d) at three different conditional values. mu <- c(2, 0, -1) Sigma <- matrix(c(1, -0.2, 0.5, -0.2, 1.4, -1.2, 0.5, -1.2, 2), nrow = 3, byrow = TRUE) mu_A <- c(2, 0) mu_B <- -1 Sigma_A <- Sigma[1:2, 1:2] Sigma_B <- Sigma[3, 3] Sigma_AB <- Sigma[1:2, 3] # b tmp_b <- t(Sigma_AB) %*% solve(Sigma_A) mu_b <- mu_B - tmp_b %*% mu_A Sigma_b <- Sigma_B - t(Sigma_AB) %*% solve(Sigma_A) %*% Sigma_AB mu_b ## [,1] ## [1,] -1.676471 tmp_b ## [,1] [,2] ## [1,] 0.3382353 -0.8088235 Sigma_b ## [,1] ## [1,] 0.8602941 # d tmp_a <- Sigma_AB * (1 / Sigma_B) mu_a <- mu_A - tmp_a * mu_B Sigma_d <- Sigma_A - (Sigma_AB * (1 / Sigma_B)) %*% t(Sigma_AB) mu_a ## [1] 2.25 -0.60 tmp_a ## [1] 0.25 -0.60 Sigma_d ## [,1] [,2] ## [1,] 0.875 0.10 ## [2,] 0.100 0.68 Solution. \\(B \\sim \\text{N}(-1, 2)\\). \\(B | A = a \\sim \\text{N}(-1.68 + [0.34, -0.81] a, 0.86)\\). \\(\\mu_A = [2, 0]^T\\) and \\(\\Sigma_A = \\begin{bmatrix} 1 & -0.2 & \\\\ -0.2 & 1.4 \\\\ \\end{bmatrix}\\). \\[\\begin{align} A | B = b &\\sim \\text{N}(\\mu_t, \\Sigma_t), \\\\ \\mu_t &= [2.25, -0.6]^T + [0.25, -0.6]^T b, \\\\ \\Sigma_t &= \\begin{bmatrix} 0.875 & 0.1 \\\\ 0.1 & 0.68 \\\\ \\end{bmatrix} \\end{align}\\] library(mvtnorm) set.seed(1) nsamps <- 1000 # a and b samps <- as.data.frame(matrix(data = NA, nrow = 4 * nsamps, ncol = 2)) samps[1:nsamps,1] <- rnorm(nsamps, mu_B, Sigma_B) samps[1:nsamps,2] <- "marginal" for (i in 1:3) { a <- rmvnorm(1, mu_A, Sigma_A) samps[(i*nsamps + 1):((i + 1) * nsamps), 1] <- rnorm(nsamps, mu_b + tmp_b %*% t(a), Sigma_b) samps[(i*nsamps + 1):((i + 1) * nsamps), 2] <- paste0(# "cond", round(a, digits = 2), collapse = "-") } colnames(samps) <- c("x", "dist") ggplot(samps, aes(x = x)) + geom_density() + facet_wrap(~ dist) # c and d samps <- as.data.frame(matrix(data = NA, nrow = 4 * nsamps, ncol = 3)) samps[1:nsamps,1:2] <- rmvnorm(nsamps, mu_A, Sigma_A) samps[1:nsamps,3] <- "marginal" for (i in 1:3) { b <- rnorm(1, mu_B, Sigma_B) samps[(i*nsamps + 1):((i + 1) * nsamps), 1:2] <- rmvnorm(nsamps, mu_a + tmp_a * b, Sigma_d) samps[(i*nsamps + 1):((i + 1) * nsamps), 3] <- b } colnames(samps) <- c("x", "y", "dist") ggplot(samps, aes(x = x, y = y)) + geom_point() + geom_smooth(method = "lm") + facet_wrap(~ dist) 8.3 Transformations Exercise 8.6 Let \\((U,V)\\) be a random variable with PDF \\(p(u,v) = \\frac{1}{4 \\sqrt{u}}\\), \\(U \\in [0,4]\\) and \\(V \\in [\\sqrt{U}, \\sqrt{U} + 1]\\). Let \\(X = \\sqrt{U}\\) and \\(Y = V - \\sqrt{U}\\). Find PDF of \\((X,Y)\\). What can you tell about distributions of \\(X\\) and \\(Y\\)? This exercise shows how we can simplify a probabilistic problem with a clever use of transformations. R: Take 1000 samples from \\((X,Y)\\) and transform them with inverses of the above functions to get samples from \\((U,V)\\). Plot both sets of samples. Solution. First we need to find the inverse functions. Since \\(x = \\sqrt{u}\\) it follows that \\(u = x^2\\), and that \\(x \\in [0,2]\\). Similarly \\(v = y + x\\) and \\(y \\in [0,1]\\). Let us first find the Jacobian. \\[\\renewcommand\\arraystretch{1.6} J(x,y) = \\begin{bmatrix} \\frac{\\partial u}{\\partial x} & \\frac{\\partial v}{\\partial x} \\\\%[1ex] % <-- 1ex more space between rows of matrix \\frac{\\partial u}{\\partial y} & \\frac{\\partial v}{\\partial y} \\end{bmatrix} = \\begin{bmatrix} 2x & 1 \\\\%[1ex] % <-- 1ex more space between rows of matrix 0 & 1 \\end{bmatrix}, \\] and the determinant is \\(|J(x,y)| = 2x\\). Putting everything together, we get \\[\\begin{align} p_{X,Y}(x,y) = p_{U,V}(x^2, y + x) |J(x,y)| = \\frac{1}{4 \\sqrt{x^2}} 2x = \\frac{1}{2}. \\end{align}\\] This reminds us of the Uniform distribution. Indeed we can see that \\(p_X(x) = \\frac{1}{2}\\) and \\(p_Y(y) = 1\\). So instead of dealing with an awkward PDF of \\((U,V)\\) and the corresponding dynamic bounds, we are now looking at two independent Uniform random variables. In practice, this could make modeling much easier. set.seed(1) nsamps <- 2000 x <- runif(nsamps, min = 0, max = 2) y <- runif(nsamps) orig <- tibble(x = x, y = y, vrs = "original") u <- x^2 v <- y + x transf <- tibble(x = u, y = v, vrs = "transformed") df <- bind_rows(orig, transf) ggplot(df, aes(x = x, y = y, color = vrs)) + geom_point(alpha = 0.3) Exercise 8.7 R: Write a function that will calculate the probability density of an arbitraty multivariate normal distribution, based on independent standardized normal PDFs. Compare with dmvnorm from the mvtnorm package. library(mvtnorm) set.seed(1) mvn_dens <- function (y, mu, Sigma) { L <- chol(Sigma) L_inv <- solve(t(L)) g_inv <- L_inv %*% t(y - mu) J <- L_inv J_det <- det(J) return(prod(dnorm(g_inv)) * J_det) } mu_v <- c(-2, 0, 1) cov_m <- matrix(c(1, -0.2, 0.5, -0.2, 2, 0.3, 0.5, 0.3, 1.6), ncol = 3, byrow = TRUE) n_comp <- 20 for (i in 1:n_comp) { x <- rmvnorm(1, mean = mu_v, sigma = cov_m) print(paste0("My function: ", mvn_dens(x, mu_v, cov_m), ", dmvnorm: ", dmvnorm(x, mu_v, cov_m))) } ## [1] "My function: 0.0229514237156383, dmvnorm: 0.0229514237156383" ## [1] "My function: 0.00763138915406231, dmvnorm: 0.00763138915406231" ## [1] "My function: 0.0230688881105741, dmvnorm: 0.0230688881105741" ## [1] "My function: 0.0113616213114731, dmvnorm: 0.0113616213114731" ## [1] "My function: 0.00151808500121907, dmvnorm: 0.00151808500121907" ## [1] "My function: 0.0257658045974509, dmvnorm: 0.0257658045974509" ## [1] "My function: 0.0157963825730805, dmvnorm: 0.0157963825730805" ## [1] "My function: 0.00408856287529248, dmvnorm: 0.00408856287529248" ## [1] "My function: 0.0327793540101256, dmvnorm: 0.0327793540101256" ## [1] "My function: 0.0111606542967978, dmvnorm: 0.0111606542967978" ## [1] "My function: 0.0147636757585684, dmvnorm: 0.0147636757585684" ## [1] "My function: 0.0142948300412207, dmvnorm: 0.0142948300412207" ## [1] "My function: 0.0203093820657542, dmvnorm: 0.0203093820657542" ## [1] "My function: 0.0287533273357481, dmvnorm: 0.0287533273357481" ## [1] "My function: 0.0213402305128623, dmvnorm: 0.0213402305128623" ## [1] "My function: 0.0218356957993885, dmvnorm: 0.0218356957993885" ## [1] "My function: 0.0250750113961771, dmvnorm: 0.0250750113961771" ## [1] "My function: 0.0166498666348048, dmvnorm: 0.0166498666348048" ## [1] "My function: 0.00189725106874659, dmvnorm: 0.00189725106874659" ## [1] "My function: 0.0196697814975113, dmvnorm: 0.0196697814975113" "],["ard.html", "Chapter 9 Alternative representation of distributions 9.1 Probability generating functions (PGFs) 9.2 Moment generating functions (MGFs)", " Chapter 9 Alternative representation of distributions This chapter deals with alternative representation of distributions. The students are expected to acquire the following knowledge: Theoretical Probability generating functions. Moment generating functions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 9.1 Probability generating functions (PGFs) Exercise 9.1 Show that the sum of independent Poisson random variables is itself a Poisson random variable. R: Let \\(X\\) be a sum of three Poisson distributions with \\(\\lambda_i \\in \\{2, 5.2, 10\\}\\). Take 1000 samples and plot the three distributions and the sum. Then take 1000 samples from the theoretical distribution of \\(X\\) and compare them to the sum. Solution. Let \\(X_i \\sim \\text{Poisson}(\\lambda_i)\\) for \\(i = 1,...,n\\), and let \\(X = \\sum_{i=1}^n X_i\\). \\[\\begin{align} \\alpha_X(t) &= \\prod_{i=1}^n \\alpha_{X_i}(t) \\\\ &= \\prod_{i=1}^n \\bigg( \\sum_{j=0}^\\infty t^j \\frac{\\lambda_i^j e^{-\\lambda_i}}{j!} \\bigg) \\\\ &= \\prod_{i=1}^n \\bigg( e^{-\\lambda_i} \\sum_{j=0}^\\infty \\frac{(t\\lambda_i)^j }{j!} \\bigg) \\\\ &= \\prod_{i=1}^n \\bigg( e^{-\\lambda_i} e^{t \\lambda_i} \\bigg) & \\text{power series} \\\\ &= \\prod_{i=1}^n \\bigg( e^{\\lambda_i(t - 1)} \\bigg) \\\\ &= e^{\\sum_{i=1}^n \\lambda_i(t - 1)} \\\\ &= e^{t \\sum_{i=1}^n \\lambda_i - \\sum_{i=1}^n \\lambda_i} \\\\ &= e^{-\\sum_{i=1}^n \\lambda_i} \\sum_{j=0}^\\infty \\frac{(t \\sum_{i=1}^n \\lambda_i)^j}{j!}\\\\ &= \\sum_{j=0}^\\infty \\frac{e^{-\\sum_{i=1}^n \\lambda_i} (t \\sum_{i=1}^n \\lambda_i)^j}{j!}\\\\ \\end{align}\\] The last term is the PGF of a Poisson random variable with parameter \\(\\sum_{i=1}^n \\lambda_i\\). Because the PGF is unique, \\(X\\) is a Poisson random variable. set.seed(1) library(tidyr) nsamps <- 1000 samps <- matrix(data = NA, nrow = nsamps, ncol = 4) samps[ ,1] <- rpois(nsamps, 2) samps[ ,2] <- rpois(nsamps, 5.2) samps[ ,3] <- rpois(nsamps, 10) samps[ ,4] <- samps[ ,1] + samps[ ,2] + samps[ ,3] colnames(samps) <- c(2, 2.5, 10, "sum") gsamps <- as_tibble(samps) gsamps <- gather(gsamps, key = "dist", value = "value") ggplot(gsamps, aes(x = value)) + geom_bar() + facet_wrap(~ dist) samps <- cbind(samps, "theoretical" = rpois(nsamps, 2 + 5.2 + 10)) gsamps <- as_tibble(samps[ ,4:5]) gsamps <- gather(gsamps, key = "dist", value = "value") ggplot(gsamps, aes(x = value, fill = dist)) + geom_bar(position = "dodge") Exercise 9.2 Find the expected value and variance of the negative binomial distribution. Hint: Find the Taylor series of \\((1 - y)^{-r}\\) at point 0. Solution. Let \\(X \\sim \\text{NB}(r, p)\\). \\[\\begin{align} \\alpha_X(t) &= E[t^X] \\\\ &= \\sum_{j=0}^\\infty t^j \\binom{j + r - 1}{j} (1 - p)^r p^j \\\\ &= (1 - p)^r \\sum_{j=0}^\\infty \\binom{j + r - 1}{j} (tp)^j \\\\ &= (1 - p)^r \\sum_{j=0}^\\infty \\frac{(j + r - 1)(j + r - 2)...r}{j!} (tp)^j. \\\\ \\end{align}\\] Let us look at the Taylor series of \\((1 - y)^{-r}\\) at 0 \\[\\begin{align} (1 - y)^{-r} = &1 + \\frac{-r(-1)}{1!}y + \\frac{-r(-r - 1)(-1)^2}{2!}y^2 + \\\\ &\\frac{-r(-r - 1)(-r - 2)(-1)^3}{3!}y^3 + ... \\\\ \\end{align}\\] How does the \\(k\\)-th term look like? We have \\(k\\) derivatives of our function so \\[\\begin{align} \\frac{d^k}{d^k y} (1 - y)^{-r} &= \\frac{-r(-r - 1)...(-r - k + 1)(-1)^k}{k!}y^k \\\\ &= \\frac{r(r + 1)...(r + k - 1)}{k!}y^k. \\end{align}\\] We observe that this equals to the \\(j\\)-th term in the sum of NB PGF. Therefore \\[\\begin{align} \\alpha_X(t) &= (1 - p)^r (1 - tp)^{-r} \\\\ &= \\Big(\\frac{1 - p}{1 - tp}\\Big)^r \\end{align}\\] To find the expected value, we need to differentiate \\[\\begin{align} \\frac{d}{dt} \\Big(\\frac{1 - p}{1 - tp}\\Big)^r &= r \\Big(\\frac{1 - p}{1 - tp}\\Big)^{r-1} \\frac{d}{dt} \\frac{1 - p}{1 - tp} \\\\ &= r \\Big(\\frac{1 - p}{1 - tp}\\Big)^{r-1} \\frac{p(1 - p)}{(1 - tp)^2}. \\\\ \\end{align}\\] Evaluating this at 1, we get: \\[\\begin{align} E[X] = \\frac{rp}{1 - p}. \\end{align}\\] For the variance we need the second derivative. \\[\\begin{align} \\frac{d^2}{d^2t} \\Big(\\frac{1 - p}{1 - tp}\\Big)^r &= \\frac{p^2 r (r + 1) (\\frac{1 - p}{1 - tp})^r}{(tp - 1)^2} \\end{align}\\] Evaluating this at 1 and inserting the first derivatives, we get: \\[\\begin{align} Var[X] &= \\frac{d^2}{dt^2} \\alpha_X(1) + \\frac{d}{dt}\\alpha_X(1) - \\Big(\\frac{d}{dt}\\alpha_X(t) \\Big)^2 \\\\ &= \\frac{p^2 r (r + 1)}{(1 - p)^2} + \\frac{rp}{1 - p} - \\frac{r^2p^2}{(1 - p)^2} \\\\ &= \\frac{rp}{(1 - p)^2}. \\end{align}\\] library(tidyr) set.seed(1) nsamps <- 100000 find_p <- function (mu, r) { return (10 / (r + 10)) } r <- c(1,2,10,20) p <- find_p(10, r) sigma <- rep(sqrt(p*r / (1 - p)^2), each = nsamps) samps <- cbind("r=1" = rnbinom(nsamps, size = r[1], prob = 1 - p[1]), "r=2" = rnbinom(nsamps, size = r[2], prob = 1 - p[2]), "r=4" = rnbinom(nsamps, size = r[3], prob = 1 - p[3]), "r=20" = rnbinom(nsamps, size = r[4], prob = 1 - p[4])) gsamps <- gather(as.data.frame(samps)) iw <- (gsamps$value > sigma + 10) | (gsamps$value < sigma - 10) ggplot(gsamps, aes(x = value, fill = iw)) + geom_bar() + # geom_density() + facet_wrap(~ key) 9.2 Moment generating functions (MGFs) Exercise 9.3 Find the variance of the geometric distribution. Solution. Let \\(X \\sim \\text{Geometric}(p)\\). The MGF of the geometric distribution is \\[\\begin{align} M_X(t) &= E[e^{tX}] \\\\ &= \\sum_{k=0}^\\infty p(1 - p)^k e^{tk} \\\\ &= p \\sum_{k=0}^\\infty ((1 - p)e^t)^k. \\end{align}\\] Let us assume that \\((1 - p)e^t < 1\\). Then, by using the geometric series we get \\[\\begin{align} M_X(t) &= \\frac{p}{1 - e^t + pe^t}. \\end{align}\\] The first derivative of the above expression is \\[\\begin{align} \\frac{d}{dt}M_X(t) &= \\frac{-p(-e^t + pe^t)}{(1 - e^t + pe^t)^2}, \\end{align}\\] and evaluating at \\(t = 0\\), we get \\(\\frac{1 - p}{p}\\), which we already recognize as the expected value of the geometric distribution. The second derivative is \\[\\begin{align} \\frac{d^2}{dt^2}M_X(t) &= \\frac{(p-1)pe^t((p-1)e^t - 1)}{((p - 1)e^t + 1)^3}, \\end{align}\\] and evaluating at \\(t = 0\\), we get \\(\\frac{(p - 1)(p - 2)}{p^2}\\). Combining we get the variance \\[\\begin{align} Var(X) &= \\frac{(p - 1)(p - 2)}{p^2} - \\frac{(1 - p)^2}{p^2} \\\\ &= \\frac{(p-1)(p-2) - (1-p)^2}{p^2} \\\\ &= \\frac{1 - p}{p^2}. \\end{align}\\] Exercise 9.4 Find the distribution of sum of two normal random variables \\(X\\) and \\(Y\\), by comparing \\(M_{X+Y}(t)\\) to \\(M_X(t)\\). R: To illustrate the result draw random samples from N\\((-3, 1)\\) and N\\((5, 1.2)\\) and calculate the empirical mean and variance of \\(X+Y\\). Plot all three histograms in one plot. Solution. Let \\(X \\sim \\text{N}(\\mu_X, 1)\\) and \\(Y \\sim \\text{N}(\\mu_Y, 1)\\). The MGF of the sum is \\[\\begin{align} M_{X+Y}(t) &= M_X(t) M_Y(t). \\end{align}\\] Let us calculate \\(M_X(t)\\), the MGF for \\(Y\\) then follows analogously. \\[\\begin{align} M_X(t) &= \\int_{-\\infty}^\\infty e^{tx} \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - mu_X)^2}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - mu_X)^2 - 2\\sigma_X tx}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{x^2 - 2\\mu_X x + \\mu_X^2 - 2\\sigma_X tx}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - (\\mu_X + \\sigma_X^2 t))^2 + \\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} dx & \\text{complete the square}\\\\ &= e^{-\\frac{\\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - (\\mu_X + \\sigma_X^2 t))^2}{2\\sigma_X^2}} dx & \\\\ &= e^{-\\frac{\\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} & \\text{normal PDF} \\\\ &= e^{-\\frac{\\mu_X^2 - \\mu_X^2 - \\mu_X \\sigma_X^2 t - 2 \\sigma_X^4 t^2}{2\\sigma_X^2}} \\\\ &= e^{\\sigma_X^2 t^2 + \\frac{\\mu_X t}{2}}. \\\\ \\end{align}\\] The MGF of the sum is then \\[\\begin{align} M_{X+Y}(t) &= e^{\\sigma_X^2 t^2 + 0.5\\mu_X t} e^{\\sigma_Y^2 t^2 + 0.5\\mu_Y t} \\\\ &= e^{t^2(\\sigma_X^2 + \\sigma_Y^2) + 0.5 t(\\mu_X + \\mu_Y)}. \\end{align}\\] By comparing \\(M_{X+Y}(t)\\) and \\(M_X(t)\\) we observe that both have two terms. The first is \\(2t^2\\) multiplied by the variance, and the second is \\(2t\\) multiplied by the mean. Since MGFs are unique, we conclude that \\(Z = X + Y \\sim \\text{N}(\\mu_X + \\mu_Y, \\sigma_X^2 + \\sigma_Y^2)\\). library(tidyr) library(ggplot2) set.seed(1) nsamps <- 1000 x <- rnorm(nsamps, -3, 1) y <- rnorm(nsamps, 5, 1.2) z <- x + y mean(z) ## [1] 1.968838 var(z) ## [1] 2.645034 df <- data.frame(x = x, y = y, z = z) %>% gather() ggplot(df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") "],["ci.html", "Chapter 10 Concentration inequalities 10.1 Comparison 10.2 Practical", " Chapter 10 Concentration inequalities This chapter deals with concentration inequalities. The students are expected to acquire the following knowledge: Theoretical More assumptions produce closer bounds. R Optimization. Estimating probability inequalities. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 10.1 Comparison Exercise 10.1 R: Let \\(X\\) be geometric random variable with \\(p = 0.7\\). Visually compare the Markov bound, Chernoff bound, and the theoretical probabilities for \\(x = 1,...,12\\). To get the best fitting Chernoff bound, you will need to optimize the bound depending on \\(t\\). Use either analytical or numerical optimization. bound_chernoff <- function (t, p, a) { return ((p / (1 - exp(t) + p * exp(t))) / exp(a * t)) } set.seed(1) p <- 0.7 a <- seq(1, 12, by = 1) ci_markov <- (1 - p) / p / a t <- vector(mode = "numeric", length = length(a)) for (i in 1:length(t)) { t[i] <- optimize(bound_chernoff, interval = c(0, log(1 / (1 - p))), p = p, a = a[i])$minimum } t ## [1] 0.5108267 0.7984981 0.9162927 0.9808238 1.0216635 1.0498233 1.0704327 ## [8] 1.0861944 1.0986159 1.1086800 1.1169653 1.1239426 ci_chernoff <- (p / (1 - exp(t) + p * exp(t))) / exp(a * t) actual <- 1 - pgeom(a, 0.7) plot_df <- rbind( data.frame(x = a, y = ci_markov, type = "Markov"), data.frame(x = a, y = ci_chernoff, type = "Chernoff"), data.frame(x = a, y = actual, type = "Actual") ) ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() Exercise 10.2 R: Let \\(X\\) be a sum of 100 Beta distributions with random parameters. Take 1000 samples and plot the Chebyshev bound, Hoeffding bound, and the empirical probabilities. set.seed(1) nvars <- 100 nsamps <- 1000 samps <- matrix(data = NA, nrow = nsamps, ncol = nvars) Sn_mean <- 0 Sn_var <- 0 for (i in 1:nvars) { alpha1 <- rgamma(1, 10, 1) beta1 <- rgamma(1, 10, 1) X <- rbeta(nsamps, alpha1, beta1) Sn_mean <- Sn_mean + alpha1 / (alpha1 + beta1) Sn_var <- Sn_var + alpha1 * beta1 / ((alpha1 + beta1)^2 * (alpha1 + beta1 + 1)) samps[ ,i] <- X } mean(apply(samps, 1, sum)) ## [1] 51.12511 Sn_mean ## [1] 51.15723 var(apply(samps, 1, sum)) ## [1] 1.170652 Sn_var ## [1] 1.166183 a <- 1:30 b <- a / sqrt(Sn_var) ci_chebyshev <- 1 / b^2 ci_hoeffding <- 2 * exp(- 2 * a^2 / nvars) empirical <- NULL for (i in 1:length(a)) { empirical[i] <- sum(abs((apply(samps, 1, sum)) - Sn_mean) >= a[i])/ nsamps } plot_df <- rbind( data.frame(x = a, y = ci_chebyshev, type = "Chebyshev"), data.frame(x = a, y = ci_hoeffding, type = "Hoeffding"), data.frame(x = a, y = empirical, type = "Empirical") ) ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() + coord_cartesian(xlim = c(15, 25), ylim = c(0, 0.05)) 10.2 Practical Exercise 10.3 From Jagannathan. Let \\(X_i\\), \\(i = 1,...n\\), be a random sample of size \\(n\\) of a random variable \\(X\\). Let \\(X\\) have mean \\(\\mu\\) and variance \\(\\sigma^2\\). Find the size of the sample \\(n\\) required so that the probability that the difference between sample mean and true mean is smaller than \\(\\frac{\\sigma}{10}\\) is at least 0.95. Hint: Derive a version of the Chebyshev inequality for \\(P(|X - \\mu| \\geq a)\\) using Markov inequality. Solution. Let \\(\\bar{X} = \\sum_{i=1}^n X_i\\). Then \\(E[\\bar{X}] = \\mu\\) and \\(Var[\\bar{X}] = \\frac{\\sigma^2}{n}\\). Let us first derive another representation of Chebyshev inequality. \\[\\begin{align} P(|X - \\mu| \\geq a) = P(|X - \\mu|^2 \\geq a^2) \\leq \\frac{E[|X - \\mu|^2]}{a^2} = \\frac{Var[X]}{a^2}. \\end{align}\\] Let us use that on our sampling distribution: \\[\\begin{align} P(|\\bar{X} - \\mu| \\geq \\frac{\\sigma}{10}) \\leq \\frac{100 Var[\\bar{X}]}{\\sigma^2} = \\frac{100 Var[X]}{n \\sigma^2} = \\frac{100}{n}. \\end{align}\\] We are interested in the difference being smaller, therefore \\[\\begin{align} P(|\\bar{X} - \\mu| < \\frac{\\sigma}{10}) = 1 - P(|\\bar{X} - \\mu| \\geq \\frac{\\sigma}{10}) \\geq 1 - \\frac{100}{n} \\geq 0.95. \\end{align}\\] It follows that we need a sample size of \\(n \\geq \\frac{100}{0.05} = 2000\\). "],["crv.html", "Chapter 11 Convergence of random variables", " Chapter 11 Convergence of random variables This chapter deals with convergence of random variables. The students are expected to acquire the following knowledge: Theoretical Finding convergences of random variables. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 11.1 Let \\(X_1\\), \\(X_2\\),…, \\(X_n\\) be a sequence of Bernoulli random variables. Let \\(Y_n = \\frac{X_1 + X_2 + ... + X_n}{n^2}\\). Show that this sequence converges point-wise to the zero random variable. R: Use a simulation to check your answer. Solution. Let \\(\\epsilon\\) be arbitrary. We need to find such \\(n_0\\), that for every \\(n\\) greater than \\(n_0\\) \\(|Y_n| < \\epsilon\\) holds. \\[\\begin{align} |Y_n| &= |\\frac{X_1 + X_2 + ... + X_n}{n^2}| \\\\ &\\leq |\\frac{n}{n^2}| \\\\ &= \\frac{1}{n}. \\end{align}\\] So we need to find such \\(n_0\\), that for every \\(n > n_0\\) we will have \\(\\frac{1}{n} < \\epsilon\\). So \\(n_0 > \\frac{1}{\\epsilon}\\). x <- 1:1000 X <- matrix(data = NA, nrow = length(x), ncol = 100) y <- vector(mode = "numeric", length = length(x)) for (i in 1:length(x)) { X[i, ] <- rbinom(100, size = 1, prob = 0.5) } X <- apply(X, 2, cumsum) tmp_mat <- matrix(data = (1:1000)^2, nrow = 1000, ncol = 100) X <- X / tmp_mat y <- apply(X, 1, mean) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() Exercise 11.2 Let \\(\\Omega = [0,1]\\) and let \\(X_n\\) be a sequence of random variables, defined as \\[\\begin{align} X_n(\\omega) = \\begin{cases} \\omega^3, &\\omega = \\frac{i}{n}, &0 \\leq i \\leq 1 \\\\ 1, & \\text{otherwise.} \\end{cases} \\end{align}\\] Show that \\(X_n\\) converges almost surely to \\(X \\sim \\text{Uniform}(0,1)\\). Solution. We need to show \\(P(\\{\\omega: X_n(\\omega) \\rightarrow X(\\omega)\\}) = 1\\). Let \\(\\omega \\neq \\frac{i}{n}\\). Then for any \\(\\omega\\), \\(X_n\\) converges pointwise to \\(X\\): \\[\\begin{align} X_n(\\omega) = 1 \\implies |X_n(\\omega) - X(s)| = |1 - 1| < \\epsilon. \\end{align}\\] The above is independent of \\(n\\). Since there are countably infinite number of elements in the complement (\\(\\frac{i}{n}\\)), the probability of this set is 1. Exercise 11.3 Borrowed from Wasserman. Let \\(X_n \\sim \\text{N}(0, \\frac{1}{n})\\) and let \\(X\\) be a random variable with CDF \\[\\begin{align} F_X(x) = \\begin{cases} 0, &x < 0 \\\\ 1, &x \\geq 0. \\end{cases} \\end{align}\\] Does \\(X_n\\) converge to \\(X\\) in distribution? How about in probability? Prove or disprove these statement. R: Plot the CDF of \\(X_n\\) for \\(n = 1, 2, 5, 10, 100, 1000\\). Solution. Let us first check convergence in distribution. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} F_{X_n}(x) &= \\lim_{n \\rightarrow \\infty} \\phi (\\sqrt(n) x). \\end{align}\\] We have two cases, for \\(x < 0\\) and \\(x > 0\\). We do not need to check for \\(x = 0\\), since \\(F_X\\) is not continuous in that point. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} \\phi (\\sqrt(n) x) = \\begin{cases} 0, & x < 0 \\\\ 1, & x > 0. \\end{cases} \\end{align}\\] This is the same as \\(F_X\\). Let us now check convergence in probability. Since \\(X\\) is a point-mass distribution at zero, we have \\[\\begin{align} \\lim_{n \\rightarrow \\infty} P(|X_n| > \\epsilon) &= \\lim_{n \\rightarrow \\infty} (P(X_n > \\epsilon) + P(X_n < -\\epsilon)) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - P(X_n < \\epsilon) + P(X_n < -\\epsilon)) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - \\phi(\\sqrt{n} \\epsilon) + \\phi(- \\sqrt{n} \\epsilon)) \\\\ &= 0. \\end{align}\\] n <- c(1,2,5,10,100,1000) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01)), aes(x = x)) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/1), aes(color = "sd = 1/1")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/2), aes(color = "sd = 1/2")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/5), aes(color = "sd = 1/5")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/10), aes(color = "sd = 1/10")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/100), aes(color = "sd = 1/100")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/1000), aes(color = "sd = 1/1000")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/10000), aes(color = "sd = 1/10000")) Exercise 11.4 Let \\(X_i\\) be i.i.d. and \\(\\mu = E(X_1)\\). Let variance of \\(X_1\\) be finite. Show that the mean of \\(X_i\\), \\(\\bar{X}_n = \\frac{1}{n}\\sum_{i=1}^n X_i\\) converges in quadratic mean to \\(\\mu\\). Solution. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} E(|\\bar{X_n} - \\mu|^2) &= \\lim_{n \\rightarrow \\infty} E(\\bar{X_n}^2 - 2 \\bar{X_n} \\mu + \\mu^2) \\\\ &= \\lim_{n \\rightarrow \\infty} (E(\\bar{X_n}^2) - 2 \\mu E(\\frac{\\sum_{i=1}^n X_i}{n}) + \\mu^2) \\\\ &= \\lim_{n \\rightarrow \\infty} E(\\bar{X_n})^2 + \\lim_{n \\rightarrow \\infty} Var(\\bar{X_n}) - 2 \\mu^2 + \\mu^2 \\\\ &= \\lim_{n \\rightarrow \\infty} \\frac{n^2 \\mu^2}{n^2} + \\lim_{n \\rightarrow \\infty} \\frac{\\sigma^2}{n} - \\mu^2 \\\\ &= \\mu^2 - \\mu^2 + \\lim_{n \\rightarrow \\infty} \\frac{\\sigma^2}{n} \\\\ &= 0. \\end{align}\\] "],["lt.html", "Chapter 12 Limit theorems", " Chapter 12 Limit theorems This chapter deals with limit theorems. The students are expected to acquire the following knowledge: Theoretical Monte Carlo integration convergence. Difference between weak and strong law of large numbers. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 12.1 Show that Monte Carlo integration converges almost surely to the true integral of a bounded function. Solution. Let \\(g\\) be a function defined on \\(\\Omega\\). Let \\(X_i\\), \\(i = 1,...,n\\) be i.i.d. (multivariate) uniform random variables with bounds defined on \\(\\Omega\\). Let \\(Y_i\\) = \\(g(X_i)\\). Then it follows that \\(Y_i\\) are also i.i.d. random variables and their expected value is \\(E[g(X)] = \\int_{\\Omega} g(x) f_X(x) dx = \\frac{1}{V_{\\Omega}} \\int_{\\Omega} g(x) dx\\). By the strong law of large numbers, we have \\[\\begin{equation} \\frac{1}{n}\\sum_{i=1}^n Y_i \\xrightarrow{\\text{a.s.}} E[g(X)]. \\end{equation}\\] It follows that \\[\\begin{equation} V_{\\Omega} \\frac{1}{n}\\sum_{i=1}^n Y_i \\xrightarrow{\\text{a.s.}} \\int_{\\Omega} g(x) dx. \\end{equation}\\] Exercise 12.2 Let \\(X\\) be a geometric random variable with probability 0.5 and support in positive integers. Let \\(Y = 2^X (-1)^X X^{-1}\\). Find the expected value of \\(Y\\) by using conditional convergence (this variable does not have an expected value in the conventional sense – the series is not absolutely convergent). R: Draw \\(10000\\) samples from a geometric distribution with probability 0.5 and support in positive integers to get \\(X\\). Then calculate \\(Y\\) and plot the means at each iteration (sample). Additionally, plot the expected value calculated in a. Try it with different seeds. What do you notice? Solution. \\[\\begin{align*} E[Y] &= \\sum_{x=1}^{\\infty} \\frac{2^x (-1)^x}{x} 0.5^x \\\\ &= \\sum_{x=1}^{\\infty} \\frac{(-1)^x}{x} \\\\ &= - \\sum_{x=1}^{\\infty} \\frac{(-1)^{x+1}}{x} \\\\ &= - \\ln(2) \\end{align*}\\] set.seed(3) x <- rgeom(100000, prob = 0.5) + 1 y <- 2^x * (-1)^x * x^{-1} y_means <- cumsum(y) / seq_along(y) df <- data.frame(x = 1:length(y_means), y = y_means) ggplot(data = df, aes(x = x, y = y)) + geom_line() + geom_hline(yintercept = -log(2)) "],["eb.html", "Chapter 13 Estimation basics 13.1 ECDF 13.2 Properties of estimators", " Chapter 13 Estimation basics This chapter deals with estimation basics. The students are expected to acquire the following knowledge: Biased and unbiased estimators. Consistent estimators. Empirical cumulative distribution function. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 13.1 ECDF Exercise 13.1 (ECDF intuition) Take any univariate continuous distribution that is readily available in R and plot its CDF (\\(F\\)). Draw one sample (\\(n = 1\\)) from the chosen distribution and draw the ECDF (\\(F_n\\)) of that one sample. Use the definition of the ECDF, not an existing function in R. Implementation hint: ECDFs are always piecewise constant - they only jump at the sampled values and by \\(1/n\\). Repeat (b) for \\(n = 5, 10, 100, 1000...\\) Theory says that \\(F_n\\) should converge to \\(F\\). Can you observe that? For \\(n = 100\\) repeat the process \\(m = 20\\) times and plot every \\(F_n^{(m)}\\). Theory says that \\(F_n\\) will converge to \\(F\\) the slowest where \\(F\\) is close to 0.5 (where the variance is largest). Can you observe that? library(ggplot2) set.seed(1) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) one_samp <- rnorm(1) X <- data.frame(x = c(-5, one_samp, 5), y = c(0,1,1)) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) + geom_step(data = X, aes(x = x, y = y)) N <- c(5, 10, 100, 1000) X <- NULL for (n in N) { tmp <- rnorm(n) tmp_X <- data.frame(x = c(-5, sort(tmp), 5), y = c(0, seq(1/n, 1, by = 1/n), 1), n = n) X <- rbind(X, tmp_X) } ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) + geom_step(data = X, aes(x = x, y = y, color = as.factor(n))) + labs(color = "N") 13.2 Properties of estimators Exercise 13.2 Show that the sample average is, as an estimator of the mean: unbiased, consistent, asymptotically normal. Solution. \\[\\begin{align*} E[\\frac{1}{n} \\sum_{i=1}^n X_i] &= \\frac{1}{n} \\sum_{i=i}^n E[X_i] \\\\ &= E[X]. \\end{align*}\\] \\[\\begin{align*} \\lim_{n \\rightarrow \\infty} P(|\\frac{1}{n} \\sum_{i=1}^n X_i - E[X]| > \\epsilon) &= \\lim_{n \\rightarrow \\infty} P((\\frac{1}{n} \\sum_{i=1}^n X_i - E[X])^2 > \\epsilon^2) \\\\ & \\leq \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i - E[X])^2]}{\\epsilon^2} & \\text{Markov inequality} \\\\ & = \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i)^2 - 2 \\frac{1}{n} \\sum_{i=1}^n X_i E[X] + E[X]^2]}{\\epsilon^2} \\\\ & = \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i)^2] - 2 E[X]^2 + E[X]^2}{\\epsilon^2} \\\\ &= 0 \\end{align*}\\] For the last equality see the solution to ??. Follows directly from the CLT. Exercise 13.3 (Consistent but biased estimator) Show that sample variance (the plug-in estimator of variance) is a biased estimator of variance. Show that sample variance is a consistent estimator of variance. Show that the estimator with (\\(N-1\\)) (Bessel correction) is unbiased. Solution. \\[\\begin{align*} E[\\frac{1}{n} \\sum_{i=1}^n (Y_i - \\bar{Y})^2] &= \\frac{1}{n} \\sum_{i=1}^n E[(Y_i - \\bar{Y})^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2] - 2 E[Y_i \\bar{Y}] + \\bar{Y}^2)] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2 - 2 Y_i \\bar{Y} + \\bar{Y}^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2 - \\frac{2}{n} Y_i^2 - \\frac{2}{n} \\sum_{i \\neq j} Y_i Y_j + \\frac{1}{n^2}\\sum_j \\sum_{k \\neq j} Y_j Y_k + \\frac{1}{n^2} \\sum_j Y_j^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n \\frac{n - 2}{n} (\\sigma^2 + \\mu^2) - \\frac{2}{n} (n - 1) \\mu^2 + \\frac{1}{n^2}n(n-1)\\mu^2 + \\frac{1}{n^2}n(\\sigma^2 + \\mu^2) \\\\ &= \\frac{n-1}{n}\\sigma^2 \\\\ < \\sigma^2. \\end{align*}\\] Let \\(S_n\\) denote the sample variance. Then we can write it as \\[\\begin{align*} S_n &= \\frac{1}{n} \\sum_{i=1}^n (X_i - \\bar{X})^2 = \\frac{1}{n} \\sum_{i=1}^n (X_i - \\mu)^2 + 2(X_i - \\mu)(\\mu - \\bar{X}) + (\\mu - \\bar{X})^2. \\end{align*}\\] Now \\(\\bar{X}\\) converges in probability (by WLLN) to \\(\\mu\\) therefore the right terms converge in probability to zero. The left term converges in probability to \\(\\sigma^2\\), also by WLLN. Therefore the sample variance is a consistent estimatior of the variance. The denominator changes in the second-to-last line of a., therefore the last line is now equality. Exercise 13.4 (Estimating the median) Show that the sample median is an unbiased estimator of the median for N\\((\\mu, \\sigma^2)\\). Show that the sample median is an unbiased estimator of the mean for any distribution with symmetric density. Hint 1: The pdf of an order statistic is \\(f_{X_{(k)}}(x) = \\frac{n!}{(n - k)!(k - 1)!}f_X(x)\\Big(F_X(x)^{k-1} (1 - F_X(x)^{n - k}) \\Big)\\). Hint 2: A distribution is symmetric when \\(X\\) and \\(2a - X\\) have the same distribution for some \\(a\\). Solution. Let \\(Z_i\\), \\(i = 1,...,n\\) be i.i.d. variables with a symmetric distribution and let \\(Z_{k:n}\\) denote the \\(k\\)-th order statistic. We will distinguish two cases, when \\(n\\) is odd and when \\(n\\) is even. Let first \\(n = 2m + 1\\) be odd. Then the sample median is \\(M = Z_{m+1:2m+1}\\). Its PDF is \\[\\begin{align*} f_M(x) = (m+1)\\binom{2m + 1}{m}f_Z(x)\\Big(F_Z(x)^m (1 - F_Z(x)^m) \\Big). \\end{align*}\\] For every symmetric distribution, it holds that \\(F_X(x) = 1 - F(2a - x)\\). Let \\(a = \\mu\\), the population mean. Plugging this into the PDF, we get that \\(f_M(x) = f_M(2\\mu -x)\\). It follows that \\[\\begin{align*} E[M] &= E[2\\mu - M] \\\\ 2E[M] &= 2\\mu \\\\ E[M] &= \\mu. \\end{align*}\\] Now let \\(n = 2m\\) be even. Then the sample median is \\(M = \\frac{Z_{m:2m} + Z_{m+1:2m}}{2}\\). It can be shown, that the joint PDF of these terms is also symmetric. Therefore, similar to the above \\[\\begin{align*} E[M] &= E[\\frac{Z_{m:2m} + Z_{m+1:2m}}{2}] \\\\ &= E[\\frac{2\\mu - M + 2\\mu - M}{2}] \\\\ &= E[2\\mu - M]. \\end{align*}\\] The above also proves point a. as the median and the mean are the same in normal distribution. Exercise 13.5 (Matrix trace estimation) The Hutchinson trace estimator [1] is an estimator of the trace of a symmetric positive semidefinite matrix A that relies on Monte Carlo sampling. The estimator is defined as \\[\\begin{align*} \\textrm{tr}(A) \\approx \\frac{1}{n} \\Sigma_{i=1}^n z_i^T A z_i, &\\\\ z_i \\sim_{\\mathrm{IID}} \\textrm{Uniform}(\\{-1, 1\\}^m), & \\end{align*}\\] where \\(A \\in \\mathbb{R}^{m \\times m}\\) is a symmetric positive semidefinite matrix. Elements of each vector \\(z_i\\) are either \\(-1\\) or \\(1\\) with equal probability. This is also called a Rademacher distribution. Data scientists often want the trace of a Hessian to obtain valuable curvature information for a loss function. Per [2], an example is classifying ten digits based on \\((28,28)\\) grayscale images (i.e. MNIST data) using logistic regression. The number of parameters is \\(m = 28^2 \\cdot 10 = 7840\\) and the size of the Hessian is \\(m^2\\), roughly \\(6 \\cdot 10^6\\). The diagonal average is equal to the average eigenvalue, which may be useful for optimization; in MCMC contexts, this would be useful for preconditioners and step size optimization. Computing Hessians (as a means of getting eigenvalue information) is often intractable, but Hessian-vector products can be computed faster by autodifferentiation (with e.g. Tensorflow, Pytorch, Jax). This is one motivation for the use of a stochastic trace estimator as outlined above. References: A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines (Hutchinson, 1990) A Modern Analysis of Hutchinson’s Trace Estimator (Skorski, 2020) Prove that the Hutchinson trace estimator is an unbiased estimator of the trace. Solution. We first simplify our task: \\[\\begin{align} \\mathbb{E}\\left[\\frac{1}{n} \\Sigma_{i=1}^n z_i^T A z_i \\right] &= \\frac{1}{n} \\Sigma_{i=1}^n \\mathbb{E}\\left[z_i^T A z_i \\right] \\\\ &= \\mathbb{E}\\left[z_i^T A z_i \\right], \\end{align}\\] where the second equality is due to having \\(n\\) IID vectors \\(z_i\\). We now only need to show that \\(\\mathbb{E}\\left[z^T A z \\right] = \\mathrm{tr}(A)\\). We omit the index due to all vectors being IID: \\[\\begin{align} \\mathrm{tr}(A) &= \\mathrm{tr}(AI) \\\\ &= \\mathrm{tr}(A\\mathbb{E}[zz^T]) \\\\ &= \\mathbb{E}[\\mathrm{tr}(Azz^T)] \\\\ &= \\mathbb{E}[\\mathrm{tr}(z^TAz)] \\\\ &= \\mathbb{E}[z^TAz]. \\end{align}\\] This concludes the proof. We clarify some equalities below. The second equality assumes that \\(\\mathbb{E}[zz^T] = I\\). By noting that the mean of the Rademacher distribution is 0, we have \\[\\begin{align} \\mathrm{Cov}[z, z] &= \\mathbb{E}[(z - \\mathbb{E}[z])(z - \\mathbb{E}[z])^T] \\\\ &= \\mathbb{E}[zz^T]. \\end{align}\\] Dimensions of \\(z\\) are independent, so \\(\\mathrm{Cov}[z, z]_{ij} = 0\\) for \\(i \\neq j\\). The diagonal will contain variances, which are equal to \\(1\\) for all dimensions \\(k = 1 \\dots m\\): \\(\\mathrm{Var}[z^{(k)}] = \\mathbb{E}[z^{(k)}z^{(k)}] - \\mathbb{E}[z^{(k)}]^2 = 1 - 0 = 1\\). It follows that the covariance is an identity matrix. Note that this is a general result for vectors with IID dimensions sampled from a distribution with mean 0 and variance 1. We could therefore use something else instead of the Rademacher, e.g. \\(z ~ N(0, I)\\). The third equality uses the fact that the expectation of a trace equals the trace of an expectation. If \\(X\\) is a random matrix, then \\(\\mathbb{E}[X]_{ij} = \\mathbb{E}[X_{ij}]\\). Therefore: \\[\\begin{align} \\mathrm{tr}(\\mathbb{E}[X]) &= \\Sigma_{i=1}^m(\\mathbb{E}[X]_{ii}) \\\\ &= \\Sigma_{i=1}^m(\\mathbb{E}[X_{ii}]) \\\\ &= \\mathbb{E}[\\Sigma_{i=1}^m(X_{ii})] \\\\ &= \\mathbb{E}[\\mathrm{tr}(X)], \\end{align}\\] where we used the linearity of the expectation in the third step. The fourth equality uses the fact that \\(\\mathrm{tr}(AB) = \\mathrm{tr}(BA)\\) for any matrices \\(A \\in \\mathbb{R}^{n \\times m}, B \\in \\mathbb{R}^{m \\times n}\\). The last inequality uses the fact that the trace of a \\(1 \\times 1\\) matrix is just its element. "],["boot.html", "Chapter 14 Bootstrap", " Chapter 14 Bootstrap This chapter deals with bootstrap. The students are expected to acquire the following knowledge: How to use bootstrap to generate coverage intervals. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 14.1 Ideally, a \\(1-\\alpha\\) CI would have \\(1-\\alpha\\) coverage. That is, say a 95% CI should, in the long run, contain the true value of the parameter 95% of the time. In practice, it is impossible to assess the coverage of our CI method, because we rarely know the true parameter. In simulation, however, we can. Let’s assess the coverage of bootstrap percentile intervals. Pick a univariate distribution with readily available mean and one that you can easily sample from. Draw \\(n = 30\\) random samples from the chosen distribution and use the bootstrap (with large enough m) and percentile CI method to construct 95% CI. Repeat the process many times and count how many times the CI contains the true mean. That is, compute the actual coverage probability (don’t forget to include the standard error of the coverage probability!). What can you observe? Try one or two different distributions. What can you observe? Repeat (b) and (c) using BCa intervals (R package boot). How does the coverage compare to percentile intervals? As (d) but using intervals based on asymptotic normality (+/- 1.96 SE). How do results from (b), (d), and (e) change if we increase the sample size to n = 200? What about n = 5? library(boot) set.seed(0) nit <- 1000 # Repeat the process "many times" alpha <- 0.05 # CI parameter nboot <- 100 # m parameter for bootstrap ("large enough m") # f: change this to 200 or 5. nsample <- 30 # n = 30 random samples from the chosen distribution. Comment out BCa code if it breaks. covers <- matrix(nrow = nit, ncol = 3) covers_BCa <- matrix(nrow = nit, ncol = 3) covers_asymp_norm <- matrix(nrow = nit, ncol = 3) isin <- function (x, lower, upper) { (x > lower) & (x < upper) } for (j in 1:nit) { # Repeating many times # a: pick a univariate distribution - standard normal x1 <- rnorm(nsample) # c: one or two different distributions - beta and poisson x2 <- rbeta(nsample, 1, 2) x3 <- rpois(nsample, 5) X1 <- matrix(data = NA, nrow = nsample, ncol = nboot) X2 <- matrix(data = NA, nrow = nsample, ncol = nboot) X3 <- matrix(data = NA, nrow = nsample, ncol = nboot) for (i in 1:nboot) { X1[ ,i] <- sample(x1, nsample, replace = T) X2[ ,i] <- sample(x2, nsample, T) X3[ ,i] <- sample(x3, nsample, T) } X1_func <- apply(X1, 2, mean) X2_func <- apply(X2, 2, mean) X3_func <- apply(X3, 2, mean) X1_quant <- quantile(X1_func, probs = c(alpha / 2, 1 - alpha / 2)) X2_quant <- quantile(X2_func, probs = c(alpha / 2, 1 - alpha / 2)) X3_quant <- quantile(X3_func, probs = c(alpha / 2, 1 - alpha / 2)) covers[j,1] <- (0 > X1_quant[1]) & (0 < X1_quant[2]) covers[j,2] <- ((1 / 3) > X2_quant[1]) & ((1 / 3) < X2_quant[2]) covers[j,3] <- (5 > X3_quant[1]) & (5 < X3_quant[2]) mf <- function (x, i) return(mean(x[i])) bootX1 <- boot(x1, statistic = mf, R = nboot) bootX2 <- boot(x2, statistic = mf, R = nboot) bootX3 <- boot(x3, statistic = mf, R = nboot) X1_quant_BCa <- boot.ci(bootX1, type = "bca")$bca X2_quant_BCa <- boot.ci(bootX2, type = "bca")$bca X3_quant_BCa <- boot.ci(bootX3, type = "bca")$bca covers_BCa[j,1] <- (0 > X1_quant_BCa[4]) & (0 < X1_quant_BCa[5]) covers_BCa[j,2] <- ((1 / 3) > X2_quant_BCa[4]) & ((1 / 3) < X2_quant_BCa[5]) covers_BCa[j,3] <- (5 > X3_quant_BCa[4]) & (5 < X3_quant_BCa[5]) # e: estimate mean and standard error # sample mean: x1_bar <- mean(x1) x2_bar <- mean(x2) x3_bar <- mean(x3) # standard error (of the sample mean) estimate: sample standard deviation / sqrt(n) x1_bar_SE <- sd(x1) / sqrt(nsample) x2_bar_SE <- sd(x2) / sqrt(nsample) x3_bar_SE <- sd(x3) / sqrt(nsample) covers_asymp_norm[j,1] <- isin(0, x1_bar - 1.96 * x1_bar_SE, x1_bar + 1.96 * x1_bar_SE) covers_asymp_norm[j,2] <- isin(1/3, x2_bar - 1.96 * x2_bar_SE, x2_bar + 1.96 * x2_bar_SE) covers_asymp_norm[j,3] <- isin(5, x3_bar - 1.96 * x3_bar_SE, x3_bar + 1.96 * x3_bar_SE) } apply(covers, 2, mean) ## [1] 0.918 0.925 0.905 apply(covers, 2, sd) / sqrt(nit) ## [1] 0.008680516 0.008333333 0.009276910 apply(covers_BCa, 2, mean) ## [1] 0.927 0.944 0.927 apply(covers_BCa, 2, sd) / sqrt(nit) ## [1] 0.008230355 0.007274401 0.008230355 apply(covers_asymp_norm, 2, mean) ## [1] 0.939 0.937 0.930 apply(covers_asymp_norm, 2, sd) / sqrt(nit) ## [1] 0.007572076 0.007687008 0.008072494 Exercise 14.2 You are given a sample of independent observations from a process of interest: Index 1 2 3 4 5 6 7 8 X 7 2 4 6 4 5 9 10 Compute the plug-in estimate of mean and 95% symmetric CI based on asymptotic normality. Use the plug-in estimate of SE. Same as (a), but use the unbiased estimate of SE. Apply nonparametric bootstrap with 1000 bootstrap replications and estimate the 95% CI for the mean with percentile-based CI. # a x <- c(7, 2, 4, 6, 4, 5, 9, 10) n <- length(x) mu <- mean(x) SE <- sqrt(mean((x - mu)^2)) / sqrt(n) SE ## [1] 0.8915839 z <- qnorm(1 - 0.05 / 2) c(mu - z * SE, mu + z * SE) ## [1] 4.127528 7.622472 # b SE <- sd(x) / sqrt(n) SE ## [1] 0.9531433 c(mu - z * SE, mu + z * SE) ## [1] 4.006873 7.743127 # c set.seed(0) m <- 1000 T_mean <- function(x) {mean(x)} est_boot <- array(NA, m) for (i in 1:m) { x_boot <- x[sample(1:n, n, rep = T)] est_boot[i] <- T_mean(x_boot) } quantile(est_boot, p = c(0.025, 0.975)) ## 2.5% 97.5% ## 4.250 7.625 Exercise 14.3 We are given a sample of 10 independent paired (bivariate) observations: Index 1 2 3 4 5 6 7 8 9 10 X 1.26 -0.33 1.33 1.27 0.41 -1.54 -0.93 -0.29 -0.01 2.40 Y 2.64 0.33 0.48 0.06 -0.88 -2.14 -2.21 0.95 0.83 1.45 Compute Pearson correlation between X and Y. Use the cor.test() from R to estimate a 95% CI for the estimate from (a). Apply nonparametric bootstrap with 1000 bootstrap replications and estimate the 95% CI for the Pearson correlation with percentile-based CI. Compare CI from (b) and (c). Are they similar? How would the bootstrap estimation of CI change if we were interested in Spearman or Kendall correlation instead? x <- c(1.26, -0.33, 1.33, 1.27, 0.41, -1.54, -0.93, -0.29, -0.01, 2.40) y <- c(2.64, 0.33, 0.48, 0.06, -0.88, -2.14, -2.21, 0.95, 0.83, 1.45) # a cor(x, y) ## [1] 0.6991247 # b res <- cor.test(x, y) res$conf.int[1:2] ## [1] 0.1241458 0.9226238 # c set.seed(0) m <- 1000 n <- length(x) T_cor <- function(x, y) {cor(x, y)} est_boot <- array(NA, m) for (i in 1:m) { idx <- sample(1:n, n, rep = T) # !!! important to use same indices to keep dependency between x and y est_boot[i] <- T_cor(x[idx], y[idx]) } quantile(est_boot, p = c(0.025, 0.975)) ## 2.5% 97.5% ## 0.2565537 0.9057664 # d # Yes, but the bootstrap CI is more narrow. # e # We just use the functions for Kendall/Spearman coefficients instead: T_kendall <- function(x, y) {cor(x, y, method = "kendall")} T_spearman <- function(x, y) {cor(x, y, method = "spearman")} # Put this in a function that returns the CI bootstrap_95_ci <- function(x, y, t, m = 1000) { n <- length(x) est_boot <- array(NA, m) for (i in 1:m) { idx <- sample(1:n, n, rep = T) # !!! important to use same indices to keep dependency between x and y est_boot[i] <- t(x[idx], y[idx]) } quantile(est_boot, p = c(0.025, 0.975)) } bootstrap_95_ci(x, y, T_kendall) ## 2.5% 97.5% ## -0.08108108 0.78378378 bootstrap_95_ci(x, y, T_spearman) ## 2.5% 97.5% ## -0.1701115 0.8867925 Exercise 14.4 In this problem we will illustrate the use of the nonparametric bootstrap for estimating CIs of regression model coefficients. Load the longley dataset from base R with data(longley). Use lm() to apply linear regression using “Employed” as the target (dependent) variable and all other variables as the predictors (independent). Using lm() results, print the estimated regression coefficients and standard errors. Estimate 95% CI for the coefficients using +/- 1.96 * SE. Use nonparametric bootstrap with 100 replications to estimate the SE of the coefficients from (b). Compare the SE from (c) with those from (b). # a data(longley) # b res <- lm(Employed ~ . , longley) tmp <- data.frame(summary(res)$coefficients[,1:2]) tmp$LB <- tmp[,1] - 1.96 * tmp[,2] tmp$UB <- tmp[,1] + 1.96 * tmp[,2] tmp ## Estimate Std..Error LB UB ## (Intercept) -3.482259e+03 8.904204e+02 -5.227483e+03 -1.737035e+03 ## GNP.deflator 1.506187e-02 8.491493e-02 -1.513714e-01 1.814951e-01 ## GNP -3.581918e-02 3.349101e-02 -1.014616e-01 2.982320e-02 ## Unemployed -2.020230e-02 4.883997e-03 -2.977493e-02 -1.062966e-02 ## Armed.Forces -1.033227e-02 2.142742e-03 -1.453204e-02 -6.132495e-03 ## Population -5.110411e-02 2.260732e-01 -4.942076e-01 3.919994e-01 ## Year 1.829151e+00 4.554785e-01 9.364136e-01 2.721889e+00 # c set.seed(0) m <- 100 n <- nrow(longley) T_coef <- function(x) { lm(Employed ~ . , x)$coefficients } est_boot <- array(NA, c(m, ncol(longley))) for (i in 1:m) { idx <- sample(1:n, n, rep = T) est_boot[i,] <- T_coef(longley[idx,]) } SE <- apply(est_boot, 2, sd) SE ## [1] 1.826011e+03 1.605981e-01 5.693746e-02 8.204892e-03 3.802225e-03 ## [6] 3.907527e-01 9.414436e-01 # Show the standard errors around coefficients library(ggplot2) library(reshape2) df <- data.frame(index = 1:7, bootstrap_SE = SE, lm_SE = tmp$Std..Error) melted_df <- melt(df[2:nrow(df), ], id.vars = "index") # Ignore bias which has a really large magnitude ggplot(melted_df, aes(x = index, y = value, fill = variable)) + geom_bar(stat="identity", position="dodge") + xlab("Coefficient") + ylab("Standard error") # + scale_y_continuous(trans = "log") # If you want to also plot bias Exercise 14.5 This exercise shows a shortcoming of the bootstrap method when using the plug in estimator for the maximum. Compute the 95% bootstrap CI for the maximum of a standard normal distribution. Compute the 95% bootstrap CI for the maximum of a binomial distribution with n = 15 and p = 0.2. Repeat (b) using p = 0.9. Why is the result different? # bootstrap CI for maximum alpha <- 0.05 T_max <- function(x) {max(x)} # Equal to T_max = max bootstrap <- function(x, t, m = 1000) { n <- length(x) values <- rep(0, m) for (i in 1:m) { values[i] <- t(sample(x, n, replace = T)) } quantile(values, probs = c(alpha / 2, 1 - alpha / 2)) } # a # Meaningless, as the normal distribution can yield arbitrarily large values. x <- rnorm(100) bootstrap(x, T_max) ## 2.5% 97.5% ## 1.819425 2.961743 # b x <- rbinom(100, size = 15, prob = 0.2) # min = 0, max = 15 bootstrap(x, T_max) ## 2.5% 97.5% ## 6 7 # c x <- rbinom(100, size = 15, prob = 0.9) # min = 0, max = 15 bootstrap(x, T_max) ## 2.5% 97.5% ## 15 15 # Observation: to estimate the maximum, we need sufficient probability mass near the maximum value the distribution can yield. # Using bootstrap is pointless when there is too little mass near the true maximum. # In general, bootstrap will fail when estimating the CI for the maximum. "],["ml.html", "Chapter 15 Maximum likelihood 15.1 Deriving MLE 15.2 Fisher information 15.3 The German tank problem", " Chapter 15 Maximum likelihood This chapter deals with maximum likelihood estimation. The students are expected to acquire the following knowledge: How to derive MLE. Applying MLE in R. Calculating and interpreting Fisher information. Practical use of MLE. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 15.1 Deriving MLE Exercise 15.1 Derive the maximum likelihood estimator of variance for N\\((\\mu, \\sigma^2)\\). Compare with results from 13.3. What does that say about the MLE estimator? Solution. The mean is assumed constant, so we have the likelihood \\[\\begin{align} L(\\sigma^2; y) &= \\prod_{i=1}^n \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(y_i - \\mu)^2}{2 \\sigma^2}} \\\\ &= \\frac{1}{\\sqrt{2 \\pi \\sigma^2}^n} e^{\\frac{-\\sum_{i=1}^n (y_i - \\mu)^2}{2 \\sigma^2}} \\end{align}\\] We need to find the maximum of this function. We first observe that we can replace \\(\\frac{-\\sum_{i=1}^n (y_i - \\mu)^2}{2}\\) with a constant \\(c\\), since none of the terms are dependent on \\(\\sigma^2\\). Additionally, the term \\(\\frac{1}{\\sqrt{2 \\pi}^n}\\) does not affect the calculation of the maximum. So now we have \\[\\begin{align} L(\\sigma^2; y) &= (\\sigma^2)^{-\\frac{n}{2}} e^{\\frac{c}{\\sigma^2}}. \\end{align}\\] Differentiating we get \\[\\begin{align} \\frac{d}{d \\sigma^2} L(\\sigma^2; y) &= (\\sigma^2)^{-\\frac{n}{2}} \\frac{d}{d \\sigma^2} e^{\\frac{c}{\\sigma^2}} + e^{\\frac{c}{\\sigma^2}} \\frac{d}{d \\sigma^2} (\\sigma^2)^{-\\frac{n}{2}} \\\\ &= - (\\sigma^2)^{-\\frac{n}{2}} e^{\\frac{c}{\\sigma^2}} \\frac{c}{(\\sigma^2)^2} - e^{\\frac{c}{\\sigma^2}} \\frac{n}{2} (\\sigma^2)^{-\\frac{n + 2}{2}} \\\\ &= - (\\sigma^2)^{-\\frac{n + 4}{2}} e^{\\frac{c}{\\sigma^2}} c - e^{\\frac{c}{\\sigma^2}} \\frac{n}{2} (\\sigma^2)^{-\\frac{n + 2}{2}} \\\\ &= - e^{\\frac{c}{\\sigma^2}} (\\sigma^2)^{-\\frac{n + 4}{2}} \\Big(c + \\frac{n}{2}\\sigma^2 \\Big). \\end{align}\\] To get the maximum, this has to equal to 0, so \\[\\begin{align} c + \\frac{n}{2}\\sigma^2 &= 0 \\\\ \\sigma^2 &= -\\frac{2c}{n} \\\\ \\sigma^2 &= \\frac{\\sum_{i=1}^n (Y_i - \\mu)^2}{n}. \\end{align}\\] The MLE estimator is biased. Exercise 15.2 (Multivariate normal distribution) Derive the maximum likelihood estimate for the mean and covariance matrix of the multivariate normal. Simulate \\(n = 40\\) samples from a bivariate normal distribution (choose non-trivial parameters, that is, mean \\(\\neq 0\\) and covariance \\(\\neq 0\\)). Compute the MLE for the sample. Overlay the data with an ellipse that is determined by the MLE and an ellipse that is determined by the chosen true parameters. Repeat b. several times and observe how the estimates (ellipses) vary around the true value. Hint: For the derivation of MLE, these identities will be helpful: \\(\\frac{\\partial b^T a}{\\partial a} = \\frac{\\partial a^T b}{\\partial a} = b\\), \\(\\frac{\\partial a^T A a}{\\partial a} = (A + A^T)a\\), \\(\\frac{\\partial \\text{tr}(BA)}{\\partial A} = B^T\\), \\(\\frac{\\partial \\ln |A|}{\\partial A} = (A^{-1})^T\\), \\(a^T A a = \\text{tr}(a^T A a) = \\text{tr}(a a^T A) = \\text{tr}(Aaa^T)\\). Solution. The log likelihood of the MVN distribution is \\[\\begin{align*} l(\\mu, \\Sigma ; x) &= -\\frac{1}{2}\\Big(\\sum_{i=1}^n k\\ln(2\\pi) + |\\Sigma| + (x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) \\\\ &= -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n(x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) + c, \\end{align*}\\] where \\(c\\) is a constant with respect to \\(\\mu\\) and \\(\\Sigma\\). To find the MLE we first need to find partial derivatives. Let us start with \\(\\mu\\). \\[\\begin{align*} \\frac{\\partial}{\\partial \\mu}l(\\mu, \\Sigma ; x) &= \\frac{\\partial}{\\partial \\mu} -\\frac{1}{2}\\Big(\\sum_{i=1}^n x_i^T \\Sigma^{-1} x_i - x_i^T \\Sigma^{-1} \\mu - \\mu^T \\Sigma^{-1} x_i + \\mu^T \\Sigma^{-1} \\mu \\Big) \\\\ &= -\\frac{1}{2}\\Big(\\sum_{i=1}^n - \\Sigma^{-1} x_i - \\Sigma^{-1} x_i + 2 \\Sigma^{-1} \\mu \\Big) \\\\ &= -\\Sigma^{-1}\\Big(\\sum_{i=1}^n - x_i + \\mu \\Big). \\end{align*}\\] Equating above with zero, we get \\[\\begin{align*} \\sum_{i=1}^n - x_i + \\mu &= 0 \\\\ \\hat{\\mu} = \\frac{1}{n} \\sum_{i=1}^n x_i, \\end{align*}\\] which is the dimension-wise empirical mean. Now for the covariance matrix \\[\\begin{align*} \\frac{\\partial}{\\partial \\Sigma^{-1}}l(\\mu, \\Sigma ; x) &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n(x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) \\\\ &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n \\text{tr}((x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu))\\Big) \\\\ &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n \\text{tr}((\\Sigma^{-1} (x_i - \\mu) (x_i - \\mu)^T )\\Big) \\\\ &= \\frac{n}{2}\\Sigma + -\\frac{1}{2}\\Big(\\sum_{i=1}^n (x_i - \\mu) (x_i - \\mu)^T \\Big). \\end{align*}\\] Equating above with zero, we get \\[\\begin{align*} \\hat{\\Sigma} = \\frac{1}{n}\\sum_{i=1}^n (x_i - \\mu) (x_i - \\mu)^T. \\end{align*}\\] set.seed(1) n <- 40 mu <- c(1, -2) Sigma <- matrix(data = c(2, -1.6, -1.6, 1.8), ncol = 2) X <- mvrnorm(n = n, mu = mu, Sigma = Sigma) colnames(X) <- c("X1", "X2") X <- as.data.frame(X) # plot.new() tru_ellip <- ellipse(mu, Sigma, draw = FALSE) colnames(tru_ellip) <- c("X1", "X2") tru_ellip <- as.data.frame(tru_ellip) mu_est <- apply(X, 2, mean) tmp <- as.matrix(sweep(X, 2, mu_est)) Sigma_est <- (1 / n) * t(tmp) %*% tmp est_ellip <- ellipse(mu_est, Sigma_est, draw = FALSE) colnames(est_ellip) <- c("X1", "X2") est_ellip <- as.data.frame(est_ellip) ggplot(data = X, aes(x = X1, y = X2)) + geom_point() + geom_path(data = tru_ellip, aes(x = X1, y = X2, color = "truth")) + geom_path(data = est_ellip, aes(x = X1, y = X2, color = "estimated")) + labs(color = "type") Exercise 15.3 (Logistic regression) Logistic regression is a popular discriminative model when our target variable is binary (categorical with 2 values). One of the ways of looking at logistic regression is that it is linear regression but instead of using the linear term as the mean of a normal RV, we use it as the mean of a Bernoulli RV. Of course, the mean of a Bernoulli is bounded on \\([0,1]\\), so, to avoid non-sensical values, we squeeze the linear between 0 and 1 with the inverse logit function inv_logit\\((z) = 1 / (1 + e^{-z})\\). This leads to the following model: \\(y_i | \\beta, x_i \\sim \\text{Bernoulli}(\\text{inv_logit}(\\beta x_i))\\). Explicitly write the likelihood function of beta. Implement the likelihood function in R. Use black-box box-constraint optimization (for example, optim() with L-BFGS) to find the maximum likelihood estimate for beta for \\(x\\) and \\(y\\) defined below. Plot the estimated probability as a function of the independent variable. Compare with the truth. Let \\(y2\\) be a response defined below. Will logistic regression work well on this dataset? Why not? How can we still use the model, without changing it? inv_log <- function (z) { return (1 / (1 + exp(-z))) } set.seed(1) x <- rnorm(100) y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) y2 <- rbinom(100, size = 1, prob = inv_log(1.2 * x + 1.4 * x^2)) Solution. \\[\\begin{align*} l(\\beta; x, y) &= p(y | x, \\beta) \\\\ &= \\ln(\\prod_{i=1}^n \\text{inv_logit}(\\beta x_i)^{y_i} (1 - \\text{inv_logit}(\\beta x_i))^{1 - y_i}) \\\\ &= \\sum_{i=1}^n y_i \\ln(\\text{inv_logit}(\\beta x_i)) + (1 - y_i) \\ln(1 - \\text{inv_logit}(\\beta x_i)). \\end{align*}\\] set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par ## [1] 1.166558 truth_p <- data.frame(x = x, prob = inv_log(1.2 * x), type = "truth") est_p <- data.frame(x = x, prob = inv_log(my_optim$par * x), type = "estimated") plot_df <- rbind(truth_p, est_p) ggplot(data = plot_df, aes(x = x, y = prob, color = type)) + geom_point(alpha = 0.3) y2 <- rbinom(2000, size = 1, prob = inv_log(1.2 * x + 1.4 * x^2)) X2 <- cbind(x, x^2) my_optim2 <- optim(par = c(0, 0), fn = l_logistic, method = "L-BFGS-B", lower = c(0, 0), upper = c(2, 2), X = t(X2), y = y2) my_optim2$par ## [1] 1.153656 1.257649 tmp <- sweep(data.frame(x = x, x2 = x^2), 2, my_optim2$par, FUN = "*") tmp <- tmp[ ,1] + tmp[ ,2] truth_p <- data.frame(x = x, prob = inv_log(1.2 * x + 1.4 * x^2), type = "truth") est_p <- data.frame(x = x, prob = inv_log(tmp), type = "estimated") plot_df <- rbind(truth_p, est_p) ggplot(data = plot_df, aes(x = x, y = prob, color = type)) + geom_point(alpha = 0.3) Exercise 15.4 (Linear regression) For the data generated below, do the following: Compute the least squares (MLE) estimate of coefficients beta using the matrix exact solution. Compute the MLE by minimizing the sum of squared residuals using black-box optimization (optim()). Compute the MLE by using the output built-in linear regression (lm() ). Compare (a-c and the true coefficients). Compute 95% CI on the beta coefficients using the output of built-in linear regression. Compute 95% CI on the beta coefficients by using (a or b) and the bootstrap with percentile method for CI. Compare with d. set.seed(1) n <- 100 x1 <- rnorm(n) x2 <- rnorm(n) x3 <- rnorm(n) X <- cbind(x1, x2, x3) beta <- c(0.2, 0.6, -1.2) y <- as.vector(t(beta %*% t(X))) + rnorm(n, sd = 0.2) set.seed(1) n <- 100 x1 <- rnorm(n) x2 <- rnorm(n) x3 <- rnorm(n) X <- cbind(x1, x2, x3) beta <- c(0.2, 0.6, -1.2) y <- as.vector(t(beta %*% t(X))) + rnorm(n, sd = 0.2) LS_fun <- function (beta, X, y) { return(sum((y - beta %*% t(X))^2)) } my_optim <- optim(par = c(0, 0, 0), fn = LS_fun, lower = -5, upper = 5, X = X, y = y, method = "L-BFGS-B") my_optim$par ## [1] 0.1898162 0.5885946 -1.1788264 df <- data.frame(y = y, x1 = x1, x2 = x2, x3 = x3) my_lm <- lm(y ~ x1 + x2 + x3 - 1, data = df) my_lm ## ## Call: ## lm(formula = y ~ x1 + x2 + x3 - 1, data = df) ## ## Coefficients: ## x1 x2 x3 ## 0.1898 0.5886 -1.1788 # matrix solution beta_hat <- solve(t(X) %*% X) %*% t(X) %*% y beta_hat ## [,1] ## x1 0.1898162 ## x2 0.5885946 ## x3 -1.1788264 out <- summary(my_lm) out$coefficients[ ,2] ## x1 x2 x3 ## 0.02209328 0.02087542 0.01934506 # bootstrap CI nboot <- 1000 beta_boot <- matrix(data = NA, ncol = length(beta), nrow = nboot) for (i in 1:nboot) { inds <- sample(1:n, n, replace = T) new_df <- df[inds, ] X_tmp <- as.matrix(new_df[ ,-1]) y_tmp <- new_df[ ,1] # print(nrow(new_df)) tmp_beta <- solve(t(X_tmp) %*% X_tmp) %*% t(X_tmp) %*% y_tmp beta_boot[i, ] <- tmp_beta } apply(beta_boot, 2, mean) ## [1] 0.1893281 0.5887068 -1.1800738 apply(beta_boot, 2, quantile, probs = c(0.025, 0.975)) ## [,1] [,2] [,3] ## 2.5% 0.1389441 0.5436911 -1.221560 ## 97.5% 0.2386295 0.6363102 -1.140416 out$coefficients[ ,2] ## x1 x2 x3 ## 0.02209328 0.02087542 0.01934506 Exercise 15.5 (Principal component analysis) Load the olympic data set from package ade4. The data show decathlon results for 33 men in 1988 Olympic Games. This data set serves as a great example of finding the latent structure in the data, as there are certain characteristics of the athletes that make them excel at different events. For example an explosive athlete will do particulary well in sprints and long jumps. Perform PCA (prcomp) on the data set and interpret the first 2 latent dimensions. Hint: Standardize the data first to get meaningful results. Use MLE to estimate the covariance of the standardized multivariate distribution. Decompose the estimated covariance matrix with the eigendecomposition. Compare the eigenvectors to the output of PCA. data(olympic) X <- olympic$tab X_scaled <- scale(X) my_pca <- prcomp(X_scaled) summary(my_pca) ## Importance of components: ## PC1 PC2 PC3 PC4 PC5 PC6 PC7 ## Standard deviation 1.8488 1.6144 0.97123 0.9370 0.74607 0.70088 0.65620 ## Proportion of Variance 0.3418 0.2606 0.09433 0.0878 0.05566 0.04912 0.04306 ## Cumulative Proportion 0.3418 0.6025 0.69679 0.7846 0.84026 0.88938 0.93244 ## PC8 PC9 PC10 ## Standard deviation 0.55389 0.51667 0.31915 ## Proportion of Variance 0.03068 0.02669 0.01019 ## Cumulative Proportion 0.96312 0.98981 1.00000 autoplot(my_pca, data = X, loadings = TRUE, loadings.colour = 'blue', loadings.label = TRUE, loadings.label.size = 3) Sigma_est <- (1 / nrow(X_scaled)) * t(X_scaled) %*% X_scaled Sigma_dec <- eigen(Sigma_est) Sigma_dec$vectors ## [,1] [,2] [,3] [,4] [,5] [,6] ## [1,] 0.4158823 0.1488081 -0.26747198 -0.08833244 -0.442314456 0.03071237 ## [2,] -0.3940515 -0.1520815 -0.16894945 -0.24424963 0.368913901 -0.09378242 ## [3,] -0.2691057 0.4835374 0.09853273 -0.10776276 -0.009754680 0.23002054 ## [4,] -0.2122818 0.0278985 -0.85498656 0.38794393 -0.001876311 0.07454380 ## [5,] 0.3558474 0.3521598 -0.18949642 0.08057457 0.146965351 -0.32692886 ## [6,] 0.4334816 0.0695682 -0.12616012 -0.38229029 -0.088802794 0.21049130 ## [7,] -0.1757923 0.5033347 0.04609969 0.02558404 0.019358607 0.61491241 ## [8,] -0.3840821 0.1495820 0.13687235 0.14396548 -0.716743474 -0.34776037 ## [9,] -0.1799436 0.3719570 -0.19232803 -0.60046566 0.095582043 -0.43744387 ## [10,] 0.1701426 0.4209653 0.22255233 0.48564231 0.339772188 -0.30032419 ## [,7] [,8] [,9] [,10] ## [1,] 0.2543985 0.663712826 -0.10839531 0.10948045 ## [2,] 0.7505343 0.141264141 0.04613910 0.05580431 ## [3,] -0.1106637 0.072505560 0.42247611 0.65073655 ## [4,] -0.1351242 -0.155435871 -0.10206505 0.11941181 ## [5,] 0.1413388 -0.146839303 0.65076229 -0.33681395 ## [6,] 0.2725296 -0.639003579 -0.20723854 0.25971800 ## [7,] 0.1439726 0.009400445 -0.16724055 -0.53450315 ## [8,] 0.2732665 -0.276873049 -0.01766443 -0.06589572 ## [9,] -0.3419099 0.058519366 -0.30619617 -0.13093187 ## [10,] 0.1868704 0.007310045 -0.45688227 0.24311846 my_pca$rotation ## PC1 PC2 PC3 PC4 PC5 PC6 ## 100 -0.4158823 0.1488081 0.26747198 -0.08833244 -0.442314456 0.03071237 ## long 0.3940515 -0.1520815 0.16894945 -0.24424963 0.368913901 -0.09378242 ## poid 0.2691057 0.4835374 -0.09853273 -0.10776276 -0.009754680 0.23002054 ## haut 0.2122818 0.0278985 0.85498656 0.38794393 -0.001876311 0.07454380 ## 400 -0.3558474 0.3521598 0.18949642 0.08057457 0.146965351 -0.32692886 ## 110 -0.4334816 0.0695682 0.12616012 -0.38229029 -0.088802794 0.21049130 ## disq 0.1757923 0.5033347 -0.04609969 0.02558404 0.019358607 0.61491241 ## perc 0.3840821 0.1495820 -0.13687235 0.14396548 -0.716743474 -0.34776037 ## jave 0.1799436 0.3719570 0.19232803 -0.60046566 0.095582043 -0.43744387 ## 1500 -0.1701426 0.4209653 -0.22255233 0.48564231 0.339772188 -0.30032419 ## PC7 PC8 PC9 PC10 ## 100 0.2543985 -0.663712826 0.10839531 -0.10948045 ## long 0.7505343 -0.141264141 -0.04613910 -0.05580431 ## poid -0.1106637 -0.072505560 -0.42247611 -0.65073655 ## haut -0.1351242 0.155435871 0.10206505 -0.11941181 ## 400 0.1413388 0.146839303 -0.65076229 0.33681395 ## 110 0.2725296 0.639003579 0.20723854 -0.25971800 ## disq 0.1439726 -0.009400445 0.16724055 0.53450315 ## perc 0.2732665 0.276873049 0.01766443 0.06589572 ## jave -0.3419099 -0.058519366 0.30619617 0.13093187 ## 1500 0.1868704 -0.007310045 0.45688227 -0.24311846 15.2 Fisher information Exercise 15.6 Let us assume a Poisson likelihood. Derive the MLE estimate of the mean. Derive the Fisher information. For the data below compute the MLE and construct confidence intervals. Use bootstrap to construct the CI for the mean. Compare with c) and discuss. x <- c(2, 5, 3, 1, 2, 1, 0, 3, 0, 2) Solution. The log likelihood of the Poisson is \\[\\begin{align*} l(\\lambda; x) = \\sum_{i=1}^n x_i \\ln \\lambda - n \\lambda - \\sum_{i=1}^n \\ln x_i! \\end{align*}\\] Taking the derivative and equating with 0 we get \\[\\begin{align*} \\frac{1}{\\hat{\\lambda}}\\sum_{i=1}^n x_i - n &= 0 \\\\ \\hat{\\lambda} &= \\frac{1}{n} \\sum_{i=1}^n x_i. \\end{align*}\\] Since \\(\\lambda\\) is the mean parameter, this was expected. For the Fischer information, we first need the second derivative, which is \\[\\begin{align*} - \\lambda^{-2} \\sum_{i=1}^n x_i. \\\\ \\end{align*}\\] Now taking the expectation of the negative of the above, we get \\[\\begin{align*} E[\\lambda^{-2} \\sum_{i=1}^n x_i] &= \\lambda^{-2} E[\\sum_{i=1}^n x_i] \\\\ &= \\lambda^{-2} n \\lambda \\\\ &= \\frac{n}{\\lambda}. \\end{align*}\\] set.seed(1) x <- c(2, 5, 3, 1, 2, 1, 0, 3, 0, 2) lambda_hat <- mean(x) finfo <- length(x) / lambda_hat mle_CI <- c(lambda_hat - 1.96 * sqrt(1 / finfo), lambda_hat + 1.96 * sqrt(1 / finfo)) boot_lambda <- c() nboot <- 1000 for (i in 1:nboot) { tmp_x <- sample(x, length(x), replace = T) boot_lambda[i] <- mean(tmp_x) } boot_CI <- c(quantile(boot_lambda, 0.025), quantile(boot_lambda, 0.975)) mle_CI ## [1] 1.045656 2.754344 boot_CI ## 2.5% 97.5% ## 1.0 2.7 Exercise 15.7 Find the Fisher information matrix for the Gamma distribution. Generate 20 samples from a Gamma distribution and plot a confidence ellipse of the inverse of Fisher information matrix around the ML estimates of the parameters. Also plot the theoretical values. Repeat the sampling several times. What do you observe? Discuss what a non-diagonal Fisher matrix implies. Hint: The digamma function is defined as \\(\\psi(x) = \\frac{\\frac{d}{dx} \\Gamma(x)}{\\Gamma(x)}\\). Additionally, you do not need to evaluate \\(\\frac{d}{dx} \\psi(x)\\). To calculate its value in R, use package numDeriv. Solution. The log likelihood of the Gamma is \\[\\begin{equation*} l(\\alpha, \\beta; x) = n \\alpha \\ln \\beta - n \\ln \\Gamma(\\alpha) + (\\alpha - 1) \\sum_{i=1}^n \\ln x_i - \\beta \\sum_{i=1}^n x_i. \\end{equation*}\\] Let us calculate the derivatives. \\[\\begin{align*} \\frac{\\partial}{\\partial \\alpha} l(\\alpha, \\beta; x) &= n \\ln \\beta - n \\psi(\\alpha) + \\sum_{i=1}^n \\ln x_i, \\\\ \\frac{\\partial}{\\partial \\beta} l(\\alpha, \\beta; x) &= \\frac{n \\alpha}{\\beta} - \\sum_{i=1}^n x_i, \\\\ \\frac{\\partial^2}{\\partial \\alpha \\beta} l(\\alpha, \\beta; x) &= \\frac{n}{\\beta}, \\\\ \\frac{\\partial^2}{\\partial \\alpha^2} l(\\alpha, \\beta; x) &= - n \\frac{\\partial}{\\partial \\alpha} \\psi(\\alpha), \\\\ \\frac{\\partial^2}{\\partial \\beta^2} l(\\alpha, \\beta; x) &= - \\frac{n \\alpha}{\\beta^2}. \\end{align*}\\] The Fisher information matrix is then \\[\\begin{align*} I(\\alpha, \\beta) = - E[ \\begin{bmatrix} - n \\psi'(\\alpha) & \\frac{n}{\\beta} \\\\ \\frac{n}{\\beta} & - \\frac{n \\alpha}{\\beta^2} \\end{bmatrix} ] = \\begin{bmatrix} n \\psi'(\\alpha) & - \\frac{n}{\\beta} \\\\ - \\frac{n}{\\beta} & \\frac{n \\alpha}{\\beta^2} \\end{bmatrix} \\end{align*}\\] A non-diagonal Fisher matrix implies that the parameter estimates are linearly dependent. set.seed(1) n <- 20 pars_theor <- c(5, 2) x <- rgamma(n, 5, 2) # MLE for alpha and beta log_lik <- function (pars, x) { n <- length(x) return (- (n * pars[1] * log(pars[2]) - n * log(gamma(pars[1])) + (pars[1] - 1) * sum(log(x)) - pars[2] * sum(x))) } my_optim <- optim(par = c(1,1), fn = log_lik, method = "L-BFGS-B", lower = c(0.001, 0.001), upper = c(8, 8), x = x) pars_mle <- my_optim$par fish_mat <- matrix(data = NA, nrow = 2, ncol = 2) fish_mat[1,2] <- - n / pars_mle[2] fish_mat[2,1] <- - n / pars_mle[2] fish_mat[2,2] <- (n * pars_mle[1]) / (pars_mle[2]^2) fish_mat[1,1] <- n * grad(digamma, pars_mle[1]) fish_mat_inv <- solve(fish_mat) est_ellip <- ellipse(pars_mle, fish_mat_inv, draw = FALSE) colnames(est_ellip) <- c("X1", "X2") est_ellip <- as.data.frame(est_ellip) ggplot() + geom_point(data = data.frame(x = pars_mle[1], y = pars_mle[2]), aes(x = x, y = y)) + geom_path(data = est_ellip, aes(x = X1, y = X2)) + geom_point(aes(x = pars_theor[1], y = pars_theor[2]), color = "red") + geom_text(aes(x = pars_theor[1], y = pars_theor[2], label = "Theoretical parameters"), color = "red", nudge_y = -0.2) 15.3 The German tank problem Exercise 15.8 (The German tank problem) During WWII the allied intelligence were faced with an important problem of estimating the total production of certain German tanks, such as the Panther. What turned out to be a successful approach was to estimate the maximum from the serial numbers of the small sample of captured or destroyed tanks (describe the statistical model used). What assumptions were made by using the above model? Do you think they are reasonable assumptions in practice? Show that the plug-in estimate for the maximum (i.e. the maximum of the sample) is a biased estimator. Derive the maximum likelihood estimate of the maximum. Check that the following estimator is not biased: \\(\\hat{n} = \\frac{k + 1}{k}m - 1\\). Solution. The data are the serial numbers of the tanks. The parameter is \\(n\\), the total production of the tank. The distribution of the serial numbers is a discrete uniform distribution over all serial numbers. One of the assumptions is that we have i.i.d samples, however in practice this might not be true, as some tanks produced later could be sent to the field later, therefore already in theory we would not be able to recover some values from the population. To find the expected value we first need to find the distribution of \\(m\\). Let us start with the CDF. \\[\\begin{align*} F_m(x) = P(Y_1 < x,...,Y_k < x). \\end{align*}\\] If \\(x < k\\) then \\(F_m(x) = 0\\) and if \\(x \\geq 1\\) then \\(F_m(x) = 1\\). What about between those values. So the probability that the maximum value is less than or equal to \\(m\\) is just the number of possible draws from \\(Y\\) that are all smaller than \\(m\\), divided by all possible draws. This is \\(\\frac{{x}\\choose{k}}{{n}\\choose{k}}\\). The PDF on the suitable bounds is then \\[\\begin{align*} P(m = x) = F_m(x) - F_m(x - 1) = \\frac{\\binom{x}{k} - \\binom{x - 1}{k}}{\\binom{n}{k}} = \\frac{\\binom{x - 1}{k - 1}}{\\binom{n}{k}}. \\end{align*}\\] Now we can calculate the expected value of \\(m\\) using some combinatorial identities. \\[\\begin{align*} E[m] &= \\sum_{i = k}^n i \\frac{{i - 1}\\choose{k - 1}}{{n}\\choose{k}} \\\\ &= \\sum_{i = k}^n i \\frac{\\frac{(i - 1)!}{(k - 1)!(i - k)!}}{{n}\\choose{k}} \\\\ &= \\frac{k}{\\binom{n}{k}}\\sum_{i = k}^n \\binom{i}{k} \\\\ &= \\frac{k}{\\binom{n}{k}} \\binom{n + 1}{k + 1} \\\\ &= \\frac{k(n + 1)}{k + 1}. \\end{align*}\\] The bias of this estimator is then \\[\\begin{align*} E[m] - n = \\frac{k(n + 1)}{k + 1} - n = \\frac{k - n}{k + 1}. \\end{align*}\\] The probability that we observed our sample \\(Y = {Y_1, Y_2,...,,Y_k}\\) given \\(n\\) is \\(\\frac{1}{{n}\\choose{k}}\\). We need to find such \\(n^*\\) that this function is maximized. Additionally, we have a constraint that \\(n^* \\geq m = \\max{(Y)}\\). Let us plot this function for \\(m = 10\\) and \\(k = 4\\). library(ggplot2) my_fun <- function (x, m, k) { tmp <- 1 / (choose(x, k)) tmp[x < m] <- 0 return (tmp) } x <- 1:20 y <- my_fun(x, 10, 4) df <- data.frame(x = x, y = y) ggplot(data = df, aes(x = x, y = y)) + geom_line() ::: {.solution} (continued) We observe that the maximum of this function lies at the maximum value of the sample. Therefore \\(n^* = m\\) and ML estimate equals the plug-in estimate. \\[\\begin{align*} E[\\hat{n}] &= \\frac{k + 1}{k} E[m] - 1 \\\\ &= \\frac{k + 1}{k} \\frac{k(n + 1)}{k + 1} - 1 \\\\ &= n. \\end{align*}\\] ::: "],["nhst.html", "Chapter 16 Null hypothesis significance testing", " Chapter 16 Null hypothesis significance testing This chapter deals with null hypothesis significance testing. The students are expected to acquire the following knowledge: Binomial test. t-test. Chi-squared test. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 16.1 (Binomial test) We assume \\(y_i \\in \\{0,1\\}\\), \\(i = 1,...,n\\) and \\(y_i | \\theta = 0.5 \\sim i.i.d.\\) Bernoulli\\((\\theta)\\). The test statistic is \\(X = \\sum_{i=1}^n\\) and the rejection region R is defined as the region where the probability of obtaining such or more extreme \\(X\\) given \\(\\theta = 0.5\\) is less than 0.05. Derive and plot the power function of the test for \\(n=100\\). What is the significance level of this test if \\(H0: \\theta = 0.5\\)? At which values of X will we reject the null hypothesis? # a # First we need the rejection region, so we need to find X_min and X_max n <- 100 qbinom(0.025, n, 0.5) ## [1] 40 qbinom(0.975, n, 0.5) ## [1] 60 pbinom(40, n, 0.5) ## [1] 0.02844397 pbinom(60, n, 0.5) ## [1] 0.9823999 X_min <- 39 X_max <- 60 thetas <- seq(0, 1, by = 0.01) beta_t <- 1 - pbinom(X_max, size = n, prob = thetas) + pbinom(X_min, size = n, prob = thetas) plot(beta_t) # b # The significance level is beta_t[51] ## [1] 0.0352002 # We will reject the null hypothesis at X values below X_min and above X_max. Exercise 16.2 (Long-run guarantees of the t-test) Generate a sample of size \\(n = 10\\) from the standard normal. Use the two-sided t-test with \\(H0: \\mu = 0\\) and record the p-value. Can you reject H0 at 0.05 significance level? (before simulating) If we repeated (b) many times, what would be the relative frequency of false positives/Type I errors (rejecting the null that is true)? What would be the relative frequency of false negatives /Type II errors (retaining the null when the null is false)? (now simulate b and check if the simulation results match your answer in b) Similar to (a-c) but now we generate data from N(-0.5, 1). Similar to (a-c) but now we generate data from N(\\(\\mu\\), 1) where we every time pick a different \\(\\mu < 0\\) and use a one-sided test \\(H0: \\mu <= 0\\). set.seed(2) # a x <- rnorm(10) my_test <- t.test(x, alternative = "two.sided", mu = 0) my_test ## ## One Sample t-test ## ## data: x ## t = 0.6779, df = 9, p-value = 0.5149 ## alternative hypothesis: true mean is not equal to 0 ## 95 percent confidence interval: ## -0.4934661 0.9157694 ## sample estimates: ## mean of x ## 0.2111516 # we can not reject the null hypothesis # b # The expected value of false positives would be 0.05. The expected value of # true negatives would be 0, as there are no negatives (the null hypothesis is # always the truth). nit <- 1000 typeIerr <- vector(mode = "logical", length = nit) typeIIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { x <- rnorm(10) my_test <- t.test(x, alternative = "two.sided", mu = 0) if (my_test$p.value < 0.05) { typeIerr[i] <- T } else { typeIerr[i] <- F } } mean(typeIerr) ## [1] 0.052 sd(typeIerr) / sqrt(nit) ## [1] 0.007024624 # d # We can not estimate the percentage of true negatives, but it will probably be # higher than 0.05. There will be no false positives as the null hypothesis is # always false. typeIIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { x <- rnorm(10, -0.5) my_test <- t.test(x, alternative = "two.sided", mu = 0) if (my_test$p.value < 0.05) { typeIIerr[i] <- F } else { typeIIerr[i] <- T } } mean(typeIIerr) ## [1] 0.719 sd(typeIIerr) / sqrt(nit) ## [1] 0.01422115 # e # The expected value of false positives would be lower than 0.05. The expected # value of true negatives would be 0, as there are no negatives (the null # hypothesis is always the truth). typeIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { u <- runif(1, -1, 0) x <- rnorm(10, u) my_test <- t.test(x, alternative = "greater", mu = 0) if (my_test$p.value < 0.05) { typeIerr[i] <- T } else { typeIerr[i] <- F } } mean(typeIerr) ## [1] 0.012 sd(typeIerr) / sqrt(nit) ## [1] 0.003444977 Exercise 16.3 (T-test, confidence intervals, and bootstrap) Sample \\(n=20\\) from a standard normal distribution and calculate the p-value using t-test, confidence intervals based on normal distribution, and bootstrap. Repeat this several times and check how many times we rejected the null hypothesis (made a type I error). Hint: For the confidence intervals you can use function CI from the Rmisc package. set.seed(1) library(Rmisc) nit <- 1000 n_boot <- 100 t_logic <- rep(F, nit) boot_logic <- rep(F, nit) norm_logic <- rep(F, nit) for (i in 1:nit) { x <- rnorm(20) my_test <- t.test(x) my_CI <- CI(x) if (my_test$p.value <= 0.05) t_logic[i] <- T boot_tmp <- vector(mode = "numeric", length = n_boot) for (j in 1:n_boot) { tmp_samp <- sample(x, size = 20, replace = T) boot_tmp[j] <- mean(tmp_samp) } if ((quantile(boot_tmp, 0.025) >= 0) | (quantile(boot_tmp, 0.975) <= 0)) { boot_logic[i] <- T } if ((my_CI[3] >= 0) | (my_CI[1] <= 0)) { norm_logic[i] <- T } } mean(t_logic) ## [1] 0.053 sd(t_logic) / sqrt(nit) ## [1] 0.007088106 mean(boot_logic) ## [1] 0.093 sd(boot_logic) / sqrt(nit) ## [1] 0.009188876 mean(norm_logic) ## [1] 0.053 sd(norm_logic) / sqrt(nit) ## [1] 0.007088106 Exercise 16.4 (Chi-squared test) Show that the \\(\\chi^2 = \\sum_{i=1}^k \\frac{(O_i - E_i)^2}{E_i}\\) test statistic is approximately \\(\\chi^2\\) distributed when we have two categories. Let us look at the US voting data here. Compare the number of voters who voted for Trump or Hillary depending on their income (less or more than 100.000 dollars per year). Manually calculate the chi-squared statistic, compare to the chisq.test in R, and discuss the results. Visualize the test. Solution. Let \\(X_i\\) be binary variables, \\(i = 1,...,n\\). We can then express the test statistic as \\[\\begin{align} \\chi^2 = &\\frac{(O_i - np)^2}{np} + \\frac{(n - O_i - n(1 - p))^2}{n(1 - p)} \\\\ &= \\frac{(O_i - np)^2}{np(1 - p)} \\\\ &= (\\frac{O_i - np}{\\sqrt{np(1 - p)}})^2. \\end{align}\\] When \\(n\\) is large, this distrbution is approximately normal with \\(\\mu = np\\) and \\(\\sigma^2 = np(1 - p)\\) (binomial converges in distribution to standard normal). By definition, the chi-squared distribution with \\(k\\) degrees of freedom is a sum of squares of \\(k\\) independent standard normal random variables. n <- 24588 less100 <- round(0.66 * n * c(0.49, 0.45, 0.06)) # some rounding, but it should not affect results more100 <- round(0.34 * n * c(0.47, 0.47, 0.06)) x <- rbind(less100, more100) colnames(x) <- c("Clinton", "Trump", "other/no answer") print(x) ## Clinton Trump other/no answer ## less100 7952 7303 974 ## more100 3929 3929 502 chisq.test(x) ## ## Pearson's Chi-squared test ## ## data: x ## X-squared = 9.3945, df = 2, p-value = 0.00912 x ## Clinton Trump other/no answer ## less100 7952 7303 974 ## more100 3929 3929 502 csum <- apply(x, 2, sum) rsum <- apply(x, 1, sum) chi2 <- (x[1,1] - csum[1] * rsum[1] / sum(x))^2 / (csum[1] * rsum[1] / sum(x)) + (x[1,2] - csum[2] * rsum[1] / sum(x))^2 / (csum[2] * rsum[1] / sum(x)) + (x[1,3] - csum[3] * rsum[1] / sum(x))^2 / (csum[3] * rsum[1] / sum(x)) + (x[2,1] - csum[1] * rsum[2] / sum(x))^2 / (csum[1] * rsum[2] / sum(x)) + (x[2,2] - csum[2] * rsum[2] / sum(x))^2 / (csum[2] * rsum[2] / sum(x)) + (x[2,3] - csum[3] * rsum[2] / sum(x))^2 / (csum[3] * rsum[2] / sum(x)) chi2 ## Clinton ## 9.394536 1 - pchisq(chi2, df = 2) ## Clinton ## 0.009120161 x <- seq(0, 15, by = 0.01) df <- data.frame(x = x) ggplot(data = df, aes(x = x)) + stat_function(fun = dchisq, args = list(df = 2)) + geom_segment(aes(x = chi2, y = 0, xend = chi2, yend = dchisq(chi2, df = 2))) + stat_function(fun = dchisq, args = list(df = 2), xlim = c(chi2, 15), geom = "area", fill = "red") "],["bi.html", "Chapter 17 Bayesian inference 17.1 Conjugate priors 17.2 Posterior sampling", " Chapter 17 Bayesian inference This chapter deals with Bayesian inference. The students are expected to acquire the following knowledge: How to set prior distribution. Compute posterior distribution. Compute posterior predictive distribution. Use sampling for inference. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 17.1 Conjugate priors Exercise 17.1 (Poisson-gamma model) Let us assume a Poisson likelihood and a gamma prior on the Poisson mean parameter (this is a conjugate prior). Derive posterior Below we have some data, which represents number of goals in a football match. Choose sensible prior for this data (draw the gamma density if necessary), justify it. Compute the posterior. Compute an interval such that the probability that the true mean is in there is 95%. What is the probability that the true mean is greater than 2.5? Back to theory: Compute prior predictive and posterior predictive. Discuss why the posterior predictive is overdispersed and not Poisson? Draw a histogram of the prior predictive and posterior predictive for the data from (b). Discuss. Generate 10 and 100 random samples from a Poisson distribution and compare the posteriors with a flat prior, and a prior concentrated away from the truth. x <- c(3, 2, 1, 1, 5, 4, 0, 0, 4, 3) Solution. \\[\\begin{align*} p(\\lambda | X) &= \\frac{p(X | \\lambda) p(\\lambda)}{\\int_0^\\infty p(X | \\lambda) p(\\lambda) d\\lambda} \\\\ &\\propto p(X | \\lambda) p(\\lambda) \\\\ &= \\Big(\\prod_{i=1}^n \\frac{1}{x_i!} \\lambda^{x_i} e^{-\\lambda}\\Big) \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\lambda^{\\alpha - 1} e^{-\\beta \\lambda} \\\\ &\\propto \\lambda^{\\sum_{i=1}^n x_i + \\alpha - 1} e^{- \\lambda (n + \\beta)} \\\\ \\end{align*}\\] We recognize this as the shape of a gamma distribution, therefore \\[\\begin{align*} \\lambda | X \\sim \\text{gamma}(\\alpha + \\sum_{i=1}^n x_i, \\beta + n) \\end{align*}\\] For the prior predictive, we have \\[\\begin{align*} p(x^*) &= \\int_0^\\infty p(x^*, \\lambda) d\\lambda \\\\ &= \\int_0^\\infty p(x^* | \\lambda) p(\\lambda) d\\lambda \\\\ &= \\int_0^\\infty \\frac{1}{x^*!} \\lambda^{x^*} e^{-\\lambda} \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\lambda^{\\alpha - 1} e^{-\\beta \\lambda} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\int_0^\\infty \\lambda^{x^* + \\alpha - 1} e^{-\\lambda (1 + \\beta)} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\frac{\\Gamma(x^* + \\alpha)}{(1 + \\beta)^{x^* + \\alpha}} \\int_0^\\infty \\frac{(1 + \\beta)^{x^* + \\alpha}}{\\Gamma(x^* + \\alpha)} \\lambda^{x^* + \\alpha - 1} e^{-\\lambda (1 + \\beta)} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\frac{\\Gamma(x^* + \\alpha)}{(1 + \\beta)^{x^* + \\alpha}} \\\\ &= \\frac{\\Gamma(x^* + \\alpha)}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} (\\frac{\\beta}{1 + \\beta})^\\alpha (\\frac{1}{1 + \\beta})^{x^*}, \\end{align*}\\] which we recognize as the negative binomial distribution with \\(r = \\alpha\\) and \\(p = \\frac{1}{\\beta + 1}\\). For the posterior predictive, the calculation is the same, only now the parameters are \\(r = \\alpha + \\sum_{i=1}^n x_i\\) and \\(p = \\frac{1}{\\beta + n + 1}\\). There are two sources of uncertainty in the predictive distribution. First is the uncertainty about the population. Second is the variability in sampling from the population. When \\(n\\) is large, the latter is going to be very small. But when \\(n\\) is small, the latter is going to be higher, resulting in an overdispersed predictive distribution. x <- c(3, 2, 1, 1, 5, 4, 0, 0, 4, 3) # b # quick visual check of the prior ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dgamma, args = list(shape = 1, rate = 1)) palpha <- 1 pbeta <- 1 alpha_post <- palpha + sum(x) beta_post <- pbeta + length(x) ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dgamma, args = list(shape = alpha_post, rate = beta_post)) # probability of being higher than 2.5 1 - pgamma(2.5, alpha_post, beta_post) ## [1] 0.2267148 # interval qgamma(c(0.025, 0.975), alpha_post, beta_post) ## [1] 1.397932 3.137390 # d prior_pred <- rnbinom(1000, size = palpha, prob = 1 - 1 / (pbeta + 1)) post_pred <- rnbinom(1000, size = palpha + sum(x), prob = 1 - 1 / (pbeta + 10 + 1)) df <- data.frame(prior = prior_pred, posterior = post_pred) df <- gather(df) ggplot(df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") # e set.seed(1) x1 <- rpois(10, 2.5) x2 <- rpois(100, 2.5) alpha_flat <- 1 beta_flat <- 0.1 alpha_conc <- 50 beta_conc <- 10 n <- 10000 df_flat <- data.frame(x1 = rgamma(n, alpha_flat + sum(x1), beta_flat + 10), x2 = rgamma(n, alpha_flat + sum(x2), beta_flat + 100), type = "flat") df_flat <- tidyr::gather(df_flat, key = "key", value = "value", - type) df_conc <- data.frame(x1 = rgamma(n, alpha_conc + sum(x1), beta_conc + 10), x2 = rgamma(n, alpha_conc + sum(x2), beta_conc + 100), type = "conc") df_conc <- tidyr::gather(df_conc, key = "key", value = "value", - type) df <- rbind(df_flat, df_conc) ggplot(data = df, aes(x = value, color = type)) + facet_wrap(~ key) + geom_density() 17.2 Posterior sampling Exercise 17.2 (Bayesian logistic regression) In Chapter 15 we implemented a MLE for logistic regression (see the code below). For this model, conjugate priors do not exist, which complicates the calculation of the posterior. However, we can use sampling from the numerator of the posterior, using rejection sampling. Set a sensible prior distribution on \\(\\beta\\) and use rejection sampling to find the posterior distribution. In a) you will get a distribution of parameter \\(\\beta\\). Plot the probabilities (as in exercise 15.3) for each sample of \\(\\beta\\) and compare to the truth. Hint: We can use rejection sampling even for functions which are not PDFs – they do not have to sum/integrate to 1. We just need to use a suitable envelope that we know how to sample from. For example, here we could use a uniform distribution and scale it suitably. set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par # Let's say we believe that the mean of beta is 0.5. Since we are not very sure # about this, we will give it a relatively high variance. So a normal prior with # mean 0.5 and standard deviation 5. But there is no right solution to this, # this is basically us expressing our prior belief in the parameter values. set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) if (is.nan(logl)) logl <- Inf return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par ## [1] 1.166558 f_logistic <- function (beta, X, y) { logl <- prod(inv_log(as.vector(beta %*% X))^y * (1 - inv_log(as.vector(beta %*% X)))^(1 - y)) return(logl) } a <- seq(0, 3, by = 0.01) my_l <- c() for (i in a) { my_l <- c(my_l, f_logistic(i, x, y) * dnorm(i, 0.5, 5)) } plot(my_l) envlp <- 10^(-25.8) * dunif(a, -5, 5) # found by trial and error tmp <- data.frame(envel = envlp, l = my_l, t = a) tmp <- gather(tmp, key = "key", value = "value", - t) ggplot(tmp, aes(x = t, y = value, color = key)) + geom_line() # envelope OK set.seed(1) nsamps <- 1000 samps <- c() for (i in 1:nsamps) { tmp <- runif(1, -5, 5) u <- runif(1, 0, 1) if (u < (f_logistic(tmp, x, y) * dnorm(tmp, 0.5, 5)) / (10^(-25.8) * dunif(tmp, -5, 5))) { samps <- c(samps, tmp) } } plot(density(samps)) mean(samps) ## [1] 1.211578 median(samps) ## [1] 1.204279 truth_p <- data.frame(x = x, prob = inv_log(1.2 * x), type = "truth") preds <- inv_log(x %*% t(samps)) preds <- gather(cbind(as.data.frame(preds), x = x), key = "key", "value" = value, - x) ggplot(preds, aes(x = x, y = value)) + geom_line(aes(group = key), color = "gray", alpha = 0.7) + geom_point(data = truth_p, aes(y = prob), color = "red", alpha = 0.7) + theme_bw() "],["distributions-intutition.html", "Chapter 18 Distributions intutition 18.1 Discrete distributions 18.2 Continuous distributions", " Chapter 18 Distributions intutition This chapter is intended to help you familiarize yourself with the different probability distributions you will encounter in this course. You will need to use Appendix B extensively as a reference for the basic properties of distributions, so keep it close! .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 18.1 Discrete distributions Exercise 18.1 (Bernoulli intuition 1) Arguably the simplest distribution you will encounter is the Bernoulli distribution. It is a discrete probability distribution used to represent the outcome of a yes/no question. It has one parameter \\(p\\) which is the probability of success. The probability of failure is \\((1-p)\\), sometimes denoted as \\(q\\). A classic way to think about a Bernoulli trial (a yes/no experiment) is a coin flip. Real coins are fair, meaning the probability of either heads (1) or tails (0) are the same, so \\(p=0.5\\) as shown below in figure a. Alternatively we may want to represent a process that doesn’t have equal probabilities of outcomes like “Will a throw of a fair die result in a 6?”. In this case \\(p=\\frac{1}{6}\\), shown in figure b. Using your knowledge of the Bernoulli distribution use the throw of a fair die to think of events, such that: \\(p = 0.5\\) \\(p = \\frac{5}{6}\\) \\(q = \\frac{2}{3}\\) Solution. An event that is equally likely to happen or not happen i.e. \\(p = 0.5\\) would be throwing an even number. More formally we can name this event \\(A\\) and write: \\(A = \\{2,4,6\\}\\), its probability being \\(P(A) = 0.5\\) An example of an event with \\(p = \\frac{5}{6}\\) would be throwing a number greater than 1. Defined as \\(B = \\{2,3,4,5,6\\}\\). We need an event that fails \\(\\frac{2}{3}\\) of the time. Alternatively we can reverse the problem and find an event that succeeds \\(\\frac{1}{3}\\) of the time, since: \\(q = 1 - p \\implies p = 1 - q = \\frac{1}{3}\\). The event that our outcome is divisible by 3: \\(C = \\{3, 6\\}\\) satisfies this condition. Exercise 18.2 (Binomial intuition 1) The binomial distribution is a generalization of the Bernoulli distribution. Instead of considering a single Bernoulli trial, we now consider a sequence of \\(n\\) trials, which are independent and have the same parameter \\(p\\). So the binomial distribution has two parameters \\(n\\) - the number of trials and \\(p\\) - the probability of success for each trial. If we return to our coin flip representation, we now flip a coin several times. The binomial distribution will give us the probabilities of all possible outcomes. Below we show the distribution for a series of 10 coin flips with a fair coin (left) and a biased coin (right). The numbers on the x axis represent the number of times the coin landed heads. Using your knowledge of the binomial distribution: Take the pmf of the binomial distribution and plug in \\(n=1\\), check that it is in fact equivalent to a Bernoulli distribution. In our examples we show the graph of a binomial distribution over 10 trials with \\(p=0.8\\). If we take a look at the graph, it appears as though the probabilities of getting 0,1,2 or 3 heads in 10 flips are zero. Is it actually zero? Check by plugging in the values into the pmf. Solution. The pmf of a binomial distribution is \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\), now we insert \\(n=1\\) to get: \\[\\binom{1}{k} p^k (1 - p)^{1 - k}\\] Not quite equivalent to a Bernoulli, however note that the support of the binomial distribution is defined as \\(k \\in \\{0,1,\\dots,n\\}\\), so in our case \\(k = \\{0,1\\}\\), then: \\[\\binom{1}{0} = \\binom{1}{1} = 1\\] we get: \\(p^k (1 - p)^{1 - k}\\) ,the Bernoulli distribution. As we already know \\(p=0.8, n=10\\), so: \\[\\binom{10}{0} 0.8^0 (1 - 0.8)^{10 - 0} = 1.024 \\cdot 10^{-7}\\] \\[\\binom{10}{1} 0.8^1 (1 - 0.8)^{10 - 1} = 4.096 \\cdot 10^{-6}\\] \\[\\binom{10}{2} 0.8^2 (1 - 0.8)^{10 - 2} = 7.3728 \\cdot 10^{-5}\\] \\[\\binom{10}{3} 0.8^3 (1 - 0.8)^{10 - 3} = 7.86432\\cdot 10^{-4}\\] So the probabilities are not zero, just very small. Exercise 18.3 (Poisson intuition 1) Below are shown 3 different graphs of the Poisson distribution. Your task is to replicate them on your own in R by varying the \\(\\lambda\\) parameter. Hint: You can use dpois() to get the probabilities. library(ggplot2) library(gridExtra) x = 0:15 # Create Poisson data data1 <- data.frame(x = x, y = dpois(x, lambda = 0.1)) data2 <- data.frame(x = x, y = dpois(x, lambda = 1)) data3 <- data.frame(x = x, y = dpois(x, lambda = 7.5)) # Create individual ggplot objects plot1 <- ggplot(data1, aes(x, y)) + geom_col() + xlab("x") + ylab("Probability") + ylim(0,1) plot2 <- ggplot(data2, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) plot3 <- ggplot(data3, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) # Combine the plots grid.arrange(plot1, plot2, plot3, ncol = 3) Exercise 18.4 (Poisson intuition 2) The Poisson distribution is a discrete probability distribution that models the probability of a given number of events occuring within processes where events occur at a constant mean rate and independently of each other - a Poisson process. It has a single parameter \\(\\lambda\\), which represents the constant mean rate. A classic example of a scenario that can be modeled using the Poisson distribution is the number of calls received at a call center in a day (or in fact any other time interval). Suppose you work in a call center and have some understanding of probability distributions. You overhear your supervisor mentioning that the call center receives an average of 2.5 calls per day. Using your knowledge of the Poisson distribution, calculate: The probability you will get no calls today. The probability you will get more than 5 calls today. Solution. First recall the Poisson pmf: \\[p(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!}\\] as stated previously our parameter \\(\\lambda = 2.5\\) To get the probability of no calls we simply plug in \\(k = 0\\), so: \\[p(0) = \\frac{2.5^0 e^{-2.5}}{0!} = e^{-2.5} \\approx 0.082\\] The support of the Poisson distribution is non-negative integers. So if we wanted to calculate the probability of getting more than 5 calls we would need to add up the probabilities of getting 6 calls and 7 calls and so on up to infinity. Let us instead remember that the sum of all probabilties will be 1, we will reverse the problem and instead ask “What is the probability we get 5 calls or less?”. We can subtract the probability of the opposite outcome (the complement) from 1 to get the probability of our original question. \\[P(k > 5) = 1 - P(k \\leq 5)\\] \\[P(k \\leq 5) = \\sum_{i=0}^{5} p(i) = p(0) + p(1) + p(2) + p(3) + p(4) + p(5) =\\] \\[= \\frac{2.5^0 e^{-2.5}}{0!} + \\frac{2.5^1 e^{-2.5}}{1!} + \\dots =\\] \\[=0.957979\\] So the probability of geting more than 5 calls will be \\(1 - 0.957979 = 0.042021\\) Exercise 18.5 (Geometric intuition 1) The geometric distribution is a discrete distribution that models the number of failures before the first success in a sequence of independent Bernoulli trials. It has a single parameter \\(p\\), representing the probability of success and its support is all non-negative integers \\(\\{0,1,2,\\dots\\}\\). NOTE: There are two forms of this distribution, the one we just described and another that models the number of trials before the first success. The difference is subtle yet significant and you are likely to encounter both forms. The key to telling them apart is to check their support, since the number of trials has to be at least \\(1\\), for this case we have \\(\\{1,2,\\dots\\}\\). In the graph below we show the pmf of a geometric distribution with \\(p=0.5\\). This can be thought of as the number of successive failures (tails) in the flip of a fair coin. You can see that there’s a 50% chance you will have zero failures i.e. you will flip a heads on your very first attempt. But there is some smaller chance that you will flip a sequence of tails in a row, with longer sequences having ever lower probability. Create an equivalent graph that represents the probability of rolling a 6 with a fair 6-sided die. Use the formula for the mean of the geometric distribution and determine the average number of failures before you roll a 6. Look up the alternative form of the geometric distribtuion and again use the formula for the mean to determine the average number of trials before you roll a 6. Solution. Parameter p (the probability of success) for rolling a 6 is \\(p=\\frac{1}{6}\\). library(ggplot2) # Parameters p <- 1/6 x_vals <- 0:9 # Starting from 0 probs <- dgeom(x_vals, p) # Data data <- data.frame(x_vals, probs) # Plot ggplot(data, aes(x=x_vals, y=probs)) + geom_segment(aes(xend=x_vals, yend=0), color="black", size=1) + geom_point(color="red", size=2) + labs(x = "Number of trials", y = "Probability") + theme_minimal() + scale_x_continuous(breaks = x_vals) # This line ensures integer x-axis labels ::: {.solution} b) The expected value of a random variable (the mean) is denoted as \\(E[X]\\). \\[E[X] = \\frac{1-p}{p}= \\frac{1- \\frac{1}{6}}{\\frac{1}{6}} = \\frac{5}{6}\\cdot 6 = 5\\] On average we will fail 5 times before we roll our first 6. The alternative form of this distribution (with support on all positive integers) has a slightly different formula for the mean. This change reflects the difference in the way we posed our question: \\[E[X] = \\frac{1}{p} = \\frac{1}{\\frac{1}{6}} = 6\\] On average we will have to throw the die 6 times before we roll a 6. ::: 18.2 Continuous distributions Exercise 18.6 (Uniform intuition 1) The need for a randomness is a common problem. A practical solution are so-called random number generators (RNGs). The simplest RNG one would think of is choosing a set of numbers and having the generator return a number at random, where the probability of returning any number from this set is the same. If this set is an interval of real numbers, then we’ve basically described the continuous uniform distribution. It has two parameters \\(a\\) and \\(b\\), which define the beginning and end of its support respectively. Let’s think of the mean intuitively. The expected value or mean of a distribution is the pivot point on our x-axis, which “balances” the graph. Given parameters \\(a\\) and \\(b\\) what is your intuitive guess of the mean for this distribution? A special case of the uniform distribution is the standard uniform distribution with \\(a=0\\) and \\(b=1\\). Write the pdf \\(f(x)\\) of this particular distribution. Solution. It’s the midpoint between \\(a\\) and \\(b\\), so \\(\\frac{a+b}{2}\\) Inserting the parameter values we get:\\[f(x) = \\begin{cases} 1 & \\text{if } 0 \\leq x \\leq 1 \\\\ 0 & \\text{otherwise} \\end{cases} \\] Notice how the pdf is just a constant \\(1\\) across all values of \\(x \\in [0,1]\\). Here it is important to distinguish between probability and probability density. The density may be 1, but the probability is not and while discrete distributions never exceed 1 on the y-axis, continuous distributions can go as high as you like. Exercise 18.7 (Normal intuition 1) The normal distribution, also known as the Gaussian distribution, is a continuous distribution that encompasses the entire real number line. It has two parameters: the mean, denoted by \\(\\mu\\), and the variance, represented by \\(\\sigma^2\\). Its shape resembles the iconic bell curve. The position of its peak is determined by the parameter \\(\\mu\\), while the variance determines the spread or width of the curve. A smaller variance results in a sharper, narrower peak, while a larger variance leads to a broader, more spread-out curve. Below, we graph the distribution of IQ scores for two different populations. We aim to identify individuals with an IQ at or above 140 for an experiment. We can identify them reliably; however, we only have time to examine one of the two groups. Which group should we investigate to have the best chance of finding such individuals? NOTE: The graph below displays the parameter \\(\\sigma\\), which is the square root of the variance, more commonly referred to as the standard deviation. Keep this in mind when solving the problems. Insert the values of either population into the pdf of a normal distribution and determine which one has a higher density at \\(x=140\\). Generate the graph yourself and zoom into the relevant area to graphically verify your answer. To determine probability density, we can use the pdf. However, if we wish to know the proportion of the population that falls within certain parameters, we would need to integrate the pdf. Fortunately, the integrals of common distributions are well-established. This integral gives us the cumulative distribution function \\(F(x)\\) (CDF). BONUS: Look up the CDF of the normal distribution and input the appropriate values to determine the percentage of each population that comprises individuals with an IQ of 140 or higher. Solution. Group 1: \\(\\mu = 100, \\sigma=10 \\rightarrow \\sigma^2 = 100\\) \\[\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} = \\frac{1}{\\sqrt{2 \\pi 100}} e^{-\\frac{(140 - 100)^2}{2 \\cdot 100}} \\approx 1.34e-05\\] Group 2: \\(\\mu = 105, \\sigma=8 \\rightarrow \\sigma^2 = 64\\) \\[\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} = \\frac{1}{\\sqrt{2 \\pi 64}} e^{-\\frac{(140 - 105)^2}{2 \\cdot 64}} \\approx 3.48e-06\\] So despite the fact that group 1 has a lower average IQ, we are more likely to find 140 IQ individuals in this group. library(ggplot2) library(tidyr) # Create data x <- seq(135, 145, by = 0.01) # Adjusting the x range to account for the larger standard deviations df <- data.frame(x = x) # Define the IQ distributions df$IQ_mu100_sd10 <- dnorm(df$x, mean = 100, sd = 10) df$IQ_mu105_sd8 <- dnorm(df$x, mean = 105, sd = 8) # Convert from wide to long format for ggplot2 df_long <- gather(df, distribution, density, -x) # Ensure the levels of the 'distribution' factor match our desired order df_long$distribution <- factor(df_long$distribution, levels = c("IQ_mu100_sd10", "IQ_mu105_sd8")) # Plot ggplot(df_long, aes(x = x, y = density, color = distribution)) + geom_line() + labs(x = "IQ Score", y = "Density") + scale_color_manual( name = "IQ Distribution", values = c(IQ_mu100_sd10 = "red", IQ_mu105_sd8 = "blue"), labels = c("Group 1 (µ=100, σ=10)", "Group 2 (µ=105, σ=8)") ) + theme_minimal() ::: {.solution} c. The CDF of the normal distribution is \\(\\Phi(x) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{x - \\mu}{\\sigma \\sqrt{2}} \\right) \\right]\\). The CDF is defined as the integral of the distribution density up to x. So to get the total percentage of individuals with IQ at 140 or higher we will need to subtract the value from 1. Group 1: \\[1 - \\Phi(140) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{140 - 100}{10 \\sqrt{2}} \\right) \\right] \\approx 3.17e-05 \\] Group 2 : \\[1 - \\Phi(140) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{140 - 105}{8 \\sqrt{2}} \\right) \\right] \\approx 6.07e-06 \\] So roughly 0.003% and 0.0006% of individuals in groups 1 and 2 respectively have an IQ at or above 140. ::: Exercise 18.8 (Beta intuition 1) The beta distribution is a continuous distribution defined on the unit interval \\([0,1]\\). It has two strictly positive paramters \\(\\alpha\\) and \\(\\beta\\), which determine its shape. Its support makes it especially suitable to model distribtuions of percentages and proportions. Below you’ve been provided with some code that you can copy into Rstudio. Once you run the code, an interactive Shiny app will appear and you will be able to manipulate the graph of the beta distribution. Play around with the parameters to get: A straight line from (0,0) to (1,2) A straight line from (0,2) to (1,0) A symmetric bell curve A bowl-shaped curve The standard uniform distribution is actually a special case of the beta distribution. Find the exact parameters \\(\\alpha\\) and \\(\\beta\\). Once you do, prove the equality by inserting the values into our pdf. Hint: The beta function is evaluated as \\(\\text{B}(a,b) = \\frac{\\Gamma(a)\\Gamma(b)}{\\Gamma(a+b)}\\), the gamma function for positive integers \\(n\\) is evaluated as \\(\\Gamma(n)= (n-1)!\\) # Install and load necessary packages install.packages(c("shiny", "ggplot2")) library(shiny) library(ggplot2) # The Shiny App ui <- fluidPage( titlePanel("Beta Distribution Viewer"), sidebarLayout( sidebarPanel( sliderInput("alpha", "Alpha:", min = 0.1, max = 10, value = 2, step = 0.1), sliderInput("beta", "Beta:", min = 0.1, max = 10, value = 2, step = 0.1) ), mainPanel( plotOutput("betaPlot") ) ) ) server <- function(input, output) { output$betaPlot <- renderPlot({ x <- seq(0, 1, by = 0.01) y <- dbeta(x, shape1 = input$alpha, shape2 = input$beta) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() + labs(x = "Value", y = "Density") + theme_minimal() }) } shinyApp(ui = ui, server = server) Solution. \\(\\alpha = 2, \\beta=1\\) \\(\\alpha = 1, \\beta=2\\) Possible solution \\(\\alpha = \\beta= 5\\) Possible solution \\(\\alpha = \\beta= 0.5\\) The correct parameters are \\(\\alpha = 1, \\beta=1\\), to prove the equality we insert them into the beta pdf: \\[\\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} = \\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\\text{B}(1, 1)} = \\frac{1}{\\frac{\\Gamma(1)\\Gamma(1)}{\\Gamma(1+1)}}= \\frac{1}{\\frac{(1-1)!(1-1)!}{(2-1)!}} = 1\\] Exercise 18.9 (Exponential intuition 1) The exponential distribution represents the distributon of time between events in a Poisson process. It is the continuous analogue of the geometric distribution. It has a single parameter \\(\\lambda\\), which is strictly positive and represents the constant rate of the corresponding Poisson process. The support is all positive reals, since time between events is non-negative, but not bound upwards. Let’s revisit the call center from our Poisson problem. We get 2.5 calls per day on average, this is our rate parameter \\(\\lambda\\). A work day is 8 hours. What is the mean time between phone calls? The cdf \\(F(x)\\) tells us what percentage of calls occur within x amount of time of each other. You want to take an hour long lunch break but are worried about missing calls. Calculate the percentage of calls you are likely to miss if you’re gone for an hour. Hint: The cdf is \\(F(x) = \\int_{-\\infty}^{x} f(x) dx\\) Solution. Taking \\(\\lambda = \\frac{2.5 \\text{ calls}}{8 \\text{ hours}} = \\frac{1 \\text{ call}}{3.2 \\text{ hours}}\\) \\[E[X] = \\frac{1}{\\lambda} = \\frac{3.2 \\text{ hours}}{\\text{call}}\\] First we derive the CDF, we can integrate from 0 instead of \\(-\\infty\\), since we have no support in the negatives: \\[\\begin{align} F(x) &= \\int_{0}^{x} \\lambda e^{-\\lambda t} dt \\\\ &= \\lambda \\int_{0}^{x} e^{-\\lambda t} dt \\\\ &= \\lambda (\\frac{1}{-\\lambda}e^{-\\lambda t} |_{0}^{x}) \\\\ &= \\lambda(\\frac{1}{\\lambda} - \\frac{1}{\\lambda} e^{-\\lambda x}) \\\\ &= 1 - e^{-\\lambda x}. \\end{align}\\] Then we just evaluate it for a time of 1 hour: \\[F(1 \\text{ hour}) = 1 - e^{-\\frac{1 \\text{ call}}{3.2 \\text{ hours}} \\cdot 1 \\text{ hour}}= 1 - e^{-\\frac{1 \\text{ call}}{3.2 \\text{ hours}}} \\approx 0.268\\] So we have about a 27% chance of missing a call if we’re gone for an hour. Exercise 18.10 (Gamma intuition 1) The gamma distribution is a continuous distribution characterized by two parameters, \\(\\alpha\\) and \\(\\beta\\), both greater than 0. These parameters afford the distribution a broad range of shapes, leading to it being commonly referred to as a family of distributions. Given its support over the positive real numbers, it is well suited for modeling a diverse range of positive-valued phenomena. The exponential distribution is actually just a particular form of the gamma distribution. What are the values of \\(\\alpha\\) and \\(\\beta\\)? Copy the code from our beta distribution Shiny app and modify it to simulate the gamma distribution. Then get it to show the exponential. Solution. Let’s start by taking a look at the pdfs of the two distributions side by side: \\[\\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x} = \\lambda e^{-\\lambda x}\\] The \\(x^{\\alpha - 1}\\) term is not found anywhere in the pdf of the exponential so we need to eliminate it by setting \\(\\alpha = 1\\). This also makes the fraction evaluate to \\(\\frac{\\beta^1}{\\Gamma(1)} = \\beta\\), which leaves us with \\[\\beta \\cdot e^{-\\beta x}\\] Now we can see that \\(\\beta = \\lambda\\) and \\(\\alpha = 1\\). # Install and load necessary packages install.packages(c("shiny", "ggplot2")) library(shiny) library(ggplot2) # The Shiny App ui <- fluidPage( titlePanel("Gamma Distribution Viewer"), sidebarLayout( sidebarPanel( sliderInput("shape", "Shape (α):", min = 0.1, max = 10, value = 2, step = 0.1), sliderInput("scale", "Scale (β):", min = 0.1, max = 10, value = 2, step = 0.1) ), mainPanel( plotOutput("gammaPlot") ) ) ) server <- function(input, output) { output$gammaPlot <- renderPlot({ x <- seq(0, 25, by = 0.1) y <- dgamma(x, shape = input$shape, scale = input$scale) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() + labs(x = "Value", y = "Density") + theme_minimal() }) } shinyApp(ui = ui, server = server) "],["A1.html", "A R programming language A.1 Basic characteristics A.2 Why R? A.3 Setting up A.4 R basics A.5 Functions A.6 Other tips A.7 Further reading and references", " A R programming language A.1 Basic characteristics R is free software for statistical computing and graphics. It is widely used by statisticians, scientists, and other professionals for software development and data analysis. It is an interpreted language and therefore the programs do not need compilation. A.2 Why R? R is one of the main two languages used for statistics and machine learning (the other being Python). Pros Libraries. Comprehensive collection of statistical and machine learning packages. Easy to code. Open source. Anyone can access R and develop new methods. Additionally, it is relatively simple to get source code of established methods. Large community. The use of R has been rising for some time, in industry and academia. Therefore a large collection of blogs and tutorials exists, along with people offering help on pages like StackExchange and CrossValidated. Integration with other languages and LaTeX. New methods. Many researchers develop R packages based on their research, therefore new methods are available soon after development. Cons Slow. Programs run slower than in other programming languages, however this can be somewhat ammended by effective coding or integration with other languages. Memory intensive. This can become a problem with large data sets, as they need to be stored in the memory, along with all the information the models produce. Some packages are not as good as they should be, or have poor documentation. Object oriented programming in R can be very confusing and complex. A.3 Setting up https://www.r-project.org/. A.3.1 RStudio RStudio is the most widely used IDE for R. It is free, you can download it from https://rstudio.com/. While console R is sufficient for the requirements of this course, we recommend the students install RStudio for its better user interface. A.3.2 Libraries for data science Listed below are some of the more useful libraries (packages) for data science. Students are also encouraged to find other useful packages. dplyr Efficient data manipulation. Part of the wider package collection called tidyverse. ggplot2 Plotting based on grammar of graphics. stats Several statistical models. rstan Bayesian inference using Hamiltonian Monte Carlo. Very flexible model building. MCMCpack Bayesian inference. rmarkdown, knitr, and bookdown Dynamic reports (for example such as this one). devtools Package development. A.4 R basics A.4.1 Variables and types Important information and tips: no type declaration define variables with <- instead of = (although both work, there is a slight difference, additionally most of the packages use the arrow) for strings use \"\" for comments use # change types with as.type() functions no special type for single character like C++ for example n <- 20 x <- 2.7 m <- n # m gets value 20 my_flag <- TRUE student_name <- "Luka" typeof(n) ## [1] "double" typeof(student_name) ## [1] "character" typeof(my_flag) ## [1] "logical" typeof(as.integer(n)) ## [1] "integer" typeof(as.character(n)) ## [1] "character" A.4.2 Basic operations n + x ## [1] 22.7 n - x ## [1] 17.3 diff <- n - x # variable diff gets the difference between n and x diff ## [1] 17.3 n * x ## [1] 54 n / x ## [1] 7.407407 x^2 ## [1] 7.29 sqrt(x) ## [1] 1.643168 n > 2 * n ## [1] FALSE n == n ## [1] TRUE n == 2 * n ## [1] FALSE n != n ## [1] FALSE paste(student_name, "is", n, "years old") ## [1] "Luka is 20 years old" A.4.3 Vectors use c() to combine elements into vectors can only contain one type of variable if different types are provided, all are transformed to the most basic type in the vector access elements by indexes or logical vectors of the same length a scalar value is regarded as a vector of length 1 1:4 # creates a vector of integers from 1 to 4 ## [1] 1 2 3 4 student_ages <- c(20, 23, 21) student_names <- c("Luke", "Jen", "Mike") passed <- c(TRUE, TRUE, FALSE) length(student_ages) ## [1] 3 # access by index student_ages[2] ## [1] 23 student_ages[1:2] ## [1] 20 23 student_ages[2] <- 24 # change values # access by logical vectors student_ages[passed == TRUE] # same as student_ages[passed] ## [1] 20 24 student_ages[student_names %in% c("Luke", "Mike")] ## [1] 20 21 student_names[student_ages > 20] ## [1] "Jen" "Mike" A.4.3.1 Operations with vectors most operations are element-wise if we operate on vectors of different lengths, the shorter vector periodically repeats its elements until it reaches the length of the longer one a <- c(1, 3, 5) b <- c(2, 2, 1) d <- c(6, 7) a + b ## [1] 3 5 6 a * b ## [1] 2 6 5 a + d ## Warning in a + d: longer object length is not a multiple of shorter object ## length ## [1] 7 10 11 a + 2 * b ## [1] 5 7 7 a > b ## [1] FALSE TRUE TRUE b == a ## [1] FALSE FALSE FALSE a %*% b # vector multiplication, not element-wise ## [,1] ## [1,] 13 A.4.4 Factors vectors of finite predetermined classes suitable for categorical variables ordinal (ordered) or nominal (unordered) car_brand <- factor(c("Audi", "BMW", "Mercedes", "BMW"), ordered = FALSE) car_brand ## [1] Audi BMW Mercedes BMW ## Levels: Audi BMW Mercedes freq <- factor(x = NA, levels = c("never","rarely","sometimes","often","always"), ordered = TRUE) freq[1:3] <- c("rarely", "sometimes", "rarely") freq ## [1] rarely sometimes rarely ## Levels: never < rarely < sometimes < often < always freq[4] <- "quite_often" # non-existing level, returns NA ## Warning in `[<-.factor`(`*tmp*`, 4, value = "quite_often"): invalid factor ## level, NA generated freq ## [1] rarely sometimes rarely <NA> ## Levels: never < rarely < sometimes < often < always A.4.5 Matrices two-dimensional generalizations of vectors my_matrix <- matrix(c(1, 2, 1, 5, 4, 2), nrow = 2, byrow = TRUE) my_matrix ## [,1] [,2] [,3] ## [1,] 1 2 1 ## [2,] 5 4 2 my_square_matrix <- matrix(c(1, 3, 2, 3), nrow = 2) my_square_matrix ## [,1] [,2] ## [1,] 1 2 ## [2,] 3 3 my_matrix[1,2] # first row, second column ## [1] 2 my_matrix[2, ] # second row ## [1] 5 4 2 my_matrix[ ,3] # third column ## [1] 1 2 A.4.5.1 Matrix functions and operations most operation element-wise mind the dimensions when using matrix multiplication %*% nrow(my_matrix) # number of matrix rows ## [1] 2 ncol(my_matrix) # number of matrix columns ## [1] 3 dim(my_matrix) # matrix dimension ## [1] 2 3 t(my_matrix) # transpose ## [,1] [,2] ## [1,] 1 5 ## [2,] 2 4 ## [3,] 1 2 diag(my_matrix) # the diagonal of the matrix as vector ## [1] 1 4 diag(1, nrow = 3) # creates a diagonal matrix ## [,1] [,2] [,3] ## [1,] 1 0 0 ## [2,] 0 1 0 ## [3,] 0 0 1 det(my_square_matrix) # matrix determinant ## [1] -3 my_matrix + 2 * my_matrix ## [,1] [,2] [,3] ## [1,] 3 6 3 ## [2,] 15 12 6 my_matrix * my_matrix # element-wise multiplication ## [,1] [,2] [,3] ## [1,] 1 4 1 ## [2,] 25 16 4 my_matrix %*% t(my_matrix) # matrix multiplication ## [,1] [,2] ## [1,] 6 15 ## [2,] 15 45 my_vec <- as.vector(my_matrix) # transform to vector my_vec ## [1] 1 5 2 4 1 2 A.4.6 Arrays multi-dimensional generalizations of matrices my_array <- array(c(1, 2, 3, 4, 5, 6, 7, 8), dim = c(2, 2, 2)) my_array[1, 1, 1] ## [1] 1 my_array[2, 2, 1] ## [1] 4 my_array[1, , ] ## [,1] [,2] ## [1,] 1 5 ## [2,] 3 7 dim(my_array) ## [1] 2 2 2 A.4.7 Data frames basic data structure for analysis differ from matrices as columns can be of different types student_data <- data.frame("Name" = student_names, "Age" = student_ages, "Pass" = passed) student_data ## Name Age Pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE ## 3 Mike 21 FALSE colnames(student_data) <- c("name", "age", "pass") # change column names student_data[1, ] ## name age pass ## 1 Luke 20 TRUE student_data[ ,colnames(student_data) %in% c("name", "pass")] ## name pass ## 1 Luke TRUE ## 2 Jen TRUE ## 3 Mike FALSE student_data$pass # access column by name ## [1] TRUE TRUE FALSE student_data[student_data$pass == TRUE, ] ## name age pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE A.4.8 Lists useful for storing different data structures access elements with double square brackets elements can be named first_list <- list(student_ages, my_matrix, student_data) second_list <- list(student_ages, my_matrix, student_data, first_list) first_list[[1]] ## [1] 20 24 21 second_list[[4]] ## [[1]] ## [1] 20 24 21 ## ## [[2]] ## [,1] [,2] [,3] ## [1,] 1 2 1 ## [2,] 5 4 2 ## ## [[3]] ## name age pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE ## 3 Mike 21 FALSE second_list[[4]][[1]] # first element of the fourth element of second_list ## [1] 20 24 21 length(second_list) ## [1] 4 second_list[[length(second_list) + 1]] <- "add_me" # append an element names(first_list) <- c("Age", "Matrix", "Data") first_list$Age ## [1] 20 24 21 A.4.9 Loops mostly for loop for loop can iterate over an arbitrary vector # iterate over consecutive natural numbers my_sum <- 0 for (i in 1:10) { my_sum <- my_sum + i } my_sum ## [1] 55 # iterate over an arbirary vector my_sum <- 0 some_numbers <- c(2, 3.5, 6, 100) for (i in some_numbers) { my_sum <- my_sum + i } my_sum ## [1] 111.5 A.5 Functions for help use ?function_name A.5.1 Writing functions We can write our own functions with function(). In the brackets, we define the parameters the function gets, and in curly brackets we define what the function does. We use return() to return values. sum_first_n_elements <- function (n) { my_sum <- 0 for (i in 1:n) { my_sum <- my_sum + i } return (my_sum) } sum_first_n_elements(10) ## [1] 55 A.6 Other tips Use set.seed(arbitrary_number) at the beginning of a script to set the seed and ensure replication. To dynamically set the working directory in R Studio to the parent folder of a R script use setwd(dirname(rstudioapi::getSourceEditorContext()$path)). To avoid slow R loops use the apply family of functions. See ?apply and ?lapply. To make your data manipulation (and therefore your life) a whole lot easier, use the dplyr package. Use getAnywhere(function_name) to get the source code of any function. Use browser for debugging. See ?browser. A.7 Further reading and references Getting started with R Studio: https://www.youtube.com/watch?v=lVKMsaWju8w Official R manuals: https://cran.r-project.org/manuals.html Cheatsheets: https://www.rstudio.com/resources/cheatsheets/ Workshop on R, dplyr, ggplot2, and R Markdown: https://github.com/bstatcomp/Rworkshop "],["distributions.html", "B Probability distributions", " B Probability distributions Name parameters support pdf/pmf mean variance Bernoulli \\(p \\in [0,1]\\) \\(k \\in \\{0,1\\}\\) \\(p^k (1 - p)^{1 - k}\\) 1.12 \\(p\\) 7.1 \\(p(1-p)\\) 7.1 binomial \\(n \\in \\mathbb{N}\\), \\(p \\in [0,1]\\) \\(k \\in \\{0,1,\\dots,n\\}\\) \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\) 4.4 \\(np\\) 7.2 \\(np(1-p)\\) 7.2 Poisson \\(\\lambda > 0\\) \\(k \\in \\mathbb{N}_0\\) \\(\\frac{\\lambda^k e^{-\\lambda}}{k!}\\) 4.6 \\(\\lambda\\) 7.3 \\(\\lambda\\) 7.3 geometric \\(p \\in (0,1]\\) \\(k \\in \\mathbb{N}_0\\) \\(p(1-p)^k\\) 4.5 \\(\\frac{1 - p}{p}\\) 7.4 \\(\\frac{1 - p}{p^2}\\) 9.3 normal \\(\\mu \\in \\mathbb{R}\\), \\(\\sigma^2 > 0\\) \\(x \\in \\mathbb{R}\\) \\(\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}}\\) 4.12 \\(\\mu\\) 7.8 \\(\\sigma^2\\) 7.8 uniform \\(a,b \\in \\mathbb{R}\\), \\(a < b\\) \\(x \\in [a,b]\\) \\(\\frac{1}{b-a}\\) 4.9 \\(\\frac{a+b}{2}\\) \\(\\frac{(b-a)^2}{12}\\) beta \\(\\alpha,\\beta > 0\\) \\(x \\in [0,1]\\) \\(\\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)}\\) 4.10 \\(\\frac{\\alpha}{\\alpha + \\beta}\\) 7.6 \\(\\frac{\\alpha \\beta}{(\\alpha + \\beta)^2(\\alpha + \\beta + 1)}\\) 7.6 gamma \\(\\alpha,\\beta > 0\\) \\(x \\in (0, \\infty)\\) \\(\\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x}\\) 4.11 \\(\\frac{\\alpha}{\\beta}\\) 7.5 \\(\\frac{\\alpha}{\\beta^2}\\) 7.5 exponential \\(\\lambda > 0\\) \\(x \\in [0, \\infty)\\) \\(\\lambda e^{-\\lambda x}\\) 4.8 \\(\\frac{1}{\\lambda}\\) 7.7 \\(\\frac{1}{\\lambda^2}\\) 7.7 logistic \\(\\mu \\in \\mathbb{R}\\), \\(s > 0\\) \\(x \\in \\mathbb{R}\\) \\(\\frac{e^{-\\frac{x - \\mu}{s}}}{s(1 + e^{-\\frac{x - \\mu}{s}})^2}\\) 4.13 \\(\\mu\\) \\(\\frac{s^2 \\pi^2}{3}\\) negative binomial \\(r \\in \\mathbb{N}\\), \\(p \\in [0,1]\\) \\(k \\in \\mathbb{N}_0\\) \\(\\binom{k + r - 1}{k}(1-p)^r p^k\\) 4.7 \\(\\frac{rp}{1 - p}\\) 9.2 \\(\\frac{rp}{(1 - p)^2}\\) 9.2 multinomial \\(n \\in \\mathbb{N}\\), \\(k \\in \\mathbb{N}\\) \\(p_i \\in [0,1]\\), \\(\\sum p_i = 1\\) \\(x_i \\in \\{0,..., n\\}\\), \\(i \\in \\{1,...,k\\}\\), \\(\\sum{x_i} = n\\) \\(\\frac{n!}{x_1!x_2!...x_k!} p_1^{x_1} p_2^{x_2}...p_k^{x_k}\\) 8.1 \\(np_i\\) \\(np_i(1-p_i)\\) "],["references.html", "References", " References "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]] From 50860e238c59e80b46e16fcc2a04f902a0a992ee Mon Sep 17 00:00:00 2001 From: LeonHvastja Date: Mon, 2 Oct 2023 16:07:54 +0200 Subject: [PATCH 7/7] fixes after review, authors add --- 18-distributions_intuition.Rmd | 44 +++++++++++++----------------- docs/404.html | 4 +-- docs/A1.html | 4 +-- docs/ard.html | 4 +-- docs/bi.html | 4 +-- docs/boot.html | 4 +-- docs/ci.html | 4 +-- docs/condprob.html | 4 +-- docs/crv.html | 4 +-- docs/distributions-intutition.html | 34 ++++++++++------------- docs/distributions.html | 4 +-- docs/eb.html | 4 +-- docs/ev.html | 4 +-- docs/index.html | 8 +++--- docs/integ.html | 4 +-- docs/introduction.html | 4 +-- docs/lt.html | 4 +-- docs/ml.html | 4 +-- docs/mrv.html | 4 +-- docs/mrvs.html | 4 +-- docs/nhst.html | 4 +-- docs/references.html | 4 +-- docs/rvs.html | 4 +-- docs/search_index.json | 2 +- docs/uprobspaces.html | 4 +-- index.Rmd | 2 +- 26 files changed, 82 insertions(+), 92 deletions(-) diff --git a/18-distributions_intuition.Rmd b/18-distributions_intuition.Rmd index 5a57781..0704669 100644 --- a/18-distributions_intuition.Rmd +++ b/18-distributions_intuition.Rmd @@ -41,10 +41,10 @@ $(document).ready(function() { ## Discrete distributions ```{exercise, name = "Bernoulli intuition 1"} -Arguably the simplest distribution you will encounter is the Bernoulli distribution. +The simplest distribution you will encounter is the Bernoulli distribution. It is a discrete probability distribution used to represent the outcome of a yes/no -question. It has one parameter $p$ which is the probability of success. The -probability of failure is $(1-p)$, sometimes denoted as $q$. +question. It has one parameter $0 \leq p \leq 1$, which is the probability of success. The +probability of failure is $q = (1-p)$. A classic way to think about a Bernoulli trial (a yes/no experiment) is a coin flip. Real coins are fair, meaning the probability of either heads (1) @@ -101,7 +101,7 @@ is divisible by 3: $C = \{3, 6\}$ satisfies this condition. ```{exercise, name = "Binomial intuition 1"} The binomial distribution is a generalization of the Bernoulli distribution. -Instead of considering a single Bernoulli trial, we now consider a sequence of $n$ trials, +Instead of considering a single Bernoulli trial, we now consider a sum of a sequence of $n$ trials, which are independent and have the same parameter $p$. So the binomial distribution has two parameters $n$ - the number of trials and $p$ - the probability of success for each trial. @@ -118,7 +118,7 @@ a. Take the [pmf of the binomial distribution](#distributions) and plug in $n=1 check that it is in fact equivalent to a Bernoulli distribution. b. In our examples we show the graph of a binomial distribution over 10 trials with -$p=0.8$. If we take a look at the graph, it appears as though the probabilities of getting 0,1,2 or 3 +$p=0.8$. If we take a look at the graph, it appears as though the probabilities of getting 0,1, 2 or 3 heads in 10 flips are zero. Is it actually zero? Check by plugging in the values into the pmf. ``` @@ -284,8 +284,8 @@ failures** before the first success in a sequence of independent Bernoulli trial It has a single parameter $p$, representing the probability of success and its support is all non-negative integers $\{0,1,2,\dots\}$. -NOTE: There are two forms of this distribution, the one we just described -and another that models the **number of trials** before the first success. The +NOTE: There is an alternative way to think about this distribution, one that models +the **number of trials** before the first success. The difference is subtle yet significant and you are likely to encounter both forms. The key to telling them apart is to check their support, since the number of trials has to be at least $1$, for this case we have $\{1,2,\dots\}$. @@ -299,7 +299,7 @@ probability. a) Create an equivalent graph that represents the probability of rolling a 6 with a fair 6-sided die. b) Use the formula for the [mean](#distributions) of the geometric distribution and determine the average number of **failures** before you roll a 6. -c) Look up the alternative form of the geometric distribtuion and again use the formula for the mean to determine the average number of **trials** before you roll a 6. +c) Look up the alternative form of the geometric distribtuion and again use the formula for the mean to determine the average number of **trials** up to and including rolling a 6. ``` @@ -371,7 +371,7 @@ The need for a randomness is a common problem. A practical solution are so-calle It has two parameters $a$ and $b$, which define the beginning and end of its support respectively. -a) Let's think of the mean intuitively. The expected value or mean of a distribution is the pivot point on our x-axis, which "balances" the graph. Given parameters $a$ and $b$ what is your intuitive guess of the mean for this distribution? +a) Let's think about the mean intuitively. Think of the area under the graph as a geometric shape. The expected value or mean of a distribution is the x-axis value of its center of mass. Given parameters $a$ and $b$ what is your intuitive guess of the mean for the uniform distribution? b) A special case of the uniform distribution is the **standard uniform distribution** with $a=0$ and $b=1$. Write the pdf $f(x)$ of this particular distribution. ``` ```{r, fig.width=5, fig.height=3, echo=FALSE, warning=FALSE, message=FALSE} @@ -398,7 +398,7 @@ print(p) ```
        ```{solution, echo = togs} -a. It's the midpoint between $a$ and $b$, so $\frac{a+b}{2}$ +a. The center of mass is the center of the square from $a$ to $b$ and from 0 to $\frac{1}{b-a}$. Its value on the x-axis is the midpoint between $a$ and $b$, so $\frac{a+b}{2}$ b. Inserting the parameter values we get:$$f(x) = \begin{cases} 1 & \text{if } 0 \leq x \leq 1 \\ @@ -517,11 +517,9 @@ Below you've been provided with some code that you can copy into Rstudio. Once y Play around with the parameters to get: -a) A straight line from (0,0) to (1,2) -b) A straight line from (0,2) to (1,0) -c) A symmetric bell curve -d) A bowl-shaped curve -e) The standard uniform distribution is actually a special case of the beta distribution. Find the exact parameters $\alpha$ and $\beta$. Once you do, prove the equality by inserting the values into our pdf. +a) A symmetric bell curve +b) A bowl-shaped curve +c) The standard uniform distribution is actually a special case of the beta distribution. Find the exact parameters $\alpha$ and $\beta$. Once you do, prove the equality by inserting the values into our pdf. *Hint*: The beta function is evaluated as $\text{B}(a,b) = \frac{\Gamma(a)\Gamma(b)}{\Gamma(a+b)}$, the gamma function for **positive integers** $n$ is evaluated as $\Gamma(n)= (n-1)!$ @@ -566,15 +564,11 @@ shinyApp(ui = ui, server = server) ```
        ```{solution, echo = togs} - a) $\alpha = 2, \beta=1$ + a) Possible solution $\alpha = \beta= 5$ - b) $\alpha = 1, \beta=2$ - - c) Possible solution $\alpha = \beta= 5$ - - d) Possible solution $\alpha = \beta= 0.5$ + b) Possible solution $\alpha = \beta= 0.5$ - e) The correct parameters are $\alpha = 1, \beta=1$, to prove the equality we insert them into the beta pdf: + c) The correct parameters are $\alpha = 1, \beta=1$, to prove the equality we insert them into the beta pdf: $$\frac{x^{\alpha - 1} (1 - x)^{\beta - 1}}{\text{B}(\alpha, \beta)} = \frac{x^{1 - 1} (1 - x)^{1 - 1}}{\text{B}(1, 1)} = \frac{1}{\frac{\Gamma(1)\Gamma(1)}{\Gamma(1+1)}}= @@ -593,7 +587,7 @@ a) What is the mean time between phone calls? The cdf $F(x)$ tells us what percentage of calls occur within x amount of time of each other. -b) You want to take an hour long lunch break but are worried about missing calls. Calculate the percentage of calls you are likely to miss if you're gone for an hour. Hint: The cdf is $F(x) = \int_{-\infty}^{x} f(x) dx$ +b) You want to take an hour long lunch break but are worried about missing calls. Calculate the probability of missing at least one call if you're gone for an hour. Hint: The cdf is $F(x) = \int_{-\infty}^{x} f(x) dx$ ```
        @@ -612,12 +606,12 @@ b. First we derive the CDF, we can integrate from 0 instead of $-\infty$, since Then we just evaluate it for a time of 1 hour: $$F(1 \text{ hour}) = 1 - e^{-\frac{1 \text{ call}}{3.2 \text{ hours}} \cdot 1 \text{ hour}}= 1 - e^{-\frac{1 \text{ call}}{3.2 \text{ hours}}} \approx 0.268$$ - So we have about a 27% chance of missing a call if we're gone for an hour. + So we have about a 27% chance of missing at least one call if we're gone for an hour. ```
        ```{exercise, name = "Gamma intuition 1"} -The gamma distribution is a continuous distribution characterized by two parameters, $\alpha$ and $\beta$, both greater than 0. These parameters afford the distribution a broad range of shapes, leading to it being commonly referred to as a *family of distributions*. Given its support over the positive real numbers, it is well suited for modeling a diverse range of positive-valued phenomena. +The gamma distribution is a continuous distribution with by two parameters, $\alpha$ and $\beta$, both greater than 0. These parameters afford the distribution a broad range of shapes, leading to it being commonly referred to as a *family of distributions*. Given its support over the positive real numbers, it is well suited for modeling a diverse range of positive-valued phenomena. a) The exponential distribution is actually just a particular form of the gamma distribution. What are the values of $\alpha$ and $\beta$? b) Copy the code from our beta distribution Shiny app and modify it to simulate the gamma distribution. Then get it to show the exponential. diff --git a/docs/404.html b/docs/404.html index fd15a7d..16745c2 100644 --- a/docs/404.html +++ b/docs/404.html @@ -20,10 +20,10 @@ - + - + diff --git a/docs/A1.html b/docs/A1.html index f3b6fd8..80d8875 100644 --- a/docs/A1.html +++ b/docs/A1.html @@ -20,10 +20,10 @@ - + - + diff --git a/docs/ard.html b/docs/ard.html index 82eb67b..26a3e7c 100644 --- a/docs/ard.html +++ b/docs/ard.html @@ -20,10 +20,10 @@ - + - + diff --git a/docs/bi.html b/docs/bi.html index a164a4e..1597c8d 100644 --- a/docs/bi.html +++ b/docs/bi.html @@ -20,10 +20,10 @@ - + - + diff --git a/docs/boot.html b/docs/boot.html index 80f30f8..0a6b41e 100644 --- a/docs/boot.html +++ b/docs/boot.html @@ -20,10 +20,10 @@ - + - + diff --git a/docs/ci.html b/docs/ci.html index 4c0e9ac..10b0f07 100644 --- a/docs/ci.html +++ b/docs/ci.html @@ -20,10 +20,10 @@ - + - + diff --git a/docs/condprob.html b/docs/condprob.html index a3e195d..f8b7026 100644 --- a/docs/condprob.html +++ b/docs/condprob.html @@ -20,10 +20,10 @@ - + - + diff --git a/docs/crv.html b/docs/crv.html index 4d043c0..8db20b8 100644 --- a/docs/crv.html +++ b/docs/crv.html @@ -20,10 +20,10 @@ - + - + diff --git a/docs/distributions-intutition.html b/docs/distributions-intutition.html index da87a6e..6467605 100644 --- a/docs/distributions-intutition.html +++ b/docs/distributions-intutition.html @@ -20,10 +20,10 @@ - + - + @@ -327,10 +327,10 @@

        Chapter 18 Distributions intutiti

        18.1 Discrete distributions

        -

        Exercise 18.1 (Bernoulli intuition 1) Arguably the simplest distribution you will encounter is the Bernoulli distribution. +

        Exercise 18.1 (Bernoulli intuition 1) The simplest distribution you will encounter is the Bernoulli distribution. It is a discrete probability distribution used to represent the outcome of a yes/no -question. It has one parameter \(p\) which is the probability of success. The -probability of failure is \((1-p)\), sometimes denoted as \(q\).

        +question. It has one parameter \(0 \leq p \leq 1\), which is the probability of success. The +probability of failure is \(q = (1-p)\).

        A classic way to think about a Bernoulli trial (a yes/no experiment) is a coin flip. Real coins are fair, meaning the probability of either heads (1) or tails (0) are the same, so \(p=0.5\) as shown below in figure a. Alternatively @@ -364,7 +364,7 @@

        18.1 Discrete distributions

        Exercise 18.2 (Binomial intuition 1) The binomial distribution is a generalization of the Bernoulli distribution. -Instead of considering a single Bernoulli trial, we now consider a sequence of \(n\) trials, +Instead of considering a single Bernoulli trial, we now consider a sum of a sequence of \(n\) trials, which are independent and have the same parameter \(p\). So the binomial distribution has two parameters \(n\) - the number of trials and \(p\) - the probability of success for each trial.

        @@ -378,7 +378,7 @@

        18.1 Discrete distributions

        Take the pmf of the binomial distribution and plug in \(n=1\), check that it is in fact equivalent to a Bernoulli distribution.

      2. In our examples we show the graph of a binomial distribution over 10 trials with -\(p=0.8\). If we take a look at the graph, it appears as though the probabilities of getting 0,1,2 or 3 +\(p=0.8\). If we take a look at the graph, it appears as though the probabilities of getting 0,1, 2 or 3 heads in 10 flips are zero. Is it actually zero? Check by plugging in the values into the pmf.

      @@ -478,8 +478,8 @@

      18.1 Discrete distributions before the first success in a sequence of independent Bernoulli trials. It has a single parameter \(p\), representing the probability of success and its support is all non-negative integers \(\{0,1,2,\dots\}\).

      -

      NOTE: There are two forms of this distribution, the one we just described -and another that models the number of trials before the first success. The +

      NOTE: There is an alternative way to think about this distribution, one that models +the number of trials before the first success. The difference is subtle yet significant and you are likely to encounter both forms. The key to telling them apart is to check their support, since the number of trials has to be at least \(1\), for this case we have \(\{1,2,\dots\}\).

      @@ -492,7 +492,7 @@

      18.1 Discrete distributions
    1. Create an equivalent graph that represents the probability of rolling a 6 with a fair 6-sided die.
    2. Use the formula for the mean of the geometric distribution and determine the average number of failures before you roll a 6.
    3. -
    4. Look up the alternative form of the geometric distribtuion and again use the formula for the mean to determine the average number of trials before you roll a 6.
    5. +
    6. Look up the alternative form of the geometric distribtuion and again use the formula for the mean to determine the average number of trials up to and including rolling a 6.

    @@ -539,7 +539,7 @@

    18.2 Continuous distributionsExercise 18.6 (Uniform intuition 1) The need for a randomness is a common problem. A practical solution are so-called random number generators (RNGs). The simplest RNG one would think of is choosing a set of numbers and having the generator return a number at random, where the probability of returning any number from this set is the same. If this set is an interval of real numbers, then we’ve basically described the continuous uniform distribution.

    It has two parameters \(a\) and \(b\), which define the beginning and end of its support respectively.

      -
    1. Let’s think of the mean intuitively. The expected value or mean of a distribution is the pivot point on our x-axis, which “balances” the graph. Given parameters \(a\) and \(b\) what is your intuitive guess of the mean for this distribution?
    2. +
    3. Let’s think about the mean intuitively. Think of the area under the graph as a geometric shape. The expected value or mean of a distribution is the x-axis value of its center of mass. Given parameters \(a\) and \(b\) what is your intuitive guess of the mean for the uniform distribution?
    4. A special case of the uniform distribution is the standard uniform distribution with \(a=0\) and \(b=1\). Write the pdf \(f(x)\) of this particular distribution.

    @@ -548,7 +548,7 @@

    18.2 Continuous distributions

    Solution.

      -
    1. It’s the midpoint between \(a\) and \(b\), so \(\frac{a+b}{2}\)
    2. +
    3. The center of mass is the center of the square from \(a\) to \(b\) and from 0 to \(\frac{1}{b-a}\). Its value on the x-axis is the midpoint between \(a\) and \(b\), so \(\frac{a+b}{2}\)
    4. Inserting the parameter values we get:\[f(x) = \begin{cases} 1 & \text{if } 0 \leq x \leq 1 \\ @@ -630,8 +630,6 @@

      18.2 Continuous distributionsBelow you’ve been provided with some code that you can copy into Rstudio. Once you run the code, an interactive Shiny app will appear and you will be able to manipulate the graph of the beta distribution.

      Play around with the parameters to get:

        -
      1. A straight line from (0,0) to (1,2)
      2. -
      3. A straight line from (0,2) to (1,0)
      4. A symmetric bell curve
      5. A bowl-shaped curve
      6. The standard uniform distribution is actually a special case of the beta distribution. Find the exact parameters \(\alpha\) and \(\beta\). Once you do, prove the equality by inserting the values into our pdf.
      7. @@ -678,8 +676,6 @@

        18.2 Continuous distributions

        Solution.

          -
        1. \(\alpha = 2, \beta=1\)

        2. -
        3. \(\alpha = 1, \beta=2\)

        4. Possible solution \(\alpha = \beta= 5\)

        5. Possible solution \(\alpha = \beta= 0.5\)

        6. The correct parameters are \(\alpha = 1, \beta=1\), to prove the equality we insert them into the beta pdf: @@ -699,7 +695,7 @@

          18.2 Continuous distributions

          The cdf \(F(x)\) tells us what percentage of calls occur within x amount of time of each other.

            -
          1. You want to take an hour long lunch break but are worried about missing calls. Calculate the percentage of calls you are likely to miss if you’re gone for an hour. Hint: The cdf is \(F(x) = \int_{-\infty}^{x} f(x) dx\)
          2. +
          3. You want to take an hour long lunch break but are worried about missing calls. Calculate the probability of missing at least one call if you’re gone for an hour. Hint: The cdf is \(F(x) = \int_{-\infty}^{x} f(x) dx\)

    -

    Exercise 18.10 (Gamma intuition 1) The gamma distribution is a continuous distribution characterized by two parameters, \(\alpha\) and \(\beta\), both greater than 0. These parameters afford the distribution a broad range of shapes, leading to it being commonly referred to as a family of distributions. Given its support over the positive real numbers, it is well suited for modeling a diverse range of positive-valued phenomena.

    +

    Exercise 18.10 (Gamma intuition 1) The gamma distribution is a continuous distribution with by two parameters, \(\alpha\) and \(\beta\), both greater than 0. These parameters afford the distribution a broad range of shapes, leading to it being commonly referred to as a family of distributions. Given its support over the positive real numbers, it is well suited for modeling a diverse range of positive-valued phenomena.

    1. The exponential distribution is actually just a particular form of the gamma distribution. What are the values of \(\alpha\) and \(\beta\)?
    2. Copy the code from our beta distribution Shiny app and modify it to simulate the gamma distribution. Then get it to show the exponential.
    3. diff --git a/docs/distributions.html b/docs/distributions.html index 24ad8ae..16bfc91 100644 --- a/docs/distributions.html +++ b/docs/distributions.html @@ -20,10 +20,10 @@ - + - + diff --git a/docs/eb.html b/docs/eb.html index 09cd76b..bb372f7 100644 --- a/docs/eb.html +++ b/docs/eb.html @@ -20,10 +20,10 @@ - + - + diff --git a/docs/ev.html b/docs/ev.html index c3e2dec..8938503 100644 --- a/docs/ev.html +++ b/docs/ev.html @@ -20,10 +20,10 @@ - + - + diff --git a/docs/index.html b/docs/index.html index 589e5e1..de01699 100644 --- a/docs/index.html +++ b/docs/index.html @@ -20,10 +20,10 @@ - + - + @@ -299,8 +299,8 @@

      Preface

      diff --git a/docs/integ.html b/docs/integ.html index 2d3e14f..2d51736 100644 --- a/docs/integ.html +++ b/docs/integ.html @@ -20,10 +20,10 @@ - + - + diff --git a/docs/introduction.html b/docs/introduction.html index f472b1d..468df61 100644 --- a/docs/introduction.html +++ b/docs/introduction.html @@ -20,10 +20,10 @@ - + - + diff --git a/docs/lt.html b/docs/lt.html index f70caad..10f550d 100644 --- a/docs/lt.html +++ b/docs/lt.html @@ -20,10 +20,10 @@ - + - + diff --git a/docs/ml.html b/docs/ml.html index 01a8d4c..b30e532 100644 --- a/docs/ml.html +++ b/docs/ml.html @@ -20,10 +20,10 @@ - + - + diff --git a/docs/mrv.html b/docs/mrv.html index e52e8ce..0ed2084 100644 --- a/docs/mrv.html +++ b/docs/mrv.html @@ -20,10 +20,10 @@ - + - + diff --git a/docs/mrvs.html b/docs/mrvs.html index d08c574..498fbea 100644 --- a/docs/mrvs.html +++ b/docs/mrvs.html @@ -20,10 +20,10 @@ - + - + diff --git a/docs/nhst.html b/docs/nhst.html index bc849af..7256cbc 100644 --- a/docs/nhst.html +++ b/docs/nhst.html @@ -20,10 +20,10 @@ - + - + diff --git a/docs/references.html b/docs/references.html index 5415593..0e6bfb9 100644 --- a/docs/references.html +++ b/docs/references.html @@ -20,10 +20,10 @@ - + - + diff --git a/docs/rvs.html b/docs/rvs.html index a9abfa1..73bf8f3 100644 --- a/docs/rvs.html +++ b/docs/rvs.html @@ -20,10 +20,10 @@ - + - + diff --git a/docs/search_index.json b/docs/search_index.json index 1068bd3..dae68a3 100644 --- a/docs/search_index.json +++ b/docs/search_index.json @@ -1 +1 @@ -[["index.html", "Principles of Uncertainty – exercises Preface", " Principles of Uncertainty – exercises Gregor Pirš and Erik Štrumbelj 2023-10-01 Preface These are the exercises for the Principles of Uncertainty course of the Data Science Master’s at University of Ljubljana, Faculty of Computer and Information Science. This document will be extended each week as the course progresses. At the end of each exercise session, we will post the solutions to the exercises worked in class and select exercises for homework. Students are also encouraged to solve the remaining exercises to further extend their knowledge. Some exercises require the use of R. Those exercises (or parts of) are coloured blue. Students that are not familiar with R programming language should study A to learn the basics. As the course progresses, we will cover more relevant uses of R for data science. "],["introduction.html", "Chapter 1 Probability spaces 1.1 Measure and probability spaces 1.2 Properties of probability measures 1.3 Discrete probability spaces", " Chapter 1 Probability spaces This chapter deals with measures and probability spaces. At the end of the chapter, we look more closely at discrete probability spaces. The students are expected to acquire the following knowledge: Theoretical Use properties of probability to calculate probabilities. Combinatorics. Understanding of continuity of probability. R Vectors and vector operations. For loop. Estimating probability with simulation. sample function. Matrices and matrix operations. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 1.1 Measure and probability spaces Exercise 1.1 (Completing a set to a sigma algebra) Let \\(\\Omega = \\{1,2,...,10\\}\\) and let \\(A = \\{\\emptyset, \\{1\\}, \\{2\\}, \\Omega \\}\\). Show that \\(A\\) is not a sigma algebra of \\(\\Omega\\). Find the minimum number of elements to complete A to a sigma algebra of \\(\\Omega\\). Solution. \\(1^c = \\{2,3,...,10\\} \\notin A \\implies\\) \\(A\\) is not sigma algebra. First we need the complements of all elements, so we need to add sets \\(\\{2,3,...,10\\}\\) and \\(\\{1,3,4,...,10\\}\\). Next we need unions of all sets – we add the set \\(\\{1,2\\}\\). Again we need the complement of this set, so we add \\(\\{3,4,...,10\\}\\). So the minimum number of elements we need to add is 4. Exercise 1.2 (Diversity of sigma algebras) Let \\(\\Omega\\) be a set. Find the smallest sigma algebra of \\(\\Omega\\). Find the largest sigma algebra of \\(\\Omega\\). Solution. \\(A = \\{\\emptyset, \\Omega\\}\\) \\(2^{\\Omega}\\) Exercise 1.3 Find all sigma algebras for \\(\\Omega = \\{0, 1, 2\\}\\). Solution. \\(A = \\{\\emptyset, \\Omega\\}\\) \\(A = 2^{\\Omega}\\) \\(A = \\{\\emptyset, \\{0\\}, \\{1,2\\}, \\Omega\\}\\) \\(A = \\{\\emptyset, \\{1\\}, \\{0,2\\}, \\Omega\\}\\) \\(A = \\{\\emptyset, \\{2\\}, \\{0,1\\}, \\Omega\\}\\) Exercise 1.4 (Difference between algebra and sigma algebra) Let \\(\\Omega = \\mathbb{N}\\) and \\(\\mathcal{A} = \\{A \\subseteq \\mathbb{N}: A \\text{ is finite or } A^c \\text{ is finite.} \\}\\). Show that \\(\\mathcal{A}\\) is an algebra but not a sigma algebra. Solution. \\(\\emptyset\\) is finite so \\(\\emptyset \\in \\mathcal{A}\\). Let \\(A \\in \\mathcal{A}\\) and \\(B \\in \\mathcal{A}\\). If both are finite, then their union is also finite and therefore in \\(\\mathcal{A}\\). Let at least one of them not be finite. Then their union is not finite. But \\((A \\cup B)^c = A^c \\cap B^c\\). And since at least one is infinite, then its complement is finite and the intersection is too. So finite unions are in \\(\\mathcal{A}\\). Let us look at numbers \\(2n\\). For any \\(n\\), \\(2n \\in \\mathcal{A}\\) as it is finite. But \\(\\bigcup_{k = 1}^{\\infty} 2n \\notin \\mathcal{A}\\). Exercise 1.5 We define \\(\\sigma(X) = \\cap_{\\lambda \\in I} S_\\lambda\\) to be a sigma algebra, generated by the set \\(X\\), where \\(S_\\lambda\\) are all sigma algebras such that \\(X \\subseteq S_\\lambda\\). \\(S_\\lambda\\) are indexed by \\(\\lambda \\in I\\). Let \\(A, B \\subseteq 2^{\\Omega}\\). Prove that \\(\\sigma(A) = \\sigma(B) \\iff A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A)\\). Solution. To prove the equivalence, we need to prove that the left hand side implies the right hand side and vice versa. Proving \\(\\sigma(A) = \\sigma(B) \\Rightarrow A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A)\\): we know \\(A \\subseteq \\sigma(A)\\) is always true, so by substituting in \\(\\sigma(B)\\) from the left hand side equality we obtain \\(A \\subseteq \\sigma(B)\\). We obtain \\(B \\subseteq \\sigma(A)\\) by symmetry. This proves the implication. Proving \\(A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A) \\Rightarrow \\sigma(A) = \\sigma(B)\\): by definition of a sigma algebra, generated by a set, we have \\(\\sigma(B) = \\cap_{\\lambda \\in I} S_\\lambda\\) where \\(S_\\lambda\\) are all sigma algebras where \\(B \\subseteq S_\\lambda\\). But \\(\\sigma(A)\\) is one of \\(S_\\lambda\\), so we can write \\(\\sigma(B) = \\sigma(A) \\cap \\left(\\cap_{\\lambda \\in I} S_\\lambda \\right)\\), which implies \\(\\sigma(B) \\subseteq \\sigma(A)\\). By symmetry, we have \\(\\sigma(A) \\subseteq \\sigma(B)\\). Since \\(\\sigma(A) \\subseteq \\sigma(B)\\) and \\(\\sigma(B) \\subseteq \\sigma(A)\\), we obtain \\(\\sigma(A) = \\sigma(B)\\), which proves the implication and completes the equivalence proof. Exercise 1.6 (Intro to measure) Take the measurable space \\(\\Omega = \\{1,2\\}\\), \\(F = 2^{\\Omega}\\). Which of the following is a measure? Which is a probability measure? \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 5\\), \\(\\mu(\\{2\\}) = 6\\), \\(\\mu(\\{1,2\\}) = 11\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 0\\), \\(\\mu(\\{1,2\\}) = 1\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 0\\), \\(\\mu(\\{1,2\\}) = 0\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 1\\), \\(\\mu(\\{1,2\\}) = 1\\) \\(\\mu(\\emptyset)=0\\), \\(\\mu(\\{1\\})=0\\), \\(\\mu(\\{2\\})=\\infty\\), \\(\\mu(\\{1,2\\})=\\infty\\) Solution. Measure. Not probability measure since \\(\\mu(\\Omega) > 1\\). Neither due to countable additivity. Measure. Not probability measure since \\(\\mu(\\Omega) = 0\\). Probability measure. Measure. Not probability measure since \\(\\mu(\\Omega) > 1\\). Exercise 1.7 Define a probability space that could be used to model the outcome of throwing two fair 6-sided dice. Solution. \\(\\Omega = \\{\\{i,j\\}, i = 1,...,6, j = 1,...,6\\}\\) \\(F = 2^{\\Omega}\\) \\(\\forall \\omega \\in \\Omega\\), \\(P(\\omega) = \\frac{1}{6} \\times \\frac{1}{6} = \\frac{1}{36}\\) 1.2 Properties of probability measures Exercise 1.8 A standard deck (52 cards) is distributed to two persons: 26 cards to each person. All partitions are equally likely. Find the probability that: The first person gets 4 Queens. The first person gets at least 2 Queens. R: Use simulation (sample) to check the above answers. Solution. \\(\\frac{\\binom{48}{22}}{\\binom{52}{26}}\\) 1 - \\(\\frac{\\binom{48}{26} + 4 \\times \\binom{48}{25}}{\\binom{52}{26}}\\) For the simulation, let us represent cards with numbers from 1 to 52, and let 1 through 4 represent Queens. set.seed(1) cards <- 1:52 n <- 10000 q4 <- vector(mode = "logical", length = n) q2 <- vector(mode = "logical", length = n) tmp <- vector(mode = "logical", length = n) for (i in 1:n) { p1 <- sample(1:52, 26) q4[i] <- sum(1:4 %in% p1) == 4 q2[i] <- sum(1:4 %in% p1) >= 2 } sum(q4) / n ## [1] 0.0572 sum(q2) / n ## [1] 0.6894 Exercise 1.9 Let \\(A\\) and \\(B\\) be events with probabilities \\(P(A) = \\frac{2}{3}\\) and \\(P(B) = \\frac{1}{2}\\). Show that \\(\\frac{1}{6} \\leq P(A\\cap B) \\leq \\frac{1}{2}\\), and give examples to show that both extremes are possible. Find corresponding bounds for \\(P(A\\cup B)\\). R: Draw samples from the examples and show the probability bounds of \\(P(A \\cap B)\\) . Solution. From the properties of probability we have \\[\\begin{equation} P(A \\cup B) = P(A) + P(B) - P(A \\cap B) \\leq 1. \\end{equation}\\] From this follows \\[\\begin{align} P(A \\cap B) &\\geq P(A) + P(B) - 1 \\\\ &= \\frac{2}{3} + \\frac{1}{2} - 1 \\\\ &= \\frac{1}{6}, \\end{align}\\] which is the lower bound for the intersection. Conversely, we have \\[\\begin{equation} P(A \\cup B) = P(A) + P(B) - P(A \\cap B) \\geq P(A). \\end{equation}\\] From this follows \\[\\begin{align} P(A \\cap B) &\\leq P(B) \\\\ &= \\frac{1}{2}, \\end{align}\\] which is the upper bound for the intersection. For an example take a fair die. To achieve the lower bound let \\(A = \\{3,4,5,6\\}\\) and \\(B = \\{1,2,3\\}\\), then their intersection is \\(A \\cap B = \\{3\\}\\). To achieve the upper bound take \\(A = \\{1,2,3,4\\}\\) and $B = {1,2,3} $. For the bounds of the union we will use the results from the first part. Again from the properties of probability we have \\[\\begin{align} P(A \\cup B) &= P(A) + P(B) - P(A \\cap B) \\\\ &\\geq P(A) + P(B) - \\frac{1}{2} \\\\ &= \\frac{2}{3}. \\end{align}\\] Conversely \\[\\begin{align} P(A \\cup B) &= P(A) + P(B) - P(A \\cap B) \\\\ &\\leq P(A) + P(B) - \\frac{1}{6} \\\\ &= 1. \\end{align}\\] Therefore \\(\\frac{2}{3} \\leq P(A \\cup B) \\leq 1\\). We use sample in R: set.seed(1) n <- 10000 samps <- sample(1:6, n, replace = TRUE) # lower bound lb <- vector(mode = "logical", length = n) A <- c(1,2,3) B <- c(3,4,5,6) for (i in 1:n) { lb[i] <- samps[i] %in% A & samps[i] %in% B } sum(lb) / n ## [1] 0.1605 # upper bound ub <- vector(mode = "logical", length = n) A <- c(1,2,3) B <- c(1,2,3,4) for (i in 1:n) { ub[i] <- samps[i] %in% A & samps[i] %in% B } sum(ub) / n ## [1] 0.4913 Exercise 1.10 A fair coin is tossed repeatedly. Show that, with probability one, a head turns up sooner or later. Show similarly that any given finite sequence of heads and tails occurs eventually with probability one. Solution. \\[\\begin{align} P(\\text{no heads}) &= \\lim_{n \\rightarrow \\infty} P(\\text{no heads in first }n \\text{ tosses}) \\\\ &= \\lim_{n \\rightarrow \\infty} \\frac{1}{2^n} \\\\ &= 0. \\end{align}\\] For the second part, let us fix the given sequence of heads and tails of length \\(k\\) as \\(s\\). A probability that this happens in \\(k\\) tosses is \\(\\frac{1}{2^k}\\). \\[\\begin{align} P(s \\text{ occurs}) &= \\lim_{n \\rightarrow \\infty} P(s \\text{ occurs in first } nk \\text{ tosses}) \\end{align}\\] The right part of the upper equation is greater than if \\(s\\) occurs either in the first \\(k\\) tosses, second \\(k\\) tosses,…, \\(n\\)-th \\(k\\) tosses. Therefore \\[\\begin{align} P(s \\text{ occurs}) &\\geq \\lim_{n \\rightarrow \\infty} P(s \\text{ occurs in first } n \\text{ disjoint sequences of length } k) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - P(s \\text{ does not occur in first } n \\text{ disjoint sequences})) \\\\ &= 1 - \\lim_{n \\rightarrow \\infty} P(s \\text{ does not occur in first } n \\text{ disjoint sequences}) \\\\ &= 1 - \\lim_{n \\rightarrow \\infty} (1 - \\frac{1}{2^k})^n \\\\ &= 1. \\end{align}\\] Exercise 1.11 An Erdos-Renyi random graph \\(G(n,p)\\) is a model with \\(n\\) nodes, where each pair of nodes is connected with probability \\(p\\). Calculate the probability that there exists a node that is not connected to any other node in \\(G(4,0.6)\\). Show that the upper bound for the probability that there exist 2 nodes that are not connected to any other node for an arbitrary \\(G(n,p)\\) is \\(\\binom{n}{2} (1-p)^{2n - 3}\\). R: Estimate the probability from the first point using simulation. Solution. Let \\(A_i\\) be the event that the \\(i\\)-th node is not connected to any other node. Then our goal is to calculate \\(P(\\cup_{i=1}^n A_i)\\). Using the inclusion-exclusion principle, we get \\[\\begin{align} P(\\cup_{i=1}^n A_i) &= \\sum_i A_i - \\sum_{i<j} P(A_i \\cap A_j) + \\sum_{i<j<k} P(A_i \\cap A_j \\cap A_k) - P(A_1 \\cap A_2 \\cap A_3 \\cap A_4) \\\\ &=4 (1 - p)^3 - \\binom{4}{2} (1 - p)^5 + \\binom{4}{3} (1 - p)^6 - (1 - p)^6 \\\\ &\\approx 0.21. \\end{align}\\] Let \\(A_{ij}\\) be the event that nodes \\(i\\) and \\(j\\) are not connected to any other node. We are interested in \\(P(\\cup_{i<j}A_{ij})\\). By using Boole`s inequality, we get \\[\\begin{align} P(\\cup_{i<j}A_{ij}) \\leq \\sum_{i<j} P(A_{ij}). \\end{align}\\] What is the probability of \\(A_{ij}\\)? There need to be no connections to the \\(i\\)-th node to the remaining nodes (excluding \\(j\\)), the same for the \\(j\\)-th node, and there can be no connection between them. Therefore \\[\\begin{align} P(\\cup_{i<j}A_{ij}) &\\leq \\sum_{i<j} (1 - p)^{2(n-2) + 1} \\\\ &= \\binom{n}{2} (1 - p)^{2n - 3}. \\end{align}\\] set.seed(1) n_samp <- 100000 n <- 4 p <- 0.6 conn_samp <- vector(mode = "logical", length = n_samp) for (i in 1:n_samp) { tmp_mat <- matrix(data = 0, nrow = n, ncol = n) samp_conn <- sample(c(0,1), choose(4,2), replace = TRUE, prob = c(1 - p, p)) tmp_mat[lower.tri(tmp_mat)] <- samp_conn tmp_mat[upper.tri(tmp_mat)] <- t(tmp_mat)[upper.tri(t(tmp_mat))] not_conn <- apply(tmp_mat, 1, sum) if (any(not_conn == 0)) { conn_samp[i] <- TRUE } else { conn_samp[i] <- FALSE } } sum(conn_samp) / n_samp ## [1] 0.20565 1.3 Discrete probability spaces Exercise 1.12 Show that the standard measurable space on \\(\\Omega = \\{0,1,...,n\\}\\) equipped with binomial measure is a discrete probability space. Define another probability measure on this measurable space. Show that for \\(n=1\\) the binomial measure is the same as the Bernoulli measure. R: Draw 1000 samples from the binomial distribution \\(p=0.5\\), \\(n=20\\) (rbinom) and compare relative frequencies with theoretical probability measure. Solution. We need to show that the terms of \\(\\sum_{k=0}^n \\binom{n}{k} p^k (1 - p)^{n - k}\\) sum to 1. For that we use the binomial theorem \\(\\sum_{k=0}^n \\binom{n}{k} x^k y^{n-k} = (x + y)^n\\). So \\[\\begin{equation} \\sum_{k=0}^n \\binom{n}{k} p^k (1 - p)^{n - k} = (p + 1 - p)^n = 1. \\end{equation}\\] \\(P(\\{k\\}) = \\frac{1}{n + 1}\\). When \\(n=1\\) then \\(k \\in \\{0,1\\}\\). Inserting \\(n=1\\) into the binomial measure, we get \\(\\binom{1}{k}p^k (1-p)^{1 - k}\\). Now \\(\\binom{1}{1} = \\binom{1}{0} = 1\\), so the measure is \\(p^k (1-p)^{1 - k}\\), which is the Bernoulli measure. set.seed(1) library(ggplot2) library(dplyr) bin_samp <- rbinom(n = 1000, size = 20, prob = 0.5) bin_samp <- data.frame(x = bin_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:20, n = dbinom(0:20, size = 20, prob = 0.5), type = "theoretical_measure")) bin_plot <- ggplot(data = bin_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(bin_plot) Exercise 1.13 Show that the standard measurable space on \\(\\Omega = \\{0,1,...,\\infty\\}\\) equipped with geometric measure is a discrete probability space, equipped with Poisson measure is a discrete probability space. Define another probability measure on this measurable space. R: Draw 1000 samples from the Poisson distribution \\(\\lambda = 10\\) (rpois) and compare relative frequencies with theoretical probability measure. Solution. \\(\\sum_{k = 0}^{\\infty} p(1 - p)^k = p \\sum_{k = 0}^{\\infty} (1 - p)^k = p \\frac{1}{1 - 1 + p} = 1\\). We used the formula for geometric series. \\(\\sum_{k = 0}^{\\infty} \\frac{\\lambda^k e^{-\\lambda}}{k!} = e^{-\\lambda} \\sum_{k = 0}^{\\infty} \\frac{\\lambda^k}{k!} = e^{-\\lambda} e^{\\lambda} = 1.\\) We used the Taylor expansion of the exponential function. Since we only have to define a probability measure, we could only assign probabilities that sum to one to a finite number of events in \\(\\Omega\\), and probability zero to the other infinite number of events. However to make this solution more educational, we will try to find a measure that assigns a non-zero probability to all events in \\(\\Omega\\). A good start for this would be to find a converging infinite series, as the probabilities will have to sum to one. One simple converging series is the geometric series \\(\\sum_{k=0}^{\\infty} p^k\\) for \\(|p| < 1\\). Let us choose an arbitrary \\(p = 0.5\\). Then \\(\\sum_{k=0}^{\\infty} p^k = \\frac{1}{1 - 0.5} = 2\\). To complete the measure, we have to normalize it, so it sums to one, therefore \\(P(\\{k\\}) = \\frac{0.5^k}{2}\\) is a probability measure on \\(\\Omega\\). We could make it even more difficult by making this measure dependent on some parameter \\(\\alpha\\), but this is out of the scope of this introductory chapter. set.seed(1) pois_samp <- rpois(n = 1000, lambda = 10) pois_samp <- data.frame(x = pois_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:25, n = dpois(0:25, lambda = 10), type = "theoretical_measure")) pois_plot <- ggplot(data = pois_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(pois_plot) Exercise 1.14 Define a probability measure on \\((\\Omega = \\mathbb{Z}, 2^{\\mathbb{Z}})\\). Define a probability measure such that \\(P(\\omega) > 0, \\forall \\omega \\in \\Omega\\). R: Implement a random generator that will generate samples with the relative frequency that corresponds to your probability measure. Compare relative frequencies with theoretical probability measure . Solution. \\(P(0) = 1, P(\\omega) = 0, \\forall \\omega \\neq 0\\). \\(P(\\{k\\}) = \\sum_{k = -\\infty}^{\\infty} \\frac{p(1 - p)^{|k|}}{2^{1 - 1_0(k)}}\\), where \\(1_0(k)\\) is the indicator function, which equals to one if \\(k\\) is 0, and equals to zero in every other case. n <- 1000 geom_samps <- rgeom(n, prob = 0.5) sign_samps <- sample(c(FALSE, TRUE), size = n, replace = TRUE) geom_samps[sign_samps] <- -geom_samps[sign_samps] my_pmf <- function (k, p) { indic <- rep(1, length(k)) indic[k == 0] <- 0 return ((p * (1 - p)^(abs(k))) / 2^indic) } geom_samps <- data.frame(x = geom_samps) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = -10:10, n = my_pmf(-10:10, 0.5), type = "theoretical_measure")) geom_plot <- ggplot(data = geom_samps, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(geom_plot) Exercise 1.15 Define a probability measure on \\(\\Omega = \\{1,2,3,4,5,6\\}\\) with parameter \\(m \\in \\{1,2,3,4,5,6\\}\\), so that the probability of outcome at distance \\(1\\) from \\(m\\) is half of the probability at distance \\(0\\), at distance \\(2\\) is half of the probability at distance \\(1\\), etc. R: Implement a random generator that will generate samples with the relative frequency that corresponds to your probability measure. Compare relative frequencies with theoretical probability measure . Solution. \\(P(\\{k\\}) = \\frac{\\frac{1}{2}^{|m - k|}}{\\sum_{i=1}^6 \\frac{1}{2}^{|m - i|}}\\) n <- 10000 m <- 4 my_pmf <- function (k, m) { denom <- sum(0.5^abs(m - 1:6)) return (0.5^abs(m - k) / denom) } samps <- c() for (i in 1:n) { a <- sample(1:6, 1) a_val <- my_pmf(a, m) prob <- runif(1) if (prob < a_val) { samps <- c(samps, a) } } samps <- data.frame(x = samps) %>% count(x) %>% mutate(n = n / length(samps), type = "empirical_frequencies") %>% bind_rows(data.frame(x = 1:6, n = my_pmf(1:6, m), type = "theoretical_measure")) my_plot <- ggplot(data = samps, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(my_plot) "],["uprobspaces.html", "Chapter 2 Uncountable probability spaces 2.1 Borel sets 2.2 Lebesgue measure", " Chapter 2 Uncountable probability spaces This chapter deals with uncountable probability spaces. The students are expected to acquire the following knowledge: Theoretical Understand Borel sets and identify them. Estimate Lebesgue measure for different sets. Know when sets are Borel-measurable. Understanding of countable and uncountable sets. R Uniform sampling. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 2.1 Borel sets Exercise 2.1 Prove that the intersection of two sigma algebras on \\(\\Omega\\) is a sigma algebra. Prove that the collection of all open subsets \\((a,b)\\) on \\((0,1]\\) is not a sigma algebra of \\((0,1]\\). Solution. Empty set: \\[\\begin{equation} \\emptyset \\in \\mathcal{A} \\wedge \\emptyset \\in \\mathcal{B} \\Rightarrow \\emptyset \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Complement: \\[\\begin{equation} \\text{Let } A \\in \\mathcal{A} \\cap \\mathcal{B} \\Rightarrow A \\in \\mathcal{A} \\wedge A \\in \\mathcal{B} \\Rightarrow A^c \\in \\mathcal{A} \\wedge A^c \\in \\mathcal{B} \\Rightarrow A^c \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Countable additivity: Let \\(\\{A_i\\}\\) be a countable sequence of subsets in \\(\\mathcal{A} \\cap \\mathcal{B}\\). \\[\\begin{equation} \\forall i: A_i \\in \\mathcal{A} \\cap \\mathcal{B} \\Rightarrow A_i \\in \\mathcal{A} \\wedge A_i \\in \\mathcal{B} \\Rightarrow \\cup A_i \\in \\mathcal{A} \\wedge \\cup A_i \\in \\mathcal{B} \\Rightarrow \\cup A_i \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Let \\(A\\) denote the collection of all open subsets \\((a,b)\\) on \\((0,1]\\). Then \\((0,1) \\in A\\). But \\((0,1)^c = 1 \\notin A\\). Exercise 2.2 Show that \\(\\mathcal{C} = \\sigma(\\mathcal{C})\\) if and only if \\(\\mathcal{C}\\) is a sigma algebra. Solution. “\\(\\Rightarrow\\)” This follows from the definition of a generated sigma algebra. “\\(\\Leftarrow\\)” Let \\(\\mathcal{F} = \\cap_i F_i\\) be the intersection of all sigma algebras that contain \\(\\mathcal{C}\\). Then \\(\\sigma(\\mathcal{C}) = \\mathcal{F}\\). Additionally, \\(\\forall i: \\mathcal{C} \\in F_i\\). So each \\(F_i\\) can be written as \\(F_i = \\mathcal{C} \\cup D\\), where \\(D\\) are the rest of the elements in the sigma algebra. In other words, each sigma algebra in the collection contains at least \\(\\mathcal{C}\\), but can contain other elements. Now for some \\(j\\), \\(F_j = \\mathcal{C}\\) as \\(\\{F_i\\}\\) contains all sigma algebras that contain \\(\\mathcal{C}\\) and \\(\\mathcal{C}\\) is such a sigma algebra. Since this is the smallest subset in the intersection it follows that \\(\\sigma(\\mathcal{C}) = \\mathcal{F} = \\mathcal{C}\\). Exercise 2.3 Let \\(\\mathcal{C}\\) and \\(\\mathcal{D}\\) be two collections of subsets on \\(\\Omega\\) such that \\(\\mathcal{C} \\subset \\mathcal{D}\\). Prove that \\(\\sigma(\\mathcal{C}) \\subseteq \\sigma(\\mathcal{D})\\). Solution. \\(\\sigma(\\mathcal{D})\\) is a sigma algebra that contains \\(\\mathcal{D}\\). It follows that \\(\\sigma(\\mathcal{D})\\) is a sigma algebra that contains \\(\\mathcal{C}\\). Let us write \\(\\sigma(\\mathcal{C}) = \\cap_i F_i\\), where \\(\\{F_i\\}\\) is the collection of all sigma algebras that contain \\(\\mathcal{C}\\). Since \\(\\sigma(\\mathcal{D})\\) is such a sigma algebra, there exists an index \\(j\\), so that \\(F_j = \\sigma(\\mathcal{D})\\). Then we can write \\[\\begin{align} \\sigma(\\mathcal{C}) &= (\\cap_{i \\neq j} F_i) \\cap \\sigma(\\mathcal{D}) \\\\ &\\subseteq \\sigma(\\mathcal{D}). \\end{align}\\] Exercise 2.4 Prove that the following subsets of \\((0,1]\\) are Borel-measurable by finding their measure. Any countable set. The set of numbers in (0,1] whose decimal expansion does not contain 7. Solution. This follows directly from the fact that every countable set is a union of singletons, whose measure is 0. Let us first look at numbers which have a 7 as the first decimal numbers. Their measure is 0.1. Then we take all the numbers with a 7 as the second decimal number (excluding those who already have it as the first). These have the measure 0.01, and there are 9 of them, so their total measure is 0.09. We can continue to do so infinitely many times. At each \\(n\\), we have the measure of the intervals which is \\(10^n\\) and the number of those intervals is \\(9^{n-1}\\). Now \\[\\begin{align} \\lambda(A) &= 1 - \\sum_{n = 0}^{\\infty} \\frac{9^n}{10^{n+1}} \\\\ &= 1 - \\frac{1}{10} \\sum_{n = 0}^{\\infty} (\\frac{9}{10})^n \\\\ &= 1 - \\frac{1}{10} \\frac{10}{1} \\\\ &= 0. \\end{align}\\] Since we have shown that the measure of the set is \\(0\\), we have also shown that the set is measurable. Exercise 2.5 Let \\(\\Omega = [0,1]\\), and let \\(\\mathcal{F}_3\\) consist of all countable subsets of \\(\\Omega\\), and all subsets of \\(\\Omega\\) having a countable complement. Show that \\(\\mathcal{F}_3\\) is a sigma algebra. Let us define \\(P(A)=0\\) if \\(A\\) is countable, and \\(P(A) = 1\\) if \\(A\\) has a countable complement. Is \\((\\Omega, \\mathcal{F}_3, P)\\) a legitimate probability space? Solution. The empty set is countable, therefore it is in \\(\\mathcal{F}_3\\). For any \\(A \\in \\mathcal{F}_3\\). If \\(A\\) is countable, then \\(A^c\\) has a countable complement and is in \\(\\mathcal{F}_3\\). If \\(A\\) is uncountable, then it has a countable complement \\(A^c\\) which is therefore also in \\(\\mathcal{F}_3\\). We are left with showing countable additivity. Let \\(\\{A_i\\}\\) be an arbitrary collection of sets in \\(\\mathcal{F}_3\\). We will look at two possibilities. First let all \\(A_i\\) be countable. A countable union of countable sets is countable, and therefore in \\(\\mathcal{F}_3\\). Second, let at least one \\(A_i\\) be uncountable. It follows that it has a countable complement. We can write \\[\\begin{equation} (\\cup_{i=1}^{\\infty} A_i)^c = \\cap_{i=1}^{\\infty} A_i^c. \\end{equation}\\] Since at least one \\(A_i^c\\) on the right side is countable, the whole intersection is countable, and therefore the union has a countable complement. It follows that the union is in \\(\\mathcal{F}_3\\). The tuple \\((\\Omega, \\mathcal{F}_3)\\) is a measurable space. Therefore, we only need to check whether \\(P\\) is a probability measure. The measure of the empty set is zero as it is countable. We have to check for countable additivity. Let us look at three situations. Let \\(A_i\\) be disjoint sets. First, let all \\(A_i\\) be countable. \\[\\begin{equation} P(\\cup_{i=1}^{\\infty} A_i) = \\sum_{i=1}^{\\infty}P( A_i)) = 0. \\end{equation}\\] Since the union is countable, the above equation holds. Second, let exactly one \\(A_i\\) be uncountable. W.L.O.G. let that be \\(A_1\\). Then \\[\\begin{equation} P(\\cup_{i=1}^{\\infty} A_i) = 1 + \\sum_{i=2}^{\\infty}P( A_i)) = 1. \\end{equation}\\] Since the union is uncountable, the above equation holds. Third, let at least two \\(A_i\\) be uncountable. We have to check whether it is possible for two uncountable sets in \\(\\mathcal{F}_3\\) to be disjoint. If that is possible, then their measures would sum to more than one and \\(P\\) would not be a probability measure. W.L.O.G. let \\(A_1\\) and \\(A_2\\) be uncountable. Then we have \\[\\begin{equation} A_1 \\cap A_2 = (A_1^c \\cup A_2^c)^c. \\end{equation}\\] Now \\(A_1^c\\) and \\(A_2^c\\) are countable and their union is therefore countable. Let \\(B = A_1^c \\cup A_2^c\\). So the intersection of \\(A_1\\) and \\(A_2\\) equals the complement of \\(B\\), which is countable. For the intersection to be the empty set, \\(B\\) would have to equal to \\(\\Omega\\). But \\(\\Omega\\) is uncountable and therefore \\(B\\) can not equal to \\(\\Omega\\). It follows that two uncountable sets in \\(\\mathcal{F}_3\\) can not have an empty intersection. Therefore the tuple is a legitimate probability space. 2.2 Lebesgue measure Exercise 2.6 Show that the Lebesgue measure of rational numbers on \\([0,1]\\) is 0. R: Implement a random number generator, which generates uniform samples of irrational numbers in \\([0,1]\\) by uniformly sampling from \\([0,1]\\) and rejecting a sample if it is rational. Solution. There are a countable number of rational numbers. Therefore, we can write \\[\\begin{align} \\lambda(\\mathbb{Q}) &= \\lambda(\\cup_{i = 1}^{\\infty} q_i) &\\\\ &= \\sum_{i = 1}^{\\infty} \\lambda(q_i) &\\text{ (countable additivity)} \\\\ &= \\sum_{i = 1}^{\\infty} 0 &\\text{ (Lebesgue measure of a singleton)} \\\\ &= 0. \\end{align}\\] Exercise 2.7 Prove that the Lebesgue measure of \\(\\mathbb{R}\\) is infinity. Paradox. Show that the cardinality of \\(\\mathbb{R}\\) and \\((0,1)\\) is the same, while their Lebesgue measures are infinity and one respectively. Solution. Let \\(a_i\\) be the \\(i\\)-th integer for \\(i \\in \\mathbb{Z}\\). We can write \\(\\mathbb{R} = \\cup_{-\\infty}^{\\infty} (a_i, a_{i + 1}]\\). \\[\\begin{align} \\lambda(\\mathbb{R}) &= \\lambda(\\cup_{i = -\\infty}^{\\infty} (a_i, a_{i + 1}]) \\\\ &= \\lambda(\\lim_{n \\rightarrow \\infty} \\cup_{i = -n}^{n} (a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\lambda(\\cup_{i = -n}^{n} (a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\sum_{i = -n}^{n} \\lambda((a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\sum_{i = -n}^{n} 1 \\\\ &= \\lim_{n \\rightarrow \\infty} 2n \\\\ &= \\infty. \\end{align}\\] We need to find a bijection between \\(\\mathbb{R}\\) and \\((0,1)\\). A well-known function that maps from a bounded interval to \\(\\mathbb{R}\\) is the tangent. To make the bijection easier to achieve, we will take the inverse, which maps from \\(\\mathbb{R}\\) to \\((-\\frac{\\pi}{2}, \\frac{\\pi}{2})\\). However, we need to change the function so it maps to \\((0,1)\\). First we add \\(\\frac{\\pi}{2}\\), so that we move the function above zero. Then we only have to divide by the max value, which in this case is \\(\\pi\\). So our bijection is \\[\\begin{equation} f(x) = \\frac{\\tan^{-1}(x) + \\frac{\\pi}{2}}{\\pi}. \\end{equation}\\] Exercise 2.8 Take the measure space \\((\\Omega_1 = (0,1], B_{(0,1]}, \\lambda)\\) (we know that this is a probability space on \\((0,1]\\)). Define a map (function) from \\(\\Omega_1\\) to \\(\\Omega_2 = \\{1,2,3,4,5,6\\}\\) such that the measure space \\((\\Omega_2, 2^{\\Omega_2}, \\lambda(f^{-1}()))\\) will be a discrete probability space with uniform probabilities (\\(P(\\omega) = \\frac{1}{6}, \\forall \\omega \\in \\Omega_2)\\). Is the map that you defined in (a) the only such map? How would you in the same fashion define a map that would result in a probability space that can be interpreted as a coin toss with probability \\(p\\) of heads? R: Use the map in (a) as a basis for a random generator for this fair die. Solution. In other words, we have to assign disjunct intervals of the same size to each element of \\(\\Omega_2\\). Therefore \\[\\begin{equation} f(x) = \\lceil 6x \\rceil. \\end{equation}\\] No, we could for example rearrange the order in which the intervals are mapped to integers. Additionally, we could have several disjoint intervals that mapped to the same integer, as long as the Lebesgue measure of their union would be \\(\\frac{1}{6}\\) and the function would remain injective. We have \\(\\Omega_3 = \\{0,1\\}\\), where zero represents heads and one represents tails. Then \\[\\begin{equation} f(x) = 0^{I_{A}(x)}, \\end{equation}\\] where \\(A = \\{y \\in (0,1] : y < p\\}\\). set.seed(1) unif_s <- runif(1000) die_s <- ceiling(6 * unif_s) summary(as.factor(die_s)) ## 1 2 3 4 5 6 ## 166 154 200 146 166 168 "],["condprob.html", "Chapter 3 Conditional probability 3.1 Calculating conditional probabilities 3.2 Conditional independence 3.3 Monty Hall problem", " Chapter 3 Conditional probability This chapter deals with conditional probability. The students are expected to acquire the following knowledge: Theoretical Identify whether variables are independent. Calculation of conditional probabilities. Understanding of conditional dependence and independence. How to apply Bayes’ theorem to solve difficult probabilistic questions. R Simulating conditional probabilities. cumsum. apply. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 3.1 Calculating conditional probabilities Exercise 3.1 A military officer is in charge of identifying enemy aircraft and shooting them down. He is able to positively identify an enemy airplane 95% of the time and positively identify a friendly airplane 90% of the time. Furthermore, 99% of the airplanes are friendly. When the officer identifies an airplane as an enemy airplane, what is the probability that it is not and they will shoot at a friendly airplane? Solution. Let \\(E = 0\\) denote that the observed plane is friendly and \\(E=1\\) that it is an enemy. Let \\(I = 0\\) denote that the officer identified it as friendly and \\(I = 1\\) as enemy. Then \\[\\begin{align} P(E = 0 | I = 1) &= \\frac{P(I = 1 | E = 0)P(E = 0)}{P(I = 1)} \\\\ &= \\frac{P(I = 1 | E = 0)P(E = 0)}{P(I = 1 | E = 0)P(E = 0) + P(I = 1 | E = 1)P(E = 1)} \\\\ &= \\frac{0.1 \\times 0.99}{0.1 \\times 0.99 + 0.95 \\times 0.01} \\\\ &= 0.91. \\end{align}\\] Exercise 3.2 R: Consider tossing a fair die. Let \\(A = \\{2,4,6\\}\\) and \\(B = \\{1,2,3,4\\}\\). Then \\(P(A) = \\frac{1}{2}\\), \\(P(B) = \\frac{2}{3}\\) and \\(P(AB) = \\frac{1}{3}\\). Since \\(P(AB) = P(A)P(B)\\), the events \\(A\\) and \\(B\\) are independent. Simulate draws from the sample space and verify that the proportions are the same. Then find two events \\(C\\) and \\(D\\) that are not independent and repeat the simulation. set.seed(1) nsamps <- 10000 tosses <- sample(1:6, nsamps, replace = TRUE) PA <- sum(tosses %in% c(2,4,6)) / nsamps PB <- sum(tosses %in% c(1,2,3,4)) / nsamps PA * PB ## [1] 0.3295095 sum(tosses %in% c(2,4)) / nsamps ## [1] 0.3323 # Let C = {1,2} and D = {2,3} PC <- sum(tosses %in% c(1,2)) / nsamps PD <- sum(tosses %in% c(2,3)) / nsamps PC * PD ## [1] 0.1067492 sum(tosses %in% c(2)) / nsamps ## [1] 0.1622 Exercise 3.3 A machine reports the true value of a thrown 12-sided die 5 out of 6 times. If the machine reports a 1 has been tossed, what is the probability that it is actually a 1? Now let the machine only report whether a 1 has been tossed or not. Does the probability change? R: Use simulation to check your answers to a) and b). Solution. Let \\(T = 1\\) denote that the toss is 1 and \\(M = 1\\) that the machine reports a 1. \\[\\begin{align} P(T = 1 | M = 1) &= \\frac{P(M = 1 | T = 1)P(T = 1)}{P(M = 1)} \\\\ &= \\frac{P(M = 1 | T = 1)P(T = 1)}{\\sum_{k=1}^{12} P(M = 1 | T = k)P(T = k)} \\\\ &= \\frac{\\frac{5}{6}\\frac{1}{12}}{\\frac{5}{6}\\frac{1}{12} + 11 \\frac{1}{6} \\frac{1}{11} \\frac{1}{12}} \\\\ &= \\frac{5}{6}. \\end{align}\\] Yes. \\[\\begin{align} P(T = 1 | M = 1) &= \\frac{P(M = 1 | T = 1)P(T = 1)}{P(M = 1)} \\\\ &= \\frac{P(M = 1 | T = 1)P(T = 1)}{\\sum_{k=1}^{12} P(M = 1 | T = k)P(T = k)} \\\\ &= \\frac{\\frac{5}{6}\\frac{1}{12}}{\\frac{5}{6}\\frac{1}{12} + 11 \\frac{1}{6} \\frac{1}{12}} \\\\ &= \\frac{5}{16}. \\end{align}\\] set.seed(1) nsamps <- 10000 report_a <- vector(mode = "numeric", length = nsamps) report_b <- vector(mode = "logical", length = nsamps) truths <- vector(mode = "logical", length = nsamps) for (i in 1:10000) { toss <- sample(1:12, size = 1) truth <- sample(c(TRUE, FALSE), size = 1, prob = c(5/6, 1/6)) truths[i] <- truth if (truth) { report_a[i] <- toss report_b[i] <- toss == 1 } else { remaining <- (1:12)[1:12 != toss] report_a[i] <- sample(remaining, size = 1) report_b[i] <- toss != 1 } } truth_a1 <- truths[report_a == 1] sum(truth_a1) / length(truth_a1) ## [1] 0.8300733 truth_b1 <- truths[report_b] sum(truth_b1) / length(truth_b1) ## [1] 0.3046209 Exercise 3.4 A coin is tossed independently \\(n\\) times. The probability of heads at each toss is \\(p\\). At each time \\(k\\), \\((k = 2,3,...,n)\\) we get a reward at time \\(k+1\\) if \\(k\\)-th toss was a head and the previous toss was a tail. Let \\(A_k\\) be the event that a reward is obtained at time \\(k\\). Are events \\(A_k\\) and \\(A_{k+1}\\) independent? Are events \\(A_k\\) and \\(A_{k+2}\\) independent? R: simulate 10 tosses 10000 times, where \\(p = 0.7\\). Check your answers to a) and b) by counting the frequencies of the events \\(A_5\\), \\(A_6\\), and \\(A_7\\). Solution. For \\(A_k\\) to happen, we need the tosses \\(k-2\\) and \\(k-1\\) be tails and heads respectively. For \\(A_{k+1}\\) to happen, we need tosses \\(k-1\\) and \\(k\\) be tails and heads respectively. As the toss \\(k-1\\) need to be heads for one and tails for the other, these two events can not happen simultaneously. Therefore the probability of their intersection is 0. But the probability of each of them separately is \\(p(1-p) > 0\\). Therefore, they are not independent. For \\(A_k\\) to happen, we need the tosses \\(k-2\\) and \\(k-1\\) be tails and heads respectively. For \\(A_{k+2}\\) to happen, we need tosses \\(k\\) and \\(k+1\\) be tails and heads respectively. So the probability of intersection is \\(p^2(1-p)^2\\). And the probability of each separately is again \\(p(1-p)\\). Therefore, they are independent. set.seed(1) nsamps <- 10000 p <- 0.7 rewardA_5 <- vector(mode = "logical", length = nsamps) rewardA_6 <- vector(mode = "logical", length = nsamps) rewardA_7 <- vector(mode = "logical", length = nsamps) rewardA_56 <- vector(mode = "logical", length = nsamps) rewardA_57 <- vector(mode = "logical", length = nsamps) for (i in 1:nsamps) { samps <- sample(c(0,1), size = 10, replace = TRUE, prob = c(0.7, 0.3)) rewardA_5[i] <- (samps[4] == 0 & samps[3] == 1) rewardA_6[i] <- (samps[5] == 0 & samps[4] == 1) rewardA_7[i] <- (samps[6] == 0 & samps[5] == 1) rewardA_56[i] <- (rewardA_5[i] & rewardA_6[i]) rewardA_57[i] <- (rewardA_5[i] & rewardA_7[i]) } sum(rewardA_5) / nsamps ## [1] 0.2141 sum(rewardA_6) / nsamps ## [1] 0.2122 sum(rewardA_7) / nsamps ## [1] 0.2107 sum(rewardA_56) / nsamps ## [1] 0 sum(rewardA_57) / nsamps ## [1] 0.0454 Exercise 3.5 A drawer contains two coins. One is an unbiased coin, the other is a biased coin, which will turn up heads with probability \\(p\\) and tails with probability \\(1-p\\). One coin is selected uniformly at random. The selected coin is tossed \\(n\\) times. The coin turns up heads \\(k\\) times and tails \\(n-k\\) times. What is the probability that the coin is biased? The selected coin is tossed repeatedly until it turns up heads \\(k\\) times. Given that it is tossed \\(n\\) times in total, what is the probability that the coin is biased? Solution. Let \\(B = 1\\) denote that the coin is biased and let \\(H = k\\) denote that we’ve seen \\(k\\) heads. \\[\\begin{align} P(B = 1 | H = k) &= \\frac{P(H = k | B = 1)P(B = 1)}{P(H = k)} \\\\ &= \\frac{P(H = k | B = 1)P(B = 1)}{P(H = k | B = 1)P(B = 1) + P(H = k | B = 0)P(B = 0)} \\\\ &= \\frac{p^k(1-p)^{n-k} 0.5}{p^k(1-p)^{n-k} 0.5 + 0.5^{n+1}} \\\\ &= \\frac{p^k(1-p)^{n-k}}{p^k(1-p)^{n-k} + 0.5^n}. \\end{align}\\] The same results as in a). The only difference between these two scenarios is that in b) the last throw must be heads. However, this holds for the biased and the unbiased coin and therefore does not affect the probability of the coin being biased. Exercise 3.6 Judy goes around the company for Women’s day and shares flowers. In every office she leaves a flower, if there is at least one woman inside. The probability that there’s a woman in the office is \\(\\frac{3}{5}\\). What is the probability that Judy leaves her first flower in the fourth office? Given that she has given away exactly three flowers in the first four offices, what is the probability that she gives her fourth flower in the eighth office? What is the probability that she leaves the second flower in the fifth office? What is the probability that she leaves the second flower in the fifth office, given that she did not leave the second flower in the second office? Judy needs a new supply of flowers immediately after the office, where she gives away her last flower. What is the probability that she visits at least five offices, if she starts with two flowers? R: simulate Judy’s walk 10000 times to check your answers a) - e). Solution. Let \\(X_i = k\\) denote the event that … \\(i\\)-th sample on the \\(k\\)-th run. Since the events are independent, we can multiply their probabilities to get \\[\\begin{equation} P(X_1 = 4) = 0.4^3 \\times 0.6 = 0.0384. \\end{equation}\\] Same as in a) as we have a fresh start after first four offices. For this to be possible, she had to leave the first flower in one of the first four offices. Therefore there are four possibilities, and for each of those the probability is \\(0.4^3 \\times 0.6\\). Additionally, the probability that she leaves a flower in the fifth office is \\(0.6\\). So \\[\\begin{equation} P(X_2 = 5) = \\binom{4}{1} \\times 0.4^3 \\times 0.6^2 = 0.09216. \\end{equation}\\] We use Bayes’ theorem. \\[\\begin{align} P(X_2 = 5 | X_2 \\neq 2) &= \\frac{P(X_2 \\neq 2 | X_2 = 5)P(X_2 = 5)}{P(X_2 \\neq 2)} \\\\ &= \\frac{0.09216}{0.64} \\\\ &= 0.144. \\end{align}\\] The denominator in the second equation can be calculated as follows. One of three things has to happen for the second not to be dealt in the second round. First, both are zero, so \\(0.4^2\\). Second, first is zero, and second is one, so \\(0.4 \\times 0.6\\). Third, the first is one and the second one zero, so \\(0.6 \\times 0.4\\). Summing these values we get \\(0.64\\). We will look at the complement, so the events that she gave away exactly two flowers after two, three and four offices. \\[\\begin{equation} P(X_2 \\geq 5) = 1 - 0.6^2 - 2 \\times 0.4 \\times 0.6^2 - 3 \\times 0.4^2 \\times 0.6^2 = 0.1792. \\end{equation}\\] The multiplying parts represent the possibilities of the first flower. set.seed(1) nsamps <- 100000 Judyswalks <- matrix(data = NA, nrow = nsamps, ncol = 8) for (i in 1:nsamps) { thiswalk <- sample(c(0,1), size = 8, replace = TRUE, prob = c(0.4, 0.6)) Judyswalks[i, ] <- thiswalk } csJudy <- t(apply(Judyswalks, 1, cumsum)) # a sum(csJudy[ ,4] == 1 & csJudy[ ,3] == 0) / nsamps ## [1] 0.03848 # b csJsubset <- csJudy[csJudy[ ,4] == 3 & csJudy[ ,3] == 2, ] sum(csJsubset[ ,8] == 4 & csJsubset[ ,7] == 3) / nrow(csJsubset) ## [1] 0.03665893 # c sum(csJudy[ ,5] == 2 & csJudy[ ,4] == 1) / nsamps ## [1] 0.09117 # d sum(csJudy[ ,5] == 2 & csJudy[ ,4] == 1) / sum(csJudy[ ,2] != 2) ## [1] 0.1422398 # e sum(csJudy[ ,4] < 2) / nsamps ## [1] 0.17818 3.2 Conditional independence Exercise 3.7 Describe: A real-world example of two events \\(A\\) and \\(B\\) that are dependent but become conditionally independent if conditioned on a third event \\(C\\). A real-world example of two events \\(A\\) and \\(B\\) that are independent, but become dependent if conditioned on some third event \\(C\\). Solution. Let \\(A\\) be the height of a person and let \\(B\\) be the person’s knowledge of the Dutch language. These events are dependent since the Dutch are known to be taller than average. However if \\(C\\) is the nationality of the person, then \\(A\\) and \\(B\\) are independent given \\(C\\). Let \\(A\\) be the event that Mary passes the exam and let \\(B\\) be the event that John passes the exam. These events are independent. However, if the event \\(C\\) is that Mary and John studied together, then \\(A\\) and \\(B\\) are conditionally dependent given \\(C\\). Exercise 3.8 We have two coins of identical appearance. We know that one is a fair coin and the other flips heads 80% of the time. We choose one of the two coins uniformly at random. We discard the coin that was not chosen. We now flip the chosen coin independently 10 times, producing a sequence \\(Y_1 = y_1\\), \\(Y_2 = y_2\\), …, \\(Y_{10} = y_{10}\\). Intuitively, without doing and computation, are these random variables independent? Compute the probability \\(P(Y_1 = 1)\\). Compute the probabilities \\(P(Y_2 = 1 | Y_1 = 1)\\) and \\(P(Y_{10} = 1 | Y_1 = 1,...,Y_9 = 1)\\). Given your answers to b) and c), would you now change your answer to a)? If so, discuss why your intuition had failed. Solution. \\(P(Y_1 = 1) = 0.5 * 0.8 + 0.5 * 0.5 = 0.65\\). Since we know that \\(Y_1 = 1\\) this should change our view of the probability of the coin being biased or not. Let \\(B = 1\\) denote the event that the coin is biased and let \\(B = 0\\) denote that the coin is unbiased. By using marginal probability, we can write \\[\\begin{align} P(Y_2 = 1 | Y_1 = 1) &= P(Y_2 = 1, B = 1 | Y_1 = 1) + P(Y_2 = 1, B = 0 | Y_1 = 1) \\\\ &= \\sum_{k=1}^2 P(Y_2 = 1 | B = k, Y_1 = 1)P(B = k | Y_1 = 1) \\\\ &= 0.8 \\frac{P(Y_1 = 1 | B = 1)P(B = 1)}{P(Y_1 = 1)} + 0.5 \\frac{P(Y_1 = 1 | B = 0)P(B = 0)}{P(Y_1 = 1)} \\\\ &= 0.8 \\frac{0.8 \\times 0.5}{0.65} + 0.5 \\frac{0.5 \\times 0.5}{0.65} \\\\ &\\approx 0.68. \\end{align}\\] For the other calculation we follow the same procedure. Let \\(X = 1\\) denote that first nine tosses are all heads (equivalent to \\(Y_1 = 1\\),…, \\(Y_9 = 1\\)). \\[\\begin{align} P(Y_{10} = 1 | X = 1) &= P(Y_2 = 1, B = 1 | X = 1) + P(Y_2 = 1, B = 0 | X = 1) \\\\ &= \\sum_{k=1}^2 P(Y_2 = 1 | B = k, X = 1)P(B = k | X = 1) \\\\ &= 0.8 \\frac{P(X = 1 | B = 1)P(B = 1)}{P(X = 1)} + 0.5 \\frac{P(X = 1 | B = 0)P(B = 0)}{P(X = 1)} \\\\ &= 0.8 \\frac{0.8^9 \\times 0.5}{0.5 \\times 0.8^9 + 0.5 \\times 0.5^9} + 0.5 \\frac{0.5^9 \\times 0.5}{0.5 \\times 0.8^9 + 0.5 \\times 0.5^9} \\\\ &\\approx 0.8. \\end{align}\\] 3.3 Monty Hall problem The Monty Hall problem is a famous probability puzzle with non-intuitive outcome. Many established mathematicians and statisticians had problems solving it and many even disregarded the correct solution until they’ve seen the proof by simulation. Here we will show how it can be solved relatively simply with the use of Bayes’ theorem if we select the variables in a smart way. Exercise 3.9 (Monty Hall problem) A prize is placed at random behind one of three doors. You pick a door. Now Monty Hall chooses one of the other two doors, opens it and shows you that it is empty. He then gives you the opportunity to keep your door or switch to the other unopened door. Should you stay or switch? Use Bayes’ theorem to calculate the probability of winning if you switch and if you do not. R: Check your answers in R. Solution. W.L.O.G. assume we always pick the first door. The host can only open door 2 or door 3, as he can not open the door we picked. Let \\(k \\in \\{2,3\\}\\). Let us first look at what happens if we do not change. Then we have \\[\\begin{align} P(\\text{car in 1} | \\text{open $k$}) &= \\frac{P(\\text{open $k$} | \\text{car in 1})P(\\text{car in 1})}{P(\\text{open $k$})} \\\\ &= \\frac{P(\\text{open $k$} | \\text{car in 1})P(\\text{car in 1})}{\\sum_{n=1}^3 P(\\text{open $k$} | \\text{car in $n$})P(\\text{car in $n$)}}. \\end{align}\\] The probability that he opened \\(k\\) if the car is in 1 is \\(\\frac{1}{2}\\), as he can choose between door 2 and 3 as both have a goat behind it. Let us look at the normalization constant. When \\(n = 1\\) we get the value in the nominator. When \\(n=k\\), we get 0, as he will not open the door if there’s a prize behind. The remaining option is that we select 1, the car is behind \\(k\\) and he opens the only door left. Since he can’t open 1 due to it being our pick and \\(k\\) due to having the prize, the probability of opening the remaining door is 1, and the prior probability of the car being behind this door is \\(\\frac{1}{3}\\). So we have \\[\\begin{align} P(\\text{car in 1} | \\text{open $k$}) &= \\frac{\\frac{1}{2}\\frac{1}{3}}{\\frac{1}{2}\\frac{1}{3} + \\frac{1}{3}} \\\\ &= \\frac{1}{3}. \\end{align}\\] Now let us look at what happens if we do change. Let \\(k' \\in \\{2,3\\}\\) be the door that is not opened. If we change, we select this door, so we have \\[\\begin{align} P(\\text{car in $k'$} | \\text{open $k$}) &= \\frac{P(\\text{open $k$} | \\text{car in $k'$})P(\\text{car in $k'$})}{P(\\text{open $k$})} \\\\ &= \\frac{P(\\text{open $k$} | \\text{car in $k'$})P(\\text{car in $k'$})}{\\sum_{n=1}^3 P(\\text{open $k$} | \\text{car in $n$})P(\\text{car in $n$)}}. \\end{align}\\] The denominator stays the same, the only thing that is different from before is \\(P(\\text{open $k$} | \\text{car in $k'$})\\). We have a situation where we initially selected door 1 and the car is in door \\(k'\\). The probability that the host will open door \\(k\\) is then 1, as he can not pick any other door. So we have \\[\\begin{align} P(\\text{car in $k'$} | \\text{open $k$}) &= \\frac{\\frac{1}{3}}{\\frac{1}{2}\\frac{1}{3} + \\frac{1}{3}} \\\\ &= \\frac{2}{3}. \\end{align}\\] Therefore it makes sense to change the door. set.seed(1) nsamps <- 1000 ifchange <- vector(mode = "logical", length = nsamps) ifstay <- vector(mode = "logical", length = nsamps) for (i in 1:nsamps) { where_car <- sample(c(1:3), 1) where_player <- sample(c(1:3), 1) open_samp <- (1:3)[where_car != (1:3) & where_player != (1:3)] if (length(open_samp) == 1) { where_open <- open_samp } else { where_open <- sample(open_samp, 1) } ifstay[i] <- where_car == where_player where_ifchange <- (1:3)[where_open != (1:3) & where_player != (1:3)] ifchange[i] <- where_ifchange == where_car } sum(ifstay) / nsamps ## [1] 0.328 sum(ifchange) / nsamps ## [1] 0.672 "],["rvs.html", "Chapter 4 Random variables 4.1 General properties and calculations 4.2 Discrete random variables 4.3 Continuous random variables 4.4 Singular random variables 4.5 Transformations", " Chapter 4 Random variables This chapter deals with random variables and their distributions. The students are expected to acquire the following knowledge: Theoretical Identification of random variables. Convolutions of random variables. Derivation of PDF, PMF, CDF, and quantile function. Definitions and properties of common discrete random variables. Definitions and properties of common continuous random variables. Transforming univariate random variables. R Familiarize with PDF, PMF, CDF, and quantile functions for several distributions. Visual inspection of probability distributions. Analytical and empirical calculation of probabilities based on distributions. New R functions for plotting (for example, facet_wrap). Creating random number generators based on the Uniform distribution. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 4.1 General properties and calculations Exercise 4.1 Which of the functions below are valid CDFs? Find their respective densities. R: Plot the three functions. \\[\\begin{equation} F(x) = \\begin{cases} 1 - e^{-x^2} & x \\geq 0 \\\\ 0 & x < 0. \\end{cases} \\end{equation}\\] \\[\\begin{equation} F(x) = \\begin{cases} e^{-\\frac{1}{x}} & x > 0 \\\\ 0 & x \\leq 0. \\end{cases} \\end{equation}\\] \\[\\begin{equation} F(x) = \\begin{cases} 0 & x \\leq 0 \\\\ \\frac{1}{3} & 0 < x \\leq \\frac{1}{2} \\\\ 1 & x > \\frac{1}{2}. \\end{cases} \\end{equation}\\] Solution. Yes. First, let us check the limits. \\(\\lim_{x \\rightarrow -\\infty} (0) = 0\\). \\(\\lim_{x \\rightarrow \\infty} (1 - e^{-x^2}) = 1 - \\lim_{x \\rightarrow \\infty} e^{-x^2} = 1 - 0 = 1\\). Second, let us check whether the function is increasing. Let \\(x > y \\geq 0\\). Then \\(1 - e^{-x^2} \\geq 1 - e^{-y^2}\\). We only have to check right continuity for the point zero. \\(F(0) = 0\\) and \\(\\lim_{\\epsilon \\downarrow 0}F (0 + \\epsilon) = \\lim_{\\epsilon \\downarrow 0} 1 - e^{-\\epsilon^2} = 1 - \\lim_{\\epsilon \\downarrow 0} e^{-\\epsilon^2} = 1 - 1 = 0\\). We get the density by differentiating the CDF. \\(p(x) = \\frac{d}{dx} 1 - e^{-x^2} = 2xe^{-x^2}.\\) Students are encouraged to check that this is a proper PDF. Yes. First, let us check the limits. $_{x -} (0) = 0 and \\(\\lim_{x \\rightarrow \\infty} (e^{-\\frac{1}{x}}) = 1\\). Second, let us check whether the function is increasing. Let \\(x > y \\geq 0\\). Then \\(e^{-\\frac{1}{x}} \\geq e^{-\\frac{1}{y}}\\). We only have to check right continuity for the point zero. \\(F(0) = 0\\) and \\(\\lim_{\\epsilon \\downarrow 0}F (0 + \\epsilon) = \\lim_{\\epsilon \\downarrow 0} e^{-\\frac{1}{\\epsilon}} = 0\\). We get the density by differentiating the CDF. \\(p(x) = \\frac{d}{dx} e^{-\\frac{1}{x}} = \\frac{1}{x^2}e^{-\\frac{1}{x}}.\\) Students are encouraged to check that this is a proper PDF. No. The function is not right continuous as \\(F(\\frac{1}{2}) = \\frac{1}{3}\\), but \\(\\lim_{\\epsilon \\downarrow 0} F(\\frac{1}{2} + \\epsilon) = 1\\). f1 <- function (x) { tmp <- 1 - exp(-x^2) tmp[x < 0] <- 0 return(tmp) } f2 <- function (x) { tmp <- exp(-(1 / x)) tmp[x <= 0] <- 0 return(tmp) } f3 <- function (x) { tmp <- x tmp[x == x] <- 1 tmp[x <= 0.5] <- 1/3 tmp[x <= 0] <- 0 return(tmp) } cdf_data <- tibble(x = seq(-1, 20, by = 0.001), f1 = f1(x), f2 = f2(x), f3 = f3(x)) %>% melt(id.vars = "x") cdf_plot <- ggplot(data = cdf_data, aes(x = x, y = value, color = variable)) + geom_hline(yintercept = 1) + geom_line() plot(cdf_plot) Exercise 4.2 Let \\(X\\) be a random variable with CDF \\[\\begin{equation} F(x) = \\begin{cases} 0 & x < 0 \\\\ \\frac{x^2}{2} & 0 \\leq x < 1 \\\\ \\frac{1}{2} + \\frac{p}{2} & 1 \\leq x < 2 \\\\ \\frac{1}{2} + \\frac{p}{2} + \\frac{1 - p}{2} & x \\geq 2 \\end{cases} \\end{equation}\\] R: Plot this CDF for \\(p = 0.3\\). Is it a discrete, continuous, or mixed random varible? Find the probability density/mass of \\(X\\). f1 <- function (x, p) { tmp <- x tmp[x >= 2] <- 0.5 + (p * 0.5) + ((1-p) * 0.5) tmp[x < 2] <- 0.5 + (p * 0.5) tmp[x < 1] <- (x[x < 1])^2 / 2 tmp[x < 0] <- 0 return(tmp) } cdf_data <- tibble(x = seq(-1, 5, by = 0.001), y = f1(x, 0.3)) cdf_plot <- ggplot(data = cdf_data, aes(x = x, y = y)) + geom_hline(yintercept = 1) + geom_line(color = "blue") plot(cdf_plot) ::: {.solution} \\(X\\) is a mixed random variable. Since \\(X\\) is a mixed random variable, we have to find the PDF of the continuous part and the PMF of the discrete part. We get the continuous part by differentiating the corresponding CDF, \\(\\frac{d}{dx}\\frac{x^2}{2} = x\\). So the PDF, when \\(0 \\leq x < 1\\), is \\(p(x) = x\\). Let us look at the discrete part now. It has two steps, so this is a discrete distribution with two outcomes – numbers 1 and 2. The first happens with probability \\(\\frac{p}{2}\\), and the second with probability \\(\\frac{1 - p}{2}\\). This reminds us of the Bernoulli distribution. The PMF for the discrete part is \\(P(X = x) = (\\frac{p}{2})^{2 - x} (\\frac{1 - p}{2})^{x - 1}\\). ::: Exercise 4.3 (Convolutions) Convolutions are probability distributions that correspond to sums of independent random variables. Let \\(X\\) and \\(Y\\) be independent discrete variables. Find the PMF of \\(Z = X + Y\\). Hint: Use the law of total probability. Let \\(X\\) and \\(Y\\) be independent continuous variables. Find the PDF of \\(Z = X + Y\\). Hint: Start with the CDF. Solution. \\[\\begin{align} P(Z = z) &= P(X + Y = z) & \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + Y = z | Y = k) P(Y = k) & \\text{ (law of total probability)} \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + k = z | Y = k) P(Y = k) & \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + k = z) P(Y = k) & \\text{ (independence of $X$ and $Y$)} \\\\ &= \\sum_{k = -\\infty}^\\infty P(X = z - k) P(Y = k). & \\end{align}\\] Let \\(f\\) and \\(g\\) be the PDFs of \\(X\\) and \\(Y\\) respectively. \\[\\begin{align} F(z) &= P(Z < z) \\\\ &= P(X + Y < z) \\\\ &= \\int_{-\\infty}^{\\infty} P(X + Y < z | Y = y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X + y < z | Y = y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X + y < z)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X < z - y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} (\\int_{-\\infty}^{z - y} f(x) dx) g(y) dy \\end{align}\\] Now \\[\\begin{align} p(z) &= \\frac{d}{dz} F(z) & \\\\ &= \\int_{-\\infty}^{\\infty} (\\frac{d}{dz}\\int_{-\\infty}^{z - y} f(x) dx) g(y) dy & \\\\ &= \\int_{-\\infty}^{\\infty} f(z - y) g(y) dy & \\text{ (fundamental theorem of calculus)}. \\end{align}\\] 4.2 Discrete random variables Exercise 4.4 (Binomial random variable) Let \\(X_k\\), \\(k = 1,...,n\\), be random variables with the Bernoulli measure as the PMF. Let \\(X = \\sum_{k=1}^n X_k\\). We call \\(X_k\\) a Bernoulli random variable with parameter \\(p \\in (0,1)\\). Find the CDF of \\(X_k\\). Find PMF of \\(X\\). This is a Binomial random variable with support in \\(\\{0,1,2,...,n\\}\\) and parameters \\(p \\in (0,1)\\) and \\(n \\in \\mathbb{N}_0\\). We denote \\[\\begin{equation} X | n,p \\sim \\text{binomial}(n,p). \\end{equation}\\] Find CDF of \\(X\\). R: Simulate from the binomial distribution with \\(n = 10\\) and \\(p = 0.5\\), and from \\(n\\) Bernoulli distributions with \\(p = 0.5\\). Visually compare the sum of Bernoullis and the binomial. Hint: there is no standard function like rpois for a Bernoulli random variable. Check exercise 1.12 to find out how to sample from a Bernoulli distribution. Solution. There are two outcomes – zero and one. Zero happens with probability \\(1 - p\\). Therefore \\[\\begin{equation} F(k) = \\begin{cases} 0 & k < 0 \\\\ 1 - p & 0 \\leq k < 1 \\\\ 1 & k \\geq 1. \\end{cases} \\end{equation}\\] For the probability of \\(X\\) to be equal to some \\(k \\leq n\\), exactly \\(k\\) Bernoulli variables need to be one, and the others zero. So \\(p^k(1-p)^{n-k}\\). There are \\(\\binom{n}{k}\\) such possible arrangements. Therefore \\[\\begin{align} P(X = k) = \\binom{n}{k} p^k (1 - p)^{n-k}. \\end{align}\\] \\[\\begin{equation} F(k) = \\sum_{i = 0}^{\\lfloor k \\rfloor} \\binom{n}{i} p^i (1 - p)^{n - i} \\end{equation}\\] set.seed(1) nsamps <- 10000 binom_samp <- rbinom(nsamps, size = 10, prob = 0.5) bernoulli_mat <- matrix(data = NA, nrow = nsamps, ncol = 10) for (i in 1:nsamps) { bernoulli_mat[i, ] <- rbinom(10, size = 1, prob = 0.5) } bern_samp <- apply(bernoulli_mat, 1, sum) b_data <- tibble(x = c(binom_samp, bern_samp), type = c(rep("binomial", 10000), rep("Bernoulli_sum", 10000))) b_plot <- ggplot(data = b_data, aes(x = x, fill = type)) + geom_bar(position = "dodge") plot(b_plot) Exercise 4.5 (Geometric random variable) A variable with PMF \\[\\begin{equation} P(k) = p(1-p)^k \\end{equation}\\] is a geometric random variable with support in non-negative integers. It has one parameter \\(p \\in (0,1]\\). We denote \\[\\begin{equation} X | p \\sim \\text{geometric}(p) \\end{equation}\\] Derive the CDF of a geometric random variable. R: Draw 1000 samples from the geometric distribution with \\(p = 0.3\\) and compare their frequencies to theoretical values. Solution. \\[\\begin{align} P(X \\leq k) &= \\sum_{i = 0}^k p(1-p)^i \\\\ &= p \\sum_{i = 0}^k (1-p)^i \\\\ &= p \\frac{1 - (1-p)^{k+1}}{1 - (1 - p)} \\\\ &= 1 - (1-p)^{k + 1} \\end{align}\\] set.seed(1) geo_samp <- rgeom(n = 1000, prob = 0.3) geo_samp <- data.frame(x = geo_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:20, n = dgeom(0:20, prob = 0.3), type = "theoretical_measure")) geo_plot <- ggplot(data = geo_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(geo_plot) Exercise 4.6 (Poisson random variable) A variable with PMF \\[\\begin{equation} P(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!} \\end{equation}\\] is a Poisson random variable with support in non-negative integers. It has one positive parameter \\(\\lambda\\), which also represents its mean value and variance (a measure of the deviation of the values from the mean – more on mean and variance in the next chapter). We denote \\[\\begin{equation} X | \\lambda \\sim \\text{Poisson}(\\lambda). \\end{equation}\\] This distribution is usually the default choice for modeling counts. We have already encountered a Poisson random variable in exercise 1.13, where we also sampled from this distribution. The CDF of a Poisson random variable is \\(P(X <= x) = e^{-\\lambda} \\sum_{i=0}^x \\frac{\\lambda^{i}}{i!}\\). R: Draw 1000 samples from the Poisson distribution with \\(\\lambda = 5\\) and compare their empirical cumulative distribution function with the theoretical CDF. set.seed(1) pois_samp <- rpois(n = 1000, lambda = 5) pois_samp <- data.frame(x = pois_samp) pois_plot <- ggplot(data = pois_samp, aes(x = x, colour = "ECDF")) + stat_ecdf(geom = "step") + geom_step(data = tibble(x = 0:17, y = ppois(x, 5)), aes(x = x, y = y, colour = "CDF")) + scale_colour_manual("Lgend title", values = c("black", "red")) plot(pois_plot) Exercise 4.7 (Negative binomial random variable) A variable with PMF \\[\\begin{equation} p(k) = \\binom{k + r - 1}{k}(1-p)^r p^k \\end{equation}\\] is a negative binomial random variable with support in non-negative integers. It has two parameters \\(r > 0\\) and \\(p \\in (0,1)\\). We denote \\[\\begin{equation} X | r,p \\sim \\text{NB}(r,p). \\end{equation}\\] Let us reparameterize the negative binomial distribution with \\(q = 1 - p\\). Find the PMF of \\(X \\sim \\text{NB}(1, q)\\). Do you recognize this distribution? Show that the sum of two negative binomial random variables with the same \\(p\\) is also a negative binomial random variable. Hint: Use the fact that the number of ways to place \\(n\\) indistinct balls into \\(k\\) boxes is \\(\\binom{n + k - 1}{n}\\). R: Draw samples from \\(X \\sim \\text{NB}(5, 0.4)\\) and \\(Y \\sim \\text{NB}(3, 0.4)\\). Draw samples from \\(Z = X + Y\\), where you use the parameters calculated in b). Plot both distributions, their sum, and \\(Z\\) using facet_wrap. Be careful, as R uses a different parameterization size=\\(r\\) and prob=\\(1 - p\\). Solution. \\[\\begin{align} P(X = k) &= \\binom{k + 1 - 1}{k}q^1 (1-q)^k \\\\ &= q(1-q)^k. \\end{align}\\] This is the geometric distribution. Let \\(X \\sim \\text{NB}(r_1, p)\\) and \\(Y \\sim \\text{NB}(r_2, p)\\). Let \\(Z = X + Y\\). \\[\\begin{align} P(Z = z) &= \\sum_{k = 0}^{\\infty} P(X = z - k)P(Y = k), \\text{ if k < 0, then the probabilities are 0} \\\\ &= \\sum_{k = 0}^{z} P(X = z - k)P(Y = k), \\text{ if k > z, then the probabilities are 0} \\\\ &= \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k}(1 - p)^{r_1} p^{z - k} \\binom{k + r_2 - 1}{k}(1 - p)^{r_2} p^{k} & \\\\ &= \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}(1 - p)^{r_1 + r_2} p^{z} & \\\\ &= (1 - p)^{r_1 + r_2} p^{z} \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}& \\end{align}\\] The part before the sum reminds us of the negative binomial distribution with parameters \\(r_1 + r_2\\) and \\(p\\). To complete this term to the negative binomial PMF we need \\(\\binom{z + r_1 + r_2 -1}{z}\\). So the only thing we need to prove is that the sum equals this term. Both terms in the sum can be interpreted as numbers of ways to place a number of balls into boxes. For the left term it is \\(z-k\\) balls into \\(r_1\\) boxes, and for the right \\(k\\) balls into \\(r_2\\) boxes. For each \\(k\\) we are distributing \\(z\\) balls in total. By summing over all \\(k\\), we actually get all the possible placements of \\(z\\) balls into \\(r_1 + r_2\\) boxes. Therefore \\[\\begin{align} P(Z = z) &= (1 - p)^{r_1 + r_2} p^{z} \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}& \\\\ &= \\binom{z + r_1 + r_2 -1}{z} (1 - p)^{r_1 + r_2} p^{z}. \\end{align}\\] From this it also follows that the sum of geometric distributions with the same parameter is a negative binomial distribution. \\(Z \\sim \\text{NB}(8, 0.4)\\). set.seed(1) nsamps <- 10000 x <- rnbinom(nsamps, size = 5, prob = 0.6) y <- rnbinom(nsamps, size = 3, prob = 0.6) xpy <- x + y z <- rnbinom(nsamps, size = 8, prob = 0.6) samps <- tibble(x, y, xpy, z) samps <- melt(samps) ggplot(data = samps, aes(x = value)) + geom_bar() + facet_wrap(~ variable) 4.3 Continuous random variables Exercise 4.8 (Exponential random variable) A variable \\(X\\) with PDF \\(\\lambda e^{-\\lambda x}\\) is an exponential random variable with support in non-negative real numbers. It has one positive parameter \\(\\lambda\\). We denote \\[\\begin{equation} X | \\lambda \\sim \\text{Exp}(\\lambda). \\end{equation}\\] Find the CDF of an exponential random variable. Find the quantile function of an exponential random variable. Calculate the probability \\(P(1 \\leq X \\leq 3)\\), where \\(X \\sim \\text{Exp(1.5)}\\). R: Check your answer to c) with a simulation (rexp). Plot the probability in a meaningful way. R: Implement PDF, CDF, and the quantile function and compare their values with corresponding R functions visually. Hint: use the size parameter to make one of the curves wider. Solution. \\[\\begin{align} F(x) &= \\int_{0}^{x} \\lambda e^{-\\lambda t} dt \\\\ &= \\lambda \\int_{0}^{x} e^{-\\lambda t} dt \\\\ &= \\lambda (\\frac{1}{-\\lambda}e^{-\\lambda t} |_{0}^{x}) \\\\ &= \\lambda(\\frac{1}{\\lambda} - \\frac{1}{\\lambda} e^{-\\lambda x}) \\\\ &= 1 - e^{-\\lambda x}. \\end{align}\\] \\[\\begin{align} F(F^{-1}(x)) &= x \\\\ 1 - e^{-\\lambda F^{-1}(x)} &= x \\\\ e^{-\\lambda F^{-1}(x)} &= 1 - x \\\\ -\\lambda F^{-1}(x) &= \\ln(1 - x) \\\\ F^{-1}(x) &= - \\frac{ln(1 - x)}{\\lambda}. \\end{align}\\] \\[\\begin{align} P(1 \\leq X \\leq 3) &= P(X \\leq 3) - P(X \\leq 1) \\\\ &= P(X \\leq 3) - P(X \\leq 1) \\\\ &= 1 - e^{-1.5 \\times 3} - 1 + e^{-1.5 \\times 1} \\\\ &\\approx 0.212. \\end{align}\\] set.seed(1) nsamps <- 1000 samps <- rexp(nsamps, rate = 1.5) sum(samps >= 1 & samps <= 3) / nsamps ## [1] 0.212 exp_plot <- ggplot(data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dexp, args = list(rate = 1.5)) + stat_function(fun = dexp, args = list(rate = 1.5), xlim = c(1,3), geom = "area", fill = "red") plot(exp_plot) exp_pdf <- function(x, lambda) { return (lambda * exp(-lambda * x)) } exp_cdf <- function(x, lambda) { return (1 - exp(-lambda * x)) } exp_quant <- function(q, lambda) { return (-(log(1 - q) / lambda)) } ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_pdf, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = pexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_cdf, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) ggplot(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x)) + stat_function(fun = qexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_quant, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) Exercise 4.9 (Uniform random variable) Continuous uniform random variable with parameters \\(a\\) and \\(b\\) has the PDF \\[\\begin{equation} p(x) = \\begin{cases} \\frac{1}{b - a} & x \\in [a,b] \\\\ 0 & \\text{otherwise}. \\end{cases} \\end{equation}\\] Find the CDF of the uniform random variable. Find the quantile function of the uniform random variable. Let \\(X \\sim \\text{Uniform}(a,b)\\). Find the CDF of the variable \\(Y = \\frac{X - a}{b - a}\\). This is the standard uniform random variable. Let \\(X \\sim \\text{Uniform}(-1, 3)\\). Find such \\(z\\) that \\(P(X < z + \\mu_x) = \\frac{1}{5}\\). R: Check your result from d) using simulation. Solution. \\[\\begin{align} F(x) &= \\int_{a}^x \\frac{1}{b - a} dt \\\\ &= \\frac{1}{b - a} \\int_{a}^x dt \\\\ &= \\frac{x - a}{b - a}. \\end{align}\\] \\[\\begin{align} F(F^{-1}(p)) &= p \\\\ \\frac{F^{-1}(p) - a}{b - a} &= p \\\\ F^{-1}(p) &= p(b - a) + a. \\end{align}\\] \\[\\begin{align} F_Y(y) &= P(Y < y) \\\\ &= P(\\frac{X - a}{b - a} < y) \\\\ &= P(X < y(b - a) + a) \\\\ &= F_X(y(b - a) + a) \\\\ &= \\frac{(y(b - a) + a) - a}{b - a} \\\\ &= y. \\end{align}\\] \\[\\begin{align} P(X < z + 1) &= \\frac{1}{5} \\\\ F(z + 1) &= \\frac{1}{5} \\\\ z + 1 &= F^{-1}(\\frac{1}{5}) \\\\ z &= \\frac{1}{5}4 - 1 - 1 \\\\ z &= -1.2. \\end{align}\\] set.seed(1) a <- -1 b <- 3 nsamps <- 10000 unif_samp <- runif(nsamps, a, b) mu_x <- mean(unif_samp) new_samp <- unif_samp - mu_x quantile(new_samp, probs = 1/5) ## 20% ## -1.203192 punif(-0.2, -1, 3) ## [1] 0.2 Exercise 4.10 (Beta random variable) A variable \\(X\\) with PDF \\[\\begin{equation} p(x) = \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)}, \\end{equation}\\] where \\(\\text{B}(\\alpha, \\beta) = \\frac{\\Gamma(\\alpha) \\Gamma(\\beta)}{\\Gamma(\\alpha + \\beta)}\\) and \\(\\Gamma(x) = \\int_0^{\\infty} x^{z - 1} e^{-x} dx\\) is a Beta random variable with support on \\([0,1]\\). It has two positive parameters \\(\\alpha\\) and \\(\\beta\\). Notation: \\[\\begin{equation} X | \\alpha, \\beta \\sim \\text{Beta}(\\alpha, \\beta) \\end{equation}\\] It is often used in modeling rates. Calculate the PDF for \\(\\alpha = 1\\) and \\(\\beta = 1\\). What do you notice? R: Plot densities of the beta distribution for parameter pairs (2, 2), (4, 1), (1, 4), (2, 5), and (0.1, 0.1). R: Sample from \\(X \\sim \\text{Beta}(2, 5)\\) and compare the histogram with Beta PDF. Solution. \\[\\begin{equation} p(x) = \\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\\text{B}(1, 1)} = 1. \\end{equation}\\] This is the standard uniform distribution. set.seed(1) ggplot(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x)) + stat_function(fun = dbeta, args = list(shape1 = 2, shape2 = 2), aes(color = "alpha = 0.5")) + stat_function(fun = dbeta, args = list(shape1 = 4, shape2 = 1), aes(color = "alpha = 4")) + stat_function(fun = dbeta, args = list(shape1 = 1, shape2 = 4), aes(color = "alpha = 1")) + stat_function(fun = dbeta, args = list(shape1 = 2, shape2 = 5), aes(color = "alpha = 25")) + stat_function(fun = dbeta, args = list(shape1 = 0.1, shape2 = 0.1), aes(color = "alpha = 0.1")) set.seed(1) nsamps <- 1000 samps <- rbeta(nsamps, 2, 5) ggplot(data = data.frame(x = samps), aes(x = x)) + geom_histogram(aes(y = ..density..), color = "black") + stat_function(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x), fun = dbeta, args = list(shape1 = 2, shape2 = 5), color = "red", size = 1.2) Exercise 4.11 (Gamma random variable) A random variable with PDF \\[\\begin{equation} p(x) = \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x} \\end{equation}\\] is a Gamma random variable with support on the positive numbers and parameters shape \\(\\alpha > 0\\) and rate \\(\\beta > 0\\). We write \\[\\begin{equation} X | \\alpha, \\beta \\sim \\text{Gamma}(\\alpha, \\beta) \\end{equation}\\] and it’s CDF is \\[\\begin{equation} \\frac{\\gamma(\\alpha, \\beta x)}{\\Gamma(\\alpha)}, \\end{equation}\\] where \\(\\gamma(s, x) = \\int_0^x t^{s-1} e^{-t} dt\\). It is usually used in modeling positive phenomena (for example insurance claims and rainfalls). Let \\(X \\sim \\text{Gamma}(1, \\beta)\\). Find the PDF of \\(X\\). Do you recognize this PDF? Let \\(k = \\alpha\\) and \\(\\theta = \\frac{1}{\\beta}\\). Find the PDF of \\(X | k, \\theta \\sim \\text{Gamma}(k, \\theta)\\). Random variables can be reparameterized, and sometimes a reparameterized distribution is more suitable for certain calculations. The first parameterization is for example usually used in Bayesian statistics, while this parameterization is more common in econometrics and some other applied fields. Note that you also need to pay attention to the parameters in statistical software, so diligently read the help files when using functions like rgamma to see how the function is parameterized. R: Plot gamma CDF for random variables with shape and rate parameters (1,1), (10,1), (1,10). Solution. \\[\\begin{align} p(x) &= \\frac{\\beta^1}{\\Gamma(1)} x^{1 - 1}e^{-\\beta x} \\\\ &= \\beta e^{-\\beta x} \\end{align}\\] This is the PDF of the exponential distribution with parameter \\(\\beta\\). \\[\\begin{align} p(x) &= \\frac{1}{\\Gamma(k)\\beta^k} x^{k - 1}e^{-\\frac{x}{\\theta}}. \\end{align}\\] set.seed(1) ggplot(data = data.frame(x = seq(0, 25, by = 0.01)), aes(x = x)) + stat_function(fun = pgamma, args = list(shape = 1, rate = 1), aes(color = "Gamma(1,1)")) + stat_function(fun = pgamma, args = list(shape = 10, rate = 1), aes(color = "Gamma(10,1)")) + stat_function(fun = pgamma, args = list(shape = 1, rate = 10), aes(color = "Gamma(1,10)")) Exercise 4.12 (Normal random variable) A random variable with PDF \\[\\begin{equation} p(x) = \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} \\end{equation}\\] is a normal random variable with support on the real axis and parameters \\(\\mu\\) in reals and \\(\\sigma^2 > 0\\). The first is the mean parameter and the second is the variance parameter. Many statistical methods assume a normal distribution. We denote \\[\\begin{equation} X | \\mu, \\sigma \\sim \\text{N}(\\mu, \\sigma^2), \\end{equation}\\] and it’s CDF is \\[\\begin{equation} F(x) = \\int_{-\\infty}^x \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(t - \\mu)^2}{2 \\sigma^2}} dt, \\end{equation}\\] which is intractable and is usually approximated. Due to its flexibility it is also one of the most researched distributions. For that reason statisticians often use transformations of variables or approximate distributions with the normal distribution. Show that a variable \\(\\frac{X - \\mu}{\\sigma} \\sim \\text{N}(0,1)\\). This transformation is called standardization, and \\(\\text{N}(0,1)\\) is a standard normal distribution. R: Plot the normal distribution with \\(\\mu = 0\\) and different values for the \\(\\sigma\\) parameter. R: The normal distribution provides a good approximation for the Poisson distribution with a large \\(\\lambda\\). Let \\(X \\sim \\text{Poisson}(50)\\). Approximate \\(X\\) with the normal distribution and compare its density with the Poisson histogram. What are the values of \\(\\mu\\) and \\(\\sigma^2\\) that should provide the best approximation? Note that R function rnorm takes standard deviation (\\(\\sigma\\)) as a parameter and not variance. Solution. \\[\\begin{align} P(\\frac{X - \\mu}{\\sigma} < x) &= P(X < \\sigma x + \\mu) \\\\ &= F(\\sigma x + \\mu) \\\\ &= \\int_{-\\infty}^{\\sigma x + \\mu} \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(t - \\mu)^2}{2\\sigma^2}} dt \\end{align}\\] Now let \\(s = f(t) = \\frac{t - \\mu}{\\sigma}\\), then \\(ds = \\frac{dt}{\\sigma}\\) and \\(f(\\sigma x + \\mu) = x\\), so \\[\\begin{align} P(\\frac{X - \\mu}{\\sigma} < x) &= \\int_{-\\infty}^{x} \\frac{1}{\\sqrt{2 \\pi}} e^{-\\frac{s^2}{2}} ds. \\end{align}\\] There is no need to evaluate this integral, as we recognize it as the CDF of a normal distribution with \\(\\mu = 0\\) and \\(\\sigma^2 = 1\\). set.seed(1) # b ggplot(data = data.frame(x = seq(-15, 15, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 1), aes(color = "sd = 1")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 0.4), aes(color = "sd = 0.1")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "sd = 2")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 5), aes(color = "sd = 5")) # c mean_par <- 50 nsamps <- 100000 pois_samps <- rpois(nsamps, lambda = mean_par) norm_samps <- rnorm(nsamps, mean = mean_par, sd = sqrt(mean_par)) my_plot <- ggplot() + geom_bar(data = tibble(x = pois_samps), aes(x = x, y = (..count..)/sum(..count..))) + geom_density(data = tibble(x = norm_samps), aes(x = x), color = "red") plot(my_plot) Exercise 4.13 (Logistic random variable) A logistic random variable has CDF \\[\\begin{equation} F(x) = \\frac{1}{1 + e^{-\\frac{x - \\mu}{s}}}, \\end{equation}\\] where \\(\\mu\\) is real and \\(s > 0\\). The support is on the real axis. We denote \\[\\begin{equation} X | \\mu, s \\sim \\text{Logistic}(\\mu, s). \\end{equation}\\] The distribution of the logistic random variable resembles a normal random variable, however it has heavier tails. Find the PDF of a logistic random variable. R: Implement logistic PDF and CDF and visually compare both for \\(X \\sim \\text{N}(0, 1)\\) and \\(Y \\sim \\text{logit}(0, \\sqrt{\\frac{3}{\\pi^2}})\\). These distributions have the same mean and variance. Additionally, plot the same plot on the interval [5,10], to better see the difference in the tails. R: For the distributions in b) find the probability \\(P(|X| > 4)\\) and interpret the result. Solution. \\[\\begin{align} p(x) &= \\frac{d}{dx} \\frac{1}{1 + e^{-\\frac{x - \\mu}{s}}} \\\\ &= \\frac{- \\frac{d}{dx} (1 + e^{-\\frac{x - \\mu}{s}})}{(1 + e{-\\frac{x - \\mu}{s}})^2} \\\\ &= \\frac{e^{-\\frac{x - \\mu}{s}}}{s(1 + e{-\\frac{x - \\mu}{s}})^2}. \\end{align}\\] # b set.seed(1) logit_pdf <- function (x, mu, s) { return ((exp(-(x - mu)/(s))) / (s * (1 + exp(-(x - mu)/(s)))^2)) } nl_plot <- ggplot(data = data.frame(x = seq(-12, 12, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "normal")) + stat_function(fun = logit_pdf, args = list(mu = 0, s = sqrt(12/pi^2)), aes(color = "logit")) plot(nl_plot) nl_plot <- ggplot(data = data.frame(x = seq(5, 10, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "normal")) + stat_function(fun = logit_pdf, args = list(mu = 0, s = sqrt(12/pi^2)), aes(color = "logit")) plot(nl_plot) # c logit_cdf <- function (x, mu, s) { return (1 / (1 + exp(-(x - mu) / s))) } p_logistic <- 1 - logit_cdf(4, 0, sqrt(12/pi^2)) + logit_cdf(-4, 0, sqrt(12/pi^2)) p_norm <- 1 - pnorm(4, 0, 2) + pnorm(-4, 0, 2) p_logistic ## [1] 0.05178347 p_norm ## [1] 0.04550026 # Logistic distribution has wider tails, therefore the probability of larger # absolute values is higher. 4.4 Singular random variables Exercise 4.14 (Cantor distribution) The Cantor set is a subset of \\([0,1]\\), which we create by iteratively deleting the middle third of the interval. For example, in the first iteration, we get the sets \\([0,\\frac{1}{3}]\\) and \\([\\frac{2}{3},1]\\). In the second iteration, we get \\([0,\\frac{1}{9}]\\), \\([\\frac{2}{9},\\frac{1}{3}]\\), \\([\\frac{2}{3}, \\frac{7}{9}]\\), and \\([\\frac{8}{9}, 1]\\). On the \\(n\\)-th iteration, we have \\[\\begin{equation} C_n = \\frac{C_{n-1}}{3} \\cup \\bigg(\\frac{2}{3} + \\frac{C_{n-1}}{3} \\bigg), \\end{equation}\\] where \\(C_0 = [0,1]\\). The Cantor set is then defined as the intersection of these sets \\[\\begin{equation} C = \\cap_{n=1}^{\\infty} C_n. \\end{equation}\\] It has the same cardinality as \\([0,1]\\). Another way to define the Cantor set is the set of all numbers on \\([0,1]\\), that do not have a 1 in the ternary representation \\(x = \\sum_{n=1}^\\infty \\frac{x_i}{3^i}, x_i \\in \\{0,1,2\\}\\). A random variable follows the Cantor distribution, if its CDF is the Cantor function (below). You can find the implementations of random number generator, CDF, and quantile functions for the Cantor distributions at https://github.com/Henrygb/CantorDist.R. Show that the Lebesgue measure of the Cantor set is 0. (Jagannathan) Let us look at an infinite sequence of independent fair-coin tosses. If the outcome is heads, let \\(x_i = 2\\) and \\(x_i = 0\\), when tails. Then use these to create \\(x = \\sum_{n=1}^\\infty \\frac{x_i}{3^i}\\). This is a random variable with the Cantor distribution. Show that \\(X\\) has a singular distribution. Solution. \\[\\begin{align} \\lambda(C) &= 1 - \\lambda(C^c) \\\\ &= 1 - \\frac{1}{3}\\sum_{k = 0}^\\infty (\\frac{2}{3})^k \\\\ &= 1 - \\frac{\\frac{1}{3}}{1 - \\frac{2}{3}} \\\\ &= 0. \\end{align}\\] First, for every \\(x\\), the probability of observing it is \\(\\lim_{n \\rightarrow \\infty} \\frac{1}{2^n} = 0\\). Second, the probability that we observe one of all the possible sequences is 1. Therefore \\(P(C) = 1\\). So this is a singular variable. The CDF only increments on the elements of the Cantor set. 4.5 Transformations Exercise 4.15 Let \\(X\\) be a random variable that is uniformly distributed on \\(\\{-2, -1, 0, 1, 2\\}\\). Find the PMF of \\(Y = X^2\\). Solution. \\[\\begin{align} P_Y(y) = \\sum_{x \\in \\sqrt(y)} P_X(x) = \\begin{cases} 0 & y \\notin \\{0,1,4\\} \\\\ \\frac{1}{5} & y = 0 \\\\ \\frac{2}{5} & y \\in \\{1,4\\} \\end{cases} \\end{align}\\] Exercise 4.16 (Lognormal random variable) A lognormal random variable is a variable whose logarithm is normally distributed. In practice, we often encounter skewed data. Usually using a log transformation on such data makes it more symmetric and therefore more suitable for modeling with the normal distribution (more on why we wish to model data with the normal distribution in the following chapters). Let \\(X \\sim \\text{N}(\\mu,\\sigma)\\). Find the PDF of \\(Y: \\log(Y) = X\\). R: Sample from the lognormal distribution with parameters \\(\\mu = 5\\) and \\(\\sigma = 2\\). Plot a histogram of the samples. Then log-transform the samples and plot a histogram along with the theoretical normal PDF. Solution. \\[\\begin{align} p_Y(y) &= p_X(\\log(y)) \\frac{d}{dy} \\log(y) \\\\ &= \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{\\frac{(\\log(y) - \\mu)^2}{2 \\sigma^2}} \\frac{1}{y} \\\\ &= \\frac{1}{y \\sqrt{2 \\pi \\sigma^2}} e^{\\frac{(\\log(y) - \\mu)^2}{2 \\sigma^2}}. \\end{align}\\] set.seed(1) nsamps <- 10000 mu <- 0.5 sigma <- 0.4 ln_samps <- rlnorm(nsamps, mu, sigma) ln_plot <- ggplot(data = data.frame(x = ln_samps), aes(x = x)) + geom_histogram(color = "black") plot(ln_plot) norm_samps <- log(ln_samps) n_plot <- ggplot(data = data.frame(x = norm_samps), aes(x = x)) + geom_histogram(aes(y = ..density..), color = "black") + stat_function(fun = dnorm, args = list(mean = mu, sd = sigma), color = "red") plot(n_plot) Exercise 4.17 (Probability integral transform) This exercise is borrowed from Wasserman. Let \\(X\\) have a continuous, strictly increasing CDF \\(F\\). Let \\(Y = F(X)\\). Find the density of \\(Y\\). This is called the probability integral transform. Let \\(U \\sim \\text{Uniform}(0,1)\\) and let \\(X = F^{-1}(U)\\). Show that \\(X \\sim F\\). R: Implement a program that takes Uniform(0,1) random variables and generates random variables from an exponential(\\(\\beta\\)) distribution. Compare your implemented function with function rexp in R. Solution. \\[\\begin{align} F_Y(y) &= P(Y < y) \\\\ &= P(F(X) < y) \\\\ &= P(X < F_X^{-1}(y)) \\\\ &= F_X(F_X^{-1}(y)) \\\\ &= y. \\end{align}\\] From the above it follows that \\(p(y) = 1\\). Note that we need to know the inverse CDF to be able to apply this procedure. \\[\\begin{align} P(X < x) &= P(F^{-1}(U) < x) \\\\ &= P(U < F(x)) \\\\ &= F_U(F(x)) \\\\ &= F(x). \\end{align}\\] set.seed(1) nsamps <- 10000 beta <- 4 generate_exp <- function (n, beta) { tmp <- runif(n) X <- qexp(tmp, beta) return (X) } df <- tibble("R" = rexp(nsamps, beta), "myGenerator" = generate_exp(nsamps, beta)) %>% gather() ggplot(data = df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") "],["mrvs.html", "Chapter 5 Multiple random variables 5.1 General 5.2 Bivariate distribution examples 5.3 Transformations", " Chapter 5 Multiple random variables This chapter deals with multiple random variables and their distributions. The students are expected to acquire the following knowledge: Theoretical Calculation of PDF of transformed multiple random variables. Finding marginal and conditional distributions. R Scatterplots of bivariate random variables. New R functions (for example, expand.grid). .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 5.1 General Exercise 5.1 Let \\(X \\sim \\text{N}(0,1)\\) and \\(Y \\sim \\text{N}(0,1)\\) be independent random variables. Draw 1000 samples from \\((X,Y)\\) and plot a scatterplot. Now let \\(X \\sim \\text{N}(0,1)\\) and \\(Y | X = x \\sim N(ax, 1)\\). Draw 1000 samples from \\((X,Y)\\) for \\(a = 1\\), \\(a=0\\), and \\(a=-0.5\\). Plot the scatterplots. How would you interpret parameter \\(a\\)? Plot the marginal distribution of \\(Y\\) for cases \\(a=1\\), \\(a=0\\), and \\(a=-0.5\\). Can you guess which distribution it is? set.seed(1) nsamps <- 1000 x <- rnorm(nsamps) y <- rnorm(nsamps) ggplot(data.frame(x, y), aes(x = x, y = y)) + geom_point() y1 <- rnorm(nsamps, mean = 1 * x) y2 <- rnorm(nsamps, mean = 0 * x) y3 <- rnorm(nsamps, mean = -0.5 * x) df <- tibble(x = c(x,x,x), y = c(y1,y2,y3), a = c(rep(1, nsamps), rep(0, nsamps), rep(-0.5, nsamps))) ggplot(df, aes(x = x, y = y)) + geom_point() + facet_wrap(~a) + coord_equal(ratio=1) # Parameter a controls the scale of linear dependency between X and Y. ggplot(df, aes(x = y)) + geom_density() + facet_wrap(~a) 5.2 Bivariate distribution examples Exercise 5.2 (Discrete bivariate random variable) Let \\(X\\) represent the event that a die rolls an even number and let \\(Y\\) represent the event that a die rolls one, two, or a three. Find the marginal distributions of \\(X\\) and \\(Y\\). Find the PMF of \\((X,Y)\\). Find the CDF of \\((X,Y)\\). Find \\(P(X = 1 | Y = 1)\\). Solution. \\[\\begin{align} P(X = 1) = \\frac{1}{2} \\text{ and } P(X = 0) = \\frac{1}{2} \\\\ P(Y = 1) = \\frac{1}{2} \\text{ and } P(Y = 0) = \\frac{1}{2} \\\\ \\end{align}\\] \\[\\begin{align} P(X = 1, Y = 1) = \\frac{1}{6} \\\\ P(X = 1, Y = 0) = \\frac{2}{6} \\\\ P(X = 0, Y = 1) = \\frac{2}{6} \\\\ P(X = 0, Y = 0) = \\frac{1}{6} \\end{align}\\] \\[\\begin{align} P(X \\leq x, Y \\leq y) = \\begin{cases} \\frac{1}{6} & x = 0, y = 0 \\\\ \\frac{3}{6} & x \\neq y \\\\ 1 & x = 1, y = 1 \\end{cases} \\end{align}\\] \\[\\begin{align} P(X = 1 | Y = 1) = \\frac{1}{3} \\end{align}\\] Exercise 5.3 (Continuous bivariate random variable) Let \\(p(x,y) = 6 (x - y)^2\\) be the PDF of a bivariate random variable \\((X,Y)\\), where both variables range from zero to one. Find CDF. Find marginal distributions. Find conditional distributions. R: Plot a grid of points and colour them by value – this can help us visualize the PDF. R: Implement a random number generator, which will generate numbers from \\((X,Y)\\) and visually check the results. R: Plot the marginal distribution of \\(Y\\) and the conditional distributions of \\(X | Y = y\\), where \\(y \\in \\{0, 0.1, 0.5\\}\\). Solution. \\[\\begin{align} F(x,y) &= \\int_0^{x} \\int_0^{y} 6 (t - s)^2 ds dt\\\\ &= 6 \\int_0^{x} \\int_0^{y} t^2 - 2ts + s^2 ds dt\\\\ &= 6 \\int_0^{x} t^2y - ty^2 + \\frac{y^3}{3} dt \\\\ &= 6 (\\frac{x^3 y}{3} - \\frac{x^2y^2}{2} + \\frac{x y^3}{3}) \\\\ &= 2 x^3 y - 3 t^2y^2 + 2 x y^3 \\end{align}\\] \\[\\begin{align} p(x) &= \\int_0^{1} 6 (x - y)^2 dy\\\\ &= 6 (x^2 - x + \\frac{1}{3}) \\\\ &= 6x^2 - 6x + 2 \\end{align}\\] \\[\\begin{align} p(y) &= \\int_0^{1} 6 (x - y)^2 dx\\\\ &= 6 (y^2 - y + \\frac{1}{3}) \\\\ &= 6y^2 - 6y + 2 \\end{align}\\] \\[\\begin{align} p(x|y) &= \\frac{p(xy)}{p(y)} \\\\ &= \\frac{6 (x - y)^2}{6 (y^2 - y + \\frac{1}{3})} \\\\ &= \\frac{(x - y)^2}{y^2 - y + \\frac{1}{3}} \\end{align}\\] \\[\\begin{align} p(y|x) &= \\frac{p(xy)}{p(x)} \\\\ &= \\frac{6 (x - y)^2}{6 (x^2 - x + \\frac{1}{3})} \\\\ &= \\frac{(x - y)^2}{x^2 - x + \\frac{1}{3}} \\end{align}\\] set.seed(1) # d pxy <- function (x, y) { return ((x - y)^2) } x_axis <- seq(0, 1, length.out = 100) y_axis <- seq(0, 1, length.out = 100) df <- expand.grid(x_axis, y_axis) colnames(df) <- c("x", "y") df <- cbind(df, pdf = pxy(df$x, df$y)) ggplot(data = df, aes(x = x, y = y, color = pdf)) + geom_point() # e samps <- NULL for (i in 1:10000) { xt <- runif(1, 0, 1) yt <- runif(1, 0, 1) pdft <- pxy(xt, yt) acc <- runif(1, 0, 6) if (acc <= pdft) { samps <- rbind(samps, c(xt, yt)) } } colnames(samps) <- c("x", "y") ggplot(data = as.data.frame(samps), aes(x = x, y = y)) + geom_point() # f mar_pdf <- function (x) { return (6 * x^2 - 6 * x + 2) } cond_pdf <- function (x, y) { return (((x - y)^2) / (y^2 - y + 1/3)) } df <- tibble(x = x_axis, mar = mar_pdf(x), y0 = cond_pdf(x, 0), y0.1 = cond_pdf(x, 0.1), y0.5 = cond_pdf(x, 0.5)) %>% gather(dist, value, -x) ggplot(df, aes(x = x, y = value, color = dist)) + geom_line() Exercise 5.4 (Mixed bivariate random variable) Let \\(f(x,y) = \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y+ \\alpha -1} e^{-x(1 + \\beta)}\\) be the PDF of a bivariate random variable, where \\(x \\in (0, \\infty)\\) and \\(y \\in \\mathbb{N}_0\\). Find the marginal distribution of \\(X\\). Do you recognize this distribution? Find the conditional distribution of \\(Y | X\\). Do you recognize this distribution? Calculate the probability \\(P(Y = 2 | X = 2.5)\\) for \\((X,Y)\\). Find the marginal distribution of \\(Y\\). Do you recognize this distribution? R: Take 1000 random samples from \\((X,Y)\\) with parameters \\(\\beta = 1\\) and \\(\\alpha = 1\\). Plot a scatterplot. Plot a bar plot of the marginal distribution of \\(Y\\), and the theoretical PMF calculated from d) on the range from 0 to 10. Hint: Use the gamma function in R.? Solution. \\[\\begin{align} p(x) &= \\sum_{k = 0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)k!} x^{k + \\alpha -1} e^{-x(1 + \\beta)} & \\\\ &= \\sum_{k = 0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)k!} x^{k} x^{\\alpha -1} e^{-x} e^{-\\beta x} & \\\\ &= \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x} \\sum_{k = 0}^{\\infty} \\frac{1}{k!} x^{k} e^{-x} & \\\\ &= \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x} & \\text{the last term above sums to one} \\end{align}\\] This is the Gamma PDF. \\[\\begin{align} p(y|x) &= \\frac{p(x,y)}{p(x)} \\\\ &= \\frac{\\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y+ \\alpha -1} e^{-x(1 + \\beta)}}{\\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x}} \\\\ &= \\frac{x^y e^{-x}}{y!}. \\end{align}\\] This is the Poisson PMF. \\[\\begin{align} P(Y = 2 | X = 2.5) = \\frac{2.5^2 e^{-2.5}}{2!} \\approx 0.26. \\end{align}\\] \\[\\begin{align} p(y) &= \\int_{0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y + \\alpha -1} e^{-x(1 + \\beta)} dx & \\\\ &= \\frac{1}{y!} \\int_{0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{(y + \\alpha) -1} e^{-(1 + \\beta)x} dx & \\\\ &= \\frac{1}{y!} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} \\int_{0}^{\\infty} \\frac{\\Gamma(y + \\alpha)}{(1 + \\beta)^{y + \\alpha}} \\frac{(1 + \\beta)^{y + \\alpha}}{\\Gamma(y + \\alpha)} x^{(y + \\alpha) -1} e^{-(1 + \\beta)x} dx & \\text{complete to Gamma PDF} \\\\ &= \\frac{1}{y!} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} \\frac{\\Gamma(y + \\alpha)}{(1 + \\beta)^{y + \\alpha}}. \\end{align}\\] We add the terms in the third equality to get a Gamma PDF inside the integral, which then integrates to one. We do not recognize this distribution. set.seed(1) px <- function (x, alpha, beta) { return((1 / factorial(x)) * (beta^alpha / gamma(alpha)) * (gamma(x + alpha) / (1 + beta)^(x + alpha))) } nsamps <- 1000 rx <- rgamma(nsamps, 1, 1) ryx <- rpois(nsamps, rx) ggplot(data = data.frame(x = rx, y = ryx), aes(x = x, y = y)) + geom_point() ggplot(data = data.frame(x = rx, y = ryx), aes(x = y)) + geom_bar(aes(y = (..count..)/sum(..count..))) + stat_function(fun = px, args = list(alpha = 1, beta = 1), color = "red") Exercise 5.5 Let \\(f(x,y) = cx^2y\\) for \\(x^2 \\leq y \\leq 1\\) and zero otherwise. Find such \\(c\\) that \\(f\\) is a PDF of a bivariate random variable. This exercise is borrowed from Wasserman. Solution. \\[\\begin{align} 1 &= \\int_{-1}^{1} \\int_{x^2}^1 cx^2y dy dx \\\\ &= \\int_{-1}^{1} cx^2 (\\frac{1}{2} - \\frac{x^4}{2}) dx \\\\ &= \\frac{c}{2} \\int_{-1}^{1} x^2 - x^6 dx \\\\ &= \\frac{c}{2} (\\frac{1}{3} + \\frac{1}{3} - \\frac{1}{7} - \\frac{1}{7}) \\\\ &= \\frac{c}{2} \\frac{8}{21} \\\\ &= \\frac{4c}{21} \\end{align}\\] It follows \\(c = \\frac{21}{4}\\). 5.3 Transformations Exercise 5.6 Let \\((X,Y)\\) be uniformly distributed on the unit ball \\(\\{(x,y,z) : x^2 + y^2 + z^2 \\leq 1\\}\\). Let \\(R = \\sqrt{X^2 + Y^2 + Z^2}\\). Find the CDF and PDF of \\(R\\). Solution. \\[\\begin{align} P(R < r) &= P(\\sqrt{X^2 + Y^2 + Z^2} < r) \\\\ &= P(X^2 + Y^2 + Z^2 < r^2) \\\\ &= \\frac{\\frac{4}{3} \\pi r^3}{\\frac{4}{3}\\pi} \\\\ &= r^3. \\end{align}\\] The second line shows us that we are looking at the probability which is represented by a smaller ball with radius \\(r\\). To get the probability, we divide it by the radius of the whole ball. We get the PDF by differentiating the CDF, so \\(p(r) = 3r^2\\). "],["integ.html", "Chapter 6 Integration 6.1 Monte Carlo integration 6.2 Lebesgue integrals", " Chapter 6 Integration This chapter deals with abstract and Monte Carlo integration. The students are expected to acquire the following knowledge: Theoretical How to calculate Lebesgue integrals for non-simple functions. R Monte Carlo integration. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 6.1 Monte Carlo integration Exercise 6.1 Let \\(X\\) and \\(Y\\) be continuous random variables on the unit interval and \\(p(x,y) = 6(x - y)^2\\). Use Monte Carlo integration to estimate the probability \\(P(0.2 \\leq X \\leq 0.5, \\: 0.1 \\leq Y \\leq 0.2)\\). Can you find the exact value? set.seed(1) nsamps <- 1000 V <- (0.5 - 0.2) * (0.2 - 0.1) x1 <- runif(nsamps, 0.2, 0.5) x2 <- runif(nsamps, 0.1, 0.2) f_1 <- function (x, y) { return (6 * (x - y)^2) } mcint <- V * (1 / nsamps) * sum(f_1(x1, x2)) sdm <- sqrt((V^2 / nsamps) * var(f_1(x1, x2))) mcint ## [1] 0.008793445 sdm ## [1] 0.0002197686 F_1 <- function (x, y) { return (2 * x^3 * y - 3 * x^2 * y^2 + 2 * x * y^3) } F_1(0.5, 0.2) - F_1(0.2, 0.2) - F_1(0.5, 0.1) + F_1(0.2, 0.1) ## [1] 0.0087 6.2 Lebesgue integrals Exercise 6.2 (borrowed from Jagannathan) Find the Lebesgue integral of the following functions on (\\(\\mathbb{R}\\), \\(\\mathcal{B}(\\mathbb{R})\\), \\(\\lambda\\)). \\[\\begin{align} f(\\omega) = \\begin{cases} \\omega, & \\text{for } \\omega = 0,1,...,n \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] \\[\\begin{align} f(\\omega) = \\begin{cases} 1, & \\text{for } \\omega = \\mathbb{Q}^c \\cap [0,1] \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] \\[\\begin{align} f(\\omega) = \\begin{cases} n, & \\text{for } \\omega = \\mathbb{Q}^c \\cap [0,n] \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] Solution. \\[\\begin{align} \\int f(\\omega) d\\lambda = \\sum_{\\omega = 0}^n \\omega \\lambda(\\omega) = 0. \\end{align}\\] \\[\\begin{align} \\int f(\\omega) d\\lambda = 1 \\times \\lambda(\\mathbb{Q}^c \\cap [0,1]) = 1. \\end{align}\\] \\[\\begin{align} \\int f(\\omega) d\\lambda = n \\times \\lambda(\\mathbb{Q}^c \\cap [0,n]) = n^2. \\end{align}\\] Exercise 6.3 (borrowed from Jagannathan) Let \\(c \\in \\mathbb{R}\\) be fixed and (\\(\\mathbb{R}\\), \\(\\mathcal{B}(\\mathbb{R})\\)) a measurable space. If for any Borel set \\(A\\), \\(\\delta_c (A) = 1\\) if \\(c \\in A\\), and \\(\\delta_c (A) = 0\\) otherwise, then \\(\\delta_c\\) is called a Dirac measure. Let \\(g\\) be a non-negative, measurable function. Show that \\(\\int g d \\delta_c = g(c)\\). Solution. \\[\\begin{align} \\int g d \\delta_c &= \\sup_{q \\in S(g)} \\int q d \\delta_c \\\\ &= \\sup_{q \\in S(g)} \\sum_{i = 1}^n a_i \\delta_c(A_i) \\\\ &= \\sup_{q \\in S(g)} \\sum_{i = 1}^n a_i \\text{I}_{A_i}(c) \\\\ &= \\sup_{q \\in S(g)} q(c) \\\\ &= g(c) \\end{align}\\] "],["ev.html", "Chapter 7 Expected value 7.1 Discrete random variables 7.2 Continuous random variables 7.3 Sums, functions, conditional expectations 7.4 Covariance", " Chapter 7 Expected value This chapter deals with expected values of random variables. The students are expected to acquire the following knowledge: Theoretical Calculation of the expected value. Calculation of variance and covariance. Cauchy distribution. R Estimation of expected value. Estimation of variance and covariance. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 7.1 Discrete random variables Exercise 7.1 (Bernoulli) Let \\(X \\sim \\text{Bernoulli}(p)\\). Find \\(E[X]\\). Find \\(Var[X]\\). R: Let \\(p = 0.4\\). Check your answers to a) and b) with a simulation. Solution. \\[\\begin{align*} E[X] = \\sum_{k=0}^1 p^k (1-p)^{1-k} k = p. \\end{align*}\\] \\[\\begin{align*} Var[X] = E[X^2] - E[X]^2 = \\sum_{k=0}^1 (p^k (1-p)^{1-k} k^2) - p^2 = p(1-p). \\end{align*}\\] set.seed(1) nsamps <- 1000 x <- rbinom(nsamps, 1, 0.4) mean(x) ## [1] 0.394 var(x) ## [1] 0.239003 0.4 * (1 - 0.4) ## [1] 0.24 Exercise 7.2 (Binomial) Let \\(X \\sim \\text{Binomial}(n,p)\\). Find \\(E[X]\\). Find \\(Var[X]\\). Solution. Let \\(X = \\sum_{i=0}^n X_i\\), where \\(X_i \\sim \\text{Bernoulli}(p)\\). Then, due to linearity of expectation \\[\\begin{align*} E[X] = E[\\sum_{i=0}^n X_i] = \\sum_{i=0}^n E[X_i] = np. \\end{align*}\\] Again let \\(X = \\sum_{i=0}^n X_i\\), where \\(X_i \\sim \\text{Bernoulli}(p)\\). Since the Bernoulli variables \\(X_i\\) are independent we have \\[\\begin{align*} Var[X] = Var[\\sum_{i=0}^n X_i] = \\sum_{i=0}^n Var[X_i] = np(1-p). \\end{align*}\\] Exercise 7.3 (Poisson) Let \\(X \\sim \\text{Poisson}(\\lambda)\\). Find \\(E[X]\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\sum_{k=0}^\\infty \\frac{\\lambda^k e^{-\\lambda}}{k!} k & \\\\ &= \\sum_{k=1}^\\infty \\frac{\\lambda^k e^{-\\lambda}}{k!} k & \\text{term at $k=0$ is 0} \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty \\frac{\\lambda^{k-1}}{(k - 1)!} & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=0}^\\infty \\frac{\\lambda^{k}}{k!} & \\\\ &= e^{-\\lambda} \\lambda e^\\lambda & \\\\ &= \\lambda. \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty k \\frac{\\lambda^{k-1}}{(k - 1)!} - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty (k - 1) + 1) \\frac{\\lambda^{k-1}}{(k - 1)!} - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\sum_{k=1}^\\infty (k - 1) \\frac{\\lambda^{k-1}}{(k - 1)!} + \\sum_{k=1}^\\infty \\frac{\\lambda^{k-1}}{(k - 1)!}\\Big) - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\lambda\\sum_{k=2}^\\infty \\frac{\\lambda^{k-2}}{(k - 2)!} + e^\\lambda\\Big) - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\lambda e^\\lambda + e^\\lambda\\Big) - \\lambda^2 & \\\\ &= \\lambda^2 + \\lambda - \\lambda^2 & \\\\ &= \\lambda. \\end{align*}\\] Exercise 7.4 (Geometric) Let \\(X \\sim \\text{Geometric}(p)\\). Find \\(E[X]\\). Hint: \\(\\frac{d}{dx} x^k = k x^{(k - 1)}\\). Solution. \\[\\begin{align*} E[X] &= \\sum_{k=0}^\\infty (1 - p)^k p k & \\\\ &= p (1 - p) \\sum_{k=0}^\\infty (1 - p)^{k-1} k & \\\\ &= p (1 - p) \\sum_{k=0}^\\infty -\\frac{d}{dp}(1 - p)^k & \\\\ &= p (1 - p) \\Big(-\\frac{d}{dp}\\Big) \\sum_{k=0}^\\infty (1 - p)^k & \\\\ &= p (1 - p) \\Big(-\\frac{d}{dp}\\Big) \\frac{1}{1 - (1 - p)} & \\text{geometric series} \\\\ &= \\frac{1 - p}{p} \\end{align*}\\] 7.2 Continuous random variables Exercise 7.5 (Gamma) Let \\(X \\sim \\text{Gamma}(\\alpha, \\beta)\\). Hint: \\(\\Gamma(z) = \\int_0^\\infty t^{z-1}e^{-t} dt\\) and \\(\\Gamma(z + 1) = z \\Gamma(z)\\). Find \\(E[X]\\). Find \\(Var[X]\\). R: Let \\(\\alpha = 10\\) and \\(\\beta = 2\\). Plot the density of \\(X\\). Add a horizontal line at the expected value that touches the density curve (geom_segment). Shade the area within a standard deviation of the expected value. Solution. \\[\\begin{align*} E[X] &= \\int_0^\\infty \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)}x^\\alpha e^{-\\beta x} dx & \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\int_0^\\infty x^\\alpha e^{-\\beta x} dx & \\text{ (let $t = \\beta x$)} \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(\\alpha) }\\int_0^\\infty \\frac{t^\\alpha}{\\beta^\\alpha} e^{-t} \\frac{dt}{\\beta} & \\\\ &= \\frac{1}{\\beta \\Gamma(\\alpha) }\\int_0^\\infty t^\\alpha e^{-t} dt & \\\\ &= \\frac{\\Gamma(\\alpha + 1)}{\\beta \\Gamma(\\alpha)} & \\\\ &= \\frac{\\alpha \\Gamma(\\alpha)}{\\beta \\Gamma(\\alpha)} & \\\\ &= \\frac{\\alpha}{\\beta}. & \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\int_0^\\infty \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)}x^{\\alpha+1} e^{-\\beta x} dx - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\Gamma(\\alpha + 2)}{\\beta^2 \\Gamma(\\alpha)} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{(\\alpha + 1)\\alpha\\Gamma(\\alpha)}{\\beta^2 \\Gamma(\\alpha)} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\alpha^2 + \\alpha}{\\beta^2} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\alpha}{\\beta^2}. \\end{align*}\\] set.seed(1) x <- seq(0, 25, by = 0.01) y <- dgamma(x, shape = 10, rate = 2) df <- data.frame(x = x, y = y) ggplot(df, aes(x = x, y = y)) + geom_line() + geom_segment(aes(x = 5, y = 0, xend = 5, yend = dgamma(5, shape = 10, rate = 2)), color = "red") + stat_function(fun = dgamma, args = list(shape = 10, rate = 2), xlim = c(5 - sqrt(10/4), 5 + sqrt(10/4)), geom = "area", fill = "gray", alpha = 0.4) Exercise 7.6 (Beta) Let \\(X \\sim \\text{Beta}(\\alpha, \\beta)\\). Find \\(E[X]\\). Hint 1: \\(\\text{B}(x,y) = \\int_0^1 t^{x-1} (1 - t)^{y-1} dt\\). Hint 2: \\(\\text{B}(x + 1, y) = \\text{B}(x,y)\\frac{x}{x + y}\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\int_0^1 \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} x dx \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)}\\int_0^1 x^{\\alpha} (1 - x)^{\\beta - 1} dx \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 1, \\beta) \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha, \\beta) \\frac{\\alpha}{\\alpha + \\beta} \\\\ &= \\frac{\\alpha}{\\alpha + \\beta}. \\\\ \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\int_0^1 \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} x^2 dx - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)}\\int_0^1 x^{\\alpha + 1} (1 - x)^{\\beta - 1} dx - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 2, \\beta) - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 1, \\beta) \\frac{\\alpha + 1}{\\alpha + \\beta + 1} - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{\\alpha + 1}{\\alpha + \\beta + 1} \\frac{\\alpha}{\\alpha + \\beta} - \\frac{\\alpha^2}{(\\alpha + \\beta)^2}\\\\ &= \\frac{\\alpha \\beta}{(\\alpha + \\beta)^2(\\alpha + \\beta + 1)}. \\end{align*}\\] Exercise 7.7 (Exponential) Let \\(X \\sim \\text{Exp}(\\lambda)\\). Find \\(E[X]\\). Hint: \\(\\Gamma(z + 1) = z\\Gamma(z)\\) and \\(\\Gamma(1) = 1\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\int_0^\\infty \\lambda e^{-\\lambda x} x dx & \\\\ &= \\lambda \\int_0^\\infty x e^{-\\lambda x} dx & \\\\ &= \\lambda \\int_0^\\infty \\frac{t}{\\lambda} e^{-t} \\frac{dt}{\\lambda} & \\text{$t = \\lambda x$}\\\\ &= \\lambda \\lambda^{-2} \\Gamma(2) & \\text{definition of gamma function} \\\\ &= \\lambda^{-1}. \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 & \\\\ &= \\int_0^\\infty \\lambda e^{-\\lambda x} x^2 dx - \\lambda^{-2} & \\\\ &= \\lambda \\int_0^\\infty \\frac{t^2}{\\lambda^2} e^{-t} \\frac{dt}{\\lambda} - \\lambda^{-2} & \\text{$t = \\lambda x$} \\\\ &= \\lambda \\lambda^{-3} \\Gamma(3) - \\lambda^{-2} & \\text{definition of gamma function} & \\\\ &= \\lambda^{-2} 2 \\Gamma(2) - \\lambda^{-2} & \\\\ &= 2 \\lambda^{-2} - \\lambda^{-2} & \\\\ &= \\lambda^{-2}. & \\\\ \\end{align*}\\] Exercise 7.8 (Normal) Let \\(X \\sim \\text{N}(\\mu, \\sigma)\\). Show that \\(E[X] = \\mu\\). Hint: Use the error function \\(\\text{erf}(x) = \\frac{1}{\\sqrt(\\pi)} \\int_{-x}^x e^{-t^2} dt\\). The statistical interpretation of this function is that if \\(Y \\sim \\text{N}(0, 0.5)\\), then the error function describes the probability of \\(Y\\) falling between \\(-x\\) and \\(x\\). Also, \\(\\text{erf}(\\infty) = 1\\). Show that \\(Var[X] = \\sigma^2\\). Hint: Start with the definition of variance. Solution. \\[\\begin{align*} E[X] &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} x dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty x e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty \\Big(t \\sqrt{2\\sigma^2} + \\mu\\Big)e^{-t^2} \\sqrt{2 \\sigma^2} dt & t = \\frac{x - \\mu}{\\sqrt{2}\\sigma} \\\\ &= \\frac{\\sqrt{2\\sigma^2}}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty t e^{-t^2} dt + \\frac{1}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty \\mu e^{-t^2} dt & \\\\ \\end{align*}\\] Let us calculate these integrals separately. \\[\\begin{align*} \\int t e^{-t^2} dt &= -\\frac{1}{2}\\int e^{s} ds & s = -t^2 \\\\ &= -\\frac{e^s}{2} + C \\\\ &= -\\frac{e^{-t^2}}{2} + C & \\text{undoing substitution}. \\end{align*}\\] Inserting the integration limits we get \\[\\begin{align*} \\int_{-\\infty}^\\infty t e^{-t^2} dt &= 0, \\end{align*}\\] due to the integrated function being symmetric. Reordering the second integral we get \\[\\begin{align*} \\mu \\frac{1}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty e^{-t^2} dt &= \\mu \\text{erf}(\\infty) & \\text{definition of error function} \\\\ &= \\mu & \\text{probability of $Y$ falling between $-\\infty$ and $\\infty$}. \\end{align*}\\] Combining all of the above we get \\[\\begin{align*} E[X] &= \\frac{\\sqrt{2\\sigma^2}}{\\sqrt{\\pi}} \\times 0 + \\mu &= \\mu.\\\\ \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[(X - E[X])^2] \\\\ &= E[(X - \\mu)^2] \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty (x - \\mu)^2 e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} dx \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\int_{-\\infty}^\\infty t^2 e^{-\\frac{t^2}{2}} dt \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\bigg(\\Big(- t e^{-\\frac{t^2}{2}} |_{-\\infty}^\\infty \\Big) + \\int_{-\\infty}^\\infty e^{-\\frac{t^2}{2}} \\bigg) dt & \\text{integration by parts} \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\sqrt{2 \\pi} \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt(\\pi)}e^{-s^2} \\bigg) & s = \\frac{t}{\\sqrt{2}} \\text{ and evaluating the left expression at the bounds} \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\sqrt{2 \\pi} \\Big(\\text{erf}(\\infty) & \\text{definition of error function} \\\\ &= \\sigma^2. \\end{align*}\\] 7.3 Sums, functions, conditional expectations Exercise 7.9 (Expectation of transformations) Let \\(X\\) follow a normal distribution with mean \\(\\mu\\) and variance \\(\\sigma^2\\). Find \\(E[2X + 4]\\). Find \\(E[X^2]\\). Find \\(E[\\exp(X)]\\). Hint: Use the error function \\(\\text{erf}(x) = \\frac{1}{\\sqrt(\\pi)} \\int_{-x}^x e^{-t^2} dt\\). Also, \\(\\text{erf}(\\infty) = 1\\). R: Check your results numerically for \\(\\mu = 0.4\\) and \\(\\sigma^2 = 0.25\\) and plot the densities of all four distributions. Solution. \\[\\begin{align} E[2X + 4] &= 2E[X] + 4 & \\text{linearity of expectation} \\\\ &= 2\\mu + 4. \\\\ \\end{align}\\] \\[\\begin{align} E[X^2] &= E[X]^2 + Var[X] & \\text{definition of variance} \\\\ &= \\mu^2 + \\sigma^2. \\end{align}\\] \\[\\begin{align} E[\\exp(X)] &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} e^x dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{\\frac{2 \\sigma^2 x}{2\\sigma^2} -\\frac{(x - \\mu)^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{x^2 - 2x(\\mu + \\sigma^2) + \\mu^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{(x - (\\mu + \\sigma^2))^2 + \\mu^2 - (\\mu + \\sigma^2)^2}{2\\sigma^2}} dx & \\text{complete the square} \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{\\frac{- \\mu^2 + (\\mu + \\sigma^2)^2}{2\\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{(x - (\\mu + \\sigma^2))^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{\\frac{- \\mu^2 + (\\mu + \\sigma^2)^2}{2\\sigma^2}} \\sigma \\sqrt{2 \\pi} \\text{erf}(\\infty) & \\\\ &= e^{\\frac{2\\mu + \\sigma^2}{2}}. \\end{align}\\] set.seed(1) mu <- 0.4 sigma <- 0.5 x <- rnorm(100000, mean = mu, sd = sigma) mean(2*x + 4) ## [1] 4.797756 2 * mu + 4 ## [1] 4.8 mean(x^2) ## [1] 0.4108658 mu^2 + sigma^2 ## [1] 0.41 mean(exp(x)) ## [1] 1.689794 exp((2 * mu + sigma^2) / 2) ## [1] 1.690459 Exercise 7.10 (Sum of independent random variables) Borrowed from Wasserman. Let \\(X_1, X_2,...,X_n\\) be IID random variables with expected value \\(E[X_i] = \\mu\\) and variance \\(Var[X_i] = \\sigma^2\\). Find the expected value and variance of \\(\\bar{X} = \\frac{1}{n} \\sum_{i=1}^n X_i\\). \\(\\bar{X}\\) is called a statistic (a function of the values in a sample). It is itself a random variable and its distribution is called a sampling distribution. R: Take \\(n = 5, 10, 100, 1000\\) samples from the N(\\(2\\), \\(6\\)) distribution 10000 times. Plot the theoretical density and the densities of \\(\\bar{X}\\) statistic for each \\(n\\). Intuitively, are the results in correspondence with your calculations? Check them numerically. Solution. Let us start with the expectation of \\(\\bar{X}\\). \\[\\begin{align} E[\\bar{X}] &= E[\\frac{1}{n} \\sum_{i=1}^n X_i] & \\\\ &= \\frac{1}{n} E[\\sum_{i=1}^n X_i] & \\text{ (multiplication with a scalar)} \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[X_i] & \\text{ (linearity)} \\\\ &= \\frac{1}{n} n \\mu & \\\\ &= \\mu. \\end{align}\\] Now the variance \\[\\begin{align} Var[\\bar{X}] &= Var[\\frac{1}{n} \\sum_{i=1}^n X_i] & \\\\ &= \\frac{1}{n^2} Var[\\sum_{i=1}^n X_i] & \\text{ (multiplication with a scalar)} \\\\ &= \\frac{1}{n^2} \\sum_{i=1}^n Var[X_i] & \\text{ (independence of samples)} \\\\ &= \\frac{1}{n^2} n \\sigma^2 & \\\\ &= \\frac{1}{n} \\sigma^2. \\end{align}\\] set.seed(1) nsamps <- 10000 mu <- 2 sigma <- sqrt(6) N <- c(5, 10, 100, 500) X <- matrix(data = NA, nrow = nsamps, ncol = length(N)) ind <- 1 for (n in N) { for (i in 1:nsamps) { X[i,ind] <- mean(rnorm(n, mu, sigma)) } ind <- ind + 1 } colnames(X) <- N X <- melt(as.data.frame(X)) ggplot(data = X, aes(x = value, colour = variable)) + geom_density() + stat_function(data = data.frame(x = seq(-2, 6, by = 0.01)), aes(x = x), fun = dnorm, args = list(mean = mu, sd = sigma), color = "black") Exercise 7.11 (Conditional expectation) Let \\(X \\in \\mathbb{R}_0^+\\) and \\(Y \\in \\mathbb{N}_0\\) be random variables with joint distribution \\(p_{XY}(X,Y) = \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} 0.5^{y + 1}\\). Find \\(E[X | Y = y]\\) by first finding \\(p_Y\\) and then \\(p_{X|Y}\\). Find \\(E[X]\\). R: check your answers to a) and b) by drawing 10000 samples from \\(p_Y\\) and \\(p_{X|Y}\\). Solution. \\[\\begin{align} p(y) &= \\int_0^\\infty \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} 0.5^{y + 1} dx \\\\ &= \\frac{0.5^{y + 1}}{y + 1} \\int_0^\\infty e^{-\\frac{x}{y + 1}} dx \\\\ &= \\frac{0.5^{y + 1}}{y + 1} (y + 1) \\\\ &= 0.5^{y + 1} \\\\ &= 0.5(1 - 0.5)^y. \\end{align}\\] We recognize this as the geometric distribution. \\[\\begin{align} p(x|y) &= \\frac{p(x,y)}{p(y)} \\\\ &= \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}}. \\end{align}\\] We recognize this as the exponential distribution. \\[\\begin{align} E[X | Y = y] &= \\int_0^\\infty x \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} dx \\\\ &= y + 1 & \\text{expected value of the exponential distribution} \\end{align}\\] Use the law of iterated expectation. \\[\\begin{align} E[X] &= E[E[X | Y]] \\\\ &= E[Y + 1] \\\\ &= E[Y] + 1 \\\\ &= \\frac{1 - 0.5}{0.5} + 1 \\\\ &= 2. \\end{align}\\] set.seed(1) y <- rgeom(100000, 0.5) x <- rexp(100000, rate = 1 / (y + 1)) x2 <- x[y == 3] mean(x2) ## [1] 4.048501 3 + 1 ## [1] 4 mean(x) ## [1] 2.007639 (1 - 0.5) / 0.5 + 1 ## [1] 2 Exercise 7.12 (Cauchy distribution) Let \\(p(x | x_0, \\gamma) = \\frac{1}{\\pi \\gamma \\Big(1 + \\big(\\frac{x - x_0}{\\gamma}\\big)^2\\Big)}\\). A random variable with this PDF follows a Cauchy distribution. This distribution is symmetric and has wider tails than the normal distribution. R: Draw \\(n = 1,...,1000\\) samples from a standard normal and \\(\\text{Cauchy}(0, 1)\\). For each \\(n\\) plot the mean and the median of the sample using facets. Interpret the results. To get a mathematical explanation of the results in a), evaluate the integral \\(\\int_0^\\infty \\frac{x}{1 + x^2} dx\\) and consider that \\(E[X] = \\int_{-\\infty}^\\infty \\frac{x}{1 + x^2}dx\\). set.seed(1) n <- 1000 means_n <- vector(mode = "numeric", length = n) means_c <- vector(mode = "numeric", length = n) medians_n <- vector(mode = "numeric", length = n) medians_c <- vector(mode = "numeric", length = n) for (i in 1:n) { tmp_n <- rnorm(i) tmp_c <- rcauchy(i) means_n[i] <- mean(tmp_n) means_c[i] <- mean(tmp_c) medians_n[i] <- median(tmp_n) medians_c[i] <- median(tmp_c) } df <- data.frame("distribution" = c(rep("normal", 2 * n), rep("Cauchy", 2 * n)), "type" = c(rep("mean", n), rep("median", n), rep("mean", n), rep("median", n)), "value" = c(means_n, medians_n, means_c, medians_c), "n" = rep(1:n, times = 4)) ggplot(df, aes(x = n, y = value)) + geom_line(alpha = 0.5) + facet_wrap(~ type + distribution , scales = "free") Solution. \\[\\begin{align} \\int_0^\\infty \\frac{x}{1 + x^2} dx &= \\frac{1}{2} \\int_1^\\infty \\frac{1}{u} du & u = 1 + x^2 \\\\ &= \\frac{1}{2} \\ln(x) |_0^\\infty. \\end{align}\\] This integral is not finite. The same holds for the negative part. Therefore, the expectation is undefined, as \\(E[|X|] = \\infty\\). Why can we not just claim that \\(f(x) = x / (1 + x^2)\\) is odd and \\(\\int_{-\\infty}^\\infty f(x) = 0\\)? By definition of the Lebesgue integral \\(\\int_{-\\infty}^{\\infty} f= \\int_{-\\infty}^{\\infty} f_+-\\int_{-\\infty}^{\\infty} f_-\\). At least one of the two integrals needs to be finite for \\(\\int_{-\\infty}^{\\infty} f\\) to be well-defined. However \\(\\int_{-\\infty}^{\\infty} f_+=\\int_0^{\\infty} x/(1+x^2)\\) and \\(\\int_{-\\infty}^{\\infty} f_-=\\int_{-\\infty}^{0} |x|/(1+x^2)\\). We have just shown that both of these integrals are infinite, which implies that their sum is also infinite. 7.4 Covariance Exercise 7.13 Below is a table of values for random variables \\(X\\) and \\(Y\\). X Y 2.1 8 -0.5 11 1 10 -2 12 4 9 Find sample covariance of \\(X\\) and \\(Y\\). Find sample variances of \\(X\\) and \\(Y\\). Find sample correlation of \\(X\\) and \\(Y\\). Find sample variance of \\(Z = 2X - 3Y\\). Solution. \\(\\bar{X} = 0.92\\) and \\(\\bar{Y} = 10\\). \\[\\begin{align} s(X, Y) &= \\frac{1}{n - 1} \\sum_{i=1}^5 (X_i - 0.92) (Y_i - 10) \\\\ &= -3.175. \\end{align}\\] \\[\\begin{align} s(X) &= \\frac{\\sum_{i=1}^5(X_i - 0.92)^2}{5 - 1} \\\\ &= 5.357. \\end{align}\\] \\[\\begin{align} s(Y) &= \\frac{\\sum_{i=1}^5(Y_i - 10)^2}{5 - 1} \\\\ &= 2.5. \\end{align}\\] \\[\\begin{align} r(X,Y) &= \\frac{Cov(X,Y)}{\\sqrt{Var[X]Var[Y]}} \\\\ &= \\frac{-3.175}{\\sqrt{5.357 \\times 2.5}} \\\\ &= -8.68. \\end{align}\\] \\[\\begin{align} s(Z) &= 2^2 s(X) + 3^2 s(Y) + 2 \\times 2 \\times 3 s(X, Y) \\\\ &= 4 \\times 5.357 + 9 \\times 2.5 + 12 \\times 3.175 \\\\ &= 82.028. \\end{align}\\] Exercise 7.14 Let \\(X \\sim \\text{Uniform}(0,1)\\) and \\(Y | X = x \\sim \\text{Uniform(0,x)}\\). Find the covariance of \\(X\\) and \\(Y\\). Find the correlation of \\(X\\) and \\(Y\\). R: check your answers to a) and b) with simulation. Plot \\(X\\) against \\(Y\\) on a scatterplot. Solution. The joint PDF is \\(p(x,y) = p(x)p(y|x) = \\frac{1}{x}\\). \\[\\begin{align} Cov(X,Y) &= E[XY] - E[X]E[Y] \\\\ \\end{align}\\] Let us first evaluate the first term: \\[\\begin{align} E[XY] &= \\int_0^1 \\int_0^x x y \\frac{1}{x} dy dx \\\\ &= \\int_0^1 \\int_0^x y dy dx \\\\ &= \\int_0^1 \\frac{x^2}{2} dx \\\\ &= \\frac{1}{6}. \\end{align}\\] Now let us find \\(E[Y]\\), \\(E[X]\\) is trivial. \\[\\begin{align} E[Y] = E[E[Y | X]] = E[\\frac{X}{2}] = \\frac{1}{2} \\int_0^1 x dx = \\frac{1}{4}. \\end{align}\\] Combining all: \\[\\begin{align} Cov(X,Y) &= \\frac{1}{6} - \\frac{1}{2} \\frac{1}{4} = \\frac{1}{24}. \\end{align}\\] \\[\\begin{align} \\rho(X,Y) &= \\frac{Cov(X,Y)}{\\sqrt{Var[X]Var[Y]}} \\\\ \\end{align}\\] Let us calculate \\(Var[X]\\). \\[\\begin{align} Var[X] &= E[X^2] - \\frac{1}{4} \\\\ &= \\int_0^1 x^2 - \\frac{1}{4} \\\\ &= \\frac{1}{3} - \\frac{1}{4} \\\\ &= \\frac{1}{12}. \\end{align}\\] Let us calculate \\(E[E[Y^2|X]]\\). \\[\\begin{align} E[E[Y^2|X]] &= E[\\frac{x^2}{3}] \\\\ &= \\frac{1}{9}. \\end{align}\\] Then \\(Var[Y] = \\frac{1}{9} - \\frac{1}{16} = \\frac{7}{144}\\). Combining all \\[\\begin{align} \\rho(X,Y) &= \\frac{\\frac{1}{24}}{\\sqrt{\\frac{1}{12}\\frac{7}{144}}} \\\\ &= 0.65. \\end{align}\\] set.seed(1) nsamps <- 10000 x <- runif(nsamps) y <- runif(nsamps, 0, x) cov(x, y) ## [1] 0.04274061 1/24 ## [1] 0.04166667 cor(x, y) ## [1] 0.6629567 (1 / 24) / (sqrt(7 / (12 * 144))) ## [1] 0.6546537 ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_point(alpha = 0.2) + geom_smooth(method = "lm") "],["mrv.html", "Chapter 8 Multivariate random variables 8.1 Multinomial random variables 8.2 Multivariate normal random variables 8.3 Transformations", " Chapter 8 Multivariate random variables This chapter deals with multivariate random variables. The students are expected to acquire the following knowledge: Theoretical Multinomial distribution. Multivariate normal distribution. Cholesky decomposition. Eigendecomposition. R Sampling from the multinomial distribution. Sampling from the multivariate normal distribution. Matrix decompositions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 8.1 Multinomial random variables Exercise 8.1 Let \\(X_i\\), \\(i = 1,...,k\\) represent \\(k\\) events, and \\(p_i\\) the probabilities of these events happening in a trial. Let \\(n\\) be the number of trials, and \\(X\\) a multivariate random variable, the collection of \\(X_i\\). Then \\(p(x) = \\frac{n!}{x_1!x_2!...x_k!} p_1^{x_1} p_2^{x_2}...p_k^{x_k}\\) is the PMF of a multinomial distribution, where \\(n = \\sum_{i = 1}^k x_i\\). Show that the marginal distribution of \\(X_i\\) is a binomial distribution. Take 1000 samples from the multinomial distribution with \\(n=4\\) and probabilities \\(p = (0.2, 0.2, 0.5, 0.1)\\). Then take 1000 samples from four binomial distributions with the same parameters. Inspect the results visually. Solution. We will approach this proof from the probabilistic point of view. W.L.O.G. let \\(x_1\\) be the marginal distribution we are interested in. The term \\(p^{x_1}\\) denotes the probability that event 1 happened \\(x_1\\) times. For this event not to happen, one of the other events needs to happen. So for each of the remaining trials, the probability of another event is \\(\\sum_{i=2}^k p_i = 1 - p_1\\), and there were \\(n - x_1\\) such trials. What is left to do is to calculate the number of permutations of event 1 happening and event 1 not happening. We choose \\(x_1\\) trials, from \\(n\\) trials. Therefore \\(p(x_1) = \\binom{n}{x_1} p_1^{x_1} (1 - p_1)^{n - x_1}\\), which is the binomial PMF. Interested students are encouraged to prove this mathematically. set.seed(1) nsamps <- 1000 samps_mult <- rmultinom(nsamps, 4, prob = c(0.2, 0.2, 0.5, 0.1)) samps_mult <- as_tibble(t(samps_mult)) %>% gather() samps <- tibble( V1 = rbinom(nsamps, 4, 0.2), V2 = rbinom(nsamps, 4, 0.2), V3 = rbinom(nsamps, 4, 0.5), V4 = rbinom(nsamps, 4, 0.1) ) %>% gather() %>% bind_rows(samps_mult) %>% bind_cols("dist" = c(rep("binomial", 4*nsamps), rep("multinomial", 4*nsamps))) ggplot(samps, aes(x = value, fill = dist)) + geom_bar(position = "dodge") + facet_wrap(~ key) Exercise 8.2 (Multinomial expected value) Find the expected value, variance and covariance of the multinomial distribution. Hint: First find the expected value for \\(n = 1\\) and then use the fact that the trials are independent. Solution. Let us first calculate the expected value of \\(X_1\\), when \\(n = 1\\). \\[\\begin{align} E[X_1] &= \\sum_{n_1 = 0}^1 \\sum_{n_2 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_1!n_2!...n_k!}p_1^{n_1}p_2^{n_2}...p_k^{n_k}n_1 \\\\ &= \\sum_{n_1 = 0}^1 \\frac{p_1^{n_1} n_1}{n_1!} \\sum_{n_2 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_2!...n_k!}p_2^{n_2}...p_k^{n_k} \\end{align}\\] When \\(n_1 = 0\\) then the whole terms is zero, so we do not need to evaluate other sums. When \\(n_1 = 1\\), all other \\(n_i\\) must be zero, as we have \\(1 = \\sum_{i=1}^k n_i\\). Therefore the other sums equal \\(1\\). So \\(E[X_1] = p_1\\) and \\(E[X_i] = p_i\\) for \\(i = 1,...,k\\). Now let \\(Y_j\\), \\(j = 1,...,n\\), have a multinomial distribution with \\(n = 1\\), and let \\(X\\) have a multinomial distribution with an arbitrary \\(n\\). Then we can write \\(X = \\sum_{j=1}^n Y_j\\). And due to independence \\[\\begin{align} E[X] &= E[\\sum_{j=1}^n Y_j] \\\\ &= \\sum_{j=1}^n E[Y_j] \\\\ &= np. \\end{align}\\] For the variance, we need \\(E[X^2]\\). Let us follow the same procedure as above and first calculate \\(E[X_i]\\) for \\(n = 1\\). The only thing that changes is that the term \\(n_i\\) becomes \\(n_i^2\\). Since we only have \\(0\\) and \\(1\\) this does not change the outcome. So \\[\\begin{align} Var[X_i] &= E[X_i^2] - E[X_i]^2\\\\ &= p_i(1 - p_i). \\end{align}\\] Analogous to above for arbitrary \\(n\\) \\[\\begin{align} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\sum_{j=1}^n E[Y_j^2] - \\sum_{j=1}^n E[Y_j]^2 \\\\ &= \\sum_{j=1}^n E[Y_j^2] - E[Y_j]^2 \\\\ &= \\sum_{j=1}^n p(1-p) \\\\ &= np(1-p). \\end{align}\\] To calculate the covariance, we need \\(E[X_i X_j]\\). Again, let us start with \\(n = 1\\). Without loss of generality, let us assume \\(i = 1\\) and \\(j = 2\\). \\[\\begin{align} E[X_1 X_2] = \\sum_{n_1 = 0}^1 \\sum_{n_2 = 0}^1 \\frac{p_1^{n_1} n_1}{n_1!} \\frac{p_2^{n_2} n_2}{n_2!} \\sum_{n_3 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_3!...n_k!}p_3^{n_3}...p_k^{n_k}. \\end{align}\\] In the above expression, at each iteration we multiply with \\(n_1\\) and \\(n_2\\). Since \\(n = 1\\), one of these always has to be zero. Therefore \\(E[X_1 X_2] = 0\\) and \\[\\begin{align} Cov(X_i, X_j) &= E[X_i X_j] - E[X_i]E[X_j] \\\\ &= - p_i p_j. \\end{align}\\] For arbitrary \\(n\\), let \\(X = \\sum_{t = 1}^n Y_t\\) be the sum of independent multinomial random variables \\(Y_t = [X_{1t}, X_{2t},...,X_{kt}]^T\\) with \\(n=1\\). Then \\(X_1 = \\sum_{t = 1}^n X_{1t}\\) and \\(X_2 = \\sum_{l = 1}^n X_{2l}\\). \\[\\begin{align} Cov(X_1, X_2) &= E[X_1 X_2] - E[X_1] E[X_2] \\\\ &= E[\\sum_{t = 1}^n X_{1t} \\sum_{l = 1}^n X_{2l}] - n^2 p_1 p_2 \\\\ &= \\sum_{t = 1}^n \\sum_{l = 1}^n E[X_{1t} X_{2l}] - n^2 p_1 p_2. \\end{align}\\] For \\(X_{1t}\\) and \\(X_{2l}\\) the expected value is zero when \\(t = l\\). When \\(t \\neq l\\) then they are independent, so the expected value is the product \\(p_1 p_2\\). There are \\(n^2\\) total terms, and for \\(n\\) of them \\(t = l\\) holds. So \\(E[X_1 X_2] = (n^2 - n) p_1 p_2\\). Inserting into the above \\[\\begin{align} Cov(X_1, X_2) &= (n^2 - n) p_1 p_2 - n^2 p_1 p_2 \\\\ &= - n p_1 p_2. \\end{align}\\] 8.2 Multivariate normal random variables Exercise 8.3 (Cholesky decomposition) Let \\(X\\) be a random vector of length \\(k\\) with \\(X_i \\sim \\text{N}(0, 1)\\) and \\(LL^*\\) the Cholesky decomposition of a Hermitian positive-definite matrix \\(A\\). Let \\(\\mu\\) be a vector of length \\(k\\). Find the distribution of the random vector \\(Y = \\mu + L X\\). Find the Cholesky decomposition of \\(A = \\begin{bmatrix} 2 & 1.2 \\\\ 1.2 & 1 \\end{bmatrix}\\). R: Use the results from a) and b) to sample from the MVN distribution \\(\\text{N}(\\mu, A)\\), where \\(\\mu = [1.5, -1]^T\\). Plot a scatterplot and compare it to direct samples from the multivariate normal distribution (rmvnorm). Solution. \\(X\\) has an independent normal distribution of dimension \\(k\\). Then \\[\\begin{align} Y = \\mu + L X &\\sim \\text{N}(\\mu, LL^T) \\\\ &\\sim \\text{N}(\\mu, A). \\end{align}\\] Solve \\[\\begin{align} \\begin{bmatrix} a & 0 \\\\ b & c \\end{bmatrix} \\begin{bmatrix} a & b \\\\ 0 & c \\end{bmatrix} = \\begin{bmatrix} 2 & 1.2 \\\\ 1.2 & 1 \\end{bmatrix} \\end{align}\\] # a set.seed(1) nsamps <- 1000 X <- matrix(data = rnorm(nsamps * 2), ncol = 2) mu <- c(1.5, -1) L <- matrix(data = c(sqrt(2), 0, 1.2 / sqrt(2), sqrt(1 - 1.2^2/2)), ncol = 2, byrow = TRUE) Y <- t(mu + L %*% t(X)) plot_df <- data.frame(rbind(X, Y), c(rep("X", nsamps), rep("Y", nsamps))) colnames(plot_df) <- c("D1", "D2", "var") ggplot(data = plot_df, aes(x = D1, y = D2, colour = as.factor(var))) + geom_point() Exercise 8.4 (Eigendecomposition) R: Let \\(\\Sigma = U \\Lambda U^T\\) be the eigendecomposition of covariance matrix \\(\\Sigma\\). Follow the procedure below, to sample from a multivariate normal with \\(\\mu = [-2, 1]^T\\) and \\(\\Sigma = \\begin{bmatrix} 0.3, -0.5 \\\\ -0.5, 1.6 \\end{bmatrix}\\): Sample from two independent standardized normal distributions to get \\(X\\). Find the eigendecomposition of \\(X\\) (eigen). Multiply \\(X\\) by \\(\\Lambda^{\\frac{1}{2}}\\) to get \\(X2\\). Consider how the eigendecomposition for \\(X2\\) changes compared to \\(X\\). Multiply \\(X2\\) by \\(U\\) to get \\(X3\\). Consider how the eigendecomposition for \\(X3\\) changes compared to \\(X2\\). Add \\(\\mu\\) to \\(X3\\). Consider how the eigendecomposition for \\(X4\\) changes compared to \\(X3\\). Plot the data and the eigenvectors (scaled with \\(\\Lambda^{\\frac{1}{2}}\\)) at each step. Hint: Use geom_segment for the eigenvectors. # a set.seed(1) sigma <- matrix(data = c(0.3, -0.5, -0.5, 1.6), nrow = 2, byrow = TRUE) ed <- eigen(sigma) e_val <- ed$values e_vec <- ed$vectors # b set.seed(1) nsamps <- 1000 X <- matrix(data = rnorm(nsamps * 2), ncol = 2) vec1 <- matrix(c(1,0,0,1), nrow = 2) X2 <- t(sqrt(diag(e_val)) %*% t(X)) vec2 <- sqrt(diag(e_val)) %*% vec1 X3 <- t(e_vec %*% t(X2)) vec3 <- e_vec %*% vec2 X4 <- t(c(-2, 1) + t(X3)) vec4 <- c(-2, 1) + vec3 vec_mat <- data.frame(matrix(c(0,0,0,0,0,0,0,0,0,0,0,0,-2,1,-2,1), ncol = 2, byrow = TRUE), t(cbind(vec1, vec2, vec3, vec4)), c(1,1,2,2,3,3,4,4)) df <- data.frame(rbind(X, X2, X3, X4), c(rep(1, nsamps), rep(2, nsamps), rep(3, nsamps), rep(4, nsamps))) colnames(df) <- c("D1", "D2", "wh") colnames(vec_mat) <- c("D1", "D2", "E1", "E2", "wh") ggplot(data = df, aes(x = D1, y = D2)) + geom_point() + geom_segment(data = vec_mat, aes(xend = E1, yend = E2), color = "red") + facet_wrap(~ wh) + coord_fixed() Exercise 8.5 (Marginal and conditional distributions) Let \\(X \\sim \\text{N}(\\mu, \\Sigma)\\), where \\(\\mu = [2, 0, -1]^T\\) and \\(\\Sigma = \\begin{bmatrix} 1 & -0.2 & 0.5 \\\\ -0.2 & 1.4 & -1.2 \\\\ 0.5 & -1.2 & 2 \\\\ \\end{bmatrix}\\). Let \\(A\\) represent the first two random variables and \\(B\\) the third random variable. R: For the calculation in the following points, you can use R. Find the marginal distribution of \\(B\\). Find the conditional distribution of \\(B | A = [a_1, a_2]^T\\). Find the marginal distribution of \\(A\\). Find the conditional distribution of \\(A | B = b\\). R: Visually compare the distributions of a) and b), and c) and d) at three different conditional values. mu <- c(2, 0, -1) Sigma <- matrix(c(1, -0.2, 0.5, -0.2, 1.4, -1.2, 0.5, -1.2, 2), nrow = 3, byrow = TRUE) mu_A <- c(2, 0) mu_B <- -1 Sigma_A <- Sigma[1:2, 1:2] Sigma_B <- Sigma[3, 3] Sigma_AB <- Sigma[1:2, 3] # b tmp_b <- t(Sigma_AB) %*% solve(Sigma_A) mu_b <- mu_B - tmp_b %*% mu_A Sigma_b <- Sigma_B - t(Sigma_AB) %*% solve(Sigma_A) %*% Sigma_AB mu_b ## [,1] ## [1,] -1.676471 tmp_b ## [,1] [,2] ## [1,] 0.3382353 -0.8088235 Sigma_b ## [,1] ## [1,] 0.8602941 # d tmp_a <- Sigma_AB * (1 / Sigma_B) mu_a <- mu_A - tmp_a * mu_B Sigma_d <- Sigma_A - (Sigma_AB * (1 / Sigma_B)) %*% t(Sigma_AB) mu_a ## [1] 2.25 -0.60 tmp_a ## [1] 0.25 -0.60 Sigma_d ## [,1] [,2] ## [1,] 0.875 0.10 ## [2,] 0.100 0.68 Solution. \\(B \\sim \\text{N}(-1, 2)\\). \\(B | A = a \\sim \\text{N}(-1.68 + [0.34, -0.81] a, 0.86)\\). \\(\\mu_A = [2, 0]^T\\) and \\(\\Sigma_A = \\begin{bmatrix} 1 & -0.2 & \\\\ -0.2 & 1.4 \\\\ \\end{bmatrix}\\). \\[\\begin{align} A | B = b &\\sim \\text{N}(\\mu_t, \\Sigma_t), \\\\ \\mu_t &= [2.25, -0.6]^T + [0.25, -0.6]^T b, \\\\ \\Sigma_t &= \\begin{bmatrix} 0.875 & 0.1 \\\\ 0.1 & 0.68 \\\\ \\end{bmatrix} \\end{align}\\] library(mvtnorm) set.seed(1) nsamps <- 1000 # a and b samps <- as.data.frame(matrix(data = NA, nrow = 4 * nsamps, ncol = 2)) samps[1:nsamps,1] <- rnorm(nsamps, mu_B, Sigma_B) samps[1:nsamps,2] <- "marginal" for (i in 1:3) { a <- rmvnorm(1, mu_A, Sigma_A) samps[(i*nsamps + 1):((i + 1) * nsamps), 1] <- rnorm(nsamps, mu_b + tmp_b %*% t(a), Sigma_b) samps[(i*nsamps + 1):((i + 1) * nsamps), 2] <- paste0(# "cond", round(a, digits = 2), collapse = "-") } colnames(samps) <- c("x", "dist") ggplot(samps, aes(x = x)) + geom_density() + facet_wrap(~ dist) # c and d samps <- as.data.frame(matrix(data = NA, nrow = 4 * nsamps, ncol = 3)) samps[1:nsamps,1:2] <- rmvnorm(nsamps, mu_A, Sigma_A) samps[1:nsamps,3] <- "marginal" for (i in 1:3) { b <- rnorm(1, mu_B, Sigma_B) samps[(i*nsamps + 1):((i + 1) * nsamps), 1:2] <- rmvnorm(nsamps, mu_a + tmp_a * b, Sigma_d) samps[(i*nsamps + 1):((i + 1) * nsamps), 3] <- b } colnames(samps) <- c("x", "y", "dist") ggplot(samps, aes(x = x, y = y)) + geom_point() + geom_smooth(method = "lm") + facet_wrap(~ dist) 8.3 Transformations Exercise 8.6 Let \\((U,V)\\) be a random variable with PDF \\(p(u,v) = \\frac{1}{4 \\sqrt{u}}\\), \\(U \\in [0,4]\\) and \\(V \\in [\\sqrt{U}, \\sqrt{U} + 1]\\). Let \\(X = \\sqrt{U}\\) and \\(Y = V - \\sqrt{U}\\). Find PDF of \\((X,Y)\\). What can you tell about distributions of \\(X\\) and \\(Y\\)? This exercise shows how we can simplify a probabilistic problem with a clever use of transformations. R: Take 1000 samples from \\((X,Y)\\) and transform them with inverses of the above functions to get samples from \\((U,V)\\). Plot both sets of samples. Solution. First we need to find the inverse functions. Since \\(x = \\sqrt{u}\\) it follows that \\(u = x^2\\), and that \\(x \\in [0,2]\\). Similarly \\(v = y + x\\) and \\(y \\in [0,1]\\). Let us first find the Jacobian. \\[\\renewcommand\\arraystretch{1.6} J(x,y) = \\begin{bmatrix} \\frac{\\partial u}{\\partial x} & \\frac{\\partial v}{\\partial x} \\\\%[1ex] % <-- 1ex more space between rows of matrix \\frac{\\partial u}{\\partial y} & \\frac{\\partial v}{\\partial y} \\end{bmatrix} = \\begin{bmatrix} 2x & 1 \\\\%[1ex] % <-- 1ex more space between rows of matrix 0 & 1 \\end{bmatrix}, \\] and the determinant is \\(|J(x,y)| = 2x\\). Putting everything together, we get \\[\\begin{align} p_{X,Y}(x,y) = p_{U,V}(x^2, y + x) |J(x,y)| = \\frac{1}{4 \\sqrt{x^2}} 2x = \\frac{1}{2}. \\end{align}\\] This reminds us of the Uniform distribution. Indeed we can see that \\(p_X(x) = \\frac{1}{2}\\) and \\(p_Y(y) = 1\\). So instead of dealing with an awkward PDF of \\((U,V)\\) and the corresponding dynamic bounds, we are now looking at two independent Uniform random variables. In practice, this could make modeling much easier. set.seed(1) nsamps <- 2000 x <- runif(nsamps, min = 0, max = 2) y <- runif(nsamps) orig <- tibble(x = x, y = y, vrs = "original") u <- x^2 v <- y + x transf <- tibble(x = u, y = v, vrs = "transformed") df <- bind_rows(orig, transf) ggplot(df, aes(x = x, y = y, color = vrs)) + geom_point(alpha = 0.3) Exercise 8.7 R: Write a function that will calculate the probability density of an arbitraty multivariate normal distribution, based on independent standardized normal PDFs. Compare with dmvnorm from the mvtnorm package. library(mvtnorm) set.seed(1) mvn_dens <- function (y, mu, Sigma) { L <- chol(Sigma) L_inv <- solve(t(L)) g_inv <- L_inv %*% t(y - mu) J <- L_inv J_det <- det(J) return(prod(dnorm(g_inv)) * J_det) } mu_v <- c(-2, 0, 1) cov_m <- matrix(c(1, -0.2, 0.5, -0.2, 2, 0.3, 0.5, 0.3, 1.6), ncol = 3, byrow = TRUE) n_comp <- 20 for (i in 1:n_comp) { x <- rmvnorm(1, mean = mu_v, sigma = cov_m) print(paste0("My function: ", mvn_dens(x, mu_v, cov_m), ", dmvnorm: ", dmvnorm(x, mu_v, cov_m))) } ## [1] "My function: 0.0229514237156383, dmvnorm: 0.0229514237156383" ## [1] "My function: 0.00763138915406231, dmvnorm: 0.00763138915406231" ## [1] "My function: 0.0230688881105741, dmvnorm: 0.0230688881105741" ## [1] "My function: 0.0113616213114731, dmvnorm: 0.0113616213114731" ## [1] "My function: 0.00151808500121907, dmvnorm: 0.00151808500121907" ## [1] "My function: 0.0257658045974509, dmvnorm: 0.0257658045974509" ## [1] "My function: 0.0157963825730805, dmvnorm: 0.0157963825730805" ## [1] "My function: 0.00408856287529248, dmvnorm: 0.00408856287529248" ## [1] "My function: 0.0327793540101256, dmvnorm: 0.0327793540101256" ## [1] "My function: 0.0111606542967978, dmvnorm: 0.0111606542967978" ## [1] "My function: 0.0147636757585684, dmvnorm: 0.0147636757585684" ## [1] "My function: 0.0142948300412207, dmvnorm: 0.0142948300412207" ## [1] "My function: 0.0203093820657542, dmvnorm: 0.0203093820657542" ## [1] "My function: 0.0287533273357481, dmvnorm: 0.0287533273357481" ## [1] "My function: 0.0213402305128623, dmvnorm: 0.0213402305128623" ## [1] "My function: 0.0218356957993885, dmvnorm: 0.0218356957993885" ## [1] "My function: 0.0250750113961771, dmvnorm: 0.0250750113961771" ## [1] "My function: 0.0166498666348048, dmvnorm: 0.0166498666348048" ## [1] "My function: 0.00189725106874659, dmvnorm: 0.00189725106874659" ## [1] "My function: 0.0196697814975113, dmvnorm: 0.0196697814975113" "],["ard.html", "Chapter 9 Alternative representation of distributions 9.1 Probability generating functions (PGFs) 9.2 Moment generating functions (MGFs)", " Chapter 9 Alternative representation of distributions This chapter deals with alternative representation of distributions. The students are expected to acquire the following knowledge: Theoretical Probability generating functions. Moment generating functions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 9.1 Probability generating functions (PGFs) Exercise 9.1 Show that the sum of independent Poisson random variables is itself a Poisson random variable. R: Let \\(X\\) be a sum of three Poisson distributions with \\(\\lambda_i \\in \\{2, 5.2, 10\\}\\). Take 1000 samples and plot the three distributions and the sum. Then take 1000 samples from the theoretical distribution of \\(X\\) and compare them to the sum. Solution. Let \\(X_i \\sim \\text{Poisson}(\\lambda_i)\\) for \\(i = 1,...,n\\), and let \\(X = \\sum_{i=1}^n X_i\\). \\[\\begin{align} \\alpha_X(t) &= \\prod_{i=1}^n \\alpha_{X_i}(t) \\\\ &= \\prod_{i=1}^n \\bigg( \\sum_{j=0}^\\infty t^j \\frac{\\lambda_i^j e^{-\\lambda_i}}{j!} \\bigg) \\\\ &= \\prod_{i=1}^n \\bigg( e^{-\\lambda_i} \\sum_{j=0}^\\infty \\frac{(t\\lambda_i)^j }{j!} \\bigg) \\\\ &= \\prod_{i=1}^n \\bigg( e^{-\\lambda_i} e^{t \\lambda_i} \\bigg) & \\text{power series} \\\\ &= \\prod_{i=1}^n \\bigg( e^{\\lambda_i(t - 1)} \\bigg) \\\\ &= e^{\\sum_{i=1}^n \\lambda_i(t - 1)} \\\\ &= e^{t \\sum_{i=1}^n \\lambda_i - \\sum_{i=1}^n \\lambda_i} \\\\ &= e^{-\\sum_{i=1}^n \\lambda_i} \\sum_{j=0}^\\infty \\frac{(t \\sum_{i=1}^n \\lambda_i)^j}{j!}\\\\ &= \\sum_{j=0}^\\infty \\frac{e^{-\\sum_{i=1}^n \\lambda_i} (t \\sum_{i=1}^n \\lambda_i)^j}{j!}\\\\ \\end{align}\\] The last term is the PGF of a Poisson random variable with parameter \\(\\sum_{i=1}^n \\lambda_i\\). Because the PGF is unique, \\(X\\) is a Poisson random variable. set.seed(1) library(tidyr) nsamps <- 1000 samps <- matrix(data = NA, nrow = nsamps, ncol = 4) samps[ ,1] <- rpois(nsamps, 2) samps[ ,2] <- rpois(nsamps, 5.2) samps[ ,3] <- rpois(nsamps, 10) samps[ ,4] <- samps[ ,1] + samps[ ,2] + samps[ ,3] colnames(samps) <- c(2, 2.5, 10, "sum") gsamps <- as_tibble(samps) gsamps <- gather(gsamps, key = "dist", value = "value") ggplot(gsamps, aes(x = value)) + geom_bar() + facet_wrap(~ dist) samps <- cbind(samps, "theoretical" = rpois(nsamps, 2 + 5.2 + 10)) gsamps <- as_tibble(samps[ ,4:5]) gsamps <- gather(gsamps, key = "dist", value = "value") ggplot(gsamps, aes(x = value, fill = dist)) + geom_bar(position = "dodge") Exercise 9.2 Find the expected value and variance of the negative binomial distribution. Hint: Find the Taylor series of \\((1 - y)^{-r}\\) at point 0. Solution. Let \\(X \\sim \\text{NB}(r, p)\\). \\[\\begin{align} \\alpha_X(t) &= E[t^X] \\\\ &= \\sum_{j=0}^\\infty t^j \\binom{j + r - 1}{j} (1 - p)^r p^j \\\\ &= (1 - p)^r \\sum_{j=0}^\\infty \\binom{j + r - 1}{j} (tp)^j \\\\ &= (1 - p)^r \\sum_{j=0}^\\infty \\frac{(j + r - 1)(j + r - 2)...r}{j!} (tp)^j. \\\\ \\end{align}\\] Let us look at the Taylor series of \\((1 - y)^{-r}\\) at 0 \\[\\begin{align} (1 - y)^{-r} = &1 + \\frac{-r(-1)}{1!}y + \\frac{-r(-r - 1)(-1)^2}{2!}y^2 + \\\\ &\\frac{-r(-r - 1)(-r - 2)(-1)^3}{3!}y^3 + ... \\\\ \\end{align}\\] How does the \\(k\\)-th term look like? We have \\(k\\) derivatives of our function so \\[\\begin{align} \\frac{d^k}{d^k y} (1 - y)^{-r} &= \\frac{-r(-r - 1)...(-r - k + 1)(-1)^k}{k!}y^k \\\\ &= \\frac{r(r + 1)...(r + k - 1)}{k!}y^k. \\end{align}\\] We observe that this equals to the \\(j\\)-th term in the sum of NB PGF. Therefore \\[\\begin{align} \\alpha_X(t) &= (1 - p)^r (1 - tp)^{-r} \\\\ &= \\Big(\\frac{1 - p}{1 - tp}\\Big)^r \\end{align}\\] To find the expected value, we need to differentiate \\[\\begin{align} \\frac{d}{dt} \\Big(\\frac{1 - p}{1 - tp}\\Big)^r &= r \\Big(\\frac{1 - p}{1 - tp}\\Big)^{r-1} \\frac{d}{dt} \\frac{1 - p}{1 - tp} \\\\ &= r \\Big(\\frac{1 - p}{1 - tp}\\Big)^{r-1} \\frac{p(1 - p)}{(1 - tp)^2}. \\\\ \\end{align}\\] Evaluating this at 1, we get: \\[\\begin{align} E[X] = \\frac{rp}{1 - p}. \\end{align}\\] For the variance we need the second derivative. \\[\\begin{align} \\frac{d^2}{d^2t} \\Big(\\frac{1 - p}{1 - tp}\\Big)^r &= \\frac{p^2 r (r + 1) (\\frac{1 - p}{1 - tp})^r}{(tp - 1)^2} \\end{align}\\] Evaluating this at 1 and inserting the first derivatives, we get: \\[\\begin{align} Var[X] &= \\frac{d^2}{dt^2} \\alpha_X(1) + \\frac{d}{dt}\\alpha_X(1) - \\Big(\\frac{d}{dt}\\alpha_X(t) \\Big)^2 \\\\ &= \\frac{p^2 r (r + 1)}{(1 - p)^2} + \\frac{rp}{1 - p} - \\frac{r^2p^2}{(1 - p)^2} \\\\ &= \\frac{rp}{(1 - p)^2}. \\end{align}\\] library(tidyr) set.seed(1) nsamps <- 100000 find_p <- function (mu, r) { return (10 / (r + 10)) } r <- c(1,2,10,20) p <- find_p(10, r) sigma <- rep(sqrt(p*r / (1 - p)^2), each = nsamps) samps <- cbind("r=1" = rnbinom(nsamps, size = r[1], prob = 1 - p[1]), "r=2" = rnbinom(nsamps, size = r[2], prob = 1 - p[2]), "r=4" = rnbinom(nsamps, size = r[3], prob = 1 - p[3]), "r=20" = rnbinom(nsamps, size = r[4], prob = 1 - p[4])) gsamps <- gather(as.data.frame(samps)) iw <- (gsamps$value > sigma + 10) | (gsamps$value < sigma - 10) ggplot(gsamps, aes(x = value, fill = iw)) + geom_bar() + # geom_density() + facet_wrap(~ key) 9.2 Moment generating functions (MGFs) Exercise 9.3 Find the variance of the geometric distribution. Solution. Let \\(X \\sim \\text{Geometric}(p)\\). The MGF of the geometric distribution is \\[\\begin{align} M_X(t) &= E[e^{tX}] \\\\ &= \\sum_{k=0}^\\infty p(1 - p)^k e^{tk} \\\\ &= p \\sum_{k=0}^\\infty ((1 - p)e^t)^k. \\end{align}\\] Let us assume that \\((1 - p)e^t < 1\\). Then, by using the geometric series we get \\[\\begin{align} M_X(t) &= \\frac{p}{1 - e^t + pe^t}. \\end{align}\\] The first derivative of the above expression is \\[\\begin{align} \\frac{d}{dt}M_X(t) &= \\frac{-p(-e^t + pe^t)}{(1 - e^t + pe^t)^2}, \\end{align}\\] and evaluating at \\(t = 0\\), we get \\(\\frac{1 - p}{p}\\), which we already recognize as the expected value of the geometric distribution. The second derivative is \\[\\begin{align} \\frac{d^2}{dt^2}M_X(t) &= \\frac{(p-1)pe^t((p-1)e^t - 1)}{((p - 1)e^t + 1)^3}, \\end{align}\\] and evaluating at \\(t = 0\\), we get \\(\\frac{(p - 1)(p - 2)}{p^2}\\). Combining we get the variance \\[\\begin{align} Var(X) &= \\frac{(p - 1)(p - 2)}{p^2} - \\frac{(1 - p)^2}{p^2} \\\\ &= \\frac{(p-1)(p-2) - (1-p)^2}{p^2} \\\\ &= \\frac{1 - p}{p^2}. \\end{align}\\] Exercise 9.4 Find the distribution of sum of two normal random variables \\(X\\) and \\(Y\\), by comparing \\(M_{X+Y}(t)\\) to \\(M_X(t)\\). R: To illustrate the result draw random samples from N\\((-3, 1)\\) and N\\((5, 1.2)\\) and calculate the empirical mean and variance of \\(X+Y\\). Plot all three histograms in one plot. Solution. Let \\(X \\sim \\text{N}(\\mu_X, 1)\\) and \\(Y \\sim \\text{N}(\\mu_Y, 1)\\). The MGF of the sum is \\[\\begin{align} M_{X+Y}(t) &= M_X(t) M_Y(t). \\end{align}\\] Let us calculate \\(M_X(t)\\), the MGF for \\(Y\\) then follows analogously. \\[\\begin{align} M_X(t) &= \\int_{-\\infty}^\\infty e^{tx} \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - mu_X)^2}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - mu_X)^2 - 2\\sigma_X tx}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{x^2 - 2\\mu_X x + \\mu_X^2 - 2\\sigma_X tx}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - (\\mu_X + \\sigma_X^2 t))^2 + \\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} dx & \\text{complete the square}\\\\ &= e^{-\\frac{\\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - (\\mu_X + \\sigma_X^2 t))^2}{2\\sigma_X^2}} dx & \\\\ &= e^{-\\frac{\\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} & \\text{normal PDF} \\\\ &= e^{-\\frac{\\mu_X^2 - \\mu_X^2 - \\mu_X \\sigma_X^2 t - 2 \\sigma_X^4 t^2}{2\\sigma_X^2}} \\\\ &= e^{\\sigma_X^2 t^2 + \\frac{\\mu_X t}{2}}. \\\\ \\end{align}\\] The MGF of the sum is then \\[\\begin{align} M_{X+Y}(t) &= e^{\\sigma_X^2 t^2 + 0.5\\mu_X t} e^{\\sigma_Y^2 t^2 + 0.5\\mu_Y t} \\\\ &= e^{t^2(\\sigma_X^2 + \\sigma_Y^2) + 0.5 t(\\mu_X + \\mu_Y)}. \\end{align}\\] By comparing \\(M_{X+Y}(t)\\) and \\(M_X(t)\\) we observe that both have two terms. The first is \\(2t^2\\) multiplied by the variance, and the second is \\(2t\\) multiplied by the mean. Since MGFs are unique, we conclude that \\(Z = X + Y \\sim \\text{N}(\\mu_X + \\mu_Y, \\sigma_X^2 + \\sigma_Y^2)\\). library(tidyr) library(ggplot2) set.seed(1) nsamps <- 1000 x <- rnorm(nsamps, -3, 1) y <- rnorm(nsamps, 5, 1.2) z <- x + y mean(z) ## [1] 1.968838 var(z) ## [1] 2.645034 df <- data.frame(x = x, y = y, z = z) %>% gather() ggplot(df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") "],["ci.html", "Chapter 10 Concentration inequalities 10.1 Comparison 10.2 Practical", " Chapter 10 Concentration inequalities This chapter deals with concentration inequalities. The students are expected to acquire the following knowledge: Theoretical More assumptions produce closer bounds. R Optimization. Estimating probability inequalities. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 10.1 Comparison Exercise 10.1 R: Let \\(X\\) be geometric random variable with \\(p = 0.7\\). Visually compare the Markov bound, Chernoff bound, and the theoretical probabilities for \\(x = 1,...,12\\). To get the best fitting Chernoff bound, you will need to optimize the bound depending on \\(t\\). Use either analytical or numerical optimization. bound_chernoff <- function (t, p, a) { return ((p / (1 - exp(t) + p * exp(t))) / exp(a * t)) } set.seed(1) p <- 0.7 a <- seq(1, 12, by = 1) ci_markov <- (1 - p) / p / a t <- vector(mode = "numeric", length = length(a)) for (i in 1:length(t)) { t[i] <- optimize(bound_chernoff, interval = c(0, log(1 / (1 - p))), p = p, a = a[i])$minimum } t ## [1] 0.5108267 0.7984981 0.9162927 0.9808238 1.0216635 1.0498233 1.0704327 ## [8] 1.0861944 1.0986159 1.1086800 1.1169653 1.1239426 ci_chernoff <- (p / (1 - exp(t) + p * exp(t))) / exp(a * t) actual <- 1 - pgeom(a, 0.7) plot_df <- rbind( data.frame(x = a, y = ci_markov, type = "Markov"), data.frame(x = a, y = ci_chernoff, type = "Chernoff"), data.frame(x = a, y = actual, type = "Actual") ) ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() Exercise 10.2 R: Let \\(X\\) be a sum of 100 Beta distributions with random parameters. Take 1000 samples and plot the Chebyshev bound, Hoeffding bound, and the empirical probabilities. set.seed(1) nvars <- 100 nsamps <- 1000 samps <- matrix(data = NA, nrow = nsamps, ncol = nvars) Sn_mean <- 0 Sn_var <- 0 for (i in 1:nvars) { alpha1 <- rgamma(1, 10, 1) beta1 <- rgamma(1, 10, 1) X <- rbeta(nsamps, alpha1, beta1) Sn_mean <- Sn_mean + alpha1 / (alpha1 + beta1) Sn_var <- Sn_var + alpha1 * beta1 / ((alpha1 + beta1)^2 * (alpha1 + beta1 + 1)) samps[ ,i] <- X } mean(apply(samps, 1, sum)) ## [1] 51.12511 Sn_mean ## [1] 51.15723 var(apply(samps, 1, sum)) ## [1] 1.170652 Sn_var ## [1] 1.166183 a <- 1:30 b <- a / sqrt(Sn_var) ci_chebyshev <- 1 / b^2 ci_hoeffding <- 2 * exp(- 2 * a^2 / nvars) empirical <- NULL for (i in 1:length(a)) { empirical[i] <- sum(abs((apply(samps, 1, sum)) - Sn_mean) >= a[i])/ nsamps } plot_df <- rbind( data.frame(x = a, y = ci_chebyshev, type = "Chebyshev"), data.frame(x = a, y = ci_hoeffding, type = "Hoeffding"), data.frame(x = a, y = empirical, type = "Empirical") ) ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() + coord_cartesian(xlim = c(15, 25), ylim = c(0, 0.05)) 10.2 Practical Exercise 10.3 From Jagannathan. Let \\(X_i\\), \\(i = 1,...n\\), be a random sample of size \\(n\\) of a random variable \\(X\\). Let \\(X\\) have mean \\(\\mu\\) and variance \\(\\sigma^2\\). Find the size of the sample \\(n\\) required so that the probability that the difference between sample mean and true mean is smaller than \\(\\frac{\\sigma}{10}\\) is at least 0.95. Hint: Derive a version of the Chebyshev inequality for \\(P(|X - \\mu| \\geq a)\\) using Markov inequality. Solution. Let \\(\\bar{X} = \\sum_{i=1}^n X_i\\). Then \\(E[\\bar{X}] = \\mu\\) and \\(Var[\\bar{X}] = \\frac{\\sigma^2}{n}\\). Let us first derive another representation of Chebyshev inequality. \\[\\begin{align} P(|X - \\mu| \\geq a) = P(|X - \\mu|^2 \\geq a^2) \\leq \\frac{E[|X - \\mu|^2]}{a^2} = \\frac{Var[X]}{a^2}. \\end{align}\\] Let us use that on our sampling distribution: \\[\\begin{align} P(|\\bar{X} - \\mu| \\geq \\frac{\\sigma}{10}) \\leq \\frac{100 Var[\\bar{X}]}{\\sigma^2} = \\frac{100 Var[X]}{n \\sigma^2} = \\frac{100}{n}. \\end{align}\\] We are interested in the difference being smaller, therefore \\[\\begin{align} P(|\\bar{X} - \\mu| < \\frac{\\sigma}{10}) = 1 - P(|\\bar{X} - \\mu| \\geq \\frac{\\sigma}{10}) \\geq 1 - \\frac{100}{n} \\geq 0.95. \\end{align}\\] It follows that we need a sample size of \\(n \\geq \\frac{100}{0.05} = 2000\\). "],["crv.html", "Chapter 11 Convergence of random variables", " Chapter 11 Convergence of random variables This chapter deals with convergence of random variables. The students are expected to acquire the following knowledge: Theoretical Finding convergences of random variables. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 11.1 Let \\(X_1\\), \\(X_2\\),…, \\(X_n\\) be a sequence of Bernoulli random variables. Let \\(Y_n = \\frac{X_1 + X_2 + ... + X_n}{n^2}\\). Show that this sequence converges point-wise to the zero random variable. R: Use a simulation to check your answer. Solution. Let \\(\\epsilon\\) be arbitrary. We need to find such \\(n_0\\), that for every \\(n\\) greater than \\(n_0\\) \\(|Y_n| < \\epsilon\\) holds. \\[\\begin{align} |Y_n| &= |\\frac{X_1 + X_2 + ... + X_n}{n^2}| \\\\ &\\leq |\\frac{n}{n^2}| \\\\ &= \\frac{1}{n}. \\end{align}\\] So we need to find such \\(n_0\\), that for every \\(n > n_0\\) we will have \\(\\frac{1}{n} < \\epsilon\\). So \\(n_0 > \\frac{1}{\\epsilon}\\). x <- 1:1000 X <- matrix(data = NA, nrow = length(x), ncol = 100) y <- vector(mode = "numeric", length = length(x)) for (i in 1:length(x)) { X[i, ] <- rbinom(100, size = 1, prob = 0.5) } X <- apply(X, 2, cumsum) tmp_mat <- matrix(data = (1:1000)^2, nrow = 1000, ncol = 100) X <- X / tmp_mat y <- apply(X, 1, mean) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() Exercise 11.2 Let \\(\\Omega = [0,1]\\) and let \\(X_n\\) be a sequence of random variables, defined as \\[\\begin{align} X_n(\\omega) = \\begin{cases} \\omega^3, &\\omega = \\frac{i}{n}, &0 \\leq i \\leq 1 \\\\ 1, & \\text{otherwise.} \\end{cases} \\end{align}\\] Show that \\(X_n\\) converges almost surely to \\(X \\sim \\text{Uniform}(0,1)\\). Solution. We need to show \\(P(\\{\\omega: X_n(\\omega) \\rightarrow X(\\omega)\\}) = 1\\). Let \\(\\omega \\neq \\frac{i}{n}\\). Then for any \\(\\omega\\), \\(X_n\\) converges pointwise to \\(X\\): \\[\\begin{align} X_n(\\omega) = 1 \\implies |X_n(\\omega) - X(s)| = |1 - 1| < \\epsilon. \\end{align}\\] The above is independent of \\(n\\). Since there are countably infinite number of elements in the complement (\\(\\frac{i}{n}\\)), the probability of this set is 1. Exercise 11.3 Borrowed from Wasserman. Let \\(X_n \\sim \\text{N}(0, \\frac{1}{n})\\) and let \\(X\\) be a random variable with CDF \\[\\begin{align} F_X(x) = \\begin{cases} 0, &x < 0 \\\\ 1, &x \\geq 0. \\end{cases} \\end{align}\\] Does \\(X_n\\) converge to \\(X\\) in distribution? How about in probability? Prove or disprove these statement. R: Plot the CDF of \\(X_n\\) for \\(n = 1, 2, 5, 10, 100, 1000\\). Solution. Let us first check convergence in distribution. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} F_{X_n}(x) &= \\lim_{n \\rightarrow \\infty} \\phi (\\sqrt(n) x). \\end{align}\\] We have two cases, for \\(x < 0\\) and \\(x > 0\\). We do not need to check for \\(x = 0\\), since \\(F_X\\) is not continuous in that point. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} \\phi (\\sqrt(n) x) = \\begin{cases} 0, & x < 0 \\\\ 1, & x > 0. \\end{cases} \\end{align}\\] This is the same as \\(F_X\\). Let us now check convergence in probability. Since \\(X\\) is a point-mass distribution at zero, we have \\[\\begin{align} \\lim_{n \\rightarrow \\infty} P(|X_n| > \\epsilon) &= \\lim_{n \\rightarrow \\infty} (P(X_n > \\epsilon) + P(X_n < -\\epsilon)) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - P(X_n < \\epsilon) + P(X_n < -\\epsilon)) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - \\phi(\\sqrt{n} \\epsilon) + \\phi(- \\sqrt{n} \\epsilon)) \\\\ &= 0. \\end{align}\\] n <- c(1,2,5,10,100,1000) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01)), aes(x = x)) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/1), aes(color = "sd = 1/1")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/2), aes(color = "sd = 1/2")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/5), aes(color = "sd = 1/5")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/10), aes(color = "sd = 1/10")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/100), aes(color = "sd = 1/100")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/1000), aes(color = "sd = 1/1000")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/10000), aes(color = "sd = 1/10000")) Exercise 11.4 Let \\(X_i\\) be i.i.d. and \\(\\mu = E(X_1)\\). Let variance of \\(X_1\\) be finite. Show that the mean of \\(X_i\\), \\(\\bar{X}_n = \\frac{1}{n}\\sum_{i=1}^n X_i\\) converges in quadratic mean to \\(\\mu\\). Solution. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} E(|\\bar{X_n} - \\mu|^2) &= \\lim_{n \\rightarrow \\infty} E(\\bar{X_n}^2 - 2 \\bar{X_n} \\mu + \\mu^2) \\\\ &= \\lim_{n \\rightarrow \\infty} (E(\\bar{X_n}^2) - 2 \\mu E(\\frac{\\sum_{i=1}^n X_i}{n}) + \\mu^2) \\\\ &= \\lim_{n \\rightarrow \\infty} E(\\bar{X_n})^2 + \\lim_{n \\rightarrow \\infty} Var(\\bar{X_n}) - 2 \\mu^2 + \\mu^2 \\\\ &= \\lim_{n \\rightarrow \\infty} \\frac{n^2 \\mu^2}{n^2} + \\lim_{n \\rightarrow \\infty} \\frac{\\sigma^2}{n} - \\mu^2 \\\\ &= \\mu^2 - \\mu^2 + \\lim_{n \\rightarrow \\infty} \\frac{\\sigma^2}{n} \\\\ &= 0. \\end{align}\\] "],["lt.html", "Chapter 12 Limit theorems", " Chapter 12 Limit theorems This chapter deals with limit theorems. The students are expected to acquire the following knowledge: Theoretical Monte Carlo integration convergence. Difference between weak and strong law of large numbers. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 12.1 Show that Monte Carlo integration converges almost surely to the true integral of a bounded function. Solution. Let \\(g\\) be a function defined on \\(\\Omega\\). Let \\(X_i\\), \\(i = 1,...,n\\) be i.i.d. (multivariate) uniform random variables with bounds defined on \\(\\Omega\\). Let \\(Y_i\\) = \\(g(X_i)\\). Then it follows that \\(Y_i\\) are also i.i.d. random variables and their expected value is \\(E[g(X)] = \\int_{\\Omega} g(x) f_X(x) dx = \\frac{1}{V_{\\Omega}} \\int_{\\Omega} g(x) dx\\). By the strong law of large numbers, we have \\[\\begin{equation} \\frac{1}{n}\\sum_{i=1}^n Y_i \\xrightarrow{\\text{a.s.}} E[g(X)]. \\end{equation}\\] It follows that \\[\\begin{equation} V_{\\Omega} \\frac{1}{n}\\sum_{i=1}^n Y_i \\xrightarrow{\\text{a.s.}} \\int_{\\Omega} g(x) dx. \\end{equation}\\] Exercise 12.2 Let \\(X\\) be a geometric random variable with probability 0.5 and support in positive integers. Let \\(Y = 2^X (-1)^X X^{-1}\\). Find the expected value of \\(Y\\) by using conditional convergence (this variable does not have an expected value in the conventional sense – the series is not absolutely convergent). R: Draw \\(10000\\) samples from a geometric distribution with probability 0.5 and support in positive integers to get \\(X\\). Then calculate \\(Y\\) and plot the means at each iteration (sample). Additionally, plot the expected value calculated in a. Try it with different seeds. What do you notice? Solution. \\[\\begin{align*} E[Y] &= \\sum_{x=1}^{\\infty} \\frac{2^x (-1)^x}{x} 0.5^x \\\\ &= \\sum_{x=1}^{\\infty} \\frac{(-1)^x}{x} \\\\ &= - \\sum_{x=1}^{\\infty} \\frac{(-1)^{x+1}}{x} \\\\ &= - \\ln(2) \\end{align*}\\] set.seed(3) x <- rgeom(100000, prob = 0.5) + 1 y <- 2^x * (-1)^x * x^{-1} y_means <- cumsum(y) / seq_along(y) df <- data.frame(x = 1:length(y_means), y = y_means) ggplot(data = df, aes(x = x, y = y)) + geom_line() + geom_hline(yintercept = -log(2)) "],["eb.html", "Chapter 13 Estimation basics 13.1 ECDF 13.2 Properties of estimators", " Chapter 13 Estimation basics This chapter deals with estimation basics. The students are expected to acquire the following knowledge: Biased and unbiased estimators. Consistent estimators. Empirical cumulative distribution function. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 13.1 ECDF Exercise 13.1 (ECDF intuition) Take any univariate continuous distribution that is readily available in R and plot its CDF (\\(F\\)). Draw one sample (\\(n = 1\\)) from the chosen distribution and draw the ECDF (\\(F_n\\)) of that one sample. Use the definition of the ECDF, not an existing function in R. Implementation hint: ECDFs are always piecewise constant - they only jump at the sampled values and by \\(1/n\\). Repeat (b) for \\(n = 5, 10, 100, 1000...\\) Theory says that \\(F_n\\) should converge to \\(F\\). Can you observe that? For \\(n = 100\\) repeat the process \\(m = 20\\) times and plot every \\(F_n^{(m)}\\). Theory says that \\(F_n\\) will converge to \\(F\\) the slowest where \\(F\\) is close to 0.5 (where the variance is largest). Can you observe that? library(ggplot2) set.seed(1) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) one_samp <- rnorm(1) X <- data.frame(x = c(-5, one_samp, 5), y = c(0,1,1)) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) + geom_step(data = X, aes(x = x, y = y)) N <- c(5, 10, 100, 1000) X <- NULL for (n in N) { tmp <- rnorm(n) tmp_X <- data.frame(x = c(-5, sort(tmp), 5), y = c(0, seq(1/n, 1, by = 1/n), 1), n = n) X <- rbind(X, tmp_X) } ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) + geom_step(data = X, aes(x = x, y = y, color = as.factor(n))) + labs(color = "N") 13.2 Properties of estimators Exercise 13.2 Show that the sample average is, as an estimator of the mean: unbiased, consistent, asymptotically normal. Solution. \\[\\begin{align*} E[\\frac{1}{n} \\sum_{i=1}^n X_i] &= \\frac{1}{n} \\sum_{i=i}^n E[X_i] \\\\ &= E[X]. \\end{align*}\\] \\[\\begin{align*} \\lim_{n \\rightarrow \\infty} P(|\\frac{1}{n} \\sum_{i=1}^n X_i - E[X]| > \\epsilon) &= \\lim_{n \\rightarrow \\infty} P((\\frac{1}{n} \\sum_{i=1}^n X_i - E[X])^2 > \\epsilon^2) \\\\ & \\leq \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i - E[X])^2]}{\\epsilon^2} & \\text{Markov inequality} \\\\ & = \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i)^2 - 2 \\frac{1}{n} \\sum_{i=1}^n X_i E[X] + E[X]^2]}{\\epsilon^2} \\\\ & = \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i)^2] - 2 E[X]^2 + E[X]^2}{\\epsilon^2} \\\\ &= 0 \\end{align*}\\] For the last equality see the solution to ??. Follows directly from the CLT. Exercise 13.3 (Consistent but biased estimator) Show that sample variance (the plug-in estimator of variance) is a biased estimator of variance. Show that sample variance is a consistent estimator of variance. Show that the estimator with (\\(N-1\\)) (Bessel correction) is unbiased. Solution. \\[\\begin{align*} E[\\frac{1}{n} \\sum_{i=1}^n (Y_i - \\bar{Y})^2] &= \\frac{1}{n} \\sum_{i=1}^n E[(Y_i - \\bar{Y})^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2] - 2 E[Y_i \\bar{Y}] + \\bar{Y}^2)] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2 - 2 Y_i \\bar{Y} + \\bar{Y}^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2 - \\frac{2}{n} Y_i^2 - \\frac{2}{n} \\sum_{i \\neq j} Y_i Y_j + \\frac{1}{n^2}\\sum_j \\sum_{k \\neq j} Y_j Y_k + \\frac{1}{n^2} \\sum_j Y_j^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n \\frac{n - 2}{n} (\\sigma^2 + \\mu^2) - \\frac{2}{n} (n - 1) \\mu^2 + \\frac{1}{n^2}n(n-1)\\mu^2 + \\frac{1}{n^2}n(\\sigma^2 + \\mu^2) \\\\ &= \\frac{n-1}{n}\\sigma^2 \\\\ < \\sigma^2. \\end{align*}\\] Let \\(S_n\\) denote the sample variance. Then we can write it as \\[\\begin{align*} S_n &= \\frac{1}{n} \\sum_{i=1}^n (X_i - \\bar{X})^2 = \\frac{1}{n} \\sum_{i=1}^n (X_i - \\mu)^2 + 2(X_i - \\mu)(\\mu - \\bar{X}) + (\\mu - \\bar{X})^2. \\end{align*}\\] Now \\(\\bar{X}\\) converges in probability (by WLLN) to \\(\\mu\\) therefore the right terms converge in probability to zero. The left term converges in probability to \\(\\sigma^2\\), also by WLLN. Therefore the sample variance is a consistent estimatior of the variance. The denominator changes in the second-to-last line of a., therefore the last line is now equality. Exercise 13.4 (Estimating the median) Show that the sample median is an unbiased estimator of the median for N\\((\\mu, \\sigma^2)\\). Show that the sample median is an unbiased estimator of the mean for any distribution with symmetric density. Hint 1: The pdf of an order statistic is \\(f_{X_{(k)}}(x) = \\frac{n!}{(n - k)!(k - 1)!}f_X(x)\\Big(F_X(x)^{k-1} (1 - F_X(x)^{n - k}) \\Big)\\). Hint 2: A distribution is symmetric when \\(X\\) and \\(2a - X\\) have the same distribution for some \\(a\\). Solution. Let \\(Z_i\\), \\(i = 1,...,n\\) be i.i.d. variables with a symmetric distribution and let \\(Z_{k:n}\\) denote the \\(k\\)-th order statistic. We will distinguish two cases, when \\(n\\) is odd and when \\(n\\) is even. Let first \\(n = 2m + 1\\) be odd. Then the sample median is \\(M = Z_{m+1:2m+1}\\). Its PDF is \\[\\begin{align*} f_M(x) = (m+1)\\binom{2m + 1}{m}f_Z(x)\\Big(F_Z(x)^m (1 - F_Z(x)^m) \\Big). \\end{align*}\\] For every symmetric distribution, it holds that \\(F_X(x) = 1 - F(2a - x)\\). Let \\(a = \\mu\\), the population mean. Plugging this into the PDF, we get that \\(f_M(x) = f_M(2\\mu -x)\\). It follows that \\[\\begin{align*} E[M] &= E[2\\mu - M] \\\\ 2E[M] &= 2\\mu \\\\ E[M] &= \\mu. \\end{align*}\\] Now let \\(n = 2m\\) be even. Then the sample median is \\(M = \\frac{Z_{m:2m} + Z_{m+1:2m}}{2}\\). It can be shown, that the joint PDF of these terms is also symmetric. Therefore, similar to the above \\[\\begin{align*} E[M] &= E[\\frac{Z_{m:2m} + Z_{m+1:2m}}{2}] \\\\ &= E[\\frac{2\\mu - M + 2\\mu - M}{2}] \\\\ &= E[2\\mu - M]. \\end{align*}\\] The above also proves point a. as the median and the mean are the same in normal distribution. Exercise 13.5 (Matrix trace estimation) The Hutchinson trace estimator [1] is an estimator of the trace of a symmetric positive semidefinite matrix A that relies on Monte Carlo sampling. The estimator is defined as \\[\\begin{align*} \\textrm{tr}(A) \\approx \\frac{1}{n} \\Sigma_{i=1}^n z_i^T A z_i, &\\\\ z_i \\sim_{\\mathrm{IID}} \\textrm{Uniform}(\\{-1, 1\\}^m), & \\end{align*}\\] where \\(A \\in \\mathbb{R}^{m \\times m}\\) is a symmetric positive semidefinite matrix. Elements of each vector \\(z_i\\) are either \\(-1\\) or \\(1\\) with equal probability. This is also called a Rademacher distribution. Data scientists often want the trace of a Hessian to obtain valuable curvature information for a loss function. Per [2], an example is classifying ten digits based on \\((28,28)\\) grayscale images (i.e. MNIST data) using logistic regression. The number of parameters is \\(m = 28^2 \\cdot 10 = 7840\\) and the size of the Hessian is \\(m^2\\), roughly \\(6 \\cdot 10^6\\). The diagonal average is equal to the average eigenvalue, which may be useful for optimization; in MCMC contexts, this would be useful for preconditioners and step size optimization. Computing Hessians (as a means of getting eigenvalue information) is often intractable, but Hessian-vector products can be computed faster by autodifferentiation (with e.g. Tensorflow, Pytorch, Jax). This is one motivation for the use of a stochastic trace estimator as outlined above. References: A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines (Hutchinson, 1990) A Modern Analysis of Hutchinson’s Trace Estimator (Skorski, 2020) Prove that the Hutchinson trace estimator is an unbiased estimator of the trace. Solution. We first simplify our task: \\[\\begin{align} \\mathbb{E}\\left[\\frac{1}{n} \\Sigma_{i=1}^n z_i^T A z_i \\right] &= \\frac{1}{n} \\Sigma_{i=1}^n \\mathbb{E}\\left[z_i^T A z_i \\right] \\\\ &= \\mathbb{E}\\left[z_i^T A z_i \\right], \\end{align}\\] where the second equality is due to having \\(n\\) IID vectors \\(z_i\\). We now only need to show that \\(\\mathbb{E}\\left[z^T A z \\right] = \\mathrm{tr}(A)\\). We omit the index due to all vectors being IID: \\[\\begin{align} \\mathrm{tr}(A) &= \\mathrm{tr}(AI) \\\\ &= \\mathrm{tr}(A\\mathbb{E}[zz^T]) \\\\ &= \\mathbb{E}[\\mathrm{tr}(Azz^T)] \\\\ &= \\mathbb{E}[\\mathrm{tr}(z^TAz)] \\\\ &= \\mathbb{E}[z^TAz]. \\end{align}\\] This concludes the proof. We clarify some equalities below. The second equality assumes that \\(\\mathbb{E}[zz^T] = I\\). By noting that the mean of the Rademacher distribution is 0, we have \\[\\begin{align} \\mathrm{Cov}[z, z] &= \\mathbb{E}[(z - \\mathbb{E}[z])(z - \\mathbb{E}[z])^T] \\\\ &= \\mathbb{E}[zz^T]. \\end{align}\\] Dimensions of \\(z\\) are independent, so \\(\\mathrm{Cov}[z, z]_{ij} = 0\\) for \\(i \\neq j\\). The diagonal will contain variances, which are equal to \\(1\\) for all dimensions \\(k = 1 \\dots m\\): \\(\\mathrm{Var}[z^{(k)}] = \\mathbb{E}[z^{(k)}z^{(k)}] - \\mathbb{E}[z^{(k)}]^2 = 1 - 0 = 1\\). It follows that the covariance is an identity matrix. Note that this is a general result for vectors with IID dimensions sampled from a distribution with mean 0 and variance 1. We could therefore use something else instead of the Rademacher, e.g. \\(z ~ N(0, I)\\). The third equality uses the fact that the expectation of a trace equals the trace of an expectation. If \\(X\\) is a random matrix, then \\(\\mathbb{E}[X]_{ij} = \\mathbb{E}[X_{ij}]\\). Therefore: \\[\\begin{align} \\mathrm{tr}(\\mathbb{E}[X]) &= \\Sigma_{i=1}^m(\\mathbb{E}[X]_{ii}) \\\\ &= \\Sigma_{i=1}^m(\\mathbb{E}[X_{ii}]) \\\\ &= \\mathbb{E}[\\Sigma_{i=1}^m(X_{ii})] \\\\ &= \\mathbb{E}[\\mathrm{tr}(X)], \\end{align}\\] where we used the linearity of the expectation in the third step. The fourth equality uses the fact that \\(\\mathrm{tr}(AB) = \\mathrm{tr}(BA)\\) for any matrices \\(A \\in \\mathbb{R}^{n \\times m}, B \\in \\mathbb{R}^{m \\times n}\\). The last inequality uses the fact that the trace of a \\(1 \\times 1\\) matrix is just its element. "],["boot.html", "Chapter 14 Bootstrap", " Chapter 14 Bootstrap This chapter deals with bootstrap. The students are expected to acquire the following knowledge: How to use bootstrap to generate coverage intervals. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 14.1 Ideally, a \\(1-\\alpha\\) CI would have \\(1-\\alpha\\) coverage. That is, say a 95% CI should, in the long run, contain the true value of the parameter 95% of the time. In practice, it is impossible to assess the coverage of our CI method, because we rarely know the true parameter. In simulation, however, we can. Let’s assess the coverage of bootstrap percentile intervals. Pick a univariate distribution with readily available mean and one that you can easily sample from. Draw \\(n = 30\\) random samples from the chosen distribution and use the bootstrap (with large enough m) and percentile CI method to construct 95% CI. Repeat the process many times and count how many times the CI contains the true mean. That is, compute the actual coverage probability (don’t forget to include the standard error of the coverage probability!). What can you observe? Try one or two different distributions. What can you observe? Repeat (b) and (c) using BCa intervals (R package boot). How does the coverage compare to percentile intervals? As (d) but using intervals based on asymptotic normality (+/- 1.96 SE). How do results from (b), (d), and (e) change if we increase the sample size to n = 200? What about n = 5? library(boot) set.seed(0) nit <- 1000 # Repeat the process "many times" alpha <- 0.05 # CI parameter nboot <- 100 # m parameter for bootstrap ("large enough m") # f: change this to 200 or 5. nsample <- 30 # n = 30 random samples from the chosen distribution. Comment out BCa code if it breaks. covers <- matrix(nrow = nit, ncol = 3) covers_BCa <- matrix(nrow = nit, ncol = 3) covers_asymp_norm <- matrix(nrow = nit, ncol = 3) isin <- function (x, lower, upper) { (x > lower) & (x < upper) } for (j in 1:nit) { # Repeating many times # a: pick a univariate distribution - standard normal x1 <- rnorm(nsample) # c: one or two different distributions - beta and poisson x2 <- rbeta(nsample, 1, 2) x3 <- rpois(nsample, 5) X1 <- matrix(data = NA, nrow = nsample, ncol = nboot) X2 <- matrix(data = NA, nrow = nsample, ncol = nboot) X3 <- matrix(data = NA, nrow = nsample, ncol = nboot) for (i in 1:nboot) { X1[ ,i] <- sample(x1, nsample, replace = T) X2[ ,i] <- sample(x2, nsample, T) X3[ ,i] <- sample(x3, nsample, T) } X1_func <- apply(X1, 2, mean) X2_func <- apply(X2, 2, mean) X3_func <- apply(X3, 2, mean) X1_quant <- quantile(X1_func, probs = c(alpha / 2, 1 - alpha / 2)) X2_quant <- quantile(X2_func, probs = c(alpha / 2, 1 - alpha / 2)) X3_quant <- quantile(X3_func, probs = c(alpha / 2, 1 - alpha / 2)) covers[j,1] <- (0 > X1_quant[1]) & (0 < X1_quant[2]) covers[j,2] <- ((1 / 3) > X2_quant[1]) & ((1 / 3) < X2_quant[2]) covers[j,3] <- (5 > X3_quant[1]) & (5 < X3_quant[2]) mf <- function (x, i) return(mean(x[i])) bootX1 <- boot(x1, statistic = mf, R = nboot) bootX2 <- boot(x2, statistic = mf, R = nboot) bootX3 <- boot(x3, statistic = mf, R = nboot) X1_quant_BCa <- boot.ci(bootX1, type = "bca")$bca X2_quant_BCa <- boot.ci(bootX2, type = "bca")$bca X3_quant_BCa <- boot.ci(bootX3, type = "bca")$bca covers_BCa[j,1] <- (0 > X1_quant_BCa[4]) & (0 < X1_quant_BCa[5]) covers_BCa[j,2] <- ((1 / 3) > X2_quant_BCa[4]) & ((1 / 3) < X2_quant_BCa[5]) covers_BCa[j,3] <- (5 > X3_quant_BCa[4]) & (5 < X3_quant_BCa[5]) # e: estimate mean and standard error # sample mean: x1_bar <- mean(x1) x2_bar <- mean(x2) x3_bar <- mean(x3) # standard error (of the sample mean) estimate: sample standard deviation / sqrt(n) x1_bar_SE <- sd(x1) / sqrt(nsample) x2_bar_SE <- sd(x2) / sqrt(nsample) x3_bar_SE <- sd(x3) / sqrt(nsample) covers_asymp_norm[j,1] <- isin(0, x1_bar - 1.96 * x1_bar_SE, x1_bar + 1.96 * x1_bar_SE) covers_asymp_norm[j,2] <- isin(1/3, x2_bar - 1.96 * x2_bar_SE, x2_bar + 1.96 * x2_bar_SE) covers_asymp_norm[j,3] <- isin(5, x3_bar - 1.96 * x3_bar_SE, x3_bar + 1.96 * x3_bar_SE) } apply(covers, 2, mean) ## [1] 0.918 0.925 0.905 apply(covers, 2, sd) / sqrt(nit) ## [1] 0.008680516 0.008333333 0.009276910 apply(covers_BCa, 2, mean) ## [1] 0.927 0.944 0.927 apply(covers_BCa, 2, sd) / sqrt(nit) ## [1] 0.008230355 0.007274401 0.008230355 apply(covers_asymp_norm, 2, mean) ## [1] 0.939 0.937 0.930 apply(covers_asymp_norm, 2, sd) / sqrt(nit) ## [1] 0.007572076 0.007687008 0.008072494 Exercise 14.2 You are given a sample of independent observations from a process of interest: Index 1 2 3 4 5 6 7 8 X 7 2 4 6 4 5 9 10 Compute the plug-in estimate of mean and 95% symmetric CI based on asymptotic normality. Use the plug-in estimate of SE. Same as (a), but use the unbiased estimate of SE. Apply nonparametric bootstrap with 1000 bootstrap replications and estimate the 95% CI for the mean with percentile-based CI. # a x <- c(7, 2, 4, 6, 4, 5, 9, 10) n <- length(x) mu <- mean(x) SE <- sqrt(mean((x - mu)^2)) / sqrt(n) SE ## [1] 0.8915839 z <- qnorm(1 - 0.05 / 2) c(mu - z * SE, mu + z * SE) ## [1] 4.127528 7.622472 # b SE <- sd(x) / sqrt(n) SE ## [1] 0.9531433 c(mu - z * SE, mu + z * SE) ## [1] 4.006873 7.743127 # c set.seed(0) m <- 1000 T_mean <- function(x) {mean(x)} est_boot <- array(NA, m) for (i in 1:m) { x_boot <- x[sample(1:n, n, rep = T)] est_boot[i] <- T_mean(x_boot) } quantile(est_boot, p = c(0.025, 0.975)) ## 2.5% 97.5% ## 4.250 7.625 Exercise 14.3 We are given a sample of 10 independent paired (bivariate) observations: Index 1 2 3 4 5 6 7 8 9 10 X 1.26 -0.33 1.33 1.27 0.41 -1.54 -0.93 -0.29 -0.01 2.40 Y 2.64 0.33 0.48 0.06 -0.88 -2.14 -2.21 0.95 0.83 1.45 Compute Pearson correlation between X and Y. Use the cor.test() from R to estimate a 95% CI for the estimate from (a). Apply nonparametric bootstrap with 1000 bootstrap replications and estimate the 95% CI for the Pearson correlation with percentile-based CI. Compare CI from (b) and (c). Are they similar? How would the bootstrap estimation of CI change if we were interested in Spearman or Kendall correlation instead? x <- c(1.26, -0.33, 1.33, 1.27, 0.41, -1.54, -0.93, -0.29, -0.01, 2.40) y <- c(2.64, 0.33, 0.48, 0.06, -0.88, -2.14, -2.21, 0.95, 0.83, 1.45) # a cor(x, y) ## [1] 0.6991247 # b res <- cor.test(x, y) res$conf.int[1:2] ## [1] 0.1241458 0.9226238 # c set.seed(0) m <- 1000 n <- length(x) T_cor <- function(x, y) {cor(x, y)} est_boot <- array(NA, m) for (i in 1:m) { idx <- sample(1:n, n, rep = T) # !!! important to use same indices to keep dependency between x and y est_boot[i] <- T_cor(x[idx], y[idx]) } quantile(est_boot, p = c(0.025, 0.975)) ## 2.5% 97.5% ## 0.2565537 0.9057664 # d # Yes, but the bootstrap CI is more narrow. # e # We just use the functions for Kendall/Spearman coefficients instead: T_kendall <- function(x, y) {cor(x, y, method = "kendall")} T_spearman <- function(x, y) {cor(x, y, method = "spearman")} # Put this in a function that returns the CI bootstrap_95_ci <- function(x, y, t, m = 1000) { n <- length(x) est_boot <- array(NA, m) for (i in 1:m) { idx <- sample(1:n, n, rep = T) # !!! important to use same indices to keep dependency between x and y est_boot[i] <- t(x[idx], y[idx]) } quantile(est_boot, p = c(0.025, 0.975)) } bootstrap_95_ci(x, y, T_kendall) ## 2.5% 97.5% ## -0.08108108 0.78378378 bootstrap_95_ci(x, y, T_spearman) ## 2.5% 97.5% ## -0.1701115 0.8867925 Exercise 14.4 In this problem we will illustrate the use of the nonparametric bootstrap for estimating CIs of regression model coefficients. Load the longley dataset from base R with data(longley). Use lm() to apply linear regression using “Employed” as the target (dependent) variable and all other variables as the predictors (independent). Using lm() results, print the estimated regression coefficients and standard errors. Estimate 95% CI for the coefficients using +/- 1.96 * SE. Use nonparametric bootstrap with 100 replications to estimate the SE of the coefficients from (b). Compare the SE from (c) with those from (b). # a data(longley) # b res <- lm(Employed ~ . , longley) tmp <- data.frame(summary(res)$coefficients[,1:2]) tmp$LB <- tmp[,1] - 1.96 * tmp[,2] tmp$UB <- tmp[,1] + 1.96 * tmp[,2] tmp ## Estimate Std..Error LB UB ## (Intercept) -3.482259e+03 8.904204e+02 -5.227483e+03 -1.737035e+03 ## GNP.deflator 1.506187e-02 8.491493e-02 -1.513714e-01 1.814951e-01 ## GNP -3.581918e-02 3.349101e-02 -1.014616e-01 2.982320e-02 ## Unemployed -2.020230e-02 4.883997e-03 -2.977493e-02 -1.062966e-02 ## Armed.Forces -1.033227e-02 2.142742e-03 -1.453204e-02 -6.132495e-03 ## Population -5.110411e-02 2.260732e-01 -4.942076e-01 3.919994e-01 ## Year 1.829151e+00 4.554785e-01 9.364136e-01 2.721889e+00 # c set.seed(0) m <- 100 n <- nrow(longley) T_coef <- function(x) { lm(Employed ~ . , x)$coefficients } est_boot <- array(NA, c(m, ncol(longley))) for (i in 1:m) { idx <- sample(1:n, n, rep = T) est_boot[i,] <- T_coef(longley[idx,]) } SE <- apply(est_boot, 2, sd) SE ## [1] 1.826011e+03 1.605981e-01 5.693746e-02 8.204892e-03 3.802225e-03 ## [6] 3.907527e-01 9.414436e-01 # Show the standard errors around coefficients library(ggplot2) library(reshape2) df <- data.frame(index = 1:7, bootstrap_SE = SE, lm_SE = tmp$Std..Error) melted_df <- melt(df[2:nrow(df), ], id.vars = "index") # Ignore bias which has a really large magnitude ggplot(melted_df, aes(x = index, y = value, fill = variable)) + geom_bar(stat="identity", position="dodge") + xlab("Coefficient") + ylab("Standard error") # + scale_y_continuous(trans = "log") # If you want to also plot bias Exercise 14.5 This exercise shows a shortcoming of the bootstrap method when using the plug in estimator for the maximum. Compute the 95% bootstrap CI for the maximum of a standard normal distribution. Compute the 95% bootstrap CI for the maximum of a binomial distribution with n = 15 and p = 0.2. Repeat (b) using p = 0.9. Why is the result different? # bootstrap CI for maximum alpha <- 0.05 T_max <- function(x) {max(x)} # Equal to T_max = max bootstrap <- function(x, t, m = 1000) { n <- length(x) values <- rep(0, m) for (i in 1:m) { values[i] <- t(sample(x, n, replace = T)) } quantile(values, probs = c(alpha / 2, 1 - alpha / 2)) } # a # Meaningless, as the normal distribution can yield arbitrarily large values. x <- rnorm(100) bootstrap(x, T_max) ## 2.5% 97.5% ## 1.819425 2.961743 # b x <- rbinom(100, size = 15, prob = 0.2) # min = 0, max = 15 bootstrap(x, T_max) ## 2.5% 97.5% ## 6 7 # c x <- rbinom(100, size = 15, prob = 0.9) # min = 0, max = 15 bootstrap(x, T_max) ## 2.5% 97.5% ## 15 15 # Observation: to estimate the maximum, we need sufficient probability mass near the maximum value the distribution can yield. # Using bootstrap is pointless when there is too little mass near the true maximum. # In general, bootstrap will fail when estimating the CI for the maximum. "],["ml.html", "Chapter 15 Maximum likelihood 15.1 Deriving MLE 15.2 Fisher information 15.3 The German tank problem", " Chapter 15 Maximum likelihood This chapter deals with maximum likelihood estimation. The students are expected to acquire the following knowledge: How to derive MLE. Applying MLE in R. Calculating and interpreting Fisher information. Practical use of MLE. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 15.1 Deriving MLE Exercise 15.1 Derive the maximum likelihood estimator of variance for N\\((\\mu, \\sigma^2)\\). Compare with results from 13.3. What does that say about the MLE estimator? Solution. The mean is assumed constant, so we have the likelihood \\[\\begin{align} L(\\sigma^2; y) &= \\prod_{i=1}^n \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(y_i - \\mu)^2}{2 \\sigma^2}} \\\\ &= \\frac{1}{\\sqrt{2 \\pi \\sigma^2}^n} e^{\\frac{-\\sum_{i=1}^n (y_i - \\mu)^2}{2 \\sigma^2}} \\end{align}\\] We need to find the maximum of this function. We first observe that we can replace \\(\\frac{-\\sum_{i=1}^n (y_i - \\mu)^2}{2}\\) with a constant \\(c\\), since none of the terms are dependent on \\(\\sigma^2\\). Additionally, the term \\(\\frac{1}{\\sqrt{2 \\pi}^n}\\) does not affect the calculation of the maximum. So now we have \\[\\begin{align} L(\\sigma^2; y) &= (\\sigma^2)^{-\\frac{n}{2}} e^{\\frac{c}{\\sigma^2}}. \\end{align}\\] Differentiating we get \\[\\begin{align} \\frac{d}{d \\sigma^2} L(\\sigma^2; y) &= (\\sigma^2)^{-\\frac{n}{2}} \\frac{d}{d \\sigma^2} e^{\\frac{c}{\\sigma^2}} + e^{\\frac{c}{\\sigma^2}} \\frac{d}{d \\sigma^2} (\\sigma^2)^{-\\frac{n}{2}} \\\\ &= - (\\sigma^2)^{-\\frac{n}{2}} e^{\\frac{c}{\\sigma^2}} \\frac{c}{(\\sigma^2)^2} - e^{\\frac{c}{\\sigma^2}} \\frac{n}{2} (\\sigma^2)^{-\\frac{n + 2}{2}} \\\\ &= - (\\sigma^2)^{-\\frac{n + 4}{2}} e^{\\frac{c}{\\sigma^2}} c - e^{\\frac{c}{\\sigma^2}} \\frac{n}{2} (\\sigma^2)^{-\\frac{n + 2}{2}} \\\\ &= - e^{\\frac{c}{\\sigma^2}} (\\sigma^2)^{-\\frac{n + 4}{2}} \\Big(c + \\frac{n}{2}\\sigma^2 \\Big). \\end{align}\\] To get the maximum, this has to equal to 0, so \\[\\begin{align} c + \\frac{n}{2}\\sigma^2 &= 0 \\\\ \\sigma^2 &= -\\frac{2c}{n} \\\\ \\sigma^2 &= \\frac{\\sum_{i=1}^n (Y_i - \\mu)^2}{n}. \\end{align}\\] The MLE estimator is biased. Exercise 15.2 (Multivariate normal distribution) Derive the maximum likelihood estimate for the mean and covariance matrix of the multivariate normal. Simulate \\(n = 40\\) samples from a bivariate normal distribution (choose non-trivial parameters, that is, mean \\(\\neq 0\\) and covariance \\(\\neq 0\\)). Compute the MLE for the sample. Overlay the data with an ellipse that is determined by the MLE and an ellipse that is determined by the chosen true parameters. Repeat b. several times and observe how the estimates (ellipses) vary around the true value. Hint: For the derivation of MLE, these identities will be helpful: \\(\\frac{\\partial b^T a}{\\partial a} = \\frac{\\partial a^T b}{\\partial a} = b\\), \\(\\frac{\\partial a^T A a}{\\partial a} = (A + A^T)a\\), \\(\\frac{\\partial \\text{tr}(BA)}{\\partial A} = B^T\\), \\(\\frac{\\partial \\ln |A|}{\\partial A} = (A^{-1})^T\\), \\(a^T A a = \\text{tr}(a^T A a) = \\text{tr}(a a^T A) = \\text{tr}(Aaa^T)\\). Solution. The log likelihood of the MVN distribution is \\[\\begin{align*} l(\\mu, \\Sigma ; x) &= -\\frac{1}{2}\\Big(\\sum_{i=1}^n k\\ln(2\\pi) + |\\Sigma| + (x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) \\\\ &= -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n(x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) + c, \\end{align*}\\] where \\(c\\) is a constant with respect to \\(\\mu\\) and \\(\\Sigma\\). To find the MLE we first need to find partial derivatives. Let us start with \\(\\mu\\). \\[\\begin{align*} \\frac{\\partial}{\\partial \\mu}l(\\mu, \\Sigma ; x) &= \\frac{\\partial}{\\partial \\mu} -\\frac{1}{2}\\Big(\\sum_{i=1}^n x_i^T \\Sigma^{-1} x_i - x_i^T \\Sigma^{-1} \\mu - \\mu^T \\Sigma^{-1} x_i + \\mu^T \\Sigma^{-1} \\mu \\Big) \\\\ &= -\\frac{1}{2}\\Big(\\sum_{i=1}^n - \\Sigma^{-1} x_i - \\Sigma^{-1} x_i + 2 \\Sigma^{-1} \\mu \\Big) \\\\ &= -\\Sigma^{-1}\\Big(\\sum_{i=1}^n - x_i + \\mu \\Big). \\end{align*}\\] Equating above with zero, we get \\[\\begin{align*} \\sum_{i=1}^n - x_i + \\mu &= 0 \\\\ \\hat{\\mu} = \\frac{1}{n} \\sum_{i=1}^n x_i, \\end{align*}\\] which is the dimension-wise empirical mean. Now for the covariance matrix \\[\\begin{align*} \\frac{\\partial}{\\partial \\Sigma^{-1}}l(\\mu, \\Sigma ; x) &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n(x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) \\\\ &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n \\text{tr}((x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu))\\Big) \\\\ &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n \\text{tr}((\\Sigma^{-1} (x_i - \\mu) (x_i - \\mu)^T )\\Big) \\\\ &= \\frac{n}{2}\\Sigma + -\\frac{1}{2}\\Big(\\sum_{i=1}^n (x_i - \\mu) (x_i - \\mu)^T \\Big). \\end{align*}\\] Equating above with zero, we get \\[\\begin{align*} \\hat{\\Sigma} = \\frac{1}{n}\\sum_{i=1}^n (x_i - \\mu) (x_i - \\mu)^T. \\end{align*}\\] set.seed(1) n <- 40 mu <- c(1, -2) Sigma <- matrix(data = c(2, -1.6, -1.6, 1.8), ncol = 2) X <- mvrnorm(n = n, mu = mu, Sigma = Sigma) colnames(X) <- c("X1", "X2") X <- as.data.frame(X) # plot.new() tru_ellip <- ellipse(mu, Sigma, draw = FALSE) colnames(tru_ellip) <- c("X1", "X2") tru_ellip <- as.data.frame(tru_ellip) mu_est <- apply(X, 2, mean) tmp <- as.matrix(sweep(X, 2, mu_est)) Sigma_est <- (1 / n) * t(tmp) %*% tmp est_ellip <- ellipse(mu_est, Sigma_est, draw = FALSE) colnames(est_ellip) <- c("X1", "X2") est_ellip <- as.data.frame(est_ellip) ggplot(data = X, aes(x = X1, y = X2)) + geom_point() + geom_path(data = tru_ellip, aes(x = X1, y = X2, color = "truth")) + geom_path(data = est_ellip, aes(x = X1, y = X2, color = "estimated")) + labs(color = "type") Exercise 15.3 (Logistic regression) Logistic regression is a popular discriminative model when our target variable is binary (categorical with 2 values). One of the ways of looking at logistic regression is that it is linear regression but instead of using the linear term as the mean of a normal RV, we use it as the mean of a Bernoulli RV. Of course, the mean of a Bernoulli is bounded on \\([0,1]\\), so, to avoid non-sensical values, we squeeze the linear between 0 and 1 with the inverse logit function inv_logit\\((z) = 1 / (1 + e^{-z})\\). This leads to the following model: \\(y_i | \\beta, x_i \\sim \\text{Bernoulli}(\\text{inv_logit}(\\beta x_i))\\). Explicitly write the likelihood function of beta. Implement the likelihood function in R. Use black-box box-constraint optimization (for example, optim() with L-BFGS) to find the maximum likelihood estimate for beta for \\(x\\) and \\(y\\) defined below. Plot the estimated probability as a function of the independent variable. Compare with the truth. Let \\(y2\\) be a response defined below. Will logistic regression work well on this dataset? Why not? How can we still use the model, without changing it? inv_log <- function (z) { return (1 / (1 + exp(-z))) } set.seed(1) x <- rnorm(100) y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) y2 <- rbinom(100, size = 1, prob = inv_log(1.2 * x + 1.4 * x^2)) Solution. \\[\\begin{align*} l(\\beta; x, y) &= p(y | x, \\beta) \\\\ &= \\ln(\\prod_{i=1}^n \\text{inv_logit}(\\beta x_i)^{y_i} (1 - \\text{inv_logit}(\\beta x_i))^{1 - y_i}) \\\\ &= \\sum_{i=1}^n y_i \\ln(\\text{inv_logit}(\\beta x_i)) + (1 - y_i) \\ln(1 - \\text{inv_logit}(\\beta x_i)). \\end{align*}\\] set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par ## [1] 1.166558 truth_p <- data.frame(x = x, prob = inv_log(1.2 * x), type = "truth") est_p <- data.frame(x = x, prob = inv_log(my_optim$par * x), type = "estimated") plot_df <- rbind(truth_p, est_p) ggplot(data = plot_df, aes(x = x, y = prob, color = type)) + geom_point(alpha = 0.3) y2 <- rbinom(2000, size = 1, prob = inv_log(1.2 * x + 1.4 * x^2)) X2 <- cbind(x, x^2) my_optim2 <- optim(par = c(0, 0), fn = l_logistic, method = "L-BFGS-B", lower = c(0, 0), upper = c(2, 2), X = t(X2), y = y2) my_optim2$par ## [1] 1.153656 1.257649 tmp <- sweep(data.frame(x = x, x2 = x^2), 2, my_optim2$par, FUN = "*") tmp <- tmp[ ,1] + tmp[ ,2] truth_p <- data.frame(x = x, prob = inv_log(1.2 * x + 1.4 * x^2), type = "truth") est_p <- data.frame(x = x, prob = inv_log(tmp), type = "estimated") plot_df <- rbind(truth_p, est_p) ggplot(data = plot_df, aes(x = x, y = prob, color = type)) + geom_point(alpha = 0.3) Exercise 15.4 (Linear regression) For the data generated below, do the following: Compute the least squares (MLE) estimate of coefficients beta using the matrix exact solution. Compute the MLE by minimizing the sum of squared residuals using black-box optimization (optim()). Compute the MLE by using the output built-in linear regression (lm() ). Compare (a-c and the true coefficients). Compute 95% CI on the beta coefficients using the output of built-in linear regression. Compute 95% CI on the beta coefficients by using (a or b) and the bootstrap with percentile method for CI. Compare with d. set.seed(1) n <- 100 x1 <- rnorm(n) x2 <- rnorm(n) x3 <- rnorm(n) X <- cbind(x1, x2, x3) beta <- c(0.2, 0.6, -1.2) y <- as.vector(t(beta %*% t(X))) + rnorm(n, sd = 0.2) set.seed(1) n <- 100 x1 <- rnorm(n) x2 <- rnorm(n) x3 <- rnorm(n) X <- cbind(x1, x2, x3) beta <- c(0.2, 0.6, -1.2) y <- as.vector(t(beta %*% t(X))) + rnorm(n, sd = 0.2) LS_fun <- function (beta, X, y) { return(sum((y - beta %*% t(X))^2)) } my_optim <- optim(par = c(0, 0, 0), fn = LS_fun, lower = -5, upper = 5, X = X, y = y, method = "L-BFGS-B") my_optim$par ## [1] 0.1898162 0.5885946 -1.1788264 df <- data.frame(y = y, x1 = x1, x2 = x2, x3 = x3) my_lm <- lm(y ~ x1 + x2 + x3 - 1, data = df) my_lm ## ## Call: ## lm(formula = y ~ x1 + x2 + x3 - 1, data = df) ## ## Coefficients: ## x1 x2 x3 ## 0.1898 0.5886 -1.1788 # matrix solution beta_hat <- solve(t(X) %*% X) %*% t(X) %*% y beta_hat ## [,1] ## x1 0.1898162 ## x2 0.5885946 ## x3 -1.1788264 out <- summary(my_lm) out$coefficients[ ,2] ## x1 x2 x3 ## 0.02209328 0.02087542 0.01934506 # bootstrap CI nboot <- 1000 beta_boot <- matrix(data = NA, ncol = length(beta), nrow = nboot) for (i in 1:nboot) { inds <- sample(1:n, n, replace = T) new_df <- df[inds, ] X_tmp <- as.matrix(new_df[ ,-1]) y_tmp <- new_df[ ,1] # print(nrow(new_df)) tmp_beta <- solve(t(X_tmp) %*% X_tmp) %*% t(X_tmp) %*% y_tmp beta_boot[i, ] <- tmp_beta } apply(beta_boot, 2, mean) ## [1] 0.1893281 0.5887068 -1.1800738 apply(beta_boot, 2, quantile, probs = c(0.025, 0.975)) ## [,1] [,2] [,3] ## 2.5% 0.1389441 0.5436911 -1.221560 ## 97.5% 0.2386295 0.6363102 -1.140416 out$coefficients[ ,2] ## x1 x2 x3 ## 0.02209328 0.02087542 0.01934506 Exercise 15.5 (Principal component analysis) Load the olympic data set from package ade4. The data show decathlon results for 33 men in 1988 Olympic Games. This data set serves as a great example of finding the latent structure in the data, as there are certain characteristics of the athletes that make them excel at different events. For example an explosive athlete will do particulary well in sprints and long jumps. Perform PCA (prcomp) on the data set and interpret the first 2 latent dimensions. Hint: Standardize the data first to get meaningful results. Use MLE to estimate the covariance of the standardized multivariate distribution. Decompose the estimated covariance matrix with the eigendecomposition. Compare the eigenvectors to the output of PCA. data(olympic) X <- olympic$tab X_scaled <- scale(X) my_pca <- prcomp(X_scaled) summary(my_pca) ## Importance of components: ## PC1 PC2 PC3 PC4 PC5 PC6 PC7 ## Standard deviation 1.8488 1.6144 0.97123 0.9370 0.74607 0.70088 0.65620 ## Proportion of Variance 0.3418 0.2606 0.09433 0.0878 0.05566 0.04912 0.04306 ## Cumulative Proportion 0.3418 0.6025 0.69679 0.7846 0.84026 0.88938 0.93244 ## PC8 PC9 PC10 ## Standard deviation 0.55389 0.51667 0.31915 ## Proportion of Variance 0.03068 0.02669 0.01019 ## Cumulative Proportion 0.96312 0.98981 1.00000 autoplot(my_pca, data = X, loadings = TRUE, loadings.colour = 'blue', loadings.label = TRUE, loadings.label.size = 3) Sigma_est <- (1 / nrow(X_scaled)) * t(X_scaled) %*% X_scaled Sigma_dec <- eigen(Sigma_est) Sigma_dec$vectors ## [,1] [,2] [,3] [,4] [,5] [,6] ## [1,] 0.4158823 0.1488081 -0.26747198 -0.08833244 -0.442314456 0.03071237 ## [2,] -0.3940515 -0.1520815 -0.16894945 -0.24424963 0.368913901 -0.09378242 ## [3,] -0.2691057 0.4835374 0.09853273 -0.10776276 -0.009754680 0.23002054 ## [4,] -0.2122818 0.0278985 -0.85498656 0.38794393 -0.001876311 0.07454380 ## [5,] 0.3558474 0.3521598 -0.18949642 0.08057457 0.146965351 -0.32692886 ## [6,] 0.4334816 0.0695682 -0.12616012 -0.38229029 -0.088802794 0.21049130 ## [7,] -0.1757923 0.5033347 0.04609969 0.02558404 0.019358607 0.61491241 ## [8,] -0.3840821 0.1495820 0.13687235 0.14396548 -0.716743474 -0.34776037 ## [9,] -0.1799436 0.3719570 -0.19232803 -0.60046566 0.095582043 -0.43744387 ## [10,] 0.1701426 0.4209653 0.22255233 0.48564231 0.339772188 -0.30032419 ## [,7] [,8] [,9] [,10] ## [1,] 0.2543985 0.663712826 -0.10839531 0.10948045 ## [2,] 0.7505343 0.141264141 0.04613910 0.05580431 ## [3,] -0.1106637 0.072505560 0.42247611 0.65073655 ## [4,] -0.1351242 -0.155435871 -0.10206505 0.11941181 ## [5,] 0.1413388 -0.146839303 0.65076229 -0.33681395 ## [6,] 0.2725296 -0.639003579 -0.20723854 0.25971800 ## [7,] 0.1439726 0.009400445 -0.16724055 -0.53450315 ## [8,] 0.2732665 -0.276873049 -0.01766443 -0.06589572 ## [9,] -0.3419099 0.058519366 -0.30619617 -0.13093187 ## [10,] 0.1868704 0.007310045 -0.45688227 0.24311846 my_pca$rotation ## PC1 PC2 PC3 PC4 PC5 PC6 ## 100 -0.4158823 0.1488081 0.26747198 -0.08833244 -0.442314456 0.03071237 ## long 0.3940515 -0.1520815 0.16894945 -0.24424963 0.368913901 -0.09378242 ## poid 0.2691057 0.4835374 -0.09853273 -0.10776276 -0.009754680 0.23002054 ## haut 0.2122818 0.0278985 0.85498656 0.38794393 -0.001876311 0.07454380 ## 400 -0.3558474 0.3521598 0.18949642 0.08057457 0.146965351 -0.32692886 ## 110 -0.4334816 0.0695682 0.12616012 -0.38229029 -0.088802794 0.21049130 ## disq 0.1757923 0.5033347 -0.04609969 0.02558404 0.019358607 0.61491241 ## perc 0.3840821 0.1495820 -0.13687235 0.14396548 -0.716743474 -0.34776037 ## jave 0.1799436 0.3719570 0.19232803 -0.60046566 0.095582043 -0.43744387 ## 1500 -0.1701426 0.4209653 -0.22255233 0.48564231 0.339772188 -0.30032419 ## PC7 PC8 PC9 PC10 ## 100 0.2543985 -0.663712826 0.10839531 -0.10948045 ## long 0.7505343 -0.141264141 -0.04613910 -0.05580431 ## poid -0.1106637 -0.072505560 -0.42247611 -0.65073655 ## haut -0.1351242 0.155435871 0.10206505 -0.11941181 ## 400 0.1413388 0.146839303 -0.65076229 0.33681395 ## 110 0.2725296 0.639003579 0.20723854 -0.25971800 ## disq 0.1439726 -0.009400445 0.16724055 0.53450315 ## perc 0.2732665 0.276873049 0.01766443 0.06589572 ## jave -0.3419099 -0.058519366 0.30619617 0.13093187 ## 1500 0.1868704 -0.007310045 0.45688227 -0.24311846 15.2 Fisher information Exercise 15.6 Let us assume a Poisson likelihood. Derive the MLE estimate of the mean. Derive the Fisher information. For the data below compute the MLE and construct confidence intervals. Use bootstrap to construct the CI for the mean. Compare with c) and discuss. x <- c(2, 5, 3, 1, 2, 1, 0, 3, 0, 2) Solution. The log likelihood of the Poisson is \\[\\begin{align*} l(\\lambda; x) = \\sum_{i=1}^n x_i \\ln \\lambda - n \\lambda - \\sum_{i=1}^n \\ln x_i! \\end{align*}\\] Taking the derivative and equating with 0 we get \\[\\begin{align*} \\frac{1}{\\hat{\\lambda}}\\sum_{i=1}^n x_i - n &= 0 \\\\ \\hat{\\lambda} &= \\frac{1}{n} \\sum_{i=1}^n x_i. \\end{align*}\\] Since \\(\\lambda\\) is the mean parameter, this was expected. For the Fischer information, we first need the second derivative, which is \\[\\begin{align*} - \\lambda^{-2} \\sum_{i=1}^n x_i. \\\\ \\end{align*}\\] Now taking the expectation of the negative of the above, we get \\[\\begin{align*} E[\\lambda^{-2} \\sum_{i=1}^n x_i] &= \\lambda^{-2} E[\\sum_{i=1}^n x_i] \\\\ &= \\lambda^{-2} n \\lambda \\\\ &= \\frac{n}{\\lambda}. \\end{align*}\\] set.seed(1) x <- c(2, 5, 3, 1, 2, 1, 0, 3, 0, 2) lambda_hat <- mean(x) finfo <- length(x) / lambda_hat mle_CI <- c(lambda_hat - 1.96 * sqrt(1 / finfo), lambda_hat + 1.96 * sqrt(1 / finfo)) boot_lambda <- c() nboot <- 1000 for (i in 1:nboot) { tmp_x <- sample(x, length(x), replace = T) boot_lambda[i] <- mean(tmp_x) } boot_CI <- c(quantile(boot_lambda, 0.025), quantile(boot_lambda, 0.975)) mle_CI ## [1] 1.045656 2.754344 boot_CI ## 2.5% 97.5% ## 1.0 2.7 Exercise 15.7 Find the Fisher information matrix for the Gamma distribution. Generate 20 samples from a Gamma distribution and plot a confidence ellipse of the inverse of Fisher information matrix around the ML estimates of the parameters. Also plot the theoretical values. Repeat the sampling several times. What do you observe? Discuss what a non-diagonal Fisher matrix implies. Hint: The digamma function is defined as \\(\\psi(x) = \\frac{\\frac{d}{dx} \\Gamma(x)}{\\Gamma(x)}\\). Additionally, you do not need to evaluate \\(\\frac{d}{dx} \\psi(x)\\). To calculate its value in R, use package numDeriv. Solution. The log likelihood of the Gamma is \\[\\begin{equation*} l(\\alpha, \\beta; x) = n \\alpha \\ln \\beta - n \\ln \\Gamma(\\alpha) + (\\alpha - 1) \\sum_{i=1}^n \\ln x_i - \\beta \\sum_{i=1}^n x_i. \\end{equation*}\\] Let us calculate the derivatives. \\[\\begin{align*} \\frac{\\partial}{\\partial \\alpha} l(\\alpha, \\beta; x) &= n \\ln \\beta - n \\psi(\\alpha) + \\sum_{i=1}^n \\ln x_i, \\\\ \\frac{\\partial}{\\partial \\beta} l(\\alpha, \\beta; x) &= \\frac{n \\alpha}{\\beta} - \\sum_{i=1}^n x_i, \\\\ \\frac{\\partial^2}{\\partial \\alpha \\beta} l(\\alpha, \\beta; x) &= \\frac{n}{\\beta}, \\\\ \\frac{\\partial^2}{\\partial \\alpha^2} l(\\alpha, \\beta; x) &= - n \\frac{\\partial}{\\partial \\alpha} \\psi(\\alpha), \\\\ \\frac{\\partial^2}{\\partial \\beta^2} l(\\alpha, \\beta; x) &= - \\frac{n \\alpha}{\\beta^2}. \\end{align*}\\] The Fisher information matrix is then \\[\\begin{align*} I(\\alpha, \\beta) = - E[ \\begin{bmatrix} - n \\psi'(\\alpha) & \\frac{n}{\\beta} \\\\ \\frac{n}{\\beta} & - \\frac{n \\alpha}{\\beta^2} \\end{bmatrix} ] = \\begin{bmatrix} n \\psi'(\\alpha) & - \\frac{n}{\\beta} \\\\ - \\frac{n}{\\beta} & \\frac{n \\alpha}{\\beta^2} \\end{bmatrix} \\end{align*}\\] A non-diagonal Fisher matrix implies that the parameter estimates are linearly dependent. set.seed(1) n <- 20 pars_theor <- c(5, 2) x <- rgamma(n, 5, 2) # MLE for alpha and beta log_lik <- function (pars, x) { n <- length(x) return (- (n * pars[1] * log(pars[2]) - n * log(gamma(pars[1])) + (pars[1] - 1) * sum(log(x)) - pars[2] * sum(x))) } my_optim <- optim(par = c(1,1), fn = log_lik, method = "L-BFGS-B", lower = c(0.001, 0.001), upper = c(8, 8), x = x) pars_mle <- my_optim$par fish_mat <- matrix(data = NA, nrow = 2, ncol = 2) fish_mat[1,2] <- - n / pars_mle[2] fish_mat[2,1] <- - n / pars_mle[2] fish_mat[2,2] <- (n * pars_mle[1]) / (pars_mle[2]^2) fish_mat[1,1] <- n * grad(digamma, pars_mle[1]) fish_mat_inv <- solve(fish_mat) est_ellip <- ellipse(pars_mle, fish_mat_inv, draw = FALSE) colnames(est_ellip) <- c("X1", "X2") est_ellip <- as.data.frame(est_ellip) ggplot() + geom_point(data = data.frame(x = pars_mle[1], y = pars_mle[2]), aes(x = x, y = y)) + geom_path(data = est_ellip, aes(x = X1, y = X2)) + geom_point(aes(x = pars_theor[1], y = pars_theor[2]), color = "red") + geom_text(aes(x = pars_theor[1], y = pars_theor[2], label = "Theoretical parameters"), color = "red", nudge_y = -0.2) 15.3 The German tank problem Exercise 15.8 (The German tank problem) During WWII the allied intelligence were faced with an important problem of estimating the total production of certain German tanks, such as the Panther. What turned out to be a successful approach was to estimate the maximum from the serial numbers of the small sample of captured or destroyed tanks (describe the statistical model used). What assumptions were made by using the above model? Do you think they are reasonable assumptions in practice? Show that the plug-in estimate for the maximum (i.e. the maximum of the sample) is a biased estimator. Derive the maximum likelihood estimate of the maximum. Check that the following estimator is not biased: \\(\\hat{n} = \\frac{k + 1}{k}m - 1\\). Solution. The data are the serial numbers of the tanks. The parameter is \\(n\\), the total production of the tank. The distribution of the serial numbers is a discrete uniform distribution over all serial numbers. One of the assumptions is that we have i.i.d samples, however in practice this might not be true, as some tanks produced later could be sent to the field later, therefore already in theory we would not be able to recover some values from the population. To find the expected value we first need to find the distribution of \\(m\\). Let us start with the CDF. \\[\\begin{align*} F_m(x) = P(Y_1 < x,...,Y_k < x). \\end{align*}\\] If \\(x < k\\) then \\(F_m(x) = 0\\) and if \\(x \\geq 1\\) then \\(F_m(x) = 1\\). What about between those values. So the probability that the maximum value is less than or equal to \\(m\\) is just the number of possible draws from \\(Y\\) that are all smaller than \\(m\\), divided by all possible draws. This is \\(\\frac{{x}\\choose{k}}{{n}\\choose{k}}\\). The PDF on the suitable bounds is then \\[\\begin{align*} P(m = x) = F_m(x) - F_m(x - 1) = \\frac{\\binom{x}{k} - \\binom{x - 1}{k}}{\\binom{n}{k}} = \\frac{\\binom{x - 1}{k - 1}}{\\binom{n}{k}}. \\end{align*}\\] Now we can calculate the expected value of \\(m\\) using some combinatorial identities. \\[\\begin{align*} E[m] &= \\sum_{i = k}^n i \\frac{{i - 1}\\choose{k - 1}}{{n}\\choose{k}} \\\\ &= \\sum_{i = k}^n i \\frac{\\frac{(i - 1)!}{(k - 1)!(i - k)!}}{{n}\\choose{k}} \\\\ &= \\frac{k}{\\binom{n}{k}}\\sum_{i = k}^n \\binom{i}{k} \\\\ &= \\frac{k}{\\binom{n}{k}} \\binom{n + 1}{k + 1} \\\\ &= \\frac{k(n + 1)}{k + 1}. \\end{align*}\\] The bias of this estimator is then \\[\\begin{align*} E[m] - n = \\frac{k(n + 1)}{k + 1} - n = \\frac{k - n}{k + 1}. \\end{align*}\\] The probability that we observed our sample \\(Y = {Y_1, Y_2,...,,Y_k}\\) given \\(n\\) is \\(\\frac{1}{{n}\\choose{k}}\\). We need to find such \\(n^*\\) that this function is maximized. Additionally, we have a constraint that \\(n^* \\geq m = \\max{(Y)}\\). Let us plot this function for \\(m = 10\\) and \\(k = 4\\). library(ggplot2) my_fun <- function (x, m, k) { tmp <- 1 / (choose(x, k)) tmp[x < m] <- 0 return (tmp) } x <- 1:20 y <- my_fun(x, 10, 4) df <- data.frame(x = x, y = y) ggplot(data = df, aes(x = x, y = y)) + geom_line() ::: {.solution} (continued) We observe that the maximum of this function lies at the maximum value of the sample. Therefore \\(n^* = m\\) and ML estimate equals the plug-in estimate. \\[\\begin{align*} E[\\hat{n}] &= \\frac{k + 1}{k} E[m] - 1 \\\\ &= \\frac{k + 1}{k} \\frac{k(n + 1)}{k + 1} - 1 \\\\ &= n. \\end{align*}\\] ::: "],["nhst.html", "Chapter 16 Null hypothesis significance testing", " Chapter 16 Null hypothesis significance testing This chapter deals with null hypothesis significance testing. The students are expected to acquire the following knowledge: Binomial test. t-test. Chi-squared test. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 16.1 (Binomial test) We assume \\(y_i \\in \\{0,1\\}\\), \\(i = 1,...,n\\) and \\(y_i | \\theta = 0.5 \\sim i.i.d.\\) Bernoulli\\((\\theta)\\). The test statistic is \\(X = \\sum_{i=1}^n\\) and the rejection region R is defined as the region where the probability of obtaining such or more extreme \\(X\\) given \\(\\theta = 0.5\\) is less than 0.05. Derive and plot the power function of the test for \\(n=100\\). What is the significance level of this test if \\(H0: \\theta = 0.5\\)? At which values of X will we reject the null hypothesis? # a # First we need the rejection region, so we need to find X_min and X_max n <- 100 qbinom(0.025, n, 0.5) ## [1] 40 qbinom(0.975, n, 0.5) ## [1] 60 pbinom(40, n, 0.5) ## [1] 0.02844397 pbinom(60, n, 0.5) ## [1] 0.9823999 X_min <- 39 X_max <- 60 thetas <- seq(0, 1, by = 0.01) beta_t <- 1 - pbinom(X_max, size = n, prob = thetas) + pbinom(X_min, size = n, prob = thetas) plot(beta_t) # b # The significance level is beta_t[51] ## [1] 0.0352002 # We will reject the null hypothesis at X values below X_min and above X_max. Exercise 16.2 (Long-run guarantees of the t-test) Generate a sample of size \\(n = 10\\) from the standard normal. Use the two-sided t-test with \\(H0: \\mu = 0\\) and record the p-value. Can you reject H0 at 0.05 significance level? (before simulating) If we repeated (b) many times, what would be the relative frequency of false positives/Type I errors (rejecting the null that is true)? What would be the relative frequency of false negatives /Type II errors (retaining the null when the null is false)? (now simulate b and check if the simulation results match your answer in b) Similar to (a-c) but now we generate data from N(-0.5, 1). Similar to (a-c) but now we generate data from N(\\(\\mu\\), 1) where we every time pick a different \\(\\mu < 0\\) and use a one-sided test \\(H0: \\mu <= 0\\). set.seed(2) # a x <- rnorm(10) my_test <- t.test(x, alternative = "two.sided", mu = 0) my_test ## ## One Sample t-test ## ## data: x ## t = 0.6779, df = 9, p-value = 0.5149 ## alternative hypothesis: true mean is not equal to 0 ## 95 percent confidence interval: ## -0.4934661 0.9157694 ## sample estimates: ## mean of x ## 0.2111516 # we can not reject the null hypothesis # b # The expected value of false positives would be 0.05. The expected value of # true negatives would be 0, as there are no negatives (the null hypothesis is # always the truth). nit <- 1000 typeIerr <- vector(mode = "logical", length = nit) typeIIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { x <- rnorm(10) my_test <- t.test(x, alternative = "two.sided", mu = 0) if (my_test$p.value < 0.05) { typeIerr[i] <- T } else { typeIerr[i] <- F } } mean(typeIerr) ## [1] 0.052 sd(typeIerr) / sqrt(nit) ## [1] 0.007024624 # d # We can not estimate the percentage of true negatives, but it will probably be # higher than 0.05. There will be no false positives as the null hypothesis is # always false. typeIIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { x <- rnorm(10, -0.5) my_test <- t.test(x, alternative = "two.sided", mu = 0) if (my_test$p.value < 0.05) { typeIIerr[i] <- F } else { typeIIerr[i] <- T } } mean(typeIIerr) ## [1] 0.719 sd(typeIIerr) / sqrt(nit) ## [1] 0.01422115 # e # The expected value of false positives would be lower than 0.05. The expected # value of true negatives would be 0, as there are no negatives (the null # hypothesis is always the truth). typeIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { u <- runif(1, -1, 0) x <- rnorm(10, u) my_test <- t.test(x, alternative = "greater", mu = 0) if (my_test$p.value < 0.05) { typeIerr[i] <- T } else { typeIerr[i] <- F } } mean(typeIerr) ## [1] 0.012 sd(typeIerr) / sqrt(nit) ## [1] 0.003444977 Exercise 16.3 (T-test, confidence intervals, and bootstrap) Sample \\(n=20\\) from a standard normal distribution and calculate the p-value using t-test, confidence intervals based on normal distribution, and bootstrap. Repeat this several times and check how many times we rejected the null hypothesis (made a type I error). Hint: For the confidence intervals you can use function CI from the Rmisc package. set.seed(1) library(Rmisc) nit <- 1000 n_boot <- 100 t_logic <- rep(F, nit) boot_logic <- rep(F, nit) norm_logic <- rep(F, nit) for (i in 1:nit) { x <- rnorm(20) my_test <- t.test(x) my_CI <- CI(x) if (my_test$p.value <= 0.05) t_logic[i] <- T boot_tmp <- vector(mode = "numeric", length = n_boot) for (j in 1:n_boot) { tmp_samp <- sample(x, size = 20, replace = T) boot_tmp[j] <- mean(tmp_samp) } if ((quantile(boot_tmp, 0.025) >= 0) | (quantile(boot_tmp, 0.975) <= 0)) { boot_logic[i] <- T } if ((my_CI[3] >= 0) | (my_CI[1] <= 0)) { norm_logic[i] <- T } } mean(t_logic) ## [1] 0.053 sd(t_logic) / sqrt(nit) ## [1] 0.007088106 mean(boot_logic) ## [1] 0.093 sd(boot_logic) / sqrt(nit) ## [1] 0.009188876 mean(norm_logic) ## [1] 0.053 sd(norm_logic) / sqrt(nit) ## [1] 0.007088106 Exercise 16.4 (Chi-squared test) Show that the \\(\\chi^2 = \\sum_{i=1}^k \\frac{(O_i - E_i)^2}{E_i}\\) test statistic is approximately \\(\\chi^2\\) distributed when we have two categories. Let us look at the US voting data here. Compare the number of voters who voted for Trump or Hillary depending on their income (less or more than 100.000 dollars per year). Manually calculate the chi-squared statistic, compare to the chisq.test in R, and discuss the results. Visualize the test. Solution. Let \\(X_i\\) be binary variables, \\(i = 1,...,n\\). We can then express the test statistic as \\[\\begin{align} \\chi^2 = &\\frac{(O_i - np)^2}{np} + \\frac{(n - O_i - n(1 - p))^2}{n(1 - p)} \\\\ &= \\frac{(O_i - np)^2}{np(1 - p)} \\\\ &= (\\frac{O_i - np}{\\sqrt{np(1 - p)}})^2. \\end{align}\\] When \\(n\\) is large, this distrbution is approximately normal with \\(\\mu = np\\) and \\(\\sigma^2 = np(1 - p)\\) (binomial converges in distribution to standard normal). By definition, the chi-squared distribution with \\(k\\) degrees of freedom is a sum of squares of \\(k\\) independent standard normal random variables. n <- 24588 less100 <- round(0.66 * n * c(0.49, 0.45, 0.06)) # some rounding, but it should not affect results more100 <- round(0.34 * n * c(0.47, 0.47, 0.06)) x <- rbind(less100, more100) colnames(x) <- c("Clinton", "Trump", "other/no answer") print(x) ## Clinton Trump other/no answer ## less100 7952 7303 974 ## more100 3929 3929 502 chisq.test(x) ## ## Pearson's Chi-squared test ## ## data: x ## X-squared = 9.3945, df = 2, p-value = 0.00912 x ## Clinton Trump other/no answer ## less100 7952 7303 974 ## more100 3929 3929 502 csum <- apply(x, 2, sum) rsum <- apply(x, 1, sum) chi2 <- (x[1,1] - csum[1] * rsum[1] / sum(x))^2 / (csum[1] * rsum[1] / sum(x)) + (x[1,2] - csum[2] * rsum[1] / sum(x))^2 / (csum[2] * rsum[1] / sum(x)) + (x[1,3] - csum[3] * rsum[1] / sum(x))^2 / (csum[3] * rsum[1] / sum(x)) + (x[2,1] - csum[1] * rsum[2] / sum(x))^2 / (csum[1] * rsum[2] / sum(x)) + (x[2,2] - csum[2] * rsum[2] / sum(x))^2 / (csum[2] * rsum[2] / sum(x)) + (x[2,3] - csum[3] * rsum[2] / sum(x))^2 / (csum[3] * rsum[2] / sum(x)) chi2 ## Clinton ## 9.394536 1 - pchisq(chi2, df = 2) ## Clinton ## 0.009120161 x <- seq(0, 15, by = 0.01) df <- data.frame(x = x) ggplot(data = df, aes(x = x)) + stat_function(fun = dchisq, args = list(df = 2)) + geom_segment(aes(x = chi2, y = 0, xend = chi2, yend = dchisq(chi2, df = 2))) + stat_function(fun = dchisq, args = list(df = 2), xlim = c(chi2, 15), geom = "area", fill = "red") "],["bi.html", "Chapter 17 Bayesian inference 17.1 Conjugate priors 17.2 Posterior sampling", " Chapter 17 Bayesian inference This chapter deals with Bayesian inference. The students are expected to acquire the following knowledge: How to set prior distribution. Compute posterior distribution. Compute posterior predictive distribution. Use sampling for inference. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 17.1 Conjugate priors Exercise 17.1 (Poisson-gamma model) Let us assume a Poisson likelihood and a gamma prior on the Poisson mean parameter (this is a conjugate prior). Derive posterior Below we have some data, which represents number of goals in a football match. Choose sensible prior for this data (draw the gamma density if necessary), justify it. Compute the posterior. Compute an interval such that the probability that the true mean is in there is 95%. What is the probability that the true mean is greater than 2.5? Back to theory: Compute prior predictive and posterior predictive. Discuss why the posterior predictive is overdispersed and not Poisson? Draw a histogram of the prior predictive and posterior predictive for the data from (b). Discuss. Generate 10 and 100 random samples from a Poisson distribution and compare the posteriors with a flat prior, and a prior concentrated away from the truth. x <- c(3, 2, 1, 1, 5, 4, 0, 0, 4, 3) Solution. \\[\\begin{align*} p(\\lambda | X) &= \\frac{p(X | \\lambda) p(\\lambda)}{\\int_0^\\infty p(X | \\lambda) p(\\lambda) d\\lambda} \\\\ &\\propto p(X | \\lambda) p(\\lambda) \\\\ &= \\Big(\\prod_{i=1}^n \\frac{1}{x_i!} \\lambda^{x_i} e^{-\\lambda}\\Big) \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\lambda^{\\alpha - 1} e^{-\\beta \\lambda} \\\\ &\\propto \\lambda^{\\sum_{i=1}^n x_i + \\alpha - 1} e^{- \\lambda (n + \\beta)} \\\\ \\end{align*}\\] We recognize this as the shape of a gamma distribution, therefore \\[\\begin{align*} \\lambda | X \\sim \\text{gamma}(\\alpha + \\sum_{i=1}^n x_i, \\beta + n) \\end{align*}\\] For the prior predictive, we have \\[\\begin{align*} p(x^*) &= \\int_0^\\infty p(x^*, \\lambda) d\\lambda \\\\ &= \\int_0^\\infty p(x^* | \\lambda) p(\\lambda) d\\lambda \\\\ &= \\int_0^\\infty \\frac{1}{x^*!} \\lambda^{x^*} e^{-\\lambda} \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\lambda^{\\alpha - 1} e^{-\\beta \\lambda} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\int_0^\\infty \\lambda^{x^* + \\alpha - 1} e^{-\\lambda (1 + \\beta)} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\frac{\\Gamma(x^* + \\alpha)}{(1 + \\beta)^{x^* + \\alpha}} \\int_0^\\infty \\frac{(1 + \\beta)^{x^* + \\alpha}}{\\Gamma(x^* + \\alpha)} \\lambda^{x^* + \\alpha - 1} e^{-\\lambda (1 + \\beta)} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\frac{\\Gamma(x^* + \\alpha)}{(1 + \\beta)^{x^* + \\alpha}} \\\\ &= \\frac{\\Gamma(x^* + \\alpha)}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} (\\frac{\\beta}{1 + \\beta})^\\alpha (\\frac{1}{1 + \\beta})^{x^*}, \\end{align*}\\] which we recognize as the negative binomial distribution with \\(r = \\alpha\\) and \\(p = \\frac{1}{\\beta + 1}\\). For the posterior predictive, the calculation is the same, only now the parameters are \\(r = \\alpha + \\sum_{i=1}^n x_i\\) and \\(p = \\frac{1}{\\beta + n + 1}\\). There are two sources of uncertainty in the predictive distribution. First is the uncertainty about the population. Second is the variability in sampling from the population. When \\(n\\) is large, the latter is going to be very small. But when \\(n\\) is small, the latter is going to be higher, resulting in an overdispersed predictive distribution. x <- c(3, 2, 1, 1, 5, 4, 0, 0, 4, 3) # b # quick visual check of the prior ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dgamma, args = list(shape = 1, rate = 1)) palpha <- 1 pbeta <- 1 alpha_post <- palpha + sum(x) beta_post <- pbeta + length(x) ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dgamma, args = list(shape = alpha_post, rate = beta_post)) # probability of being higher than 2.5 1 - pgamma(2.5, alpha_post, beta_post) ## [1] 0.2267148 # interval qgamma(c(0.025, 0.975), alpha_post, beta_post) ## [1] 1.397932 3.137390 # d prior_pred <- rnbinom(1000, size = palpha, prob = 1 - 1 / (pbeta + 1)) post_pred <- rnbinom(1000, size = palpha + sum(x), prob = 1 - 1 / (pbeta + 10 + 1)) df <- data.frame(prior = prior_pred, posterior = post_pred) df <- gather(df) ggplot(df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") # e set.seed(1) x1 <- rpois(10, 2.5) x2 <- rpois(100, 2.5) alpha_flat <- 1 beta_flat <- 0.1 alpha_conc <- 50 beta_conc <- 10 n <- 10000 df_flat <- data.frame(x1 = rgamma(n, alpha_flat + sum(x1), beta_flat + 10), x2 = rgamma(n, alpha_flat + sum(x2), beta_flat + 100), type = "flat") df_flat <- tidyr::gather(df_flat, key = "key", value = "value", - type) df_conc <- data.frame(x1 = rgamma(n, alpha_conc + sum(x1), beta_conc + 10), x2 = rgamma(n, alpha_conc + sum(x2), beta_conc + 100), type = "conc") df_conc <- tidyr::gather(df_conc, key = "key", value = "value", - type) df <- rbind(df_flat, df_conc) ggplot(data = df, aes(x = value, color = type)) + facet_wrap(~ key) + geom_density() 17.2 Posterior sampling Exercise 17.2 (Bayesian logistic regression) In Chapter 15 we implemented a MLE for logistic regression (see the code below). For this model, conjugate priors do not exist, which complicates the calculation of the posterior. However, we can use sampling from the numerator of the posterior, using rejection sampling. Set a sensible prior distribution on \\(\\beta\\) and use rejection sampling to find the posterior distribution. In a) you will get a distribution of parameter \\(\\beta\\). Plot the probabilities (as in exercise 15.3) for each sample of \\(\\beta\\) and compare to the truth. Hint: We can use rejection sampling even for functions which are not PDFs – they do not have to sum/integrate to 1. We just need to use a suitable envelope that we know how to sample from. For example, here we could use a uniform distribution and scale it suitably. set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par # Let's say we believe that the mean of beta is 0.5. Since we are not very sure # about this, we will give it a relatively high variance. So a normal prior with # mean 0.5 and standard deviation 5. But there is no right solution to this, # this is basically us expressing our prior belief in the parameter values. set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) if (is.nan(logl)) logl <- Inf return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par ## [1] 1.166558 f_logistic <- function (beta, X, y) { logl <- prod(inv_log(as.vector(beta %*% X))^y * (1 - inv_log(as.vector(beta %*% X)))^(1 - y)) return(logl) } a <- seq(0, 3, by = 0.01) my_l <- c() for (i in a) { my_l <- c(my_l, f_logistic(i, x, y) * dnorm(i, 0.5, 5)) } plot(my_l) envlp <- 10^(-25.8) * dunif(a, -5, 5) # found by trial and error tmp <- data.frame(envel = envlp, l = my_l, t = a) tmp <- gather(tmp, key = "key", value = "value", - t) ggplot(tmp, aes(x = t, y = value, color = key)) + geom_line() # envelope OK set.seed(1) nsamps <- 1000 samps <- c() for (i in 1:nsamps) { tmp <- runif(1, -5, 5) u <- runif(1, 0, 1) if (u < (f_logistic(tmp, x, y) * dnorm(tmp, 0.5, 5)) / (10^(-25.8) * dunif(tmp, -5, 5))) { samps <- c(samps, tmp) } } plot(density(samps)) mean(samps) ## [1] 1.211578 median(samps) ## [1] 1.204279 truth_p <- data.frame(x = x, prob = inv_log(1.2 * x), type = "truth") preds <- inv_log(x %*% t(samps)) preds <- gather(cbind(as.data.frame(preds), x = x), key = "key", "value" = value, - x) ggplot(preds, aes(x = x, y = value)) + geom_line(aes(group = key), color = "gray", alpha = 0.7) + geom_point(data = truth_p, aes(y = prob), color = "red", alpha = 0.7) + theme_bw() "],["distributions-intutition.html", "Chapter 18 Distributions intutition 18.1 Discrete distributions 18.2 Continuous distributions", " Chapter 18 Distributions intutition This chapter is intended to help you familiarize yourself with the different probability distributions you will encounter in this course. You will need to use Appendix B extensively as a reference for the basic properties of distributions, so keep it close! .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 18.1 Discrete distributions Exercise 18.1 (Bernoulli intuition 1) Arguably the simplest distribution you will encounter is the Bernoulli distribution. It is a discrete probability distribution used to represent the outcome of a yes/no question. It has one parameter \\(p\\) which is the probability of success. The probability of failure is \\((1-p)\\), sometimes denoted as \\(q\\). A classic way to think about a Bernoulli trial (a yes/no experiment) is a coin flip. Real coins are fair, meaning the probability of either heads (1) or tails (0) are the same, so \\(p=0.5\\) as shown below in figure a. Alternatively we may want to represent a process that doesn’t have equal probabilities of outcomes like “Will a throw of a fair die result in a 6?”. In this case \\(p=\\frac{1}{6}\\), shown in figure b. Using your knowledge of the Bernoulli distribution use the throw of a fair die to think of events, such that: \\(p = 0.5\\) \\(p = \\frac{5}{6}\\) \\(q = \\frac{2}{3}\\) Solution. An event that is equally likely to happen or not happen i.e. \\(p = 0.5\\) would be throwing an even number. More formally we can name this event \\(A\\) and write: \\(A = \\{2,4,6\\}\\), its probability being \\(P(A) = 0.5\\) An example of an event with \\(p = \\frac{5}{6}\\) would be throwing a number greater than 1. Defined as \\(B = \\{2,3,4,5,6\\}\\). We need an event that fails \\(\\frac{2}{3}\\) of the time. Alternatively we can reverse the problem and find an event that succeeds \\(\\frac{1}{3}\\) of the time, since: \\(q = 1 - p \\implies p = 1 - q = \\frac{1}{3}\\). The event that our outcome is divisible by 3: \\(C = \\{3, 6\\}\\) satisfies this condition. Exercise 18.2 (Binomial intuition 1) The binomial distribution is a generalization of the Bernoulli distribution. Instead of considering a single Bernoulli trial, we now consider a sequence of \\(n\\) trials, which are independent and have the same parameter \\(p\\). So the binomial distribution has two parameters \\(n\\) - the number of trials and \\(p\\) - the probability of success for each trial. If we return to our coin flip representation, we now flip a coin several times. The binomial distribution will give us the probabilities of all possible outcomes. Below we show the distribution for a series of 10 coin flips with a fair coin (left) and a biased coin (right). The numbers on the x axis represent the number of times the coin landed heads. Using your knowledge of the binomial distribution: Take the pmf of the binomial distribution and plug in \\(n=1\\), check that it is in fact equivalent to a Bernoulli distribution. In our examples we show the graph of a binomial distribution over 10 trials with \\(p=0.8\\). If we take a look at the graph, it appears as though the probabilities of getting 0,1,2 or 3 heads in 10 flips are zero. Is it actually zero? Check by plugging in the values into the pmf. Solution. The pmf of a binomial distribution is \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\), now we insert \\(n=1\\) to get: \\[\\binom{1}{k} p^k (1 - p)^{1 - k}\\] Not quite equivalent to a Bernoulli, however note that the support of the binomial distribution is defined as \\(k \\in \\{0,1,\\dots,n\\}\\), so in our case \\(k = \\{0,1\\}\\), then: \\[\\binom{1}{0} = \\binom{1}{1} = 1\\] we get: \\(p^k (1 - p)^{1 - k}\\) ,the Bernoulli distribution. As we already know \\(p=0.8, n=10\\), so: \\[\\binom{10}{0} 0.8^0 (1 - 0.8)^{10 - 0} = 1.024 \\cdot 10^{-7}\\] \\[\\binom{10}{1} 0.8^1 (1 - 0.8)^{10 - 1} = 4.096 \\cdot 10^{-6}\\] \\[\\binom{10}{2} 0.8^2 (1 - 0.8)^{10 - 2} = 7.3728 \\cdot 10^{-5}\\] \\[\\binom{10}{3} 0.8^3 (1 - 0.8)^{10 - 3} = 7.86432\\cdot 10^{-4}\\] So the probabilities are not zero, just very small. Exercise 18.3 (Poisson intuition 1) Below are shown 3 different graphs of the Poisson distribution. Your task is to replicate them on your own in R by varying the \\(\\lambda\\) parameter. Hint: You can use dpois() to get the probabilities. library(ggplot2) library(gridExtra) x = 0:15 # Create Poisson data data1 <- data.frame(x = x, y = dpois(x, lambda = 0.1)) data2 <- data.frame(x = x, y = dpois(x, lambda = 1)) data3 <- data.frame(x = x, y = dpois(x, lambda = 7.5)) # Create individual ggplot objects plot1 <- ggplot(data1, aes(x, y)) + geom_col() + xlab("x") + ylab("Probability") + ylim(0,1) plot2 <- ggplot(data2, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) plot3 <- ggplot(data3, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) # Combine the plots grid.arrange(plot1, plot2, plot3, ncol = 3) Exercise 18.4 (Poisson intuition 2) The Poisson distribution is a discrete probability distribution that models the probability of a given number of events occuring within processes where events occur at a constant mean rate and independently of each other - a Poisson process. It has a single parameter \\(\\lambda\\), which represents the constant mean rate. A classic example of a scenario that can be modeled using the Poisson distribution is the number of calls received at a call center in a day (or in fact any other time interval). Suppose you work in a call center and have some understanding of probability distributions. You overhear your supervisor mentioning that the call center receives an average of 2.5 calls per day. Using your knowledge of the Poisson distribution, calculate: The probability you will get no calls today. The probability you will get more than 5 calls today. Solution. First recall the Poisson pmf: \\[p(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!}\\] as stated previously our parameter \\(\\lambda = 2.5\\) To get the probability of no calls we simply plug in \\(k = 0\\), so: \\[p(0) = \\frac{2.5^0 e^{-2.5}}{0!} = e^{-2.5} \\approx 0.082\\] The support of the Poisson distribution is non-negative integers. So if we wanted to calculate the probability of getting more than 5 calls we would need to add up the probabilities of getting 6 calls and 7 calls and so on up to infinity. Let us instead remember that the sum of all probabilties will be 1, we will reverse the problem and instead ask “What is the probability we get 5 calls or less?”. We can subtract the probability of the opposite outcome (the complement) from 1 to get the probability of our original question. \\[P(k > 5) = 1 - P(k \\leq 5)\\] \\[P(k \\leq 5) = \\sum_{i=0}^{5} p(i) = p(0) + p(1) + p(2) + p(3) + p(4) + p(5) =\\] \\[= \\frac{2.5^0 e^{-2.5}}{0!} + \\frac{2.5^1 e^{-2.5}}{1!} + \\dots =\\] \\[=0.957979\\] So the probability of geting more than 5 calls will be \\(1 - 0.957979 = 0.042021\\) Exercise 18.5 (Geometric intuition 1) The geometric distribution is a discrete distribution that models the number of failures before the first success in a sequence of independent Bernoulli trials. It has a single parameter \\(p\\), representing the probability of success and its support is all non-negative integers \\(\\{0,1,2,\\dots\\}\\). NOTE: There are two forms of this distribution, the one we just described and another that models the number of trials before the first success. The difference is subtle yet significant and you are likely to encounter both forms. The key to telling them apart is to check their support, since the number of trials has to be at least \\(1\\), for this case we have \\(\\{1,2,\\dots\\}\\). In the graph below we show the pmf of a geometric distribution with \\(p=0.5\\). This can be thought of as the number of successive failures (tails) in the flip of a fair coin. You can see that there’s a 50% chance you will have zero failures i.e. you will flip a heads on your very first attempt. But there is some smaller chance that you will flip a sequence of tails in a row, with longer sequences having ever lower probability. Create an equivalent graph that represents the probability of rolling a 6 with a fair 6-sided die. Use the formula for the mean of the geometric distribution and determine the average number of failures before you roll a 6. Look up the alternative form of the geometric distribtuion and again use the formula for the mean to determine the average number of trials before you roll a 6. Solution. Parameter p (the probability of success) for rolling a 6 is \\(p=\\frac{1}{6}\\). library(ggplot2) # Parameters p <- 1/6 x_vals <- 0:9 # Starting from 0 probs <- dgeom(x_vals, p) # Data data <- data.frame(x_vals, probs) # Plot ggplot(data, aes(x=x_vals, y=probs)) + geom_segment(aes(xend=x_vals, yend=0), color="black", size=1) + geom_point(color="red", size=2) + labs(x = "Number of trials", y = "Probability") + theme_minimal() + scale_x_continuous(breaks = x_vals) # This line ensures integer x-axis labels ::: {.solution} b) The expected value of a random variable (the mean) is denoted as \\(E[X]\\). \\[E[X] = \\frac{1-p}{p}= \\frac{1- \\frac{1}{6}}{\\frac{1}{6}} = \\frac{5}{6}\\cdot 6 = 5\\] On average we will fail 5 times before we roll our first 6. The alternative form of this distribution (with support on all positive integers) has a slightly different formula for the mean. This change reflects the difference in the way we posed our question: \\[E[X] = \\frac{1}{p} = \\frac{1}{\\frac{1}{6}} = 6\\] On average we will have to throw the die 6 times before we roll a 6. ::: 18.2 Continuous distributions Exercise 18.6 (Uniform intuition 1) The need for a randomness is a common problem. A practical solution are so-called random number generators (RNGs). The simplest RNG one would think of is choosing a set of numbers and having the generator return a number at random, where the probability of returning any number from this set is the same. If this set is an interval of real numbers, then we’ve basically described the continuous uniform distribution. It has two parameters \\(a\\) and \\(b\\), which define the beginning and end of its support respectively. Let’s think of the mean intuitively. The expected value or mean of a distribution is the pivot point on our x-axis, which “balances” the graph. Given parameters \\(a\\) and \\(b\\) what is your intuitive guess of the mean for this distribution? A special case of the uniform distribution is the standard uniform distribution with \\(a=0\\) and \\(b=1\\). Write the pdf \\(f(x)\\) of this particular distribution. Solution. It’s the midpoint between \\(a\\) and \\(b\\), so \\(\\frac{a+b}{2}\\) Inserting the parameter values we get:\\[f(x) = \\begin{cases} 1 & \\text{if } 0 \\leq x \\leq 1 \\\\ 0 & \\text{otherwise} \\end{cases} \\] Notice how the pdf is just a constant \\(1\\) across all values of \\(x \\in [0,1]\\). Here it is important to distinguish between probability and probability density. The density may be 1, but the probability is not and while discrete distributions never exceed 1 on the y-axis, continuous distributions can go as high as you like. Exercise 18.7 (Normal intuition 1) The normal distribution, also known as the Gaussian distribution, is a continuous distribution that encompasses the entire real number line. It has two parameters: the mean, denoted by \\(\\mu\\), and the variance, represented by \\(\\sigma^2\\). Its shape resembles the iconic bell curve. The position of its peak is determined by the parameter \\(\\mu\\), while the variance determines the spread or width of the curve. A smaller variance results in a sharper, narrower peak, while a larger variance leads to a broader, more spread-out curve. Below, we graph the distribution of IQ scores for two different populations. We aim to identify individuals with an IQ at or above 140 for an experiment. We can identify them reliably; however, we only have time to examine one of the two groups. Which group should we investigate to have the best chance of finding such individuals? NOTE: The graph below displays the parameter \\(\\sigma\\), which is the square root of the variance, more commonly referred to as the standard deviation. Keep this in mind when solving the problems. Insert the values of either population into the pdf of a normal distribution and determine which one has a higher density at \\(x=140\\). Generate the graph yourself and zoom into the relevant area to graphically verify your answer. To determine probability density, we can use the pdf. However, if we wish to know the proportion of the population that falls within certain parameters, we would need to integrate the pdf. Fortunately, the integrals of common distributions are well-established. This integral gives us the cumulative distribution function \\(F(x)\\) (CDF). BONUS: Look up the CDF of the normal distribution and input the appropriate values to determine the percentage of each population that comprises individuals with an IQ of 140 or higher. Solution. Group 1: \\(\\mu = 100, \\sigma=10 \\rightarrow \\sigma^2 = 100\\) \\[\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} = \\frac{1}{\\sqrt{2 \\pi 100}} e^{-\\frac{(140 - 100)^2}{2 \\cdot 100}} \\approx 1.34e-05\\] Group 2: \\(\\mu = 105, \\sigma=8 \\rightarrow \\sigma^2 = 64\\) \\[\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} = \\frac{1}{\\sqrt{2 \\pi 64}} e^{-\\frac{(140 - 105)^2}{2 \\cdot 64}} \\approx 3.48e-06\\] So despite the fact that group 1 has a lower average IQ, we are more likely to find 140 IQ individuals in this group. library(ggplot2) library(tidyr) # Create data x <- seq(135, 145, by = 0.01) # Adjusting the x range to account for the larger standard deviations df <- data.frame(x = x) # Define the IQ distributions df$IQ_mu100_sd10 <- dnorm(df$x, mean = 100, sd = 10) df$IQ_mu105_sd8 <- dnorm(df$x, mean = 105, sd = 8) # Convert from wide to long format for ggplot2 df_long <- gather(df, distribution, density, -x) # Ensure the levels of the 'distribution' factor match our desired order df_long$distribution <- factor(df_long$distribution, levels = c("IQ_mu100_sd10", "IQ_mu105_sd8")) # Plot ggplot(df_long, aes(x = x, y = density, color = distribution)) + geom_line() + labs(x = "IQ Score", y = "Density") + scale_color_manual( name = "IQ Distribution", values = c(IQ_mu100_sd10 = "red", IQ_mu105_sd8 = "blue"), labels = c("Group 1 (µ=100, σ=10)", "Group 2 (µ=105, σ=8)") ) + theme_minimal() ::: {.solution} c. The CDF of the normal distribution is \\(\\Phi(x) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{x - \\mu}{\\sigma \\sqrt{2}} \\right) \\right]\\). The CDF is defined as the integral of the distribution density up to x. So to get the total percentage of individuals with IQ at 140 or higher we will need to subtract the value from 1. Group 1: \\[1 - \\Phi(140) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{140 - 100}{10 \\sqrt{2}} \\right) \\right] \\approx 3.17e-05 \\] Group 2 : \\[1 - \\Phi(140) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{140 - 105}{8 \\sqrt{2}} \\right) \\right] \\approx 6.07e-06 \\] So roughly 0.003% and 0.0006% of individuals in groups 1 and 2 respectively have an IQ at or above 140. ::: Exercise 18.8 (Beta intuition 1) The beta distribution is a continuous distribution defined on the unit interval \\([0,1]\\). It has two strictly positive paramters \\(\\alpha\\) and \\(\\beta\\), which determine its shape. Its support makes it especially suitable to model distribtuions of percentages and proportions. Below you’ve been provided with some code that you can copy into Rstudio. Once you run the code, an interactive Shiny app will appear and you will be able to manipulate the graph of the beta distribution. Play around with the parameters to get: A straight line from (0,0) to (1,2) A straight line from (0,2) to (1,0) A symmetric bell curve A bowl-shaped curve The standard uniform distribution is actually a special case of the beta distribution. Find the exact parameters \\(\\alpha\\) and \\(\\beta\\). Once you do, prove the equality by inserting the values into our pdf. Hint: The beta function is evaluated as \\(\\text{B}(a,b) = \\frac{\\Gamma(a)\\Gamma(b)}{\\Gamma(a+b)}\\), the gamma function for positive integers \\(n\\) is evaluated as \\(\\Gamma(n)= (n-1)!\\) # Install and load necessary packages install.packages(c("shiny", "ggplot2")) library(shiny) library(ggplot2) # The Shiny App ui <- fluidPage( titlePanel("Beta Distribution Viewer"), sidebarLayout( sidebarPanel( sliderInput("alpha", "Alpha:", min = 0.1, max = 10, value = 2, step = 0.1), sliderInput("beta", "Beta:", min = 0.1, max = 10, value = 2, step = 0.1) ), mainPanel( plotOutput("betaPlot") ) ) ) server <- function(input, output) { output$betaPlot <- renderPlot({ x <- seq(0, 1, by = 0.01) y <- dbeta(x, shape1 = input$alpha, shape2 = input$beta) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() + labs(x = "Value", y = "Density") + theme_minimal() }) } shinyApp(ui = ui, server = server) Solution. \\(\\alpha = 2, \\beta=1\\) \\(\\alpha = 1, \\beta=2\\) Possible solution \\(\\alpha = \\beta= 5\\) Possible solution \\(\\alpha = \\beta= 0.5\\) The correct parameters are \\(\\alpha = 1, \\beta=1\\), to prove the equality we insert them into the beta pdf: \\[\\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} = \\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\\text{B}(1, 1)} = \\frac{1}{\\frac{\\Gamma(1)\\Gamma(1)}{\\Gamma(1+1)}}= \\frac{1}{\\frac{(1-1)!(1-1)!}{(2-1)!}} = 1\\] Exercise 18.9 (Exponential intuition 1) The exponential distribution represents the distributon of time between events in a Poisson process. It is the continuous analogue of the geometric distribution. It has a single parameter \\(\\lambda\\), which is strictly positive and represents the constant rate of the corresponding Poisson process. The support is all positive reals, since time between events is non-negative, but not bound upwards. Let’s revisit the call center from our Poisson problem. We get 2.5 calls per day on average, this is our rate parameter \\(\\lambda\\). A work day is 8 hours. What is the mean time between phone calls? The cdf \\(F(x)\\) tells us what percentage of calls occur within x amount of time of each other. You want to take an hour long lunch break but are worried about missing calls. Calculate the percentage of calls you are likely to miss if you’re gone for an hour. Hint: The cdf is \\(F(x) = \\int_{-\\infty}^{x} f(x) dx\\) Solution. Taking \\(\\lambda = \\frac{2.5 \\text{ calls}}{8 \\text{ hours}} = \\frac{1 \\text{ call}}{3.2 \\text{ hours}}\\) \\[E[X] = \\frac{1}{\\lambda} = \\frac{3.2 \\text{ hours}}{\\text{call}}\\] First we derive the CDF, we can integrate from 0 instead of \\(-\\infty\\), since we have no support in the negatives: \\[\\begin{align} F(x) &= \\int_{0}^{x} \\lambda e^{-\\lambda t} dt \\\\ &= \\lambda \\int_{0}^{x} e^{-\\lambda t} dt \\\\ &= \\lambda (\\frac{1}{-\\lambda}e^{-\\lambda t} |_{0}^{x}) \\\\ &= \\lambda(\\frac{1}{\\lambda} - \\frac{1}{\\lambda} e^{-\\lambda x}) \\\\ &= 1 - e^{-\\lambda x}. \\end{align}\\] Then we just evaluate it for a time of 1 hour: \\[F(1 \\text{ hour}) = 1 - e^{-\\frac{1 \\text{ call}}{3.2 \\text{ hours}} \\cdot 1 \\text{ hour}}= 1 - e^{-\\frac{1 \\text{ call}}{3.2 \\text{ hours}}} \\approx 0.268\\] So we have about a 27% chance of missing a call if we’re gone for an hour. Exercise 18.10 (Gamma intuition 1) The gamma distribution is a continuous distribution characterized by two parameters, \\(\\alpha\\) and \\(\\beta\\), both greater than 0. These parameters afford the distribution a broad range of shapes, leading to it being commonly referred to as a family of distributions. Given its support over the positive real numbers, it is well suited for modeling a diverse range of positive-valued phenomena. The exponential distribution is actually just a particular form of the gamma distribution. What are the values of \\(\\alpha\\) and \\(\\beta\\)? Copy the code from our beta distribution Shiny app and modify it to simulate the gamma distribution. Then get it to show the exponential. Solution. Let’s start by taking a look at the pdfs of the two distributions side by side: \\[\\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x} = \\lambda e^{-\\lambda x}\\] The \\(x^{\\alpha - 1}\\) term is not found anywhere in the pdf of the exponential so we need to eliminate it by setting \\(\\alpha = 1\\). This also makes the fraction evaluate to \\(\\frac{\\beta^1}{\\Gamma(1)} = \\beta\\), which leaves us with \\[\\beta \\cdot e^{-\\beta x}\\] Now we can see that \\(\\beta = \\lambda\\) and \\(\\alpha = 1\\). # Install and load necessary packages install.packages(c("shiny", "ggplot2")) library(shiny) library(ggplot2) # The Shiny App ui <- fluidPage( titlePanel("Gamma Distribution Viewer"), sidebarLayout( sidebarPanel( sliderInput("shape", "Shape (α):", min = 0.1, max = 10, value = 2, step = 0.1), sliderInput("scale", "Scale (β):", min = 0.1, max = 10, value = 2, step = 0.1) ), mainPanel( plotOutput("gammaPlot") ) ) ) server <- function(input, output) { output$gammaPlot <- renderPlot({ x <- seq(0, 25, by = 0.1) y <- dgamma(x, shape = input$shape, scale = input$scale) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() + labs(x = "Value", y = "Density") + theme_minimal() }) } shinyApp(ui = ui, server = server) "],["A1.html", "A R programming language A.1 Basic characteristics A.2 Why R? A.3 Setting up A.4 R basics A.5 Functions A.6 Other tips A.7 Further reading and references", " A R programming language A.1 Basic characteristics R is free software for statistical computing and graphics. It is widely used by statisticians, scientists, and other professionals for software development and data analysis. It is an interpreted language and therefore the programs do not need compilation. A.2 Why R? R is one of the main two languages used for statistics and machine learning (the other being Python). Pros Libraries. Comprehensive collection of statistical and machine learning packages. Easy to code. Open source. Anyone can access R and develop new methods. Additionally, it is relatively simple to get source code of established methods. Large community. The use of R has been rising for some time, in industry and academia. Therefore a large collection of blogs and tutorials exists, along with people offering help on pages like StackExchange and CrossValidated. Integration with other languages and LaTeX. New methods. Many researchers develop R packages based on their research, therefore new methods are available soon after development. Cons Slow. Programs run slower than in other programming languages, however this can be somewhat ammended by effective coding or integration with other languages. Memory intensive. This can become a problem with large data sets, as they need to be stored in the memory, along with all the information the models produce. Some packages are not as good as they should be, or have poor documentation. Object oriented programming in R can be very confusing and complex. A.3 Setting up https://www.r-project.org/. A.3.1 RStudio RStudio is the most widely used IDE for R. It is free, you can download it from https://rstudio.com/. While console R is sufficient for the requirements of this course, we recommend the students install RStudio for its better user interface. A.3.2 Libraries for data science Listed below are some of the more useful libraries (packages) for data science. Students are also encouraged to find other useful packages. dplyr Efficient data manipulation. Part of the wider package collection called tidyverse. ggplot2 Plotting based on grammar of graphics. stats Several statistical models. rstan Bayesian inference using Hamiltonian Monte Carlo. Very flexible model building. MCMCpack Bayesian inference. rmarkdown, knitr, and bookdown Dynamic reports (for example such as this one). devtools Package development. A.4 R basics A.4.1 Variables and types Important information and tips: no type declaration define variables with <- instead of = (although both work, there is a slight difference, additionally most of the packages use the arrow) for strings use \"\" for comments use # change types with as.type() functions no special type for single character like C++ for example n <- 20 x <- 2.7 m <- n # m gets value 20 my_flag <- TRUE student_name <- "Luka" typeof(n) ## [1] "double" typeof(student_name) ## [1] "character" typeof(my_flag) ## [1] "logical" typeof(as.integer(n)) ## [1] "integer" typeof(as.character(n)) ## [1] "character" A.4.2 Basic operations n + x ## [1] 22.7 n - x ## [1] 17.3 diff <- n - x # variable diff gets the difference between n and x diff ## [1] 17.3 n * x ## [1] 54 n / x ## [1] 7.407407 x^2 ## [1] 7.29 sqrt(x) ## [1] 1.643168 n > 2 * n ## [1] FALSE n == n ## [1] TRUE n == 2 * n ## [1] FALSE n != n ## [1] FALSE paste(student_name, "is", n, "years old") ## [1] "Luka is 20 years old" A.4.3 Vectors use c() to combine elements into vectors can only contain one type of variable if different types are provided, all are transformed to the most basic type in the vector access elements by indexes or logical vectors of the same length a scalar value is regarded as a vector of length 1 1:4 # creates a vector of integers from 1 to 4 ## [1] 1 2 3 4 student_ages <- c(20, 23, 21) student_names <- c("Luke", "Jen", "Mike") passed <- c(TRUE, TRUE, FALSE) length(student_ages) ## [1] 3 # access by index student_ages[2] ## [1] 23 student_ages[1:2] ## [1] 20 23 student_ages[2] <- 24 # change values # access by logical vectors student_ages[passed == TRUE] # same as student_ages[passed] ## [1] 20 24 student_ages[student_names %in% c("Luke", "Mike")] ## [1] 20 21 student_names[student_ages > 20] ## [1] "Jen" "Mike" A.4.3.1 Operations with vectors most operations are element-wise if we operate on vectors of different lengths, the shorter vector periodically repeats its elements until it reaches the length of the longer one a <- c(1, 3, 5) b <- c(2, 2, 1) d <- c(6, 7) a + b ## [1] 3 5 6 a * b ## [1] 2 6 5 a + d ## Warning in a + d: longer object length is not a multiple of shorter object ## length ## [1] 7 10 11 a + 2 * b ## [1] 5 7 7 a > b ## [1] FALSE TRUE TRUE b == a ## [1] FALSE FALSE FALSE a %*% b # vector multiplication, not element-wise ## [,1] ## [1,] 13 A.4.4 Factors vectors of finite predetermined classes suitable for categorical variables ordinal (ordered) or nominal (unordered) car_brand <- factor(c("Audi", "BMW", "Mercedes", "BMW"), ordered = FALSE) car_brand ## [1] Audi BMW Mercedes BMW ## Levels: Audi BMW Mercedes freq <- factor(x = NA, levels = c("never","rarely","sometimes","often","always"), ordered = TRUE) freq[1:3] <- c("rarely", "sometimes", "rarely") freq ## [1] rarely sometimes rarely ## Levels: never < rarely < sometimes < often < always freq[4] <- "quite_often" # non-existing level, returns NA ## Warning in `[<-.factor`(`*tmp*`, 4, value = "quite_often"): invalid factor ## level, NA generated freq ## [1] rarely sometimes rarely <NA> ## Levels: never < rarely < sometimes < often < always A.4.5 Matrices two-dimensional generalizations of vectors my_matrix <- matrix(c(1, 2, 1, 5, 4, 2), nrow = 2, byrow = TRUE) my_matrix ## [,1] [,2] [,3] ## [1,] 1 2 1 ## [2,] 5 4 2 my_square_matrix <- matrix(c(1, 3, 2, 3), nrow = 2) my_square_matrix ## [,1] [,2] ## [1,] 1 2 ## [2,] 3 3 my_matrix[1,2] # first row, second column ## [1] 2 my_matrix[2, ] # second row ## [1] 5 4 2 my_matrix[ ,3] # third column ## [1] 1 2 A.4.5.1 Matrix functions and operations most operation element-wise mind the dimensions when using matrix multiplication %*% nrow(my_matrix) # number of matrix rows ## [1] 2 ncol(my_matrix) # number of matrix columns ## [1] 3 dim(my_matrix) # matrix dimension ## [1] 2 3 t(my_matrix) # transpose ## [,1] [,2] ## [1,] 1 5 ## [2,] 2 4 ## [3,] 1 2 diag(my_matrix) # the diagonal of the matrix as vector ## [1] 1 4 diag(1, nrow = 3) # creates a diagonal matrix ## [,1] [,2] [,3] ## [1,] 1 0 0 ## [2,] 0 1 0 ## [3,] 0 0 1 det(my_square_matrix) # matrix determinant ## [1] -3 my_matrix + 2 * my_matrix ## [,1] [,2] [,3] ## [1,] 3 6 3 ## [2,] 15 12 6 my_matrix * my_matrix # element-wise multiplication ## [,1] [,2] [,3] ## [1,] 1 4 1 ## [2,] 25 16 4 my_matrix %*% t(my_matrix) # matrix multiplication ## [,1] [,2] ## [1,] 6 15 ## [2,] 15 45 my_vec <- as.vector(my_matrix) # transform to vector my_vec ## [1] 1 5 2 4 1 2 A.4.6 Arrays multi-dimensional generalizations of matrices my_array <- array(c(1, 2, 3, 4, 5, 6, 7, 8), dim = c(2, 2, 2)) my_array[1, 1, 1] ## [1] 1 my_array[2, 2, 1] ## [1] 4 my_array[1, , ] ## [,1] [,2] ## [1,] 1 5 ## [2,] 3 7 dim(my_array) ## [1] 2 2 2 A.4.7 Data frames basic data structure for analysis differ from matrices as columns can be of different types student_data <- data.frame("Name" = student_names, "Age" = student_ages, "Pass" = passed) student_data ## Name Age Pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE ## 3 Mike 21 FALSE colnames(student_data) <- c("name", "age", "pass") # change column names student_data[1, ] ## name age pass ## 1 Luke 20 TRUE student_data[ ,colnames(student_data) %in% c("name", "pass")] ## name pass ## 1 Luke TRUE ## 2 Jen TRUE ## 3 Mike FALSE student_data$pass # access column by name ## [1] TRUE TRUE FALSE student_data[student_data$pass == TRUE, ] ## name age pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE A.4.8 Lists useful for storing different data structures access elements with double square brackets elements can be named first_list <- list(student_ages, my_matrix, student_data) second_list <- list(student_ages, my_matrix, student_data, first_list) first_list[[1]] ## [1] 20 24 21 second_list[[4]] ## [[1]] ## [1] 20 24 21 ## ## [[2]] ## [,1] [,2] [,3] ## [1,] 1 2 1 ## [2,] 5 4 2 ## ## [[3]] ## name age pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE ## 3 Mike 21 FALSE second_list[[4]][[1]] # first element of the fourth element of second_list ## [1] 20 24 21 length(second_list) ## [1] 4 second_list[[length(second_list) + 1]] <- "add_me" # append an element names(first_list) <- c("Age", "Matrix", "Data") first_list$Age ## [1] 20 24 21 A.4.9 Loops mostly for loop for loop can iterate over an arbitrary vector # iterate over consecutive natural numbers my_sum <- 0 for (i in 1:10) { my_sum <- my_sum + i } my_sum ## [1] 55 # iterate over an arbirary vector my_sum <- 0 some_numbers <- c(2, 3.5, 6, 100) for (i in some_numbers) { my_sum <- my_sum + i } my_sum ## [1] 111.5 A.5 Functions for help use ?function_name A.5.1 Writing functions We can write our own functions with function(). In the brackets, we define the parameters the function gets, and in curly brackets we define what the function does. We use return() to return values. sum_first_n_elements <- function (n) { my_sum <- 0 for (i in 1:n) { my_sum <- my_sum + i } return (my_sum) } sum_first_n_elements(10) ## [1] 55 A.6 Other tips Use set.seed(arbitrary_number) at the beginning of a script to set the seed and ensure replication. To dynamically set the working directory in R Studio to the parent folder of a R script use setwd(dirname(rstudioapi::getSourceEditorContext()$path)). To avoid slow R loops use the apply family of functions. See ?apply and ?lapply. To make your data manipulation (and therefore your life) a whole lot easier, use the dplyr package. Use getAnywhere(function_name) to get the source code of any function. Use browser for debugging. See ?browser. A.7 Further reading and references Getting started with R Studio: https://www.youtube.com/watch?v=lVKMsaWju8w Official R manuals: https://cran.r-project.org/manuals.html Cheatsheets: https://www.rstudio.com/resources/cheatsheets/ Workshop on R, dplyr, ggplot2, and R Markdown: https://github.com/bstatcomp/Rworkshop "],["distributions.html", "B Probability distributions", " B Probability distributions Name parameters support pdf/pmf mean variance Bernoulli \\(p \\in [0,1]\\) \\(k \\in \\{0,1\\}\\) \\(p^k (1 - p)^{1 - k}\\) 1.12 \\(p\\) 7.1 \\(p(1-p)\\) 7.1 binomial \\(n \\in \\mathbb{N}\\), \\(p \\in [0,1]\\) \\(k \\in \\{0,1,\\dots,n\\}\\) \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\) 4.4 \\(np\\) 7.2 \\(np(1-p)\\) 7.2 Poisson \\(\\lambda > 0\\) \\(k \\in \\mathbb{N}_0\\) \\(\\frac{\\lambda^k e^{-\\lambda}}{k!}\\) 4.6 \\(\\lambda\\) 7.3 \\(\\lambda\\) 7.3 geometric \\(p \\in (0,1]\\) \\(k \\in \\mathbb{N}_0\\) \\(p(1-p)^k\\) 4.5 \\(\\frac{1 - p}{p}\\) 7.4 \\(\\frac{1 - p}{p^2}\\) 9.3 normal \\(\\mu \\in \\mathbb{R}\\), \\(\\sigma^2 > 0\\) \\(x \\in \\mathbb{R}\\) \\(\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}}\\) 4.12 \\(\\mu\\) 7.8 \\(\\sigma^2\\) 7.8 uniform \\(a,b \\in \\mathbb{R}\\), \\(a < b\\) \\(x \\in [a,b]\\) \\(\\frac{1}{b-a}\\) 4.9 \\(\\frac{a+b}{2}\\) \\(\\frac{(b-a)^2}{12}\\) beta \\(\\alpha,\\beta > 0\\) \\(x \\in [0,1]\\) \\(\\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)}\\) 4.10 \\(\\frac{\\alpha}{\\alpha + \\beta}\\) 7.6 \\(\\frac{\\alpha \\beta}{(\\alpha + \\beta)^2(\\alpha + \\beta + 1)}\\) 7.6 gamma \\(\\alpha,\\beta > 0\\) \\(x \\in (0, \\infty)\\) \\(\\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x}\\) 4.11 \\(\\frac{\\alpha}{\\beta}\\) 7.5 \\(\\frac{\\alpha}{\\beta^2}\\) 7.5 exponential \\(\\lambda > 0\\) \\(x \\in [0, \\infty)\\) \\(\\lambda e^{-\\lambda x}\\) 4.8 \\(\\frac{1}{\\lambda}\\) 7.7 \\(\\frac{1}{\\lambda^2}\\) 7.7 logistic \\(\\mu \\in \\mathbb{R}\\), \\(s > 0\\) \\(x \\in \\mathbb{R}\\) \\(\\frac{e^{-\\frac{x - \\mu}{s}}}{s(1 + e^{-\\frac{x - \\mu}{s}})^2}\\) 4.13 \\(\\mu\\) \\(\\frac{s^2 \\pi^2}{3}\\) negative binomial \\(r \\in \\mathbb{N}\\), \\(p \\in [0,1]\\) \\(k \\in \\mathbb{N}_0\\) \\(\\binom{k + r - 1}{k}(1-p)^r p^k\\) 4.7 \\(\\frac{rp}{1 - p}\\) 9.2 \\(\\frac{rp}{(1 - p)^2}\\) 9.2 multinomial \\(n \\in \\mathbb{N}\\), \\(k \\in \\mathbb{N}\\) \\(p_i \\in [0,1]\\), \\(\\sum p_i = 1\\) \\(x_i \\in \\{0,..., n\\}\\), \\(i \\in \\{1,...,k\\}\\), \\(\\sum{x_i} = n\\) \\(\\frac{n!}{x_1!x_2!...x_k!} p_1^{x_1} p_2^{x_2}...p_k^{x_k}\\) 8.1 \\(np_i\\) \\(np_i(1-p_i)\\) "],["references.html", "References", " References "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]] +[["index.html", "Principles of Uncertainty – exercises Preface", " Principles of Uncertainty – exercises Gregor Pirš, Erik Štrumbelj, David Nabergoj and Leon Hvastja 2023-10-02 Preface These are the exercises for the Principles of Uncertainty course of the Data Science Master’s at University of Ljubljana, Faculty of Computer and Information Science. This document will be extended each week as the course progresses. At the end of each exercise session, we will post the solutions to the exercises worked in class and select exercises for homework. Students are also encouraged to solve the remaining exercises to further extend their knowledge. Some exercises require the use of R. Those exercises (or parts of) are coloured blue. Students that are not familiar with R programming language should study A to learn the basics. As the course progresses, we will cover more relevant uses of R for data science. "],["introduction.html", "Chapter 1 Probability spaces 1.1 Measure and probability spaces 1.2 Properties of probability measures 1.3 Discrete probability spaces", " Chapter 1 Probability spaces This chapter deals with measures and probability spaces. At the end of the chapter, we look more closely at discrete probability spaces. The students are expected to acquire the following knowledge: Theoretical Use properties of probability to calculate probabilities. Combinatorics. Understanding of continuity of probability. R Vectors and vector operations. For loop. Estimating probability with simulation. sample function. Matrices and matrix operations. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 1.1 Measure and probability spaces Exercise 1.1 (Completing a set to a sigma algebra) Let \\(\\Omega = \\{1,2,...,10\\}\\) and let \\(A = \\{\\emptyset, \\{1\\}, \\{2\\}, \\Omega \\}\\). Show that \\(A\\) is not a sigma algebra of \\(\\Omega\\). Find the minimum number of elements to complete A to a sigma algebra of \\(\\Omega\\). Solution. \\(1^c = \\{2,3,...,10\\} \\notin A \\implies\\) \\(A\\) is not sigma algebra. First we need the complements of all elements, so we need to add sets \\(\\{2,3,...,10\\}\\) and \\(\\{1,3,4,...,10\\}\\). Next we need unions of all sets – we add the set \\(\\{1,2\\}\\). Again we need the complement of this set, so we add \\(\\{3,4,...,10\\}\\). So the minimum number of elements we need to add is 4. Exercise 1.2 (Diversity of sigma algebras) Let \\(\\Omega\\) be a set. Find the smallest sigma algebra of \\(\\Omega\\). Find the largest sigma algebra of \\(\\Omega\\). Solution. \\(A = \\{\\emptyset, \\Omega\\}\\) \\(2^{\\Omega}\\) Exercise 1.3 Find all sigma algebras for \\(\\Omega = \\{0, 1, 2\\}\\). Solution. \\(A = \\{\\emptyset, \\Omega\\}\\) \\(A = 2^{\\Omega}\\) \\(A = \\{\\emptyset, \\{0\\}, \\{1,2\\}, \\Omega\\}\\) \\(A = \\{\\emptyset, \\{1\\}, \\{0,2\\}, \\Omega\\}\\) \\(A = \\{\\emptyset, \\{2\\}, \\{0,1\\}, \\Omega\\}\\) Exercise 1.4 (Difference between algebra and sigma algebra) Let \\(\\Omega = \\mathbb{N}\\) and \\(\\mathcal{A} = \\{A \\subseteq \\mathbb{N}: A \\text{ is finite or } A^c \\text{ is finite.} \\}\\). Show that \\(\\mathcal{A}\\) is an algebra but not a sigma algebra. Solution. \\(\\emptyset\\) is finite so \\(\\emptyset \\in \\mathcal{A}\\). Let \\(A \\in \\mathcal{A}\\) and \\(B \\in \\mathcal{A}\\). If both are finite, then their union is also finite and therefore in \\(\\mathcal{A}\\). Let at least one of them not be finite. Then their union is not finite. But \\((A \\cup B)^c = A^c \\cap B^c\\). And since at least one is infinite, then its complement is finite and the intersection is too. So finite unions are in \\(\\mathcal{A}\\). Let us look at numbers \\(2n\\). For any \\(n\\), \\(2n \\in \\mathcal{A}\\) as it is finite. But \\(\\bigcup_{k = 1}^{\\infty} 2n \\notin \\mathcal{A}\\). Exercise 1.5 We define \\(\\sigma(X) = \\cap_{\\lambda \\in I} S_\\lambda\\) to be a sigma algebra, generated by the set \\(X\\), where \\(S_\\lambda\\) are all sigma algebras such that \\(X \\subseteq S_\\lambda\\). \\(S_\\lambda\\) are indexed by \\(\\lambda \\in I\\). Let \\(A, B \\subseteq 2^{\\Omega}\\). Prove that \\(\\sigma(A) = \\sigma(B) \\iff A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A)\\). Solution. To prove the equivalence, we need to prove that the left hand side implies the right hand side and vice versa. Proving \\(\\sigma(A) = \\sigma(B) \\Rightarrow A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A)\\): we know \\(A \\subseteq \\sigma(A)\\) is always true, so by substituting in \\(\\sigma(B)\\) from the left hand side equality we obtain \\(A \\subseteq \\sigma(B)\\). We obtain \\(B \\subseteq \\sigma(A)\\) by symmetry. This proves the implication. Proving \\(A \\subseteq \\sigma(B) \\land B \\subseteq \\sigma(A) \\Rightarrow \\sigma(A) = \\sigma(B)\\): by definition of a sigma algebra, generated by a set, we have \\(\\sigma(B) = \\cap_{\\lambda \\in I} S_\\lambda\\) where \\(S_\\lambda\\) are all sigma algebras where \\(B \\subseteq S_\\lambda\\). But \\(\\sigma(A)\\) is one of \\(S_\\lambda\\), so we can write \\(\\sigma(B) = \\sigma(A) \\cap \\left(\\cap_{\\lambda \\in I} S_\\lambda \\right)\\), which implies \\(\\sigma(B) \\subseteq \\sigma(A)\\). By symmetry, we have \\(\\sigma(A) \\subseteq \\sigma(B)\\). Since \\(\\sigma(A) \\subseteq \\sigma(B)\\) and \\(\\sigma(B) \\subseteq \\sigma(A)\\), we obtain \\(\\sigma(A) = \\sigma(B)\\), which proves the implication and completes the equivalence proof. Exercise 1.6 (Intro to measure) Take the measurable space \\(\\Omega = \\{1,2\\}\\), \\(F = 2^{\\Omega}\\). Which of the following is a measure? Which is a probability measure? \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 5\\), \\(\\mu(\\{2\\}) = 6\\), \\(\\mu(\\{1,2\\}) = 11\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 0\\), \\(\\mu(\\{1,2\\}) = 1\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 0\\), \\(\\mu(\\{1,2\\}) = 0\\) \\(\\mu(\\emptyset) = 0\\), \\(\\mu(\\{1\\}) = 0\\), \\(\\mu(\\{2\\}) = 1\\), \\(\\mu(\\{1,2\\}) = 1\\) \\(\\mu(\\emptyset)=0\\), \\(\\mu(\\{1\\})=0\\), \\(\\mu(\\{2\\})=\\infty\\), \\(\\mu(\\{1,2\\})=\\infty\\) Solution. Measure. Not probability measure since \\(\\mu(\\Omega) > 1\\). Neither due to countable additivity. Measure. Not probability measure since \\(\\mu(\\Omega) = 0\\). Probability measure. Measure. Not probability measure since \\(\\mu(\\Omega) > 1\\). Exercise 1.7 Define a probability space that could be used to model the outcome of throwing two fair 6-sided dice. Solution. \\(\\Omega = \\{\\{i,j\\}, i = 1,...,6, j = 1,...,6\\}\\) \\(F = 2^{\\Omega}\\) \\(\\forall \\omega \\in \\Omega\\), \\(P(\\omega) = \\frac{1}{6} \\times \\frac{1}{6} = \\frac{1}{36}\\) 1.2 Properties of probability measures Exercise 1.8 A standard deck (52 cards) is distributed to two persons: 26 cards to each person. All partitions are equally likely. Find the probability that: The first person gets 4 Queens. The first person gets at least 2 Queens. R: Use simulation (sample) to check the above answers. Solution. \\(\\frac{\\binom{48}{22}}{\\binom{52}{26}}\\) 1 - \\(\\frac{\\binom{48}{26} + 4 \\times \\binom{48}{25}}{\\binom{52}{26}}\\) For the simulation, let us represent cards with numbers from 1 to 52, and let 1 through 4 represent Queens. set.seed(1) cards <- 1:52 n <- 10000 q4 <- vector(mode = "logical", length = n) q2 <- vector(mode = "logical", length = n) tmp <- vector(mode = "logical", length = n) for (i in 1:n) { p1 <- sample(1:52, 26) q4[i] <- sum(1:4 %in% p1) == 4 q2[i] <- sum(1:4 %in% p1) >= 2 } sum(q4) / n ## [1] 0.0572 sum(q2) / n ## [1] 0.6894 Exercise 1.9 Let \\(A\\) and \\(B\\) be events with probabilities \\(P(A) = \\frac{2}{3}\\) and \\(P(B) = \\frac{1}{2}\\). Show that \\(\\frac{1}{6} \\leq P(A\\cap B) \\leq \\frac{1}{2}\\), and give examples to show that both extremes are possible. Find corresponding bounds for \\(P(A\\cup B)\\). R: Draw samples from the examples and show the probability bounds of \\(P(A \\cap B)\\) . Solution. From the properties of probability we have \\[\\begin{equation} P(A \\cup B) = P(A) + P(B) - P(A \\cap B) \\leq 1. \\end{equation}\\] From this follows \\[\\begin{align} P(A \\cap B) &\\geq P(A) + P(B) - 1 \\\\ &= \\frac{2}{3} + \\frac{1}{2} - 1 \\\\ &= \\frac{1}{6}, \\end{align}\\] which is the lower bound for the intersection. Conversely, we have \\[\\begin{equation} P(A \\cup B) = P(A) + P(B) - P(A \\cap B) \\geq P(A). \\end{equation}\\] From this follows \\[\\begin{align} P(A \\cap B) &\\leq P(B) \\\\ &= \\frac{1}{2}, \\end{align}\\] which is the upper bound for the intersection. For an example take a fair die. To achieve the lower bound let \\(A = \\{3,4,5,6\\}\\) and \\(B = \\{1,2,3\\}\\), then their intersection is \\(A \\cap B = \\{3\\}\\). To achieve the upper bound take \\(A = \\{1,2,3,4\\}\\) and $B = {1,2,3} $. For the bounds of the union we will use the results from the first part. Again from the properties of probability we have \\[\\begin{align} P(A \\cup B) &= P(A) + P(B) - P(A \\cap B) \\\\ &\\geq P(A) + P(B) - \\frac{1}{2} \\\\ &= \\frac{2}{3}. \\end{align}\\] Conversely \\[\\begin{align} P(A \\cup B) &= P(A) + P(B) - P(A \\cap B) \\\\ &\\leq P(A) + P(B) - \\frac{1}{6} \\\\ &= 1. \\end{align}\\] Therefore \\(\\frac{2}{3} \\leq P(A \\cup B) \\leq 1\\). We use sample in R: set.seed(1) n <- 10000 samps <- sample(1:6, n, replace = TRUE) # lower bound lb <- vector(mode = "logical", length = n) A <- c(1,2,3) B <- c(3,4,5,6) for (i in 1:n) { lb[i] <- samps[i] %in% A & samps[i] %in% B } sum(lb) / n ## [1] 0.1605 # upper bound ub <- vector(mode = "logical", length = n) A <- c(1,2,3) B <- c(1,2,3,4) for (i in 1:n) { ub[i] <- samps[i] %in% A & samps[i] %in% B } sum(ub) / n ## [1] 0.4913 Exercise 1.10 A fair coin is tossed repeatedly. Show that, with probability one, a head turns up sooner or later. Show similarly that any given finite sequence of heads and tails occurs eventually with probability one. Solution. \\[\\begin{align} P(\\text{no heads}) &= \\lim_{n \\rightarrow \\infty} P(\\text{no heads in first }n \\text{ tosses}) \\\\ &= \\lim_{n \\rightarrow \\infty} \\frac{1}{2^n} \\\\ &= 0. \\end{align}\\] For the second part, let us fix the given sequence of heads and tails of length \\(k\\) as \\(s\\). A probability that this happens in \\(k\\) tosses is \\(\\frac{1}{2^k}\\). \\[\\begin{align} P(s \\text{ occurs}) &= \\lim_{n \\rightarrow \\infty} P(s \\text{ occurs in first } nk \\text{ tosses}) \\end{align}\\] The right part of the upper equation is greater than if \\(s\\) occurs either in the first \\(k\\) tosses, second \\(k\\) tosses,…, \\(n\\)-th \\(k\\) tosses. Therefore \\[\\begin{align} P(s \\text{ occurs}) &\\geq \\lim_{n \\rightarrow \\infty} P(s \\text{ occurs in first } n \\text{ disjoint sequences of length } k) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - P(s \\text{ does not occur in first } n \\text{ disjoint sequences})) \\\\ &= 1 - \\lim_{n \\rightarrow \\infty} P(s \\text{ does not occur in first } n \\text{ disjoint sequences}) \\\\ &= 1 - \\lim_{n \\rightarrow \\infty} (1 - \\frac{1}{2^k})^n \\\\ &= 1. \\end{align}\\] Exercise 1.11 An Erdos-Renyi random graph \\(G(n,p)\\) is a model with \\(n\\) nodes, where each pair of nodes is connected with probability \\(p\\). Calculate the probability that there exists a node that is not connected to any other node in \\(G(4,0.6)\\). Show that the upper bound for the probability that there exist 2 nodes that are not connected to any other node for an arbitrary \\(G(n,p)\\) is \\(\\binom{n}{2} (1-p)^{2n - 3}\\). R: Estimate the probability from the first point using simulation. Solution. Let \\(A_i\\) be the event that the \\(i\\)-th node is not connected to any other node. Then our goal is to calculate \\(P(\\cup_{i=1}^n A_i)\\). Using the inclusion-exclusion principle, we get \\[\\begin{align} P(\\cup_{i=1}^n A_i) &= \\sum_i A_i - \\sum_{i<j} P(A_i \\cap A_j) + \\sum_{i<j<k} P(A_i \\cap A_j \\cap A_k) - P(A_1 \\cap A_2 \\cap A_3 \\cap A_4) \\\\ &=4 (1 - p)^3 - \\binom{4}{2} (1 - p)^5 + \\binom{4}{3} (1 - p)^6 - (1 - p)^6 \\\\ &\\approx 0.21. \\end{align}\\] Let \\(A_{ij}\\) be the event that nodes \\(i\\) and \\(j\\) are not connected to any other node. We are interested in \\(P(\\cup_{i<j}A_{ij})\\). By using Boole`s inequality, we get \\[\\begin{align} P(\\cup_{i<j}A_{ij}) \\leq \\sum_{i<j} P(A_{ij}). \\end{align}\\] What is the probability of \\(A_{ij}\\)? There need to be no connections to the \\(i\\)-th node to the remaining nodes (excluding \\(j\\)), the same for the \\(j\\)-th node, and there can be no connection between them. Therefore \\[\\begin{align} P(\\cup_{i<j}A_{ij}) &\\leq \\sum_{i<j} (1 - p)^{2(n-2) + 1} \\\\ &= \\binom{n}{2} (1 - p)^{2n - 3}. \\end{align}\\] set.seed(1) n_samp <- 100000 n <- 4 p <- 0.6 conn_samp <- vector(mode = "logical", length = n_samp) for (i in 1:n_samp) { tmp_mat <- matrix(data = 0, nrow = n, ncol = n) samp_conn <- sample(c(0,1), choose(4,2), replace = TRUE, prob = c(1 - p, p)) tmp_mat[lower.tri(tmp_mat)] <- samp_conn tmp_mat[upper.tri(tmp_mat)] <- t(tmp_mat)[upper.tri(t(tmp_mat))] not_conn <- apply(tmp_mat, 1, sum) if (any(not_conn == 0)) { conn_samp[i] <- TRUE } else { conn_samp[i] <- FALSE } } sum(conn_samp) / n_samp ## [1] 0.20565 1.3 Discrete probability spaces Exercise 1.12 Show that the standard measurable space on \\(\\Omega = \\{0,1,...,n\\}\\) equipped with binomial measure is a discrete probability space. Define another probability measure on this measurable space. Show that for \\(n=1\\) the binomial measure is the same as the Bernoulli measure. R: Draw 1000 samples from the binomial distribution \\(p=0.5\\), \\(n=20\\) (rbinom) and compare relative frequencies with theoretical probability measure. Solution. We need to show that the terms of \\(\\sum_{k=0}^n \\binom{n}{k} p^k (1 - p)^{n - k}\\) sum to 1. For that we use the binomial theorem \\(\\sum_{k=0}^n \\binom{n}{k} x^k y^{n-k} = (x + y)^n\\). So \\[\\begin{equation} \\sum_{k=0}^n \\binom{n}{k} p^k (1 - p)^{n - k} = (p + 1 - p)^n = 1. \\end{equation}\\] \\(P(\\{k\\}) = \\frac{1}{n + 1}\\). When \\(n=1\\) then \\(k \\in \\{0,1\\}\\). Inserting \\(n=1\\) into the binomial measure, we get \\(\\binom{1}{k}p^k (1-p)^{1 - k}\\). Now \\(\\binom{1}{1} = \\binom{1}{0} = 1\\), so the measure is \\(p^k (1-p)^{1 - k}\\), which is the Bernoulli measure. set.seed(1) library(ggplot2) library(dplyr) bin_samp <- rbinom(n = 1000, size = 20, prob = 0.5) bin_samp <- data.frame(x = bin_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:20, n = dbinom(0:20, size = 20, prob = 0.5), type = "theoretical_measure")) bin_plot <- ggplot(data = bin_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(bin_plot) Exercise 1.13 Show that the standard measurable space on \\(\\Omega = \\{0,1,...,\\infty\\}\\) equipped with geometric measure is a discrete probability space, equipped with Poisson measure is a discrete probability space. Define another probability measure on this measurable space. R: Draw 1000 samples from the Poisson distribution \\(\\lambda = 10\\) (rpois) and compare relative frequencies with theoretical probability measure. Solution. \\(\\sum_{k = 0}^{\\infty} p(1 - p)^k = p \\sum_{k = 0}^{\\infty} (1 - p)^k = p \\frac{1}{1 - 1 + p} = 1\\). We used the formula for geometric series. \\(\\sum_{k = 0}^{\\infty} \\frac{\\lambda^k e^{-\\lambda}}{k!} = e^{-\\lambda} \\sum_{k = 0}^{\\infty} \\frac{\\lambda^k}{k!} = e^{-\\lambda} e^{\\lambda} = 1.\\) We used the Taylor expansion of the exponential function. Since we only have to define a probability measure, we could only assign probabilities that sum to one to a finite number of events in \\(\\Omega\\), and probability zero to the other infinite number of events. However to make this solution more educational, we will try to find a measure that assigns a non-zero probability to all events in \\(\\Omega\\). A good start for this would be to find a converging infinite series, as the probabilities will have to sum to one. One simple converging series is the geometric series \\(\\sum_{k=0}^{\\infty} p^k\\) for \\(|p| < 1\\). Let us choose an arbitrary \\(p = 0.5\\). Then \\(\\sum_{k=0}^{\\infty} p^k = \\frac{1}{1 - 0.5} = 2\\). To complete the measure, we have to normalize it, so it sums to one, therefore \\(P(\\{k\\}) = \\frac{0.5^k}{2}\\) is a probability measure on \\(\\Omega\\). We could make it even more difficult by making this measure dependent on some parameter \\(\\alpha\\), but this is out of the scope of this introductory chapter. set.seed(1) pois_samp <- rpois(n = 1000, lambda = 10) pois_samp <- data.frame(x = pois_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:25, n = dpois(0:25, lambda = 10), type = "theoretical_measure")) pois_plot <- ggplot(data = pois_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(pois_plot) Exercise 1.14 Define a probability measure on \\((\\Omega = \\mathbb{Z}, 2^{\\mathbb{Z}})\\). Define a probability measure such that \\(P(\\omega) > 0, \\forall \\omega \\in \\Omega\\). R: Implement a random generator that will generate samples with the relative frequency that corresponds to your probability measure. Compare relative frequencies with theoretical probability measure . Solution. \\(P(0) = 1, P(\\omega) = 0, \\forall \\omega \\neq 0\\). \\(P(\\{k\\}) = \\sum_{k = -\\infty}^{\\infty} \\frac{p(1 - p)^{|k|}}{2^{1 - 1_0(k)}}\\), where \\(1_0(k)\\) is the indicator function, which equals to one if \\(k\\) is 0, and equals to zero in every other case. n <- 1000 geom_samps <- rgeom(n, prob = 0.5) sign_samps <- sample(c(FALSE, TRUE), size = n, replace = TRUE) geom_samps[sign_samps] <- -geom_samps[sign_samps] my_pmf <- function (k, p) { indic <- rep(1, length(k)) indic[k == 0] <- 0 return ((p * (1 - p)^(abs(k))) / 2^indic) } geom_samps <- data.frame(x = geom_samps) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = -10:10, n = my_pmf(-10:10, 0.5), type = "theoretical_measure")) geom_plot <- ggplot(data = geom_samps, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(geom_plot) Exercise 1.15 Define a probability measure on \\(\\Omega = \\{1,2,3,4,5,6\\}\\) with parameter \\(m \\in \\{1,2,3,4,5,6\\}\\), so that the probability of outcome at distance \\(1\\) from \\(m\\) is half of the probability at distance \\(0\\), at distance \\(2\\) is half of the probability at distance \\(1\\), etc. R: Implement a random generator that will generate samples with the relative frequency that corresponds to your probability measure. Compare relative frequencies with theoretical probability measure . Solution. \\(P(\\{k\\}) = \\frac{\\frac{1}{2}^{|m - k|}}{\\sum_{i=1}^6 \\frac{1}{2}^{|m - i|}}\\) n <- 10000 m <- 4 my_pmf <- function (k, m) { denom <- sum(0.5^abs(m - 1:6)) return (0.5^abs(m - k) / denom) } samps <- c() for (i in 1:n) { a <- sample(1:6, 1) a_val <- my_pmf(a, m) prob <- runif(1) if (prob < a_val) { samps <- c(samps, a) } } samps <- data.frame(x = samps) %>% count(x) %>% mutate(n = n / length(samps), type = "empirical_frequencies") %>% bind_rows(data.frame(x = 1:6, n = my_pmf(1:6, m), type = "theoretical_measure")) my_plot <- ggplot(data = samps, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(my_plot) "],["uprobspaces.html", "Chapter 2 Uncountable probability spaces 2.1 Borel sets 2.2 Lebesgue measure", " Chapter 2 Uncountable probability spaces This chapter deals with uncountable probability spaces. The students are expected to acquire the following knowledge: Theoretical Understand Borel sets and identify them. Estimate Lebesgue measure for different sets. Know when sets are Borel-measurable. Understanding of countable and uncountable sets. R Uniform sampling. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 2.1 Borel sets Exercise 2.1 Prove that the intersection of two sigma algebras on \\(\\Omega\\) is a sigma algebra. Prove that the collection of all open subsets \\((a,b)\\) on \\((0,1]\\) is not a sigma algebra of \\((0,1]\\). Solution. Empty set: \\[\\begin{equation} \\emptyset \\in \\mathcal{A} \\wedge \\emptyset \\in \\mathcal{B} \\Rightarrow \\emptyset \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Complement: \\[\\begin{equation} \\text{Let } A \\in \\mathcal{A} \\cap \\mathcal{B} \\Rightarrow A \\in \\mathcal{A} \\wedge A \\in \\mathcal{B} \\Rightarrow A^c \\in \\mathcal{A} \\wedge A^c \\in \\mathcal{B} \\Rightarrow A^c \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Countable additivity: Let \\(\\{A_i\\}\\) be a countable sequence of subsets in \\(\\mathcal{A} \\cap \\mathcal{B}\\). \\[\\begin{equation} \\forall i: A_i \\in \\mathcal{A} \\cap \\mathcal{B} \\Rightarrow A_i \\in \\mathcal{A} \\wedge A_i \\in \\mathcal{B} \\Rightarrow \\cup A_i \\in \\mathcal{A} \\wedge \\cup A_i \\in \\mathcal{B} \\Rightarrow \\cup A_i \\in \\mathcal{A} \\cap \\mathcal{B} \\end{equation}\\] Let \\(A\\) denote the collection of all open subsets \\((a,b)\\) on \\((0,1]\\). Then \\((0,1) \\in A\\). But \\((0,1)^c = 1 \\notin A\\). Exercise 2.2 Show that \\(\\mathcal{C} = \\sigma(\\mathcal{C})\\) if and only if \\(\\mathcal{C}\\) is a sigma algebra. Solution. “\\(\\Rightarrow\\)” This follows from the definition of a generated sigma algebra. “\\(\\Leftarrow\\)” Let \\(\\mathcal{F} = \\cap_i F_i\\) be the intersection of all sigma algebras that contain \\(\\mathcal{C}\\). Then \\(\\sigma(\\mathcal{C}) = \\mathcal{F}\\). Additionally, \\(\\forall i: \\mathcal{C} \\in F_i\\). So each \\(F_i\\) can be written as \\(F_i = \\mathcal{C} \\cup D\\), where \\(D\\) are the rest of the elements in the sigma algebra. In other words, each sigma algebra in the collection contains at least \\(\\mathcal{C}\\), but can contain other elements. Now for some \\(j\\), \\(F_j = \\mathcal{C}\\) as \\(\\{F_i\\}\\) contains all sigma algebras that contain \\(\\mathcal{C}\\) and \\(\\mathcal{C}\\) is such a sigma algebra. Since this is the smallest subset in the intersection it follows that \\(\\sigma(\\mathcal{C}) = \\mathcal{F} = \\mathcal{C}\\). Exercise 2.3 Let \\(\\mathcal{C}\\) and \\(\\mathcal{D}\\) be two collections of subsets on \\(\\Omega\\) such that \\(\\mathcal{C} \\subset \\mathcal{D}\\). Prove that \\(\\sigma(\\mathcal{C}) \\subseteq \\sigma(\\mathcal{D})\\). Solution. \\(\\sigma(\\mathcal{D})\\) is a sigma algebra that contains \\(\\mathcal{D}\\). It follows that \\(\\sigma(\\mathcal{D})\\) is a sigma algebra that contains \\(\\mathcal{C}\\). Let us write \\(\\sigma(\\mathcal{C}) = \\cap_i F_i\\), where \\(\\{F_i\\}\\) is the collection of all sigma algebras that contain \\(\\mathcal{C}\\). Since \\(\\sigma(\\mathcal{D})\\) is such a sigma algebra, there exists an index \\(j\\), so that \\(F_j = \\sigma(\\mathcal{D})\\). Then we can write \\[\\begin{align} \\sigma(\\mathcal{C}) &= (\\cap_{i \\neq j} F_i) \\cap \\sigma(\\mathcal{D}) \\\\ &\\subseteq \\sigma(\\mathcal{D}). \\end{align}\\] Exercise 2.4 Prove that the following subsets of \\((0,1]\\) are Borel-measurable by finding their measure. Any countable set. The set of numbers in (0,1] whose decimal expansion does not contain 7. Solution. This follows directly from the fact that every countable set is a union of singletons, whose measure is 0. Let us first look at numbers which have a 7 as the first decimal numbers. Their measure is 0.1. Then we take all the numbers with a 7 as the second decimal number (excluding those who already have it as the first). These have the measure 0.01, and there are 9 of them, so their total measure is 0.09. We can continue to do so infinitely many times. At each \\(n\\), we have the measure of the intervals which is \\(10^n\\) and the number of those intervals is \\(9^{n-1}\\). Now \\[\\begin{align} \\lambda(A) &= 1 - \\sum_{n = 0}^{\\infty} \\frac{9^n}{10^{n+1}} \\\\ &= 1 - \\frac{1}{10} \\sum_{n = 0}^{\\infty} (\\frac{9}{10})^n \\\\ &= 1 - \\frac{1}{10} \\frac{10}{1} \\\\ &= 0. \\end{align}\\] Since we have shown that the measure of the set is \\(0\\), we have also shown that the set is measurable. Exercise 2.5 Let \\(\\Omega = [0,1]\\), and let \\(\\mathcal{F}_3\\) consist of all countable subsets of \\(\\Omega\\), and all subsets of \\(\\Omega\\) having a countable complement. Show that \\(\\mathcal{F}_3\\) is a sigma algebra. Let us define \\(P(A)=0\\) if \\(A\\) is countable, and \\(P(A) = 1\\) if \\(A\\) has a countable complement. Is \\((\\Omega, \\mathcal{F}_3, P)\\) a legitimate probability space? Solution. The empty set is countable, therefore it is in \\(\\mathcal{F}_3\\). For any \\(A \\in \\mathcal{F}_3\\). If \\(A\\) is countable, then \\(A^c\\) has a countable complement and is in \\(\\mathcal{F}_3\\). If \\(A\\) is uncountable, then it has a countable complement \\(A^c\\) which is therefore also in \\(\\mathcal{F}_3\\). We are left with showing countable additivity. Let \\(\\{A_i\\}\\) be an arbitrary collection of sets in \\(\\mathcal{F}_3\\). We will look at two possibilities. First let all \\(A_i\\) be countable. A countable union of countable sets is countable, and therefore in \\(\\mathcal{F}_3\\). Second, let at least one \\(A_i\\) be uncountable. It follows that it has a countable complement. We can write \\[\\begin{equation} (\\cup_{i=1}^{\\infty} A_i)^c = \\cap_{i=1}^{\\infty} A_i^c. \\end{equation}\\] Since at least one \\(A_i^c\\) on the right side is countable, the whole intersection is countable, and therefore the union has a countable complement. It follows that the union is in \\(\\mathcal{F}_3\\). The tuple \\((\\Omega, \\mathcal{F}_3)\\) is a measurable space. Therefore, we only need to check whether \\(P\\) is a probability measure. The measure of the empty set is zero as it is countable. We have to check for countable additivity. Let us look at three situations. Let \\(A_i\\) be disjoint sets. First, let all \\(A_i\\) be countable. \\[\\begin{equation} P(\\cup_{i=1}^{\\infty} A_i) = \\sum_{i=1}^{\\infty}P( A_i)) = 0. \\end{equation}\\] Since the union is countable, the above equation holds. Second, let exactly one \\(A_i\\) be uncountable. W.L.O.G. let that be \\(A_1\\). Then \\[\\begin{equation} P(\\cup_{i=1}^{\\infty} A_i) = 1 + \\sum_{i=2}^{\\infty}P( A_i)) = 1. \\end{equation}\\] Since the union is uncountable, the above equation holds. Third, let at least two \\(A_i\\) be uncountable. We have to check whether it is possible for two uncountable sets in \\(\\mathcal{F}_3\\) to be disjoint. If that is possible, then their measures would sum to more than one and \\(P\\) would not be a probability measure. W.L.O.G. let \\(A_1\\) and \\(A_2\\) be uncountable. Then we have \\[\\begin{equation} A_1 \\cap A_2 = (A_1^c \\cup A_2^c)^c. \\end{equation}\\] Now \\(A_1^c\\) and \\(A_2^c\\) are countable and their union is therefore countable. Let \\(B = A_1^c \\cup A_2^c\\). So the intersection of \\(A_1\\) and \\(A_2\\) equals the complement of \\(B\\), which is countable. For the intersection to be the empty set, \\(B\\) would have to equal to \\(\\Omega\\). But \\(\\Omega\\) is uncountable and therefore \\(B\\) can not equal to \\(\\Omega\\). It follows that two uncountable sets in \\(\\mathcal{F}_3\\) can not have an empty intersection. Therefore the tuple is a legitimate probability space. 2.2 Lebesgue measure Exercise 2.6 Show that the Lebesgue measure of rational numbers on \\([0,1]\\) is 0. R: Implement a random number generator, which generates uniform samples of irrational numbers in \\([0,1]\\) by uniformly sampling from \\([0,1]\\) and rejecting a sample if it is rational. Solution. There are a countable number of rational numbers. Therefore, we can write \\[\\begin{align} \\lambda(\\mathbb{Q}) &= \\lambda(\\cup_{i = 1}^{\\infty} q_i) &\\\\ &= \\sum_{i = 1}^{\\infty} \\lambda(q_i) &\\text{ (countable additivity)} \\\\ &= \\sum_{i = 1}^{\\infty} 0 &\\text{ (Lebesgue measure of a singleton)} \\\\ &= 0. \\end{align}\\] Exercise 2.7 Prove that the Lebesgue measure of \\(\\mathbb{R}\\) is infinity. Paradox. Show that the cardinality of \\(\\mathbb{R}\\) and \\((0,1)\\) is the same, while their Lebesgue measures are infinity and one respectively. Solution. Let \\(a_i\\) be the \\(i\\)-th integer for \\(i \\in \\mathbb{Z}\\). We can write \\(\\mathbb{R} = \\cup_{-\\infty}^{\\infty} (a_i, a_{i + 1}]\\). \\[\\begin{align} \\lambda(\\mathbb{R}) &= \\lambda(\\cup_{i = -\\infty}^{\\infty} (a_i, a_{i + 1}]) \\\\ &= \\lambda(\\lim_{n \\rightarrow \\infty} \\cup_{i = -n}^{n} (a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\lambda(\\cup_{i = -n}^{n} (a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\sum_{i = -n}^{n} \\lambda((a_i, a_{i + 1}]) \\\\ &= \\lim_{n \\rightarrow \\infty} \\sum_{i = -n}^{n} 1 \\\\ &= \\lim_{n \\rightarrow \\infty} 2n \\\\ &= \\infty. \\end{align}\\] We need to find a bijection between \\(\\mathbb{R}\\) and \\((0,1)\\). A well-known function that maps from a bounded interval to \\(\\mathbb{R}\\) is the tangent. To make the bijection easier to achieve, we will take the inverse, which maps from \\(\\mathbb{R}\\) to \\((-\\frac{\\pi}{2}, \\frac{\\pi}{2})\\). However, we need to change the function so it maps to \\((0,1)\\). First we add \\(\\frac{\\pi}{2}\\), so that we move the function above zero. Then we only have to divide by the max value, which in this case is \\(\\pi\\). So our bijection is \\[\\begin{equation} f(x) = \\frac{\\tan^{-1}(x) + \\frac{\\pi}{2}}{\\pi}. \\end{equation}\\] Exercise 2.8 Take the measure space \\((\\Omega_1 = (0,1], B_{(0,1]}, \\lambda)\\) (we know that this is a probability space on \\((0,1]\\)). Define a map (function) from \\(\\Omega_1\\) to \\(\\Omega_2 = \\{1,2,3,4,5,6\\}\\) such that the measure space \\((\\Omega_2, 2^{\\Omega_2}, \\lambda(f^{-1}()))\\) will be a discrete probability space with uniform probabilities (\\(P(\\omega) = \\frac{1}{6}, \\forall \\omega \\in \\Omega_2)\\). Is the map that you defined in (a) the only such map? How would you in the same fashion define a map that would result in a probability space that can be interpreted as a coin toss with probability \\(p\\) of heads? R: Use the map in (a) as a basis for a random generator for this fair die. Solution. In other words, we have to assign disjunct intervals of the same size to each element of \\(\\Omega_2\\). Therefore \\[\\begin{equation} f(x) = \\lceil 6x \\rceil. \\end{equation}\\] No, we could for example rearrange the order in which the intervals are mapped to integers. Additionally, we could have several disjoint intervals that mapped to the same integer, as long as the Lebesgue measure of their union would be \\(\\frac{1}{6}\\) and the function would remain injective. We have \\(\\Omega_3 = \\{0,1\\}\\), where zero represents heads and one represents tails. Then \\[\\begin{equation} f(x) = 0^{I_{A}(x)}, \\end{equation}\\] where \\(A = \\{y \\in (0,1] : y < p\\}\\). set.seed(1) unif_s <- runif(1000) die_s <- ceiling(6 * unif_s) summary(as.factor(die_s)) ## 1 2 3 4 5 6 ## 166 154 200 146 166 168 "],["condprob.html", "Chapter 3 Conditional probability 3.1 Calculating conditional probabilities 3.2 Conditional independence 3.3 Monty Hall problem", " Chapter 3 Conditional probability This chapter deals with conditional probability. The students are expected to acquire the following knowledge: Theoretical Identify whether variables are independent. Calculation of conditional probabilities. Understanding of conditional dependence and independence. How to apply Bayes’ theorem to solve difficult probabilistic questions. R Simulating conditional probabilities. cumsum. apply. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 3.1 Calculating conditional probabilities Exercise 3.1 A military officer is in charge of identifying enemy aircraft and shooting them down. He is able to positively identify an enemy airplane 95% of the time and positively identify a friendly airplane 90% of the time. Furthermore, 99% of the airplanes are friendly. When the officer identifies an airplane as an enemy airplane, what is the probability that it is not and they will shoot at a friendly airplane? Solution. Let \\(E = 0\\) denote that the observed plane is friendly and \\(E=1\\) that it is an enemy. Let \\(I = 0\\) denote that the officer identified it as friendly and \\(I = 1\\) as enemy. Then \\[\\begin{align} P(E = 0 | I = 1) &= \\frac{P(I = 1 | E = 0)P(E = 0)}{P(I = 1)} \\\\ &= \\frac{P(I = 1 | E = 0)P(E = 0)}{P(I = 1 | E = 0)P(E = 0) + P(I = 1 | E = 1)P(E = 1)} \\\\ &= \\frac{0.1 \\times 0.99}{0.1 \\times 0.99 + 0.95 \\times 0.01} \\\\ &= 0.91. \\end{align}\\] Exercise 3.2 R: Consider tossing a fair die. Let \\(A = \\{2,4,6\\}\\) and \\(B = \\{1,2,3,4\\}\\). Then \\(P(A) = \\frac{1}{2}\\), \\(P(B) = \\frac{2}{3}\\) and \\(P(AB) = \\frac{1}{3}\\). Since \\(P(AB) = P(A)P(B)\\), the events \\(A\\) and \\(B\\) are independent. Simulate draws from the sample space and verify that the proportions are the same. Then find two events \\(C\\) and \\(D\\) that are not independent and repeat the simulation. set.seed(1) nsamps <- 10000 tosses <- sample(1:6, nsamps, replace = TRUE) PA <- sum(tosses %in% c(2,4,6)) / nsamps PB <- sum(tosses %in% c(1,2,3,4)) / nsamps PA * PB ## [1] 0.3295095 sum(tosses %in% c(2,4)) / nsamps ## [1] 0.3323 # Let C = {1,2} and D = {2,3} PC <- sum(tosses %in% c(1,2)) / nsamps PD <- sum(tosses %in% c(2,3)) / nsamps PC * PD ## [1] 0.1067492 sum(tosses %in% c(2)) / nsamps ## [1] 0.1622 Exercise 3.3 A machine reports the true value of a thrown 12-sided die 5 out of 6 times. If the machine reports a 1 has been tossed, what is the probability that it is actually a 1? Now let the machine only report whether a 1 has been tossed or not. Does the probability change? R: Use simulation to check your answers to a) and b). Solution. Let \\(T = 1\\) denote that the toss is 1 and \\(M = 1\\) that the machine reports a 1. \\[\\begin{align} P(T = 1 | M = 1) &= \\frac{P(M = 1 | T = 1)P(T = 1)}{P(M = 1)} \\\\ &= \\frac{P(M = 1 | T = 1)P(T = 1)}{\\sum_{k=1}^{12} P(M = 1 | T = k)P(T = k)} \\\\ &= \\frac{\\frac{5}{6}\\frac{1}{12}}{\\frac{5}{6}\\frac{1}{12} + 11 \\frac{1}{6} \\frac{1}{11} \\frac{1}{12}} \\\\ &= \\frac{5}{6}. \\end{align}\\] Yes. \\[\\begin{align} P(T = 1 | M = 1) &= \\frac{P(M = 1 | T = 1)P(T = 1)}{P(M = 1)} \\\\ &= \\frac{P(M = 1 | T = 1)P(T = 1)}{\\sum_{k=1}^{12} P(M = 1 | T = k)P(T = k)} \\\\ &= \\frac{\\frac{5}{6}\\frac{1}{12}}{\\frac{5}{6}\\frac{1}{12} + 11 \\frac{1}{6} \\frac{1}{12}} \\\\ &= \\frac{5}{16}. \\end{align}\\] set.seed(1) nsamps <- 10000 report_a <- vector(mode = "numeric", length = nsamps) report_b <- vector(mode = "logical", length = nsamps) truths <- vector(mode = "logical", length = nsamps) for (i in 1:10000) { toss <- sample(1:12, size = 1) truth <- sample(c(TRUE, FALSE), size = 1, prob = c(5/6, 1/6)) truths[i] <- truth if (truth) { report_a[i] <- toss report_b[i] <- toss == 1 } else { remaining <- (1:12)[1:12 != toss] report_a[i] <- sample(remaining, size = 1) report_b[i] <- toss != 1 } } truth_a1 <- truths[report_a == 1] sum(truth_a1) / length(truth_a1) ## [1] 0.8300733 truth_b1 <- truths[report_b] sum(truth_b1) / length(truth_b1) ## [1] 0.3046209 Exercise 3.4 A coin is tossed independently \\(n\\) times. The probability of heads at each toss is \\(p\\). At each time \\(k\\), \\((k = 2,3,...,n)\\) we get a reward at time \\(k+1\\) if \\(k\\)-th toss was a head and the previous toss was a tail. Let \\(A_k\\) be the event that a reward is obtained at time \\(k\\). Are events \\(A_k\\) and \\(A_{k+1}\\) independent? Are events \\(A_k\\) and \\(A_{k+2}\\) independent? R: simulate 10 tosses 10000 times, where \\(p = 0.7\\). Check your answers to a) and b) by counting the frequencies of the events \\(A_5\\), \\(A_6\\), and \\(A_7\\). Solution. For \\(A_k\\) to happen, we need the tosses \\(k-2\\) and \\(k-1\\) be tails and heads respectively. For \\(A_{k+1}\\) to happen, we need tosses \\(k-1\\) and \\(k\\) be tails and heads respectively. As the toss \\(k-1\\) need to be heads for one and tails for the other, these two events can not happen simultaneously. Therefore the probability of their intersection is 0. But the probability of each of them separately is \\(p(1-p) > 0\\). Therefore, they are not independent. For \\(A_k\\) to happen, we need the tosses \\(k-2\\) and \\(k-1\\) be tails and heads respectively. For \\(A_{k+2}\\) to happen, we need tosses \\(k\\) and \\(k+1\\) be tails and heads respectively. So the probability of intersection is \\(p^2(1-p)^2\\). And the probability of each separately is again \\(p(1-p)\\). Therefore, they are independent. set.seed(1) nsamps <- 10000 p <- 0.7 rewardA_5 <- vector(mode = "logical", length = nsamps) rewardA_6 <- vector(mode = "logical", length = nsamps) rewardA_7 <- vector(mode = "logical", length = nsamps) rewardA_56 <- vector(mode = "logical", length = nsamps) rewardA_57 <- vector(mode = "logical", length = nsamps) for (i in 1:nsamps) { samps <- sample(c(0,1), size = 10, replace = TRUE, prob = c(0.7, 0.3)) rewardA_5[i] <- (samps[4] == 0 & samps[3] == 1) rewardA_6[i] <- (samps[5] == 0 & samps[4] == 1) rewardA_7[i] <- (samps[6] == 0 & samps[5] == 1) rewardA_56[i] <- (rewardA_5[i] & rewardA_6[i]) rewardA_57[i] <- (rewardA_5[i] & rewardA_7[i]) } sum(rewardA_5) / nsamps ## [1] 0.2141 sum(rewardA_6) / nsamps ## [1] 0.2122 sum(rewardA_7) / nsamps ## [1] 0.2107 sum(rewardA_56) / nsamps ## [1] 0 sum(rewardA_57) / nsamps ## [1] 0.0454 Exercise 3.5 A drawer contains two coins. One is an unbiased coin, the other is a biased coin, which will turn up heads with probability \\(p\\) and tails with probability \\(1-p\\). One coin is selected uniformly at random. The selected coin is tossed \\(n\\) times. The coin turns up heads \\(k\\) times and tails \\(n-k\\) times. What is the probability that the coin is biased? The selected coin is tossed repeatedly until it turns up heads \\(k\\) times. Given that it is tossed \\(n\\) times in total, what is the probability that the coin is biased? Solution. Let \\(B = 1\\) denote that the coin is biased and let \\(H = k\\) denote that we’ve seen \\(k\\) heads. \\[\\begin{align} P(B = 1 | H = k) &= \\frac{P(H = k | B = 1)P(B = 1)}{P(H = k)} \\\\ &= \\frac{P(H = k | B = 1)P(B = 1)}{P(H = k | B = 1)P(B = 1) + P(H = k | B = 0)P(B = 0)} \\\\ &= \\frac{p^k(1-p)^{n-k} 0.5}{p^k(1-p)^{n-k} 0.5 + 0.5^{n+1}} \\\\ &= \\frac{p^k(1-p)^{n-k}}{p^k(1-p)^{n-k} + 0.5^n}. \\end{align}\\] The same results as in a). The only difference between these two scenarios is that in b) the last throw must be heads. However, this holds for the biased and the unbiased coin and therefore does not affect the probability of the coin being biased. Exercise 3.6 Judy goes around the company for Women’s day and shares flowers. In every office she leaves a flower, if there is at least one woman inside. The probability that there’s a woman in the office is \\(\\frac{3}{5}\\). What is the probability that Judy leaves her first flower in the fourth office? Given that she has given away exactly three flowers in the first four offices, what is the probability that she gives her fourth flower in the eighth office? What is the probability that she leaves the second flower in the fifth office? What is the probability that she leaves the second flower in the fifth office, given that she did not leave the second flower in the second office? Judy needs a new supply of flowers immediately after the office, where she gives away her last flower. What is the probability that she visits at least five offices, if she starts with two flowers? R: simulate Judy’s walk 10000 times to check your answers a) - e). Solution. Let \\(X_i = k\\) denote the event that … \\(i\\)-th sample on the \\(k\\)-th run. Since the events are independent, we can multiply their probabilities to get \\[\\begin{equation} P(X_1 = 4) = 0.4^3 \\times 0.6 = 0.0384. \\end{equation}\\] Same as in a) as we have a fresh start after first four offices. For this to be possible, she had to leave the first flower in one of the first four offices. Therefore there are four possibilities, and for each of those the probability is \\(0.4^3 \\times 0.6\\). Additionally, the probability that she leaves a flower in the fifth office is \\(0.6\\). So \\[\\begin{equation} P(X_2 = 5) = \\binom{4}{1} \\times 0.4^3 \\times 0.6^2 = 0.09216. \\end{equation}\\] We use Bayes’ theorem. \\[\\begin{align} P(X_2 = 5 | X_2 \\neq 2) &= \\frac{P(X_2 \\neq 2 | X_2 = 5)P(X_2 = 5)}{P(X_2 \\neq 2)} \\\\ &= \\frac{0.09216}{0.64} \\\\ &= 0.144. \\end{align}\\] The denominator in the second equation can be calculated as follows. One of three things has to happen for the second not to be dealt in the second round. First, both are zero, so \\(0.4^2\\). Second, first is zero, and second is one, so \\(0.4 \\times 0.6\\). Third, the first is one and the second one zero, so \\(0.6 \\times 0.4\\). Summing these values we get \\(0.64\\). We will look at the complement, so the events that she gave away exactly two flowers after two, three and four offices. \\[\\begin{equation} P(X_2 \\geq 5) = 1 - 0.6^2 - 2 \\times 0.4 \\times 0.6^2 - 3 \\times 0.4^2 \\times 0.6^2 = 0.1792. \\end{equation}\\] The multiplying parts represent the possibilities of the first flower. set.seed(1) nsamps <- 100000 Judyswalks <- matrix(data = NA, nrow = nsamps, ncol = 8) for (i in 1:nsamps) { thiswalk <- sample(c(0,1), size = 8, replace = TRUE, prob = c(0.4, 0.6)) Judyswalks[i, ] <- thiswalk } csJudy <- t(apply(Judyswalks, 1, cumsum)) # a sum(csJudy[ ,4] == 1 & csJudy[ ,3] == 0) / nsamps ## [1] 0.03848 # b csJsubset <- csJudy[csJudy[ ,4] == 3 & csJudy[ ,3] == 2, ] sum(csJsubset[ ,8] == 4 & csJsubset[ ,7] == 3) / nrow(csJsubset) ## [1] 0.03665893 # c sum(csJudy[ ,5] == 2 & csJudy[ ,4] == 1) / nsamps ## [1] 0.09117 # d sum(csJudy[ ,5] == 2 & csJudy[ ,4] == 1) / sum(csJudy[ ,2] != 2) ## [1] 0.1422398 # e sum(csJudy[ ,4] < 2) / nsamps ## [1] 0.17818 3.2 Conditional independence Exercise 3.7 Describe: A real-world example of two events \\(A\\) and \\(B\\) that are dependent but become conditionally independent if conditioned on a third event \\(C\\). A real-world example of two events \\(A\\) and \\(B\\) that are independent, but become dependent if conditioned on some third event \\(C\\). Solution. Let \\(A\\) be the height of a person and let \\(B\\) be the person’s knowledge of the Dutch language. These events are dependent since the Dutch are known to be taller than average. However if \\(C\\) is the nationality of the person, then \\(A\\) and \\(B\\) are independent given \\(C\\). Let \\(A\\) be the event that Mary passes the exam and let \\(B\\) be the event that John passes the exam. These events are independent. However, if the event \\(C\\) is that Mary and John studied together, then \\(A\\) and \\(B\\) are conditionally dependent given \\(C\\). Exercise 3.8 We have two coins of identical appearance. We know that one is a fair coin and the other flips heads 80% of the time. We choose one of the two coins uniformly at random. We discard the coin that was not chosen. We now flip the chosen coin independently 10 times, producing a sequence \\(Y_1 = y_1\\), \\(Y_2 = y_2\\), …, \\(Y_{10} = y_{10}\\). Intuitively, without doing and computation, are these random variables independent? Compute the probability \\(P(Y_1 = 1)\\). Compute the probabilities \\(P(Y_2 = 1 | Y_1 = 1)\\) and \\(P(Y_{10} = 1 | Y_1 = 1,...,Y_9 = 1)\\). Given your answers to b) and c), would you now change your answer to a)? If so, discuss why your intuition had failed. Solution. \\(P(Y_1 = 1) = 0.5 * 0.8 + 0.5 * 0.5 = 0.65\\). Since we know that \\(Y_1 = 1\\) this should change our view of the probability of the coin being biased or not. Let \\(B = 1\\) denote the event that the coin is biased and let \\(B = 0\\) denote that the coin is unbiased. By using marginal probability, we can write \\[\\begin{align} P(Y_2 = 1 | Y_1 = 1) &= P(Y_2 = 1, B = 1 | Y_1 = 1) + P(Y_2 = 1, B = 0 | Y_1 = 1) \\\\ &= \\sum_{k=1}^2 P(Y_2 = 1 | B = k, Y_1 = 1)P(B = k | Y_1 = 1) \\\\ &= 0.8 \\frac{P(Y_1 = 1 | B = 1)P(B = 1)}{P(Y_1 = 1)} + 0.5 \\frac{P(Y_1 = 1 | B = 0)P(B = 0)}{P(Y_1 = 1)} \\\\ &= 0.8 \\frac{0.8 \\times 0.5}{0.65} + 0.5 \\frac{0.5 \\times 0.5}{0.65} \\\\ &\\approx 0.68. \\end{align}\\] For the other calculation we follow the same procedure. Let \\(X = 1\\) denote that first nine tosses are all heads (equivalent to \\(Y_1 = 1\\),…, \\(Y_9 = 1\\)). \\[\\begin{align} P(Y_{10} = 1 | X = 1) &= P(Y_2 = 1, B = 1 | X = 1) + P(Y_2 = 1, B = 0 | X = 1) \\\\ &= \\sum_{k=1}^2 P(Y_2 = 1 | B = k, X = 1)P(B = k | X = 1) \\\\ &= 0.8 \\frac{P(X = 1 | B = 1)P(B = 1)}{P(X = 1)} + 0.5 \\frac{P(X = 1 | B = 0)P(B = 0)}{P(X = 1)} \\\\ &= 0.8 \\frac{0.8^9 \\times 0.5}{0.5 \\times 0.8^9 + 0.5 \\times 0.5^9} + 0.5 \\frac{0.5^9 \\times 0.5}{0.5 \\times 0.8^9 + 0.5 \\times 0.5^9} \\\\ &\\approx 0.8. \\end{align}\\] 3.3 Monty Hall problem The Monty Hall problem is a famous probability puzzle with non-intuitive outcome. Many established mathematicians and statisticians had problems solving it and many even disregarded the correct solution until they’ve seen the proof by simulation. Here we will show how it can be solved relatively simply with the use of Bayes’ theorem if we select the variables in a smart way. Exercise 3.9 (Monty Hall problem) A prize is placed at random behind one of three doors. You pick a door. Now Monty Hall chooses one of the other two doors, opens it and shows you that it is empty. He then gives you the opportunity to keep your door or switch to the other unopened door. Should you stay or switch? Use Bayes’ theorem to calculate the probability of winning if you switch and if you do not. R: Check your answers in R. Solution. W.L.O.G. assume we always pick the first door. The host can only open door 2 or door 3, as he can not open the door we picked. Let \\(k \\in \\{2,3\\}\\). Let us first look at what happens if we do not change. Then we have \\[\\begin{align} P(\\text{car in 1} | \\text{open $k$}) &= \\frac{P(\\text{open $k$} | \\text{car in 1})P(\\text{car in 1})}{P(\\text{open $k$})} \\\\ &= \\frac{P(\\text{open $k$} | \\text{car in 1})P(\\text{car in 1})}{\\sum_{n=1}^3 P(\\text{open $k$} | \\text{car in $n$})P(\\text{car in $n$)}}. \\end{align}\\] The probability that he opened \\(k\\) if the car is in 1 is \\(\\frac{1}{2}\\), as he can choose between door 2 and 3 as both have a goat behind it. Let us look at the normalization constant. When \\(n = 1\\) we get the value in the nominator. When \\(n=k\\), we get 0, as he will not open the door if there’s a prize behind. The remaining option is that we select 1, the car is behind \\(k\\) and he opens the only door left. Since he can’t open 1 due to it being our pick and \\(k\\) due to having the prize, the probability of opening the remaining door is 1, and the prior probability of the car being behind this door is \\(\\frac{1}{3}\\). So we have \\[\\begin{align} P(\\text{car in 1} | \\text{open $k$}) &= \\frac{\\frac{1}{2}\\frac{1}{3}}{\\frac{1}{2}\\frac{1}{3} + \\frac{1}{3}} \\\\ &= \\frac{1}{3}. \\end{align}\\] Now let us look at what happens if we do change. Let \\(k' \\in \\{2,3\\}\\) be the door that is not opened. If we change, we select this door, so we have \\[\\begin{align} P(\\text{car in $k'$} | \\text{open $k$}) &= \\frac{P(\\text{open $k$} | \\text{car in $k'$})P(\\text{car in $k'$})}{P(\\text{open $k$})} \\\\ &= \\frac{P(\\text{open $k$} | \\text{car in $k'$})P(\\text{car in $k'$})}{\\sum_{n=1}^3 P(\\text{open $k$} | \\text{car in $n$})P(\\text{car in $n$)}}. \\end{align}\\] The denominator stays the same, the only thing that is different from before is \\(P(\\text{open $k$} | \\text{car in $k'$})\\). We have a situation where we initially selected door 1 and the car is in door \\(k'\\). The probability that the host will open door \\(k\\) is then 1, as he can not pick any other door. So we have \\[\\begin{align} P(\\text{car in $k'$} | \\text{open $k$}) &= \\frac{\\frac{1}{3}}{\\frac{1}{2}\\frac{1}{3} + \\frac{1}{3}} \\\\ &= \\frac{2}{3}. \\end{align}\\] Therefore it makes sense to change the door. set.seed(1) nsamps <- 1000 ifchange <- vector(mode = "logical", length = nsamps) ifstay <- vector(mode = "logical", length = nsamps) for (i in 1:nsamps) { where_car <- sample(c(1:3), 1) where_player <- sample(c(1:3), 1) open_samp <- (1:3)[where_car != (1:3) & where_player != (1:3)] if (length(open_samp) == 1) { where_open <- open_samp } else { where_open <- sample(open_samp, 1) } ifstay[i] <- where_car == where_player where_ifchange <- (1:3)[where_open != (1:3) & where_player != (1:3)] ifchange[i] <- where_ifchange == where_car } sum(ifstay) / nsamps ## [1] 0.328 sum(ifchange) / nsamps ## [1] 0.672 "],["rvs.html", "Chapter 4 Random variables 4.1 General properties and calculations 4.2 Discrete random variables 4.3 Continuous random variables 4.4 Singular random variables 4.5 Transformations", " Chapter 4 Random variables This chapter deals with random variables and their distributions. The students are expected to acquire the following knowledge: Theoretical Identification of random variables. Convolutions of random variables. Derivation of PDF, PMF, CDF, and quantile function. Definitions and properties of common discrete random variables. Definitions and properties of common continuous random variables. Transforming univariate random variables. R Familiarize with PDF, PMF, CDF, and quantile functions for several distributions. Visual inspection of probability distributions. Analytical and empirical calculation of probabilities based on distributions. New R functions for plotting (for example, facet_wrap). Creating random number generators based on the Uniform distribution. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 4.1 General properties and calculations Exercise 4.1 Which of the functions below are valid CDFs? Find their respective densities. R: Plot the three functions. \\[\\begin{equation} F(x) = \\begin{cases} 1 - e^{-x^2} & x \\geq 0 \\\\ 0 & x < 0. \\end{cases} \\end{equation}\\] \\[\\begin{equation} F(x) = \\begin{cases} e^{-\\frac{1}{x}} & x > 0 \\\\ 0 & x \\leq 0. \\end{cases} \\end{equation}\\] \\[\\begin{equation} F(x) = \\begin{cases} 0 & x \\leq 0 \\\\ \\frac{1}{3} & 0 < x \\leq \\frac{1}{2} \\\\ 1 & x > \\frac{1}{2}. \\end{cases} \\end{equation}\\] Solution. Yes. First, let us check the limits. \\(\\lim_{x \\rightarrow -\\infty} (0) = 0\\). \\(\\lim_{x \\rightarrow \\infty} (1 - e^{-x^2}) = 1 - \\lim_{x \\rightarrow \\infty} e^{-x^2} = 1 - 0 = 1\\). Second, let us check whether the function is increasing. Let \\(x > y \\geq 0\\). Then \\(1 - e^{-x^2} \\geq 1 - e^{-y^2}\\). We only have to check right continuity for the point zero. \\(F(0) = 0\\) and \\(\\lim_{\\epsilon \\downarrow 0}F (0 + \\epsilon) = \\lim_{\\epsilon \\downarrow 0} 1 - e^{-\\epsilon^2} = 1 - \\lim_{\\epsilon \\downarrow 0} e^{-\\epsilon^2} = 1 - 1 = 0\\). We get the density by differentiating the CDF. \\(p(x) = \\frac{d}{dx} 1 - e^{-x^2} = 2xe^{-x^2}.\\) Students are encouraged to check that this is a proper PDF. Yes. First, let us check the limits. $_{x -} (0) = 0 and \\(\\lim_{x \\rightarrow \\infty} (e^{-\\frac{1}{x}}) = 1\\). Second, let us check whether the function is increasing. Let \\(x > y \\geq 0\\). Then \\(e^{-\\frac{1}{x}} \\geq e^{-\\frac{1}{y}}\\). We only have to check right continuity for the point zero. \\(F(0) = 0\\) and \\(\\lim_{\\epsilon \\downarrow 0}F (0 + \\epsilon) = \\lim_{\\epsilon \\downarrow 0} e^{-\\frac{1}{\\epsilon}} = 0\\). We get the density by differentiating the CDF. \\(p(x) = \\frac{d}{dx} e^{-\\frac{1}{x}} = \\frac{1}{x^2}e^{-\\frac{1}{x}}.\\) Students are encouraged to check that this is a proper PDF. No. The function is not right continuous as \\(F(\\frac{1}{2}) = \\frac{1}{3}\\), but \\(\\lim_{\\epsilon \\downarrow 0} F(\\frac{1}{2} + \\epsilon) = 1\\). f1 <- function (x) { tmp <- 1 - exp(-x^2) tmp[x < 0] <- 0 return(tmp) } f2 <- function (x) { tmp <- exp(-(1 / x)) tmp[x <= 0] <- 0 return(tmp) } f3 <- function (x) { tmp <- x tmp[x == x] <- 1 tmp[x <= 0.5] <- 1/3 tmp[x <= 0] <- 0 return(tmp) } cdf_data <- tibble(x = seq(-1, 20, by = 0.001), f1 = f1(x), f2 = f2(x), f3 = f3(x)) %>% melt(id.vars = "x") cdf_plot <- ggplot(data = cdf_data, aes(x = x, y = value, color = variable)) + geom_hline(yintercept = 1) + geom_line() plot(cdf_plot) Exercise 4.2 Let \\(X\\) be a random variable with CDF \\[\\begin{equation} F(x) = \\begin{cases} 0 & x < 0 \\\\ \\frac{x^2}{2} & 0 \\leq x < 1 \\\\ \\frac{1}{2} + \\frac{p}{2} & 1 \\leq x < 2 \\\\ \\frac{1}{2} + \\frac{p}{2} + \\frac{1 - p}{2} & x \\geq 2 \\end{cases} \\end{equation}\\] R: Plot this CDF for \\(p = 0.3\\). Is it a discrete, continuous, or mixed random varible? Find the probability density/mass of \\(X\\). f1 <- function (x, p) { tmp <- x tmp[x >= 2] <- 0.5 + (p * 0.5) + ((1-p) * 0.5) tmp[x < 2] <- 0.5 + (p * 0.5) tmp[x < 1] <- (x[x < 1])^2 / 2 tmp[x < 0] <- 0 return(tmp) } cdf_data <- tibble(x = seq(-1, 5, by = 0.001), y = f1(x, 0.3)) cdf_plot <- ggplot(data = cdf_data, aes(x = x, y = y)) + geom_hline(yintercept = 1) + geom_line(color = "blue") plot(cdf_plot) ::: {.solution} \\(X\\) is a mixed random variable. Since \\(X\\) is a mixed random variable, we have to find the PDF of the continuous part and the PMF of the discrete part. We get the continuous part by differentiating the corresponding CDF, \\(\\frac{d}{dx}\\frac{x^2}{2} = x\\). So the PDF, when \\(0 \\leq x < 1\\), is \\(p(x) = x\\). Let us look at the discrete part now. It has two steps, so this is a discrete distribution with two outcomes – numbers 1 and 2. The first happens with probability \\(\\frac{p}{2}\\), and the second with probability \\(\\frac{1 - p}{2}\\). This reminds us of the Bernoulli distribution. The PMF for the discrete part is \\(P(X = x) = (\\frac{p}{2})^{2 - x} (\\frac{1 - p}{2})^{x - 1}\\). ::: Exercise 4.3 (Convolutions) Convolutions are probability distributions that correspond to sums of independent random variables. Let \\(X\\) and \\(Y\\) be independent discrete variables. Find the PMF of \\(Z = X + Y\\). Hint: Use the law of total probability. Let \\(X\\) and \\(Y\\) be independent continuous variables. Find the PDF of \\(Z = X + Y\\). Hint: Start with the CDF. Solution. \\[\\begin{align} P(Z = z) &= P(X + Y = z) & \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + Y = z | Y = k) P(Y = k) & \\text{ (law of total probability)} \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + k = z | Y = k) P(Y = k) & \\\\ &= \\sum_{k = -\\infty}^\\infty P(X + k = z) P(Y = k) & \\text{ (independence of $X$ and $Y$)} \\\\ &= \\sum_{k = -\\infty}^\\infty P(X = z - k) P(Y = k). & \\end{align}\\] Let \\(f\\) and \\(g\\) be the PDFs of \\(X\\) and \\(Y\\) respectively. \\[\\begin{align} F(z) &= P(Z < z) \\\\ &= P(X + Y < z) \\\\ &= \\int_{-\\infty}^{\\infty} P(X + Y < z | Y = y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X + y < z | Y = y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X + y < z)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} P(X < z - y)P(Y = y)dy \\\\ &= \\int_{-\\infty}^{\\infty} (\\int_{-\\infty}^{z - y} f(x) dx) g(y) dy \\end{align}\\] Now \\[\\begin{align} p(z) &= \\frac{d}{dz} F(z) & \\\\ &= \\int_{-\\infty}^{\\infty} (\\frac{d}{dz}\\int_{-\\infty}^{z - y} f(x) dx) g(y) dy & \\\\ &= \\int_{-\\infty}^{\\infty} f(z - y) g(y) dy & \\text{ (fundamental theorem of calculus)}. \\end{align}\\] 4.2 Discrete random variables Exercise 4.4 (Binomial random variable) Let \\(X_k\\), \\(k = 1,...,n\\), be random variables with the Bernoulli measure as the PMF. Let \\(X = \\sum_{k=1}^n X_k\\). We call \\(X_k\\) a Bernoulli random variable with parameter \\(p \\in (0,1)\\). Find the CDF of \\(X_k\\). Find PMF of \\(X\\). This is a Binomial random variable with support in \\(\\{0,1,2,...,n\\}\\) and parameters \\(p \\in (0,1)\\) and \\(n \\in \\mathbb{N}_0\\). We denote \\[\\begin{equation} X | n,p \\sim \\text{binomial}(n,p). \\end{equation}\\] Find CDF of \\(X\\). R: Simulate from the binomial distribution with \\(n = 10\\) and \\(p = 0.5\\), and from \\(n\\) Bernoulli distributions with \\(p = 0.5\\). Visually compare the sum of Bernoullis and the binomial. Hint: there is no standard function like rpois for a Bernoulli random variable. Check exercise 1.12 to find out how to sample from a Bernoulli distribution. Solution. There are two outcomes – zero and one. Zero happens with probability \\(1 - p\\). Therefore \\[\\begin{equation} F(k) = \\begin{cases} 0 & k < 0 \\\\ 1 - p & 0 \\leq k < 1 \\\\ 1 & k \\geq 1. \\end{cases} \\end{equation}\\] For the probability of \\(X\\) to be equal to some \\(k \\leq n\\), exactly \\(k\\) Bernoulli variables need to be one, and the others zero. So \\(p^k(1-p)^{n-k}\\). There are \\(\\binom{n}{k}\\) such possible arrangements. Therefore \\[\\begin{align} P(X = k) = \\binom{n}{k} p^k (1 - p)^{n-k}. \\end{align}\\] \\[\\begin{equation} F(k) = \\sum_{i = 0}^{\\lfloor k \\rfloor} \\binom{n}{i} p^i (1 - p)^{n - i} \\end{equation}\\] set.seed(1) nsamps <- 10000 binom_samp <- rbinom(nsamps, size = 10, prob = 0.5) bernoulli_mat <- matrix(data = NA, nrow = nsamps, ncol = 10) for (i in 1:nsamps) { bernoulli_mat[i, ] <- rbinom(10, size = 1, prob = 0.5) } bern_samp <- apply(bernoulli_mat, 1, sum) b_data <- tibble(x = c(binom_samp, bern_samp), type = c(rep("binomial", 10000), rep("Bernoulli_sum", 10000))) b_plot <- ggplot(data = b_data, aes(x = x, fill = type)) + geom_bar(position = "dodge") plot(b_plot) Exercise 4.5 (Geometric random variable) A variable with PMF \\[\\begin{equation} P(k) = p(1-p)^k \\end{equation}\\] is a geometric random variable with support in non-negative integers. It has one parameter \\(p \\in (0,1]\\). We denote \\[\\begin{equation} X | p \\sim \\text{geometric}(p) \\end{equation}\\] Derive the CDF of a geometric random variable. R: Draw 1000 samples from the geometric distribution with \\(p = 0.3\\) and compare their frequencies to theoretical values. Solution. \\[\\begin{align} P(X \\leq k) &= \\sum_{i = 0}^k p(1-p)^i \\\\ &= p \\sum_{i = 0}^k (1-p)^i \\\\ &= p \\frac{1 - (1-p)^{k+1}}{1 - (1 - p)} \\\\ &= 1 - (1-p)^{k + 1} \\end{align}\\] set.seed(1) geo_samp <- rgeom(n = 1000, prob = 0.3) geo_samp <- data.frame(x = geo_samp) %>% count(x) %>% mutate(n = n / 1000, type = "empirical_frequencies") %>% bind_rows(data.frame(x = 0:20, n = dgeom(0:20, prob = 0.3), type = "theoretical_measure")) geo_plot <- ggplot(data = geo_samp, aes(x = x, y = n, fill = type)) + geom_bar(stat="identity", position = "dodge") plot(geo_plot) Exercise 4.6 (Poisson random variable) A variable with PMF \\[\\begin{equation} P(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!} \\end{equation}\\] is a Poisson random variable with support in non-negative integers. It has one positive parameter \\(\\lambda\\), which also represents its mean value and variance (a measure of the deviation of the values from the mean – more on mean and variance in the next chapter). We denote \\[\\begin{equation} X | \\lambda \\sim \\text{Poisson}(\\lambda). \\end{equation}\\] This distribution is usually the default choice for modeling counts. We have already encountered a Poisson random variable in exercise 1.13, where we also sampled from this distribution. The CDF of a Poisson random variable is \\(P(X <= x) = e^{-\\lambda} \\sum_{i=0}^x \\frac{\\lambda^{i}}{i!}\\). R: Draw 1000 samples from the Poisson distribution with \\(\\lambda = 5\\) and compare their empirical cumulative distribution function with the theoretical CDF. set.seed(1) pois_samp <- rpois(n = 1000, lambda = 5) pois_samp <- data.frame(x = pois_samp) pois_plot <- ggplot(data = pois_samp, aes(x = x, colour = "ECDF")) + stat_ecdf(geom = "step") + geom_step(data = tibble(x = 0:17, y = ppois(x, 5)), aes(x = x, y = y, colour = "CDF")) + scale_colour_manual("Lgend title", values = c("black", "red")) plot(pois_plot) Exercise 4.7 (Negative binomial random variable) A variable with PMF \\[\\begin{equation} p(k) = \\binom{k + r - 1}{k}(1-p)^r p^k \\end{equation}\\] is a negative binomial random variable with support in non-negative integers. It has two parameters \\(r > 0\\) and \\(p \\in (0,1)\\). We denote \\[\\begin{equation} X | r,p \\sim \\text{NB}(r,p). \\end{equation}\\] Let us reparameterize the negative binomial distribution with \\(q = 1 - p\\). Find the PMF of \\(X \\sim \\text{NB}(1, q)\\). Do you recognize this distribution? Show that the sum of two negative binomial random variables with the same \\(p\\) is also a negative binomial random variable. Hint: Use the fact that the number of ways to place \\(n\\) indistinct balls into \\(k\\) boxes is \\(\\binom{n + k - 1}{n}\\). R: Draw samples from \\(X \\sim \\text{NB}(5, 0.4)\\) and \\(Y \\sim \\text{NB}(3, 0.4)\\). Draw samples from \\(Z = X + Y\\), where you use the parameters calculated in b). Plot both distributions, their sum, and \\(Z\\) using facet_wrap. Be careful, as R uses a different parameterization size=\\(r\\) and prob=\\(1 - p\\). Solution. \\[\\begin{align} P(X = k) &= \\binom{k + 1 - 1}{k}q^1 (1-q)^k \\\\ &= q(1-q)^k. \\end{align}\\] This is the geometric distribution. Let \\(X \\sim \\text{NB}(r_1, p)\\) and \\(Y \\sim \\text{NB}(r_2, p)\\). Let \\(Z = X + Y\\). \\[\\begin{align} P(Z = z) &= \\sum_{k = 0}^{\\infty} P(X = z - k)P(Y = k), \\text{ if k < 0, then the probabilities are 0} \\\\ &= \\sum_{k = 0}^{z} P(X = z - k)P(Y = k), \\text{ if k > z, then the probabilities are 0} \\\\ &= \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k}(1 - p)^{r_1} p^{z - k} \\binom{k + r_2 - 1}{k}(1 - p)^{r_2} p^{k} & \\\\ &= \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}(1 - p)^{r_1 + r_2} p^{z} & \\\\ &= (1 - p)^{r_1 + r_2} p^{z} \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}& \\end{align}\\] The part before the sum reminds us of the negative binomial distribution with parameters \\(r_1 + r_2\\) and \\(p\\). To complete this term to the negative binomial PMF we need \\(\\binom{z + r_1 + r_2 -1}{z}\\). So the only thing we need to prove is that the sum equals this term. Both terms in the sum can be interpreted as numbers of ways to place a number of balls into boxes. For the left term it is \\(z-k\\) balls into \\(r_1\\) boxes, and for the right \\(k\\) balls into \\(r_2\\) boxes. For each \\(k\\) we are distributing \\(z\\) balls in total. By summing over all \\(k\\), we actually get all the possible placements of \\(z\\) balls into \\(r_1 + r_2\\) boxes. Therefore \\[\\begin{align} P(Z = z) &= (1 - p)^{r_1 + r_2} p^{z} \\sum_{k = 0}^{z} \\binom{z - k + r_1 - 1}{z - k} \\binom{k + r_2 - 1}{k}& \\\\ &= \\binom{z + r_1 + r_2 -1}{z} (1 - p)^{r_1 + r_2} p^{z}. \\end{align}\\] From this it also follows that the sum of geometric distributions with the same parameter is a negative binomial distribution. \\(Z \\sim \\text{NB}(8, 0.4)\\). set.seed(1) nsamps <- 10000 x <- rnbinom(nsamps, size = 5, prob = 0.6) y <- rnbinom(nsamps, size = 3, prob = 0.6) xpy <- x + y z <- rnbinom(nsamps, size = 8, prob = 0.6) samps <- tibble(x, y, xpy, z) samps <- melt(samps) ggplot(data = samps, aes(x = value)) + geom_bar() + facet_wrap(~ variable) 4.3 Continuous random variables Exercise 4.8 (Exponential random variable) A variable \\(X\\) with PDF \\(\\lambda e^{-\\lambda x}\\) is an exponential random variable with support in non-negative real numbers. It has one positive parameter \\(\\lambda\\). We denote \\[\\begin{equation} X | \\lambda \\sim \\text{Exp}(\\lambda). \\end{equation}\\] Find the CDF of an exponential random variable. Find the quantile function of an exponential random variable. Calculate the probability \\(P(1 \\leq X \\leq 3)\\), where \\(X \\sim \\text{Exp(1.5)}\\). R: Check your answer to c) with a simulation (rexp). Plot the probability in a meaningful way. R: Implement PDF, CDF, and the quantile function and compare their values with corresponding R functions visually. Hint: use the size parameter to make one of the curves wider. Solution. \\[\\begin{align} F(x) &= \\int_{0}^{x} \\lambda e^{-\\lambda t} dt \\\\ &= \\lambda \\int_{0}^{x} e^{-\\lambda t} dt \\\\ &= \\lambda (\\frac{1}{-\\lambda}e^{-\\lambda t} |_{0}^{x}) \\\\ &= \\lambda(\\frac{1}{\\lambda} - \\frac{1}{\\lambda} e^{-\\lambda x}) \\\\ &= 1 - e^{-\\lambda x}. \\end{align}\\] \\[\\begin{align} F(F^{-1}(x)) &= x \\\\ 1 - e^{-\\lambda F^{-1}(x)} &= x \\\\ e^{-\\lambda F^{-1}(x)} &= 1 - x \\\\ -\\lambda F^{-1}(x) &= \\ln(1 - x) \\\\ F^{-1}(x) &= - \\frac{ln(1 - x)}{\\lambda}. \\end{align}\\] \\[\\begin{align} P(1 \\leq X \\leq 3) &= P(X \\leq 3) - P(X \\leq 1) \\\\ &= P(X \\leq 3) - P(X \\leq 1) \\\\ &= 1 - e^{-1.5 \\times 3} - 1 + e^{-1.5 \\times 1} \\\\ &\\approx 0.212. \\end{align}\\] set.seed(1) nsamps <- 1000 samps <- rexp(nsamps, rate = 1.5) sum(samps >= 1 & samps <= 3) / nsamps ## [1] 0.212 exp_plot <- ggplot(data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dexp, args = list(rate = 1.5)) + stat_function(fun = dexp, args = list(rate = 1.5), xlim = c(1,3), geom = "area", fill = "red") plot(exp_plot) exp_pdf <- function(x, lambda) { return (lambda * exp(-lambda * x)) } exp_cdf <- function(x, lambda) { return (1 - exp(-lambda * x)) } exp_quant <- function(q, lambda) { return (-(log(1 - q) / lambda)) } ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_pdf, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = pexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_cdf, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) ggplot(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x)) + stat_function(fun = qexp, args = list(rate = 1.5), aes(color = "R"), size = 2.5) + stat_function(fun = exp_quant, args = list(lambda = 1.5), aes(color = "Mine"), size = 1.2) + scale_color_manual(values = c("red", "black")) Exercise 4.9 (Uniform random variable) Continuous uniform random variable with parameters \\(a\\) and \\(b\\) has the PDF \\[\\begin{equation} p(x) = \\begin{cases} \\frac{1}{b - a} & x \\in [a,b] \\\\ 0 & \\text{otherwise}. \\end{cases} \\end{equation}\\] Find the CDF of the uniform random variable. Find the quantile function of the uniform random variable. Let \\(X \\sim \\text{Uniform}(a,b)\\). Find the CDF of the variable \\(Y = \\frac{X - a}{b - a}\\). This is the standard uniform random variable. Let \\(X \\sim \\text{Uniform}(-1, 3)\\). Find such \\(z\\) that \\(P(X < z + \\mu_x) = \\frac{1}{5}\\). R: Check your result from d) using simulation. Solution. \\[\\begin{align} F(x) &= \\int_{a}^x \\frac{1}{b - a} dt \\\\ &= \\frac{1}{b - a} \\int_{a}^x dt \\\\ &= \\frac{x - a}{b - a}. \\end{align}\\] \\[\\begin{align} F(F^{-1}(p)) &= p \\\\ \\frac{F^{-1}(p) - a}{b - a} &= p \\\\ F^{-1}(p) &= p(b - a) + a. \\end{align}\\] \\[\\begin{align} F_Y(y) &= P(Y < y) \\\\ &= P(\\frac{X - a}{b - a} < y) \\\\ &= P(X < y(b - a) + a) \\\\ &= F_X(y(b - a) + a) \\\\ &= \\frac{(y(b - a) + a) - a}{b - a} \\\\ &= y. \\end{align}\\] \\[\\begin{align} P(X < z + 1) &= \\frac{1}{5} \\\\ F(z + 1) &= \\frac{1}{5} \\\\ z + 1 &= F^{-1}(\\frac{1}{5}) \\\\ z &= \\frac{1}{5}4 - 1 - 1 \\\\ z &= -1.2. \\end{align}\\] set.seed(1) a <- -1 b <- 3 nsamps <- 10000 unif_samp <- runif(nsamps, a, b) mu_x <- mean(unif_samp) new_samp <- unif_samp - mu_x quantile(new_samp, probs = 1/5) ## 20% ## -1.203192 punif(-0.2, -1, 3) ## [1] 0.2 Exercise 4.10 (Beta random variable) A variable \\(X\\) with PDF \\[\\begin{equation} p(x) = \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)}, \\end{equation}\\] where \\(\\text{B}(\\alpha, \\beta) = \\frac{\\Gamma(\\alpha) \\Gamma(\\beta)}{\\Gamma(\\alpha + \\beta)}\\) and \\(\\Gamma(x) = \\int_0^{\\infty} x^{z - 1} e^{-x} dx\\) is a Beta random variable with support on \\([0,1]\\). It has two positive parameters \\(\\alpha\\) and \\(\\beta\\). Notation: \\[\\begin{equation} X | \\alpha, \\beta \\sim \\text{Beta}(\\alpha, \\beta) \\end{equation}\\] It is often used in modeling rates. Calculate the PDF for \\(\\alpha = 1\\) and \\(\\beta = 1\\). What do you notice? R: Plot densities of the beta distribution for parameter pairs (2, 2), (4, 1), (1, 4), (2, 5), and (0.1, 0.1). R: Sample from \\(X \\sim \\text{Beta}(2, 5)\\) and compare the histogram with Beta PDF. Solution. \\[\\begin{equation} p(x) = \\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\\text{B}(1, 1)} = 1. \\end{equation}\\] This is the standard uniform distribution. set.seed(1) ggplot(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x)) + stat_function(fun = dbeta, args = list(shape1 = 2, shape2 = 2), aes(color = "alpha = 0.5")) + stat_function(fun = dbeta, args = list(shape1 = 4, shape2 = 1), aes(color = "alpha = 4")) + stat_function(fun = dbeta, args = list(shape1 = 1, shape2 = 4), aes(color = "alpha = 1")) + stat_function(fun = dbeta, args = list(shape1 = 2, shape2 = 5), aes(color = "alpha = 25")) + stat_function(fun = dbeta, args = list(shape1 = 0.1, shape2 = 0.1), aes(color = "alpha = 0.1")) set.seed(1) nsamps <- 1000 samps <- rbeta(nsamps, 2, 5) ggplot(data = data.frame(x = samps), aes(x = x)) + geom_histogram(aes(y = ..density..), color = "black") + stat_function(data = data.frame(x = seq(0, 1, by = 0.01)), aes(x = x), fun = dbeta, args = list(shape1 = 2, shape2 = 5), color = "red", size = 1.2) Exercise 4.11 (Gamma random variable) A random variable with PDF \\[\\begin{equation} p(x) = \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x} \\end{equation}\\] is a Gamma random variable with support on the positive numbers and parameters shape \\(\\alpha > 0\\) and rate \\(\\beta > 0\\). We write \\[\\begin{equation} X | \\alpha, \\beta \\sim \\text{Gamma}(\\alpha, \\beta) \\end{equation}\\] and it’s CDF is \\[\\begin{equation} \\frac{\\gamma(\\alpha, \\beta x)}{\\Gamma(\\alpha)}, \\end{equation}\\] where \\(\\gamma(s, x) = \\int_0^x t^{s-1} e^{-t} dt\\). It is usually used in modeling positive phenomena (for example insurance claims and rainfalls). Let \\(X \\sim \\text{Gamma}(1, \\beta)\\). Find the PDF of \\(X\\). Do you recognize this PDF? Let \\(k = \\alpha\\) and \\(\\theta = \\frac{1}{\\beta}\\). Find the PDF of \\(X | k, \\theta \\sim \\text{Gamma}(k, \\theta)\\). Random variables can be reparameterized, and sometimes a reparameterized distribution is more suitable for certain calculations. The first parameterization is for example usually used in Bayesian statistics, while this parameterization is more common in econometrics and some other applied fields. Note that you also need to pay attention to the parameters in statistical software, so diligently read the help files when using functions like rgamma to see how the function is parameterized. R: Plot gamma CDF for random variables with shape and rate parameters (1,1), (10,1), (1,10). Solution. \\[\\begin{align} p(x) &= \\frac{\\beta^1}{\\Gamma(1)} x^{1 - 1}e^{-\\beta x} \\\\ &= \\beta e^{-\\beta x} \\end{align}\\] This is the PDF of the exponential distribution with parameter \\(\\beta\\). \\[\\begin{align} p(x) &= \\frac{1}{\\Gamma(k)\\beta^k} x^{k - 1}e^{-\\frac{x}{\\theta}}. \\end{align}\\] set.seed(1) ggplot(data = data.frame(x = seq(0, 25, by = 0.01)), aes(x = x)) + stat_function(fun = pgamma, args = list(shape = 1, rate = 1), aes(color = "Gamma(1,1)")) + stat_function(fun = pgamma, args = list(shape = 10, rate = 1), aes(color = "Gamma(10,1)")) + stat_function(fun = pgamma, args = list(shape = 1, rate = 10), aes(color = "Gamma(1,10)")) Exercise 4.12 (Normal random variable) A random variable with PDF \\[\\begin{equation} p(x) = \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} \\end{equation}\\] is a normal random variable with support on the real axis and parameters \\(\\mu\\) in reals and \\(\\sigma^2 > 0\\). The first is the mean parameter and the second is the variance parameter. Many statistical methods assume a normal distribution. We denote \\[\\begin{equation} X | \\mu, \\sigma \\sim \\text{N}(\\mu, \\sigma^2), \\end{equation}\\] and it’s CDF is \\[\\begin{equation} F(x) = \\int_{-\\infty}^x \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(t - \\mu)^2}{2 \\sigma^2}} dt, \\end{equation}\\] which is intractable and is usually approximated. Due to its flexibility it is also one of the most researched distributions. For that reason statisticians often use transformations of variables or approximate distributions with the normal distribution. Show that a variable \\(\\frac{X - \\mu}{\\sigma} \\sim \\text{N}(0,1)\\). This transformation is called standardization, and \\(\\text{N}(0,1)\\) is a standard normal distribution. R: Plot the normal distribution with \\(\\mu = 0\\) and different values for the \\(\\sigma\\) parameter. R: The normal distribution provides a good approximation for the Poisson distribution with a large \\(\\lambda\\). Let \\(X \\sim \\text{Poisson}(50)\\). Approximate \\(X\\) with the normal distribution and compare its density with the Poisson histogram. What are the values of \\(\\mu\\) and \\(\\sigma^2\\) that should provide the best approximation? Note that R function rnorm takes standard deviation (\\(\\sigma\\)) as a parameter and not variance. Solution. \\[\\begin{align} P(\\frac{X - \\mu}{\\sigma} < x) &= P(X < \\sigma x + \\mu) \\\\ &= F(\\sigma x + \\mu) \\\\ &= \\int_{-\\infty}^{\\sigma x + \\mu} \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(t - \\mu)^2}{2\\sigma^2}} dt \\end{align}\\] Now let \\(s = f(t) = \\frac{t - \\mu}{\\sigma}\\), then \\(ds = \\frac{dt}{\\sigma}\\) and \\(f(\\sigma x + \\mu) = x\\), so \\[\\begin{align} P(\\frac{X - \\mu}{\\sigma} < x) &= \\int_{-\\infty}^{x} \\frac{1}{\\sqrt{2 \\pi}} e^{-\\frac{s^2}{2}} ds. \\end{align}\\] There is no need to evaluate this integral, as we recognize it as the CDF of a normal distribution with \\(\\mu = 0\\) and \\(\\sigma^2 = 1\\). set.seed(1) # b ggplot(data = data.frame(x = seq(-15, 15, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 1), aes(color = "sd = 1")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 0.4), aes(color = "sd = 0.1")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "sd = 2")) + stat_function(fun = dnorm, args = list(mean = 0, sd = 5), aes(color = "sd = 5")) # c mean_par <- 50 nsamps <- 100000 pois_samps <- rpois(nsamps, lambda = mean_par) norm_samps <- rnorm(nsamps, mean = mean_par, sd = sqrt(mean_par)) my_plot <- ggplot() + geom_bar(data = tibble(x = pois_samps), aes(x = x, y = (..count..)/sum(..count..))) + geom_density(data = tibble(x = norm_samps), aes(x = x), color = "red") plot(my_plot) Exercise 4.13 (Logistic random variable) A logistic random variable has CDF \\[\\begin{equation} F(x) = \\frac{1}{1 + e^{-\\frac{x - \\mu}{s}}}, \\end{equation}\\] where \\(\\mu\\) is real and \\(s > 0\\). The support is on the real axis. We denote \\[\\begin{equation} X | \\mu, s \\sim \\text{Logistic}(\\mu, s). \\end{equation}\\] The distribution of the logistic random variable resembles a normal random variable, however it has heavier tails. Find the PDF of a logistic random variable. R: Implement logistic PDF and CDF and visually compare both for \\(X \\sim \\text{N}(0, 1)\\) and \\(Y \\sim \\text{logit}(0, \\sqrt{\\frac{3}{\\pi^2}})\\). These distributions have the same mean and variance. Additionally, plot the same plot on the interval [5,10], to better see the difference in the tails. R: For the distributions in b) find the probability \\(P(|X| > 4)\\) and interpret the result. Solution. \\[\\begin{align} p(x) &= \\frac{d}{dx} \\frac{1}{1 + e^{-\\frac{x - \\mu}{s}}} \\\\ &= \\frac{- \\frac{d}{dx} (1 + e^{-\\frac{x - \\mu}{s}})}{(1 + e{-\\frac{x - \\mu}{s}})^2} \\\\ &= \\frac{e^{-\\frac{x - \\mu}{s}}}{s(1 + e{-\\frac{x - \\mu}{s}})^2}. \\end{align}\\] # b set.seed(1) logit_pdf <- function (x, mu, s) { return ((exp(-(x - mu)/(s))) / (s * (1 + exp(-(x - mu)/(s)))^2)) } nl_plot <- ggplot(data = data.frame(x = seq(-12, 12, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "normal")) + stat_function(fun = logit_pdf, args = list(mu = 0, s = sqrt(12/pi^2)), aes(color = "logit")) plot(nl_plot) nl_plot <- ggplot(data = data.frame(x = seq(5, 10, by = 0.01)), aes(x = x)) + stat_function(fun = dnorm, args = list(mean = 0, sd = 2), aes(color = "normal")) + stat_function(fun = logit_pdf, args = list(mu = 0, s = sqrt(12/pi^2)), aes(color = "logit")) plot(nl_plot) # c logit_cdf <- function (x, mu, s) { return (1 / (1 + exp(-(x - mu) / s))) } p_logistic <- 1 - logit_cdf(4, 0, sqrt(12/pi^2)) + logit_cdf(-4, 0, sqrt(12/pi^2)) p_norm <- 1 - pnorm(4, 0, 2) + pnorm(-4, 0, 2) p_logistic ## [1] 0.05178347 p_norm ## [1] 0.04550026 # Logistic distribution has wider tails, therefore the probability of larger # absolute values is higher. 4.4 Singular random variables Exercise 4.14 (Cantor distribution) The Cantor set is a subset of \\([0,1]\\), which we create by iteratively deleting the middle third of the interval. For example, in the first iteration, we get the sets \\([0,\\frac{1}{3}]\\) and \\([\\frac{2}{3},1]\\). In the second iteration, we get \\([0,\\frac{1}{9}]\\), \\([\\frac{2}{9},\\frac{1}{3}]\\), \\([\\frac{2}{3}, \\frac{7}{9}]\\), and \\([\\frac{8}{9}, 1]\\). On the \\(n\\)-th iteration, we have \\[\\begin{equation} C_n = \\frac{C_{n-1}}{3} \\cup \\bigg(\\frac{2}{3} + \\frac{C_{n-1}}{3} \\bigg), \\end{equation}\\] where \\(C_0 = [0,1]\\). The Cantor set is then defined as the intersection of these sets \\[\\begin{equation} C = \\cap_{n=1}^{\\infty} C_n. \\end{equation}\\] It has the same cardinality as \\([0,1]\\). Another way to define the Cantor set is the set of all numbers on \\([0,1]\\), that do not have a 1 in the ternary representation \\(x = \\sum_{n=1}^\\infty \\frac{x_i}{3^i}, x_i \\in \\{0,1,2\\}\\). A random variable follows the Cantor distribution, if its CDF is the Cantor function (below). You can find the implementations of random number generator, CDF, and quantile functions for the Cantor distributions at https://github.com/Henrygb/CantorDist.R. Show that the Lebesgue measure of the Cantor set is 0. (Jagannathan) Let us look at an infinite sequence of independent fair-coin tosses. If the outcome is heads, let \\(x_i = 2\\) and \\(x_i = 0\\), when tails. Then use these to create \\(x = \\sum_{n=1}^\\infty \\frac{x_i}{3^i}\\). This is a random variable with the Cantor distribution. Show that \\(X\\) has a singular distribution. Solution. \\[\\begin{align} \\lambda(C) &= 1 - \\lambda(C^c) \\\\ &= 1 - \\frac{1}{3}\\sum_{k = 0}^\\infty (\\frac{2}{3})^k \\\\ &= 1 - \\frac{\\frac{1}{3}}{1 - \\frac{2}{3}} \\\\ &= 0. \\end{align}\\] First, for every \\(x\\), the probability of observing it is \\(\\lim_{n \\rightarrow \\infty} \\frac{1}{2^n} = 0\\). Second, the probability that we observe one of all the possible sequences is 1. Therefore \\(P(C) = 1\\). So this is a singular variable. The CDF only increments on the elements of the Cantor set. 4.5 Transformations Exercise 4.15 Let \\(X\\) be a random variable that is uniformly distributed on \\(\\{-2, -1, 0, 1, 2\\}\\). Find the PMF of \\(Y = X^2\\). Solution. \\[\\begin{align} P_Y(y) = \\sum_{x \\in \\sqrt(y)} P_X(x) = \\begin{cases} 0 & y \\notin \\{0,1,4\\} \\\\ \\frac{1}{5} & y = 0 \\\\ \\frac{2}{5} & y \\in \\{1,4\\} \\end{cases} \\end{align}\\] Exercise 4.16 (Lognormal random variable) A lognormal random variable is a variable whose logarithm is normally distributed. In practice, we often encounter skewed data. Usually using a log transformation on such data makes it more symmetric and therefore more suitable for modeling with the normal distribution (more on why we wish to model data with the normal distribution in the following chapters). Let \\(X \\sim \\text{N}(\\mu,\\sigma)\\). Find the PDF of \\(Y: \\log(Y) = X\\). R: Sample from the lognormal distribution with parameters \\(\\mu = 5\\) and \\(\\sigma = 2\\). Plot a histogram of the samples. Then log-transform the samples and plot a histogram along with the theoretical normal PDF. Solution. \\[\\begin{align} p_Y(y) &= p_X(\\log(y)) \\frac{d}{dy} \\log(y) \\\\ &= \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{\\frac{(\\log(y) - \\mu)^2}{2 \\sigma^2}} \\frac{1}{y} \\\\ &= \\frac{1}{y \\sqrt{2 \\pi \\sigma^2}} e^{\\frac{(\\log(y) - \\mu)^2}{2 \\sigma^2}}. \\end{align}\\] set.seed(1) nsamps <- 10000 mu <- 0.5 sigma <- 0.4 ln_samps <- rlnorm(nsamps, mu, sigma) ln_plot <- ggplot(data = data.frame(x = ln_samps), aes(x = x)) + geom_histogram(color = "black") plot(ln_plot) norm_samps <- log(ln_samps) n_plot <- ggplot(data = data.frame(x = norm_samps), aes(x = x)) + geom_histogram(aes(y = ..density..), color = "black") + stat_function(fun = dnorm, args = list(mean = mu, sd = sigma), color = "red") plot(n_plot) Exercise 4.17 (Probability integral transform) This exercise is borrowed from Wasserman. Let \\(X\\) have a continuous, strictly increasing CDF \\(F\\). Let \\(Y = F(X)\\). Find the density of \\(Y\\). This is called the probability integral transform. Let \\(U \\sim \\text{Uniform}(0,1)\\) and let \\(X = F^{-1}(U)\\). Show that \\(X \\sim F\\). R: Implement a program that takes Uniform(0,1) random variables and generates random variables from an exponential(\\(\\beta\\)) distribution. Compare your implemented function with function rexp in R. Solution. \\[\\begin{align} F_Y(y) &= P(Y < y) \\\\ &= P(F(X) < y) \\\\ &= P(X < F_X^{-1}(y)) \\\\ &= F_X(F_X^{-1}(y)) \\\\ &= y. \\end{align}\\] From the above it follows that \\(p(y) = 1\\). Note that we need to know the inverse CDF to be able to apply this procedure. \\[\\begin{align} P(X < x) &= P(F^{-1}(U) < x) \\\\ &= P(U < F(x)) \\\\ &= F_U(F(x)) \\\\ &= F(x). \\end{align}\\] set.seed(1) nsamps <- 10000 beta <- 4 generate_exp <- function (n, beta) { tmp <- runif(n) X <- qexp(tmp, beta) return (X) } df <- tibble("R" = rexp(nsamps, beta), "myGenerator" = generate_exp(nsamps, beta)) %>% gather() ggplot(data = df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") "],["mrvs.html", "Chapter 5 Multiple random variables 5.1 General 5.2 Bivariate distribution examples 5.3 Transformations", " Chapter 5 Multiple random variables This chapter deals with multiple random variables and their distributions. The students are expected to acquire the following knowledge: Theoretical Calculation of PDF of transformed multiple random variables. Finding marginal and conditional distributions. R Scatterplots of bivariate random variables. New R functions (for example, expand.grid). .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 5.1 General Exercise 5.1 Let \\(X \\sim \\text{N}(0,1)\\) and \\(Y \\sim \\text{N}(0,1)\\) be independent random variables. Draw 1000 samples from \\((X,Y)\\) and plot a scatterplot. Now let \\(X \\sim \\text{N}(0,1)\\) and \\(Y | X = x \\sim N(ax, 1)\\). Draw 1000 samples from \\((X,Y)\\) for \\(a = 1\\), \\(a=0\\), and \\(a=-0.5\\). Plot the scatterplots. How would you interpret parameter \\(a\\)? Plot the marginal distribution of \\(Y\\) for cases \\(a=1\\), \\(a=0\\), and \\(a=-0.5\\). Can you guess which distribution it is? set.seed(1) nsamps <- 1000 x <- rnorm(nsamps) y <- rnorm(nsamps) ggplot(data.frame(x, y), aes(x = x, y = y)) + geom_point() y1 <- rnorm(nsamps, mean = 1 * x) y2 <- rnorm(nsamps, mean = 0 * x) y3 <- rnorm(nsamps, mean = -0.5 * x) df <- tibble(x = c(x,x,x), y = c(y1,y2,y3), a = c(rep(1, nsamps), rep(0, nsamps), rep(-0.5, nsamps))) ggplot(df, aes(x = x, y = y)) + geom_point() + facet_wrap(~a) + coord_equal(ratio=1) # Parameter a controls the scale of linear dependency between X and Y. ggplot(df, aes(x = y)) + geom_density() + facet_wrap(~a) 5.2 Bivariate distribution examples Exercise 5.2 (Discrete bivariate random variable) Let \\(X\\) represent the event that a die rolls an even number and let \\(Y\\) represent the event that a die rolls one, two, or a three. Find the marginal distributions of \\(X\\) and \\(Y\\). Find the PMF of \\((X,Y)\\). Find the CDF of \\((X,Y)\\). Find \\(P(X = 1 | Y = 1)\\). Solution. \\[\\begin{align} P(X = 1) = \\frac{1}{2} \\text{ and } P(X = 0) = \\frac{1}{2} \\\\ P(Y = 1) = \\frac{1}{2} \\text{ and } P(Y = 0) = \\frac{1}{2} \\\\ \\end{align}\\] \\[\\begin{align} P(X = 1, Y = 1) = \\frac{1}{6} \\\\ P(X = 1, Y = 0) = \\frac{2}{6} \\\\ P(X = 0, Y = 1) = \\frac{2}{6} \\\\ P(X = 0, Y = 0) = \\frac{1}{6} \\end{align}\\] \\[\\begin{align} P(X \\leq x, Y \\leq y) = \\begin{cases} \\frac{1}{6} & x = 0, y = 0 \\\\ \\frac{3}{6} & x \\neq y \\\\ 1 & x = 1, y = 1 \\end{cases} \\end{align}\\] \\[\\begin{align} P(X = 1 | Y = 1) = \\frac{1}{3} \\end{align}\\] Exercise 5.3 (Continuous bivariate random variable) Let \\(p(x,y) = 6 (x - y)^2\\) be the PDF of a bivariate random variable \\((X,Y)\\), where both variables range from zero to one. Find CDF. Find marginal distributions. Find conditional distributions. R: Plot a grid of points and colour them by value – this can help us visualize the PDF. R: Implement a random number generator, which will generate numbers from \\((X,Y)\\) and visually check the results. R: Plot the marginal distribution of \\(Y\\) and the conditional distributions of \\(X | Y = y\\), where \\(y \\in \\{0, 0.1, 0.5\\}\\). Solution. \\[\\begin{align} F(x,y) &= \\int_0^{x} \\int_0^{y} 6 (t - s)^2 ds dt\\\\ &= 6 \\int_0^{x} \\int_0^{y} t^2 - 2ts + s^2 ds dt\\\\ &= 6 \\int_0^{x} t^2y - ty^2 + \\frac{y^3}{3} dt \\\\ &= 6 (\\frac{x^3 y}{3} - \\frac{x^2y^2}{2} + \\frac{x y^3}{3}) \\\\ &= 2 x^3 y - 3 t^2y^2 + 2 x y^3 \\end{align}\\] \\[\\begin{align} p(x) &= \\int_0^{1} 6 (x - y)^2 dy\\\\ &= 6 (x^2 - x + \\frac{1}{3}) \\\\ &= 6x^2 - 6x + 2 \\end{align}\\] \\[\\begin{align} p(y) &= \\int_0^{1} 6 (x - y)^2 dx\\\\ &= 6 (y^2 - y + \\frac{1}{3}) \\\\ &= 6y^2 - 6y + 2 \\end{align}\\] \\[\\begin{align} p(x|y) &= \\frac{p(xy)}{p(y)} \\\\ &= \\frac{6 (x - y)^2}{6 (y^2 - y + \\frac{1}{3})} \\\\ &= \\frac{(x - y)^2}{y^2 - y + \\frac{1}{3}} \\end{align}\\] \\[\\begin{align} p(y|x) &= \\frac{p(xy)}{p(x)} \\\\ &= \\frac{6 (x - y)^2}{6 (x^2 - x + \\frac{1}{3})} \\\\ &= \\frac{(x - y)^2}{x^2 - x + \\frac{1}{3}} \\end{align}\\] set.seed(1) # d pxy <- function (x, y) { return ((x - y)^2) } x_axis <- seq(0, 1, length.out = 100) y_axis <- seq(0, 1, length.out = 100) df <- expand.grid(x_axis, y_axis) colnames(df) <- c("x", "y") df <- cbind(df, pdf = pxy(df$x, df$y)) ggplot(data = df, aes(x = x, y = y, color = pdf)) + geom_point() # e samps <- NULL for (i in 1:10000) { xt <- runif(1, 0, 1) yt <- runif(1, 0, 1) pdft <- pxy(xt, yt) acc <- runif(1, 0, 6) if (acc <= pdft) { samps <- rbind(samps, c(xt, yt)) } } colnames(samps) <- c("x", "y") ggplot(data = as.data.frame(samps), aes(x = x, y = y)) + geom_point() # f mar_pdf <- function (x) { return (6 * x^2 - 6 * x + 2) } cond_pdf <- function (x, y) { return (((x - y)^2) / (y^2 - y + 1/3)) } df <- tibble(x = x_axis, mar = mar_pdf(x), y0 = cond_pdf(x, 0), y0.1 = cond_pdf(x, 0.1), y0.5 = cond_pdf(x, 0.5)) %>% gather(dist, value, -x) ggplot(df, aes(x = x, y = value, color = dist)) + geom_line() Exercise 5.4 (Mixed bivariate random variable) Let \\(f(x,y) = \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y+ \\alpha -1} e^{-x(1 + \\beta)}\\) be the PDF of a bivariate random variable, where \\(x \\in (0, \\infty)\\) and \\(y \\in \\mathbb{N}_0\\). Find the marginal distribution of \\(X\\). Do you recognize this distribution? Find the conditional distribution of \\(Y | X\\). Do you recognize this distribution? Calculate the probability \\(P(Y = 2 | X = 2.5)\\) for \\((X,Y)\\). Find the marginal distribution of \\(Y\\). Do you recognize this distribution? R: Take 1000 random samples from \\((X,Y)\\) with parameters \\(\\beta = 1\\) and \\(\\alpha = 1\\). Plot a scatterplot. Plot a bar plot of the marginal distribution of \\(Y\\), and the theoretical PMF calculated from d) on the range from 0 to 10. Hint: Use the gamma function in R.? Solution. \\[\\begin{align} p(x) &= \\sum_{k = 0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)k!} x^{k + \\alpha -1} e^{-x(1 + \\beta)} & \\\\ &= \\sum_{k = 0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)k!} x^{k} x^{\\alpha -1} e^{-x} e^{-\\beta x} & \\\\ &= \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x} \\sum_{k = 0}^{\\infty} \\frac{1}{k!} x^{k} e^{-x} & \\\\ &= \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x} & \\text{the last term above sums to one} \\end{align}\\] This is the Gamma PDF. \\[\\begin{align} p(y|x) &= \\frac{p(x,y)}{p(x)} \\\\ &= \\frac{\\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y+ \\alpha -1} e^{-x(1 + \\beta)}}{\\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{\\alpha -1} e^{-\\beta x}} \\\\ &= \\frac{x^y e^{-x}}{y!}. \\end{align}\\] This is the Poisson PMF. \\[\\begin{align} P(Y = 2 | X = 2.5) = \\frac{2.5^2 e^{-2.5}}{2!} \\approx 0.26. \\end{align}\\] \\[\\begin{align} p(y) &= \\int_{0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)y!} x^{y + \\alpha -1} e^{-x(1 + \\beta)} dx & \\\\ &= \\frac{1}{y!} \\int_{0}^{\\infty} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} x^{(y + \\alpha) -1} e^{-(1 + \\beta)x} dx & \\\\ &= \\frac{1}{y!} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} \\int_{0}^{\\infty} \\frac{\\Gamma(y + \\alpha)}{(1 + \\beta)^{y + \\alpha}} \\frac{(1 + \\beta)^{y + \\alpha}}{\\Gamma(y + \\alpha)} x^{(y + \\alpha) -1} e^{-(1 + \\beta)x} dx & \\text{complete to Gamma PDF} \\\\ &= \\frac{1}{y!} \\frac{\\beta^{\\alpha}}{\\Gamma(\\alpha)} \\frac{\\Gamma(y + \\alpha)}{(1 + \\beta)^{y + \\alpha}}. \\end{align}\\] We add the terms in the third equality to get a Gamma PDF inside the integral, which then integrates to one. We do not recognize this distribution. set.seed(1) px <- function (x, alpha, beta) { return((1 / factorial(x)) * (beta^alpha / gamma(alpha)) * (gamma(x + alpha) / (1 + beta)^(x + alpha))) } nsamps <- 1000 rx <- rgamma(nsamps, 1, 1) ryx <- rpois(nsamps, rx) ggplot(data = data.frame(x = rx, y = ryx), aes(x = x, y = y)) + geom_point() ggplot(data = data.frame(x = rx, y = ryx), aes(x = y)) + geom_bar(aes(y = (..count..)/sum(..count..))) + stat_function(fun = px, args = list(alpha = 1, beta = 1), color = "red") Exercise 5.5 Let \\(f(x,y) = cx^2y\\) for \\(x^2 \\leq y \\leq 1\\) and zero otherwise. Find such \\(c\\) that \\(f\\) is a PDF of a bivariate random variable. This exercise is borrowed from Wasserman. Solution. \\[\\begin{align} 1 &= \\int_{-1}^{1} \\int_{x^2}^1 cx^2y dy dx \\\\ &= \\int_{-1}^{1} cx^2 (\\frac{1}{2} - \\frac{x^4}{2}) dx \\\\ &= \\frac{c}{2} \\int_{-1}^{1} x^2 - x^6 dx \\\\ &= \\frac{c}{2} (\\frac{1}{3} + \\frac{1}{3} - \\frac{1}{7} - \\frac{1}{7}) \\\\ &= \\frac{c}{2} \\frac{8}{21} \\\\ &= \\frac{4c}{21} \\end{align}\\] It follows \\(c = \\frac{21}{4}\\). 5.3 Transformations Exercise 5.6 Let \\((X,Y)\\) be uniformly distributed on the unit ball \\(\\{(x,y,z) : x^2 + y^2 + z^2 \\leq 1\\}\\). Let \\(R = \\sqrt{X^2 + Y^2 + Z^2}\\). Find the CDF and PDF of \\(R\\). Solution. \\[\\begin{align} P(R < r) &= P(\\sqrt{X^2 + Y^2 + Z^2} < r) \\\\ &= P(X^2 + Y^2 + Z^2 < r^2) \\\\ &= \\frac{\\frac{4}{3} \\pi r^3}{\\frac{4}{3}\\pi} \\\\ &= r^3. \\end{align}\\] The second line shows us that we are looking at the probability which is represented by a smaller ball with radius \\(r\\). To get the probability, we divide it by the radius of the whole ball. We get the PDF by differentiating the CDF, so \\(p(r) = 3r^2\\). "],["integ.html", "Chapter 6 Integration 6.1 Monte Carlo integration 6.2 Lebesgue integrals", " Chapter 6 Integration This chapter deals with abstract and Monte Carlo integration. The students are expected to acquire the following knowledge: Theoretical How to calculate Lebesgue integrals for non-simple functions. R Monte Carlo integration. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 6.1 Monte Carlo integration Exercise 6.1 Let \\(X\\) and \\(Y\\) be continuous random variables on the unit interval and \\(p(x,y) = 6(x - y)^2\\). Use Monte Carlo integration to estimate the probability \\(P(0.2 \\leq X \\leq 0.5, \\: 0.1 \\leq Y \\leq 0.2)\\). Can you find the exact value? set.seed(1) nsamps <- 1000 V <- (0.5 - 0.2) * (0.2 - 0.1) x1 <- runif(nsamps, 0.2, 0.5) x2 <- runif(nsamps, 0.1, 0.2) f_1 <- function (x, y) { return (6 * (x - y)^2) } mcint <- V * (1 / nsamps) * sum(f_1(x1, x2)) sdm <- sqrt((V^2 / nsamps) * var(f_1(x1, x2))) mcint ## [1] 0.008793445 sdm ## [1] 0.0002197686 F_1 <- function (x, y) { return (2 * x^3 * y - 3 * x^2 * y^2 + 2 * x * y^3) } F_1(0.5, 0.2) - F_1(0.2, 0.2) - F_1(0.5, 0.1) + F_1(0.2, 0.1) ## [1] 0.0087 6.2 Lebesgue integrals Exercise 6.2 (borrowed from Jagannathan) Find the Lebesgue integral of the following functions on (\\(\\mathbb{R}\\), \\(\\mathcal{B}(\\mathbb{R})\\), \\(\\lambda\\)). \\[\\begin{align} f(\\omega) = \\begin{cases} \\omega, & \\text{for } \\omega = 0,1,...,n \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] \\[\\begin{align} f(\\omega) = \\begin{cases} 1, & \\text{for } \\omega = \\mathbb{Q}^c \\cap [0,1] \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] \\[\\begin{align} f(\\omega) = \\begin{cases} n, & \\text{for } \\omega = \\mathbb{Q}^c \\cap [0,n] \\\\ 0, & \\text{elsewhere} \\end{cases} \\end{align}\\] Solution. \\[\\begin{align} \\int f(\\omega) d\\lambda = \\sum_{\\omega = 0}^n \\omega \\lambda(\\omega) = 0. \\end{align}\\] \\[\\begin{align} \\int f(\\omega) d\\lambda = 1 \\times \\lambda(\\mathbb{Q}^c \\cap [0,1]) = 1. \\end{align}\\] \\[\\begin{align} \\int f(\\omega) d\\lambda = n \\times \\lambda(\\mathbb{Q}^c \\cap [0,n]) = n^2. \\end{align}\\] Exercise 6.3 (borrowed from Jagannathan) Let \\(c \\in \\mathbb{R}\\) be fixed and (\\(\\mathbb{R}\\), \\(\\mathcal{B}(\\mathbb{R})\\)) a measurable space. If for any Borel set \\(A\\), \\(\\delta_c (A) = 1\\) if \\(c \\in A\\), and \\(\\delta_c (A) = 0\\) otherwise, then \\(\\delta_c\\) is called a Dirac measure. Let \\(g\\) be a non-negative, measurable function. Show that \\(\\int g d \\delta_c = g(c)\\). Solution. \\[\\begin{align} \\int g d \\delta_c &= \\sup_{q \\in S(g)} \\int q d \\delta_c \\\\ &= \\sup_{q \\in S(g)} \\sum_{i = 1}^n a_i \\delta_c(A_i) \\\\ &= \\sup_{q \\in S(g)} \\sum_{i = 1}^n a_i \\text{I}_{A_i}(c) \\\\ &= \\sup_{q \\in S(g)} q(c) \\\\ &= g(c) \\end{align}\\] "],["ev.html", "Chapter 7 Expected value 7.1 Discrete random variables 7.2 Continuous random variables 7.3 Sums, functions, conditional expectations 7.4 Covariance", " Chapter 7 Expected value This chapter deals with expected values of random variables. The students are expected to acquire the following knowledge: Theoretical Calculation of the expected value. Calculation of variance and covariance. Cauchy distribution. R Estimation of expected value. Estimation of variance and covariance. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 7.1 Discrete random variables Exercise 7.1 (Bernoulli) Let \\(X \\sim \\text{Bernoulli}(p)\\). Find \\(E[X]\\). Find \\(Var[X]\\). R: Let \\(p = 0.4\\). Check your answers to a) and b) with a simulation. Solution. \\[\\begin{align*} E[X] = \\sum_{k=0}^1 p^k (1-p)^{1-k} k = p. \\end{align*}\\] \\[\\begin{align*} Var[X] = E[X^2] - E[X]^2 = \\sum_{k=0}^1 (p^k (1-p)^{1-k} k^2) - p^2 = p(1-p). \\end{align*}\\] set.seed(1) nsamps <- 1000 x <- rbinom(nsamps, 1, 0.4) mean(x) ## [1] 0.394 var(x) ## [1] 0.239003 0.4 * (1 - 0.4) ## [1] 0.24 Exercise 7.2 (Binomial) Let \\(X \\sim \\text{Binomial}(n,p)\\). Find \\(E[X]\\). Find \\(Var[X]\\). Solution. Let \\(X = \\sum_{i=0}^n X_i\\), where \\(X_i \\sim \\text{Bernoulli}(p)\\). Then, due to linearity of expectation \\[\\begin{align*} E[X] = E[\\sum_{i=0}^n X_i] = \\sum_{i=0}^n E[X_i] = np. \\end{align*}\\] Again let \\(X = \\sum_{i=0}^n X_i\\), where \\(X_i \\sim \\text{Bernoulli}(p)\\). Since the Bernoulli variables \\(X_i\\) are independent we have \\[\\begin{align*} Var[X] = Var[\\sum_{i=0}^n X_i] = \\sum_{i=0}^n Var[X_i] = np(1-p). \\end{align*}\\] Exercise 7.3 (Poisson) Let \\(X \\sim \\text{Poisson}(\\lambda)\\). Find \\(E[X]\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\sum_{k=0}^\\infty \\frac{\\lambda^k e^{-\\lambda}}{k!} k & \\\\ &= \\sum_{k=1}^\\infty \\frac{\\lambda^k e^{-\\lambda}}{k!} k & \\text{term at $k=0$ is 0} \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty \\frac{\\lambda^{k-1}}{(k - 1)!} & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=0}^\\infty \\frac{\\lambda^{k}}{k!} & \\\\ &= e^{-\\lambda} \\lambda e^\\lambda & \\\\ &= \\lambda. \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty k \\frac{\\lambda^{k-1}}{(k - 1)!} - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\sum_{k=1}^\\infty (k - 1) + 1) \\frac{\\lambda^{k-1}}{(k - 1)!} - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\sum_{k=1}^\\infty (k - 1) \\frac{\\lambda^{k-1}}{(k - 1)!} + \\sum_{k=1}^\\infty \\frac{\\lambda^{k-1}}{(k - 1)!}\\Big) - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\lambda\\sum_{k=2}^\\infty \\frac{\\lambda^{k-2}}{(k - 2)!} + e^\\lambda\\Big) - \\lambda^2 & \\\\ &= e^{-\\lambda} \\lambda \\big(\\lambda e^\\lambda + e^\\lambda\\Big) - \\lambda^2 & \\\\ &= \\lambda^2 + \\lambda - \\lambda^2 & \\\\ &= \\lambda. \\end{align*}\\] Exercise 7.4 (Geometric) Let \\(X \\sim \\text{Geometric}(p)\\). Find \\(E[X]\\). Hint: \\(\\frac{d}{dx} x^k = k x^{(k - 1)}\\). Solution. \\[\\begin{align*} E[X] &= \\sum_{k=0}^\\infty (1 - p)^k p k & \\\\ &= p (1 - p) \\sum_{k=0}^\\infty (1 - p)^{k-1} k & \\\\ &= p (1 - p) \\sum_{k=0}^\\infty -\\frac{d}{dp}(1 - p)^k & \\\\ &= p (1 - p) \\Big(-\\frac{d}{dp}\\Big) \\sum_{k=0}^\\infty (1 - p)^k & \\\\ &= p (1 - p) \\Big(-\\frac{d}{dp}\\Big) \\frac{1}{1 - (1 - p)} & \\text{geometric series} \\\\ &= \\frac{1 - p}{p} \\end{align*}\\] 7.2 Continuous random variables Exercise 7.5 (Gamma) Let \\(X \\sim \\text{Gamma}(\\alpha, \\beta)\\). Hint: \\(\\Gamma(z) = \\int_0^\\infty t^{z-1}e^{-t} dt\\) and \\(\\Gamma(z + 1) = z \\Gamma(z)\\). Find \\(E[X]\\). Find \\(Var[X]\\). R: Let \\(\\alpha = 10\\) and \\(\\beta = 2\\). Plot the density of \\(X\\). Add a horizontal line at the expected value that touches the density curve (geom_segment). Shade the area within a standard deviation of the expected value. Solution. \\[\\begin{align*} E[X] &= \\int_0^\\infty \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)}x^\\alpha e^{-\\beta x} dx & \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\int_0^\\infty x^\\alpha e^{-\\beta x} dx & \\text{ (let $t = \\beta x$)} \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(\\alpha) }\\int_0^\\infty \\frac{t^\\alpha}{\\beta^\\alpha} e^{-t} \\frac{dt}{\\beta} & \\\\ &= \\frac{1}{\\beta \\Gamma(\\alpha) }\\int_0^\\infty t^\\alpha e^{-t} dt & \\\\ &= \\frac{\\Gamma(\\alpha + 1)}{\\beta \\Gamma(\\alpha)} & \\\\ &= \\frac{\\alpha \\Gamma(\\alpha)}{\\beta \\Gamma(\\alpha)} & \\\\ &= \\frac{\\alpha}{\\beta}. & \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\int_0^\\infty \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)}x^{\\alpha+1} e^{-\\beta x} dx - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\Gamma(\\alpha + 2)}{\\beta^2 \\Gamma(\\alpha)} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{(\\alpha + 1)\\alpha\\Gamma(\\alpha)}{\\beta^2 \\Gamma(\\alpha)} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\alpha^2 + \\alpha}{\\beta^2} - \\frac{\\alpha^2}{\\beta^2} \\\\ &= \\frac{\\alpha}{\\beta^2}. \\end{align*}\\] set.seed(1) x <- seq(0, 25, by = 0.01) y <- dgamma(x, shape = 10, rate = 2) df <- data.frame(x = x, y = y) ggplot(df, aes(x = x, y = y)) + geom_line() + geom_segment(aes(x = 5, y = 0, xend = 5, yend = dgamma(5, shape = 10, rate = 2)), color = "red") + stat_function(fun = dgamma, args = list(shape = 10, rate = 2), xlim = c(5 - sqrt(10/4), 5 + sqrt(10/4)), geom = "area", fill = "gray", alpha = 0.4) Exercise 7.6 (Beta) Let \\(X \\sim \\text{Beta}(\\alpha, \\beta)\\). Find \\(E[X]\\). Hint 1: \\(\\text{B}(x,y) = \\int_0^1 t^{x-1} (1 - t)^{y-1} dt\\). Hint 2: \\(\\text{B}(x + 1, y) = \\text{B}(x,y)\\frac{x}{x + y}\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\int_0^1 \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} x dx \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)}\\int_0^1 x^{\\alpha} (1 - x)^{\\beta - 1} dx \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 1, \\beta) \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha, \\beta) \\frac{\\alpha}{\\alpha + \\beta} \\\\ &= \\frac{\\alpha}{\\alpha + \\beta}. \\\\ \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\int_0^1 \\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} x^2 dx - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)}\\int_0^1 x^{\\alpha + 1} (1 - x)^{\\beta - 1} dx - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 2, \\beta) - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{1}{\\text{B}(\\alpha, \\beta)} \\text{B}(\\alpha + 1, \\beta) \\frac{\\alpha + 1}{\\alpha + \\beta + 1} - \\frac{\\alpha^2}{(\\alpha + \\beta)^2} \\\\ &= \\frac{\\alpha + 1}{\\alpha + \\beta + 1} \\frac{\\alpha}{\\alpha + \\beta} - \\frac{\\alpha^2}{(\\alpha + \\beta)^2}\\\\ &= \\frac{\\alpha \\beta}{(\\alpha + \\beta)^2(\\alpha + \\beta + 1)}. \\end{align*}\\] Exercise 7.7 (Exponential) Let \\(X \\sim \\text{Exp}(\\lambda)\\). Find \\(E[X]\\). Hint: \\(\\Gamma(z + 1) = z\\Gamma(z)\\) and \\(\\Gamma(1) = 1\\). Find \\(Var[X]\\). Solution. \\[\\begin{align*} E[X] &= \\int_0^\\infty \\lambda e^{-\\lambda x} x dx & \\\\ &= \\lambda \\int_0^\\infty x e^{-\\lambda x} dx & \\\\ &= \\lambda \\int_0^\\infty \\frac{t}{\\lambda} e^{-t} \\frac{dt}{\\lambda} & \\text{$t = \\lambda x$}\\\\ &= \\lambda \\lambda^{-2} \\Gamma(2) & \\text{definition of gamma function} \\\\ &= \\lambda^{-1}. \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[X^2] - E[X]^2 & \\\\ &= \\int_0^\\infty \\lambda e^{-\\lambda x} x^2 dx - \\lambda^{-2} & \\\\ &= \\lambda \\int_0^\\infty \\frac{t^2}{\\lambda^2} e^{-t} \\frac{dt}{\\lambda} - \\lambda^{-2} & \\text{$t = \\lambda x$} \\\\ &= \\lambda \\lambda^{-3} \\Gamma(3) - \\lambda^{-2} & \\text{definition of gamma function} & \\\\ &= \\lambda^{-2} 2 \\Gamma(2) - \\lambda^{-2} & \\\\ &= 2 \\lambda^{-2} - \\lambda^{-2} & \\\\ &= \\lambda^{-2}. & \\\\ \\end{align*}\\] Exercise 7.8 (Normal) Let \\(X \\sim \\text{N}(\\mu, \\sigma)\\). Show that \\(E[X] = \\mu\\). Hint: Use the error function \\(\\text{erf}(x) = \\frac{1}{\\sqrt(\\pi)} \\int_{-x}^x e^{-t^2} dt\\). The statistical interpretation of this function is that if \\(Y \\sim \\text{N}(0, 0.5)\\), then the error function describes the probability of \\(Y\\) falling between \\(-x\\) and \\(x\\). Also, \\(\\text{erf}(\\infty) = 1\\). Show that \\(Var[X] = \\sigma^2\\). Hint: Start with the definition of variance. Solution. \\[\\begin{align*} E[X] &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} x dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty x e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty \\Big(t \\sqrt{2\\sigma^2} + \\mu\\Big)e^{-t^2} \\sqrt{2 \\sigma^2} dt & t = \\frac{x - \\mu}{\\sqrt{2}\\sigma} \\\\ &= \\frac{\\sqrt{2\\sigma^2}}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty t e^{-t^2} dt + \\frac{1}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty \\mu e^{-t^2} dt & \\\\ \\end{align*}\\] Let us calculate these integrals separately. \\[\\begin{align*} \\int t e^{-t^2} dt &= -\\frac{1}{2}\\int e^{s} ds & s = -t^2 \\\\ &= -\\frac{e^s}{2} + C \\\\ &= -\\frac{e^{-t^2}}{2} + C & \\text{undoing substitution}. \\end{align*}\\] Inserting the integration limits we get \\[\\begin{align*} \\int_{-\\infty}^\\infty t e^{-t^2} dt &= 0, \\end{align*}\\] due to the integrated function being symmetric. Reordering the second integral we get \\[\\begin{align*} \\mu \\frac{1}{\\sqrt{\\pi}} \\int_{-\\infty}^\\infty e^{-t^2} dt &= \\mu \\text{erf}(\\infty) & \\text{definition of error function} \\\\ &= \\mu & \\text{probability of $Y$ falling between $-\\infty$ and $\\infty$}. \\end{align*}\\] Combining all of the above we get \\[\\begin{align*} E[X] &= \\frac{\\sqrt{2\\sigma^2}}{\\sqrt{\\pi}} \\times 0 + \\mu &= \\mu.\\\\ \\end{align*}\\] \\[\\begin{align*} Var[X] &= E[(X - E[X])^2] \\\\ &= E[(X - \\mu)^2] \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty (x - \\mu)^2 e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} dx \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\int_{-\\infty}^\\infty t^2 e^{-\\frac{t^2}{2}} dt \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\bigg(\\Big(- t e^{-\\frac{t^2}{2}} |_{-\\infty}^\\infty \\Big) + \\int_{-\\infty}^\\infty e^{-\\frac{t^2}{2}} \\bigg) dt & \\text{integration by parts} \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\sqrt{2 \\pi} \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt(\\pi)}e^{-s^2} \\bigg) & s = \\frac{t}{\\sqrt{2}} \\text{ and evaluating the left expression at the bounds} \\\\ &= \\frac{\\sigma^2}{\\sqrt{2\\pi}} \\sqrt{2 \\pi} \\Big(\\text{erf}(\\infty) & \\text{definition of error function} \\\\ &= \\sigma^2. \\end{align*}\\] 7.3 Sums, functions, conditional expectations Exercise 7.9 (Expectation of transformations) Let \\(X\\) follow a normal distribution with mean \\(\\mu\\) and variance \\(\\sigma^2\\). Find \\(E[2X + 4]\\). Find \\(E[X^2]\\). Find \\(E[\\exp(X)]\\). Hint: Use the error function \\(\\text{erf}(x) = \\frac{1}{\\sqrt(\\pi)} \\int_{-x}^x e^{-t^2} dt\\). Also, \\(\\text{erf}(\\infty) = 1\\). R: Check your results numerically for \\(\\mu = 0.4\\) and \\(\\sigma^2 = 0.25\\) and plot the densities of all four distributions. Solution. \\[\\begin{align} E[2X + 4] &= 2E[X] + 4 & \\text{linearity of expectation} \\\\ &= 2\\mu + 4. \\\\ \\end{align}\\] \\[\\begin{align} E[X^2] &= E[X]^2 + Var[X] & \\text{definition of variance} \\\\ &= \\mu^2 + \\sigma^2. \\end{align}\\] \\[\\begin{align} E[\\exp(X)] &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2\\sigma^2}} e^x dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{\\frac{2 \\sigma^2 x}{2\\sigma^2} -\\frac{(x - \\mu)^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{x^2 - 2x(\\mu + \\sigma^2) + \\mu^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{(x - (\\mu + \\sigma^2))^2 + \\mu^2 - (\\mu + \\sigma^2)^2}{2\\sigma^2}} dx & \\text{complete the square} \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{\\frac{- \\mu^2 + (\\mu + \\sigma^2)^2}{2\\sigma^2}} \\int_{-\\infty}^\\infty e^{-\\frac{(x - (\\mu + \\sigma^2))^2}{2\\sigma^2}} dx & \\\\ &= \\frac{1}{\\sqrt{2\\pi \\sigma^2}} e^{\\frac{- \\mu^2 + (\\mu + \\sigma^2)^2}{2\\sigma^2}} \\sigma \\sqrt{2 \\pi} \\text{erf}(\\infty) & \\\\ &= e^{\\frac{2\\mu + \\sigma^2}{2}}. \\end{align}\\] set.seed(1) mu <- 0.4 sigma <- 0.5 x <- rnorm(100000, mean = mu, sd = sigma) mean(2*x + 4) ## [1] 4.797756 2 * mu + 4 ## [1] 4.8 mean(x^2) ## [1] 0.4108658 mu^2 + sigma^2 ## [1] 0.41 mean(exp(x)) ## [1] 1.689794 exp((2 * mu + sigma^2) / 2) ## [1] 1.690459 Exercise 7.10 (Sum of independent random variables) Borrowed from Wasserman. Let \\(X_1, X_2,...,X_n\\) be IID random variables with expected value \\(E[X_i] = \\mu\\) and variance \\(Var[X_i] = \\sigma^2\\). Find the expected value and variance of \\(\\bar{X} = \\frac{1}{n} \\sum_{i=1}^n X_i\\). \\(\\bar{X}\\) is called a statistic (a function of the values in a sample). It is itself a random variable and its distribution is called a sampling distribution. R: Take \\(n = 5, 10, 100, 1000\\) samples from the N(\\(2\\), \\(6\\)) distribution 10000 times. Plot the theoretical density and the densities of \\(\\bar{X}\\) statistic for each \\(n\\). Intuitively, are the results in correspondence with your calculations? Check them numerically. Solution. Let us start with the expectation of \\(\\bar{X}\\). \\[\\begin{align} E[\\bar{X}] &= E[\\frac{1}{n} \\sum_{i=1}^n X_i] & \\\\ &= \\frac{1}{n} E[\\sum_{i=1}^n X_i] & \\text{ (multiplication with a scalar)} \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[X_i] & \\text{ (linearity)} \\\\ &= \\frac{1}{n} n \\mu & \\\\ &= \\mu. \\end{align}\\] Now the variance \\[\\begin{align} Var[\\bar{X}] &= Var[\\frac{1}{n} \\sum_{i=1}^n X_i] & \\\\ &= \\frac{1}{n^2} Var[\\sum_{i=1}^n X_i] & \\text{ (multiplication with a scalar)} \\\\ &= \\frac{1}{n^2} \\sum_{i=1}^n Var[X_i] & \\text{ (independence of samples)} \\\\ &= \\frac{1}{n^2} n \\sigma^2 & \\\\ &= \\frac{1}{n} \\sigma^2. \\end{align}\\] set.seed(1) nsamps <- 10000 mu <- 2 sigma <- sqrt(6) N <- c(5, 10, 100, 500) X <- matrix(data = NA, nrow = nsamps, ncol = length(N)) ind <- 1 for (n in N) { for (i in 1:nsamps) { X[i,ind] <- mean(rnorm(n, mu, sigma)) } ind <- ind + 1 } colnames(X) <- N X <- melt(as.data.frame(X)) ggplot(data = X, aes(x = value, colour = variable)) + geom_density() + stat_function(data = data.frame(x = seq(-2, 6, by = 0.01)), aes(x = x), fun = dnorm, args = list(mean = mu, sd = sigma), color = "black") Exercise 7.11 (Conditional expectation) Let \\(X \\in \\mathbb{R}_0^+\\) and \\(Y \\in \\mathbb{N}_0\\) be random variables with joint distribution \\(p_{XY}(X,Y) = \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} 0.5^{y + 1}\\). Find \\(E[X | Y = y]\\) by first finding \\(p_Y\\) and then \\(p_{X|Y}\\). Find \\(E[X]\\). R: check your answers to a) and b) by drawing 10000 samples from \\(p_Y\\) and \\(p_{X|Y}\\). Solution. \\[\\begin{align} p(y) &= \\int_0^\\infty \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} 0.5^{y + 1} dx \\\\ &= \\frac{0.5^{y + 1}}{y + 1} \\int_0^\\infty e^{-\\frac{x}{y + 1}} dx \\\\ &= \\frac{0.5^{y + 1}}{y + 1} (y + 1) \\\\ &= 0.5^{y + 1} \\\\ &= 0.5(1 - 0.5)^y. \\end{align}\\] We recognize this as the geometric distribution. \\[\\begin{align} p(x|y) &= \\frac{p(x,y)}{p(y)} \\\\ &= \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}}. \\end{align}\\] We recognize this as the exponential distribution. \\[\\begin{align} E[X | Y = y] &= \\int_0^\\infty x \\frac{1}{y + 1} e^{-\\frac{x}{y + 1}} dx \\\\ &= y + 1 & \\text{expected value of the exponential distribution} \\end{align}\\] Use the law of iterated expectation. \\[\\begin{align} E[X] &= E[E[X | Y]] \\\\ &= E[Y + 1] \\\\ &= E[Y] + 1 \\\\ &= \\frac{1 - 0.5}{0.5} + 1 \\\\ &= 2. \\end{align}\\] set.seed(1) y <- rgeom(100000, 0.5) x <- rexp(100000, rate = 1 / (y + 1)) x2 <- x[y == 3] mean(x2) ## [1] 4.048501 3 + 1 ## [1] 4 mean(x) ## [1] 2.007639 (1 - 0.5) / 0.5 + 1 ## [1] 2 Exercise 7.12 (Cauchy distribution) Let \\(p(x | x_0, \\gamma) = \\frac{1}{\\pi \\gamma \\Big(1 + \\big(\\frac{x - x_0}{\\gamma}\\big)^2\\Big)}\\). A random variable with this PDF follows a Cauchy distribution. This distribution is symmetric and has wider tails than the normal distribution. R: Draw \\(n = 1,...,1000\\) samples from a standard normal and \\(\\text{Cauchy}(0, 1)\\). For each \\(n\\) plot the mean and the median of the sample using facets. Interpret the results. To get a mathematical explanation of the results in a), evaluate the integral \\(\\int_0^\\infty \\frac{x}{1 + x^2} dx\\) and consider that \\(E[X] = \\int_{-\\infty}^\\infty \\frac{x}{1 + x^2}dx\\). set.seed(1) n <- 1000 means_n <- vector(mode = "numeric", length = n) means_c <- vector(mode = "numeric", length = n) medians_n <- vector(mode = "numeric", length = n) medians_c <- vector(mode = "numeric", length = n) for (i in 1:n) { tmp_n <- rnorm(i) tmp_c <- rcauchy(i) means_n[i] <- mean(tmp_n) means_c[i] <- mean(tmp_c) medians_n[i] <- median(tmp_n) medians_c[i] <- median(tmp_c) } df <- data.frame("distribution" = c(rep("normal", 2 * n), rep("Cauchy", 2 * n)), "type" = c(rep("mean", n), rep("median", n), rep("mean", n), rep("median", n)), "value" = c(means_n, medians_n, means_c, medians_c), "n" = rep(1:n, times = 4)) ggplot(df, aes(x = n, y = value)) + geom_line(alpha = 0.5) + facet_wrap(~ type + distribution , scales = "free") Solution. \\[\\begin{align} \\int_0^\\infty \\frac{x}{1 + x^2} dx &= \\frac{1}{2} \\int_1^\\infty \\frac{1}{u} du & u = 1 + x^2 \\\\ &= \\frac{1}{2} \\ln(x) |_0^\\infty. \\end{align}\\] This integral is not finite. The same holds for the negative part. Therefore, the expectation is undefined, as \\(E[|X|] = \\infty\\). Why can we not just claim that \\(f(x) = x / (1 + x^2)\\) is odd and \\(\\int_{-\\infty}^\\infty f(x) = 0\\)? By definition of the Lebesgue integral \\(\\int_{-\\infty}^{\\infty} f= \\int_{-\\infty}^{\\infty} f_+-\\int_{-\\infty}^{\\infty} f_-\\). At least one of the two integrals needs to be finite for \\(\\int_{-\\infty}^{\\infty} f\\) to be well-defined. However \\(\\int_{-\\infty}^{\\infty} f_+=\\int_0^{\\infty} x/(1+x^2)\\) and \\(\\int_{-\\infty}^{\\infty} f_-=\\int_{-\\infty}^{0} |x|/(1+x^2)\\). We have just shown that both of these integrals are infinite, which implies that their sum is also infinite. 7.4 Covariance Exercise 7.13 Below is a table of values for random variables \\(X\\) and \\(Y\\). X Y 2.1 8 -0.5 11 1 10 -2 12 4 9 Find sample covariance of \\(X\\) and \\(Y\\). Find sample variances of \\(X\\) and \\(Y\\). Find sample correlation of \\(X\\) and \\(Y\\). Find sample variance of \\(Z = 2X - 3Y\\). Solution. \\(\\bar{X} = 0.92\\) and \\(\\bar{Y} = 10\\). \\[\\begin{align} s(X, Y) &= \\frac{1}{n - 1} \\sum_{i=1}^5 (X_i - 0.92) (Y_i - 10) \\\\ &= -3.175. \\end{align}\\] \\[\\begin{align} s(X) &= \\frac{\\sum_{i=1}^5(X_i - 0.92)^2}{5 - 1} \\\\ &= 5.357. \\end{align}\\] \\[\\begin{align} s(Y) &= \\frac{\\sum_{i=1}^5(Y_i - 10)^2}{5 - 1} \\\\ &= 2.5. \\end{align}\\] \\[\\begin{align} r(X,Y) &= \\frac{Cov(X,Y)}{\\sqrt{Var[X]Var[Y]}} \\\\ &= \\frac{-3.175}{\\sqrt{5.357 \\times 2.5}} \\\\ &= -8.68. \\end{align}\\] \\[\\begin{align} s(Z) &= 2^2 s(X) + 3^2 s(Y) + 2 \\times 2 \\times 3 s(X, Y) \\\\ &= 4 \\times 5.357 + 9 \\times 2.5 + 12 \\times 3.175 \\\\ &= 82.028. \\end{align}\\] Exercise 7.14 Let \\(X \\sim \\text{Uniform}(0,1)\\) and \\(Y | X = x \\sim \\text{Uniform(0,x)}\\). Find the covariance of \\(X\\) and \\(Y\\). Find the correlation of \\(X\\) and \\(Y\\). R: check your answers to a) and b) with simulation. Plot \\(X\\) against \\(Y\\) on a scatterplot. Solution. The joint PDF is \\(p(x,y) = p(x)p(y|x) = \\frac{1}{x}\\). \\[\\begin{align} Cov(X,Y) &= E[XY] - E[X]E[Y] \\\\ \\end{align}\\] Let us first evaluate the first term: \\[\\begin{align} E[XY] &= \\int_0^1 \\int_0^x x y \\frac{1}{x} dy dx \\\\ &= \\int_0^1 \\int_0^x y dy dx \\\\ &= \\int_0^1 \\frac{x^2}{2} dx \\\\ &= \\frac{1}{6}. \\end{align}\\] Now let us find \\(E[Y]\\), \\(E[X]\\) is trivial. \\[\\begin{align} E[Y] = E[E[Y | X]] = E[\\frac{X}{2}] = \\frac{1}{2} \\int_0^1 x dx = \\frac{1}{4}. \\end{align}\\] Combining all: \\[\\begin{align} Cov(X,Y) &= \\frac{1}{6} - \\frac{1}{2} \\frac{1}{4} = \\frac{1}{24}. \\end{align}\\] \\[\\begin{align} \\rho(X,Y) &= \\frac{Cov(X,Y)}{\\sqrt{Var[X]Var[Y]}} \\\\ \\end{align}\\] Let us calculate \\(Var[X]\\). \\[\\begin{align} Var[X] &= E[X^2] - \\frac{1}{4} \\\\ &= \\int_0^1 x^2 - \\frac{1}{4} \\\\ &= \\frac{1}{3} - \\frac{1}{4} \\\\ &= \\frac{1}{12}. \\end{align}\\] Let us calculate \\(E[E[Y^2|X]]\\). \\[\\begin{align} E[E[Y^2|X]] &= E[\\frac{x^2}{3}] \\\\ &= \\frac{1}{9}. \\end{align}\\] Then \\(Var[Y] = \\frac{1}{9} - \\frac{1}{16} = \\frac{7}{144}\\). Combining all \\[\\begin{align} \\rho(X,Y) &= \\frac{\\frac{1}{24}}{\\sqrt{\\frac{1}{12}\\frac{7}{144}}} \\\\ &= 0.65. \\end{align}\\] set.seed(1) nsamps <- 10000 x <- runif(nsamps) y <- runif(nsamps, 0, x) cov(x, y) ## [1] 0.04274061 1/24 ## [1] 0.04166667 cor(x, y) ## [1] 0.6629567 (1 / 24) / (sqrt(7 / (12 * 144))) ## [1] 0.6546537 ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_point(alpha = 0.2) + geom_smooth(method = "lm") "],["mrv.html", "Chapter 8 Multivariate random variables 8.1 Multinomial random variables 8.2 Multivariate normal random variables 8.3 Transformations", " Chapter 8 Multivariate random variables This chapter deals with multivariate random variables. The students are expected to acquire the following knowledge: Theoretical Multinomial distribution. Multivariate normal distribution. Cholesky decomposition. Eigendecomposition. R Sampling from the multinomial distribution. Sampling from the multivariate normal distribution. Matrix decompositions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 8.1 Multinomial random variables Exercise 8.1 Let \\(X_i\\), \\(i = 1,...,k\\) represent \\(k\\) events, and \\(p_i\\) the probabilities of these events happening in a trial. Let \\(n\\) be the number of trials, and \\(X\\) a multivariate random variable, the collection of \\(X_i\\). Then \\(p(x) = \\frac{n!}{x_1!x_2!...x_k!} p_1^{x_1} p_2^{x_2}...p_k^{x_k}\\) is the PMF of a multinomial distribution, where \\(n = \\sum_{i = 1}^k x_i\\). Show that the marginal distribution of \\(X_i\\) is a binomial distribution. Take 1000 samples from the multinomial distribution with \\(n=4\\) and probabilities \\(p = (0.2, 0.2, 0.5, 0.1)\\). Then take 1000 samples from four binomial distributions with the same parameters. Inspect the results visually. Solution. We will approach this proof from the probabilistic point of view. W.L.O.G. let \\(x_1\\) be the marginal distribution we are interested in. The term \\(p^{x_1}\\) denotes the probability that event 1 happened \\(x_1\\) times. For this event not to happen, one of the other events needs to happen. So for each of the remaining trials, the probability of another event is \\(\\sum_{i=2}^k p_i = 1 - p_1\\), and there were \\(n - x_1\\) such trials. What is left to do is to calculate the number of permutations of event 1 happening and event 1 not happening. We choose \\(x_1\\) trials, from \\(n\\) trials. Therefore \\(p(x_1) = \\binom{n}{x_1} p_1^{x_1} (1 - p_1)^{n - x_1}\\), which is the binomial PMF. Interested students are encouraged to prove this mathematically. set.seed(1) nsamps <- 1000 samps_mult <- rmultinom(nsamps, 4, prob = c(0.2, 0.2, 0.5, 0.1)) samps_mult <- as_tibble(t(samps_mult)) %>% gather() samps <- tibble( V1 = rbinom(nsamps, 4, 0.2), V2 = rbinom(nsamps, 4, 0.2), V3 = rbinom(nsamps, 4, 0.5), V4 = rbinom(nsamps, 4, 0.1) ) %>% gather() %>% bind_rows(samps_mult) %>% bind_cols("dist" = c(rep("binomial", 4*nsamps), rep("multinomial", 4*nsamps))) ggplot(samps, aes(x = value, fill = dist)) + geom_bar(position = "dodge") + facet_wrap(~ key) Exercise 8.2 (Multinomial expected value) Find the expected value, variance and covariance of the multinomial distribution. Hint: First find the expected value for \\(n = 1\\) and then use the fact that the trials are independent. Solution. Let us first calculate the expected value of \\(X_1\\), when \\(n = 1\\). \\[\\begin{align} E[X_1] &= \\sum_{n_1 = 0}^1 \\sum_{n_2 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_1!n_2!...n_k!}p_1^{n_1}p_2^{n_2}...p_k^{n_k}n_1 \\\\ &= \\sum_{n_1 = 0}^1 \\frac{p_1^{n_1} n_1}{n_1!} \\sum_{n_2 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_2!...n_k!}p_2^{n_2}...p_k^{n_k} \\end{align}\\] When \\(n_1 = 0\\) then the whole terms is zero, so we do not need to evaluate other sums. When \\(n_1 = 1\\), all other \\(n_i\\) must be zero, as we have \\(1 = \\sum_{i=1}^k n_i\\). Therefore the other sums equal \\(1\\). So \\(E[X_1] = p_1\\) and \\(E[X_i] = p_i\\) for \\(i = 1,...,k\\). Now let \\(Y_j\\), \\(j = 1,...,n\\), have a multinomial distribution with \\(n = 1\\), and let \\(X\\) have a multinomial distribution with an arbitrary \\(n\\). Then we can write \\(X = \\sum_{j=1}^n Y_j\\). And due to independence \\[\\begin{align} E[X] &= E[\\sum_{j=1}^n Y_j] \\\\ &= \\sum_{j=1}^n E[Y_j] \\\\ &= np. \\end{align}\\] For the variance, we need \\(E[X^2]\\). Let us follow the same procedure as above and first calculate \\(E[X_i]\\) for \\(n = 1\\). The only thing that changes is that the term \\(n_i\\) becomes \\(n_i^2\\). Since we only have \\(0\\) and \\(1\\) this does not change the outcome. So \\[\\begin{align} Var[X_i] &= E[X_i^2] - E[X_i]^2\\\\ &= p_i(1 - p_i). \\end{align}\\] Analogous to above for arbitrary \\(n\\) \\[\\begin{align} Var[X] &= E[X^2] - E[X]^2 \\\\ &= \\sum_{j=1}^n E[Y_j^2] - \\sum_{j=1}^n E[Y_j]^2 \\\\ &= \\sum_{j=1}^n E[Y_j^2] - E[Y_j]^2 \\\\ &= \\sum_{j=1}^n p(1-p) \\\\ &= np(1-p). \\end{align}\\] To calculate the covariance, we need \\(E[X_i X_j]\\). Again, let us start with \\(n = 1\\). Without loss of generality, let us assume \\(i = 1\\) and \\(j = 2\\). \\[\\begin{align} E[X_1 X_2] = \\sum_{n_1 = 0}^1 \\sum_{n_2 = 0}^1 \\frac{p_1^{n_1} n_1}{n_1!} \\frac{p_2^{n_2} n_2}{n_2!} \\sum_{n_3 = 0}^1 ... \\sum_{n_k = 0}^1 \\frac{1}{n_3!...n_k!}p_3^{n_3}...p_k^{n_k}. \\end{align}\\] In the above expression, at each iteration we multiply with \\(n_1\\) and \\(n_2\\). Since \\(n = 1\\), one of these always has to be zero. Therefore \\(E[X_1 X_2] = 0\\) and \\[\\begin{align} Cov(X_i, X_j) &= E[X_i X_j] - E[X_i]E[X_j] \\\\ &= - p_i p_j. \\end{align}\\] For arbitrary \\(n\\), let \\(X = \\sum_{t = 1}^n Y_t\\) be the sum of independent multinomial random variables \\(Y_t = [X_{1t}, X_{2t},...,X_{kt}]^T\\) with \\(n=1\\). Then \\(X_1 = \\sum_{t = 1}^n X_{1t}\\) and \\(X_2 = \\sum_{l = 1}^n X_{2l}\\). \\[\\begin{align} Cov(X_1, X_2) &= E[X_1 X_2] - E[X_1] E[X_2] \\\\ &= E[\\sum_{t = 1}^n X_{1t} \\sum_{l = 1}^n X_{2l}] - n^2 p_1 p_2 \\\\ &= \\sum_{t = 1}^n \\sum_{l = 1}^n E[X_{1t} X_{2l}] - n^2 p_1 p_2. \\end{align}\\] For \\(X_{1t}\\) and \\(X_{2l}\\) the expected value is zero when \\(t = l\\). When \\(t \\neq l\\) then they are independent, so the expected value is the product \\(p_1 p_2\\). There are \\(n^2\\) total terms, and for \\(n\\) of them \\(t = l\\) holds. So \\(E[X_1 X_2] = (n^2 - n) p_1 p_2\\). Inserting into the above \\[\\begin{align} Cov(X_1, X_2) &= (n^2 - n) p_1 p_2 - n^2 p_1 p_2 \\\\ &= - n p_1 p_2. \\end{align}\\] 8.2 Multivariate normal random variables Exercise 8.3 (Cholesky decomposition) Let \\(X\\) be a random vector of length \\(k\\) with \\(X_i \\sim \\text{N}(0, 1)\\) and \\(LL^*\\) the Cholesky decomposition of a Hermitian positive-definite matrix \\(A\\). Let \\(\\mu\\) be a vector of length \\(k\\). Find the distribution of the random vector \\(Y = \\mu + L X\\). Find the Cholesky decomposition of \\(A = \\begin{bmatrix} 2 & 1.2 \\\\ 1.2 & 1 \\end{bmatrix}\\). R: Use the results from a) and b) to sample from the MVN distribution \\(\\text{N}(\\mu, A)\\), where \\(\\mu = [1.5, -1]^T\\). Plot a scatterplot and compare it to direct samples from the multivariate normal distribution (rmvnorm). Solution. \\(X\\) has an independent normal distribution of dimension \\(k\\). Then \\[\\begin{align} Y = \\mu + L X &\\sim \\text{N}(\\mu, LL^T) \\\\ &\\sim \\text{N}(\\mu, A). \\end{align}\\] Solve \\[\\begin{align} \\begin{bmatrix} a & 0 \\\\ b & c \\end{bmatrix} \\begin{bmatrix} a & b \\\\ 0 & c \\end{bmatrix} = \\begin{bmatrix} 2 & 1.2 \\\\ 1.2 & 1 \\end{bmatrix} \\end{align}\\] # a set.seed(1) nsamps <- 1000 X <- matrix(data = rnorm(nsamps * 2), ncol = 2) mu <- c(1.5, -1) L <- matrix(data = c(sqrt(2), 0, 1.2 / sqrt(2), sqrt(1 - 1.2^2/2)), ncol = 2, byrow = TRUE) Y <- t(mu + L %*% t(X)) plot_df <- data.frame(rbind(X, Y), c(rep("X", nsamps), rep("Y", nsamps))) colnames(plot_df) <- c("D1", "D2", "var") ggplot(data = plot_df, aes(x = D1, y = D2, colour = as.factor(var))) + geom_point() Exercise 8.4 (Eigendecomposition) R: Let \\(\\Sigma = U \\Lambda U^T\\) be the eigendecomposition of covariance matrix \\(\\Sigma\\). Follow the procedure below, to sample from a multivariate normal with \\(\\mu = [-2, 1]^T\\) and \\(\\Sigma = \\begin{bmatrix} 0.3, -0.5 \\\\ -0.5, 1.6 \\end{bmatrix}\\): Sample from two independent standardized normal distributions to get \\(X\\). Find the eigendecomposition of \\(X\\) (eigen). Multiply \\(X\\) by \\(\\Lambda^{\\frac{1}{2}}\\) to get \\(X2\\). Consider how the eigendecomposition for \\(X2\\) changes compared to \\(X\\). Multiply \\(X2\\) by \\(U\\) to get \\(X3\\). Consider how the eigendecomposition for \\(X3\\) changes compared to \\(X2\\). Add \\(\\mu\\) to \\(X3\\). Consider how the eigendecomposition for \\(X4\\) changes compared to \\(X3\\). Plot the data and the eigenvectors (scaled with \\(\\Lambda^{\\frac{1}{2}}\\)) at each step. Hint: Use geom_segment for the eigenvectors. # a set.seed(1) sigma <- matrix(data = c(0.3, -0.5, -0.5, 1.6), nrow = 2, byrow = TRUE) ed <- eigen(sigma) e_val <- ed$values e_vec <- ed$vectors # b set.seed(1) nsamps <- 1000 X <- matrix(data = rnorm(nsamps * 2), ncol = 2) vec1 <- matrix(c(1,0,0,1), nrow = 2) X2 <- t(sqrt(diag(e_val)) %*% t(X)) vec2 <- sqrt(diag(e_val)) %*% vec1 X3 <- t(e_vec %*% t(X2)) vec3 <- e_vec %*% vec2 X4 <- t(c(-2, 1) + t(X3)) vec4 <- c(-2, 1) + vec3 vec_mat <- data.frame(matrix(c(0,0,0,0,0,0,0,0,0,0,0,0,-2,1,-2,1), ncol = 2, byrow = TRUE), t(cbind(vec1, vec2, vec3, vec4)), c(1,1,2,2,3,3,4,4)) df <- data.frame(rbind(X, X2, X3, X4), c(rep(1, nsamps), rep(2, nsamps), rep(3, nsamps), rep(4, nsamps))) colnames(df) <- c("D1", "D2", "wh") colnames(vec_mat) <- c("D1", "D2", "E1", "E2", "wh") ggplot(data = df, aes(x = D1, y = D2)) + geom_point() + geom_segment(data = vec_mat, aes(xend = E1, yend = E2), color = "red") + facet_wrap(~ wh) + coord_fixed() Exercise 8.5 (Marginal and conditional distributions) Let \\(X \\sim \\text{N}(\\mu, \\Sigma)\\), where \\(\\mu = [2, 0, -1]^T\\) and \\(\\Sigma = \\begin{bmatrix} 1 & -0.2 & 0.5 \\\\ -0.2 & 1.4 & -1.2 \\\\ 0.5 & -1.2 & 2 \\\\ \\end{bmatrix}\\). Let \\(A\\) represent the first two random variables and \\(B\\) the third random variable. R: For the calculation in the following points, you can use R. Find the marginal distribution of \\(B\\). Find the conditional distribution of \\(B | A = [a_1, a_2]^T\\). Find the marginal distribution of \\(A\\). Find the conditional distribution of \\(A | B = b\\). R: Visually compare the distributions of a) and b), and c) and d) at three different conditional values. mu <- c(2, 0, -1) Sigma <- matrix(c(1, -0.2, 0.5, -0.2, 1.4, -1.2, 0.5, -1.2, 2), nrow = 3, byrow = TRUE) mu_A <- c(2, 0) mu_B <- -1 Sigma_A <- Sigma[1:2, 1:2] Sigma_B <- Sigma[3, 3] Sigma_AB <- Sigma[1:2, 3] # b tmp_b <- t(Sigma_AB) %*% solve(Sigma_A) mu_b <- mu_B - tmp_b %*% mu_A Sigma_b <- Sigma_B - t(Sigma_AB) %*% solve(Sigma_A) %*% Sigma_AB mu_b ## [,1] ## [1,] -1.676471 tmp_b ## [,1] [,2] ## [1,] 0.3382353 -0.8088235 Sigma_b ## [,1] ## [1,] 0.8602941 # d tmp_a <- Sigma_AB * (1 / Sigma_B) mu_a <- mu_A - tmp_a * mu_B Sigma_d <- Sigma_A - (Sigma_AB * (1 / Sigma_B)) %*% t(Sigma_AB) mu_a ## [1] 2.25 -0.60 tmp_a ## [1] 0.25 -0.60 Sigma_d ## [,1] [,2] ## [1,] 0.875 0.10 ## [2,] 0.100 0.68 Solution. \\(B \\sim \\text{N}(-1, 2)\\). \\(B | A = a \\sim \\text{N}(-1.68 + [0.34, -0.81] a, 0.86)\\). \\(\\mu_A = [2, 0]^T\\) and \\(\\Sigma_A = \\begin{bmatrix} 1 & -0.2 & \\\\ -0.2 & 1.4 \\\\ \\end{bmatrix}\\). \\[\\begin{align} A | B = b &\\sim \\text{N}(\\mu_t, \\Sigma_t), \\\\ \\mu_t &= [2.25, -0.6]^T + [0.25, -0.6]^T b, \\\\ \\Sigma_t &= \\begin{bmatrix} 0.875 & 0.1 \\\\ 0.1 & 0.68 \\\\ \\end{bmatrix} \\end{align}\\] library(mvtnorm) set.seed(1) nsamps <- 1000 # a and b samps <- as.data.frame(matrix(data = NA, nrow = 4 * nsamps, ncol = 2)) samps[1:nsamps,1] <- rnorm(nsamps, mu_B, Sigma_B) samps[1:nsamps,2] <- "marginal" for (i in 1:3) { a <- rmvnorm(1, mu_A, Sigma_A) samps[(i*nsamps + 1):((i + 1) * nsamps), 1] <- rnorm(nsamps, mu_b + tmp_b %*% t(a), Sigma_b) samps[(i*nsamps + 1):((i + 1) * nsamps), 2] <- paste0(# "cond", round(a, digits = 2), collapse = "-") } colnames(samps) <- c("x", "dist") ggplot(samps, aes(x = x)) + geom_density() + facet_wrap(~ dist) # c and d samps <- as.data.frame(matrix(data = NA, nrow = 4 * nsamps, ncol = 3)) samps[1:nsamps,1:2] <- rmvnorm(nsamps, mu_A, Sigma_A) samps[1:nsamps,3] <- "marginal" for (i in 1:3) { b <- rnorm(1, mu_B, Sigma_B) samps[(i*nsamps + 1):((i + 1) * nsamps), 1:2] <- rmvnorm(nsamps, mu_a + tmp_a * b, Sigma_d) samps[(i*nsamps + 1):((i + 1) * nsamps), 3] <- b } colnames(samps) <- c("x", "y", "dist") ggplot(samps, aes(x = x, y = y)) + geom_point() + geom_smooth(method = "lm") + facet_wrap(~ dist) 8.3 Transformations Exercise 8.6 Let \\((U,V)\\) be a random variable with PDF \\(p(u,v) = \\frac{1}{4 \\sqrt{u}}\\), \\(U \\in [0,4]\\) and \\(V \\in [\\sqrt{U}, \\sqrt{U} + 1]\\). Let \\(X = \\sqrt{U}\\) and \\(Y = V - \\sqrt{U}\\). Find PDF of \\((X,Y)\\). What can you tell about distributions of \\(X\\) and \\(Y\\)? This exercise shows how we can simplify a probabilistic problem with a clever use of transformations. R: Take 1000 samples from \\((X,Y)\\) and transform them with inverses of the above functions to get samples from \\((U,V)\\). Plot both sets of samples. Solution. First we need to find the inverse functions. Since \\(x = \\sqrt{u}\\) it follows that \\(u = x^2\\), and that \\(x \\in [0,2]\\). Similarly \\(v = y + x\\) and \\(y \\in [0,1]\\). Let us first find the Jacobian. \\[\\renewcommand\\arraystretch{1.6} J(x,y) = \\begin{bmatrix} \\frac{\\partial u}{\\partial x} & \\frac{\\partial v}{\\partial x} \\\\%[1ex] % <-- 1ex more space between rows of matrix \\frac{\\partial u}{\\partial y} & \\frac{\\partial v}{\\partial y} \\end{bmatrix} = \\begin{bmatrix} 2x & 1 \\\\%[1ex] % <-- 1ex more space between rows of matrix 0 & 1 \\end{bmatrix}, \\] and the determinant is \\(|J(x,y)| = 2x\\). Putting everything together, we get \\[\\begin{align} p_{X,Y}(x,y) = p_{U,V}(x^2, y + x) |J(x,y)| = \\frac{1}{4 \\sqrt{x^2}} 2x = \\frac{1}{2}. \\end{align}\\] This reminds us of the Uniform distribution. Indeed we can see that \\(p_X(x) = \\frac{1}{2}\\) and \\(p_Y(y) = 1\\). So instead of dealing with an awkward PDF of \\((U,V)\\) and the corresponding dynamic bounds, we are now looking at two independent Uniform random variables. In practice, this could make modeling much easier. set.seed(1) nsamps <- 2000 x <- runif(nsamps, min = 0, max = 2) y <- runif(nsamps) orig <- tibble(x = x, y = y, vrs = "original") u <- x^2 v <- y + x transf <- tibble(x = u, y = v, vrs = "transformed") df <- bind_rows(orig, transf) ggplot(df, aes(x = x, y = y, color = vrs)) + geom_point(alpha = 0.3) Exercise 8.7 R: Write a function that will calculate the probability density of an arbitraty multivariate normal distribution, based on independent standardized normal PDFs. Compare with dmvnorm from the mvtnorm package. library(mvtnorm) set.seed(1) mvn_dens <- function (y, mu, Sigma) { L <- chol(Sigma) L_inv <- solve(t(L)) g_inv <- L_inv %*% t(y - mu) J <- L_inv J_det <- det(J) return(prod(dnorm(g_inv)) * J_det) } mu_v <- c(-2, 0, 1) cov_m <- matrix(c(1, -0.2, 0.5, -0.2, 2, 0.3, 0.5, 0.3, 1.6), ncol = 3, byrow = TRUE) n_comp <- 20 for (i in 1:n_comp) { x <- rmvnorm(1, mean = mu_v, sigma = cov_m) print(paste0("My function: ", mvn_dens(x, mu_v, cov_m), ", dmvnorm: ", dmvnorm(x, mu_v, cov_m))) } ## [1] "My function: 0.0229514237156383, dmvnorm: 0.0229514237156383" ## [1] "My function: 0.00763138915406231, dmvnorm: 0.00763138915406231" ## [1] "My function: 0.0230688881105741, dmvnorm: 0.0230688881105741" ## [1] "My function: 0.0113616213114731, dmvnorm: 0.0113616213114731" ## [1] "My function: 0.00151808500121907, dmvnorm: 0.00151808500121907" ## [1] "My function: 0.0257658045974509, dmvnorm: 0.0257658045974509" ## [1] "My function: 0.0157963825730805, dmvnorm: 0.0157963825730805" ## [1] "My function: 0.00408856287529248, dmvnorm: 0.00408856287529248" ## [1] "My function: 0.0327793540101256, dmvnorm: 0.0327793540101256" ## [1] "My function: 0.0111606542967978, dmvnorm: 0.0111606542967978" ## [1] "My function: 0.0147636757585684, dmvnorm: 0.0147636757585684" ## [1] "My function: 0.0142948300412207, dmvnorm: 0.0142948300412207" ## [1] "My function: 0.0203093820657542, dmvnorm: 0.0203093820657542" ## [1] "My function: 0.0287533273357481, dmvnorm: 0.0287533273357481" ## [1] "My function: 0.0213402305128623, dmvnorm: 0.0213402305128623" ## [1] "My function: 0.0218356957993885, dmvnorm: 0.0218356957993885" ## [1] "My function: 0.0250750113961771, dmvnorm: 0.0250750113961771" ## [1] "My function: 0.0166498666348048, dmvnorm: 0.0166498666348048" ## [1] "My function: 0.00189725106874659, dmvnorm: 0.00189725106874659" ## [1] "My function: 0.0196697814975113, dmvnorm: 0.0196697814975113" "],["ard.html", "Chapter 9 Alternative representation of distributions 9.1 Probability generating functions (PGFs) 9.2 Moment generating functions (MGFs)", " Chapter 9 Alternative representation of distributions This chapter deals with alternative representation of distributions. The students are expected to acquire the following knowledge: Theoretical Probability generating functions. Moment generating functions. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 9.1 Probability generating functions (PGFs) Exercise 9.1 Show that the sum of independent Poisson random variables is itself a Poisson random variable. R: Let \\(X\\) be a sum of three Poisson distributions with \\(\\lambda_i \\in \\{2, 5.2, 10\\}\\). Take 1000 samples and plot the three distributions and the sum. Then take 1000 samples from the theoretical distribution of \\(X\\) and compare them to the sum. Solution. Let \\(X_i \\sim \\text{Poisson}(\\lambda_i)\\) for \\(i = 1,...,n\\), and let \\(X = \\sum_{i=1}^n X_i\\). \\[\\begin{align} \\alpha_X(t) &= \\prod_{i=1}^n \\alpha_{X_i}(t) \\\\ &= \\prod_{i=1}^n \\bigg( \\sum_{j=0}^\\infty t^j \\frac{\\lambda_i^j e^{-\\lambda_i}}{j!} \\bigg) \\\\ &= \\prod_{i=1}^n \\bigg( e^{-\\lambda_i} \\sum_{j=0}^\\infty \\frac{(t\\lambda_i)^j }{j!} \\bigg) \\\\ &= \\prod_{i=1}^n \\bigg( e^{-\\lambda_i} e^{t \\lambda_i} \\bigg) & \\text{power series} \\\\ &= \\prod_{i=1}^n \\bigg( e^{\\lambda_i(t - 1)} \\bigg) \\\\ &= e^{\\sum_{i=1}^n \\lambda_i(t - 1)} \\\\ &= e^{t \\sum_{i=1}^n \\lambda_i - \\sum_{i=1}^n \\lambda_i} \\\\ &= e^{-\\sum_{i=1}^n \\lambda_i} \\sum_{j=0}^\\infty \\frac{(t \\sum_{i=1}^n \\lambda_i)^j}{j!}\\\\ &= \\sum_{j=0}^\\infty \\frac{e^{-\\sum_{i=1}^n \\lambda_i} (t \\sum_{i=1}^n \\lambda_i)^j}{j!}\\\\ \\end{align}\\] The last term is the PGF of a Poisson random variable with parameter \\(\\sum_{i=1}^n \\lambda_i\\). Because the PGF is unique, \\(X\\) is a Poisson random variable. set.seed(1) library(tidyr) nsamps <- 1000 samps <- matrix(data = NA, nrow = nsamps, ncol = 4) samps[ ,1] <- rpois(nsamps, 2) samps[ ,2] <- rpois(nsamps, 5.2) samps[ ,3] <- rpois(nsamps, 10) samps[ ,4] <- samps[ ,1] + samps[ ,2] + samps[ ,3] colnames(samps) <- c(2, 2.5, 10, "sum") gsamps <- as_tibble(samps) gsamps <- gather(gsamps, key = "dist", value = "value") ggplot(gsamps, aes(x = value)) + geom_bar() + facet_wrap(~ dist) samps <- cbind(samps, "theoretical" = rpois(nsamps, 2 + 5.2 + 10)) gsamps <- as_tibble(samps[ ,4:5]) gsamps <- gather(gsamps, key = "dist", value = "value") ggplot(gsamps, aes(x = value, fill = dist)) + geom_bar(position = "dodge") Exercise 9.2 Find the expected value and variance of the negative binomial distribution. Hint: Find the Taylor series of \\((1 - y)^{-r}\\) at point 0. Solution. Let \\(X \\sim \\text{NB}(r, p)\\). \\[\\begin{align} \\alpha_X(t) &= E[t^X] \\\\ &= \\sum_{j=0}^\\infty t^j \\binom{j + r - 1}{j} (1 - p)^r p^j \\\\ &= (1 - p)^r \\sum_{j=0}^\\infty \\binom{j + r - 1}{j} (tp)^j \\\\ &= (1 - p)^r \\sum_{j=0}^\\infty \\frac{(j + r - 1)(j + r - 2)...r}{j!} (tp)^j. \\\\ \\end{align}\\] Let us look at the Taylor series of \\((1 - y)^{-r}\\) at 0 \\[\\begin{align} (1 - y)^{-r} = &1 + \\frac{-r(-1)}{1!}y + \\frac{-r(-r - 1)(-1)^2}{2!}y^2 + \\\\ &\\frac{-r(-r - 1)(-r - 2)(-1)^3}{3!}y^3 + ... \\\\ \\end{align}\\] How does the \\(k\\)-th term look like? We have \\(k\\) derivatives of our function so \\[\\begin{align} \\frac{d^k}{d^k y} (1 - y)^{-r} &= \\frac{-r(-r - 1)...(-r - k + 1)(-1)^k}{k!}y^k \\\\ &= \\frac{r(r + 1)...(r + k - 1)}{k!}y^k. \\end{align}\\] We observe that this equals to the \\(j\\)-th term in the sum of NB PGF. Therefore \\[\\begin{align} \\alpha_X(t) &= (1 - p)^r (1 - tp)^{-r} \\\\ &= \\Big(\\frac{1 - p}{1 - tp}\\Big)^r \\end{align}\\] To find the expected value, we need to differentiate \\[\\begin{align} \\frac{d}{dt} \\Big(\\frac{1 - p}{1 - tp}\\Big)^r &= r \\Big(\\frac{1 - p}{1 - tp}\\Big)^{r-1} \\frac{d}{dt} \\frac{1 - p}{1 - tp} \\\\ &= r \\Big(\\frac{1 - p}{1 - tp}\\Big)^{r-1} \\frac{p(1 - p)}{(1 - tp)^2}. \\\\ \\end{align}\\] Evaluating this at 1, we get: \\[\\begin{align} E[X] = \\frac{rp}{1 - p}. \\end{align}\\] For the variance we need the second derivative. \\[\\begin{align} \\frac{d^2}{d^2t} \\Big(\\frac{1 - p}{1 - tp}\\Big)^r &= \\frac{p^2 r (r + 1) (\\frac{1 - p}{1 - tp})^r}{(tp - 1)^2} \\end{align}\\] Evaluating this at 1 and inserting the first derivatives, we get: \\[\\begin{align} Var[X] &= \\frac{d^2}{dt^2} \\alpha_X(1) + \\frac{d}{dt}\\alpha_X(1) - \\Big(\\frac{d}{dt}\\alpha_X(t) \\Big)^2 \\\\ &= \\frac{p^2 r (r + 1)}{(1 - p)^2} + \\frac{rp}{1 - p} - \\frac{r^2p^2}{(1 - p)^2} \\\\ &= \\frac{rp}{(1 - p)^2}. \\end{align}\\] library(tidyr) set.seed(1) nsamps <- 100000 find_p <- function (mu, r) { return (10 / (r + 10)) } r <- c(1,2,10,20) p <- find_p(10, r) sigma <- rep(sqrt(p*r / (1 - p)^2), each = nsamps) samps <- cbind("r=1" = rnbinom(nsamps, size = r[1], prob = 1 - p[1]), "r=2" = rnbinom(nsamps, size = r[2], prob = 1 - p[2]), "r=4" = rnbinom(nsamps, size = r[3], prob = 1 - p[3]), "r=20" = rnbinom(nsamps, size = r[4], prob = 1 - p[4])) gsamps <- gather(as.data.frame(samps)) iw <- (gsamps$value > sigma + 10) | (gsamps$value < sigma - 10) ggplot(gsamps, aes(x = value, fill = iw)) + geom_bar() + # geom_density() + facet_wrap(~ key) 9.2 Moment generating functions (MGFs) Exercise 9.3 Find the variance of the geometric distribution. Solution. Let \\(X \\sim \\text{Geometric}(p)\\). The MGF of the geometric distribution is \\[\\begin{align} M_X(t) &= E[e^{tX}] \\\\ &= \\sum_{k=0}^\\infty p(1 - p)^k e^{tk} \\\\ &= p \\sum_{k=0}^\\infty ((1 - p)e^t)^k. \\end{align}\\] Let us assume that \\((1 - p)e^t < 1\\). Then, by using the geometric series we get \\[\\begin{align} M_X(t) &= \\frac{p}{1 - e^t + pe^t}. \\end{align}\\] The first derivative of the above expression is \\[\\begin{align} \\frac{d}{dt}M_X(t) &= \\frac{-p(-e^t + pe^t)}{(1 - e^t + pe^t)^2}, \\end{align}\\] and evaluating at \\(t = 0\\), we get \\(\\frac{1 - p}{p}\\), which we already recognize as the expected value of the geometric distribution. The second derivative is \\[\\begin{align} \\frac{d^2}{dt^2}M_X(t) &= \\frac{(p-1)pe^t((p-1)e^t - 1)}{((p - 1)e^t + 1)^3}, \\end{align}\\] and evaluating at \\(t = 0\\), we get \\(\\frac{(p - 1)(p - 2)}{p^2}\\). Combining we get the variance \\[\\begin{align} Var(X) &= \\frac{(p - 1)(p - 2)}{p^2} - \\frac{(1 - p)^2}{p^2} \\\\ &= \\frac{(p-1)(p-2) - (1-p)^2}{p^2} \\\\ &= \\frac{1 - p}{p^2}. \\end{align}\\] Exercise 9.4 Find the distribution of sum of two normal random variables \\(X\\) and \\(Y\\), by comparing \\(M_{X+Y}(t)\\) to \\(M_X(t)\\). R: To illustrate the result draw random samples from N\\((-3, 1)\\) and N\\((5, 1.2)\\) and calculate the empirical mean and variance of \\(X+Y\\). Plot all three histograms in one plot. Solution. Let \\(X \\sim \\text{N}(\\mu_X, 1)\\) and \\(Y \\sim \\text{N}(\\mu_Y, 1)\\). The MGF of the sum is \\[\\begin{align} M_{X+Y}(t) &= M_X(t) M_Y(t). \\end{align}\\] Let us calculate \\(M_X(t)\\), the MGF for \\(Y\\) then follows analogously. \\[\\begin{align} M_X(t) &= \\int_{-\\infty}^\\infty e^{tx} \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - mu_X)^2}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - mu_X)^2 - 2\\sigma_X tx}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{x^2 - 2\\mu_X x + \\mu_X^2 - 2\\sigma_X tx}{2\\sigma_X^2}} dx \\\\ &= \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - (\\mu_X + \\sigma_X^2 t))^2 + \\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} dx & \\text{complete the square}\\\\ &= e^{-\\frac{\\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} \\int_{-\\infty}^\\infty \\frac{1}{\\sqrt{2 \\pi \\sigma_X^2}} e^{-\\frac{(x - (\\mu_X + \\sigma_X^2 t))^2}{2\\sigma_X^2}} dx & \\\\ &= e^{-\\frac{\\mu_X^2 - (\\mu_X + \\sigma_X^2 t)^2}{2\\sigma_X^2}} & \\text{normal PDF} \\\\ &= e^{-\\frac{\\mu_X^2 - \\mu_X^2 - \\mu_X \\sigma_X^2 t - 2 \\sigma_X^4 t^2}{2\\sigma_X^2}} \\\\ &= e^{\\sigma_X^2 t^2 + \\frac{\\mu_X t}{2}}. \\\\ \\end{align}\\] The MGF of the sum is then \\[\\begin{align} M_{X+Y}(t) &= e^{\\sigma_X^2 t^2 + 0.5\\mu_X t} e^{\\sigma_Y^2 t^2 + 0.5\\mu_Y t} \\\\ &= e^{t^2(\\sigma_X^2 + \\sigma_Y^2) + 0.5 t(\\mu_X + \\mu_Y)}. \\end{align}\\] By comparing \\(M_{X+Y}(t)\\) and \\(M_X(t)\\) we observe that both have two terms. The first is \\(2t^2\\) multiplied by the variance, and the second is \\(2t\\) multiplied by the mean. Since MGFs are unique, we conclude that \\(Z = X + Y \\sim \\text{N}(\\mu_X + \\mu_Y, \\sigma_X^2 + \\sigma_Y^2)\\). library(tidyr) library(ggplot2) set.seed(1) nsamps <- 1000 x <- rnorm(nsamps, -3, 1) y <- rnorm(nsamps, 5, 1.2) z <- x + y mean(z) ## [1] 1.968838 var(z) ## [1] 2.645034 df <- data.frame(x = x, y = y, z = z) %>% gather() ggplot(df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") "],["ci.html", "Chapter 10 Concentration inequalities 10.1 Comparison 10.2 Practical", " Chapter 10 Concentration inequalities This chapter deals with concentration inequalities. The students are expected to acquire the following knowledge: Theoretical More assumptions produce closer bounds. R Optimization. Estimating probability inequalities. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 10.1 Comparison Exercise 10.1 R: Let \\(X\\) be geometric random variable with \\(p = 0.7\\). Visually compare the Markov bound, Chernoff bound, and the theoretical probabilities for \\(x = 1,...,12\\). To get the best fitting Chernoff bound, you will need to optimize the bound depending on \\(t\\). Use either analytical or numerical optimization. bound_chernoff <- function (t, p, a) { return ((p / (1 - exp(t) + p * exp(t))) / exp(a * t)) } set.seed(1) p <- 0.7 a <- seq(1, 12, by = 1) ci_markov <- (1 - p) / p / a t <- vector(mode = "numeric", length = length(a)) for (i in 1:length(t)) { t[i] <- optimize(bound_chernoff, interval = c(0, log(1 / (1 - p))), p = p, a = a[i])$minimum } t ## [1] 0.5108267 0.7984981 0.9162927 0.9808238 1.0216635 1.0498233 1.0704327 ## [8] 1.0861944 1.0986159 1.1086800 1.1169653 1.1239426 ci_chernoff <- (p / (1 - exp(t) + p * exp(t))) / exp(a * t) actual <- 1 - pgeom(a, 0.7) plot_df <- rbind( data.frame(x = a, y = ci_markov, type = "Markov"), data.frame(x = a, y = ci_chernoff, type = "Chernoff"), data.frame(x = a, y = actual, type = "Actual") ) ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() Exercise 10.2 R: Let \\(X\\) be a sum of 100 Beta distributions with random parameters. Take 1000 samples and plot the Chebyshev bound, Hoeffding bound, and the empirical probabilities. set.seed(1) nvars <- 100 nsamps <- 1000 samps <- matrix(data = NA, nrow = nsamps, ncol = nvars) Sn_mean <- 0 Sn_var <- 0 for (i in 1:nvars) { alpha1 <- rgamma(1, 10, 1) beta1 <- rgamma(1, 10, 1) X <- rbeta(nsamps, alpha1, beta1) Sn_mean <- Sn_mean + alpha1 / (alpha1 + beta1) Sn_var <- Sn_var + alpha1 * beta1 / ((alpha1 + beta1)^2 * (alpha1 + beta1 + 1)) samps[ ,i] <- X } mean(apply(samps, 1, sum)) ## [1] 51.12511 Sn_mean ## [1] 51.15723 var(apply(samps, 1, sum)) ## [1] 1.170652 Sn_var ## [1] 1.166183 a <- 1:30 b <- a / sqrt(Sn_var) ci_chebyshev <- 1 / b^2 ci_hoeffding <- 2 * exp(- 2 * a^2 / nvars) empirical <- NULL for (i in 1:length(a)) { empirical[i] <- sum(abs((apply(samps, 1, sum)) - Sn_mean) >= a[i])/ nsamps } plot_df <- rbind( data.frame(x = a, y = ci_chebyshev, type = "Chebyshev"), data.frame(x = a, y = ci_hoeffding, type = "Hoeffding"), data.frame(x = a, y = empirical, type = "Empirical") ) ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() ggplot(plot_df, aes(x = x, y = y, color = type)) + geom_line() + coord_cartesian(xlim = c(15, 25), ylim = c(0, 0.05)) 10.2 Practical Exercise 10.3 From Jagannathan. Let \\(X_i\\), \\(i = 1,...n\\), be a random sample of size \\(n\\) of a random variable \\(X\\). Let \\(X\\) have mean \\(\\mu\\) and variance \\(\\sigma^2\\). Find the size of the sample \\(n\\) required so that the probability that the difference between sample mean and true mean is smaller than \\(\\frac{\\sigma}{10}\\) is at least 0.95. Hint: Derive a version of the Chebyshev inequality for \\(P(|X - \\mu| \\geq a)\\) using Markov inequality. Solution. Let \\(\\bar{X} = \\sum_{i=1}^n X_i\\). Then \\(E[\\bar{X}] = \\mu\\) and \\(Var[\\bar{X}] = \\frac{\\sigma^2}{n}\\). Let us first derive another representation of Chebyshev inequality. \\[\\begin{align} P(|X - \\mu| \\geq a) = P(|X - \\mu|^2 \\geq a^2) \\leq \\frac{E[|X - \\mu|^2]}{a^2} = \\frac{Var[X]}{a^2}. \\end{align}\\] Let us use that on our sampling distribution: \\[\\begin{align} P(|\\bar{X} - \\mu| \\geq \\frac{\\sigma}{10}) \\leq \\frac{100 Var[\\bar{X}]}{\\sigma^2} = \\frac{100 Var[X]}{n \\sigma^2} = \\frac{100}{n}. \\end{align}\\] We are interested in the difference being smaller, therefore \\[\\begin{align} P(|\\bar{X} - \\mu| < \\frac{\\sigma}{10}) = 1 - P(|\\bar{X} - \\mu| \\geq \\frac{\\sigma}{10}) \\geq 1 - \\frac{100}{n} \\geq 0.95. \\end{align}\\] It follows that we need a sample size of \\(n \\geq \\frac{100}{0.05} = 2000\\). "],["crv.html", "Chapter 11 Convergence of random variables", " Chapter 11 Convergence of random variables This chapter deals with convergence of random variables. The students are expected to acquire the following knowledge: Theoretical Finding convergences of random variables. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 11.1 Let \\(X_1\\), \\(X_2\\),…, \\(X_n\\) be a sequence of Bernoulli random variables. Let \\(Y_n = \\frac{X_1 + X_2 + ... + X_n}{n^2}\\). Show that this sequence converges point-wise to the zero random variable. R: Use a simulation to check your answer. Solution. Let \\(\\epsilon\\) be arbitrary. We need to find such \\(n_0\\), that for every \\(n\\) greater than \\(n_0\\) \\(|Y_n| < \\epsilon\\) holds. \\[\\begin{align} |Y_n| &= |\\frac{X_1 + X_2 + ... + X_n}{n^2}| \\\\ &\\leq |\\frac{n}{n^2}| \\\\ &= \\frac{1}{n}. \\end{align}\\] So we need to find such \\(n_0\\), that for every \\(n > n_0\\) we will have \\(\\frac{1}{n} < \\epsilon\\). So \\(n_0 > \\frac{1}{\\epsilon}\\). x <- 1:1000 X <- matrix(data = NA, nrow = length(x), ncol = 100) y <- vector(mode = "numeric", length = length(x)) for (i in 1:length(x)) { X[i, ] <- rbinom(100, size = 1, prob = 0.5) } X <- apply(X, 2, cumsum) tmp_mat <- matrix(data = (1:1000)^2, nrow = 1000, ncol = 100) X <- X / tmp_mat y <- apply(X, 1, mean) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() Exercise 11.2 Let \\(\\Omega = [0,1]\\) and let \\(X_n\\) be a sequence of random variables, defined as \\[\\begin{align} X_n(\\omega) = \\begin{cases} \\omega^3, &\\omega = \\frac{i}{n}, &0 \\leq i \\leq 1 \\\\ 1, & \\text{otherwise.} \\end{cases} \\end{align}\\] Show that \\(X_n\\) converges almost surely to \\(X \\sim \\text{Uniform}(0,1)\\). Solution. We need to show \\(P(\\{\\omega: X_n(\\omega) \\rightarrow X(\\omega)\\}) = 1\\). Let \\(\\omega \\neq \\frac{i}{n}\\). Then for any \\(\\omega\\), \\(X_n\\) converges pointwise to \\(X\\): \\[\\begin{align} X_n(\\omega) = 1 \\implies |X_n(\\omega) - X(s)| = |1 - 1| < \\epsilon. \\end{align}\\] The above is independent of \\(n\\). Since there are countably infinite number of elements in the complement (\\(\\frac{i}{n}\\)), the probability of this set is 1. Exercise 11.3 Borrowed from Wasserman. Let \\(X_n \\sim \\text{N}(0, \\frac{1}{n})\\) and let \\(X\\) be a random variable with CDF \\[\\begin{align} F_X(x) = \\begin{cases} 0, &x < 0 \\\\ 1, &x \\geq 0. \\end{cases} \\end{align}\\] Does \\(X_n\\) converge to \\(X\\) in distribution? How about in probability? Prove or disprove these statement. R: Plot the CDF of \\(X_n\\) for \\(n = 1, 2, 5, 10, 100, 1000\\). Solution. Let us first check convergence in distribution. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} F_{X_n}(x) &= \\lim_{n \\rightarrow \\infty} \\phi (\\sqrt(n) x). \\end{align}\\] We have two cases, for \\(x < 0\\) and \\(x > 0\\). We do not need to check for \\(x = 0\\), since \\(F_X\\) is not continuous in that point. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} \\phi (\\sqrt(n) x) = \\begin{cases} 0, & x < 0 \\\\ 1, & x > 0. \\end{cases} \\end{align}\\] This is the same as \\(F_X\\). Let us now check convergence in probability. Since \\(X\\) is a point-mass distribution at zero, we have \\[\\begin{align} \\lim_{n \\rightarrow \\infty} P(|X_n| > \\epsilon) &= \\lim_{n \\rightarrow \\infty} (P(X_n > \\epsilon) + P(X_n < -\\epsilon)) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - P(X_n < \\epsilon) + P(X_n < -\\epsilon)) \\\\ &= \\lim_{n \\rightarrow \\infty} (1 - \\phi(\\sqrt{n} \\epsilon) + \\phi(- \\sqrt{n} \\epsilon)) \\\\ &= 0. \\end{align}\\] n <- c(1,2,5,10,100,1000) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01)), aes(x = x)) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/1), aes(color = "sd = 1/1")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/2), aes(color = "sd = 1/2")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/5), aes(color = "sd = 1/5")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/10), aes(color = "sd = 1/10")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/100), aes(color = "sd = 1/100")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/1000), aes(color = "sd = 1/1000")) + stat_function(fun = pnorm, args = list(mean = 0, sd = 1/10000), aes(color = "sd = 1/10000")) Exercise 11.4 Let \\(X_i\\) be i.i.d. and \\(\\mu = E(X_1)\\). Let variance of \\(X_1\\) be finite. Show that the mean of \\(X_i\\), \\(\\bar{X}_n = \\frac{1}{n}\\sum_{i=1}^n X_i\\) converges in quadratic mean to \\(\\mu\\). Solution. \\[\\begin{align} \\lim_{n \\rightarrow \\infty} E(|\\bar{X_n} - \\mu|^2) &= \\lim_{n \\rightarrow \\infty} E(\\bar{X_n}^2 - 2 \\bar{X_n} \\mu + \\mu^2) \\\\ &= \\lim_{n \\rightarrow \\infty} (E(\\bar{X_n}^2) - 2 \\mu E(\\frac{\\sum_{i=1}^n X_i}{n}) + \\mu^2) \\\\ &= \\lim_{n \\rightarrow \\infty} E(\\bar{X_n})^2 + \\lim_{n \\rightarrow \\infty} Var(\\bar{X_n}) - 2 \\mu^2 + \\mu^2 \\\\ &= \\lim_{n \\rightarrow \\infty} \\frac{n^2 \\mu^2}{n^2} + \\lim_{n \\rightarrow \\infty} \\frac{\\sigma^2}{n} - \\mu^2 \\\\ &= \\mu^2 - \\mu^2 + \\lim_{n \\rightarrow \\infty} \\frac{\\sigma^2}{n} \\\\ &= 0. \\end{align}\\] "],["lt.html", "Chapter 12 Limit theorems", " Chapter 12 Limit theorems This chapter deals with limit theorems. The students are expected to acquire the following knowledge: Theoretical Monte Carlo integration convergence. Difference between weak and strong law of large numbers. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 12.1 Show that Monte Carlo integration converges almost surely to the true integral of a bounded function. Solution. Let \\(g\\) be a function defined on \\(\\Omega\\). Let \\(X_i\\), \\(i = 1,...,n\\) be i.i.d. (multivariate) uniform random variables with bounds defined on \\(\\Omega\\). Let \\(Y_i\\) = \\(g(X_i)\\). Then it follows that \\(Y_i\\) are also i.i.d. random variables and their expected value is \\(E[g(X)] = \\int_{\\Omega} g(x) f_X(x) dx = \\frac{1}{V_{\\Omega}} \\int_{\\Omega} g(x) dx\\). By the strong law of large numbers, we have \\[\\begin{equation} \\frac{1}{n}\\sum_{i=1}^n Y_i \\xrightarrow{\\text{a.s.}} E[g(X)]. \\end{equation}\\] It follows that \\[\\begin{equation} V_{\\Omega} \\frac{1}{n}\\sum_{i=1}^n Y_i \\xrightarrow{\\text{a.s.}} \\int_{\\Omega} g(x) dx. \\end{equation}\\] Exercise 12.2 Let \\(X\\) be a geometric random variable with probability 0.5 and support in positive integers. Let \\(Y = 2^X (-1)^X X^{-1}\\). Find the expected value of \\(Y\\) by using conditional convergence (this variable does not have an expected value in the conventional sense – the series is not absolutely convergent). R: Draw \\(10000\\) samples from a geometric distribution with probability 0.5 and support in positive integers to get \\(X\\). Then calculate \\(Y\\) and plot the means at each iteration (sample). Additionally, plot the expected value calculated in a. Try it with different seeds. What do you notice? Solution. \\[\\begin{align*} E[Y] &= \\sum_{x=1}^{\\infty} \\frac{2^x (-1)^x}{x} 0.5^x \\\\ &= \\sum_{x=1}^{\\infty} \\frac{(-1)^x}{x} \\\\ &= - \\sum_{x=1}^{\\infty} \\frac{(-1)^{x+1}}{x} \\\\ &= - \\ln(2) \\end{align*}\\] set.seed(3) x <- rgeom(100000, prob = 0.5) + 1 y <- 2^x * (-1)^x * x^{-1} y_means <- cumsum(y) / seq_along(y) df <- data.frame(x = 1:length(y_means), y = y_means) ggplot(data = df, aes(x = x, y = y)) + geom_line() + geom_hline(yintercept = -log(2)) "],["eb.html", "Chapter 13 Estimation basics 13.1 ECDF 13.2 Properties of estimators", " Chapter 13 Estimation basics This chapter deals with estimation basics. The students are expected to acquire the following knowledge: Biased and unbiased estimators. Consistent estimators. Empirical cumulative distribution function. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 13.1 ECDF Exercise 13.1 (ECDF intuition) Take any univariate continuous distribution that is readily available in R and plot its CDF (\\(F\\)). Draw one sample (\\(n = 1\\)) from the chosen distribution and draw the ECDF (\\(F_n\\)) of that one sample. Use the definition of the ECDF, not an existing function in R. Implementation hint: ECDFs are always piecewise constant - they only jump at the sampled values and by \\(1/n\\). Repeat (b) for \\(n = 5, 10, 100, 1000...\\) Theory says that \\(F_n\\) should converge to \\(F\\). Can you observe that? For \\(n = 100\\) repeat the process \\(m = 20\\) times and plot every \\(F_n^{(m)}\\). Theory says that \\(F_n\\) will converge to \\(F\\) the slowest where \\(F\\) is close to 0.5 (where the variance is largest). Can you observe that? library(ggplot2) set.seed(1) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) one_samp <- rnorm(1) X <- data.frame(x = c(-5, one_samp, 5), y = c(0,1,1)) ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) + geom_step(data = X, aes(x = x, y = y)) N <- c(5, 10, 100, 1000) X <- NULL for (n in N) { tmp <- rnorm(n) tmp_X <- data.frame(x = c(-5, sort(tmp), 5), y = c(0, seq(1/n, 1, by = 1/n), 1), n = n) X <- rbind(X, tmp_X) } ggplot(data = data.frame(x = seq(-5, 5, by = 0.01))) + # stat_function(aes(x = x), fun = pbeta, args = list(shape1 = 1, shape2 = 2)) stat_function(aes(x = x), fun = pnorm, args = list(mean = 0, sd = 1)) + geom_step(data = X, aes(x = x, y = y, color = as.factor(n))) + labs(color = "N") 13.2 Properties of estimators Exercise 13.2 Show that the sample average is, as an estimator of the mean: unbiased, consistent, asymptotically normal. Solution. \\[\\begin{align*} E[\\frac{1}{n} \\sum_{i=1}^n X_i] &= \\frac{1}{n} \\sum_{i=i}^n E[X_i] \\\\ &= E[X]. \\end{align*}\\] \\[\\begin{align*} \\lim_{n \\rightarrow \\infty} P(|\\frac{1}{n} \\sum_{i=1}^n X_i - E[X]| > \\epsilon) &= \\lim_{n \\rightarrow \\infty} P((\\frac{1}{n} \\sum_{i=1}^n X_i - E[X])^2 > \\epsilon^2) \\\\ & \\leq \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i - E[X])^2]}{\\epsilon^2} & \\text{Markov inequality} \\\\ & = \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i)^2 - 2 \\frac{1}{n} \\sum_{i=1}^n X_i E[X] + E[X]^2]}{\\epsilon^2} \\\\ & = \\lim_{n \\rightarrow \\infty} \\frac{E[(\\frac{1}{n} \\sum_{i=1}^n X_i)^2] - 2 E[X]^2 + E[X]^2}{\\epsilon^2} \\\\ &= 0 \\end{align*}\\] For the last equality see the solution to ??. Follows directly from the CLT. Exercise 13.3 (Consistent but biased estimator) Show that sample variance (the plug-in estimator of variance) is a biased estimator of variance. Show that sample variance is a consistent estimator of variance. Show that the estimator with (\\(N-1\\)) (Bessel correction) is unbiased. Solution. \\[\\begin{align*} E[\\frac{1}{n} \\sum_{i=1}^n (Y_i - \\bar{Y})^2] &= \\frac{1}{n} \\sum_{i=1}^n E[(Y_i - \\bar{Y})^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2] - 2 E[Y_i \\bar{Y}] + \\bar{Y}^2)] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2 - 2 Y_i \\bar{Y} + \\bar{Y}^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n E[Y_i^2 - \\frac{2}{n} Y_i^2 - \\frac{2}{n} \\sum_{i \\neq j} Y_i Y_j + \\frac{1}{n^2}\\sum_j \\sum_{k \\neq j} Y_j Y_k + \\frac{1}{n^2} \\sum_j Y_j^2] \\\\ &= \\frac{1}{n} \\sum_{i=1}^n \\frac{n - 2}{n} (\\sigma^2 + \\mu^2) - \\frac{2}{n} (n - 1) \\mu^2 + \\frac{1}{n^2}n(n-1)\\mu^2 + \\frac{1}{n^2}n(\\sigma^2 + \\mu^2) \\\\ &= \\frac{n-1}{n}\\sigma^2 \\\\ < \\sigma^2. \\end{align*}\\] Let \\(S_n\\) denote the sample variance. Then we can write it as \\[\\begin{align*} S_n &= \\frac{1}{n} \\sum_{i=1}^n (X_i - \\bar{X})^2 = \\frac{1}{n} \\sum_{i=1}^n (X_i - \\mu)^2 + 2(X_i - \\mu)(\\mu - \\bar{X}) + (\\mu - \\bar{X})^2. \\end{align*}\\] Now \\(\\bar{X}\\) converges in probability (by WLLN) to \\(\\mu\\) therefore the right terms converge in probability to zero. The left term converges in probability to \\(\\sigma^2\\), also by WLLN. Therefore the sample variance is a consistent estimatior of the variance. The denominator changes in the second-to-last line of a., therefore the last line is now equality. Exercise 13.4 (Estimating the median) Show that the sample median is an unbiased estimator of the median for N\\((\\mu, \\sigma^2)\\). Show that the sample median is an unbiased estimator of the mean for any distribution with symmetric density. Hint 1: The pdf of an order statistic is \\(f_{X_{(k)}}(x) = \\frac{n!}{(n - k)!(k - 1)!}f_X(x)\\Big(F_X(x)^{k-1} (1 - F_X(x)^{n - k}) \\Big)\\). Hint 2: A distribution is symmetric when \\(X\\) and \\(2a - X\\) have the same distribution for some \\(a\\). Solution. Let \\(Z_i\\), \\(i = 1,...,n\\) be i.i.d. variables with a symmetric distribution and let \\(Z_{k:n}\\) denote the \\(k\\)-th order statistic. We will distinguish two cases, when \\(n\\) is odd and when \\(n\\) is even. Let first \\(n = 2m + 1\\) be odd. Then the sample median is \\(M = Z_{m+1:2m+1}\\). Its PDF is \\[\\begin{align*} f_M(x) = (m+1)\\binom{2m + 1}{m}f_Z(x)\\Big(F_Z(x)^m (1 - F_Z(x)^m) \\Big). \\end{align*}\\] For every symmetric distribution, it holds that \\(F_X(x) = 1 - F(2a - x)\\). Let \\(a = \\mu\\), the population mean. Plugging this into the PDF, we get that \\(f_M(x) = f_M(2\\mu -x)\\). It follows that \\[\\begin{align*} E[M] &= E[2\\mu - M] \\\\ 2E[M] &= 2\\mu \\\\ E[M] &= \\mu. \\end{align*}\\] Now let \\(n = 2m\\) be even. Then the sample median is \\(M = \\frac{Z_{m:2m} + Z_{m+1:2m}}{2}\\). It can be shown, that the joint PDF of these terms is also symmetric. Therefore, similar to the above \\[\\begin{align*} E[M] &= E[\\frac{Z_{m:2m} + Z_{m+1:2m}}{2}] \\\\ &= E[\\frac{2\\mu - M + 2\\mu - M}{2}] \\\\ &= E[2\\mu - M]. \\end{align*}\\] The above also proves point a. as the median and the mean are the same in normal distribution. Exercise 13.5 (Matrix trace estimation) The Hutchinson trace estimator [1] is an estimator of the trace of a symmetric positive semidefinite matrix A that relies on Monte Carlo sampling. The estimator is defined as \\[\\begin{align*} \\textrm{tr}(A) \\approx \\frac{1}{n} \\Sigma_{i=1}^n z_i^T A z_i, &\\\\ z_i \\sim_{\\mathrm{IID}} \\textrm{Uniform}(\\{-1, 1\\}^m), & \\end{align*}\\] where \\(A \\in \\mathbb{R}^{m \\times m}\\) is a symmetric positive semidefinite matrix. Elements of each vector \\(z_i\\) are either \\(-1\\) or \\(1\\) with equal probability. This is also called a Rademacher distribution. Data scientists often want the trace of a Hessian to obtain valuable curvature information for a loss function. Per [2], an example is classifying ten digits based on \\((28,28)\\) grayscale images (i.e. MNIST data) using logistic regression. The number of parameters is \\(m = 28^2 \\cdot 10 = 7840\\) and the size of the Hessian is \\(m^2\\), roughly \\(6 \\cdot 10^6\\). The diagonal average is equal to the average eigenvalue, which may be useful for optimization; in MCMC contexts, this would be useful for preconditioners and step size optimization. Computing Hessians (as a means of getting eigenvalue information) is often intractable, but Hessian-vector products can be computed faster by autodifferentiation (with e.g. Tensorflow, Pytorch, Jax). This is one motivation for the use of a stochastic trace estimator as outlined above. References: A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines (Hutchinson, 1990) A Modern Analysis of Hutchinson’s Trace Estimator (Skorski, 2020) Prove that the Hutchinson trace estimator is an unbiased estimator of the trace. Solution. We first simplify our task: \\[\\begin{align} \\mathbb{E}\\left[\\frac{1}{n} \\Sigma_{i=1}^n z_i^T A z_i \\right] &= \\frac{1}{n} \\Sigma_{i=1}^n \\mathbb{E}\\left[z_i^T A z_i \\right] \\\\ &= \\mathbb{E}\\left[z_i^T A z_i \\right], \\end{align}\\] where the second equality is due to having \\(n\\) IID vectors \\(z_i\\). We now only need to show that \\(\\mathbb{E}\\left[z^T A z \\right] = \\mathrm{tr}(A)\\). We omit the index due to all vectors being IID: \\[\\begin{align} \\mathrm{tr}(A) &= \\mathrm{tr}(AI) \\\\ &= \\mathrm{tr}(A\\mathbb{E}[zz^T]) \\\\ &= \\mathbb{E}[\\mathrm{tr}(Azz^T)] \\\\ &= \\mathbb{E}[\\mathrm{tr}(z^TAz)] \\\\ &= \\mathbb{E}[z^TAz]. \\end{align}\\] This concludes the proof. We clarify some equalities below. The second equality assumes that \\(\\mathbb{E}[zz^T] = I\\). By noting that the mean of the Rademacher distribution is 0, we have \\[\\begin{align} \\mathrm{Cov}[z, z] &= \\mathbb{E}[(z - \\mathbb{E}[z])(z - \\mathbb{E}[z])^T] \\\\ &= \\mathbb{E}[zz^T]. \\end{align}\\] Dimensions of \\(z\\) are independent, so \\(\\mathrm{Cov}[z, z]_{ij} = 0\\) for \\(i \\neq j\\). The diagonal will contain variances, which are equal to \\(1\\) for all dimensions \\(k = 1 \\dots m\\): \\(\\mathrm{Var}[z^{(k)}] = \\mathbb{E}[z^{(k)}z^{(k)}] - \\mathbb{E}[z^{(k)}]^2 = 1 - 0 = 1\\). It follows that the covariance is an identity matrix. Note that this is a general result for vectors with IID dimensions sampled from a distribution with mean 0 and variance 1. We could therefore use something else instead of the Rademacher, e.g. \\(z ~ N(0, I)\\). The third equality uses the fact that the expectation of a trace equals the trace of an expectation. If \\(X\\) is a random matrix, then \\(\\mathbb{E}[X]_{ij} = \\mathbb{E}[X_{ij}]\\). Therefore: \\[\\begin{align} \\mathrm{tr}(\\mathbb{E}[X]) &= \\Sigma_{i=1}^m(\\mathbb{E}[X]_{ii}) \\\\ &= \\Sigma_{i=1}^m(\\mathbb{E}[X_{ii}]) \\\\ &= \\mathbb{E}[\\Sigma_{i=1}^m(X_{ii})] \\\\ &= \\mathbb{E}[\\mathrm{tr}(X)], \\end{align}\\] where we used the linearity of the expectation in the third step. The fourth equality uses the fact that \\(\\mathrm{tr}(AB) = \\mathrm{tr}(BA)\\) for any matrices \\(A \\in \\mathbb{R}^{n \\times m}, B \\in \\mathbb{R}^{m \\times n}\\). The last inequality uses the fact that the trace of a \\(1 \\times 1\\) matrix is just its element. "],["boot.html", "Chapter 14 Bootstrap", " Chapter 14 Bootstrap This chapter deals with bootstrap. The students are expected to acquire the following knowledge: How to use bootstrap to generate coverage intervals. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 14.1 Ideally, a \\(1-\\alpha\\) CI would have \\(1-\\alpha\\) coverage. That is, say a 95% CI should, in the long run, contain the true value of the parameter 95% of the time. In practice, it is impossible to assess the coverage of our CI method, because we rarely know the true parameter. In simulation, however, we can. Let’s assess the coverage of bootstrap percentile intervals. Pick a univariate distribution with readily available mean and one that you can easily sample from. Draw \\(n = 30\\) random samples from the chosen distribution and use the bootstrap (with large enough m) and percentile CI method to construct 95% CI. Repeat the process many times and count how many times the CI contains the true mean. That is, compute the actual coverage probability (don’t forget to include the standard error of the coverage probability!). What can you observe? Try one or two different distributions. What can you observe? Repeat (b) and (c) using BCa intervals (R package boot). How does the coverage compare to percentile intervals? As (d) but using intervals based on asymptotic normality (+/- 1.96 SE). How do results from (b), (d), and (e) change if we increase the sample size to n = 200? What about n = 5? library(boot) set.seed(0) nit <- 1000 # Repeat the process "many times" alpha <- 0.05 # CI parameter nboot <- 100 # m parameter for bootstrap ("large enough m") # f: change this to 200 or 5. nsample <- 30 # n = 30 random samples from the chosen distribution. Comment out BCa code if it breaks. covers <- matrix(nrow = nit, ncol = 3) covers_BCa <- matrix(nrow = nit, ncol = 3) covers_asymp_norm <- matrix(nrow = nit, ncol = 3) isin <- function (x, lower, upper) { (x > lower) & (x < upper) } for (j in 1:nit) { # Repeating many times # a: pick a univariate distribution - standard normal x1 <- rnorm(nsample) # c: one or two different distributions - beta and poisson x2 <- rbeta(nsample, 1, 2) x3 <- rpois(nsample, 5) X1 <- matrix(data = NA, nrow = nsample, ncol = nboot) X2 <- matrix(data = NA, nrow = nsample, ncol = nboot) X3 <- matrix(data = NA, nrow = nsample, ncol = nboot) for (i in 1:nboot) { X1[ ,i] <- sample(x1, nsample, replace = T) X2[ ,i] <- sample(x2, nsample, T) X3[ ,i] <- sample(x3, nsample, T) } X1_func <- apply(X1, 2, mean) X2_func <- apply(X2, 2, mean) X3_func <- apply(X3, 2, mean) X1_quant <- quantile(X1_func, probs = c(alpha / 2, 1 - alpha / 2)) X2_quant <- quantile(X2_func, probs = c(alpha / 2, 1 - alpha / 2)) X3_quant <- quantile(X3_func, probs = c(alpha / 2, 1 - alpha / 2)) covers[j,1] <- (0 > X1_quant[1]) & (0 < X1_quant[2]) covers[j,2] <- ((1 / 3) > X2_quant[1]) & ((1 / 3) < X2_quant[2]) covers[j,3] <- (5 > X3_quant[1]) & (5 < X3_quant[2]) mf <- function (x, i) return(mean(x[i])) bootX1 <- boot(x1, statistic = mf, R = nboot) bootX2 <- boot(x2, statistic = mf, R = nboot) bootX3 <- boot(x3, statistic = mf, R = nboot) X1_quant_BCa <- boot.ci(bootX1, type = "bca")$bca X2_quant_BCa <- boot.ci(bootX2, type = "bca")$bca X3_quant_BCa <- boot.ci(bootX3, type = "bca")$bca covers_BCa[j,1] <- (0 > X1_quant_BCa[4]) & (0 < X1_quant_BCa[5]) covers_BCa[j,2] <- ((1 / 3) > X2_quant_BCa[4]) & ((1 / 3) < X2_quant_BCa[5]) covers_BCa[j,3] <- (5 > X3_quant_BCa[4]) & (5 < X3_quant_BCa[5]) # e: estimate mean and standard error # sample mean: x1_bar <- mean(x1) x2_bar <- mean(x2) x3_bar <- mean(x3) # standard error (of the sample mean) estimate: sample standard deviation / sqrt(n) x1_bar_SE <- sd(x1) / sqrt(nsample) x2_bar_SE <- sd(x2) / sqrt(nsample) x3_bar_SE <- sd(x3) / sqrt(nsample) covers_asymp_norm[j,1] <- isin(0, x1_bar - 1.96 * x1_bar_SE, x1_bar + 1.96 * x1_bar_SE) covers_asymp_norm[j,2] <- isin(1/3, x2_bar - 1.96 * x2_bar_SE, x2_bar + 1.96 * x2_bar_SE) covers_asymp_norm[j,3] <- isin(5, x3_bar - 1.96 * x3_bar_SE, x3_bar + 1.96 * x3_bar_SE) } apply(covers, 2, mean) ## [1] 0.918 0.925 0.905 apply(covers, 2, sd) / sqrt(nit) ## [1] 0.008680516 0.008333333 0.009276910 apply(covers_BCa, 2, mean) ## [1] 0.927 0.944 0.927 apply(covers_BCa, 2, sd) / sqrt(nit) ## [1] 0.008230355 0.007274401 0.008230355 apply(covers_asymp_norm, 2, mean) ## [1] 0.939 0.937 0.930 apply(covers_asymp_norm, 2, sd) / sqrt(nit) ## [1] 0.007572076 0.007687008 0.008072494 Exercise 14.2 You are given a sample of independent observations from a process of interest: Index 1 2 3 4 5 6 7 8 X 7 2 4 6 4 5 9 10 Compute the plug-in estimate of mean and 95% symmetric CI based on asymptotic normality. Use the plug-in estimate of SE. Same as (a), but use the unbiased estimate of SE. Apply nonparametric bootstrap with 1000 bootstrap replications and estimate the 95% CI for the mean with percentile-based CI. # a x <- c(7, 2, 4, 6, 4, 5, 9, 10) n <- length(x) mu <- mean(x) SE <- sqrt(mean((x - mu)^2)) / sqrt(n) SE ## [1] 0.8915839 z <- qnorm(1 - 0.05 / 2) c(mu - z * SE, mu + z * SE) ## [1] 4.127528 7.622472 # b SE <- sd(x) / sqrt(n) SE ## [1] 0.9531433 c(mu - z * SE, mu + z * SE) ## [1] 4.006873 7.743127 # c set.seed(0) m <- 1000 T_mean <- function(x) {mean(x)} est_boot <- array(NA, m) for (i in 1:m) { x_boot <- x[sample(1:n, n, rep = T)] est_boot[i] <- T_mean(x_boot) } quantile(est_boot, p = c(0.025, 0.975)) ## 2.5% 97.5% ## 4.250 7.625 Exercise 14.3 We are given a sample of 10 independent paired (bivariate) observations: Index 1 2 3 4 5 6 7 8 9 10 X 1.26 -0.33 1.33 1.27 0.41 -1.54 -0.93 -0.29 -0.01 2.40 Y 2.64 0.33 0.48 0.06 -0.88 -2.14 -2.21 0.95 0.83 1.45 Compute Pearson correlation between X and Y. Use the cor.test() from R to estimate a 95% CI for the estimate from (a). Apply nonparametric bootstrap with 1000 bootstrap replications and estimate the 95% CI for the Pearson correlation with percentile-based CI. Compare CI from (b) and (c). Are they similar? How would the bootstrap estimation of CI change if we were interested in Spearman or Kendall correlation instead? x <- c(1.26, -0.33, 1.33, 1.27, 0.41, -1.54, -0.93, -0.29, -0.01, 2.40) y <- c(2.64, 0.33, 0.48, 0.06, -0.88, -2.14, -2.21, 0.95, 0.83, 1.45) # a cor(x, y) ## [1] 0.6991247 # b res <- cor.test(x, y) res$conf.int[1:2] ## [1] 0.1241458 0.9226238 # c set.seed(0) m <- 1000 n <- length(x) T_cor <- function(x, y) {cor(x, y)} est_boot <- array(NA, m) for (i in 1:m) { idx <- sample(1:n, n, rep = T) # !!! important to use same indices to keep dependency between x and y est_boot[i] <- T_cor(x[idx], y[idx]) } quantile(est_boot, p = c(0.025, 0.975)) ## 2.5% 97.5% ## 0.2565537 0.9057664 # d # Yes, but the bootstrap CI is more narrow. # e # We just use the functions for Kendall/Spearman coefficients instead: T_kendall <- function(x, y) {cor(x, y, method = "kendall")} T_spearman <- function(x, y) {cor(x, y, method = "spearman")} # Put this in a function that returns the CI bootstrap_95_ci <- function(x, y, t, m = 1000) { n <- length(x) est_boot <- array(NA, m) for (i in 1:m) { idx <- sample(1:n, n, rep = T) # !!! important to use same indices to keep dependency between x and y est_boot[i] <- t(x[idx], y[idx]) } quantile(est_boot, p = c(0.025, 0.975)) } bootstrap_95_ci(x, y, T_kendall) ## 2.5% 97.5% ## -0.08108108 0.78378378 bootstrap_95_ci(x, y, T_spearman) ## 2.5% 97.5% ## -0.1701115 0.8867925 Exercise 14.4 In this problem we will illustrate the use of the nonparametric bootstrap for estimating CIs of regression model coefficients. Load the longley dataset from base R with data(longley). Use lm() to apply linear regression using “Employed” as the target (dependent) variable and all other variables as the predictors (independent). Using lm() results, print the estimated regression coefficients and standard errors. Estimate 95% CI for the coefficients using +/- 1.96 * SE. Use nonparametric bootstrap with 100 replications to estimate the SE of the coefficients from (b). Compare the SE from (c) with those from (b). # a data(longley) # b res <- lm(Employed ~ . , longley) tmp <- data.frame(summary(res)$coefficients[,1:2]) tmp$LB <- tmp[,1] - 1.96 * tmp[,2] tmp$UB <- tmp[,1] + 1.96 * tmp[,2] tmp ## Estimate Std..Error LB UB ## (Intercept) -3.482259e+03 8.904204e+02 -5.227483e+03 -1.737035e+03 ## GNP.deflator 1.506187e-02 8.491493e-02 -1.513714e-01 1.814951e-01 ## GNP -3.581918e-02 3.349101e-02 -1.014616e-01 2.982320e-02 ## Unemployed -2.020230e-02 4.883997e-03 -2.977493e-02 -1.062966e-02 ## Armed.Forces -1.033227e-02 2.142742e-03 -1.453204e-02 -6.132495e-03 ## Population -5.110411e-02 2.260732e-01 -4.942076e-01 3.919994e-01 ## Year 1.829151e+00 4.554785e-01 9.364136e-01 2.721889e+00 # c set.seed(0) m <- 100 n <- nrow(longley) T_coef <- function(x) { lm(Employed ~ . , x)$coefficients } est_boot <- array(NA, c(m, ncol(longley))) for (i in 1:m) { idx <- sample(1:n, n, rep = T) est_boot[i,] <- T_coef(longley[idx,]) } SE <- apply(est_boot, 2, sd) SE ## [1] 1.826011e+03 1.605981e-01 5.693746e-02 8.204892e-03 3.802225e-03 ## [6] 3.907527e-01 9.414436e-01 # Show the standard errors around coefficients library(ggplot2) library(reshape2) df <- data.frame(index = 1:7, bootstrap_SE = SE, lm_SE = tmp$Std..Error) melted_df <- melt(df[2:nrow(df), ], id.vars = "index") # Ignore bias which has a really large magnitude ggplot(melted_df, aes(x = index, y = value, fill = variable)) + geom_bar(stat="identity", position="dodge") + xlab("Coefficient") + ylab("Standard error") # + scale_y_continuous(trans = "log") # If you want to also plot bias Exercise 14.5 This exercise shows a shortcoming of the bootstrap method when using the plug in estimator for the maximum. Compute the 95% bootstrap CI for the maximum of a standard normal distribution. Compute the 95% bootstrap CI for the maximum of a binomial distribution with n = 15 and p = 0.2. Repeat (b) using p = 0.9. Why is the result different? # bootstrap CI for maximum alpha <- 0.05 T_max <- function(x) {max(x)} # Equal to T_max = max bootstrap <- function(x, t, m = 1000) { n <- length(x) values <- rep(0, m) for (i in 1:m) { values[i] <- t(sample(x, n, replace = T)) } quantile(values, probs = c(alpha / 2, 1 - alpha / 2)) } # a # Meaningless, as the normal distribution can yield arbitrarily large values. x <- rnorm(100) bootstrap(x, T_max) ## 2.5% 97.5% ## 1.819425 2.961743 # b x <- rbinom(100, size = 15, prob = 0.2) # min = 0, max = 15 bootstrap(x, T_max) ## 2.5% 97.5% ## 6 7 # c x <- rbinom(100, size = 15, prob = 0.9) # min = 0, max = 15 bootstrap(x, T_max) ## 2.5% 97.5% ## 15 15 # Observation: to estimate the maximum, we need sufficient probability mass near the maximum value the distribution can yield. # Using bootstrap is pointless when there is too little mass near the true maximum. # In general, bootstrap will fail when estimating the CI for the maximum. "],["ml.html", "Chapter 15 Maximum likelihood 15.1 Deriving MLE 15.2 Fisher information 15.3 The German tank problem", " Chapter 15 Maximum likelihood This chapter deals with maximum likelihood estimation. The students are expected to acquire the following knowledge: How to derive MLE. Applying MLE in R. Calculating and interpreting Fisher information. Practical use of MLE. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 15.1 Deriving MLE Exercise 15.1 Derive the maximum likelihood estimator of variance for N\\((\\mu, \\sigma^2)\\). Compare with results from 13.3. What does that say about the MLE estimator? Solution. The mean is assumed constant, so we have the likelihood \\[\\begin{align} L(\\sigma^2; y) &= \\prod_{i=1}^n \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(y_i - \\mu)^2}{2 \\sigma^2}} \\\\ &= \\frac{1}{\\sqrt{2 \\pi \\sigma^2}^n} e^{\\frac{-\\sum_{i=1}^n (y_i - \\mu)^2}{2 \\sigma^2}} \\end{align}\\] We need to find the maximum of this function. We first observe that we can replace \\(\\frac{-\\sum_{i=1}^n (y_i - \\mu)^2}{2}\\) with a constant \\(c\\), since none of the terms are dependent on \\(\\sigma^2\\). Additionally, the term \\(\\frac{1}{\\sqrt{2 \\pi}^n}\\) does not affect the calculation of the maximum. So now we have \\[\\begin{align} L(\\sigma^2; y) &= (\\sigma^2)^{-\\frac{n}{2}} e^{\\frac{c}{\\sigma^2}}. \\end{align}\\] Differentiating we get \\[\\begin{align} \\frac{d}{d \\sigma^2} L(\\sigma^2; y) &= (\\sigma^2)^{-\\frac{n}{2}} \\frac{d}{d \\sigma^2} e^{\\frac{c}{\\sigma^2}} + e^{\\frac{c}{\\sigma^2}} \\frac{d}{d \\sigma^2} (\\sigma^2)^{-\\frac{n}{2}} \\\\ &= - (\\sigma^2)^{-\\frac{n}{2}} e^{\\frac{c}{\\sigma^2}} \\frac{c}{(\\sigma^2)^2} - e^{\\frac{c}{\\sigma^2}} \\frac{n}{2} (\\sigma^2)^{-\\frac{n + 2}{2}} \\\\ &= - (\\sigma^2)^{-\\frac{n + 4}{2}} e^{\\frac{c}{\\sigma^2}} c - e^{\\frac{c}{\\sigma^2}} \\frac{n}{2} (\\sigma^2)^{-\\frac{n + 2}{2}} \\\\ &= - e^{\\frac{c}{\\sigma^2}} (\\sigma^2)^{-\\frac{n + 4}{2}} \\Big(c + \\frac{n}{2}\\sigma^2 \\Big). \\end{align}\\] To get the maximum, this has to equal to 0, so \\[\\begin{align} c + \\frac{n}{2}\\sigma^2 &= 0 \\\\ \\sigma^2 &= -\\frac{2c}{n} \\\\ \\sigma^2 &= \\frac{\\sum_{i=1}^n (Y_i - \\mu)^2}{n}. \\end{align}\\] The MLE estimator is biased. Exercise 15.2 (Multivariate normal distribution) Derive the maximum likelihood estimate for the mean and covariance matrix of the multivariate normal. Simulate \\(n = 40\\) samples from a bivariate normal distribution (choose non-trivial parameters, that is, mean \\(\\neq 0\\) and covariance \\(\\neq 0\\)). Compute the MLE for the sample. Overlay the data with an ellipse that is determined by the MLE and an ellipse that is determined by the chosen true parameters. Repeat b. several times and observe how the estimates (ellipses) vary around the true value. Hint: For the derivation of MLE, these identities will be helpful: \\(\\frac{\\partial b^T a}{\\partial a} = \\frac{\\partial a^T b}{\\partial a} = b\\), \\(\\frac{\\partial a^T A a}{\\partial a} = (A + A^T)a\\), \\(\\frac{\\partial \\text{tr}(BA)}{\\partial A} = B^T\\), \\(\\frac{\\partial \\ln |A|}{\\partial A} = (A^{-1})^T\\), \\(a^T A a = \\text{tr}(a^T A a) = \\text{tr}(a a^T A) = \\text{tr}(Aaa^T)\\). Solution. The log likelihood of the MVN distribution is \\[\\begin{align*} l(\\mu, \\Sigma ; x) &= -\\frac{1}{2}\\Big(\\sum_{i=1}^n k\\ln(2\\pi) + |\\Sigma| + (x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) \\\\ &= -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n(x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) + c, \\end{align*}\\] where \\(c\\) is a constant with respect to \\(\\mu\\) and \\(\\Sigma\\). To find the MLE we first need to find partial derivatives. Let us start with \\(\\mu\\). \\[\\begin{align*} \\frac{\\partial}{\\partial \\mu}l(\\mu, \\Sigma ; x) &= \\frac{\\partial}{\\partial \\mu} -\\frac{1}{2}\\Big(\\sum_{i=1}^n x_i^T \\Sigma^{-1} x_i - x_i^T \\Sigma^{-1} \\mu - \\mu^T \\Sigma^{-1} x_i + \\mu^T \\Sigma^{-1} \\mu \\Big) \\\\ &= -\\frac{1}{2}\\Big(\\sum_{i=1}^n - \\Sigma^{-1} x_i - \\Sigma^{-1} x_i + 2 \\Sigma^{-1} \\mu \\Big) \\\\ &= -\\Sigma^{-1}\\Big(\\sum_{i=1}^n - x_i + \\mu \\Big). \\end{align*}\\] Equating above with zero, we get \\[\\begin{align*} \\sum_{i=1}^n - x_i + \\mu &= 0 \\\\ \\hat{\\mu} = \\frac{1}{n} \\sum_{i=1}^n x_i, \\end{align*}\\] which is the dimension-wise empirical mean. Now for the covariance matrix \\[\\begin{align*} \\frac{\\partial}{\\partial \\Sigma^{-1}}l(\\mu, \\Sigma ; x) &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n(x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu)\\Big) \\\\ &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n \\text{tr}((x_i - \\mu)^T \\Sigma^{-1} (x_i - \\mu))\\Big) \\\\ &= \\frac{\\partial}{\\partial \\Sigma^{-1}} -\\frac{n}{2}\\ln|\\Sigma| + -\\frac{1}{2}\\Big(\\sum_{i=1}^n \\text{tr}((\\Sigma^{-1} (x_i - \\mu) (x_i - \\mu)^T )\\Big) \\\\ &= \\frac{n}{2}\\Sigma + -\\frac{1}{2}\\Big(\\sum_{i=1}^n (x_i - \\mu) (x_i - \\mu)^T \\Big). \\end{align*}\\] Equating above with zero, we get \\[\\begin{align*} \\hat{\\Sigma} = \\frac{1}{n}\\sum_{i=1}^n (x_i - \\mu) (x_i - \\mu)^T. \\end{align*}\\] set.seed(1) n <- 40 mu <- c(1, -2) Sigma <- matrix(data = c(2, -1.6, -1.6, 1.8), ncol = 2) X <- mvrnorm(n = n, mu = mu, Sigma = Sigma) colnames(X) <- c("X1", "X2") X <- as.data.frame(X) # plot.new() tru_ellip <- ellipse(mu, Sigma, draw = FALSE) colnames(tru_ellip) <- c("X1", "X2") tru_ellip <- as.data.frame(tru_ellip) mu_est <- apply(X, 2, mean) tmp <- as.matrix(sweep(X, 2, mu_est)) Sigma_est <- (1 / n) * t(tmp) %*% tmp est_ellip <- ellipse(mu_est, Sigma_est, draw = FALSE) colnames(est_ellip) <- c("X1", "X2") est_ellip <- as.data.frame(est_ellip) ggplot(data = X, aes(x = X1, y = X2)) + geom_point() + geom_path(data = tru_ellip, aes(x = X1, y = X2, color = "truth")) + geom_path(data = est_ellip, aes(x = X1, y = X2, color = "estimated")) + labs(color = "type") Exercise 15.3 (Logistic regression) Logistic regression is a popular discriminative model when our target variable is binary (categorical with 2 values). One of the ways of looking at logistic regression is that it is linear regression but instead of using the linear term as the mean of a normal RV, we use it as the mean of a Bernoulli RV. Of course, the mean of a Bernoulli is bounded on \\([0,1]\\), so, to avoid non-sensical values, we squeeze the linear between 0 and 1 with the inverse logit function inv_logit\\((z) = 1 / (1 + e^{-z})\\). This leads to the following model: \\(y_i | \\beta, x_i \\sim \\text{Bernoulli}(\\text{inv_logit}(\\beta x_i))\\). Explicitly write the likelihood function of beta. Implement the likelihood function in R. Use black-box box-constraint optimization (for example, optim() with L-BFGS) to find the maximum likelihood estimate for beta for \\(x\\) and \\(y\\) defined below. Plot the estimated probability as a function of the independent variable. Compare with the truth. Let \\(y2\\) be a response defined below. Will logistic regression work well on this dataset? Why not? How can we still use the model, without changing it? inv_log <- function (z) { return (1 / (1 + exp(-z))) } set.seed(1) x <- rnorm(100) y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) y2 <- rbinom(100, size = 1, prob = inv_log(1.2 * x + 1.4 * x^2)) Solution. \\[\\begin{align*} l(\\beta; x, y) &= p(y | x, \\beta) \\\\ &= \\ln(\\prod_{i=1}^n \\text{inv_logit}(\\beta x_i)^{y_i} (1 - \\text{inv_logit}(\\beta x_i))^{1 - y_i}) \\\\ &= \\sum_{i=1}^n y_i \\ln(\\text{inv_logit}(\\beta x_i)) + (1 - y_i) \\ln(1 - \\text{inv_logit}(\\beta x_i)). \\end{align*}\\] set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par ## [1] 1.166558 truth_p <- data.frame(x = x, prob = inv_log(1.2 * x), type = "truth") est_p <- data.frame(x = x, prob = inv_log(my_optim$par * x), type = "estimated") plot_df <- rbind(truth_p, est_p) ggplot(data = plot_df, aes(x = x, y = prob, color = type)) + geom_point(alpha = 0.3) y2 <- rbinom(2000, size = 1, prob = inv_log(1.2 * x + 1.4 * x^2)) X2 <- cbind(x, x^2) my_optim2 <- optim(par = c(0, 0), fn = l_logistic, method = "L-BFGS-B", lower = c(0, 0), upper = c(2, 2), X = t(X2), y = y2) my_optim2$par ## [1] 1.153656 1.257649 tmp <- sweep(data.frame(x = x, x2 = x^2), 2, my_optim2$par, FUN = "*") tmp <- tmp[ ,1] + tmp[ ,2] truth_p <- data.frame(x = x, prob = inv_log(1.2 * x + 1.4 * x^2), type = "truth") est_p <- data.frame(x = x, prob = inv_log(tmp), type = "estimated") plot_df <- rbind(truth_p, est_p) ggplot(data = plot_df, aes(x = x, y = prob, color = type)) + geom_point(alpha = 0.3) Exercise 15.4 (Linear regression) For the data generated below, do the following: Compute the least squares (MLE) estimate of coefficients beta using the matrix exact solution. Compute the MLE by minimizing the sum of squared residuals using black-box optimization (optim()). Compute the MLE by using the output built-in linear regression (lm() ). Compare (a-c and the true coefficients). Compute 95% CI on the beta coefficients using the output of built-in linear regression. Compute 95% CI on the beta coefficients by using (a or b) and the bootstrap with percentile method for CI. Compare with d. set.seed(1) n <- 100 x1 <- rnorm(n) x2 <- rnorm(n) x3 <- rnorm(n) X <- cbind(x1, x2, x3) beta <- c(0.2, 0.6, -1.2) y <- as.vector(t(beta %*% t(X))) + rnorm(n, sd = 0.2) set.seed(1) n <- 100 x1 <- rnorm(n) x2 <- rnorm(n) x3 <- rnorm(n) X <- cbind(x1, x2, x3) beta <- c(0.2, 0.6, -1.2) y <- as.vector(t(beta %*% t(X))) + rnorm(n, sd = 0.2) LS_fun <- function (beta, X, y) { return(sum((y - beta %*% t(X))^2)) } my_optim <- optim(par = c(0, 0, 0), fn = LS_fun, lower = -5, upper = 5, X = X, y = y, method = "L-BFGS-B") my_optim$par ## [1] 0.1898162 0.5885946 -1.1788264 df <- data.frame(y = y, x1 = x1, x2 = x2, x3 = x3) my_lm <- lm(y ~ x1 + x2 + x3 - 1, data = df) my_lm ## ## Call: ## lm(formula = y ~ x1 + x2 + x3 - 1, data = df) ## ## Coefficients: ## x1 x2 x3 ## 0.1898 0.5886 -1.1788 # matrix solution beta_hat <- solve(t(X) %*% X) %*% t(X) %*% y beta_hat ## [,1] ## x1 0.1898162 ## x2 0.5885946 ## x3 -1.1788264 out <- summary(my_lm) out$coefficients[ ,2] ## x1 x2 x3 ## 0.02209328 0.02087542 0.01934506 # bootstrap CI nboot <- 1000 beta_boot <- matrix(data = NA, ncol = length(beta), nrow = nboot) for (i in 1:nboot) { inds <- sample(1:n, n, replace = T) new_df <- df[inds, ] X_tmp <- as.matrix(new_df[ ,-1]) y_tmp <- new_df[ ,1] # print(nrow(new_df)) tmp_beta <- solve(t(X_tmp) %*% X_tmp) %*% t(X_tmp) %*% y_tmp beta_boot[i, ] <- tmp_beta } apply(beta_boot, 2, mean) ## [1] 0.1893281 0.5887068 -1.1800738 apply(beta_boot, 2, quantile, probs = c(0.025, 0.975)) ## [,1] [,2] [,3] ## 2.5% 0.1389441 0.5436911 -1.221560 ## 97.5% 0.2386295 0.6363102 -1.140416 out$coefficients[ ,2] ## x1 x2 x3 ## 0.02209328 0.02087542 0.01934506 Exercise 15.5 (Principal component analysis) Load the olympic data set from package ade4. The data show decathlon results for 33 men in 1988 Olympic Games. This data set serves as a great example of finding the latent structure in the data, as there are certain characteristics of the athletes that make them excel at different events. For example an explosive athlete will do particulary well in sprints and long jumps. Perform PCA (prcomp) on the data set and interpret the first 2 latent dimensions. Hint: Standardize the data first to get meaningful results. Use MLE to estimate the covariance of the standardized multivariate distribution. Decompose the estimated covariance matrix with the eigendecomposition. Compare the eigenvectors to the output of PCA. data(olympic) X <- olympic$tab X_scaled <- scale(X) my_pca <- prcomp(X_scaled) summary(my_pca) ## Importance of components: ## PC1 PC2 PC3 PC4 PC5 PC6 PC7 ## Standard deviation 1.8488 1.6144 0.97123 0.9370 0.74607 0.70088 0.65620 ## Proportion of Variance 0.3418 0.2606 0.09433 0.0878 0.05566 0.04912 0.04306 ## Cumulative Proportion 0.3418 0.6025 0.69679 0.7846 0.84026 0.88938 0.93244 ## PC8 PC9 PC10 ## Standard deviation 0.55389 0.51667 0.31915 ## Proportion of Variance 0.03068 0.02669 0.01019 ## Cumulative Proportion 0.96312 0.98981 1.00000 autoplot(my_pca, data = X, loadings = TRUE, loadings.colour = 'blue', loadings.label = TRUE, loadings.label.size = 3) Sigma_est <- (1 / nrow(X_scaled)) * t(X_scaled) %*% X_scaled Sigma_dec <- eigen(Sigma_est) Sigma_dec$vectors ## [,1] [,2] [,3] [,4] [,5] [,6] ## [1,] 0.4158823 0.1488081 -0.26747198 -0.08833244 -0.442314456 0.03071237 ## [2,] -0.3940515 -0.1520815 -0.16894945 -0.24424963 0.368913901 -0.09378242 ## [3,] -0.2691057 0.4835374 0.09853273 -0.10776276 -0.009754680 0.23002054 ## [4,] -0.2122818 0.0278985 -0.85498656 0.38794393 -0.001876311 0.07454380 ## [5,] 0.3558474 0.3521598 -0.18949642 0.08057457 0.146965351 -0.32692886 ## [6,] 0.4334816 0.0695682 -0.12616012 -0.38229029 -0.088802794 0.21049130 ## [7,] -0.1757923 0.5033347 0.04609969 0.02558404 0.019358607 0.61491241 ## [8,] -0.3840821 0.1495820 0.13687235 0.14396548 -0.716743474 -0.34776037 ## [9,] -0.1799436 0.3719570 -0.19232803 -0.60046566 0.095582043 -0.43744387 ## [10,] 0.1701426 0.4209653 0.22255233 0.48564231 0.339772188 -0.30032419 ## [,7] [,8] [,9] [,10] ## [1,] 0.2543985 0.663712826 -0.10839531 0.10948045 ## [2,] 0.7505343 0.141264141 0.04613910 0.05580431 ## [3,] -0.1106637 0.072505560 0.42247611 0.65073655 ## [4,] -0.1351242 -0.155435871 -0.10206505 0.11941181 ## [5,] 0.1413388 -0.146839303 0.65076229 -0.33681395 ## [6,] 0.2725296 -0.639003579 -0.20723854 0.25971800 ## [7,] 0.1439726 0.009400445 -0.16724055 -0.53450315 ## [8,] 0.2732665 -0.276873049 -0.01766443 -0.06589572 ## [9,] -0.3419099 0.058519366 -0.30619617 -0.13093187 ## [10,] 0.1868704 0.007310045 -0.45688227 0.24311846 my_pca$rotation ## PC1 PC2 PC3 PC4 PC5 PC6 ## 100 -0.4158823 0.1488081 0.26747198 -0.08833244 -0.442314456 0.03071237 ## long 0.3940515 -0.1520815 0.16894945 -0.24424963 0.368913901 -0.09378242 ## poid 0.2691057 0.4835374 -0.09853273 -0.10776276 -0.009754680 0.23002054 ## haut 0.2122818 0.0278985 0.85498656 0.38794393 -0.001876311 0.07454380 ## 400 -0.3558474 0.3521598 0.18949642 0.08057457 0.146965351 -0.32692886 ## 110 -0.4334816 0.0695682 0.12616012 -0.38229029 -0.088802794 0.21049130 ## disq 0.1757923 0.5033347 -0.04609969 0.02558404 0.019358607 0.61491241 ## perc 0.3840821 0.1495820 -0.13687235 0.14396548 -0.716743474 -0.34776037 ## jave 0.1799436 0.3719570 0.19232803 -0.60046566 0.095582043 -0.43744387 ## 1500 -0.1701426 0.4209653 -0.22255233 0.48564231 0.339772188 -0.30032419 ## PC7 PC8 PC9 PC10 ## 100 0.2543985 -0.663712826 0.10839531 -0.10948045 ## long 0.7505343 -0.141264141 -0.04613910 -0.05580431 ## poid -0.1106637 -0.072505560 -0.42247611 -0.65073655 ## haut -0.1351242 0.155435871 0.10206505 -0.11941181 ## 400 0.1413388 0.146839303 -0.65076229 0.33681395 ## 110 0.2725296 0.639003579 0.20723854 -0.25971800 ## disq 0.1439726 -0.009400445 0.16724055 0.53450315 ## perc 0.2732665 0.276873049 0.01766443 0.06589572 ## jave -0.3419099 -0.058519366 0.30619617 0.13093187 ## 1500 0.1868704 -0.007310045 0.45688227 -0.24311846 15.2 Fisher information Exercise 15.6 Let us assume a Poisson likelihood. Derive the MLE estimate of the mean. Derive the Fisher information. For the data below compute the MLE and construct confidence intervals. Use bootstrap to construct the CI for the mean. Compare with c) and discuss. x <- c(2, 5, 3, 1, 2, 1, 0, 3, 0, 2) Solution. The log likelihood of the Poisson is \\[\\begin{align*} l(\\lambda; x) = \\sum_{i=1}^n x_i \\ln \\lambda - n \\lambda - \\sum_{i=1}^n \\ln x_i! \\end{align*}\\] Taking the derivative and equating with 0 we get \\[\\begin{align*} \\frac{1}{\\hat{\\lambda}}\\sum_{i=1}^n x_i - n &= 0 \\\\ \\hat{\\lambda} &= \\frac{1}{n} \\sum_{i=1}^n x_i. \\end{align*}\\] Since \\(\\lambda\\) is the mean parameter, this was expected. For the Fischer information, we first need the second derivative, which is \\[\\begin{align*} - \\lambda^{-2} \\sum_{i=1}^n x_i. \\\\ \\end{align*}\\] Now taking the expectation of the negative of the above, we get \\[\\begin{align*} E[\\lambda^{-2} \\sum_{i=1}^n x_i] &= \\lambda^{-2} E[\\sum_{i=1}^n x_i] \\\\ &= \\lambda^{-2} n \\lambda \\\\ &= \\frac{n}{\\lambda}. \\end{align*}\\] set.seed(1) x <- c(2, 5, 3, 1, 2, 1, 0, 3, 0, 2) lambda_hat <- mean(x) finfo <- length(x) / lambda_hat mle_CI <- c(lambda_hat - 1.96 * sqrt(1 / finfo), lambda_hat + 1.96 * sqrt(1 / finfo)) boot_lambda <- c() nboot <- 1000 for (i in 1:nboot) { tmp_x <- sample(x, length(x), replace = T) boot_lambda[i] <- mean(tmp_x) } boot_CI <- c(quantile(boot_lambda, 0.025), quantile(boot_lambda, 0.975)) mle_CI ## [1] 1.045656 2.754344 boot_CI ## 2.5% 97.5% ## 1.0 2.7 Exercise 15.7 Find the Fisher information matrix for the Gamma distribution. Generate 20 samples from a Gamma distribution and plot a confidence ellipse of the inverse of Fisher information matrix around the ML estimates of the parameters. Also plot the theoretical values. Repeat the sampling several times. What do you observe? Discuss what a non-diagonal Fisher matrix implies. Hint: The digamma function is defined as \\(\\psi(x) = \\frac{\\frac{d}{dx} \\Gamma(x)}{\\Gamma(x)}\\). Additionally, you do not need to evaluate \\(\\frac{d}{dx} \\psi(x)\\). To calculate its value in R, use package numDeriv. Solution. The log likelihood of the Gamma is \\[\\begin{equation*} l(\\alpha, \\beta; x) = n \\alpha \\ln \\beta - n \\ln \\Gamma(\\alpha) + (\\alpha - 1) \\sum_{i=1}^n \\ln x_i - \\beta \\sum_{i=1}^n x_i. \\end{equation*}\\] Let us calculate the derivatives. \\[\\begin{align*} \\frac{\\partial}{\\partial \\alpha} l(\\alpha, \\beta; x) &= n \\ln \\beta - n \\psi(\\alpha) + \\sum_{i=1}^n \\ln x_i, \\\\ \\frac{\\partial}{\\partial \\beta} l(\\alpha, \\beta; x) &= \\frac{n \\alpha}{\\beta} - \\sum_{i=1}^n x_i, \\\\ \\frac{\\partial^2}{\\partial \\alpha \\beta} l(\\alpha, \\beta; x) &= \\frac{n}{\\beta}, \\\\ \\frac{\\partial^2}{\\partial \\alpha^2} l(\\alpha, \\beta; x) &= - n \\frac{\\partial}{\\partial \\alpha} \\psi(\\alpha), \\\\ \\frac{\\partial^2}{\\partial \\beta^2} l(\\alpha, \\beta; x) &= - \\frac{n \\alpha}{\\beta^2}. \\end{align*}\\] The Fisher information matrix is then \\[\\begin{align*} I(\\alpha, \\beta) = - E[ \\begin{bmatrix} - n \\psi'(\\alpha) & \\frac{n}{\\beta} \\\\ \\frac{n}{\\beta} & - \\frac{n \\alpha}{\\beta^2} \\end{bmatrix} ] = \\begin{bmatrix} n \\psi'(\\alpha) & - \\frac{n}{\\beta} \\\\ - \\frac{n}{\\beta} & \\frac{n \\alpha}{\\beta^2} \\end{bmatrix} \\end{align*}\\] A non-diagonal Fisher matrix implies that the parameter estimates are linearly dependent. set.seed(1) n <- 20 pars_theor <- c(5, 2) x <- rgamma(n, 5, 2) # MLE for alpha and beta log_lik <- function (pars, x) { n <- length(x) return (- (n * pars[1] * log(pars[2]) - n * log(gamma(pars[1])) + (pars[1] - 1) * sum(log(x)) - pars[2] * sum(x))) } my_optim <- optim(par = c(1,1), fn = log_lik, method = "L-BFGS-B", lower = c(0.001, 0.001), upper = c(8, 8), x = x) pars_mle <- my_optim$par fish_mat <- matrix(data = NA, nrow = 2, ncol = 2) fish_mat[1,2] <- - n / pars_mle[2] fish_mat[2,1] <- - n / pars_mle[2] fish_mat[2,2] <- (n * pars_mle[1]) / (pars_mle[2]^2) fish_mat[1,1] <- n * grad(digamma, pars_mle[1]) fish_mat_inv <- solve(fish_mat) est_ellip <- ellipse(pars_mle, fish_mat_inv, draw = FALSE) colnames(est_ellip) <- c("X1", "X2") est_ellip <- as.data.frame(est_ellip) ggplot() + geom_point(data = data.frame(x = pars_mle[1], y = pars_mle[2]), aes(x = x, y = y)) + geom_path(data = est_ellip, aes(x = X1, y = X2)) + geom_point(aes(x = pars_theor[1], y = pars_theor[2]), color = "red") + geom_text(aes(x = pars_theor[1], y = pars_theor[2], label = "Theoretical parameters"), color = "red", nudge_y = -0.2) 15.3 The German tank problem Exercise 15.8 (The German tank problem) During WWII the allied intelligence were faced with an important problem of estimating the total production of certain German tanks, such as the Panther. What turned out to be a successful approach was to estimate the maximum from the serial numbers of the small sample of captured or destroyed tanks (describe the statistical model used). What assumptions were made by using the above model? Do you think they are reasonable assumptions in practice? Show that the plug-in estimate for the maximum (i.e. the maximum of the sample) is a biased estimator. Derive the maximum likelihood estimate of the maximum. Check that the following estimator is not biased: \\(\\hat{n} = \\frac{k + 1}{k}m - 1\\). Solution. The data are the serial numbers of the tanks. The parameter is \\(n\\), the total production of the tank. The distribution of the serial numbers is a discrete uniform distribution over all serial numbers. One of the assumptions is that we have i.i.d samples, however in practice this might not be true, as some tanks produced later could be sent to the field later, therefore already in theory we would not be able to recover some values from the population. To find the expected value we first need to find the distribution of \\(m\\). Let us start with the CDF. \\[\\begin{align*} F_m(x) = P(Y_1 < x,...,Y_k < x). \\end{align*}\\] If \\(x < k\\) then \\(F_m(x) = 0\\) and if \\(x \\geq 1\\) then \\(F_m(x) = 1\\). What about between those values. So the probability that the maximum value is less than or equal to \\(m\\) is just the number of possible draws from \\(Y\\) that are all smaller than \\(m\\), divided by all possible draws. This is \\(\\frac{{x}\\choose{k}}{{n}\\choose{k}}\\). The PDF on the suitable bounds is then \\[\\begin{align*} P(m = x) = F_m(x) - F_m(x - 1) = \\frac{\\binom{x}{k} - \\binom{x - 1}{k}}{\\binom{n}{k}} = \\frac{\\binom{x - 1}{k - 1}}{\\binom{n}{k}}. \\end{align*}\\] Now we can calculate the expected value of \\(m\\) using some combinatorial identities. \\[\\begin{align*} E[m] &= \\sum_{i = k}^n i \\frac{{i - 1}\\choose{k - 1}}{{n}\\choose{k}} \\\\ &= \\sum_{i = k}^n i \\frac{\\frac{(i - 1)!}{(k - 1)!(i - k)!}}{{n}\\choose{k}} \\\\ &= \\frac{k}{\\binom{n}{k}}\\sum_{i = k}^n \\binom{i}{k} \\\\ &= \\frac{k}{\\binom{n}{k}} \\binom{n + 1}{k + 1} \\\\ &= \\frac{k(n + 1)}{k + 1}. \\end{align*}\\] The bias of this estimator is then \\[\\begin{align*} E[m] - n = \\frac{k(n + 1)}{k + 1} - n = \\frac{k - n}{k + 1}. \\end{align*}\\] The probability that we observed our sample \\(Y = {Y_1, Y_2,...,,Y_k}\\) given \\(n\\) is \\(\\frac{1}{{n}\\choose{k}}\\). We need to find such \\(n^*\\) that this function is maximized. Additionally, we have a constraint that \\(n^* \\geq m = \\max{(Y)}\\). Let us plot this function for \\(m = 10\\) and \\(k = 4\\). library(ggplot2) my_fun <- function (x, m, k) { tmp <- 1 / (choose(x, k)) tmp[x < m] <- 0 return (tmp) } x <- 1:20 y <- my_fun(x, 10, 4) df <- data.frame(x = x, y = y) ggplot(data = df, aes(x = x, y = y)) + geom_line() ::: {.solution} (continued) We observe that the maximum of this function lies at the maximum value of the sample. Therefore \\(n^* = m\\) and ML estimate equals the plug-in estimate. \\[\\begin{align*} E[\\hat{n}] &= \\frac{k + 1}{k} E[m] - 1 \\\\ &= \\frac{k + 1}{k} \\frac{k(n + 1)}{k + 1} - 1 \\\\ &= n. \\end{align*}\\] ::: "],["nhst.html", "Chapter 16 Null hypothesis significance testing", " Chapter 16 Null hypothesis significance testing This chapter deals with null hypothesis significance testing. The students are expected to acquire the following knowledge: Binomial test. t-test. Chi-squared test. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } Exercise 16.1 (Binomial test) We assume \\(y_i \\in \\{0,1\\}\\), \\(i = 1,...,n\\) and \\(y_i | \\theta = 0.5 \\sim i.i.d.\\) Bernoulli\\((\\theta)\\). The test statistic is \\(X = \\sum_{i=1}^n\\) and the rejection region R is defined as the region where the probability of obtaining such or more extreme \\(X\\) given \\(\\theta = 0.5\\) is less than 0.05. Derive and plot the power function of the test for \\(n=100\\). What is the significance level of this test if \\(H0: \\theta = 0.5\\)? At which values of X will we reject the null hypothesis? # a # First we need the rejection region, so we need to find X_min and X_max n <- 100 qbinom(0.025, n, 0.5) ## [1] 40 qbinom(0.975, n, 0.5) ## [1] 60 pbinom(40, n, 0.5) ## [1] 0.02844397 pbinom(60, n, 0.5) ## [1] 0.9823999 X_min <- 39 X_max <- 60 thetas <- seq(0, 1, by = 0.01) beta_t <- 1 - pbinom(X_max, size = n, prob = thetas) + pbinom(X_min, size = n, prob = thetas) plot(beta_t) # b # The significance level is beta_t[51] ## [1] 0.0352002 # We will reject the null hypothesis at X values below X_min and above X_max. Exercise 16.2 (Long-run guarantees of the t-test) Generate a sample of size \\(n = 10\\) from the standard normal. Use the two-sided t-test with \\(H0: \\mu = 0\\) and record the p-value. Can you reject H0 at 0.05 significance level? (before simulating) If we repeated (b) many times, what would be the relative frequency of false positives/Type I errors (rejecting the null that is true)? What would be the relative frequency of false negatives /Type II errors (retaining the null when the null is false)? (now simulate b and check if the simulation results match your answer in b) Similar to (a-c) but now we generate data from N(-0.5, 1). Similar to (a-c) but now we generate data from N(\\(\\mu\\), 1) where we every time pick a different \\(\\mu < 0\\) and use a one-sided test \\(H0: \\mu <= 0\\). set.seed(2) # a x <- rnorm(10) my_test <- t.test(x, alternative = "two.sided", mu = 0) my_test ## ## One Sample t-test ## ## data: x ## t = 0.6779, df = 9, p-value = 0.5149 ## alternative hypothesis: true mean is not equal to 0 ## 95 percent confidence interval: ## -0.4934661 0.9157694 ## sample estimates: ## mean of x ## 0.2111516 # we can not reject the null hypothesis # b # The expected value of false positives would be 0.05. The expected value of # true negatives would be 0, as there are no negatives (the null hypothesis is # always the truth). nit <- 1000 typeIerr <- vector(mode = "logical", length = nit) typeIIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { x <- rnorm(10) my_test <- t.test(x, alternative = "two.sided", mu = 0) if (my_test$p.value < 0.05) { typeIerr[i] <- T } else { typeIerr[i] <- F } } mean(typeIerr) ## [1] 0.052 sd(typeIerr) / sqrt(nit) ## [1] 0.007024624 # d # We can not estimate the percentage of true negatives, but it will probably be # higher than 0.05. There will be no false positives as the null hypothesis is # always false. typeIIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { x <- rnorm(10, -0.5) my_test <- t.test(x, alternative = "two.sided", mu = 0) if (my_test$p.value < 0.05) { typeIIerr[i] <- F } else { typeIIerr[i] <- T } } mean(typeIIerr) ## [1] 0.719 sd(typeIIerr) / sqrt(nit) ## [1] 0.01422115 # e # The expected value of false positives would be lower than 0.05. The expected # value of true negatives would be 0, as there are no negatives (the null # hypothesis is always the truth). typeIerr <- vector(mode = "logical", length = nit) for (i in 1:nit) { u <- runif(1, -1, 0) x <- rnorm(10, u) my_test <- t.test(x, alternative = "greater", mu = 0) if (my_test$p.value < 0.05) { typeIerr[i] <- T } else { typeIerr[i] <- F } } mean(typeIerr) ## [1] 0.012 sd(typeIerr) / sqrt(nit) ## [1] 0.003444977 Exercise 16.3 (T-test, confidence intervals, and bootstrap) Sample \\(n=20\\) from a standard normal distribution and calculate the p-value using t-test, confidence intervals based on normal distribution, and bootstrap. Repeat this several times and check how many times we rejected the null hypothesis (made a type I error). Hint: For the confidence intervals you can use function CI from the Rmisc package. set.seed(1) library(Rmisc) nit <- 1000 n_boot <- 100 t_logic <- rep(F, nit) boot_logic <- rep(F, nit) norm_logic <- rep(F, nit) for (i in 1:nit) { x <- rnorm(20) my_test <- t.test(x) my_CI <- CI(x) if (my_test$p.value <= 0.05) t_logic[i] <- T boot_tmp <- vector(mode = "numeric", length = n_boot) for (j in 1:n_boot) { tmp_samp <- sample(x, size = 20, replace = T) boot_tmp[j] <- mean(tmp_samp) } if ((quantile(boot_tmp, 0.025) >= 0) | (quantile(boot_tmp, 0.975) <= 0)) { boot_logic[i] <- T } if ((my_CI[3] >= 0) | (my_CI[1] <= 0)) { norm_logic[i] <- T } } mean(t_logic) ## [1] 0.053 sd(t_logic) / sqrt(nit) ## [1] 0.007088106 mean(boot_logic) ## [1] 0.093 sd(boot_logic) / sqrt(nit) ## [1] 0.009188876 mean(norm_logic) ## [1] 0.053 sd(norm_logic) / sqrt(nit) ## [1] 0.007088106 Exercise 16.4 (Chi-squared test) Show that the \\(\\chi^2 = \\sum_{i=1}^k \\frac{(O_i - E_i)^2}{E_i}\\) test statistic is approximately \\(\\chi^2\\) distributed when we have two categories. Let us look at the US voting data here. Compare the number of voters who voted for Trump or Hillary depending on their income (less or more than 100.000 dollars per year). Manually calculate the chi-squared statistic, compare to the chisq.test in R, and discuss the results. Visualize the test. Solution. Let \\(X_i\\) be binary variables, \\(i = 1,...,n\\). We can then express the test statistic as \\[\\begin{align} \\chi^2 = &\\frac{(O_i - np)^2}{np} + \\frac{(n - O_i - n(1 - p))^2}{n(1 - p)} \\\\ &= \\frac{(O_i - np)^2}{np(1 - p)} \\\\ &= (\\frac{O_i - np}{\\sqrt{np(1 - p)}})^2. \\end{align}\\] When \\(n\\) is large, this distrbution is approximately normal with \\(\\mu = np\\) and \\(\\sigma^2 = np(1 - p)\\) (binomial converges in distribution to standard normal). By definition, the chi-squared distribution with \\(k\\) degrees of freedom is a sum of squares of \\(k\\) independent standard normal random variables. n <- 24588 less100 <- round(0.66 * n * c(0.49, 0.45, 0.06)) # some rounding, but it should not affect results more100 <- round(0.34 * n * c(0.47, 0.47, 0.06)) x <- rbind(less100, more100) colnames(x) <- c("Clinton", "Trump", "other/no answer") print(x) ## Clinton Trump other/no answer ## less100 7952 7303 974 ## more100 3929 3929 502 chisq.test(x) ## ## Pearson's Chi-squared test ## ## data: x ## X-squared = 9.3945, df = 2, p-value = 0.00912 x ## Clinton Trump other/no answer ## less100 7952 7303 974 ## more100 3929 3929 502 csum <- apply(x, 2, sum) rsum <- apply(x, 1, sum) chi2 <- (x[1,1] - csum[1] * rsum[1] / sum(x))^2 / (csum[1] * rsum[1] / sum(x)) + (x[1,2] - csum[2] * rsum[1] / sum(x))^2 / (csum[2] * rsum[1] / sum(x)) + (x[1,3] - csum[3] * rsum[1] / sum(x))^2 / (csum[3] * rsum[1] / sum(x)) + (x[2,1] - csum[1] * rsum[2] / sum(x))^2 / (csum[1] * rsum[2] / sum(x)) + (x[2,2] - csum[2] * rsum[2] / sum(x))^2 / (csum[2] * rsum[2] / sum(x)) + (x[2,3] - csum[3] * rsum[2] / sum(x))^2 / (csum[3] * rsum[2] / sum(x)) chi2 ## Clinton ## 9.394536 1 - pchisq(chi2, df = 2) ## Clinton ## 0.009120161 x <- seq(0, 15, by = 0.01) df <- data.frame(x = x) ggplot(data = df, aes(x = x)) + stat_function(fun = dchisq, args = list(df = 2)) + geom_segment(aes(x = chi2, y = 0, xend = chi2, yend = dchisq(chi2, df = 2))) + stat_function(fun = dchisq, args = list(df = 2), xlim = c(chi2, 15), geom = "area", fill = "red") "],["bi.html", "Chapter 17 Bayesian inference 17.1 Conjugate priors 17.2 Posterior sampling", " Chapter 17 Bayesian inference This chapter deals with Bayesian inference. The students are expected to acquire the following knowledge: How to set prior distribution. Compute posterior distribution. Compute posterior predictive distribution. Use sampling for inference. .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 17.1 Conjugate priors Exercise 17.1 (Poisson-gamma model) Let us assume a Poisson likelihood and a gamma prior on the Poisson mean parameter (this is a conjugate prior). Derive posterior Below we have some data, which represents number of goals in a football match. Choose sensible prior for this data (draw the gamma density if necessary), justify it. Compute the posterior. Compute an interval such that the probability that the true mean is in there is 95%. What is the probability that the true mean is greater than 2.5? Back to theory: Compute prior predictive and posterior predictive. Discuss why the posterior predictive is overdispersed and not Poisson? Draw a histogram of the prior predictive and posterior predictive for the data from (b). Discuss. Generate 10 and 100 random samples from a Poisson distribution and compare the posteriors with a flat prior, and a prior concentrated away from the truth. x <- c(3, 2, 1, 1, 5, 4, 0, 0, 4, 3) Solution. \\[\\begin{align*} p(\\lambda | X) &= \\frac{p(X | \\lambda) p(\\lambda)}{\\int_0^\\infty p(X | \\lambda) p(\\lambda) d\\lambda} \\\\ &\\propto p(X | \\lambda) p(\\lambda) \\\\ &= \\Big(\\prod_{i=1}^n \\frac{1}{x_i!} \\lambda^{x_i} e^{-\\lambda}\\Big) \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\lambda^{\\alpha - 1} e^{-\\beta \\lambda} \\\\ &\\propto \\lambda^{\\sum_{i=1}^n x_i + \\alpha - 1} e^{- \\lambda (n + \\beta)} \\\\ \\end{align*}\\] We recognize this as the shape of a gamma distribution, therefore \\[\\begin{align*} \\lambda | X \\sim \\text{gamma}(\\alpha + \\sum_{i=1}^n x_i, \\beta + n) \\end{align*}\\] For the prior predictive, we have \\[\\begin{align*} p(x^*) &= \\int_0^\\infty p(x^*, \\lambda) d\\lambda \\\\ &= \\int_0^\\infty p(x^* | \\lambda) p(\\lambda) d\\lambda \\\\ &= \\int_0^\\infty \\frac{1}{x^*!} \\lambda^{x^*} e^{-\\lambda} \\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} \\lambda^{\\alpha - 1} e^{-\\beta \\lambda} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\int_0^\\infty \\lambda^{x^* + \\alpha - 1} e^{-\\lambda (1 + \\beta)} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\frac{\\Gamma(x^* + \\alpha)}{(1 + \\beta)^{x^* + \\alpha}} \\int_0^\\infty \\frac{(1 + \\beta)^{x^* + \\alpha}}{\\Gamma(x^* + \\alpha)} \\lambda^{x^* + \\alpha - 1} e^{-\\lambda (1 + \\beta)} d\\lambda \\\\ &= \\frac{\\beta^\\alpha}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} \\frac{\\Gamma(x^* + \\alpha)}{(1 + \\beta)^{x^* + \\alpha}} \\\\ &= \\frac{\\Gamma(x^* + \\alpha)}{\\Gamma(x^* + 1)\\Gamma(\\alpha)} (\\frac{\\beta}{1 + \\beta})^\\alpha (\\frac{1}{1 + \\beta})^{x^*}, \\end{align*}\\] which we recognize as the negative binomial distribution with \\(r = \\alpha\\) and \\(p = \\frac{1}{\\beta + 1}\\). For the posterior predictive, the calculation is the same, only now the parameters are \\(r = \\alpha + \\sum_{i=1}^n x_i\\) and \\(p = \\frac{1}{\\beta + n + 1}\\). There are two sources of uncertainty in the predictive distribution. First is the uncertainty about the population. Second is the variability in sampling from the population. When \\(n\\) is large, the latter is going to be very small. But when \\(n\\) is small, the latter is going to be higher, resulting in an overdispersed predictive distribution. x <- c(3, 2, 1, 1, 5, 4, 0, 0, 4, 3) # b # quick visual check of the prior ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dgamma, args = list(shape = 1, rate = 1)) palpha <- 1 pbeta <- 1 alpha_post <- palpha + sum(x) beta_post <- pbeta + length(x) ggplot(data = data.frame(x = seq(0, 5, by = 0.01)), aes(x = x)) + stat_function(fun = dgamma, args = list(shape = alpha_post, rate = beta_post)) # probability of being higher than 2.5 1 - pgamma(2.5, alpha_post, beta_post) ## [1] 0.2267148 # interval qgamma(c(0.025, 0.975), alpha_post, beta_post) ## [1] 1.397932 3.137390 # d prior_pred <- rnbinom(1000, size = palpha, prob = 1 - 1 / (pbeta + 1)) post_pred <- rnbinom(1000, size = palpha + sum(x), prob = 1 - 1 / (pbeta + 10 + 1)) df <- data.frame(prior = prior_pred, posterior = post_pred) df <- gather(df) ggplot(df, aes(x = value, fill = key)) + geom_histogram(position = "dodge") # e set.seed(1) x1 <- rpois(10, 2.5) x2 <- rpois(100, 2.5) alpha_flat <- 1 beta_flat <- 0.1 alpha_conc <- 50 beta_conc <- 10 n <- 10000 df_flat <- data.frame(x1 = rgamma(n, alpha_flat + sum(x1), beta_flat + 10), x2 = rgamma(n, alpha_flat + sum(x2), beta_flat + 100), type = "flat") df_flat <- tidyr::gather(df_flat, key = "key", value = "value", - type) df_conc <- data.frame(x1 = rgamma(n, alpha_conc + sum(x1), beta_conc + 10), x2 = rgamma(n, alpha_conc + sum(x2), beta_conc + 100), type = "conc") df_conc <- tidyr::gather(df_conc, key = "key", value = "value", - type) df <- rbind(df_flat, df_conc) ggplot(data = df, aes(x = value, color = type)) + facet_wrap(~ key) + geom_density() 17.2 Posterior sampling Exercise 17.2 (Bayesian logistic regression) In Chapter 15 we implemented a MLE for logistic regression (see the code below). For this model, conjugate priors do not exist, which complicates the calculation of the posterior. However, we can use sampling from the numerator of the posterior, using rejection sampling. Set a sensible prior distribution on \\(\\beta\\) and use rejection sampling to find the posterior distribution. In a) you will get a distribution of parameter \\(\\beta\\). Plot the probabilities (as in exercise 15.3) for each sample of \\(\\beta\\) and compare to the truth. Hint: We can use rejection sampling even for functions which are not PDFs – they do not have to sum/integrate to 1. We just need to use a suitable envelope that we know how to sample from. For example, here we could use a uniform distribution and scale it suitably. set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par # Let's say we believe that the mean of beta is 0.5. Since we are not very sure # about this, we will give it a relatively high variance. So a normal prior with # mean 0.5 and standard deviation 5. But there is no right solution to this, # this is basically us expressing our prior belief in the parameter values. set.seed(1) inv_log <- function (z) { return (1 / (1 + exp(-z))) } x <- rnorm(100) y <- x y <- rbinom(100, size = 1, prob = inv_log(1.2 * x)) l_logistic <- function (beta, X, y) { logl <- -sum(y * log(inv_log(as.vector(beta %*% X))) + (1 - y) * log((1 - inv_log(as.vector(beta %*% X))))) if (is.nan(logl)) logl <- Inf return(logl) } my_optim <- optim(par = 0.5, fn = l_logistic, method = "L-BFGS-B", lower = 0, upper = 10, X = x, y = y) my_optim$par ## [1] 1.166558 f_logistic <- function (beta, X, y) { logl <- prod(inv_log(as.vector(beta %*% X))^y * (1 - inv_log(as.vector(beta %*% X)))^(1 - y)) return(logl) } a <- seq(0, 3, by = 0.01) my_l <- c() for (i in a) { my_l <- c(my_l, f_logistic(i, x, y) * dnorm(i, 0.5, 5)) } plot(my_l) envlp <- 10^(-25.8) * dunif(a, -5, 5) # found by trial and error tmp <- data.frame(envel = envlp, l = my_l, t = a) tmp <- gather(tmp, key = "key", value = "value", - t) ggplot(tmp, aes(x = t, y = value, color = key)) + geom_line() # envelope OK set.seed(1) nsamps <- 1000 samps <- c() for (i in 1:nsamps) { tmp <- runif(1, -5, 5) u <- runif(1, 0, 1) if (u < (f_logistic(tmp, x, y) * dnorm(tmp, 0.5, 5)) / (10^(-25.8) * dunif(tmp, -5, 5))) { samps <- c(samps, tmp) } } plot(density(samps)) mean(samps) ## [1] 1.211578 median(samps) ## [1] 1.204279 truth_p <- data.frame(x = x, prob = inv_log(1.2 * x), type = "truth") preds <- inv_log(x %*% t(samps)) preds <- gather(cbind(as.data.frame(preds), x = x), key = "key", "value" = value, - x) ggplot(preds, aes(x = x, y = value)) + geom_line(aes(group = key), color = "gray", alpha = 0.7) + geom_point(data = truth_p, aes(y = prob), color = "red", alpha = 0.7) + theme_bw() "],["distributions-intutition.html", "Chapter 18 Distributions intutition 18.1 Discrete distributions 18.2 Continuous distributions", " Chapter 18 Distributions intutition This chapter is intended to help you familiarize yourself with the different probability distributions you will encounter in this course. You will need to use Appendix B extensively as a reference for the basic properties of distributions, so keep it close! .fold-btn { float: right; margin: 5px 5px 0 0; } .fold { border: 1px solid black; min-height: 40px; } 18.1 Discrete distributions Exercise 18.1 (Bernoulli intuition 1) The simplest distribution you will encounter is the Bernoulli distribution. It is a discrete probability distribution used to represent the outcome of a yes/no question. It has one parameter \\(0 \\leq p \\leq 1\\), which is the probability of success. The probability of failure is \\(q = (1-p)\\). A classic way to think about a Bernoulli trial (a yes/no experiment) is a coin flip. Real coins are fair, meaning the probability of either heads (1) or tails (0) are the same, so \\(p=0.5\\) as shown below in figure a. Alternatively we may want to represent a process that doesn’t have equal probabilities of outcomes like “Will a throw of a fair die result in a 6?”. In this case \\(p=\\frac{1}{6}\\), shown in figure b. Using your knowledge of the Bernoulli distribution use the throw of a fair die to think of events, such that: \\(p = 0.5\\) \\(p = \\frac{5}{6}\\) \\(q = \\frac{2}{3}\\) Solution. An event that is equally likely to happen or not happen i.e. \\(p = 0.5\\) would be throwing an even number. More formally we can name this event \\(A\\) and write: \\(A = \\{2,4,6\\}\\), its probability being \\(P(A) = 0.5\\) An example of an event with \\(p = \\frac{5}{6}\\) would be throwing a number greater than 1. Defined as \\(B = \\{2,3,4,5,6\\}\\). We need an event that fails \\(\\frac{2}{3}\\) of the time. Alternatively we can reverse the problem and find an event that succeeds \\(\\frac{1}{3}\\) of the time, since: \\(q = 1 - p \\implies p = 1 - q = \\frac{1}{3}\\). The event that our outcome is divisible by 3: \\(C = \\{3, 6\\}\\) satisfies this condition. Exercise 18.2 (Binomial intuition 1) The binomial distribution is a generalization of the Bernoulli distribution. Instead of considering a single Bernoulli trial, we now consider a sum of a sequence of \\(n\\) trials, which are independent and have the same parameter \\(p\\). So the binomial distribution has two parameters \\(n\\) - the number of trials and \\(p\\) - the probability of success for each trial. If we return to our coin flip representation, we now flip a coin several times. The binomial distribution will give us the probabilities of all possible outcomes. Below we show the distribution for a series of 10 coin flips with a fair coin (left) and a biased coin (right). The numbers on the x axis represent the number of times the coin landed heads. Using your knowledge of the binomial distribution: Take the pmf of the binomial distribution and plug in \\(n=1\\), check that it is in fact equivalent to a Bernoulli distribution. In our examples we show the graph of a binomial distribution over 10 trials with \\(p=0.8\\). If we take a look at the graph, it appears as though the probabilities of getting 0,1, 2 or 3 heads in 10 flips are zero. Is it actually zero? Check by plugging in the values into the pmf. Solution. The pmf of a binomial distribution is \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\), now we insert \\(n=1\\) to get: \\[\\binom{1}{k} p^k (1 - p)^{1 - k}\\] Not quite equivalent to a Bernoulli, however note that the support of the binomial distribution is defined as \\(k \\in \\{0,1,\\dots,n\\}\\), so in our case \\(k = \\{0,1\\}\\), then: \\[\\binom{1}{0} = \\binom{1}{1} = 1\\] we get: \\(p^k (1 - p)^{1 - k}\\) ,the Bernoulli distribution. As we already know \\(p=0.8, n=10\\), so: \\[\\binom{10}{0} 0.8^0 (1 - 0.8)^{10 - 0} = 1.024 \\cdot 10^{-7}\\] \\[\\binom{10}{1} 0.8^1 (1 - 0.8)^{10 - 1} = 4.096 \\cdot 10^{-6}\\] \\[\\binom{10}{2} 0.8^2 (1 - 0.8)^{10 - 2} = 7.3728 \\cdot 10^{-5}\\] \\[\\binom{10}{3} 0.8^3 (1 - 0.8)^{10 - 3} = 7.86432\\cdot 10^{-4}\\] So the probabilities are not zero, just very small. Exercise 18.3 (Poisson intuition 1) Below are shown 3 different graphs of the Poisson distribution. Your task is to replicate them on your own in R by varying the \\(\\lambda\\) parameter. Hint: You can use dpois() to get the probabilities. library(ggplot2) library(gridExtra) x = 0:15 # Create Poisson data data1 <- data.frame(x = x, y = dpois(x, lambda = 0.1)) data2 <- data.frame(x = x, y = dpois(x, lambda = 1)) data3 <- data.frame(x = x, y = dpois(x, lambda = 7.5)) # Create individual ggplot objects plot1 <- ggplot(data1, aes(x, y)) + geom_col() + xlab("x") + ylab("Probability") + ylim(0,1) plot2 <- ggplot(data2, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) plot3 <- ggplot(data3, aes(x, y)) + geom_col() + xlab("x") + ylab(NULL) + ylim(0,1) # Combine the plots grid.arrange(plot1, plot2, plot3, ncol = 3) Exercise 18.4 (Poisson intuition 2) The Poisson distribution is a discrete probability distribution that models the probability of a given number of events occuring within processes where events occur at a constant mean rate and independently of each other - a Poisson process. It has a single parameter \\(\\lambda\\), which represents the constant mean rate. A classic example of a scenario that can be modeled using the Poisson distribution is the number of calls received at a call center in a day (or in fact any other time interval). Suppose you work in a call center and have some understanding of probability distributions. You overhear your supervisor mentioning that the call center receives an average of 2.5 calls per day. Using your knowledge of the Poisson distribution, calculate: The probability you will get no calls today. The probability you will get more than 5 calls today. Solution. First recall the Poisson pmf: \\[p(k) = \\frac{\\lambda^k e^{-\\lambda}}{k!}\\] as stated previously our parameter \\(\\lambda = 2.5\\) To get the probability of no calls we simply plug in \\(k = 0\\), so: \\[p(0) = \\frac{2.5^0 e^{-2.5}}{0!} = e^{-2.5} \\approx 0.082\\] The support of the Poisson distribution is non-negative integers. So if we wanted to calculate the probability of getting more than 5 calls we would need to add up the probabilities of getting 6 calls and 7 calls and so on up to infinity. Let us instead remember that the sum of all probabilties will be 1, we will reverse the problem and instead ask “What is the probability we get 5 calls or less?”. We can subtract the probability of the opposite outcome (the complement) from 1 to get the probability of our original question. \\[P(k > 5) = 1 - P(k \\leq 5)\\] \\[P(k \\leq 5) = \\sum_{i=0}^{5} p(i) = p(0) + p(1) + p(2) + p(3) + p(4) + p(5) =\\] \\[= \\frac{2.5^0 e^{-2.5}}{0!} + \\frac{2.5^1 e^{-2.5}}{1!} + \\dots =\\] \\[=0.957979\\] So the probability of geting more than 5 calls will be \\(1 - 0.957979 = 0.042021\\) Exercise 18.5 (Geometric intuition 1) The geometric distribution is a discrete distribution that models the number of failures before the first success in a sequence of independent Bernoulli trials. It has a single parameter \\(p\\), representing the probability of success and its support is all non-negative integers \\(\\{0,1,2,\\dots\\}\\). NOTE: There is an alternative way to think about this distribution, one that models the number of trials before the first success. The difference is subtle yet significant and you are likely to encounter both forms. The key to telling them apart is to check their support, since the number of trials has to be at least \\(1\\), for this case we have \\(\\{1,2,\\dots\\}\\). In the graph below we show the pmf of a geometric distribution with \\(p=0.5\\). This can be thought of as the number of successive failures (tails) in the flip of a fair coin. You can see that there’s a 50% chance you will have zero failures i.e. you will flip a heads on your very first attempt. But there is some smaller chance that you will flip a sequence of tails in a row, with longer sequences having ever lower probability. Create an equivalent graph that represents the probability of rolling a 6 with a fair 6-sided die. Use the formula for the mean of the geometric distribution and determine the average number of failures before you roll a 6. Look up the alternative form of the geometric distribtuion and again use the formula for the mean to determine the average number of trials up to and including rolling a 6. Solution. Parameter p (the probability of success) for rolling a 6 is \\(p=\\frac{1}{6}\\). library(ggplot2) # Parameters p <- 1/6 x_vals <- 0:9 # Starting from 0 probs <- dgeom(x_vals, p) # Data data <- data.frame(x_vals, probs) # Plot ggplot(data, aes(x=x_vals, y=probs)) + geom_segment(aes(xend=x_vals, yend=0), color="black", size=1) + geom_point(color="red", size=2) + labs(x = "Number of trials", y = "Probability") + theme_minimal() + scale_x_continuous(breaks = x_vals) # This line ensures integer x-axis labels ::: {.solution} b) The expected value of a random variable (the mean) is denoted as \\(E[X]\\). \\[E[X] = \\frac{1-p}{p}= \\frac{1- \\frac{1}{6}}{\\frac{1}{6}} = \\frac{5}{6}\\cdot 6 = 5\\] On average we will fail 5 times before we roll our first 6. The alternative form of this distribution (with support on all positive integers) has a slightly different formula for the mean. This change reflects the difference in the way we posed our question: \\[E[X] = \\frac{1}{p} = \\frac{1}{\\frac{1}{6}} = 6\\] On average we will have to throw the die 6 times before we roll a 6. ::: 18.2 Continuous distributions Exercise 18.6 (Uniform intuition 1) The need for a randomness is a common problem. A practical solution are so-called random number generators (RNGs). The simplest RNG one would think of is choosing a set of numbers and having the generator return a number at random, where the probability of returning any number from this set is the same. If this set is an interval of real numbers, then we’ve basically described the continuous uniform distribution. It has two parameters \\(a\\) and \\(b\\), which define the beginning and end of its support respectively. Let’s think about the mean intuitively. Think of the area under the graph as a geometric shape. The expected value or mean of a distribution is the x-axis value of its center of mass. Given parameters \\(a\\) and \\(b\\) what is your intuitive guess of the mean for the uniform distribution? A special case of the uniform distribution is the standard uniform distribution with \\(a=0\\) and \\(b=1\\). Write the pdf \\(f(x)\\) of this particular distribution. Solution. The center of mass is the center of the square from \\(a\\) to \\(b\\) and from 0 to \\(\\frac{1}{b-a}\\). Its value on the x-axis is the midpoint between \\(a\\) and \\(b\\), so \\(\\frac{a+b}{2}\\) Inserting the parameter values we get:\\[f(x) = \\begin{cases} 1 & \\text{if } 0 \\leq x \\leq 1 \\\\ 0 & \\text{otherwise} \\end{cases} \\] Notice how the pdf is just a constant \\(1\\) across all values of \\(x \\in [0,1]\\). Here it is important to distinguish between probability and probability density. The density may be 1, but the probability is not and while discrete distributions never exceed 1 on the y-axis, continuous distributions can go as high as you like. Exercise 18.7 (Normal intuition 1) The normal distribution, also known as the Gaussian distribution, is a continuous distribution that encompasses the entire real number line. It has two parameters: the mean, denoted by \\(\\mu\\), and the variance, represented by \\(\\sigma^2\\). Its shape resembles the iconic bell curve. The position of its peak is determined by the parameter \\(\\mu\\), while the variance determines the spread or width of the curve. A smaller variance results in a sharper, narrower peak, while a larger variance leads to a broader, more spread-out curve. Below, we graph the distribution of IQ scores for two different populations. We aim to identify individuals with an IQ at or above 140 for an experiment. We can identify them reliably; however, we only have time to examine one of the two groups. Which group should we investigate to have the best chance of finding such individuals? NOTE: The graph below displays the parameter \\(\\sigma\\), which is the square root of the variance, more commonly referred to as the standard deviation. Keep this in mind when solving the problems. Insert the values of either population into the pdf of a normal distribution and determine which one has a higher density at \\(x=140\\). Generate the graph yourself and zoom into the relevant area to graphically verify your answer. To determine probability density, we can use the pdf. However, if we wish to know the proportion of the population that falls within certain parameters, we would need to integrate the pdf. Fortunately, the integrals of common distributions are well-established. This integral gives us the cumulative distribution function \\(F(x)\\) (CDF). BONUS: Look up the CDF of the normal distribution and input the appropriate values to determine the percentage of each population that comprises individuals with an IQ of 140 or higher. Solution. Group 1: \\(\\mu = 100, \\sigma=10 \\rightarrow \\sigma^2 = 100\\) \\[\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} = \\frac{1}{\\sqrt{2 \\pi 100}} e^{-\\frac{(140 - 100)^2}{2 \\cdot 100}} \\approx 1.34e-05\\] Group 2: \\(\\mu = 105, \\sigma=8 \\rightarrow \\sigma^2 = 64\\) \\[\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}} = \\frac{1}{\\sqrt{2 \\pi 64}} e^{-\\frac{(140 - 105)^2}{2 \\cdot 64}} \\approx 3.48e-06\\] So despite the fact that group 1 has a lower average IQ, we are more likely to find 140 IQ individuals in this group. library(ggplot2) library(tidyr) # Create data x <- seq(135, 145, by = 0.01) # Adjusting the x range to account for the larger standard deviations df <- data.frame(x = x) # Define the IQ distributions df$IQ_mu100_sd10 <- dnorm(df$x, mean = 100, sd = 10) df$IQ_mu105_sd8 <- dnorm(df$x, mean = 105, sd = 8) # Convert from wide to long format for ggplot2 df_long <- gather(df, distribution, density, -x) # Ensure the levels of the 'distribution' factor match our desired order df_long$distribution <- factor(df_long$distribution, levels = c("IQ_mu100_sd10", "IQ_mu105_sd8")) # Plot ggplot(df_long, aes(x = x, y = density, color = distribution)) + geom_line() + labs(x = "IQ Score", y = "Density") + scale_color_manual( name = "IQ Distribution", values = c(IQ_mu100_sd10 = "red", IQ_mu105_sd8 = "blue"), labels = c("Group 1 (µ=100, σ=10)", "Group 2 (µ=105, σ=8)") ) + theme_minimal() ::: {.solution} c. The CDF of the normal distribution is \\(\\Phi(x) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{x - \\mu}{\\sigma \\sqrt{2}} \\right) \\right]\\). The CDF is defined as the integral of the distribution density up to x. So to get the total percentage of individuals with IQ at 140 or higher we will need to subtract the value from 1. Group 1: \\[1 - \\Phi(140) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{140 - 100}{10 \\sqrt{2}} \\right) \\right] \\approx 3.17e-05 \\] Group 2 : \\[1 - \\Phi(140) = \\frac{1}{2} \\left[ 1 + \\text{erf} \\left( \\frac{140 - 105}{8 \\sqrt{2}} \\right) \\right] \\approx 6.07e-06 \\] So roughly 0.003% and 0.0006% of individuals in groups 1 and 2 respectively have an IQ at or above 140. ::: Exercise 18.8 (Beta intuition 1) The beta distribution is a continuous distribution defined on the unit interval \\([0,1]\\). It has two strictly positive paramters \\(\\alpha\\) and \\(\\beta\\), which determine its shape. Its support makes it especially suitable to model distribtuions of percentages and proportions. Below you’ve been provided with some code that you can copy into Rstudio. Once you run the code, an interactive Shiny app will appear and you will be able to manipulate the graph of the beta distribution. Play around with the parameters to get: A symmetric bell curve A bowl-shaped curve The standard uniform distribution is actually a special case of the beta distribution. Find the exact parameters \\(\\alpha\\) and \\(\\beta\\). Once you do, prove the equality by inserting the values into our pdf. Hint: The beta function is evaluated as \\(\\text{B}(a,b) = \\frac{\\Gamma(a)\\Gamma(b)}{\\Gamma(a+b)}\\), the gamma function for positive integers \\(n\\) is evaluated as \\(\\Gamma(n)= (n-1)!\\) # Install and load necessary packages install.packages(c("shiny", "ggplot2")) library(shiny) library(ggplot2) # The Shiny App ui <- fluidPage( titlePanel("Beta Distribution Viewer"), sidebarLayout( sidebarPanel( sliderInput("alpha", "Alpha:", min = 0.1, max = 10, value = 2, step = 0.1), sliderInput("beta", "Beta:", min = 0.1, max = 10, value = 2, step = 0.1) ), mainPanel( plotOutput("betaPlot") ) ) ) server <- function(input, output) { output$betaPlot <- renderPlot({ x <- seq(0, 1, by = 0.01) y <- dbeta(x, shape1 = input$alpha, shape2 = input$beta) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() + labs(x = "Value", y = "Density") + theme_minimal() }) } shinyApp(ui = ui, server = server) Solution. Possible solution \\(\\alpha = \\beta= 5\\) Possible solution \\(\\alpha = \\beta= 0.5\\) The correct parameters are \\(\\alpha = 1, \\beta=1\\), to prove the equality we insert them into the beta pdf: \\[\\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)} = \\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\\text{B}(1, 1)} = \\frac{1}{\\frac{\\Gamma(1)\\Gamma(1)}{\\Gamma(1+1)}}= \\frac{1}{\\frac{(1-1)!(1-1)!}{(2-1)!}} = 1\\] Exercise 18.9 (Exponential intuition 1) The exponential distribution represents the distributon of time between events in a Poisson process. It is the continuous analogue of the geometric distribution. It has a single parameter \\(\\lambda\\), which is strictly positive and represents the constant rate of the corresponding Poisson process. The support is all positive reals, since time between events is non-negative, but not bound upwards. Let’s revisit the call center from our Poisson problem. We get 2.5 calls per day on average, this is our rate parameter \\(\\lambda\\). A work day is 8 hours. What is the mean time between phone calls? The cdf \\(F(x)\\) tells us what percentage of calls occur within x amount of time of each other. You want to take an hour long lunch break but are worried about missing calls. Calculate the probability of missing at least one call if you’re gone for an hour. Hint: The cdf is \\(F(x) = \\int_{-\\infty}^{x} f(x) dx\\) Solution. Taking \\(\\lambda = \\frac{2.5 \\text{ calls}}{8 \\text{ hours}} = \\frac{1 \\text{ call}}{3.2 \\text{ hours}}\\) \\[E[X] = \\frac{1}{\\lambda} = \\frac{3.2 \\text{ hours}}{\\text{call}}\\] First we derive the CDF, we can integrate from 0 instead of \\(-\\infty\\), since we have no support in the negatives: \\[\\begin{align} F(x) &= \\int_{0}^{x} \\lambda e^{-\\lambda t} dt \\\\ &= \\lambda \\int_{0}^{x} e^{-\\lambda t} dt \\\\ &= \\lambda (\\frac{1}{-\\lambda}e^{-\\lambda t} |_{0}^{x}) \\\\ &= \\lambda(\\frac{1}{\\lambda} - \\frac{1}{\\lambda} e^{-\\lambda x}) \\\\ &= 1 - e^{-\\lambda x}. \\end{align}\\] Then we just evaluate it for a time of 1 hour: \\[F(1 \\text{ hour}) = 1 - e^{-\\frac{1 \\text{ call}}{3.2 \\text{ hours}} \\cdot 1 \\text{ hour}}= 1 - e^{-\\frac{1 \\text{ call}}{3.2 \\text{ hours}}} \\approx 0.268\\] So we have about a 27% chance of missing at least one call if we’re gone for an hour. Exercise 18.10 (Gamma intuition 1) The gamma distribution is a continuous distribution with by two parameters, \\(\\alpha\\) and \\(\\beta\\), both greater than 0. These parameters afford the distribution a broad range of shapes, leading to it being commonly referred to as a family of distributions. Given its support over the positive real numbers, it is well suited for modeling a diverse range of positive-valued phenomena. The exponential distribution is actually just a particular form of the gamma distribution. What are the values of \\(\\alpha\\) and \\(\\beta\\)? Copy the code from our beta distribution Shiny app and modify it to simulate the gamma distribution. Then get it to show the exponential. Solution. Let’s start by taking a look at the pdfs of the two distributions side by side: \\[\\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x} = \\lambda e^{-\\lambda x}\\] The \\(x^{\\alpha - 1}\\) term is not found anywhere in the pdf of the exponential so we need to eliminate it by setting \\(\\alpha = 1\\). This also makes the fraction evaluate to \\(\\frac{\\beta^1}{\\Gamma(1)} = \\beta\\), which leaves us with \\[\\beta \\cdot e^{-\\beta x}\\] Now we can see that \\(\\beta = \\lambda\\) and \\(\\alpha = 1\\). # Install and load necessary packages install.packages(c("shiny", "ggplot2")) library(shiny) library(ggplot2) # The Shiny App ui <- fluidPage( titlePanel("Gamma Distribution Viewer"), sidebarLayout( sidebarPanel( sliderInput("shape", "Shape (α):", min = 0.1, max = 10, value = 2, step = 0.1), sliderInput("scale", "Scale (β):", min = 0.1, max = 10, value = 2, step = 0.1) ), mainPanel( plotOutput("gammaPlot") ) ) ) server <- function(input, output) { output$gammaPlot <- renderPlot({ x <- seq(0, 25, by = 0.1) y <- dgamma(x, shape = input$shape, scale = input$scale) ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) + geom_line() + labs(x = "Value", y = "Density") + theme_minimal() }) } shinyApp(ui = ui, server = server) "],["A1.html", "A R programming language A.1 Basic characteristics A.2 Why R? A.3 Setting up A.4 R basics A.5 Functions A.6 Other tips A.7 Further reading and references", " A R programming language A.1 Basic characteristics R is free software for statistical computing and graphics. It is widely used by statisticians, scientists, and other professionals for software development and data analysis. It is an interpreted language and therefore the programs do not need compilation. A.2 Why R? R is one of the main two languages used for statistics and machine learning (the other being Python). Pros Libraries. Comprehensive collection of statistical and machine learning packages. Easy to code. Open source. Anyone can access R and develop new methods. Additionally, it is relatively simple to get source code of established methods. Large community. The use of R has been rising for some time, in industry and academia. Therefore a large collection of blogs and tutorials exists, along with people offering help on pages like StackExchange and CrossValidated. Integration with other languages and LaTeX. New methods. Many researchers develop R packages based on their research, therefore new methods are available soon after development. Cons Slow. Programs run slower than in other programming languages, however this can be somewhat ammended by effective coding or integration with other languages. Memory intensive. This can become a problem with large data sets, as they need to be stored in the memory, along with all the information the models produce. Some packages are not as good as they should be, or have poor documentation. Object oriented programming in R can be very confusing and complex. A.3 Setting up https://www.r-project.org/. A.3.1 RStudio RStudio is the most widely used IDE for R. It is free, you can download it from https://rstudio.com/. While console R is sufficient for the requirements of this course, we recommend the students install RStudio for its better user interface. A.3.2 Libraries for data science Listed below are some of the more useful libraries (packages) for data science. Students are also encouraged to find other useful packages. dplyr Efficient data manipulation. Part of the wider package collection called tidyverse. ggplot2 Plotting based on grammar of graphics. stats Several statistical models. rstan Bayesian inference using Hamiltonian Monte Carlo. Very flexible model building. MCMCpack Bayesian inference. rmarkdown, knitr, and bookdown Dynamic reports (for example such as this one). devtools Package development. A.4 R basics A.4.1 Variables and types Important information and tips: no type declaration define variables with <- instead of = (although both work, there is a slight difference, additionally most of the packages use the arrow) for strings use \"\" for comments use # change types with as.type() functions no special type for single character like C++ for example n <- 20 x <- 2.7 m <- n # m gets value 20 my_flag <- TRUE student_name <- "Luka" typeof(n) ## [1] "double" typeof(student_name) ## [1] "character" typeof(my_flag) ## [1] "logical" typeof(as.integer(n)) ## [1] "integer" typeof(as.character(n)) ## [1] "character" A.4.2 Basic operations n + x ## [1] 22.7 n - x ## [1] 17.3 diff <- n - x # variable diff gets the difference between n and x diff ## [1] 17.3 n * x ## [1] 54 n / x ## [1] 7.407407 x^2 ## [1] 7.29 sqrt(x) ## [1] 1.643168 n > 2 * n ## [1] FALSE n == n ## [1] TRUE n == 2 * n ## [1] FALSE n != n ## [1] FALSE paste(student_name, "is", n, "years old") ## [1] "Luka is 20 years old" A.4.3 Vectors use c() to combine elements into vectors can only contain one type of variable if different types are provided, all are transformed to the most basic type in the vector access elements by indexes or logical vectors of the same length a scalar value is regarded as a vector of length 1 1:4 # creates a vector of integers from 1 to 4 ## [1] 1 2 3 4 student_ages <- c(20, 23, 21) student_names <- c("Luke", "Jen", "Mike") passed <- c(TRUE, TRUE, FALSE) length(student_ages) ## [1] 3 # access by index student_ages[2] ## [1] 23 student_ages[1:2] ## [1] 20 23 student_ages[2] <- 24 # change values # access by logical vectors student_ages[passed == TRUE] # same as student_ages[passed] ## [1] 20 24 student_ages[student_names %in% c("Luke", "Mike")] ## [1] 20 21 student_names[student_ages > 20] ## [1] "Jen" "Mike" A.4.3.1 Operations with vectors most operations are element-wise if we operate on vectors of different lengths, the shorter vector periodically repeats its elements until it reaches the length of the longer one a <- c(1, 3, 5) b <- c(2, 2, 1) d <- c(6, 7) a + b ## [1] 3 5 6 a * b ## [1] 2 6 5 a + d ## Warning in a + d: longer object length is not a multiple of shorter object ## length ## [1] 7 10 11 a + 2 * b ## [1] 5 7 7 a > b ## [1] FALSE TRUE TRUE b == a ## [1] FALSE FALSE FALSE a %*% b # vector multiplication, not element-wise ## [,1] ## [1,] 13 A.4.4 Factors vectors of finite predetermined classes suitable for categorical variables ordinal (ordered) or nominal (unordered) car_brand <- factor(c("Audi", "BMW", "Mercedes", "BMW"), ordered = FALSE) car_brand ## [1] Audi BMW Mercedes BMW ## Levels: Audi BMW Mercedes freq <- factor(x = NA, levels = c("never","rarely","sometimes","often","always"), ordered = TRUE) freq[1:3] <- c("rarely", "sometimes", "rarely") freq ## [1] rarely sometimes rarely ## Levels: never < rarely < sometimes < often < always freq[4] <- "quite_often" # non-existing level, returns NA ## Warning in `[<-.factor`(`*tmp*`, 4, value = "quite_often"): invalid factor ## level, NA generated freq ## [1] rarely sometimes rarely <NA> ## Levels: never < rarely < sometimes < often < always A.4.5 Matrices two-dimensional generalizations of vectors my_matrix <- matrix(c(1, 2, 1, 5, 4, 2), nrow = 2, byrow = TRUE) my_matrix ## [,1] [,2] [,3] ## [1,] 1 2 1 ## [2,] 5 4 2 my_square_matrix <- matrix(c(1, 3, 2, 3), nrow = 2) my_square_matrix ## [,1] [,2] ## [1,] 1 2 ## [2,] 3 3 my_matrix[1,2] # first row, second column ## [1] 2 my_matrix[2, ] # second row ## [1] 5 4 2 my_matrix[ ,3] # third column ## [1] 1 2 A.4.5.1 Matrix functions and operations most operation element-wise mind the dimensions when using matrix multiplication %*% nrow(my_matrix) # number of matrix rows ## [1] 2 ncol(my_matrix) # number of matrix columns ## [1] 3 dim(my_matrix) # matrix dimension ## [1] 2 3 t(my_matrix) # transpose ## [,1] [,2] ## [1,] 1 5 ## [2,] 2 4 ## [3,] 1 2 diag(my_matrix) # the diagonal of the matrix as vector ## [1] 1 4 diag(1, nrow = 3) # creates a diagonal matrix ## [,1] [,2] [,3] ## [1,] 1 0 0 ## [2,] 0 1 0 ## [3,] 0 0 1 det(my_square_matrix) # matrix determinant ## [1] -3 my_matrix + 2 * my_matrix ## [,1] [,2] [,3] ## [1,] 3 6 3 ## [2,] 15 12 6 my_matrix * my_matrix # element-wise multiplication ## [,1] [,2] [,3] ## [1,] 1 4 1 ## [2,] 25 16 4 my_matrix %*% t(my_matrix) # matrix multiplication ## [,1] [,2] ## [1,] 6 15 ## [2,] 15 45 my_vec <- as.vector(my_matrix) # transform to vector my_vec ## [1] 1 5 2 4 1 2 A.4.6 Arrays multi-dimensional generalizations of matrices my_array <- array(c(1, 2, 3, 4, 5, 6, 7, 8), dim = c(2, 2, 2)) my_array[1, 1, 1] ## [1] 1 my_array[2, 2, 1] ## [1] 4 my_array[1, , ] ## [,1] [,2] ## [1,] 1 5 ## [2,] 3 7 dim(my_array) ## [1] 2 2 2 A.4.7 Data frames basic data structure for analysis differ from matrices as columns can be of different types student_data <- data.frame("Name" = student_names, "Age" = student_ages, "Pass" = passed) student_data ## Name Age Pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE ## 3 Mike 21 FALSE colnames(student_data) <- c("name", "age", "pass") # change column names student_data[1, ] ## name age pass ## 1 Luke 20 TRUE student_data[ ,colnames(student_data) %in% c("name", "pass")] ## name pass ## 1 Luke TRUE ## 2 Jen TRUE ## 3 Mike FALSE student_data$pass # access column by name ## [1] TRUE TRUE FALSE student_data[student_data$pass == TRUE, ] ## name age pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE A.4.8 Lists useful for storing different data structures access elements with double square brackets elements can be named first_list <- list(student_ages, my_matrix, student_data) second_list <- list(student_ages, my_matrix, student_data, first_list) first_list[[1]] ## [1] 20 24 21 second_list[[4]] ## [[1]] ## [1] 20 24 21 ## ## [[2]] ## [,1] [,2] [,3] ## [1,] 1 2 1 ## [2,] 5 4 2 ## ## [[3]] ## name age pass ## 1 Luke 20 TRUE ## 2 Jen 24 TRUE ## 3 Mike 21 FALSE second_list[[4]][[1]] # first element of the fourth element of second_list ## [1] 20 24 21 length(second_list) ## [1] 4 second_list[[length(second_list) + 1]] <- "add_me" # append an element names(first_list) <- c("Age", "Matrix", "Data") first_list$Age ## [1] 20 24 21 A.4.9 Loops mostly for loop for loop can iterate over an arbitrary vector # iterate over consecutive natural numbers my_sum <- 0 for (i in 1:10) { my_sum <- my_sum + i } my_sum ## [1] 55 # iterate over an arbirary vector my_sum <- 0 some_numbers <- c(2, 3.5, 6, 100) for (i in some_numbers) { my_sum <- my_sum + i } my_sum ## [1] 111.5 A.5 Functions for help use ?function_name A.5.1 Writing functions We can write our own functions with function(). In the brackets, we define the parameters the function gets, and in curly brackets we define what the function does. We use return() to return values. sum_first_n_elements <- function (n) { my_sum <- 0 for (i in 1:n) { my_sum <- my_sum + i } return (my_sum) } sum_first_n_elements(10) ## [1] 55 A.6 Other tips Use set.seed(arbitrary_number) at the beginning of a script to set the seed and ensure replication. To dynamically set the working directory in R Studio to the parent folder of a R script use setwd(dirname(rstudioapi::getSourceEditorContext()$path)). To avoid slow R loops use the apply family of functions. See ?apply and ?lapply. To make your data manipulation (and therefore your life) a whole lot easier, use the dplyr package. Use getAnywhere(function_name) to get the source code of any function. Use browser for debugging. See ?browser. A.7 Further reading and references Getting started with R Studio: https://www.youtube.com/watch?v=lVKMsaWju8w Official R manuals: https://cran.r-project.org/manuals.html Cheatsheets: https://www.rstudio.com/resources/cheatsheets/ Workshop on R, dplyr, ggplot2, and R Markdown: https://github.com/bstatcomp/Rworkshop "],["distributions.html", "B Probability distributions", " B Probability distributions Name parameters support pdf/pmf mean variance Bernoulli \\(p \\in [0,1]\\) \\(k \\in \\{0,1\\}\\) \\(p^k (1 - p)^{1 - k}\\) 1.12 \\(p\\) 7.1 \\(p(1-p)\\) 7.1 binomial \\(n \\in \\mathbb{N}\\), \\(p \\in [0,1]\\) \\(k \\in \\{0,1,\\dots,n\\}\\) \\(\\binom{n}{k} p^k (1 - p)^{n - k}\\) 4.4 \\(np\\) 7.2 \\(np(1-p)\\) 7.2 Poisson \\(\\lambda > 0\\) \\(k \\in \\mathbb{N}_0\\) \\(\\frac{\\lambda^k e^{-\\lambda}}{k!}\\) 4.6 \\(\\lambda\\) 7.3 \\(\\lambda\\) 7.3 geometric \\(p \\in (0,1]\\) \\(k \\in \\mathbb{N}_0\\) \\(p(1-p)^k\\) 4.5 \\(\\frac{1 - p}{p}\\) 7.4 \\(\\frac{1 - p}{p^2}\\) 9.3 normal \\(\\mu \\in \\mathbb{R}\\), \\(\\sigma^2 > 0\\) \\(x \\in \\mathbb{R}\\) \\(\\frac{1}{\\sqrt{2 \\pi \\sigma^2}} e^{-\\frac{(x - \\mu)^2}{2 \\sigma^2}}\\) 4.12 \\(\\mu\\) 7.8 \\(\\sigma^2\\) 7.8 uniform \\(a,b \\in \\mathbb{R}\\), \\(a < b\\) \\(x \\in [a,b]\\) \\(\\frac{1}{b-a}\\) 4.9 \\(\\frac{a+b}{2}\\) \\(\\frac{(b-a)^2}{12}\\) beta \\(\\alpha,\\beta > 0\\) \\(x \\in [0,1]\\) \\(\\frac{x^{\\alpha - 1} (1 - x)^{\\beta - 1}}{\\text{B}(\\alpha, \\beta)}\\) 4.10 \\(\\frac{\\alpha}{\\alpha + \\beta}\\) 7.6 \\(\\frac{\\alpha \\beta}{(\\alpha + \\beta)^2(\\alpha + \\beta + 1)}\\) 7.6 gamma \\(\\alpha,\\beta > 0\\) \\(x \\in (0, \\infty)\\) \\(\\frac{\\beta^\\alpha}{\\Gamma(\\alpha)} x^{\\alpha - 1}e^{-\\beta x}\\) 4.11 \\(\\frac{\\alpha}{\\beta}\\) 7.5 \\(\\frac{\\alpha}{\\beta^2}\\) 7.5 exponential \\(\\lambda > 0\\) \\(x \\in [0, \\infty)\\) \\(\\lambda e^{-\\lambda x}\\) 4.8 \\(\\frac{1}{\\lambda}\\) 7.7 \\(\\frac{1}{\\lambda^2}\\) 7.7 logistic \\(\\mu \\in \\mathbb{R}\\), \\(s > 0\\) \\(x \\in \\mathbb{R}\\) \\(\\frac{e^{-\\frac{x - \\mu}{s}}}{s(1 + e^{-\\frac{x - \\mu}{s}})^2}\\) 4.13 \\(\\mu\\) \\(\\frac{s^2 \\pi^2}{3}\\) negative binomial \\(r \\in \\mathbb{N}\\), \\(p \\in [0,1]\\) \\(k \\in \\mathbb{N}_0\\) \\(\\binom{k + r - 1}{k}(1-p)^r p^k\\) 4.7 \\(\\frac{rp}{1 - p}\\) 9.2 \\(\\frac{rp}{(1 - p)^2}\\) 9.2 multinomial \\(n \\in \\mathbb{N}\\), \\(k \\in \\mathbb{N}\\) \\(p_i \\in [0,1]\\), \\(\\sum p_i = 1\\) \\(x_i \\in \\{0,..., n\\}\\), \\(i \\in \\{1,...,k\\}\\), \\(\\sum{x_i} = n\\) \\(\\frac{n!}{x_1!x_2!...x_k!} p_1^{x_1} p_2^{x_2}...p_k^{x_k}\\) 8.1 \\(np_i\\) \\(np_i(1-p_i)\\) "],["references.html", "References", " References "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]] diff --git a/docs/uprobspaces.html b/docs/uprobspaces.html index 4989568..53ba810 100644 --- a/docs/uprobspaces.html +++ b/docs/uprobspaces.html @@ -20,10 +20,10 @@ - + - + diff --git a/index.Rmd b/index.Rmd index 09b1aca..41c8501 100644 --- a/index.Rmd +++ b/index.Rmd @@ -1,6 +1,6 @@ --- title: "Principles of Uncertainty -- exercises" -author: "Gregor Pirš and Erik Štrumbelj" +author: "Gregor Pirš, Erik Štrumbelj, David Nabergoj and Leon Hvastja" date: "`r Sys.Date()`" site: bookdown::bookdown_site documentclass: book