Skip to content

Commit

Permalink
Merge pull request #3 from LeonHvastja/distributions_chapter
Browse files Browse the repository at this point in the history
minor fixes
  • Loading branch information
LeonHvastja authored Oct 2, 2023
2 parents 287668d + 73c8043 commit a5185e1
Show file tree
Hide file tree
Showing 3 changed files with 22 additions and 16 deletions.
17 changes: 10 additions & 7 deletions 18-distributions_intuition.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ $(document).ready(function() {
## Discrete distributions

```{exercise, name = "Bernoulli intuition 1"}
Arguably the simplest distribution you will enocounter is the Bernoulli distribution.
Arguably the simplest distribution you will encounter is the Bernoulli distribution.
It is a discrete probability distribution used to represent the outcome of a yes/no
question. It has one parameter $p$ which is the probability of success. The
probability of failure is $(1-p)$, sometimes denoted as $q$.
Expand Down Expand Up @@ -127,7 +127,7 @@ into the pmf.
```{solution, echo = togs}
a. The pmf of a binomial distribution is $\binom{n}{k} p^k (1 - p)^{n - k}$, now
we insert $n=1$ to get:
$$\binom{1}{k} p^k (1 - p)^{1 - k}$$.
$$\binom{1}{k} p^k (1 - p)^{1 - k}$$
Not quite equivalent to
a Bernoulli, however note that the support of the binomial distribution is
defined as $k \in \{0,1,\dots,n\}$, so in our case $k = \{0,1\}$, then:
Expand Down Expand Up @@ -240,7 +240,7 @@ occur at a constant mean rate and independently of each other - a **Poisson proc
It has a single parameter $\lambda$, which represents the constant mean rate.
A classic example of a scenario that can be modeled using the Poisson distribution
is the number of calls received by a call center in a day (or in fact any other
is the number of calls received at a call center in a day (or in fact any other
time interval).
Suppose you work in a call center and have some understanding of probability
Expand Down Expand Up @@ -281,11 +281,14 @@ to get the probability of our original question.
```{exercise, name = "Geometric intuition 1"}
The geometric distribution is a discrete distribution that models the **number of
failures** before the first success in a sequence of independent Bernoulli trials.
It has a single parameter $p$, representing the probability of success.
It has a single parameter $p$, representing the probability of success and its
support is all non-negative integers $\{0,1,2,\dots\}$.
NOTE: There are two forms of this distribution, the one we just described
and another that models the **number of trials** before the first success. The
difference is subtle yet significant and you are likely to encounter both forms.
The key to telling them apart is to check their support, since the number of trials
has to be at least $1$, for this case we have $\{1,2,\dots\}$.
In the graph below we show the pmf of a geometric distribution with $p=0.5$. This
can be thought of as the number of successive failures (tails) in the flip of a fair coin.
Expand Down Expand Up @@ -402,7 +405,7 @@ b. Inserting the parameter values we get:$$f(x) =
0 & \text{otherwise}
\end{cases}
$$
Notice how the pdf is just a constant $1$ across all values of $x \in [0,1]$. Here it is important to distinguish between probability and **probability density**. The density may be 1, but the probability is not and while discreet distributions never exceed 1 on the y-axis, continuous distributions can go as high as you like.
Notice how the pdf is just a constant $1$ across all values of $x \in [0,1]$. Here it is important to distinguish between probability and **probability density**. The density may be 1, but the probability is not and while discrete distributions never exceed 1 on the y-axis, continuous distributions can go as high as you like.
```
</div>

Expand All @@ -412,7 +415,7 @@ The normal distribution, also known as the Gaussian distribution, is a continuou
Below, we graph the distribution of IQ scores for two different populations.
We aim to identify individuals with an IQ at or above 140 for an experiment. We can identify them reliably; however, we only have time to examine one of the two groups. Which group should we investigate to have the best chance?
We aim to identify individuals with an IQ at or above 140 for an experiment. We can identify them reliably; however, we only have time to examine one of the two groups. Which group should we investigate to have the best chance of finding such individuals?
NOTE: The graph below displays the parameter $\sigma$, which is the square root of the variance, more commonly referred to as the **standard deviation**. Keep this in mind when solving the problems.
Expand Down Expand Up @@ -466,7 +469,7 @@ a. Group 1: $\mu = 100, \sigma=10 \rightarrow \sigma^2 = 100$
$$\frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{(x - \mu)^2}{2 \sigma^2}} =
\frac{1}{\sqrt{2 \pi 64}} e^{-\frac{(140 - 105)^2}{2 \cdot 64}} \approx 3.48e-06$$
So despite the fact that group 1 has a lower average IQ, we are more likely to find 140 IQ individuals in group 2.
So despite the fact that group 1 has a lower average IQ, we are more likely to find 140 IQ individuals in this group.
```
b.
```{r, echo=togs}
Expand Down
19 changes: 11 additions & 8 deletions docs/distributions-intutition.html
Original file line number Diff line number Diff line change
Expand Up @@ -327,7 +327,7 @@ <h1><span class="header-section-number">Chapter 18</span> Distributions intutiti
<div id="discrete-distributions" class="section level2 hasAnchor" number="18.1">
<h2><span class="header-section-number">18.1</span> Discrete distributions<a href="distributions-intutition.html#discrete-distributions" class="anchor-section" aria-label="Anchor link to header"></a></h2>
<div class="exercise">
<p><span id="exr:unnamed-chunk-277" class="exercise"><strong>Exercise 18.1 (Bernoulli intuition 1) </strong></span>Arguably the simplest distribution you will enocounter is the Bernoulli distribution.
<p><span id="exr:unnamed-chunk-277" class="exercise"><strong>Exercise 18.1 (Bernoulli intuition 1) </strong></span>Arguably the simplest distribution you will encounter is the Bernoulli distribution.
It is a discrete probability distribution used to represent the outcome of a yes/no
question. It has one parameter <span class="math inline">\(p\)</span> which is the probability of success. The
probability of failure is <span class="math inline">\((1-p)\)</span>, sometimes denoted as <span class="math inline">\(q\)</span>.</p>
Expand Down Expand Up @@ -389,7 +389,7 @@ <h2><span class="header-section-number">18.1</span> Discrete distributions<a hre
<ol style="list-style-type: lower-alpha">
<li><p>The pmf of a binomial distribution is <span class="math inline">\(\binom{n}{k} p^k (1 - p)^{n - k}\)</span>, now
we insert <span class="math inline">\(n=1\)</span> to get:
<span class="math display">\[\binom{1}{k} p^k (1 - p)^{1 - k}\]</span>.
<span class="math display">\[\binom{1}{k} p^k (1 - p)^{1 - k}\]</span>
Not quite equivalent to
a Bernoulli, however note that the support of the binomial distribution is
defined as <span class="math inline">\(k \in \{0,1,\dots,n\}\)</span>, so in our case <span class="math inline">\(k = \{0,1\}\)</span>, then:
Expand Down Expand Up @@ -441,7 +441,7 @@ <h2><span class="header-section-number">18.1</span> Discrete distributions<a hre
occur at a constant mean rate and independently of each other - a <strong>Poisson process</strong>.</p>
<p>It has a single parameter <span class="math inline">\(\lambda\)</span>, which represents the constant mean rate.</p>
<p>A classic example of a scenario that can be modeled using the Poisson distribution
is the number of calls received by a call center in a day (or in fact any other
is the number of calls received at a call center in a day (or in fact any other
time interval).</p>
<p>Suppose you work in a call center and have some understanding of probability
distributions. You overhear your supervisor mentioning that the call center
Expand Down Expand Up @@ -476,10 +476,13 @@ <h2><span class="header-section-number">18.1</span> Discrete distributions<a hre
<div class="exercise">
<p><span id="exr:unnamed-chunk-288" class="exercise"><strong>Exercise 18.5 (Geometric intuition 1) </strong></span>The geometric distribution is a discrete distribution that models the <strong>number of
failures</strong> before the first success in a sequence of independent Bernoulli trials.
It has a single parameter <span class="math inline">\(p\)</span>, representing the probability of success.</p>
It has a single parameter <span class="math inline">\(p\)</span>, representing the probability of success and its
support is all non-negative integers <span class="math inline">\(\{0,1,2,\dots\}\)</span>.</p>
<p>NOTE: There are two forms of this distribution, the one we just described
and another that models the <strong>number of trials</strong> before the first success. The
difference is subtle yet significant and you are likely to encounter both forms.</p>
difference is subtle yet significant and you are likely to encounter both forms.
The key to telling them apart is to check their support, since the number of trials
has to be at least <span class="math inline">\(1\)</span>, for this case we have <span class="math inline">\(\{1,2,\dots\}\)</span>.</p>
<p>In the graph below we show the pmf of a geometric distribution with <span class="math inline">\(p=0.5\)</span>. This
can be thought of as the number of successive failures (tails) in the flip of a fair coin.
You can see that there’s a 50% chance you will have zero failures i.e. you will
Expand Down Expand Up @@ -552,14 +555,14 @@ <h2><span class="header-section-number">18.2</span> Continuous distributions<a h
0 &amp; \text{otherwise}
\end{cases}
\]</span>
Notice how the pdf is just a constant <span class="math inline">\(1\)</span> across all values of <span class="math inline">\(x \in [0,1]\)</span>. Here it is important to distinguish between probability and <strong>probability density</strong>. The density may be 1, but the probability is not and while discreet distributions never exceed 1 on the y-axis, continuous distributions can go as high as you like.</li>
Notice how the pdf is just a constant <span class="math inline">\(1\)</span> across all values of <span class="math inline">\(x \in [0,1]\)</span>. Here it is important to distinguish between probability and <strong>probability density</strong>. The density may be 1, but the probability is not and while discrete distributions never exceed 1 on the y-axis, continuous distributions can go as high as you like.</li>
</ol>
</div>
</div>
<div class="exercise">
<p><span id="exr:unnamed-chunk-296" class="exercise"><strong>Exercise 18.7 (Normal intuition 1) </strong></span>The normal distribution, also known as the Gaussian distribution, is a continuous distribution that encompasses the entire real number line. It has two parameters: the mean, denoted by <span class="math inline">\(\mu\)</span>, and the variance, represented by <span class="math inline">\(\sigma^2\)</span>. Its shape resembles the iconic bell curve. The position of its peak is determined by the parameter <span class="math inline">\(\mu\)</span>, while the variance determines the spread or width of the curve. A smaller variance results in a sharper, narrower peak, while a larger variance leads to a broader, more spread-out curve.</p>
<p>Below, we graph the distribution of IQ scores for two different populations.</p>
<p>We aim to identify individuals with an IQ at or above 140 for an experiment. We can identify them reliably; however, we only have time to examine one of the two groups. Which group should we investigate to have the best chance?</p>
<p>We aim to identify individuals with an IQ at or above 140 for an experiment. We can identify them reliably; however, we only have time to examine one of the two groups. Which group should we investigate to have the best chance of finding such individuals?</p>
<p>NOTE: The graph below displays the parameter <span class="math inline">\(\sigma\)</span>, which is the square root of the variance, more commonly referred to as the <strong>standard deviation</strong>. Keep this in mind when solving the problems.</p>
<ol style="list-style-type: lower-alpha">
<li><p>Insert the values of either population into the pdf of a normal distribution and determine which one has a higher density at <span class="math inline">\(x=140\)</span>.</p></li>
Expand All @@ -582,7 +585,7 @@ <h2><span class="header-section-number">18.2</span> Continuous distributions<a h
<span class="math display">\[\frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{(x - \mu)^2}{2 \sigma^2}} =
\frac{1}{\sqrt{2 \pi 64}} e^{-\frac{(140 - 105)^2}{2 \cdot 64}} \approx 3.48e-06\]</span></li>
</ol>
<p>So despite the fact that group 1 has a lower average IQ, we are more likely to find 140 IQ individuals in group 2.</p>
<p>So despite the fact that group 1 has a lower average IQ, we are more likely to find 140 IQ individuals in this group.</p>
</div>
<ol start="2" style="list-style-type: lower-alpha">
<li></li>
Expand Down
2 changes: 1 addition & 1 deletion docs/search_index.json

Large diffs are not rendered by default.

0 comments on commit a5185e1

Please sign in to comment.