From 1fc0b8cef5510b7bc9330caa0ea6c8822d99fd55 Mon Sep 17 00:00:00 2001 From: poldrack Date: Tue, 31 Aug 2021 15:00:31 +0000 Subject: [PATCH] deploy: 688ecbb088b7cf2d0dcd6fe03323a3a4790674ae --- 01-IntroductionToPython.html | 10 +-- 02-SummarizingData.html | 4 +- 03-DataVisualization.html | 18 ++--- 04-FittingSimpleModels.html | 22 +++--- 05-Probability.html | 4 +- 06-Sampling.html | 14 ++-- 07-ResamplingAndSimulation.html | 12 +-- 08-HypothesisTesting.html | 18 ++--- 09-StatisticalPower.html | 6 +- 10-BayesianStatistics.html | 12 +-- 11-ModelingCategoricalRelationships.html | 8 +- 13-GeneralLinearModel.html | 69 ++++-------------- _images/03-DataVisualization_23_1.png | Bin 26135 -> 26110 bytes _images/03-DataVisualization_30_1.png | Bin 27631 -> 27976 bytes _images/04-FittingSimpleModels_46_1.png | Bin 5536 -> 5218 bytes _images/04-FittingSimpleModels_48_1.png | Bin 7113 -> 5321 bytes _images/06-Sampling_11_0.png | Bin 4692 -> 5028 bytes _images/06-Sampling_13_1.png | Bin 26605 -> 26734 bytes _images/06-Sampling_5_1.png | Bin 10947 -> 11088 bytes _images/07-ResamplingAndSimulation_1_0.png | Bin 18534 -> 18811 bytes _images/07-ResamplingAndSimulation_7_0.png | Bin 5866 -> 5884 bytes ..._9_1.png => 13-GeneralLinearModel_9_0.png} | Bin _sources/03-DataVisualization.ipynb | 4 +- _sources/06-Sampling.ipynb | 2 +- _sources/07-ResamplingAndSimulation.ipynb | 5 +- _sources/08-HypothesisTesting.ipynb | 5 +- _sources/09-StatisticalPower.ipynb | 2 +- _sources/10-BayesianStatistics.ipynb | 4 +- _sources/13-GeneralLinearModel.ipynb | 4 +- genindex.html | 4 +- index.html | 4 +- objects.inv | Bin 533 -> 529 bytes search.html | 4 +- searchindex.js | 2 +- 34 files changed, 100 insertions(+), 137 deletions(-) rename _images/{13-GeneralLinearModel_9_1.png => 13-GeneralLinearModel_9_0.png} (100%) diff --git a/01-IntroductionToPython.html b/01-IntroductionToPython.html index 71470ce..2be3045 100644 --- a/01-IntroductionToPython.html +++ b/01-IntroductionToPython.html @@ -96,7 +96,7 @@

Python Companion to Statistical Thinking i
  • - Resampling and simulation in R + Resampling and simulation
  • @@ -121,7 +121,7 @@

    Python Companion to Statistical Thinking i

  • - The General Linear Model in Python + The General Linear Model
  • @@ -660,7 +660,7 @@

    Functions -
    -0.7781456934580733
    +
    -0.16732079539678754
     
    @@ -674,8 +674,8 @@

    Functions -
    array([110.59242224, 120.2608977 ,  92.63319893,  78.17154929,
    -        86.39200147])
    +
    array([ 93.87826084, 107.02436875,  87.70148102, 107.76083003,
    +       116.53109624])
     
    diff --git a/02-SummarizingData.html b/02-SummarizingData.html index 3948d91..f02a9ad 100644 --- a/02-SummarizingData.html +++ b/02-SummarizingData.html @@ -96,7 +96,7 @@

    Python Companion to Statistical Thinking i
  • - Resampling and simulation in R + Resampling and simulation
  • @@ -121,7 +121,7 @@

    Python Companion to Statistical Thinking i

  • - The General Linear Model in Python + The General Linear Model
  • diff --git a/03-DataVisualization.html b/03-DataVisualization.html index 6eb2669..ecf3ed9 100644 --- a/03-DataVisualization.html +++ b/03-DataVisualization.html @@ -96,7 +96,7 @@

    Python Companion to Statistical Thinking i
  • - Resampling and simulation in R + Resampling and simulation
  • @@ -121,7 +121,7 @@

    Python Companion to Statistical Thinking i

  • - The General Linear Model in Python + The General Linear Model
  • @@ -318,7 +318,7 @@

    Bar vs. line plots -
    [<matplotlib.lines.Line2D at 0x7f1b2ec4d6d0>]
    +
    [<matplotlib.lines.Line2D at 0x7f288ee47190>]
     
    _images/03-DataVisualization_14_1.png @@ -333,7 +333,7 @@

    Bar vs. line plots -
    <matplotlib.axes._subplots.AxesSubplot at 0x7f1b2ebfd760>
    +
    <matplotlib.axes._subplots.AxesSubplot at 0x7f288edf3d60>
     
    _images/03-DataVisualization_16_1.png @@ -376,7 +376,7 @@

    Plots with two variables -
    <matplotlib.axes._subplots.AxesSubplot at 0x7f1b2ea6df10>
    +
    <matplotlib.axes._subplots.AxesSubplot at 0x7f288ec89580>
     
    _images/03-DataVisualization_23_1.png @@ -396,7 +396,7 @@

    Plotting dispersion -
    <matplotlib.axes._subplots.AxesSubplot at 0x7f1b2e92c7f0>
    +
    <matplotlib.axes._subplots.AxesSubplot at 0x7f288eb371c0>
     
    _images/03-DataVisualization_26_1.png @@ -412,7 +412,7 @@

    Plotting dispersion -
    <matplotlib.axes._subplots.AxesSubplot at 0x7f1b2c05c4f0>
    +
    <matplotlib.axes._subplots.AxesSubplot at 0x7f288c441c40>
     
    _images/03-DataVisualization_28_1.png @@ -428,8 +428,8 @@

    Scatter plotdata=adult_nhanes_data) plt.plot([adult_nhanes_data['SystolicBloodPres1StRdgMmHg'].min(), adult_nhanes_data['SystolicBloodPres1StRdgMmHg'].max()], - [adult_nhanes_data['SystolicBloodPres2NdRdgMmHg'].min(), - adult_nhanes_data['SystolicBloodPres2NdRdgMmHg'].max()], + [adult_nhanes_data['SystolicBloodPres1StRdgMmHg'].min(), + adult_nhanes_data['SystolicBloodPres1StRdgMmHg'].max()], color='k') plt.xlabel('Systolic BP - First reading') plt.ylabel('Systolic BP - Second reading') diff --git a/04-FittingSimpleModels.html b/04-FittingSimpleModels.html index 6b9bd31..82d3656 100644 --- a/04-FittingSimpleModels.html +++ b/04-FittingSimpleModels.html @@ -96,7 +96,7 @@

    Python Companion to Statistical Thinking i
  • - Resampling and simulation in R + Resampling and simulation
  • @@ -121,7 +121,7 @@

    Python Companion to Statistical Thinking i

  • - The General Linear Model in Python + The General Linear Model
  • @@ -485,7 +485,7 @@

    Variability -
    99.32116733780761
    +
    118.0052013422819
     
    @@ -498,7 +498,7 @@

    Variability -
    99.3211673378076
    +
    118.00520134228196
     
    @@ -512,7 +512,7 @@

    Variability -
    9.966000568824366
    +
    10.863019899746197
     
    @@ -525,7 +525,7 @@

    Variability -
    9.966000568824366
    +
    10.8630198997462
     
    @@ -559,10 +559,10 @@

    Z-scores -
    (array([10., 21., 15., 27., 16., 26.,  6., 15., 10.,  4.]),
    - array([-1.78132306, -1.37393798, -0.96655289, -0.55916781, -0.15178272,
    -         0.25560237,  0.66298745,  1.07037254,  1.47775763,  1.88514271,
    -         2.2925278 ]),
    +
    (array([ 6., 15., 21., 29., 20., 21., 21., 10.,  5.,  2.]),
    + array([-2.05559782, -1.58611511, -1.1166324 , -0.64714969, -0.17766699,
    +         0.29181572,  0.76129843,  1.23078114,  1.70026385,  2.16974655,
    +         2.63922926]),
      <a list of 10 Patch objects>)
     
    @@ -577,7 +577,7 @@

    Z-scores -
    <matplotlib.collections.PathCollection at 0x7f174d259520>
    +
    <matplotlib.collections.PathCollection at 0x7f1ffead45e0>
     
    _images/04-FittingSimpleModels_48_1.png diff --git a/05-Probability.html b/05-Probability.html index 73ab3a6..fa1e638 100644 --- a/05-Probability.html +++ b/05-Probability.html @@ -96,7 +96,7 @@

    Python Companion to Statistical Thinking i
  • - Resampling and simulation in R + Resampling and simulation
  • @@ -121,7 +121,7 @@

    Python Companion to Statistical Thinking i

  • - The General Linear Model in Python + The General Linear Model
  • diff --git a/06-Sampling.html b/06-Sampling.html index 76033b5..4b86ee9 100644 --- a/06-Sampling.html +++ b/06-Sampling.html @@ -31,7 +31,7 @@ - + @@ -96,7 +96,7 @@

    Python Companion to Statistical Thinking i
  • - Resampling and simulation in R + Resampling and simulation
  • @@ -121,7 +121,7 @@

    Python Companion to Statistical Thinking i

  • - The General Linear Model in Python + The General Linear Model
  • @@ -267,7 +267,7 @@

    Sampling error# we need to use the maximum of those data to set # the height of the vertical line that shows the mean plt.axvline(x=adult_nhanes_data['Height'].mean(), - ymax=np.max(hist[0]), color='k') + ymax=1, color='k') # draw the normal distribution with same mean and standard deviation # as the sampling distribution @@ -285,7 +285,7 @@

    Sampling error -
    -

    - diff --git a/09-StatisticalPower.html b/09-StatisticalPower.html index 8a18a77..cbdc848 100644 --- a/09-StatisticalPower.html +++ b/09-StatisticalPower.html @@ -96,7 +96,7 @@

    Python Companion to Statistical Thinking i
  • - Resampling and simulation in R + Resampling and simulation
  • @@ -121,7 +121,7 @@

    Python Companion to Statistical Thinking i

  • - The General Linear Model in Python + The General Linear Model
  • @@ -237,7 +237,7 @@

    Statistical Power Analysis in Python

    Power analysis

    -

    We can compute a power analysis using functions from the statsmodels.stats.power package. Let’s focus on the power for an independent samples t-test in order to determine a difference in the mean between two groups. Let’s say that we think than an effect size of Cohen’s d=0.5 is realistic for the study in question (based on previous research) and would be of scientific interest. We wish to have 80% power to find the effect if it exists. We can compute the sample size needed for adequate power using the TTestIndPower() function:

    +

    We can compute a power analysis using functions from the statsmodels.stats.power package. Let’s focus on the power for an independent samples t-test in order to determine a difference in the mean between two groups. Let’s say that we think that an effect size of Cohen’s d=0.5 is realistic for the study in question (based on previous research) and would be of scientific interest. We wish to have 80% power to find the effect if it exists. We can compute the sample size needed for adequate power using the TTestIndPower() function:

    
    diff --git a/10-BayesianStatistics.html b/10-BayesianStatistics.html
    index f624e71..23b99b1 100644
    --- a/10-BayesianStatistics.html
    +++ b/10-BayesianStatistics.html
    @@ -96,7 +96,7 @@ 

    Python Companion to Statistical Thinking i
  • - Resampling and simulation in R + Resampling and simulation
  • @@ -121,7 +121,7 @@

    Python Companion to Statistical Thinking i

  • - The General Linear Model in Python + The General Linear Model
  • @@ -221,7 +221,7 @@

    Bayesian Statistics in Python

    Applying Bayes’ theorem: A simple example

    TBD: MOVE TO MULTIPLE TESTING EXAMPLE SO WE CAN USE BINOMIAL LIKELIHOOD -A person has a cough and flu-like symptoms, and gets a PCR test for COVID-19, which comes back postiive. What is the likelihood that they actually have COVID-19, as opposed a regular cold or flu? We can use Bayes’ theorem to compute this. Let’s say that the local rate of symptomatic individuals who actually are infected with COVID-19 is 7.4% (as reported on July 10, 2020 for San Francisco); thus, our prior probability that someone with symptoms actually has COVID-19 is .074. The RT-PCR test used to identify COVID-19 RNA is highly specific (that is, it very rarelly reports the presence of the virus when it is not present); for our example, we will say that the specificity is 99%. Its sensitivity is not known, but probably is no higher than 90%.
    +A person has a cough and flu-like symptoms, and gets a PCR test for COVID-19, which comes back postiive. What is the likelihood that they actually have COVID-19, as opposed to a regular cold or flu? We can use Bayes’ theorem to compute this. Let’s say that the local rate of symptomatic individuals who actually are infected with COVID-19 is 7.4% (as reported on July 10, 2020 for San Francisco); thus, our prior probability that someone with symptoms actually has COVID-19 is .074. The RT-PCR test used to identify COVID-19 RNA is highly specific (that is, it very rarelly reports the presence of the virus when it is not present); for our example, we will say that the specificity is 99%. Its sensitivity is not known, but probably is no higher than 90%.
    First let’s look at the probability of disease given a single positive test.

    @@ -234,6 +234,8 @@

    Applying Bayes’ theorem: A simple exampleposterior = (likelihood * prior) / marginal_likelihood posterior + +

    @@ -316,7 +318,7 @@

    Estimating posterior distributions -
    <matplotlib.legend.Legend at 0x7f4d3cdf0b80>
    +
    <matplotlib.legend.Legend at 0x7f1ad373bb80>
     
    _images/10-BayesianStatistics_5_1.png @@ -360,7 +362,7 @@

    Estimating posterior distributions -