You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sometimes one has an ECDF or random sample and wants to compare it to the CDF of some known distribution. One could plot the ECDF of the sample and the CDF of the distribution and compare them, but they will always deviate, and it's difficult to tell if the difference is meaningful.
There are two ways to handle this: plotting the ECDF difference, and/or plotting a confidence band.
If x is exactly sampled from the distribution d, then the marginal (i.e. pointwise) distribution of the ECDF at each observed x value has a known form ecdf_marginal(d, N, x) = Binomial(N, cdf(d, x)) / N, where N is the size of the sample. One can use this to plot a confidence band that gives a sense for what deviation from d should be expected given the sample size N. An example:
It's worth noting that because the confidence interval is pointwise, the envelope should not be expected to contain the entire ECDF 95% of the time; it should in fact be less than that. It might be worth it to have an option to generate simultaneous confidence bands, which we could probably do by adapting the simulation-based method in (ref). These are more easily interpretable, as they should contain the entire ECDF curve 95% of the time.
The text was updated successfully, but these errors were encountered:
sethaxen
changed the title
Add ecdf difference plot with confidence band
ECDF difference plot with confidence band
Jun 24, 2021
Sometimes one has an ECDF or random sample and wants to compare it to the CDF of some known distribution. One could plot the ECDF of the sample and the CDF of the distribution and compare them, but they will always deviate, and it's difficult to tell if the difference is meaningful.
There are two ways to handle this: plotting the ECDF difference, and/or plotting a confidence band.
If
x
is exactly sampled from the distributiond
, then the marginal (i.e. pointwise) distribution of the ECDF at each observedx
value has a known formecdf_marginal(d, N, x) = Binomial(N, cdf(d, x)) / N
, whereN
is the size of the sample. One can use this to plot a confidence band that gives a sense for what deviation fromd
should be expected given the sample sizeN
. An example:I propose that this be callable with an interface like
It's worth noting that because the confidence interval is pointwise, the envelope should not be expected to contain the entire ECDF 95% of the time; it should in fact be less than that. It might be worth it to have an option to generate simultaneous confidence bands, which we could probably do by adapting the simulation-based method in (ref). These are more easily interpretable, as they should contain the entire ECDF curve 95% of the time.
The text was updated successfully, but these errors were encountered: