diff --git a/docs/notebooks/Ensemble.ipynb b/docs/notebooks/Ensemble.ipynb index 1fd4e23..3b9621b 100644 --- a/docs/notebooks/Ensemble.ipynb +++ b/docs/notebooks/Ensemble.ipynb @@ -401,72 +401,73 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We could go on to do inference using `NUTS` or `ADVI` or any of the other samplers supported by `pymc3`. But that would take a few hours (at least). Since we have many light curves, we can actually get good point estimates by just optimizing the log probability function (which should take only a minute or two). We can then approximate the posterior using a Laplace approximation at the mode." + "We could go on to do inference using `NUTS` or `ADVI` or any of the other samplers supported by `pymc3`. But that would take a few hours (at least). Since we have many light curves,let's just optimize the log probability function to get the MAP (maximum a posteriori) solution -- that will be a good estimate of the true spot properties." ] }, { - "cell_type": "raw", - "metadata": { - "raw_mimetype": "text/restructuredtext" - }, + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ - ".. warning::\n", - "\n", - " When working with real data, you almost certainly want to use a proper sampling scheme! The estimate of the inverse of the Hessian, which we interpret as the posterior covariance, is often numerically unstable. The posterior could also be non-Gaussian or even multimodal. Our trick works here, but we only know that because we know what the true answer is." + "map_soln = pmx.optimize(model=model)" ] }, { - "cell_type": "code", - "execution_count": null, + "cell_type": "markdown", "metadata": {}, - "outputs": [], "source": [ - "def get_samples(varnames, model=None, nsamples=3000):\n", - " # Find the maximum a posteriori (MAP) point\n", - " soln, info = pmx.optimize(model=model, return_info=True)\n", - "\n", - " # If it failed, abort\n", - " if info[\"nfev\"] == 1:\n", - " return None\n", - "\n", - " # MAP solution and approximate covariance\n", - " map_mu = info[\"x\"]\n", - " map_cov = info[\"hess_inv\"]\n", - "\n", - " # Draw samples from the Laplace approximation to the posterior\n", - " u = np.random.multivariate_normal(map_mu, map_cov, size=nsamples)\n", - "\n", - " # We need to transform from the internal pymc3 parametrization\n", - " # (where all variables have infinite support)\n", - " # to our own parametrization (where the variables have\n", - " # priors with finite bounds). This is the easiest way\n", - " # of doing this!\n", - " samples = np.zeros_like(u)\n", - " wrapper = pmx.optim.ModelWrapper(model=model)\n", - " for k in tqdm(range(nsamples)):\n", - " point = pmx.optim.get_point(wrapper, u[k])\n", - " for j, name in enumerate(varnames):\n", - " samples[k, j] = point[name]\n", - "\n", - " return samples" + "Here's what we got:" ] }, { "cell_type": "code", "execution_count": null, - "metadata": {}, + "metadata": { + "tags": [ + "hide_input" + ] + }, "outputs": [], "source": [ - "samples = get_samples(varnames, model=model)\n", - "corner(samples, labels=varnames, truths=[truths[name] for name in varnames])\n", - "plt.show()" + "from IPython.display import display, Markdown\n", + "from starry_process.defaults import defaults\n", + "\n", + "display(\n", + " Markdown(\n", + " \"\"\"\n", + "| parameter | description | true value | inferred value\n", + "| - | :- | :-: | :-:\n", + "| `r` | mean radius in degrees | `{r}` | `{{r:.2f}}`\n", + "| `mu` | latitude distribution mode in degrees | `{mu}` | `{{mu:.2f}}`\n", + "| `sigma` | latitude distribution standard deviation in degrees | `{sigma}` | `{{sigma:.2f}}`\n", + "| `c` | fractional spot contrast | `{c}` | `{{c:.4f}}`\n", + "| `n` | number of spots | `{n}` | `{{n:.2f}}`\n", + "\"\"\".format(\n", + " **truths\n", + " ).format(\n", + " **map_soln\n", + " )\n", + " )\n", + ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Not bad! We correctly inferred *all* the hyperparameters of the GP! Note, importantly, that our constraint on the number of spots `n` is very poor. It is also very correlated with the spot contrast. Constraining the total number of spots, or the total spot coverage, from single-band photometric measurements is very difficult!" + "Not bad! We correctly inferred *all* the hyperparameters of the GP! Note, importantly, that we don't yet have any estimate of the uncertainty on any of these parameters. To get that, we need to actually sample the posterior. Stay tuned!" + ] + }, + { + "cell_type": "raw", + "metadata": { + "raw_mimetype": "text/restructuredtext" + }, + "source": [ + ".. warning::\n", + "\n", + " Note that our inferred value for the number of spots is pretty close to the true value. But as we will see, the uncertainty is very large! In general, it's extremely difficult to constrain the total number of spots from stellar light curves." ] } ],