-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add implementation details docs page, note on propto=True for log_density #192
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,4 +15,4 @@ going on "behind the scenes". | |
internals/development | ||
internals/testing | ||
internals/documentation | ||
|
||
internals/details |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
Stan Implementation Details | ||
=========================== | ||
|
||
|
||
.. _log_density_propto: | ||
|
||
Speed of the ``propto`` argument | ||
-------------------------------- | ||
|
||
The log density function provided by a Stan model has | ||
the ability to drop additive constants from the calculation. | ||
This is indicated by the ``propto`` ("``prop``\ortional ``to``") | ||
argument to function. | ||
|
||
If you are using an application such as an MCMC algorithm which requires | ||
gradients and only needs the log density up to a proportion, setting | ||
``propto=True`` will be at least as fast as setting ``propto=False`` | ||
roualdes marked this conversation as resolved.
Show resolved
Hide resolved
|
||
and is generally recommended (and the default value). | ||
|
||
However, in the case of the ``log_density`` function (which does not calculate | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it would be good to restate the "only needs the log density up to a proportion" bit again in this paragraph. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why? We're recommending setting it to |
||
derivatives), this argument has the potential to *slow down* computation, and we | ||
recommend setting it to ``False`` or timing it for your model of interest before | ||
proceeding. Note that the default value of ``propto`` is ``True`` for consistency | ||
with the versions of the function that do calculate gradients, so extra care is needed. | ||
|
||
Why is ``log_density`` different? | ||
_________________________________ | ||
|
||
The implementation of the ``propto`` argument relies on the presence | ||
of autodiff types (``var``\s, in the terminology of Stan's math library) | ||
to determine what is or is not constant with respect to the parameters. | ||
When the argument is ``False``, the calculation of the log density is able to be | ||
computed using only variables of type ``double``. | ||
|
||
The consequence of this is that, if the ``propto`` argument is set to ``true``, | ||
the ``log_density`` function will at a minimum need to perform more allocations | ||
than if it were set to ``false``. There may be an even higher cost, due to functions | ||
such as |reduce_sum|_ or Stan's ODE integraters changing their behavior when applied | ||
to autodiff types and performing additional work which is thrown away when gradients | ||
are not needed. These additional computations can quickly overwhelm any speed up | ||
received by dropping additive constants in practice. | ||
|
||
|
||
.. |reduce_sum| replace:: ``reduce_sum`` | ||
.. _reduce_sum: https://mc-stan.org/docs/stan-users-guide/reduce-sum.html | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would phrase this as having the ability to drop constants. Then I'd give simple recommendations:
propto = true
. Settingpropto=true
will be at least as fast.double
values to match, usepropto=true
. Settingpropto=true
may be slower or faster, depending on the cost of calculating normalizing constants (propto=false
) and the cost of autodiff (required to get the right answer ifpropto=true
).I don't think we need to say much more than that.
Why the double back ticks?
I couldn't understand what lines 34/35 were doing.
I'd just give simple recommendations:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The double back ticks are how ReStructuredText (that sphinx uses) wants code formatted. They're equivalent to single backticks in Markdown. Lines 34/35 are also a RST detail to get a link that is also
code formatted
. The result is that "|reduce_sum|" above gets rendered asreduce_sum
I agree phrasing it in terms of a suggestion for each case is clearer. I left the explanation in, but under a sub-heading for the curious.