Merge branch 'master' into avehtari-patch-1

stan-dev · Nov 18, 2024 · 4e8ca56 · 4e8ca56
2 parents 6a2e556 + fe5f556
commit 4e8ca56
Show file tree

Hide file tree

Showing 9 changed files with 247 additions and 116 deletions.
diff --git a/src/bibtex/all.bib b/src/bibtex/all.bib
@@ -1867,4 +1867,30 @@ @article{Magnusson+etal:2024:posteriordb
   author={Magnusson, M{\aa}ns and Torgander, Jakob and B{\"u}rkner, Paul-Christian and Zhang, Lu and Carpenter, Bob and Vehtari, Aki},
   journal={arXiv preprint arXiv:2407.04967},
   year={2024}
+
+@article{egozcue+etal:2003,
+  title={Isometric logratio transformations for compositional data analysis},
+  author={Egozcue, Juan Jos{\'e} and Pawlowsky-Glahn, Vera and Mateu-Figueras, Gl{\`o}ria and Barcelo-Vidal, Carles},
+  journal={Mathematical Geology},
+  volume={35},
+  number={3},
+  pages={279--300},
+  year={2003}
+}
+
+@book{filzmoser+etal:2018,
+  title={Geometrical properties of compositional data},
+  author={Filzmoser, Peter and Hron, Karel and Templ, Matthias},
+  booktitle={Applied Compositional Data Analysis: With Worked Examples in R},
+  pages={35--68},
+  year={2018},
+  publisher={Springer}
+}
+
+@misc{seyboldt:2024,
+  author="Seyboldt, Adrian",
+  title="Add ZeroSumNormal distribution",
+  note="pyro-ppl GitHub repository issue \#1751",
+  year = "2024",
+  url ="https://github.com/pyro-ppl/numpyro/pull/1751#issuecomment-1980569811"
 }
diff --git a/src/functions-reference/real-valued_basic_functions.qmd b/src/functions-reference/real-valued_basic_functions.qmd
@@ -783,8 +783,10 @@ calculations, but the result is likely to be reduced acceptance
 probabilities and less efficient sampling.
 
 The rounding functions cannot be used as indices to arrays because
-they return real values.  Stan may introduce integer-valued versions
-of these in the future, but as of now, there is no good workaround.
+they return real values. For operations over `data` or in the
+`generated quantities` block, the
+[`to_int()` function](integer-valued_basic_functions.qmd#casting-functions)
+ can be used.
 
 <!-- R; floor; (T x); -->
 \index{{\tt \bfseries floor }!{\tt (T x): R}|hyperpage}
@@ -1636,7 +1638,8 @@ proportion theta, defined by \begin{eqnarray*}
 Calculates the log mixture density given `thetas`,
 mixing proportions which should be between 0 and 1 and sum to 1,
 and `lps`, log densities.
-These two containers must have the same length.
+The `lps` variable must be either a 1-d container of the same
+length as `thetas`, or an array of such.
 
 \begin{eqnarray*}
 \mathrm{log\_mix}(\theta, \lambda)

diff --git a/src/reference-manual/expressions.qmd b/src/reference-manual/expressions.qmd
@@ -149,9 +149,9 @@ any of the following.
 
 ```
 int, real, complex, vector, simplex, unit_vector,
-ordered, positive_ordered, row_vector, matrix,
-cholesky_factor_corr, cholesky_factor_cov,
-corr_matrix, cov_matrix, array
+sum_to_zero_vector, ordered, positive_ordered,
+row_vector, matrix, cholesky_factor_corr,
+cholesky_factor_cov, corr_matrix, cov_matrix, array
 ```
 
 The following built in functions are also reserved and
@@ -810,9 +810,9 @@ In addition to single integer indexes, as described in
 [the language indexing section](#language-indexing.section), Stan supports multiple indexing.
 Multiple indexes can be integer arrays of indexes, lower
 bounds, upper bounds, lower and upper bounds, or simply shorthand for
-all of the indexes.  If the upper bound is smaller than the lower bound, 
-the range is empty (unlike, e.g., in R). The upper bound and lower bound can be 
-expressions that evaluate to integer. A complete list of index types is 
+all of the indexes.  If the upper bound is smaller than the lower bound,
+the range is empty (unlike, e.g., in R). The upper bound and lower bound can be
+expressions that evaluate to integer. A complete list of index types is
 given in the following table.
 
 ##### Indexing Options Table {- #index-types-table}
@@ -1078,6 +1078,7 @@ the following table shows the mapping from types to their primitive types.
    | `vector`               | `vector`             |
    | `simplex`              | `vector`             |
    | `unit_vector`          | `vector`             |
+   | `sum_to_zero_vector`   | `vector`             |
    | `ordered`              | `vector`             |
    | `positive_ordered`     | `vector`             |
    | `row_vector`           | `row_vector`         |
@@ -1378,7 +1379,7 @@ model {
 }
 ```
 
-Algebraically, 
+Algebraically,
 [the distribution statement](statements.qmd#distribution-statements.section)
 in the model could be reduced to
 

diff --git a/src/reference-manual/grammar.txt b/src/reference-manual/grammar.txt
@@ -24,6 +24,7 @@
 
 <identifier> ::= IDENTIFIER
                | TRUNCATE
+               | JACOBIAN
 
 <decl_identifier> ::= <identifier>
                     | <reserved_word>
@@ -57,10 +58,13 @@
                   | POSITIVEORDERED
                   | SIMPLEX
                   | UNITVECTOR
+                  | SUMTOZERO
                   | CHOLESKYFACTORCORR
                   | CHOLESKYFACTORCOV
                   | CORRMATRIX
                   | COVMATRIX
+                  | STOCHASTICCOLUMNMATRIX
+                  | STOCHASTICROWMATRIX
                   | PRINT
                   | REJECT
                   | FATAL_ERROR
@@ -165,11 +169,16 @@
                  | POSITIVEORDERED LBRACK <expression> RBRACK
                  | SIMPLEX LBRACK <expression> RBRACK
                  | UNITVECTOR LBRACK <expression> RBRACK
+                 | SUMTOZERO LBRACK <expression> RBRACK
                  | CHOLESKYFACTORCORR LBRACK <expression> RBRACK
                  | CHOLESKYFACTORCOV LBRACK <expression> [COMMA <expression>]
                    RBRACK
                  | CORRMATRIX LBRACK <expression> RBRACK
                  | COVMATRIX LBRACK <expression> RBRACK
+                 | STOCHASTICCOLUMNMATRIX LBRACK <expression> COMMA
+                   <expression> RBRACK
+                 | STOCHASTICROWMATRIX LBRACK <expression> COMMA <expression>
+                   RBRACK
 
 <type_constraint> ::= [LABRACK <range> RABRACK]
                     | LABRACK <offset_mult> RABRACK

diff --git a/src/reference-manual/syntax.qmd b/src/reference-manual/syntax.qmd
@@ -113,6 +113,7 @@ The raw output is available [here](https://raw.githubusercontent.com/stan-dev/do
 ```
 <identifier> ::= IDENTIFIER
                | TRUNCATE
+               | JACOBIAN
 
 <decl_identifier> ::= <identifier>
 
@@ -175,11 +176,16 @@ The raw output is available [here](https://raw.githubusercontent.com/stan-dev/do
                  | POSITIVEORDERED LBRACK <expression> RBRACK
                  | SIMPLEX LBRACK <expression> RBRACK
                  | UNITVECTOR LBRACK <expression> RBRACK
+                 | SUMTOZERO LBRACK <expression> RBRACK
                  | CHOLESKYFACTORCORR LBRACK <expression> RBRACK
                  | CHOLESKYFACTORCOV LBRACK <expression> [COMMA <expression>]
                    RBRACK
                  | CORRMATRIX LBRACK <expression> RBRACK
                  | COVMATRIX LBRACK <expression> RBRACK
+                 | STOCHASTICCOLUMNMATRIX LBRACK <expression> COMMA
+                   <expression> RBRACK
+                 | STOCHASTICROWMATRIX LBRACK <expression> COMMA <expression>
+                   RBRACK
 
 <type_constraint> ::= [LABRACK <range> RABRACK]
                     | LABRACK <offset_mult> RABRACK

diff --git a/src/reference-manual/transforms.qmd b/src/reference-manual/transforms.qmd
@@ -15,7 +15,7 @@ constrained to be ordered, positive ordered, or simplexes.  Matrices
 may be constrained to be correlation matrices or covariance matrices.
 This chapter provides a definition of the transforms used for each
 type of variable.   For examples of how to declare and define these
-variables in a Stan program, see section 
+variables in a Stan program, see section
 [Variable declaration](types.qmd#variable-declaration.section).
 
 Stan converts models to C++ classes which define probability
@@ -39,7 +39,7 @@ the exact arithmetic:
     rounded to the boundary. This may cause unexpected warnings or
     errors, if in other parts of the code the boundary value is
     invalid. For example, we may observe floating-point value 0 for
-    a variance parameter that has been declared to be larger than 0. 
+    a variance parameter that has been declared to be larger than 0.
 	See more about [Floating point Arithmetic in Stan user's guide](../stan-users-guide/floating-point.qmd)).
   - CmdStan stores the output to CSV files with 6 significant digits
     accuracy by default, but the constraints are checked with 8
@@ -462,6 +462,89 @@ p_Y(y)
 $$
 
 
+
+## Zero sum vector
+
+Vectors that are constrained to sum to zero are useful for, among
+other things, additive varying effects, such as varying slopes or
+intercepts in a regression model (e.g., for income deciles).
+
+A zero sum $K$-vector $x \in \mathbb{R}^K$ satisfies the constraint
+$$
+\sum_{k=1}^K x_k = 0.
+$$
+
+For the transform, Stan uses the first part of an isometric log ratio
+transform; see [@egozcue+etal:2003] for the basic definitions and
+Chapter 3 of [@filzmoser+etal:2018] for the pivot coordinate version
+used here.  Stan uses the isometric log ratio transform because it
+induces a geometry with zero correlation among the dimensions, making
+it easier for HMC to explore than simpler alternatives such as setting
+the final element to the negative sum of the first elements; see, e.g.,
+[@seyboldt:2024].
+
+
+
+
+### Zero sum transform {-}
+
+The (unconstraining) transform is defined iteratively.  Given an $x \in
+\mathbb{R}^{N + 1}$ that sums to zero (i.e., $\sum_{n=1}^{N+1} x_n =
+0$), the transform proceeds as follows to produce an unconstrained $y
+\in \mathbb{R}^N$.
+
+The transform is initialized by setting
+$$
+S_N = 0
+$$
+and
+$$
+y_N = -x_{N + 1} \cdot \frac{\sqrt{N \cdot (N + 1)}{N}}.
+$$
+The for each $n$ from $N - 1$ down to $1$, let
+$$
+w_{n + 1} = \frac{y_{n + 1}}{\sqrt{(n + 1) \cdot (n + 2)}},
+$$
+$$
+S_n = S_{n + 1} + w_{n + 1},
+$$
+and
+$$
+y_n = (S_n - x_{n + 1}) \cdot \frac{\sqrt{n \cdot (n + 1)}}{n}.
+$$
+
+### Zero sum inverse transform {-}
+
+The inverse (constraining) transform follows the isometric logratio tranform.
+It maps an unconstrained vector $y \in \mathbb{R}^N$ to a zero-sum vector $x \in
+\mathbb{R}^{N + 1}$ such that
+$$
+\sum_{n=1}^{N + 1} x_n = 0.
+$$
+The values are defined inductively, starting with 
+$$
+x_1 = \sum_{n=1}^N \frac{y_n}{\sqrt{n \cdot (n + 1)}}
+$$
+and then setting
+$$
+x_{n + 1} = \sum_{i = n + 1}^N \frac{\sqrt{y_i}}{\sqrt{i \cdot (i + 1)}}
+- n \cdot \frac{y_n}{\sqrt{n \cdot (n + 1)}}.
+$$
+for $n \in 1{:}N$.
+
+The definition is such that
+$$
+\sum_{n = 1}^{N + 1} x_n = 0
+$$
+by construction, because each of the terms added to $x_{n}$ is then
+subtracted from $x_{n + 1}$ the number of times it shows up in earlier terms. 
+
+### Absolute Jacobian determinant of the zero sum inverse transform {-}
+
+The inverse transform is a linear operation, leading to a constant Jacobian
+determinant which is therefore not included.
+
+
 ## Unit simplex {#simplex-transform.section}
 
 Variables constrained to the unit simplex show up in multivariate

diff --git a/src/reference-manual/types.qmd b/src/reference-manual/types.qmd
@@ -117,9 +117,10 @@ vector<lower=-1, upper=1>[3] rho;
 ```
 
 There are also special data types for structured vectors and
-matrices. There are four constrained vector data types, `simplex`
+matrices. There are five constrained vector data types, `simplex`
 for unit simplexes, `unit_vector` for unit-length vectors,
-`ordered` for ordered vectors of scalars and
+`sum_to_zero_vector` for vectors that sum to zero,
+`ordered` for ordered vectors of scalars, and
 `positive_ordered` for vectors of positive ordered
 scalars. There are specialized matrix data types `corr_matrix`
 and `cov_matrix` for correlation matrices (symmetric, positive
@@ -692,6 +693,26 @@ unit vectors, this is only done up to a statically specified accuracy
 threshold $\epsilon$ to account for errors arising from floating-point
 imprecision.
 
+### Vectors that sum to zero {-}
+
+A zero-sum vector is constrained such that the
+sum of its elements is always $0$. These are sometimes useful
+for resolving identifiability issues in regression models.
+While the underlying vector has only $N - 1$ degrees of freedom,
+zero sum vectors are declared with their full dimensionality.
+For instance, `beta` is declared to be a zero-sum $5$-vector (4 DoF) by
+
+```stan
+sum_to_zero_vector[5] beta;
+```
+
+Zero sum vectors are implemented as vectors and may be assigned to other
+vectors and vice-versa.  Zero sum vector variables, like other constrained
+variables, are validated to ensure that they are indeed unit length; for
+zero sum  vectors, this is only done up to a statically specified accuracy
+threshold $\epsilon$ to account for errors arising from floating-point
+imprecision.
+
 ### Ordered vectors {-}
 
 An ordered vector type in Stan represents a vector whose entries are
@@ -1636,6 +1657,8 @@ dimensions `matrix[K, K]` types.
 +---------------------+-------------------------+----------------------------------------------+
 |                     |                         | `unit_vector[N]`                             |
 +---------------------+-------------------------+----------------------------------------------+
+|                     |                         | `sum_to_zero_vector[N]`                      |
++---------------------+-------------------------+----------------------------------------------+
 | `row_vector`        | `row_vector[N]`         | `row_vector[N]`                              |
 +---------------------+-------------------------+----------------------------------------------+
 |                     |                         | `row_vector[N]<lower=L>`                     |
@@ -1700,6 +1723,8 @@ dimensions `matrix[K, K]` types.
 +---------------------+-------------------------+----------------------------------------------+
 |                     |                         | `array[M] unit_vector[N]`                    |
 +---------------------+-------------------------+----------------------------------------------+
+|                     |                         | `array[M] sum_to_zero_vector[N]`             |
++---------------------+-------------------------+----------------------------------------------+
 
 <a name="id:constrained-types.figure"></a>