Document binomial-logit GLM #678

andrjohns · 2023-10-06T05:45:25Z

Submission Checklist

Builds locally
New functions marked with `r since("VERSION")`
Declare copyright holder and open-source license: see below

Summary

This PR adds documentation for the binomial_logit_glm GLM distribution added in this PR. The implementation & likelihood/gradients are (unsurprisingly) very similar to the bernoulli_logit_glm distribution, so I've based the documentation on that entry.

Let me know if I've missed anything!

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Andrew Johnson

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
Documentation: CC BY-ND 4.0 (https://creativecommons.org/licenses/by-nd/4.0/)

bob-carpenter

This one needs a fair amount of work to get it up to the standard of our other doc in terms of naming arguments. There's also some inconsistencies in notation to iron out (little vs. big N for example in sizing, for example).

src/functions-reference/bounded_discrete_distributions.Rmd

bob-carpenter · 2024-01-02T22:19:01Z

src/functions-reference/bounded_discrete_distributions.Rmd

+
+### Probability mass function
+
+Suppose $N \in \mathbb{N}$, $x\in \mathbb{R}^{n\cdot m}, \alpha \in \mathbb{R}^n, \beta \in \mathbb{R}^m$, and $n \in


I would start by saying that M is the number of predictors and N is the number of data items.

Then we need the notation to match with caps and to use \times not \cdot

x is in R^(M x N) (with \times, not \cdot)

\alpha in \mathbb{R}

\beta \in \mathbb{R}^M

Then we need the notation to match with caps and to use \times not \cdot

That's also inconsistent with the other _glm docs:

normal_id_glm

bernoulli_logit_glm

bob-carpenter · 2024-01-02T22:20:15Z

src/functions-reference/bounded_discrete_distributions.Rmd

+### Probability mass function
+
+Suppose $N \in \mathbb{N}$, $x\in \mathbb{R}^{n\cdot m}, \alpha \in \mathbb{R}^n, \beta \in \mathbb{R}^m$, and $n \in
+\{0,\ldots,N\}$.  Then \begin{align*}


Start the begin{align*} on its own line and line up the text under it so that it's readable. Looks like this accidentally got collapsed into a paragraph.

In the Stan doc, if we have an M x N matrix, we've conventionally used [m, n] for indexing, not [i, j].

In the Stan doc, if we have an M x N matrix, we've conventionally used [m, n] for indexing, not [i, j].

That's not the case for the _glm docs

Then we should change the _glm docs to be consistent with the rest of the User's Guide. I never wrote down hard and fast style rules and things like the symbol for expectation starts drifting under multiple authors. I mean to go and do a consistency fix of the entire doc set soon.

bob-carpenter · 2024-01-02T22:21:47Z

src/functions-reference/bounded_discrete_distributions.Rmd

+<!-- real; binomial_logit_glm_lpmf; (int n | int N, matrix x, real alpha, vector beta); -->
+\index{{\tt \bfseries binomial\_logit\_glm\_lpmf  }!{\tt (int n \textbar\ int N, matrix x, real alpha, vector beta): real}|hyperpage}
+
+`real` **`binomial_logit_glm_lpmf`**`(int n | int N, matrix x, real alpha, vector beta)`<br>\newline


Like our other function doc, this should name the arguments. alpha is an intercept, beta is a vector slopes, x is the data matrix, N is the total count and n is the count of successes.

Now if x is a matrix, doesn't n and N have to be 1D arrays?

Like our other function doc, this should name the arguments. alpha is an intercept, beta is a vector slopes, x is the data matrix, N is the total count and n is the count of successes.

That would be inconsistent with the current doc for other _glm functions:

normal_id_glm

bernoulli_logit_glm

Now if x is a matrix, doesn't n and N have to be 1D arrays?

No, they would be broadcast to match. This is the same way in the bernoulli_logit_glm doc

Then normal_id_glm and bernoulli_logit_glm should be fixed to match the rest of our doc. Given that we've thrown out consistency, there's no need for any new GLM to be consistent with the other GLM doc. I'd rather it be consistent with the rest of our doc as that's where it's going to go in the end.

I should've added that this doesn't need to be done as part of this PR. If you leave the new _glm like the other GLM code, I can just fix it in a pass to make everything consistent again.

bob-carpenter · 2024-01-02T22:22:37Z

src/functions-reference/bounded_discrete_distributions.Rmd

+\index{{\tt \bfseries binomial\_logit\_glm\_lpmf  }!{\tt (int n \textbar\ int N, matrix x, vector alpha, vector beta): real}|hyperpage}
+
+`real` **`binomial_logit_glm_lpmf`**`(int n | int N, matrix x, vector alpha, vector beta)`<br>\newline
+The log Binomial probability mass of n given N trials and chance of success


Don't need to say "mass" here---it's just a probability. Or it's a "log probability mass function" if you want to spell it all out.

"probability mass" is used throughout the other function docs:

binomial_lpmf

bernoulli_lpmf

ordered_logistic_lpmf

I'd argue that "probability mass" is not idiomatic in this context because we just call the resulting quantity "probability" not a "probability mass". It's a "probability mass function" but it returns a probability not a probability mass. So this is another case where the binomial, Bernoulli, etc. need to be fixed. No worries if you can't get to it in this PR.

bob-carpenter · 2024-01-02T22:23:29Z

src/functions-reference/bounded_discrete_distributions.Rmd

+<!-- real; binomial_logit_glm_lupmf; (int n | int N, matrix x, vector alpha, vector beta); -->
+\index{{\tt \bfseries binomial\_logit\_glm\_lupmf  }!{\tt (int n \textbar\ int N, matrix x, vector alpha, vector beta): real}|hyperpage}
+
+`real` **`binomial_logit_glm_lupmf`**`(int n | int N, matrix x, vector alpha, vector beta)`<br>\newline


It doesn't make sense for x to be a matrix and n and N to be scalars. Is there interpretation here that you are broadcasting the n and N for all of the rows of x?

That's right, this is consistent with the signatures for bernoulli_logit_glm and normal_id_glm

Thanks. I think it'd help clarify this in the doc. For instance, line 315 says "The log normal probability density of y given location alpha + x * beta and scale sigma." but in this case y is a vector and alpha + beta * x is a vector, so calling a vector a location seems to violate agreement (plural/singular).

WardBrian · 2024-01-11T14:37:49Z

It would be nice if an updated version of this was merged before the release next week. @andrjohns do you have the time?

WardBrian · 2024-01-16T15:50:27Z

@bob-carpenter are there changes you still think are required for this PR, or are they all things which can/should be done in follow ons?

WardBrian · 2024-01-16T17:50:28Z

I confirmed with @bob-carpenter the remaining issues here can be a follow on. I've opened #705 for them.

andrjohns added 2 commits October 6, 2023 08:39

Document new binomial_logit_glm distribution

b0b088d

Missed versioning

9f7e9a5

andrjohns mentioned this pull request Oct 6, 2023

Add signatures for binomial_logit_glm dist stan-dev/stanc3#1367

Merged

3 tasks

bob-carpenter requested changes Jan 2, 2024

View reviewed changes

Capitalisation and eqn formatting

ba7cdc2

This was referenced Jan 16, 2024

Add section documenting tuple unpacking #702

Merged

Bring GLM function doc in line with other functions #705

Open

WardBrian merged commit d339525 into master Jan 16, 2024

WardBrian deleted the binomial_logit_glm branch January 16, 2024 17:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document binomial-logit GLM #678

Document binomial-logit GLM #678

andrjohns commented Oct 6, 2023

bob-carpenter left a comment

bob-carpenter Jan 2, 2024

andrjohns Jan 11, 2024

bob-carpenter Jan 2, 2024

andrjohns Jan 11, 2024

bob-carpenter Jan 12, 2024

bob-carpenter Jan 2, 2024

andrjohns Jan 11, 2024

bob-carpenter Jan 12, 2024

bob-carpenter Jan 12, 2024

bob-carpenter Jan 2, 2024

andrjohns Jan 12, 2024

bob-carpenter Jan 12, 2024

bob-carpenter Jan 2, 2024

andrjohns Jan 11, 2024

bob-carpenter Jan 12, 2024

WardBrian commented Jan 11, 2024

WardBrian commented Jan 16, 2024

WardBrian commented Jan 16, 2024


		### Probability mass function

		Suppose $N \in \mathbb{N}$, $x\in \mathbb{R}^{n\cdot m}, \alpha \in \mathbb{R}^n, \beta \in \mathbb{R}^m$, and $n \in

Document binomial-logit GLM #678

Document binomial-logit GLM #678

Conversation

andrjohns commented Oct 6, 2023

Submission Checklist

Summary

Copyright and Licensing

bob-carpenter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WardBrian commented Jan 11, 2024

WardBrian commented Jan 16, 2024

WardBrian commented Jan 16, 2024