Fix log_mix signature #832

WardBrian · 2024-11-14T21:26:53Z

Submission Checklist

Builds locally
New functions marked with <<{ since VERSION }>>
Declare copyright holder and open-source license: see below

Summary

Closes #831. I'm happy to take further suggestions on the wording here.

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company):

Simons Foundation

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
Documentation: CC BY-ND 4.0 (https://creativecommons.org/licenses/by-nd/4.0/)

bob-carpenter

Just some minor clarifications.

bob-carpenter · 2024-11-14T21:39:10Z

src/functions-reference/real-valued_basic_functions.qmd

+Extension of the two-mixture case above to more densities.
+Return the log mixture of the log densities stored in `lps`,
+where each log density has the mixing proportion given by
+the same index in `thetas`.


thetas is presumably required to be a simplex, so that should be mentioned.

thetas and lps have to be the same length, so that should be mentioned.

The result should be given explicitly in math, e.g., log(SUM_n thetas[n] * exp(lps[n])) or the more efficient but also more opaque log_sum_exp(log(thetas) + lps) (it'd be OK to include both if you think just one would be confusing).

Oddly enough, it doesn't seem that the math library is checking that thetas is a simplex, just that it is bounded between 0 and 1. Possibly worthy of a math issue?

Extension of the two-mixture case above to more densities. Return the log mixture of the log densities stored in lps, where each log density has the mixing proportion given by the same index in thetas.

I would prefer this to say something about how this is intended to be looped across each density observation but is generalized to more than two densities. This connects the warning in the mixture modeling section about why we need to loop.

src/functions-reference/real-valued_basic_functions.qmd

bob-carpenter · 2024-11-14T22:00:14Z

Yes, this should definitely be checked. I created an issue:

stan-dev/math#3125

bob-carpenter · 2024-11-14T22:00:56Z

I guess we shouldn't say it's checking, just that the first argument must be non-negative values that sum to 1.

spinkney · 2024-11-15T12:41:18Z

Thanks @WardBrian for fixing this, I have been annoyed by this for a long time and was constantly looking up what this does.

In fact, I have a similar issue with log_sum_exp where I wanted it to vector[n] a = log_sum_exp(vector[n] b, vector[n] c) across each element, however it doesn't do this type of elementwise sum. Maybe if we had a log_add_exp which I've seen in other languages.

WardBrian · 2024-11-15T15:05:14Z

Here's what the rendered text now appears as:

bob-carpenter

Thanks for the updates.

pmahling · 2024-11-15T19:05:24Z

Hi is there a place where we can look up types allowed for this function? This example here allowed theta as a vector (size K), and the ps the densities component as an array size N* K, basically using the same weight. I am curious if this function allows theta to be same N*K array so we can apply different weights to the densities of each data point. https://github.com/stan-dev/example-models/blob/master/basic_estimators/normal_mixture_k.stan

bob-carpenter · 2024-11-15T19:17:02Z

is there a place where we can look up types allowed for this function?

That's what the doc update is addressing.

his example here allowed theta as a vector (size K), and the ps the densities component as an array size N* K, basically using the same weight.

This is not supported according to the documentation Brian wrote above. It's just log_mix(ArrayLike, ArrayLike), to put it in Python terms, where ArrayLike for Stan means a 1D array of reals, array[] real, a vector vector[], or a row vector row_vector[]. We could eventually vectorize this to deal with an array of ArrayLike things.

pmahling · 2024-11-15T21:31:04Z

Thank you Bob!
Do you mean the normal_mixture_k.stan example is not supported? Or are you saying not allowing “theta" to be same N*K array like ps in the normal example?

The doc update" meaning the user guide or the math library? I was hoping for some like R functions arguments specification in any R packages, but it is just nice to have you available to clarify.

bob-carpenter · 2024-11-15T21:41:22Z

Do you mean the normal_mixture_k.stan example is not supported?

Yes, that's what I meant because that's what the doc said. I just tried that example and it compiled just fine. It also works with an array of different theta values.

@WardBrian---the doc should mention that this can be vectorized in both arguments. It's not clear from the types, variable names, or example as rendered above.

Yes, I meant we're updating the functions reference doc to reflect what's really going on. I just misunderstood the existing signatures.

I'm not sure what you mean by "like R functions".

WardBrian · 2024-11-15T21:48:16Z

the doc should mention that this can be vectorized in both arguments. It's not clear from the types, variable names, or example as rendered above.

I do not believe it can be. It appears the second argument (lps) supports exactly one level of "array-wrapping".

So log_mix(vector, vector) is valid, as is log_mix(vector, array[] vector), but the following are both invalid: log_mix(array[] vector) or log_mix(vector, array[,] vector)

We don't really have any other functions quite like this, so how to describe it exactly might require some new convention

bob-carpenter · 2024-11-15T22:11:28Z

Ah, I forgot to change my variable in my program. So these both compile:

log_mix(vector, vector);
log_mix(vector, array[] vector);

But these two don't:

log_mix(array[] vector, vector);
log_mix(array[] vector, array[] vector);

I think we should just describe it right in the doc for this function rather than trying to come up with an indirect convention.

pmahling · 2024-12-04T19:15:07Z

If I use below:

log_mix(lambda, poisson_lpmf(y | pe),poisson_lpmf(y| pc))
lambda, y ,pe, pc are all vector of length n.

is this desired { lambda[1]*f(y[1]|pe[1])+(1- lambda[1])f(y[1]|pc[1])} .... { lambda[n]*f(y[n]|pe[n])+(1- lambda[n])*f(y[n]|pc[n])}

bob-carpenter · 2024-12-04T19:33:59Z

log_mix(lambda, A, B) = log_sum_exp(log(lambda) + A, log1m(lambda) + B).

For Stan, poisson_lpmf(y | pe) = SUM_n poisson_lpmf(y[n] | pe[n]).

pmahling · 2024-12-04T19:44:27Z

Thank you Bob!
for (i in 1:n) {
lp[i][1] = poisson_lpmf(y[i] | pe[i]);
lp[i][2] = poisson_lpmf(y[i] | pc[i]);
target += log_mix(lambda[i], lp[i]);
}
Can this be used instead in the loop
target += log_mix(lambda, lp);

bob-carpenter · 2024-12-04T20:05:11Z

@pmahling: We have a very active forum which is a better place to ask Stan questions than closed pull requests: https://discourse.mc-stan.org

I think those would be equivalent, but I'm not 100% sure, so I'd try both to make sure they yield the same answer.

Note: you can use lp[i, 1] instead of lp[i][1]---they should compile to the same C++ code.

WardBrian added 2 commits November 14, 2024 16:23

Fix log_mix signature

2a9da9e

Example

48d6b88

WardBrian requested review from mitzimorris and spinkney November 14, 2024 21:26

bob-carpenter requested changes Nov 14, 2024

View reviewed changes

WardBrian mentioned this pull request Nov 15, 2024

A log_add_exp function that returns containers stan-dev/math#3127

Open

Update prose for log_mix

dbf2884

WardBrian requested a review from bob-carpenter November 15, 2024 15:05

bob-carpenter approved these changes Nov 15, 2024

View reviewed changes

bob-carpenter merged commit a5a4915 into master Nov 15, 2024

WardBrian deleted the fix-log-mix branch November 15, 2024 15:10

WardBrian mentioned this pull request Nov 18, 2024

Mention second argument of log_mix can be array #834

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix log_mix signature #832

Fix log_mix signature #832

WardBrian commented Nov 14, 2024

bob-carpenter left a comment

bob-carpenter Nov 14, 2024

WardBrian Nov 14, 2024

spinkney Nov 15, 2024

bob-carpenter commented Nov 14, 2024

bob-carpenter commented Nov 14, 2024

spinkney commented Nov 15, 2024

WardBrian commented Nov 15, 2024

bob-carpenter left a comment

pmahling commented Nov 15, 2024 •

edited

Loading

bob-carpenter commented Nov 15, 2024

pmahling commented Nov 15, 2024

bob-carpenter commented Nov 15, 2024

WardBrian commented Nov 15, 2024

bob-carpenter commented Nov 15, 2024

pmahling commented Dec 4, 2024

bob-carpenter commented Dec 4, 2024

pmahling commented Dec 4, 2024

bob-carpenter commented Dec 4, 2024

Fix log_mix signature #832

Fix log_mix signature #832

Conversation

WardBrian commented Nov 14, 2024

Submission Checklist

Summary

Copyright and Licensing

bob-carpenter left a comment

Choose a reason for hiding this comment

bob-carpenter Nov 14, 2024

Choose a reason for hiding this comment

WardBrian Nov 14, 2024

Choose a reason for hiding this comment

spinkney Nov 15, 2024

Choose a reason for hiding this comment

bob-carpenter commented Nov 14, 2024

bob-carpenter commented Nov 14, 2024

spinkney commented Nov 15, 2024

WardBrian commented Nov 15, 2024

bob-carpenter left a comment

Choose a reason for hiding this comment

pmahling commented Nov 15, 2024 • edited Loading

bob-carpenter commented Nov 15, 2024

pmahling commented Nov 15, 2024

bob-carpenter commented Nov 15, 2024

WardBrian commented Nov 15, 2024

bob-carpenter commented Nov 15, 2024

pmahling commented Dec 4, 2024

bob-carpenter commented Dec 4, 2024

pmahling commented Dec 4, 2024

bob-carpenter commented Dec 4, 2024

pmahling commented Nov 15, 2024 •

edited

Loading