Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize operator-like functions #1628

Merged
merged 34 commits into from
Feb 14, 2020

Conversation

t4c1
Copy link
Contributor

@t4c1 t4c1 commented Jan 20, 2020

Summary

Generalizes function signatures of functions that act kind of like operators (add, subtract, multiply, divide, elt_multiply, elt_divide). That is they can now accept and return general Eigen expressions.
EDIT: I also had to generalize functions that were called directely by return value of one of generalized functions. Those were: trace, trace_gen_quad_form, transpose and matrix_exp.

In many cases I replaced function specializations with overloads. This results in more requires, but it also allowed me to replace multiple specializations with one general overload, reducing total amount of code.

There are also some changes to autodiff testing framework that allow it to work with Eigen expressions. I did some of those in earlier PR #1471, but I missed some places.

Tests

This is mostly a refactor so no new tests.

Side Effects

None.

Checklist

  • Math issue Generalize matrix function signatures #1470

  • Copyright holder: Tadej Ciglarič

    The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
    - Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
    - Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

  • the basic tests are passing

    • unit tests pass (to run, use: ./runTests.py test/unit)
    • header checks pass, (make test-headers)
    • docs build, (make doxygen)
    • code passes the built in C++ standards checks (make cpplint)
  • the code is written in idiomatic C++ and changes are documented in the doxygen

  • the new changes are tested

= require_t<bool_constant<T1::ColsAtCompileTime == T2::RowsAtCompileTime>>,
typename = require_not_t<
conjunction<is_eigen_row_vector<T1>, is_eigen_col_vector<T2>>>,
int = 0>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This int template parameter is required to make compiler treat these as different overloads (if they have same arguments and template parameters it treats them as same overload and gives a lot of errors about redefining default template parameter values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

! O that's how you do that. I was fighting with that compile error like a week ago. Thank you. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me this solution seems a bit hacky, but I don't know any better one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of this approach. It's feels like it fights against the language. I think it would be good to post a stackoverflow Q about this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a nice solution here: https://stackoverflow.com/questions/38305222/default-template-argument-when-using-stdenable-if-as-templ-param-why-ok-wit

The trick is to use requires as type of template parameters instead of defaults to typename parameters - require_t<...>* = nullptr instead of typename = require_t<...>

stan/math/fwd/fun/multiply.hpp Show resolved Hide resolved
Comment on lines 23 to 25
check_vector("dot_product", "v1", v1);
check_vector("dot_product", "v2", v2);
template <typename T1, typename T2,
typename = require_all_eigen_vector_t<T1, T2>>
inline auto dot_product(const T1 &v1, const T2 &v2) {
check_matching_sizes("dot_product", "v1", v1, "v2", v2);
return v1.dot(v2);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These check_vector checks were redundant, as Eigen's .dot() is only defined for compile time vectors, so there is no way this could compile and check fail.

stan/math/prim/fun/multiply.hpp Show resolved Hide resolved
@@ -68,10 +60,11 @@ inline return_type_t<TD, TA, TB> trace_gen_quad_form(
* @throw std::domain_error if A cannot be multiplied by B or B cannot
* be multiplied by D.
*/
template <int RD, int CD, int RA, int CA, typename TB, int RB, int CB>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typename TB is never used and therefore can not be deduced. Due to the bug this overload was never chosen.

tols.hessian_hessian_ = 1e-1;
tols.hessian_fvar_hessian_ = 1e-1;
tols.hessian_hessian_ = 0.15;
tols.hessian_fvar_hessian_ = 0.15;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed a bug that prevented all-double overload of trace_gen_quad_form() to be ever chosen. While faster, that overload seems to be a bit less accurate, so I had to increase these tolerances.

Copy link
Collaborator

@SteveBronder SteveBronder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just looked over the metaprogramming. I like it! I'll sit down and look at more once the tests pass

const Eigen::Matrix<fvar<T>, R2, C2>& m2) {
template <typename T1, typename T2,
typename = require_all_eigen_vt<is_fvar, T1, T2>,
typename = require_same_t<typename value_type_t<T1>::Scalar,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is that scalar_t<T1> is fvar<X> and I want to compare those Xes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, I was wrong. Of course require_same_vt is fine.

typename = require_same_t<typename value_type_t<T1>::Scalar,
typename value_type_t<T2>::Scalar>,
typename = require_not_t<
conjunction<is_eigen_row_vector<T1>, is_eigen_col_vector<T2>>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[optional]
We could have something like `require_eigen_dot_product (I'm not sure if there is some other math word for general ops with a row and column vector.

Also I think this could use require_any_not_t

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also names like RowVec and ColVec may be good here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we have require_NOT, so these are not row and col vec.

const Eigen::Matrix<double, R2, C2>& m2) {
template <typename T1, typename T2, typename = require_all_eigen_t<T1, T2>,
typename = require_fvar_t<value_type_t<T1>>,
typename = require_same_vt<double, T2>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could shorten this up with

template <typename Mat1, typename Mat2, typename = require_eigen_vt<is_fvar, Mat1>,
  typename = require_eigen_vt<std::is_floating_point, Mat2>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 29 to 30
typename = require_not_t<
conjunction<is_eigen_row_vector<T1>, is_eigen_col_vector<T2>>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just a meta function for is_dot_product for these instead of a requires specialization

stan/math/fwd/fun/multiply.hpp Outdated Show resolved Hide resolved
typename = require_all_not_var_t<value_type_t<TD>, value_type_t<TA>,
value_type_t<TB>>,
typename = require_any_not_same_t<double, value_type_t<TD>,
value_type_t<TA>, value_type_t<TB>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be a weird way to express it but one way to look at this is

require_not_same_t<double, return_type_t<TD, TA, TB>>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is shorter, I find it less clear.

template <typename TD, typename TA, typename TB,
typename = require_all_same_t<double, value_type_t<TD>,
value_type_t<TA>, value_type_t<TB>>,
typename = require_all_eigen_t<TD, TA, TB>>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

require_all_eigen_vt<std::is_floating_point, TD, TA, TB>

If we ever worry about mixing types of floats we could also have

require_same_same_vt<TD, TA, TB>

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though the bottom may be better as a static assert to the user

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Static assert is not what we want here - if that require is not satisfied, other overload must be chosen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway I would not complicate here, since this overload can be removed once transpose is generalized too.

stan/math/prim/fun/transpose.hpp Show resolved Hide resolved
stan/math/rev/fun/divide.hpp Outdated Show resolved Hide resolved
= new internal::matrix_scalar_divide_vd_vari<R, C>(m, c);
Eigen::Matrix<var, R, C> result(m.rows(), m.cols());
template <typename T, typename = require_eigen_vt<is_var, T>>
inline Eigen::Matrix<var, T::RowsAtCompileTime, T::ColsAtCompileTime> divide(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does auto not work here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does. I missed that one.

@SteveBronder
Copy link
Collaborator

For the errors up in the stan repo I started working on generalizing the rvalue and assign functions. Should those all take in Eigen::MatrixBase<> objects (or just a EigMat template with a require_*?) and call eval if it's being assigned to or we need to do coefficient level access?

https://github.com/stan-dev/stan/blob/refactor/rvalue-pass/src/stan/model/indexing/rvalue.hpp#L114

If that branch is helpful and you want to make any changes to that branch feel free to push straight to it

@SteveBronder
Copy link
Collaborator

I think the lockstep here is updating the stan repo first then merged this

@t4c1
Copy link
Contributor Author

t4c1 commented Jan 23, 2020

I agree. Eigen::MatrixBase<> is not as general. I think we should either have a general template and require or Eigen::EigenBase<>. Let's just go with template + require.

Looking over the branch you linked I see many changes, none of which seem like they could fix test failures in this PR. Did you link the right branch?

@SteveBronder
Copy link
Collaborator

Looking over the branch you linked I see many changes, none of which seem like they could fix test failures in this PR. Did you link the right branch?

Crap actually sorry this branch doesn't cover that. I thought I had a branch with these but looking on my local I must not have saved it

If we use the generic template with require should we have inside each function that does coeff access like

if (is_eigen_expression<EigMat>::value) {
  the_mat.eval();
}

Or is there some better pattern?

@t4c1
Copy link
Contributor Author

t4c1 commented Jan 23, 2020

What you suggested has multiple issues. The correct pattern would be:

const auto& mat = expr_argument.eval();

where expr_argument is function argument and mat is used for reminder of the function (if is not necessary). However we can do even better and use Eigen::Ref to avoid evaling expression if it is just a sub block operation, which allows element access. So the pattern becomes:

const Eigen::Ref<const typename T::PlainObject>& mat = expr_argument;

where T is the type of expr_argument.

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 4.89 4.98 0.98 -1.8% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.99 -1.42% slower
eight_schools/eight_schools.stan 0.09 0.09 1.01 0.9% faster
gp_regr/gp_regr.stan 0.22 0.23 0.96 -3.82% slower
irt_2pl/irt_2pl.stan 6.08 6.07 1.0 0.3% faster
performance.compilation 89.16 87.17 1.02 2.23% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 7.48 7.32 1.02 2.13% faster
pkpd/one_comp_mm_elim_abs.stan 21.66 21.07 1.03 2.72% faster
sir/sir.stan 92.92 91.67 1.01 1.35% faster
gp_regr/gen_gp_data.stan 0.05 0.05 1.01 1.13% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan 2.99 2.99 1.0 -0.25% slower
pkpd/sim_one_comp_mm_elim_abs.stan 0.33 0.3 1.08 7.32% faster
arK/arK.stan 1.76 1.76 1.0 0.19% faster
arma/arma.stan 0.8 0.74 1.08 7.76% faster
garch/garch.stan 0.62 0.63 0.99 -1.12% slower
Mean result: 1.01283994857

Jenkins Console Log
Blue Ocean
Commit hash: b160720


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

# Conflicts:
#	test/unit/math/mix/fun/trace_gen_quad_form_test.cpp
@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 4.86 4.87 1.0 -0.19% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.96 -4.6% slower
eight_schools/eight_schools.stan 0.09 0.09 0.99 -0.88% slower
gp_regr/gp_regr.stan 0.22 0.22 1.0 -0.33% slower
irt_2pl/irt_2pl.stan 6.08 6.05 1.0 0.33% faster
performance.compilation 88.12 86.97 1.01 1.31% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 7.49 7.48 1.0 0.09% faster
pkpd/one_comp_mm_elim_abs.stan 22.46 21.4 1.05 4.73% faster
sir/sir.stan 93.01 98.56 0.94 -5.96% slower
gp_regr/gen_gp_data.stan 0.05 0.05 1.0 0.03% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan 3.0 2.99 1.0 0.24% faster
pkpd/sim_one_comp_mm_elim_abs.stan 0.32 0.32 1.0 -0.03% slower
arK/arK.stan 1.74 1.74 1.0 -0.0% slower
arma/arma.stan 0.79 0.79 1.01 0.5% faster
garch/garch.stan 0.59 0.58 1.0 0.16% faster
Mean result: 0.997466204586

Jenkins Console Log
Blue Ocean
Commit hash: f88d89a


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@t4c1
Copy link
Contributor Author

t4c1 commented Feb 10, 2020

@SteveBronder This is still waiting for a review.

@SteveBronder
Copy link
Collaborator

Sorry for the delay it's been a v busy week. I should have time tomorrow night to look at this

# Conflicts:
#	stan/math/fwd/fun/multiply.hpp
#	stan/math/prim/fun/multiply.hpp
#	stan/math/rev/fun/multiply.hpp
Comment on lines 25 to 29
template <typename Mat1, typename Mat2,
typename = require_eigen_vt<is_fvar, Mat1>,
typename = require_eigen_vt<std::is_floating_point, Mat2>,
typename = require_not_eigen_row_and_col_t<Mat1, Mat2>, int = 0>
inline auto multiply(const Mat1& m1, const Mat2& m2) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to do these multiplies differently. This and the ones in prim are pretty confusing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +43 to +46
template <typename Mat, typename Scal, typename = require_eigen_t<Mat>,
typename = require_stan_scalar_t<Scal>,
typename = require_all_not_var_t<scalar_type_t<Mat>, Scal>>
inline auto divide(const Mat& m, Scal c) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this so that divide works for fvar?

Copy link
Contributor Author

@t4c1 t4c1 Feb 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one works for fvar. I just generalized it the same way as other functions.

Comment on lines 64 to 66
typename = require_any_not_same_t<double, value_type_t<Mat1>,
value_type_t<Mat2>>,
typename = require_not_eigen_row_and_col_t<Mat1, Mat2>>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[sidenote]

We should probably have a require_*_vt for checking the value types of any arbitrary container

const Eigen::Matrix<double, RB, CB> &B) {
template <typename TD, typename TA, typename TB,
typename = require_all_eigen_t<TD, TA, TB>,
typename = require_all_same_vt<double, TD, TA, TB>>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not do this for all arithmetics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. Done.

@SteveBronder
Copy link
Collaborator

I looked this over and like a lot of it! But the multiply needs cleaned up. The templates there are v complicated, the requires stuff is pretty complicated to read for those. Do you have any ideas on how else to do those? I can look at this again on Thursday and see if I can hack up something else

@SteveBronder
Copy link
Collaborator

If @syclik or @bob-carpenter sign off on the multiply templates then I'm fine with it but I'm not sure about them

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 4.94 4.93 1.0 0.22% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.97 -3.37% slower
eight_schools/eight_schools.stan 0.09 0.09 1.0 0.41% faster
gp_regr/gp_regr.stan 0.22 0.22 1.01 0.76% faster
irt_2pl/irt_2pl.stan 6.11 6.05 1.01 0.99% faster
performance.compilation 88.37 87.51 1.01 0.97% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 7.5 7.49 1.0 0.2% faster
pkpd/one_comp_mm_elim_abs.stan 20.16 21.02 0.96 -4.26% slower
sir/sir.stan 93.71 91.84 1.02 2.0% faster
gp_regr/gen_gp_data.stan 0.05 0.05 1.01 0.9% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan 3.01 3.0 1.01 0.64% faster
pkpd/sim_one_comp_mm_elim_abs.stan 0.32 0.35 0.9 -11.44% slower
arK/arK.stan 1.75 1.73 1.01 1.18% faster
arma/arma.stan 0.79 0.8 0.99 -0.66% slower
garch/garch.stan 0.59 0.58 1.01 0.78% faster
Mean result: 0.993918240423

Jenkins Console Log
Blue Ocean
Commit hash: 9529ccc


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@t4c1
Copy link
Contributor Author

t4c1 commented Feb 13, 2020

@SteveBronder Now everything you mentioned is fixed, including multiply templates, and tests passed so this is ready for next review.

Copy link
Collaborator

@SteveBronder SteveBronder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@t4c1 t4c1 merged commit dde9b47 into stan-dev:develop Feb 14, 2020
@rok-cesnovar rok-cesnovar mentioned this pull request Feb 14, 2020
4 tasks
@t4c1 t4c1 deleted the generalize_operators branch November 30, 2020 09:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

double version of trace_gen_quad_form is never used
6 participants