Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eigen expressions #23

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 63 additions & 0 deletions designs/0005-eigen_expressions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
- Feature Name: eigen_expressions
- Start Date: 2020-05-22
- RFC PR: (leave this empty)
- Stan Issue: (leave this empty)

# Summary
[summary]: #summary

Allow all functions in Stan Math to accept and return Eigen expressions.

# Motivation
[motivation]: #motivation

If all functions only accept and return matrices, every function must evaluate all operations it uses within the function. Evaluation of an operation means loading all matrices involved from memory to CPU, computing the opration and storing the result back in memory. For simple operations loading and storing take more time than computations.

If we instead allow functions to accept and return expressions, the expresssions are only evaluated when strictly necessary. So more operations can be combined into a single expression, before it is evaluated. This results in less memory transfers and faster execution.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

Eigen, the library we are using for matrix computations supports lazy evaluation. That means operations are not imediately evaluated. Instead each operation returns an object that represents this operation and references the arguments to the operation. If those arguments are operations themselves we can build arbitrarily complex expressions. These expressions are evaluated when assigned to a matrix (or `.eval()` is called on it). For example:

```
Eigen MatrixXd a, b;
auto expr = a + 3 * b; //not evaluated yet
MatrixXd c = expr; //now it evaluates
```

General functions must accept their arguments as templates. Which types are accepted and which are not can be restricted with requires, such as `require_eigen_t<T>`. Now these arguments can be either matrices as before or more general expressions. All functions must be updated to handle this.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

Some expressions can be very expensive to compute (for example matrix multiplication), so they should never be evaluated more than once. So if an expression argument is used more than once in a function, it should be evaluated first and than the computed value should be used. On the other hand some expressions are trivial to compute (for example block). For these it would be better if they are not evaluated, as that would make a copy of underlying data. Eigen solved this dilemma by introducing `Eigen::Ref` class. If a trivial expression is assigned to `Ref` it is just referenced by it. If the expression involves actual computations assigning it to `Ref` evaluates it and stores the result in the `Ref`. In either case `Ref` than can be used more or less the same as a matrix.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference in an Eigen::Ref and an expression? Are Eigen::Ref s expressions?

Copy link
Contributor Author

@t4c1 t4c1 May 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is any mathematically exact definition of what is expression. For the functions accepting expressions it is. So are Eigen::Matrix and Eigen::Map.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh so Eigen::Ref will hold things like "this is the sum of two matrices, A, and B".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sum is not trivial. So Ref would evaluate it and hold the result.


However, `Ref`, behaves weirdly in some situations. Namely its copy constructor does nothing. Instead of using `Ref` we can make a trait metaprogram `to_ref_t<T>` that determines appropriate type to assign an expression to that will behave the same as `Ref`, while also handle copying, moving etc. We can also make a helper function `to_ref` for converting expressions to that type.

So whenever an input argument, which might be an expression is more than once in a function it should be assigned to a variable of type `to_ref_t<T>`, where `T` is a the type of input argument (or alternatively call `to_ref` on the argument). That variable should be used everywhere in the function instead of directly using the input argument.

Steps to implement:
- Generalize all functions to accept general Eigen expressions (already halfway complete at the time of writing this)
- Test that all functions work with general expressions (this can probably be automated with a script).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This got merged yesterday.

The design doc should link to what actual docs exist for using this: https://github.com/stan-dev/stan/wiki/Testing:-Unit-Tests#6-expression-testing-framework

Could also mention that these tests are in Stan and run when stanc3 or math change.

- Make functions return Eigen expressions wherever it makes sense to do so. (This is for the functions that are exposed to Stan language. Other function can return expresions earlier.)

# Drawbacks
[drawbacks]: #drawbacks

Code complexity increases. Performance affecting bugs are more likely due to some input expression being evaluated multiple times.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, somehow the drawback is that bugs keep popping up in unexpected places.

I guess if the code magically got more complex to handle that, that's no problem. It seems like the problem is the code doesn't magically get more complicated and we get bugs lol.

Anyway I'm not really adding anything here just thinking out loud.


# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

- Only alternative I know of is not changing anything and still evaluatign expressions everywhere.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this should just be The alternative is to stay with non-expression types. The downside to this is the extra time spend loading/storing matrices in simple operations..


# Prior art
[prior-art]: #prior-art

Eigen does it. We also do the same with kernel generator expressions.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

I can't think of any right now.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an example of a function that used to take an Eigen::Matrix<T, R, C> and return an Eigen::Matrix<T, R, C> but now it takes in an expression and possibly returns one too?

Is this a simple conversion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be returning auto? Or returning everything through one of those Ref things?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be using those universal reference things @SteveBronder does? Or should we be doing Eigen::MatrixBase<Derived>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be returning auto? Or returning everything through one of those Ref things?

Right now we still need to return matrix from functions exposed to lang. Later we will be returning auto rfom most functions (or if you don't like auto you can specify exact type of the expression).

Should we be using those universal reference things @SteveBronder does?

He tends to use them everywhere, including where they ofer no benefit, just make code more complicated. In most places not. In general it has nothing to do with expressions. If you are interested in more details about that look up perfect forwarding.

Or should we be doing Eigen::MatrixBase?

We could use that. Just than we would have to call .derived() on every argument. It is simpler to use plain templates and restrict input types with requires.

Copy link
Contributor Author

@t4c1 t4c1 May 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an example. If you are interested in more examples or in more complex functions, see what heas already been done under issue stan-dev/math#1470.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add Rok's example about stanc3 somewhere here. Like:

matrix[N, N] A = B * C;
matrix[N, N] D = A * C + A * B;

Still works as expected in stan as long as the compiler makes A a full type that causes the copy to happen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What meta-programs do we need to support this?

If I have an expression type T, how do I get:

  1. The non-expression type
  2. The scalar type
  3. is_var, etc.

Or maybe all the regular programs for 2, 3, etc. work and we just need a new one for 1? (or I take it plain_type_t solves 1)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(or I take it plain_type_t solves 1)?

Exactly.

For 2 and 3 we need nothing new. What we already have works with expressions.