Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible memory leak when differentiating call expressions in forward mode #745

Closed
vaithak opened this issue Jan 31, 2024 · 0 comments · Fixed by #848
Closed

Possible memory leak when differentiating call expressions in forward mode #745

vaithak opened this issue Jan 31, 2024 · 0 comments · Fixed by #848
Assignees
Milestone

Comments

@vaithak
Copy link
Collaborator

vaithak commented Jan 31, 2024

Minimal reproducible example:

#include "clad/Differentiator/Differentiator.h"

double g (double x) { return x; }

double f (double x) {
  const int __k = 9;
  return g(x);
}

int main () {
  auto f_dx = clad::differentiate(f);
  return 0;
}

error msg: error: reference to local variable '_d___k' declared in enclosing function 'f_darg0'

This was found during the debugging of #700, and is related to the issue in the mersenne_twister.h file.

@vgvassilev vgvassilev added this to the v1.5 milestone Mar 27, 2024
vaithak added a commit to vaithak/clad that referenced this issue Apr 30, 2024
Plan for dynamic graph
- The relations between different differentiation requests can be modelled as a graph.
For example, if `f_a` calls `f_b`, there will be two differentiation requests `df_a` and `df_b`,
the edge between them can be understood as `created_because_of`. This also means that the functions
called by the users to be explicitly differentiated (or `DiffRequests` created because of these) are the source nodes,
i.e. no incoming edges. In most cases, this graph aligns with the call graph, but in some cases,
the graph depends on the internal implementation, like the Hessian computation, which requires creating multiple
`fwd_mode` requests followed by a `rev_mode` request.

- We can use this graph to order the computation of differentiation requests. This was already being done implicitly
in the initial recursive implementation. Whenever we encountered a call expression, we started differentiation of the
called function; this was sort of like a depth-first search strategy.
  - This had problems, as `Clang` reported errors when it encountered a new function scope (of the derivative of the called function) in the middle of the old function scope (of the derivative of the callee function). It treated the nested one like a lambda expression. The issue regarding this: vgvassilev#745.

- To fix this, an initial strategy was to eliminate the recursive approach. Hence, a queue-based breadth-first approach
was implemented in this PR: vgvassilev#848.
  - Although it fixed the problem, the graph traversal was still implicit. We needed some way to compute/store the complete graph and possibly optimize it, such as converting edges to model the `requires_derivative_of` relation. Using this, we could proceed with differentiation in a topologically sorted ordering.
  - It also required one caveat: although we don't differentiate the called function completely in a recursive way, we still need to declare it so that we can have the call expression completed (i.e. `auto t0 = f_pushforward(...)`).

- To move towards the final stage of having the complete graph computed before starting the differentiation, we need the complete information on how the `DiffRequest` will be formed inside the visitors (including arguments or `DVI` info).
This whole approach will require activity analysis in the first pass.
  - As an incremental improvement, the first requirement was to implement infrastructure to support explicit modelling of the graph and use that to have a breadth-first traversal (and eventually topological ordering).

This is the initial PR for capturing the differentiation plan in a graphical format.

However, the traversal order is still breadth-first, as we don't have the complete graph in the first pass - mainly because of a lack of information about the args required for `pushforward` and `pullbacks`.

This can be improved with the help of activity analysis to capture the complete graph in the first pass, processing the plan in a topologically sorted manner and pruning the graph for user-defined functions. I started this with this approach, and the initial experimental commit is available here for future reference: 82c0b42.
vaithak added a commit to vaithak/clad that referenced this issue May 3, 2024
Plan for dynamic graph
- The relations between different differentiation requests can be modelled as a graph.
For example, if `f_a` calls `f_b`, there will be two differentiation requests `df_a` and `df_b`,
the edge between them can be understood as `created_because_of`. This also means that the functions
called by the users to be explicitly differentiated (or `DiffRequests` created because of these) are the source nodes,
i.e. no incoming edges. In most cases, this graph aligns with the call graph, but in some cases,
the graph depends on the internal implementation, like the Hessian computation, which requires creating multiple
`fwd_mode` requests followed by a `rev_mode` request.

- We can use this graph to order the computation of differentiation requests. This was already being done implicitly
in the initial recursive implementation. Whenever we encountered a call expression, we started differentiation of the
called function; this was sort of like a depth-first search strategy.
  - This had problems, as `Clang` reported errors when it encountered a new function scope (of the derivative of the called function) in the middle of the old function scope (of the derivative of the callee function). It treated the nested one like a lambda expression. The issue regarding this: vgvassilev#745.

- To fix this, an initial strategy was to eliminate the recursive approach. Hence, a queue-based breadth-first approach
was implemented in this PR: vgvassilev#848.
  - Although it fixed the problem, the graph traversal was still implicit. We needed some way to compute/store the complete graph and possibly optimize it, such as converting edges to model the `requires_derivative_of` relation. Using this, we could proceed with differentiation in a topologically sorted ordering.
  - It also required one caveat: although we don't differentiate the called function completely in a recursive way, we still need to declare it so that we can have the call expression completed (i.e. `auto t0 = f_pushforward(...)`).

- To move towards the final stage of having the complete graph computed before starting the differentiation, we need the complete information on how the `DiffRequest` will be formed inside the visitors (including arguments or `DVI` info).
This whole approach will require activity analysis in the first pass.
  - As an incremental improvement, the first requirement was to implement infrastructure to support explicit modelling of the graph and use that to have a breadth-first traversal (and eventually topological ordering).

This is the initial PR for capturing the differentiation plan in a graphical format.

However, the traversal order is still breadth-first, as we don't have the complete graph in the first pass - mainly because of a lack of information about the args required for `pushforward` and `pullbacks`.

This can be improved with the help of activity analysis to capture the complete graph in the first pass, processing the plan in a topologically sorted manner and pruning the graph for user-defined functions. I started this with this approach, and the initial experimental commit is available here for future reference: 82c0b42.
vgvassilev pushed a commit that referenced this issue May 3, 2024
Plan for dynamic graph
- The relations between different differentiation requests can be modelled as a graph.
For example, if `f_a` calls `f_b`, there will be two differentiation requests `df_a` and `df_b`,
the edge between them can be understood as `created_because_of`. This also means that the functions
called by the users to be explicitly differentiated (or `DiffRequests` created because of these) are the source nodes,
i.e. no incoming edges. In most cases, this graph aligns with the call graph, but in some cases,
the graph depends on the internal implementation, like the Hessian computation, which requires creating multiple
`fwd_mode` requests followed by a `rev_mode` request.

- We can use this graph to order the computation of differentiation requests. This was already being done implicitly
in the initial recursive implementation. Whenever we encountered a call expression, we started differentiation of the
called function; this was sort of like a depth-first search strategy.
  - This had problems, as `Clang` reported errors when it encountered a new function scope (of the derivative of the called function) in the middle of the old function scope (of the derivative of the callee function). It treated the nested one like a lambda expression. The issue regarding this: #745.

- To fix this, an initial strategy was to eliminate the recursive approach. Hence, a queue-based breadth-first approach
was implemented in this PR: #848.
  - Although it fixed the problem, the graph traversal was still implicit. We needed some way to compute/store the complete graph and possibly optimize it, such as converting edges to model the `requires_derivative_of` relation. Using this, we could proceed with differentiation in a topologically sorted ordering.
  - It also required one caveat: although we don't differentiate the called function completely in a recursive way, we still need to declare it so that we can have the call expression completed (i.e. `auto t0 = f_pushforward(...)`).

- To move towards the final stage of having the complete graph computed before starting the differentiation, we need the complete information on how the `DiffRequest` will be formed inside the visitors (including arguments or `DVI` info).
This whole approach will require activity analysis in the first pass.
  - As an incremental improvement, the first requirement was to implement infrastructure to support explicit modelling of the graph and use that to have a breadth-first traversal (and eventually topological ordering).

This is the initial PR for capturing the differentiation plan in a graphical format.

However, the traversal order is still breadth-first, as we don't have the complete graph in the first pass - mainly because of a lack of information about the args required for `pushforward` and `pullbacks`.

This can be improved with the help of activity analysis to capture the complete graph in the first pass, processing the plan in a topologically sorted manner and pruning the graph for user-defined functions. I started this with this approach, and the initial experimental commit is available here for future reference: vaithak@82c0b42.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants