-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restructure differentiation schedule into a breadth first traversal #848
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #848 +/- ##
=======================================
Coverage 94.77% 94.77%
=======================================
Files 49 49
Lines 7499 7563 +64
=======================================
+ Hits 7107 7168 +61
- Misses 392 395 +3
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
bde43e1
to
8788574
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
7bfd4aa
to
6918417
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
tools/ClangPlugin.h
Outdated
@@ -288,6 +292,11 @@ namespace clad { | |||
return P.ProcessDiffRequest(request); | |||
} | |||
|
|||
void AddRequestToSchedule(CladPlugin& P, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: function 'AddRequestToSchedule' defined in a header file; function definitions in header files can lead to ODR violations [misc-definitions-in-headers]
void AddRequestToSchedule(CladPlugin& P,
^
Additional context
tools/ClangPlugin.h:294: make as 'inline'
void AddRequestToSchedule(CladPlugin& P,
^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! This is a major improvement. We have discussed that now we can process diff schedules separately from the differentiation process. That is, we first plan and then act. This PR does not get us there just yet, and this is why it relies on a forward declarations because it only appends to the diff plan instead of inserting.
That's an incremental change but we should also try to make the next step maybe in a separate PR.
if (arg->getName() == "p") | ||
m_Variables[arg] = m_Result; | ||
idx += 1; | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we come up with a test here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for Jacobian computation and the particular case is for something related to ROOT as captured here: #478 (comment).
I will look into this when I work on the Jacobain array-related issue: #472
d0b651e
to
509ec63
Compare
a7236d0
to
bfb2c7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
CXXScopeSpec SS; | ||
bool isArrow = Base->getType()->isPointerType(); | ||
auto ME = m_Sema | ||
.ActOnMemberAccessExpr(getCurrentScope(), Base, noLoc, | ||
.ActOnMemberAccessExpr(getCurrentScope(), Base, Loc, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: 3rd argument 'Loc' (passed to 'OpLoc') looks like it might be swapped with the 6th, 'noLoc' (passed to 'TemplateKWLoc') [readability-suspicious-call-argument]
.ActOnMemberAccessExpr(getCurrentScope(), Base, Loc,
^
Additional context
llvm/include/clang/Sema/Sema.h:5307: in the call to 'ActOnMemberAccessExpr', declared here
ExprResult ActOnMemberAccessExpr(Scope *S, Expr *Base,
^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one seems irrelevant, it is using some heuristic to deduce that noLoc
is more similar to OpLoc
, maybe because of the Levenshtein distance as mentioned here: https://releases.llvm.org/13.0.0/tools/clang/tools/extra/docs/clang-tidy/checks/readability-suspicious-call-argument.html#levenshtein-distance-as-levenshtein.
4f5fabf
to
541d126
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good.
ba3b914
to
2b7334c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the clang-tidy reports are relevant. Can you address them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Do you want the merge the 6 commits the way they are or you want to squash or something else?
I think the current order of commits is fine. |
Plan for dynamic graph - The relations between different differentiation requests can be modelled as a graph. For example, if `f_a` calls `f_b`, there will be two differentiation requests `df_a` and `df_b`, the edge between them can be understood as `created_because_of`. This also means that the functions called by the users to be explicitly differentiated (or `DiffRequests` created because of these) are the source nodes, i.e. no incoming edges. In most cases, this graph aligns with the call graph, but in some cases, the graph depends on the internal implementation, like the Hessian computation, which requires creating multiple `fwd_mode` requests followed by a `rev_mode` request. - We can use this graph to order the computation of differentiation requests. This was already being done implicitly in the initial recursive implementation. Whenever we encountered a call expression, we started differentiation of the called function; this was sort of like a depth-first search strategy. - This had problems, as `Clang` reported errors when it encountered a new function scope (of the derivative of the called function) in the middle of the old function scope (of the derivative of the callee function). It treated the nested one like a lambda expression. The issue regarding this: vgvassilev#745. - To fix this, an initial strategy was to eliminate the recursive approach. Hence, a queue-based breadth-first approach was implemented in this PR: vgvassilev#848. - Although it fixed the problem, the graph traversal was still implicit. We needed some way to compute/store the complete graph and possibly optimize it, such as converting edges to model the `requires_derivative_of` relation. Using this, we could proceed with differentiation in a topologically sorted ordering. - It also required one caveat: although we don't differentiate the called function completely in a recursive way, we still need to declare it so that we can have the call expression completed (i.e. `auto t0 = f_pushforward(...)`). - To move towards the final stage of having the complete graph computed before starting the differentiation, we need the complete information on how the `DiffRequest` will be formed inside the visitors (including arguments or `DVI` info). This whole approach will require activity analysis in the first pass. - As an incremental improvement, the first requirement was to implement infrastructure to support explicit modelling of the graph and use that to have a breadth-first traversal (and eventually topological ordering). This is the initial PR for capturing the differentiation plan in a graphical format. However, the traversal order is still breadth-first, as we don't have the complete graph in the first pass - mainly because of a lack of information about the args required for `pushforward` and `pullbacks`. This can be improved with the help of activity analysis to capture the complete graph in the first pass, processing the plan in a topologically sorted manner and pruning the graph for user-defined functions. I started this with this approach, and the initial experimental commit is available here for future reference: 82c0b42.
Plan for dynamic graph - The relations between different differentiation requests can be modelled as a graph. For example, if `f_a` calls `f_b`, there will be two differentiation requests `df_a` and `df_b`, the edge between them can be understood as `created_because_of`. This also means that the functions called by the users to be explicitly differentiated (or `DiffRequests` created because of these) are the source nodes, i.e. no incoming edges. In most cases, this graph aligns with the call graph, but in some cases, the graph depends on the internal implementation, like the Hessian computation, which requires creating multiple `fwd_mode` requests followed by a `rev_mode` request. - We can use this graph to order the computation of differentiation requests. This was already being done implicitly in the initial recursive implementation. Whenever we encountered a call expression, we started differentiation of the called function; this was sort of like a depth-first search strategy. - This had problems, as `Clang` reported errors when it encountered a new function scope (of the derivative of the called function) in the middle of the old function scope (of the derivative of the callee function). It treated the nested one like a lambda expression. The issue regarding this: vgvassilev#745. - To fix this, an initial strategy was to eliminate the recursive approach. Hence, a queue-based breadth-first approach was implemented in this PR: vgvassilev#848. - Although it fixed the problem, the graph traversal was still implicit. We needed some way to compute/store the complete graph and possibly optimize it, such as converting edges to model the `requires_derivative_of` relation. Using this, we could proceed with differentiation in a topologically sorted ordering. - It also required one caveat: although we don't differentiate the called function completely in a recursive way, we still need to declare it so that we can have the call expression completed (i.e. `auto t0 = f_pushforward(...)`). - To move towards the final stage of having the complete graph computed before starting the differentiation, we need the complete information on how the `DiffRequest` will be formed inside the visitors (including arguments or `DVI` info). This whole approach will require activity analysis in the first pass. - As an incremental improvement, the first requirement was to implement infrastructure to support explicit modelling of the graph and use that to have a breadth-first traversal (and eventually topological ordering). This is the initial PR for capturing the differentiation plan in a graphical format. However, the traversal order is still breadth-first, as we don't have the complete graph in the first pass - mainly because of a lack of information about the args required for `pushforward` and `pullbacks`. This can be improved with the help of activity analysis to capture the complete graph in the first pass, processing the plan in a topologically sorted manner and pruning the graph for user-defined functions. I started this with this approach, and the initial experimental commit is available here for future reference: 82c0b42.
Plan for dynamic graph - The relations between different differentiation requests can be modelled as a graph. For example, if `f_a` calls `f_b`, there will be two differentiation requests `df_a` and `df_b`, the edge between them can be understood as `created_because_of`. This also means that the functions called by the users to be explicitly differentiated (or `DiffRequests` created because of these) are the source nodes, i.e. no incoming edges. In most cases, this graph aligns with the call graph, but in some cases, the graph depends on the internal implementation, like the Hessian computation, which requires creating multiple `fwd_mode` requests followed by a `rev_mode` request. - We can use this graph to order the computation of differentiation requests. This was already being done implicitly in the initial recursive implementation. Whenever we encountered a call expression, we started differentiation of the called function; this was sort of like a depth-first search strategy. - This had problems, as `Clang` reported errors when it encountered a new function scope (of the derivative of the called function) in the middle of the old function scope (of the derivative of the callee function). It treated the nested one like a lambda expression. The issue regarding this: #745. - To fix this, an initial strategy was to eliminate the recursive approach. Hence, a queue-based breadth-first approach was implemented in this PR: #848. - Although it fixed the problem, the graph traversal was still implicit. We needed some way to compute/store the complete graph and possibly optimize it, such as converting edges to model the `requires_derivative_of` relation. Using this, we could proceed with differentiation in a topologically sorted ordering. - It also required one caveat: although we don't differentiate the called function completely in a recursive way, we still need to declare it so that we can have the call expression completed (i.e. `auto t0 = f_pushforward(...)`). - To move towards the final stage of having the complete graph computed before starting the differentiation, we need the complete information on how the `DiffRequest` will be formed inside the visitors (including arguments or `DVI` info). This whole approach will require activity analysis in the first pass. - As an incremental improvement, the first requirement was to implement infrastructure to support explicit modelling of the graph and use that to have a breadth-first traversal (and eventually topological ordering). This is the initial PR for capturing the differentiation plan in a graphical format. However, the traversal order is still breadth-first, as we don't have the complete graph in the first pass - mainly because of a lack of information about the args required for `pushforward` and `pullbacks`. This can be improved with the help of activity analysis to capture the complete graph in the first pass, processing the plan in a topologically sorted manner and pruning the graph for user-defined functions. I started this with this approach, and the initial experimental commit is available here for future reference: vaithak@82c0b42.
No description provided.