-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add timings report if -ftime-report is enabled #779
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will need a test for this new feature.
tools/ClangPlugin.cpp
Outdated
@@ -159,7 +159,9 @@ namespace clad { | |||
S.PerformPendingInstantiations(); | |||
m_PendingInstantiationsInFlight = false; | |||
} | |||
|
|||
// Necessary to prevent separate timing reports due to expired timers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably take inspiration from clang::CompilerInstance::createFrontendTimer()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate on this? not sure what you mean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang does not use dummy timers to trigger some effects. My suggestion was to look at the interface I pointed out, understand how was done there and take a similar approach.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #779 +/- ##
==========================================
+ Coverage 94.71% 94.87% +0.16%
==========================================
Files 49 49
Lines 7342 7478 +136
==========================================
+ Hits 6954 7095 +141
+ Misses 388 383 -5
... and 6 files with indirect coverage changes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
Sorry for the delay, id appreciate feedback on this implementation |
Additionally, would a test for this new feature be simply checking if the timing report is printed when including the |
Yes. |
Can you update the output that this PR produces now? I also see that we have still the SimpleTimer. That needs to go in favor of the new implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
updated output, difference being that its printed at the end
|
I've implemented it as a stack of pointers to only the active |
ive preserved the |
tools/ClangPlugin.cpp
Outdated
} | ||
void clad::CladTimerGroup::StopTimer() { | ||
Timers.back()->stopTimer(); | ||
if (Timers.size() != 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we call pop_back
when size is 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its the llvm::TimerGroup
implementation, as given here
The description says it prints when the llvm::TimerGroup
is destroyed but the implementation here line 339 prints it even if the TimerGroup has no alive timers(before the TimerGroup is destroyed).
So by keeping atleast one Timer(running or not) alive, it prints only one report at the very end, rather than split reports
Far as i can tell, llvm keeps some of the Timers alive, in their usage of TimerGroup
as seen here Line 56 in the StringMap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alternative i believe would be to leave all the Timers alive, which seems like a waste of memory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we could use separate TimerGroups for each function differentiation request, so the output would look something like this
===-------------------------------------------------------------------------===
Timing Report for fn(forward mode)
===-------------------------------------------------------------------------===
Total Execution Time: 0.0431 seconds (0.0431 wall clock)
---User Time--- --User+System-- ---Wall Time--- --- Name ---
0.0182 ( 42.1%) 0.0182 ( 42.1%) 0.0182 ( 42.1%) fn
0.0162 ( 37.5%) 0.0162 ( 37.5%) 0.0162 ( 37.5%) nested1_fn
0.0045 ( 10.5%) 0.0045 ( 10.5%) 0.0045 ( 10.5%) nested2_fn
0.0431 (100.0%) 0.0431 (100.0%) 0.0431 (100.0%) Total
===-------------------------------------------------------------------------===
Timing Report for fnr(reverse mode)
===-------------------------------------------------------------------------===
Total Execution Time: 0.0431 seconds (0.0431 wall clock)
---User Time--- --User+System-- ---Wall Time--- --- Name ---
0.0028 ( 6.4%) 0.0028 ( 6.4%) 0.0028 ( 6.4%) fnr
0.0015 ( 3.4%) 0.0015 ( 3.4%) 0.0015 ( 3.4%) nested_fnr
0.0431 (100.0%) 0.0431 (100.0%) 0.0431 (100.0%) Total
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we could use separate TimerGroups for each function differentiation request, so the output would look something like this
===-------------------------------------------------------------------------=== Timing Report for fn(forward mode) ===-------------------------------------------------------------------------=== Total Execution Time: 0.0431 seconds (0.0431 wall clock) ---User Time--- --User+System-- ---Wall Time--- --- Name --- 0.0182 ( 42.1%) 0.0182 ( 42.1%) 0.0182 ( 42.1%) fn 0.0162 ( 37.5%) 0.0162 ( 37.5%) 0.0162 ( 37.5%) nested1_fn 0.0045 ( 10.5%) 0.0045 ( 10.5%) 0.0045 ( 10.5%) nested2_fn 0.0431 (100.0%) 0.0431 (100.0%) 0.0431 (100.0%) Total ===-------------------------------------------------------------------------=== Timing Report for fnr(reverse mode) ===-------------------------------------------------------------------------=== Total Execution Time: 0.0431 seconds (0.0431 wall clock) ---User Time--- --User+System-- ---Wall Time--- --- Name --- 0.0028 ( 6.4%) 0.0028 ( 6.4%) 0.0028 ( 6.4%) fnr 0.0015 ( 3.4%) 0.0015 ( 3.4%) 0.0015 ( 3.4%) nested_fnr 0.0431 (100.0%) 0.0431 (100.0%) 0.0431 (100.0%) Total
Can you elaborate what the grouping would be? It would be for all calls to clad::differentiate
and all calls to clad::gradient
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each differentiate or gradient call in the source file would receive its own report and this report would include all the differentiate calls that were made for that particular function in the source file.
So yes your understanding is correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe that's better, @parth-07 what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
The test is failing because the times taken by the funcs can vary and therefore the order can vary, ill remove that part, and just check if the header of the timings ("Timers for clad funcs") is printed. |
Most of the CI should pass now, but i couldnt figure out why certain tests seem to be failing from the same error, for eg this run, would appreciate some assistance figuring this error out, if it still persists after the most recent commits |
clang-tidy review says "All clean, LGTM! 👍" |
Any idea what could be causing some of the tests to fail? Outside of the clang-format one @vgvassilev @parth-07 |
"To ssh into the GitHub runner on which tests are failing, click on re-run actions and select the debug logging checkbox. If necessary increase the value of timeout-minutes key in .github/workflows/ci.yml to a suitable value for debugging – 30 - 60 minutes should generally be enough.". More you can find here. |
84fa0bf
to
9c5bf65
Compare
clang-tidy review says "All clean, LGTM! 👍" |
Found the issue, older clang versions(<11) never used the |
@DeadSpheroid, can you squash all commits and add a good commit message? |
91dec11
to
8d6d03b
Compare
clang-tidy review says "All clean, LGTM! 👍" |
Can you apply git-clang-format to your changes so that the bots are happy? |
…vassilev#769 Clad will now print a timings report for all clad function calls Adds CladTimerGroup class to time the clad functions Repurposes LIBCLAD_TIMING flag to print only the timings report for clad
8d6d03b
to
ef9bfec
Compare
clang-tidy review says "All clean, LGTM! 👍" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Fixes #769
Makes use of
llvm::TimerGroup
andllvm::Timer
classes to record timings for function differentiationsSourceFile.cpp
Output with -ftime-report
The dummy timer here
https://github.com/DeadSpheroid/clad/blob/ec56e48eab5bbd167628c8e12f2515e898f4d099/tools/ClangPlugin.cpp#L169
is because the documentation of llvm::TimerGroup fails to specify that the TimerGroup prints when all Timers in it are destroyed, the dummy timer prevents this printing from happening ensuring a single report