-
Notifications
You must be signed in to change notification settings - Fork 0
Timers and Profiling
We use PETSc's Logging machinery to understand the performance of the dycores. In particular, we've provided some high-level wrappers around the PetscLogEvent object that make it very easy to add timers for functions and blocks of code.
The subsystem for timers and profiling is defined almost entirely in tdytimers.h]
.
When a dycore is initialized, a registry of timers is created when you call TDyInit()
. Once this is done, you can manipulate timers with functions/macros as described below. Make sure you include the tdytimers.h
file wherever you use these timers.
The easiest way to use a timer is to instrument a function with a timer using TDY_START_FUNCTION_TIMER()
and TDY_STOP_FUNCTION_TIMER()
. These macros automatically create/retrieve a timer for the function and start/stop it as you'd expect. You place the START
macro at the top of a function, and the STOP
one at the bottom. For example:
void do_some_expensive_things(TDy dy)
{
TDY_START_FUNCTION_TIMER()
... // Do the expensive things
TDY_STOP_FUNCTION_TIMER()
}
There's no need to understand PETSc's logging objects--everything is done for you. These function-level timers are named after the functions in which they appear.
Sometimes you want to time something that happens in the middle of a function. You can do this by calling the TDyGetTimer
function with the TDyStartTimer
and TDyStopTimer
macros:
void do_various_things(TDy dy)
{
... // Stuff happens here
// Now we want to time a block of code in the middle of the function.
PetscLogEvent timer = TDyGetTimer("important things");
TDyStartTimer(timer);
... // Important things happen here!
TDyStopTimer(timer);
... // Other stuff happens here
}
This is a bit more involved--you need to know that timers are PetscLogEvent
objects, for example, and you need to name your timers--but not too difficult. As you might have guessed, TDY_START_FUNCTION_TIMER
and TDY_STOP_FUNCTION_TIMER
are just wrappers around these constructs. TDyStartTimer
and TDyStopTimer
are themselves just macros that call PetscLogEventBegin
and PetscLogEventEnd
, so you can always use those if you want more control.
PETSc allows an arbitrary number of logging/profiling "stages" to be defined so that you can organize your profiling into sections. These stages can be named for convenience. You can enter and exit a named stage with calls to TDyEnterProfilingStage(stageName)
and TDyExitProfilingStage(stageName)
, where stageName
is a string containing the name of the stage.
TDycore provides these named stages, registering them in TDyInit
:
-
"TDycore Setup"
: for creating meshes, setting up initial conditions, calculating time-independent matrices and vectors, etc. -
"TDycore Stepping"
, for timestepping -
"TDycore I/O"
, for checkpointing, restarting, generating visualizations.
You can register your own named stages with TDyAddProfilingStage(stageName)
. All of this machinery is a thin wrapper around PETSc's PetscLogStage
mechanism, which you can use if you prefer. The functions and macros above just simplify the bookkeeping.
Adding timers to a code is only part of profiling. You also need to generate profiling reports for runs of interest! Fortunately, this is easy--just add the -tdy_timers
flag to your command line arguments to generate a performance log. This log is named tdycore_profile.csv
. It's a comma-separated variable file containing all performance data collected by PETSc. The timers you've created will show up in the profile just like those embedded in the PETSc library.
If you'd rather look at the traditional profiling/log data dumped by PETSc, you can use the -log_view
flag to have PETSc print that information to the standard output.
If you've generated a tdycore_profile.csv
file, you can use a tool called tdyprof
(located in the `tools/ subdirectory of the source tree). This Python script digests the contents of the CSV file you give it and generates nicely-formatted reports for desired information. Use it thus:
tdyprof <profile.csv> <command> [options]
or just type tdyprof
by itself to see its usage information. Currently, it has only one command, "topN", which shows you the top N most time-intensive measurements in your run. For example:
tdyprof richards_demo_profile.csv top10
tdyprof: showing top 10 hits:
Stage Name Event Name Time FLOP
Main Stage summary 12.6833 1.3097e+09
Main Stage summary 12.6704 1.30989e+09
Main Stage TDyTimeIntegratorRunToTime 11.9438 1.3039e+09
Main Stage TDyTimeIntegratorRunToTime 11.9438 1.30372e+09
Main Stage SNESSolve 11.924 1.3039e+09
Main Stage SNESSolve 11.924 1.30372e+09
Main Stage SNESJacobianEval 10.6063 1.36184e+08
Main Stage TDyMPFAOSNESJacobian_3DMesh 10.6062 1.36184e+08
Main Stage SNESJacobianEval 10.6058 1.36198e+08
Main Stage TDyMPFAOSNESJacobian_3DMesh 10.6057 1.36198e+08