Skip to content

KernelLogger

Vivek Kale edited this page Mar 10, 2023 · 6 revisions

Tool Description

Kernel Logger prints start and ends of kernels as well as entry and exist to profiling regions to the terminal during runtime. This can for example help for debugging if it is not clear where an application crashes.

The tool is located at: https://github.com/kokkos/kokkos-tools/tree/master/debugging/kernel-logger

Compilation

Simply type "make" inside the source directory. When compiling for specific platforms modify the simple Makefile to use the correct compiler and compiler flags.

Usage

This is a standard tool which does not yet support tool chaining. In Bash do:

export KOKKOS_TOOLS_LIBS={PATH_TO_TOOL_DIRECTORY}/kp_kernel_logger.so
./application COMMANDS

This tool does not require any additional resources.

Output

The tool outputs to the terminal at runtime.

Example Output

Consider the following code:

#include<Kokkos_Core.hpp>

int main(int argc, char* argv[]) {
  Kokkos::initialize(argc,argv);
  int N = 10000000;
  
  Kokkos::Profiling::pushRegion("Initialization");
  Kokkos::View<double*> a("A",N);
  Kokkos::View<double*> b("B",N);
  Kokkos::View<double*> c("C",N);
  
  Kokkos::parallel_for(N, KOKKOS_LAMBDA (const int& i) {
    a(i) = 1.0*i;
    b(i) = 1.5*i;
    c(i) = 0.0;
  });
  Kokkos::Profiling::popRegion();

  double result = 0.0;
  Kokkos::Profiling::pushRegion("MainLoop");
  for(int k = 0; k<5; k++) {
    Kokkos::Profiling::pushRegion("Iteration");
    Kokkos::parallel_for("AXPB", N, KOKKOS_LAMBDA (const int& i) {
      c(i) = 1.0*k*a(i) + b(i);
    });
    
    double dot;
    Kokkos::parallel_reduce("Dot", N, KOKKOS_LAMBDA (const int& i, double& lsum) {
      lsum += c(i)*c(i);
    },dot);
    result += dot;
    Kokkos::Profiling::popRegion();
  }
  Kokkos::Profiling::popRegion();

  printf("Result: %lf\n",result);
  Kokkos::finalize();
}

This prints:

KokkosP: Library Loaded: /home/crtrott/kokkos-tools/src/tools/kernel-logger/kp_kernel_logger.so
KokkosP: Library Initialized (sequence is 0, version: 20150628)
KokkosP: Entering profiling region: Initialization
KokkosP: Executing parallel-for kernel on device 0 with unique execution identifier 0
KokkosP: Initialization
KokkosP:   18__nv_hdl_wrapper_tI11__nv_dl_tagIPFiiPPcEXadL_Z4mainEELj1EEFvRKiEJN6Kokkos4ViewIPdJEEESC_SC_EE
KokkosP: Execution of kernel 0 is completed.
KokkosP: Exiting profiling region: Initialization
KokkosP: Entering profiling region: MainLoop
KokkosP: Entering profiling region: Iteration
KokkosP: Executing parallel-for kernel on device 0 with unique execution identifier 1
KokkosP: MainLoop
KokkosP:   Iteration
KokkosP:     AXPB
KokkosP: Execution of kernel 1 is completed.
KokkosP: Executing parallel-reduce kernel on device 0 with unique execution identifier 2
KokkosP: MainLoop
KokkosP:   Iteration
KokkosP:     Dot
KokkosP: Execution of kernel 2 is completed.
KokkosP: Exiting profiling region: Iteration
KokkosP: Entering profiling region: Iteration
KokkosP: Executing parallel-for kernel on device 0 with unique execution identifier 3
KokkosP: MainLoop
KokkosP:   Iteration
KokkosP:     AXPB
KokkosP: Execution of kernel 3 is completed.
KokkosP: Executing parallel-reduce kernel on device 0 with unique execution identifier 4
KokkosP: MainLoop
KokkosP:   Iteration
KokkosP:     Dot
KokkosP: Execution of kernel 4 is completed.
KokkosP: Exiting profiling region: Iteration
KokkosP: Entering profiling region: Iteration
KokkosP: Executing parallel-for kernel on device 0 with unique execution identifier 5
KokkosP: MainLoop
KokkosP:   Iteration
KokkosP:     AXPB
KokkosP: Execution of kernel 5 is completed.
KokkosP: Executing parallel-reduce kernel on device 0 with unique execution identifier 6
KokkosP: MainLoop
KokkosP:   Iteration
KokkosP:     Dot
KokkosP: Execution of kernel 6 is completed.
KokkosP: Exiting profiling region: Iteration
KokkosP: Entering profiling region: Iteration
KokkosP: Executing parallel-for kernel on device 0 with unique execution identifier 7
KokkosP: MainLoop
KokkosP:   Iteration
KokkosP:     AXPB
KokkosP: Execution of kernel 7 is completed.
KokkosP: Executing parallel-reduce kernel on device 0 with unique execution identifier 8
KokkosP: MainLoop
KokkosP:   Iteration
KokkosP:     Dot
KokkosP: Execution of kernel 8 is completed.
KokkosP: Exiting profiling region: Iteration
KokkosP: Entering profiling region: Iteration
KokkosP: Executing parallel-for kernel on device 0 with unique execution identifier 9
KokkosP: MainLoop
KokkosP:   Iteration
KokkosP:     AXPB
KokkosP: Execution of kernel 9 is completed.
KokkosP: Executing parallel-reduce kernel on device 0 with unique execution identifier 10
KokkosP: MainLoop
KokkosP:   Iteration
KokkosP:     Dot
KokkosP: Execution of kernel 10 is completed.
KokkosP: Exiting profiling region: Iteration
KokkosP: Exiting profiling region: MainLoop
Result: 23749996437500111880192.000000
KokkosP: Kokkos library finalization called.
Clone this wiki locally