You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was working on clang-tidy checks during the hack session today and ran into something interesting I think we should discuss. The clang-analyzer-core.UndefinedBinaryOperatorResult check checks for cases where a variable might be undefined when it's used and so structures like this tend to trigger it:
int o1, o2, o3;
switch (dir) {
case0:
o1 = grid_enum::momentum_x;
o2 = grid_enum::momentum_y;
o3 = grid_enum::momentum_z;
break;
case1:
o1 = grid_enum::momentum_y;
o2 = grid_enum::momentum_z;
o3 = grid_enum::momentum_x;
break;
case2:
o1 = grid_enum::momentum_z;
o2 = grid_enum::momentum_x;
o3 = grid_enum::momentum_y;
break;
}
// the check flags the usage of o1 on this line
velocity_x = dev_conserved[o1 * n_cells + id];
It is technically correct to flag this since o1 could be uninitialized if dir is not 0-2 even though we know it always will be in that range. An easy way to fix this is if all the kernels with a structure like this take dir as a template argument instead of a regular argument, thus allowing the switch statement to be determined at compile time.
While I don't think it's worth doing that just for this check I tried it out on a single kernel (the new PPMC_VL kernel) since templates can improve the performance of control statements by evaluating them at compile time. While I didn't expect it to help it actually did help. It reduced the execution time by ~4ms per time step, a ~2% improvement. If these results hold when applied to other kernels that could be an easy 10% performance improvement for very little effort. What do you all think?
The text was updated successfully, but these errors were encountered:
I was working on clang-tidy checks during the hack session today and ran into something interesting I think we should discuss. The
clang-analyzer-core.UndefinedBinaryOperatorResult
check checks for cases where a variable might be undefined when it's used and so structures like this tend to trigger it:It is technically correct to flag this since
o1
could be uninitialized ifdir
is not 0-2 even though we know it always will be in that range. An easy way to fix this is if all the kernels with a structure like this takedir
as a template argument instead of a regular argument, thus allowing the switch statement to be determined at compile time.While I don't think it's worth doing that just for this check I tried it out on a single kernel (the new PPMC_VL kernel) since templates can improve the performance of control statements by evaluating them at compile time. While I didn't expect it to help it actually did help. It reduced the execution time by ~4ms per time step, a ~2% improvement. If these results hold when applied to other kernels that could be an easy 10% performance improvement for very little effort. What do you all think?
The text was updated successfully, but these errors were encountered: