Integration PR followups: make_workdiv, uniform_elements, concrete kernel dimensions #141

ariostas · 2024-12-17T19:46:51Z

I started to address some follow-up tasks in #75. In particular:

Switch to cms::alpakatools::makeworkdir instead of our custom createWorkDiv.
Switch to cms::alpakatools::uniform_elements for kernel loops.
Don't set a custom work division for CPU.
Started moving towards the removal of kVerticalModuleSlope.
Use concrete kernel dimensions instead of templated ones.

ariostas · 2024-12-17T19:49:14Z

I think the plots might look a bit different now that the work division is different, but hopefully they are still run-to-run reproducible (I'll check).

For now, I'm just making sure I didn't break something.

/run standalone
/run checks

github-actions · 2024-12-17T20:04:03Z

The PR was built and ran successfully in standalone mode. Here are some of the comparison plots.

The full set of validation and comparison plots can be found here.

Here is a timing comparison:

   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Reset    Event     Short             Rate
   avg     46.8    400.0    187.9    153.9    166.6    515.0    123.0    232.9    146.5      3.9    1976.4    1414.6+/- 393.7     525.2   explicit[s=4] (target branch)
   avg     46.5    386.1    190.4    157.7    165.2    694.5      8.8     11.8    160.6      3.4    1824.9    1084.0+/- 241.8     482.7   explicit[s=4] (this PR)

ariostas · 2024-12-17T20:05:30Z

Welp, I did break something. I'll look into it

ariostas · 2024-12-17T20:40:29Z

Hopefully it's fine now.

/run standalone

github-actions · 2024-12-17T20:54:59Z

The PR was built and ran successfully in standalone mode. Here are some of the comparison plots.

The full set of validation and comparison plots can be found here.

Here is a timing comparison:

   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Reset    Event     Short             Rate
   avg     45.8    394.5    194.2    150.0    164.7    514.4    121.9    236.0    145.2      3.7    1970.5    1410.2+/- 392.1     519.0   explicit[s=4] (target branch)
   avg     44.4    385.7    188.4    155.9    149.7    702.5    127.3    256.8    177.9      3.7    2192.4    1445.4+/- 400.5     577.8   explicit[s=4] (this PR)

ariostas · 2024-12-17T21:04:38Z

The plots match perfectly, which is nice. There was a significant increase in timing, especially for pLS, but it seems like this is only for CPU. This is how the timing compares in cgpu-1.

This PR (aa82696)
Total Timing Summary
Average time for map loading = 571.866 ms
Average time for input loading = 7550.76 ms
Average time for lst::Event creation = 0.00314486 ms
   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Reset    Event     Short             Rate
   avg     11.7      0.5      0.3      1.4      1.2      0.4      0.8      0.6      1.1      0.0      18.1       6.0+/-  1.2      20.1   explicit[s=1]
   avg      2.5      0.6      0.5      1.8      1.4      0.4      1.2      0.7      1.6      0.0      10.8       7.8+/-  1.6       6.5   explicit[s=2]
   avg      4.6      1.0      0.8      2.8      2.1      0.6      2.4      1.5      2.7      0.0      18.5      13.3+/-  2.8       5.3   explicit[s=4]
   avg      6.5      1.6      1.1      3.8      3.2      0.7      3.2      1.8      3.9      0.0      25.9      18.6+/-  4.1       4.8   explicit[s=6]
   avg      8.1      2.2      1.5      4.9      4.2      0.9      3.8      2.1      5.2      0.0      32.9      23.9+/-  4.6       4.6   explicit[s=8]

master (9628e8f)
Total Timing Summary
Average time for map loading = 566.258 ms
Average time for input loading = 7584.02 ms
Average time for lst::Event creation = 0.0034968 ms
   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Reset    Event     Short             Rate
   avg     13.1      0.5      0.3      1.5      1.2      0.4      1.3      1.0      1.3      0.0      20.7       7.2+/-  1.3      22.8   explicit[s=1]
   avg      2.5      0.6      0.5      1.8      1.4      0.5      1.1      0.7      1.8      0.0      10.8       7.9+/-  1.6       6.5   explicit[s=2]
   avg      3.4      1.0      0.8      2.8      2.2      0.6      2.0      1.1      3.0      0.0      16.9      13.0+/-  2.8       4.9   explicit[s=4]
   avg      5.7      1.6      1.2      4.0      3.3      0.7      3.0      1.6      4.4      0.0      25.4      18.9+/-  3.4       4.7   explicit[s=6]
   avg     10.0      1.9      1.3      4.8      4.7      0.9      4.3      2.4      5.5      0.0      35.8      24.9+/-  5.1       5.0   explicit[s=8]

RecoTracker/LSTCore/src/alpaka/LSTEvent.dev.cc

ariostas · 2024-12-18T21:51:41Z

Sorry for all the force-pushing.

I made the code compatible with both kVerticalModuleSlope and infinity, so that we can change the data files without breaking anything. Once that is done we can fully remove it.

I think the last low hanging fruit that I'll include here is to set concrete dimensions instead of templated types for kernels and alpaka functions.

RecoTracker/LSTCore/src/alpaka/LSTEvent.dev.cc

RecoTracker/LSTCore/src/alpaka/Segment.h

slava77 · 2024-12-20T23:06:36Z

/run all

github-actions · 2024-12-20T23:13:50Z

There was a problem while building and running in standalone mode. The logs can be found here.

github-actions · 2024-12-20T23:16:37Z

There was a problem while building and running with CMSSW. The logs can be found here.

slava77 · 2024-12-20T23:32:57Z

There was a problem while building and running in standalone mode. The logs can be found here.

I couldn't parse the error to the point of understanding where a fix is needed

RecoTracker/LSTCore/src/alpaka/Hit.h:87:63:   required from here
/cvmfs/cms.cern.ch/el8_amd64_gcc12/external/alpaka/1.1.0-aba90e6b0efd37975ff68417e953fa78/include/alpaka/workdiv/Traits.hpp:36:81: 
error: incomplete type 
'alpaka::trait::GetWorkDiv<alpaka::WorkDivUniformCudaHipBuiltIn<std::integral_constant<long unsigned int, 1>, unsigned int>, alpaka::origin::Thread, alpaka::unit::Elems, void>'
 used in nested name specifier
   36 |         return trait::GetWorkDiv<ImplementationBase, TOrigin, TUnit>::getWorkDiv(workDiv);
      |                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~

ariostas · 2024-12-23T14:58:52Z

We had some issues with our #includes since GPU code was showing up in CPU-only parts (without all the Alpaka flags being set appropriately). I didn't notice it because it compiles fine for the CPU backend, which is what I was using for testing. Should be all good now.

/run all

github-actions · 2024-12-23T15:14:39Z

The PR was built and ran successfully in standalone mode. Here are some of the comparison plots.

The full set of validation and comparison plots can be found here.

Here is a timing comparison:

   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Reset    Event     Short             Rate
   avg     46.5    393.0    188.5    157.4    164.4    508.9    122.1    230.1    146.0      3.7    1960.7    1405.3+/- 388.9     519.0   explicit[s=4] (target branch)
   avg     45.1    388.7    189.8    156.2    153.5    701.3    126.9    248.2    176.3      3.4    2189.2    1442.9+/- 393.2     575.6   explicit[s=4] (this PR)

github-actions · 2024-12-23T16:29:58Z

The PR was built and ran successfully with CMSSW. Here are some plots.

OOTB All Tracks

The full set of validation and comparison plots can be found here.

slava77 · 2024-12-23T18:38:12Z

I updated the master branch after the merge of #140 .
There is a conflict now in RecoTracker/LSTCore/interface/alpaka/Common.h

ariostas · 2024-12-30T14:53:08Z

There is a conflict now in RecoTracker/LSTCore/interface/alpaka/Common.h

Thank you, Slava. It's fixed now.

/run all

github-actions · 2024-12-30T14:59:34Z

There was a problem while building and running in standalone mode. The logs can be found here.

github-actions · 2024-12-30T15:23:29Z

The PR was built and ran successfully in standalone mode. Here are some of the comparison plots.

The full set of validation and comparison plots can be found here.

Here is a timing comparison:

   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Reset    Event     Short             Rate
   avg     45.3    391.9    188.6    151.9    146.0    543.9    121.6    234.5    151.9      3.3    1979.0    1389.8+/- 385.8     523.7   explicit[s=4] (target branch)
   avg     47.1    388.3    194.2    156.3    152.6    701.7    127.6    245.6    177.0      4.1    2194.4    1445.6+/- 397.1     580.8   explicit[s=4] (this PR)

github-actions · 2024-12-30T17:12:54Z

The PR was built and ran successfully with CMSSW. Here are some plots.

OOTB All Tracks

The full set of validation and comparison plots can be found here.

slava77 · 2025-01-03T22:42:23Z

RecoTracker/LSTCore/standalone/code/core/write_lst_ntuple.cc


 #include "write_lst_ntuple.h"

 using namespace ALPAKA_ACCELERATOR_NAMESPACE::lst;

+// copied from Tripelet.h
+float computeRadiusFromThreeAnchorHits(float x1, float y1, float x2, float y2, float x3, float y3, float& g, float& f) {


I'd rather we do not proliferate copy-pasted code, especially complex cases. Was the reason to avoid include of kernels code?
Perhaps move this method to e.g. RecoTracker/LSTCore/interface/Circle.h and use in the Triplet.h as well

ariostas force-pushed the ariostas/integration_pr_followups branch from 17d4223 to aa82696 Compare December 17, 2024 20:39

slava77 reviewed Dec 17, 2024

View reviewed changes

RecoTracker/LSTCore/src/alpaka/LSTEvent.dev.cc Outdated Show resolved Hide resolved

slava77 reviewed Dec 17, 2024

View reviewed changes

RecoTracker/LSTCore/src/alpaka/LSTEvent.dev.cc Outdated Show resolved Hide resolved

ariostas force-pushed the ariostas/integration_pr_followups branch 3 times, most recently from 7278857 to 9444e18 Compare December 18, 2024 21:47

GNiendorf reviewed Dec 19, 2024

View reviewed changes

RecoTracker/LSTCore/src/alpaka/LSTEvent.dev.cc Show resolved Hide resolved

slava77 reviewed Dec 20, 2024

View reviewed changes

RecoTracker/LSTCore/src/alpaka/Segment.h Show resolved Hide resolved

ariostas force-pushed the ariostas/integration_pr_followups branch from 2a034e8 to 0da8474 Compare December 20, 2024 16:23

ariostas changed the title ~~Integration PR followups: make_workdiv, uniform_elements, ...~~ Integration PR followups: make_workdiv, uniform_elements, concrete kernel dimensions Dec 20, 2024

ariostas marked this pull request as ready for review December 20, 2024 16:27

ariostas force-pushed the ariostas/integration_pr_followups branch from f6a499c to 4f9d7f9 Compare December 30, 2024 14:44

Use make_workdiv and uniform_elements

0b3a7bb

ariostas added 4 commits December 30, 2024 07:06

Use int16_t for hitRanges counters

9da0db8

Started removal of kVerticalModuleSlope

d042c75

Added concrete dimensions to kernels

ab6a5d6

Fixed include issues

c8b78fd

ariostas force-pushed the ariostas/integration_pr_followups branch from 4f9d7f9 to c8b78fd Compare December 30, 2024 15:07

slava77 reviewed Jan 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integration PR followups: make_workdiv, uniform_elements, concrete kernel dimensions #141

Integration PR followups: make_workdiv, uniform_elements, concrete kernel dimensions #141

ariostas commented Dec 17, 2024 •

edited

Loading

ariostas commented Dec 17, 2024 •

edited

Loading

github-actions bot commented Dec 17, 2024

ariostas commented Dec 17, 2024

ariostas commented Dec 17, 2024

github-actions bot commented Dec 17, 2024

ariostas commented Dec 17, 2024

ariostas commented Dec 18, 2024

slava77 commented Dec 20, 2024

github-actions bot commented Dec 20, 2024

github-actions bot commented Dec 20, 2024

slava77 commented Dec 20, 2024

ariostas commented Dec 23, 2024 •

edited

Loading

github-actions bot commented Dec 23, 2024

github-actions bot commented Dec 23, 2024

slava77 commented Dec 23, 2024

ariostas commented Dec 30, 2024

github-actions bot commented Dec 30, 2024

github-actions bot commented Dec 30, 2024

github-actions bot commented Dec 30, 2024

slava77 Jan 3, 2025

Integration PR followups: make_workdiv, uniform_elements, concrete kernel dimensions #141

Are you sure you want to change the base?

Integration PR followups: make_workdiv, uniform_elements, concrete kernel dimensions #141

Conversation

ariostas commented Dec 17, 2024 • edited Loading

ariostas commented Dec 17, 2024 • edited Loading

github-actions bot commented Dec 17, 2024

ariostas commented Dec 17, 2024

ariostas commented Dec 17, 2024

github-actions bot commented Dec 17, 2024

ariostas commented Dec 17, 2024

ariostas commented Dec 18, 2024

slava77 commented Dec 20, 2024

github-actions bot commented Dec 20, 2024

github-actions bot commented Dec 20, 2024

slava77 commented Dec 20, 2024

ariostas commented Dec 23, 2024 • edited Loading

github-actions bot commented Dec 23, 2024

github-actions bot commented Dec 23, 2024

slava77 commented Dec 23, 2024

ariostas commented Dec 30, 2024

github-actions bot commented Dec 30, 2024

github-actions bot commented Dec 30, 2024

github-actions bot commented Dec 30, 2024

slava77 Jan 3, 2025

Choose a reason for hiding this comment

ariostas commented Dec 17, 2024 •

edited

Loading

ariostas commented Dec 17, 2024 •

edited

Loading

ariostas commented Dec 23, 2024 •

edited

Loading