Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add occupancy branches + standalone improvements (rebase PR45, PR36, and PR55) #127

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 14 additions & 14 deletions RecoTracker/LSTCore/src/LSTESData.cc
Original file line number Diff line number Diff line change
Expand Up @@ -10,25 +10,25 @@

namespace {
std::string geometryDataDir() {
const char* path_lst_base = std::getenv("LST_BASE");
std::string path_str, path;
const char* path_tracklooperdir = std::getenv("TRACKLOOPERDIR");
std::string path_str;
if (path_lst_base != nullptr) {
path_str = path_lst_base;
} else if (path_tracklooperdir != nullptr) {
std::stringstream search_path;
search_path << std::getenv("CMSSW_SEARCH_PATH");
Comment on lines +15 to +16
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the initialization here since the IB was complaining about a nullptr.


while (std::getline(search_path, path, ':')) {
if (std::filesystem::exists(path + "/RecoTracker/LSTCore/data")) {
path_str = path;
break;
}
}

if (path_str.empty()) {
path_str = path_tracklooperdir;
path_str += "/../";
path_str += "/..";
} else {
std::stringstream search_path(std::getenv("CMSSW_SEARCH_PATH"));
std::string path;
while (std::getline(search_path, path, ':')) {
if (std::filesystem::exists(path + "/RecoTracker/LSTCore/data")) {
path_str = path;
break;
}
}
path_str += "/RecoTracker/LSTCore";
}

return path_str;
}

Expand Down
276 changes: 85 additions & 191 deletions RecoTracker/LSTCore/standalone/README.md
Original file line number Diff line number Diff line change
@@ -1,53 +1,93 @@
# TrackLooper
# How to set up standalone LST


## Quick Start


### Setting up LSTPerformanceWeb (only for lnx7188 and lnx4555)
## Setting up LSTPerformanceWeb (only for lnx7188 and lnx4555)

For lnx7188 and lnx4555 this needs to be done once

cd /cdat/tem/${USER}/
git clone [email protected]:SegmentLinking/LSTPerformanceWeb.git

### Setting up container (only for lnx7188)
## Setting up container (only for lnx7188)

For lnx7188 this needs to be done before compiling or running the code:

singularity shell --nv --bind /mnt/data1:/data --bind /data2/segmentlinking/ --bind /opt --bind /nfs --bind /mnt --bind /usr/local/cuda/bin/ --bind /cvmfs /cvmfs/unpacked.cern.ch/registry.hub.docker.com/cmssw/el8:x86_64

### Setting up the code
## Setting up LST

There are two way to set up LST as a standalone, either by setting up a full CMSSW area, which provides a unified setup for standalone and CMSSW tests, or by `sparse-checkout` only the relevant package and using them independent of CMSSW. A CVMFS-less setup is also provided for the second option.

### Setting up LST within CMSSW (preferred option)

```bash
CMSSW_VERSION=CMSSW_14_2_0_pre4 # Change with latest/preferred CMSSW version
cmsrel ${CMSSW_VERSION}
cd ${CMSSW_VERSION}/src/
cmsenv
git cms-init
# If necessary, add the remote [email protected]:SegmentLinking/cmssw.git
# and checkout a development/feature branch
git cms-addpkg RecoTracker/LST RecoTracker/LSTCore
# If modifying some dependencies, run `git cms-checkdeps -a -A`
scram b -j 12
cd RecoTracker/LSTCore/standalone
```

The data files for LST will be fetched from CVMFS. However, if new data files are needed, the need to be manually placed (under `$CMSSW_BASE/external/$SCRAM_ARCH/data/RecoTracker/LSTCore/data/`). This is done by running:

git clone [email protected]:SegmentLinking/TrackLooper.git
cd TrackLooper/
# Source one of the commands below, depending on the site
source setup.sh # if on UCSD or Cornell
source setup_hpg.sh # if on Florida
```bash
mkdir -p $CMSSW_BASE/external/$SCRAM_ARCH/data/RecoTracker/LSTCore/
cd $CMSSW_BASE/external/$SCRAM_ARCH/data/RecoTracker/LSTCore/
git clone [email protected]:cms-data/RecoTracker-LSTCore.git data
<modify the files or checkout a different branch>
cd -
```

### Running the code
### Setting up LST outside of CMSSW

sdl_make_tracklooper -mc
sdl_<backend> -i PU200 -o LSTNtuple.root
For this setup, dependencies are still provided from CMSSW through CVMFS but no CMSSW area is setup. This is done by running the following commands.

``` bash
LST_BRANCH=master # Change to the development branch
git clone --filter=blob:none --no-checkout --depth 1 --sparse --branch ${LST_BRANCH} https://github.com/SegmentLinking/cmssw.git TrackLooper
cd TrackLooper
git sparse-checkout add RecoTracker/LSTCore
git checkout
cd RecoTracker/LSTCore/standalone/
```

As in the sectino above, the data files are fetched from CVMFS, but they can also be copied manually under `RecoTracker/LSTCore/data/`.


## Running the code

Each time the standalone version of LST is to be used, the following command should be run from the `RecoTracker/LSTCore/standalone` directory:
```bash
source setup.sh
```

For running the code:

lst_make_tracklooper -m
lst_<backend> -i PU200 -o LSTNtuple.root
createPerfNumDenHists -i LSTNtuple.root -o LSTNumDen.root
lst_plot_performance.py LSTNumDen.root -t "myTag"
# python3 efficiency/python/lst_plot_performance.py LSTNumDen.root -t "myTag" # if you are on cgpu-1 or Cornell
lst_plot_performance.py LSTNumDen.root -t "myTag" # or
python3 efficiency/python/lst_plot_performance.py LSTNumDen.root -t "myTag" # if you are on cgpu-1 or Cornell

The above can be even simplified

sdl_run -f -mc -s PU200 -n -1 -t myTag
lst_run -f -m -s PU200 -n -1 -t myTag

The `-f` flag can be omitted when the code has already been compiled. If multiple backends were compiled, then the `-b` flag can be used to specify a backend. For example

sdl_run -b cpu -s PU200 -n -1 -t myTag
lst_run -b cpu -s PU200 -n -1 -t myTag

## Command explanations
### Command explanations

Compile the code with option flags. If none of `C,G,R,A` are used, then it defaults to compiling for CUDA and CPU.

sdl_make_tracklooper -mc
lst_make_tracklooper -m
-m: make clean binaries
-c: run with the cmssw caching allocator
-C: compile CPU backend
-G: compile CUDA backend
-R: compile ROCm backend
Expand All @@ -56,7 +96,7 @@ Compile the code with option flags. If none of `C,G,R,A` are used, then it defau

Run the code

sdl_<backend> -n <nevents> -v <verbose> -w <writeout> -s <streams> -i <dataset> -o <output>
lst_<backend> -n <nevents> -v <verbose> -w <writeout> -s <streams> -i <dataset> -o <output>

-i: PU200; muonGun, etc
-n: number of events; default: all
Expand Down Expand Up @@ -106,186 +146,40 @@ Comparing two different runs
-t "mywork" \
--compare

## CMSSW Integration
This is the a complete set of instruction on how the TrackLooper code
can be linked as an external tool in CMSSW:

### Build TrackLooper
```bash
git clone [email protected]:SegmentLinking/TrackLooper.git
cd TrackLooper/
# Source one of the commands below, depending on the site
source setup.sh # if on UCSD or Cornell
source setup_hpg.sh # if on Florida
sdl_make_tracklooper -mc
cd ..
```

### Set up `TrackLooper` as an external
```bash
mkdir workingFolder # Create the folder you will be working in
cd workingFolder
cmsrel CMSSW_14_1_0_pre3
cd CMSSW_14_1_0_pre3/src
cmsenv
git cms-init
git remote add SegLink [email protected]:SegmentLinking/cmssw.git
git fetch SegLink CMSSW_14_1_0_pre3_LST_X
git cms-addpkg RecoTracker Configuration
git checkout CMSSW_14_1_0_pre3_LST_X
#To include both the CPU library and GPU library into CMSSW, create 3 xml files (headers file has no library).
#Before writing the following xml file, check that libsdl_cpu.so and libsdl_gpu.so can be found under the ../../../TrackLooper/SDL/ folder.
cat <<EOF >lst_headers.xml
<tool name="lst_headers" version="1.0">
<client>
<environment name="LSTBASE" default="$PWD/../../../TrackLooper"/>
<environment name="INCLUDE" default="\$LSTBASE"/>
</client>
<runtime name="LST_BASE" value="\$LSTBASE"/>
</tool>
EOF
cat <<EOF >lst_cpu.xml
<tool name="lst_cpu" version="1.0">
<client>
<environment name="LSTBASE" default="$PWD/../../../TrackLooper"/>
<environment name="LIBDIR" default="\$LSTBASE/SDL"/>
<environment name="INCLUDE" default="\$LSTBASE"/>
</client>
<runtime name="LST_BASE" value="\$LSTBASE"/>
<lib name="sdl_cpu"/>
</tool>
EOF
cat <<EOF >lst_cuda.xml
<tool name="lst_cuda" version="1.0">
<client>
<environment name="LSTBASE" default="$PWD/../../../TrackLooper"/>
<environment name="LIBDIR" default="\$LSTBASE/SDL"/>
<environment name="INCLUDE" default="\$LSTBASE"/>
</client>
<runtime name="LST_BASE" value="\$LSTBASE"/>
<lib name="sdl_cuda"/>
</tool>
EOF
scram setup lst_headers.xml
scram setup lst_cpu.xml
scram setup lst_cuda.xml
cmsenv
git cms-checkdeps -a -A
scram b -j 12
```

### Run the LST reconstruction in CMSSW
A simple test configuration of the LST reconstruction can be run with the command:
```bash
cmsRun RecoTracker/LST/test/LSTAlpakaTester.py
```
# How to set up CMSSW with LST

For a more complete workflow, one can run a modified version of the 21034.1 workflow.
To get the commands of this workflow, one can run:
```bash
runTheMatrix.py -w upgrade -n -e -l 21034.1
```
## Setting up the area
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we even need this section? Isn't that already covered above ("Setting up LST" section)?

...I know, probably a question to myself back in the day, but maybe we can improve now...


For convenience, the workflow has been run for 100 events and the output is stored here:
```bash
/data2/segmentlinking/CMSSW_14_1_0_pre0/step2_21034.1_100Events.root
```
Follow the instructions in the ["Setting up LST within CMSSW" section](#setting-up-lst-within-cmssw-preferred-option).

For enabling the LST reconstruction in the CMSSW tracking workflow, a modified step3 needs to be run.
This is based on the step3 command of the 21034.1 workflow with the following changes:
- Remove the `--pileup_input` and `--pileup` flags.
- The number of threads and streams for the job can be optionally controlled by the `--nThreads` and `--nStreams` command line options respectively (`1` ends up being the actual default value for both, and more info can be found by running `cmsDriver.py --help`).
- Add at the end of the command: `--procModifiers gpu,trackingLST,trackingIters01 --no_exec`

Run the command and modify the output configuration file with the following:
- If want to run a cpu version, remove the ```gpu``` in the line defining the `process` object:
```python
process = cms.Process('RECO',...,gpu,...)
```
- Add the following lines below the part where the import of the standard configurations happens:
```python
process.load('Configuration.StandardSequences.Accelerators_cff')
process.load("HeterogeneousCore.AlpakaCore.ProcessAcceleratorAlpaka_cfi")
```
- Modify the input and output file names accordingly, as well as the number of events.

Then, run the configuration file with `cmsRun`.

To get the DQM files, one would have to run step4 of the 21034.1 workflow with the following modifications:
- Add `--no_exec` to the end of command and then run it.
- Modify the output configuration file by changing the input file (the one containing `inDQM` from the previous step) and number of events accordingly.
## Run the LST reconstruction in CMSSW (read to the end, before running)

Running the configuration file with `cmsRun`, the output file will have a name starting with `DQM`. The name is the same every time this step runs,
so it is good practice to rename the file, e.g. to `tracking_Iters01LST.root`.
The MTV plots can be produced with the command:
```bash
makeTrackValidationPlots.py --extended tracking_Iters01LST.root
```
Comparison plots can be made by including multiple ROOT files as arguments.
Two complete workflows have been implemented within CMSSW to run a two-iteration, tracking-only reconstruction with LST:
- 24834.703 (CPU)
- 24834.704 (GPU)

**Note:** In case one wants to run step2 as well, similar modifications as in step4 (`--no_exec` flag and input file/number of events) need to be applied. Moreover, the PU files have better be modified to point to local ones. This can be done by inserting a dummy file when running the command (set the argument of the `--pileup_input` flag to `file:file.root`), and then change the PU input files in the configuration to the following line (by means of replacing the corresponding line in the configuration):
```python
process.mix.input.fileNames = cms.untracked.vstring(['file:/data2/segmentlinking/PUSamplesForCMSSW1263/CMSSW_12_3_0_pre5/RelValMinBias_14TeV/GEN-SIM/123X_mcRun4_realistic_v4_2026D88noPU-v1/066fc95d-1cef-4469-9e08-3913973cd4ce.root', 'file:/data2/segmentlinking/PUSamplesForCMSSW1263/CMSSW_12_3_0_pre5/RelValMinBias_14TeV/GEN-SIM/123X_mcRun4_realistic_v4_2026D88noPU-v1/07928a25-231b-450d-9d17-e20e751323a1.root', 'file:/data2/segmentlinking/PUSamplesForCMSSW1263/CMSSW_12_3_0_pre5/RelValMinBias_14TeV/GEN-SIM/123X_mcRun4_realistic_v4_2026D88noPU-v1/26bd8fb0-575e-4201-b657-94cdcb633045.root', 'file:/data2/segmentlinking/PUSamplesForCMSSW1263/CMSSW_12_3_0_pre5/RelValMinBias_14TeV/GEN-SIM/123X_mcRun4_realistic_v4_2026D88noPU-v1/4206a9c5-44c2-45a5-aab2-1a8a6043a08a.root', 'file:/data2/segmentlinking/PUSamplesForCMSSW1263/CMSSW_12_3_0_pre5/RelValMinBias_14TeV/GEN-SIM/123X_mcRun4_realistic_v4_2026D88noPU-v1/55a372bf-a234-4111-8ce0-ead6157a1810.root', 'file:/data2/segmentlinking/PUSamplesForCMSSW1263/CMSSW_12_3_0_pre5/RelValMinBias_14TeV/GEN-SIM/123X_mcRun4_realistic_v4_2026D88noPU-v1/59ad346c-f405-4288-96d7-795f81c43fe8.root', 'file:/data2/segmentlinking/PUSamplesForCMSSW1263/CMSSW_12_3_0_pre5/RelValMinBias_14TeV/GEN-SIM/123X_mcRun4_realistic_v4_2026D88noPU-v1/7280f5ec-b71d-4579-a730-7ce2de0ff906.root', 'file:/data2/segmentlinking/PUSamplesForCMSSW1263/CMSSW_12_3_0_pre5/RelValMinBias_14TeV/GEN-SIM/123X_mcRun4_realistic_v4_2026D88noPU-v1/b93adc85-715f-477a-afc9-65f3241933ee.root', 'file:/data2/segmentlinking/PUSamplesForCMSSW1263/CMSSW_12_3_0_pre5/RelValMinBias_14TeV/GEN-SIM/123X_mcRun4_realistic_v4_2026D88noPU-v1/c7a0aa46-f55c-4b01-977f-34a397b71fba.root', 'file:/data2/segmentlinking/PUSamplesForCMSSW1263/CMSSW_12_3_0_pre5/RelValMinBias_14TeV/GEN-SIM/123X_mcRun4_realistic_v4_2026D88noPU-v1/e77fa467-97cb-4943-884f-6965b4eb0390.root'])
```
We will use the second one in the example below. To get the commands of this workflow, one can run:

### Inclusion of LST in other CMSSW packages
Including the line
```
<use name="lst"/>
```
in the relevant package `BuildFile.xml` allows for
including our headers in the code of that package.
runTheMatrix.py -w upgrade -n -e -l 24834.704

## Running LST in a CVMFS-less setup
For convenience, the workflow has been run for 100 events and the output is stored here:

The setup scripts included in this repository assume that the [CernVM File System (CVMFS)](https://cernvm.cern.ch/fs/) is installed. This provides a convenient way to fetch the required dependencies, but it is not necessary to run LST in standalone mode. Here, we briefly describe how to build and run it when CVMFS is not available.
/data2/segmentlinking/step2_29834.1_100Events.root

The necessary dependencies are CUDA, ROOT, the Boost libraries, Alpaka, and some CMSSW headers. CUDA, ROOT, and Boost, are fairly standard libraries and are available from multiple package managers. For the remaining necessary headers you will need to clone the [Alpaka](https://github.com/alpaka-group/alpaka) and [CMSSW](https://github.com/cms-sw/cmssw) repositories. The Alpaka repository is reasonably sized, but the CMSSW one extremely large, especially considering that we only need a tiny fraction of its files to build LST. We can get only the Alpaka interface headers from CMSSW by running the following commands.
The input files in each step may need to be properly adjusted to match the ones produced by the previous step/provided externally, hence it is better to run the commands with the `--no_exec` option included.

``` bash
git clone --filter=blob:none --no-checkout --depth 1 --sparse --branch CMSSW_14_1_X https://github.com/cms-sw/cmssw.git
cd cmssw
git sparse-checkout add HeterogeneousCore/AlpakaInterface
git checkout
```
Running the configuration file with `cmsRun`, the output file will have a name starting with `DQM`. The name is the same every time this step runs,
so it is good practice to rename the file, e.g. to `step4_24834.704.root`.
The MTV plots can be produced with the command:

Then all that is left to do is set some environment variables. We give an example of how to do this in lnx7188/cgpu-1.
makeTrackValidationPlots.py --extended step4_24834.704.root

```bash
# These two lines are only needed to set the right version of gcc and nvcc. They are not needed for standard installations.
export PATH=/cvmfs/cms.cern.ch/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/bin:/cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_1_0_pre3/external/el8_amd64_gcc12/bin:$PATH
export LD_LIBRARY_PATH=/cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_1_0_pre3/biglib/el8_amd64_gcc12:/cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_1_0_pre3/lib/el8_amd64_gcc12:/cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_1_0_pre3/external/el8_amd64_gcc12/lib:/cvmfs/cms.cern.ch/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/lib64:/cvmfs/cms.cern.ch/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/lib:$LD_LIBRARY_PATH

# These are the lines that you need to manually change for a CVMFS-less setup.
# In this example we use cvmfs paths since that is where the dependencies are in lnx7188/cgpu1, but they can point to local directories.
export BOOST_ROOT=/cvmfs/cms.cern.ch/el8_amd64_gcc12/external/boost/1.80.0-60a217837b5db1cff00c7d88ec42f53a
export ALPAKA_ROOT=/cvmfs/cms.cern.ch/el8_amd64_gcc12/external/alpaka/1.1.0-7d0324257db47fde2d27987e7ff98fb4
export CUDA_HOME=/cvmfs/cms.cern.ch/el8_amd64_gcc12/external/cuda/12.4.1-06cde0cd9f95a73a1ea05c8535f60bde
export ROOT_ROOT=/cvmfs/cms.cern.ch/el8_amd64_gcc12/lcg/root/6.30.07-21947a33e64ceb827a089697ad72e468
export CMSSW_BASE=/cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_1_0_pre3

# These lines are needed to account for some extra environment variables that are exported in the setup script.
export LD_LIBRARY_PATH=$PWD/SDL/cuda:$PWD/SDL/cpu:$PWD:$LD_LIBRARY_PATH
export PATH=$PWD/bin:$PATH
export PATH=$PWD/efficiency/bin:$PATH
export PATH=$PWD/efficiency/python:$PATH
export TRACKLOOPERDIR=$PWD
export TRACKINGNTUPLEDIR=/data2/segmentlinking/CMSSW_12_2_0_pre2/
export LSTOUTPUTDIR=.
source $PWD/code/rooutil/thisrooutil.sh

# After this, you can compile and run LST as usual.
sdl_run -f -mc -s PU200 -n -1 -t myTag
```
Comparison plots can be made by including multiple ROOT files as arguments.

## Code formatting and checking

The makefile in the `SDL` directory includes phony targets to run `clang-format` and `clang-tidy` on the code using the formatting and checks used in CMSSW. The following are the available commands.
Using the first setup option above, it is prefered to run the checks provided by CMSSW using the following commands.

- `make format`
Formats the code in the `SDL` directory using `clang-format` following the rules specified in `.clang-format`.
- `make check`
Runs `clang-tidy` on the code in the `SDL` directory to performs the checks specified in `.clang-tidy`.
- `make check-fix`
Same as `make check`, but fixes the issues that it knows how to fix.

```
scram b -j 12 code-checks >& c.log && scram b -j 12 code-format >& f.log
```
Loading