Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prod/pfnano: add the production script #14

Closed
katilp opened this issue Apr 4, 2023 · 6 comments
Closed

prod/pfnano: add the production script #14

katilp opened this issue Apr 4, 2023 · 6 comments
Assignees

Comments

@katilp
Copy link
Contributor

katilp commented Apr 4, 2023

Set up the PFNano production workflow

Use the container gitlab-registry.cern.ch/cms-cloud/cmssw-docker/cmssw_10_6_30-slc7_amd64_gcc700

Local testing:

docker pull gitlab-registry.cern.ch/cms-cloud/cmssw-docker/cmssw_10_6_30-slc7_amd64_gcc700
[...]
da834d3e9b02: Download complete
c94d22b651d9: Download complete
failed to register layer: Error processing tar file(exit status 1): write /cvmfs/cms.cern.ch/slc7_amd64_gcc700/cms/cmssw/CMSSW_10_6_30/src/Geometry/TrackerCommonData/data/tecring0.xml: no space left on device

Maybe a local issue, although space is enough, investigating...

Eventually, to get forward, use the cvmfs mount action (check that it still works) and use the smaller cc7-cms image (or cms-cvmfs-docker as in the PFNano recipe)

@katilp katilp self-assigned this Apr 4, 2023
@katilp katilp changed the title prod/pfnano prod/pfnano: add the production script Apr 5, 2023
@katilp
Copy link
Contributor Author

katilp commented Apr 5, 2023

For the record, local testing of cc7-cvmfs with an existing local /cvmfs mount.
Leaving out oasis.opensciencegrid.org from the mount to see if it is needed :

$ docker pull gitlab-registry.cern.ch/cms-cloud/cmssw-docker/cc7-cvmfs:2023-04-04-131be08c
$ ls /cvmfs
cms-bril.cern.ch  cms-opendata-conddb.cern.ch  cms.cern.ch  oasis.opensciencegrid.org
$ mkdir pfnanotest
$ cd pfnanotest/
$ docker run -it -P --device /dev/fuse --cap-add SYS_ADMIN -e CVMFS_MOUNTS="cms.cern.ch" --security-opt apparmor:unconfined  -v $PWD:/data gitlab-registry.cern.ch/cms-cloud/cmssw-docker/cc7-cvmfs:2023-04-04-131be08c
chgrp: invalid group: 'fuse'
::: cvmfs-config...
Failed to get D-Bus connection: Operation not permitted
Failed to get D-Bus connection: Operation not permitted
::: mounting FUSE...
CernVM-FS: running with credentials 998:995
CernVM-FS: loading Fuse module... done
CernVM-FS: mounted cvmfs on /cvmfs/cms.cern.ch
CernVM-FS: running with credentials 998:995
CernVM-FS: loading Fuse module... done
CernVM-FS: mounted cvmfs on /cvmfs/cms-opendata-conddb.cern.ch
::: mounting FUSE... [done]
::: Mounting CVMFS... [done]
::: Setting up CMS environment...
::: Setting up CMS environment... [done]
[10:50:18] cmsusr@4ec8f7d8a098 /code $
cd /data
source /cvmfs/cms.cern.ch/cmsset_default.sh
cmsrel CMSSW_10_6_30
cd CMSSW_10_6_30/src/
cmsenv
git cms-init --upstream-only
git config user.email "[email protected]"
git config user.name "me"
git cms-merge-topic 39040
git clone -b opendata https://github.com/DAZSLE/PFNano.git PhysicsTools/PFNano
scram b -j 4

Copy the test aod file locally to CMSSW_10_6_30/src/ and build the config

cmsDriver.py --python_filename doubleeg_cfg.py --eventcontent NANOAOD --datatier NANOAOD
  --fileout file:doubleeg_nanoaod.root --conditions 106X_dataRun2_v36 --step NANO
  --filein file:doubleeg_miniaod.root --era Run2_25ns,run2_nanoAOD_106X2015 --no_exec --data -n -1 
  --customise PhysicsTools/PFNano/pfnano_cff.PFnano_customizeData_onlyPF

Edit to change to 20 events in the config and run

cmsRun doubleeg_cfg.py
Updating process to run DeepBoostedJet on datasets before 103X
Updating process to run ParticleNet before it's included in MiniAOD
Updating process to run DeepDoubleX on datasets before 104X
Updating process to run DeepDoubleXv2 on datasets before 11X
Will recalculate the following discriminators on AK8 jets: pfDeepBoostedJetTags:probTbcq, pfDeepBoostedJetTags:probTbqq, pfDeepBoostedJetTags:probTbc, pfDeepBoostedJetTags:probTbq, pfDeepBoostedJetTags:probWcq, pfDeepBoostedJetTags:probWqq, pfDeepBoostedJetTags:probZbb, pfDeepBoostedJetTags:probZcc, pfDeepBoostedJetTags:probZqq, pfDeepBoostedJetTags:probHbb, pfDeepBoostedJetTags:probHcc, pfDeepBoostedJetTags:probHqqqq, pfDeepBoostedJetTags:probQCDbb, pfDeepBoostedJetTags:probQCDcc, pfDeepBoostedJetTags:probQCDb, pfDeepBoostedJetTags:probQCDc, pfDeepBoostedJetTags:probQCDothers, pfDeepBoostedDiscriminatorsJetTags:TvsQCD, pfDeepBoostedDiscriminatorsJetTags:WvsQCD, pfDeepBoostedDiscriminatorsJetTags:ZvsQCD, pfDeepBoostedDiscriminatorsJetTags:ZbbvsQCD, pfDeepBoostedDiscriminatorsJetTags:HbbvsQCD, pfDeepBoostedDiscriminatorsJetTags:H4qvsQCD, pfMassDecorrelatedDeepBoostedJetTags:probTbcq, pfMassDecorrelatedDeepBoostedJetTags:probTbqq, pfMassDecorrelatedDeepBoostedJetTags:probTbc, pfMassDecorrelatedDeepBoostedJetTags:probTbq, pfMassDecorrelatedDeepBoostedJetTags:probWcq, pfMassDecorrelatedDeepBoostedJetTags:probWqq, pfMassDecorrelatedDeepBoostedJetTags:probZbb, pfMassDecorrelatedDeepBoostedJetTags:probZcc, pfMassDecorrelatedDeepBoostedJetTags:probZqq, pfMassDecorrelatedDeepBoostedJetTags:probHbb, pfMassDecorrelatedDeepBoostedJetTags:probHcc, pfMassDecorrelatedDeepBoostedJetTags:probHqqqq, pfMassDecorrelatedDeepBoostedJetTags:probQCDbb, pfMassDecorrelatedDeepBoostedJetTags:probQCDcc, pfMassDecorrelatedDeepBoostedJetTags:probQCDb, pfMassDecorrelatedDeepBoostedJetTags:probQCDc, pfMassDecorrelatedDeepBoostedJetTags:probQCDothers, pfMassDecorrelatedDeepBoostedDiscriminatorsJetTags:TvsQCD, pfMassDecorrelatedDeepBoostedDiscriminatorsJetTags:WvsQCD, pfMassDecorrelatedDeepBoostedDiscriminatorsJetTags:ZvsQCD, pfMassDecorrelatedDeepBoostedDiscriminatorsJetTags:ZHbbvsQCD, pfMassDecorrelatedDeepBoostedDiscriminatorsJetTags:ZbbvsQCD, pfMassDecorrelatedDeepBoostedDiscriminatorsJetTags:HbbvsQCD, pfMassDecorrelatedDeepBoostedDiscriminatorsJetTags:H4qvsQCD, pfMassDecorrelatedDeepBoostedDiscriminatorsJetTags:ZHccvsQCD, pfMassDecorrelatedDeepBoostedDiscriminatorsJetTags:bbvsLight, pfMassDecorrelatedDeepBoostedDiscriminatorsJetTags:ccvsLight, pfParticleNetJetTags:probTbcq, pfParticleNetJetTags:probTbqq, pfParticleNetJetTags:probTbc, pfParticleNetJetTags:probTbq, pfParticleNetJetTags:probTbel, pfParticleNetJetTags:probTbmu, pfParticleNetJetTags:probTbta, pfParticleNetJetTags:probWcq, pfParticleNetJetTags:probWqq, pfParticleNetJetTags:probZbb, pfParticleNetJetTags:probZcc, pfParticleNetJetTags:probZqq, pfParticleNetJetTags:probHbb, pfParticleNetJetTags:probHcc, pfParticleNetJetTags:probHqqqq, pfParticleNetJetTags:probQCDbb, pfParticleNetJetTags:probQCDcc, pfParticleNetJetTags:probQCDb, pfParticleNetJetTags:probQCDc, pfParticleNetJetTags:probQCDothers, pfParticleNetDiscriminatorsJetTags:TvsQCD, pfParticleNetDiscriminatorsJetTags:WvsQCD, pfParticleNetDiscriminatorsJetTags:ZvsQCD, pfParticleNetDiscriminatorsJetTags:ZbbvsQCD, pfParticleNetDiscriminatorsJetTags:HbbvsQCD, pfParticleNetDiscriminatorsJetTags:HccvsQCD, pfParticleNetDiscriminatorsJetTags:H4qvsQCD, pfMassDecorrelatedParticleNetJetTags:probXbb, pfMassDecorrelatedParticleNetJetTags:probXcc, pfMassDecorrelatedParticleNetJetTags:probXqq, pfMassDecorrelatedParticleNetJetTags:probQCDbb, pfMassDecorrelatedParticleNetJetTags:probQCDcc, pfMassDecorrelatedParticleNetJetTags:probQCDb, pfMassDecorrelatedParticleNetJetTags:probQCDc, pfMassDecorrelatedParticleNetJetTags:probQCDothers, pfMassDecorrelatedParticleNetDiscriminatorsJetTags:XbbvsQCD, pfMassDecorrelatedParticleNetDiscriminatorsJetTags:XccvsQCD, pfMassDecorrelatedParticleNetDiscriminatorsJetTags:XqqvsQCD, pfParticleNetMassRegressionJetTags:mass, pfDeepDoubleBvLJetTags:probHbb, pfDeepDoubleCvLJetTags:probHcc, pfDeepDoubleCvBJetTags:probHcc, pfMassIndependentDeepDoubleBvLJetTags:probHbb, pfMassIndependentDeepDoubleCvLJetTags:probHcc, pfMassIndependentDeepDoubleCvBJetTags:probHcc, pfMassIndependentDeepDoubleBvLV2JetTags:probHbb, pfMassIndependentDeepDoubleCvLV2JetTags:probHcc, pfMassIndependentDeepDoubleCvBV2JetTags:probHcc
add DeepMET Producers
05-Apr-2023 11:23:26 CEST  Initiating request to open file file:doubleeg_miniaod.root
05-Apr-2023 11:23:32 CEST  Successfully opened file file:doubleeg_miniaod.root
2023-04-05 11:23:52.332401: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
                         : Booking "muonMVATTH" of type "BDT" from /cvmfs/cms.cern.ch/slc7_amd64_gcc700/cms/cmssw/CMSSW_10_6_30/external/slc7_amd64_gcc700/data/PhysicsTools/NanoAOD/data/mu_BDTG_2017.weights.xml.
                         : Reading weight file: /cvmfs/cms.cern.ch/slc7_amd64_gcc700/cms/cmssw/CMSSW_10_6_30/external/slc7_amd64_gcc700/data/PhysicsTools/NanoAOD/data/mu_BDTG_2017.weights.xml
<HEADER> DataSetInfo              : [Default] : Added class "Signal"
<HEADER> DataSetInfo              : [Default] : Added class "Background"
                         : Booked classifier "BDTG" of type: "BDT"
                         : Booking "muonMVALowPt" of type "BDT" from /cvmfs/cms.cern.ch/slc7_amd64_gcc700/cms/cmssw/CMSSW_10_6_30/external/slc7_amd64_gcc700/data/PhysicsTools/NanoAOD/data/mu_BDTG_lowpt.weights.xml.
                         : Reading weight file: /cvmfs/cms.cern.ch/slc7_amd64_gcc700/cms/cmssw/CMSSW_10_6_30/external/slc7_amd64_gcc700/data/PhysicsTools/NanoAOD/data/mu_BDTG_lowpt.weights.xml
<HEADER> DataSetInfo              : [Default] : Added class "Signal"
<HEADER> DataSetInfo              : [Default] : Added class "Background"
                         : Booked classifier "BDTG" of type: "BDT"
                         : Booking "electronMVATTH" of type "BDT" from /cvmfs/cms.cern.ch/slc7_amd64_gcc700/cms/cmssw/CMSSW_10_6_30/external/slc7_amd64_gcc700/data/PhysicsTools/NanoAOD/data/el_BDTG_2017.weights.xml.
                         : Reading weight file: /cvmfs/cms.cern.ch/slc7_amd64_gcc700/cms/cmssw/CMSSW_10_6_30/external/slc7_amd64_gcc700/data/PhysicsTools/NanoAOD/data/el_BDTG_2017.weights.xml
<HEADER> DataSetInfo              : [Default] : Added class "Signal"
<HEADER> DataSetInfo              : [Default] : Added class "Background"
                         : Booked classifier "BDTG" of type: "BDT"
Begin processing the 1st record. Run 258434, Event 269235992, LumiSection 165 on stream 0 at 05-Apr-2023 11:25:07.161 CEST
Begin processing the 2nd record. Run 258434, Event 269040066, LumiSection 165 on stream 0 at 05-Apr-2023 11:29:10.431 CEST
Begin processing the 3rd record. Run 258434, Event 269567329, LumiSection 165 on stream 0 at 05-Apr-2023 11:29:10.523 CEST
Begin processing the 4th record. Run 258434, Event 268674092, LumiSection 165 on stream 0 at 05-Apr-2023 11:29:10.701 CEST
Begin processing the 5th record. Run 258434, Event 269416541, LumiSection 165 on stream 0 at 05-Apr-2023 11:29:10.851 CEST
Begin processing the 6th record. Run 258434, Event 269251857, LumiSection 165 on stream 0 at 05-Apr-2023 11:29:10.901 CEST
Begin processing the 7th record. Run 258434, Event 268739237, LumiSection 165 on stream 0 at 05-Apr-2023 11:29:10.987 CEST
Begin processing the 8th record. Run 258434, Event 269456225, LumiSection 165 on stream 0 at 05-Apr-2023 11:29:11.502 CEST
Begin processing the 9th record. Run 258434, Event 269845067, LumiSection 165 on stream 0 at 05-Apr-2023 11:29:11.585 CEST
Begin processing the 10th record. Run 258434, Event 268437313, LumiSection 165 on stream 0 at 05-Apr-2023 11:29:11.655 CEST
Begin processing the 11th record. Run 258434, Event 269791499, LumiSection 165 on stream 0 at 05-Apr-2023 11:29:11.733 CEST
Begin processing the 12th record. Run 258434, Event 269105371, LumiSection 165 on stream 0 at 05-Apr-2023 11:29:11.836 CEST
Begin processing the 13th record. Run 258434, Event 269105367, LumiSection 165 on stream 0 at 05-Apr-2023 11:29:12.057 CEST
Begin processing the 14th record. Run 258434, Event 269053912, LumiSection 165 on stream 0 at 05-Apr-2023 11:29:12.145 CEST
Begin processing the 15th record. Run 258434, Event 269632798, LumiSection 165 on stream 0 at 05-Apr-2023 11:29:12.339 CEST
Begin processing the 16th record. Run 258434, Event 268630694, LumiSection 165 on stream 0 at 05-Apr-2023 11:29:12.474 CEST
Begin processing the 17th record. Run 258434, Event 269366939, LumiSection 165 on stream 0 at 05-Apr-2023 11:29:12.614 CEST
Begin processing the 18th record. Run 258434, Event 269880785, LumiSection 165 on stream 0 at 05-Apr-2023 11:29:12.690 CEST
Begin processing the 19th record. Run 258434, Event 269285599, LumiSection 165 on stream 0 at 05-Apr-2023 11:29:12.826 CEST
Begin processing the 20th record. Run 258434, Event 268996506, LumiSection 165 on stream 0 at 05-Apr-2023 11:29:12.930 CEST
05-Apr-2023 11:29:13 CEST  Closed file file:doubleeg_miniaod.root

=============================================

MessageLogger Summary

 type     category        sev    module        subroutine        count    total
 ---- -------------------- -- ---------------- ----------------  -----    -----
    1 fileAction           -s file_close                             1        1
    2 fileAction           -s file_open                              2        2

 type    category    Examples: run/evt        run/evt          run/evt
 ---- -------------------- ---------------- ---------------- ----------------
    1 fileAction           PostGlobalEndRun
    2 fileAction           pre-events       pre-events

Severity    # Occurrences   Total Occurrences
--------    -------------   -----------------
System                  3                   3

dropped waiting message count 0

Test if reading directly from root://eospublic.cern.ch//eos/opendata/cms/Run2015D/DoubleEG/MINIAOD/08Jun2016-v1/10000/00387F48-342F-E611-AB5D-0CC47A4D76AC.root works.

Yes: this is OK

cmsDriver.py --python_filename doubleeg_cfg.py --eventcontent NANOAOD --datatier NANOAOD
  --fileout file:doubleeg_nanoaod.root --conditions 106X_dataRun2_v36 --step NANO
  --filein root://eospublic.cern.ch//eos/opendata/cms/Run2015D/DoubleEG/MINIAOD/08Jun2016-v1/10000/00387F48-342F-E611-AB5D-0CC47A4D76AC.root --era Run2_25ns,run2_nanoAOD_106X2015 --no_exec --data -n -1 
  --customise PhysicsTools/PFNano/pfnano_cff.PFnano_customizeData_onlyPF

@katilp
Copy link
Contributor Author

katilp commented Apr 5, 2023

Add a PFNano production test workflow with

  • cvmfs mount
  • PFNano production step

@katilp
Copy link
Contributor Author

katilp commented Apr 5, 2023

Take note:

WARNING: In non-interactive mode release checks e.g. deprecated releases, production architectures are disabled.
ERROR: Project "CMSSW" version "CMSSW_10_6_30" is not available for arch slc7_amd64_gcc10.
       "CMSSW_10_6_30" is currently available for following archs.
       Please set SCRAM_ARCH properly and re-run the command.
    slc7_amd64_gcc820
    slc7_amd64_gcc700

and set export SCRAM_ARCH=slc7_amd64_gcc700

@katilp
Copy link
Contributor Author

katilp commented Apr 5, 2023

Condition access does not work on github action with the cvmfs image gitlab-registry.cern.ch/cms-cloud/cmssw-docker/cc7-cvmfs:2023-04-04-131be08c, try with
gitlab-registry.cern.ch/cms-cloud/cmssw-docker/cc7-cms:2023-04-04-131be08c

That does work either.

Note that there's an env variable SITECONFIG_PATH that usually points to

echo $SITECONFIG_PATH
/cvmfs/cms.cern.ch/SITECONF/local

Copy site-local-config.xml of the standalone CMSSW_7_6_7 open data container and the env variable

Setting SITECONFIG_PATH does not seem to have any effect, verified it to be
SITECONFIG_PATH: /mnt/vol/production/pfnano
in several places, at the start, after source ... and cmsenv. Still:

----- Begin Fatal Exception 05-Apr-2023 21:09:24 CEST-----------------------
An exception of category 'Incomplete configuration' occurred while
   [0] Constructing the EventProcessor
   [1] Constructing ESSource: class=PoolDBESSource label='GlobalTag'
Exception Message:
Valid site-local-config not found at /cvmfs/cms.cern.ch/SITECONF/local/JobConfig/site-local-config.xml
----- End Fatal Exception -------------------------------------------------

Probably can be made to work but not necessarily worth the trouble as the hope is to get the standalone container fixed.
It is rebuilt now, but the test won't run https://github.com/katilp/cmssw-container-workflow/blob/test-CMSSW-10-6-30/.github/workflows/main.yml (gets interrupted for unknown reasons)

@katilp
Copy link
Contributor Author

katilp commented Apr 6, 2023

The standalone image is now fixed but requires more space than is available on github runners.

Add a step to free some space.

@katilp
Copy link
Contributor Author

katilp commented Apr 6, 2023

Things to remember:

  • recent containers need more space than there is available
    • remove some software on the runner
  • cmsenv in needed (not necessary in standalone containers, at least those for OD)
    • may be related to how the command script is passed
  • aliases are not defined so instead of cmsenv use eval `scramv1 runtime -sh`

Also, most of the trouble above came from not using the standalone image but the lightweight ones. To remember for them, although they are not needed here:

  • cvmfs action works well to mount cvmfs in the lightweigt container
  • GT access for external locations does not work in some containers
    • it depends on /cvmfs/cms.cern.ch/SITECONF/local/JobConfig/site-local-config.xml the path of which is in the environmental variable SITECONFIG_PATH.
    • copying over a working site-local-config.xml in a writable location and setting SITECONFIG_PATH to its path in the command script does not seem to have any effect.

@katilp katilp closed this as completed Apr 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant