-
Notifications
You must be signed in to change notification settings - Fork 21
/
README
270 lines (204 loc) · 12.5 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
Introduction:
-------------
This package provides a small facility to analyze one or a chain of root ntuples.
A script (./scripts/make_rootNtupleClass.sh) is used to generate automatically
(using the root command RootNtupleMaker->MakeClass) a class (include/rootNtupleClass.h
and src/rootNtupleClass.C) with the variable definitions of a given root ntuple
(to be provided by the user).
The class baseClass (include/baseClass.h and src/baseClass.C) inherits from the
automatically generated rootNtupleClass.
baseClass provides the methods that are common to all analysis, such as the method
to read a list of root files and form a chain. It will, asap, also provide a method
to read a list of selection cuts.
The class analysisClass (include/analysisClass.h and src/analysisClass.C) inherits
from baseClass.
The user's code should be placed in the method Loop() of analysisClass, which reimplements
the method Loop() of rootNtupleClass.
The main program (src/main.C) receives the configuration parameters (such as the input
chain of root files and a file to provide a cut list) and executes the analysisClass code.
Instructions:
-------------
1) Environment setup:
cd CMSSW_8_0_24/src/
cmsenv
(or the most recent release)
2) Download the code:
git clone https://github.com/CMSDIJET/DijetRootTreeAnalyzer.git CMSDIJET/DijetRootTreeAnalyzer
3) Go inside:
cd CMSDIJET/DijetRootTreeAnalyzer
=======
4) to generate the rootNtupleClass (you do not need to unless the ntuple format changes):
cd CMSDIJET/DijetRootTreeAnalyzer
./scripts/make_rootNtupleClass.sh
(you will be asked for input arguments)
i.e.
./scripts/make_rootNtupleClass.sh -f dijetTree_signal.root -t dijets/events
where the rootfile has been produced by the DijetRootTreeMaker
Note that there could be minor glitches and then you can fix by hand.
=======
5) Make a symbolic link analysisClass.C by:
ln -sf analysisClass_myCodeWide.C src/analysisClass.C
or
ln -sf analysisClass_myCodeAK8.C src/analysisClass.C
or
ln -sf analysisClass_myCodeCA8.C src/analysisClass.C
6) Compile to test that all is OK so far:
make clean
make
7) Run:
./main
(you will be asked for input arguments)
i.e.
./main config/inputListJets.txt config/cutFileExampleJet.txt dijets/events output/rootFile output/cutEfficiencyFile
8) All the produced files are in the directory output
Notes:
1) one can have several analyses in a directory, such as
src/analysisClass_myCode1.C
src/analysisClass_myCode2.C
src/analysisClass_myCode3.C
and move the symbolic link to the one to be used:
ln -sf analysisClass_myCode2.C src/analysisClass.C
and compile/run as above.
More details:
-------------
- Example code:
The src/analysisClass_template.C comes with simple example code. The example code in enclosed by
#ifdef USE_EXAMPLE
... code ...
#endif //end of USE_EXAMPLE
The code is NOT compiled by default. In order to compile it, uncomment the line
#FLAGS += -DUSE_EXAMPLE
in the Makefile.
- Providing cuts via file:
A list of cut variable names and cut limits can be provided through a file (see config/cutFileExample.txt).
The variable names in such a file have to be filled with a value calculated by the user analysisClass code,
a function "fillVariableWithValue" is provided - see example code.
Once all the cut variables have been filled, the cuts can be evaluated by calling "evaluateCuts" - see
example code. Do not forget to reset the cuts by calling "resetCuts" at each event before filling the
variables - see example code.
The function "evaluateCuts" determines whether the cuts are satisfied or not, stores the pass/failed result
of each cut, calculates cut efficiencies and fills histograms for each cut variable (binning provided by the
cut file, see config/cutFileExample.txt).
The user has access to the cut results via a set of functions (see include/baseClass.h)
bool baseClass::passedCut(const string& s);
bool baseClass::passedAllPreviousCuts(const string& s);
bool baseClass::passedAllOtherCuts(const string& s);
where the string to be passed is the cut variable name.
The cuts are evaluated following the order of their apperance in the cut file (config/cutFileExample.txt).
One can simply change the sequnce of line in the cut file to have the cuts applied in a different order
and do cut efficiency studies.
Also, the user can assign to each cut a level (0,1,2,3,4 ... n) and use a function
bool baseClass::passedAllOtherSameLevelCuts(const string& s);
to have the pass/failed info on all other cuts with the same level.
There is actually also cuts with level=-1. These cuts are not actually evaluated, the corresponding lines
in the cut file (config/cutFileExample.txt) are used to pass values to the user code (such as fiducial
region limits). The user can access these values (and also those of the cuts with level >= 0) by
double baseClass::getCutMinValue1(const string& s);
double baseClass::getCutMaxValue1(const string& s);
double baseClass::getCutMinValue2(const string& s);
double baseClass::getCutMaxValue2(const string& s);
- Automatic histograms for cuts
The following histograms are generated for each cut variable with level >= 0:
no cuts applied
passedAllPreviousCuts
passedAllOtherSameLevelCuts
passedAllOtherCuts
passedAllCut
and by default only the following subset
no cuts applied
passedAllPreviousCuts
passedAllOtherCuts
is saved to the output root file. All histograms can be saved to the output root file by
uncommenting the following line in the Makefile
#FLAGS += -DSAVE_ALL_HISTOGRAMS
- Automatic cut efficiency:
the absolute and relative efficiency is calculated for each cut and stored in an output file
(named output/cutEfficiencyFile.dat if the code is executed following the examples)
The user has the option to implement a good run list using a JSON file. This requires two edits to the cut
file and one edit to the analysisClass.C file.
A line must be inserted at the beginning of the cut file with the word "JSON" first, and then
the full AFS path of the desiredJSON file. For example:
JSON /afs/cern.ch/cms/CAF/CMSCOMM/COMM_DQM/certification/Collisions11/7TeV/Prompt/Cert_160404-163369_7TeV_PromptReco_Collisions11_JSON.txt
In addition, the user must define the JSON file selection in the cut file. This is done in the usual way:
#VariableName minValue1(<) maxValue1(>=) minValue2(<) maxValue2(>=) level histoNbinsMinMax
#------------ ------------ ------------- ------------ ------------- ----- ----------------
PassJSON 0 1 - - 0 2 -0.5 1.5
In the analysisClass.C file, the user must add the following line within the analysis loop:
fillVariableWithValue ( "PassJSON", passJSON (run, ls, isData));
Note that the use of a JSON file (good run list) is optional. If the user does not list a JSON file in the cut file,
no selection will be made.
This tool was validated on May 18, 2011. The results of the validation study are here:
https://twiki.cern.ch/twiki/pub/CMS/ExoticaLeptoquarkAnalyzerRootNtupleMakerV2/json_validation.pdf
Additional scripts for running on several datasets:
---------------------------------------------------
See ./doc/howToMakeAnalysisWithRootTuples.txt
Using the Optimizer (Jeff Temple):
----------------------------------
The input cut file can also specify variables to be used in optimization studies.
To do so, add a line in the file for each variable to optimize. The first field of a line
must be the name of the variable, second field must be "OPT", third field either ">" or "<".
(The ">" sign will pass values greater than the applied threshold, and "<" will pass
those less than the threshold.) 4th and 5th fields should be the minimum
and maximum thresholds you wish to apply when scanning for optimal cuts.
An example of the optimization syntax is:
#VariableName must be OPT > or < RangeMin RangeMax unused
#------------ ----------- ------ ------------ ------------- ------
muonPt OPT > 10 55 5
This optimizer will scan 10 different values, evenly distributed over
the inclusive range [RangeMin, RangeMax]. At the moment, the 6th value is not used and
does not need to be specified.
The optimization cuts are always run after all the other cuts in the file, and are only run
when all other cuts are passed.
The above line will make 10 different cuts on muonPt, at [10, 15, 20, 25, ..., 55].
('5' in the 6th field is meaningless here.)
The output of the optimization will be a 10-bin histogram, showing the number of
events passing each of the 10 thresholds.
Multiple optimization cuts may be applied in the same file. In the case where N optimization cuts
are applied, a histogram of 10^N bins will be produced, with each bin corresponding to a unique cut combination.
No more than 6 variables may be optimized at one time (limitation in the number of bins for a TH1F ~ 10^6).
Since such file can become quite large, the default is to not create
A file (optimizationCuts.txt in the working directory) that lists the cut values applied for
each bin can be produced by uncommenting the line
#FLAGS += -DCREATE_OPT_CUT_FILE
in the Makefile. Since this file can be quite large (10^N lines), by default it is not created.
Producing an ntuple skim (Dinko Ferencek):
------------------------------------------
The class baseClass provides the ability to produce a skimmed version of the input ntuples. In order to
produce a skim, the following preliminary cut line has to be added to the cut file
#VariableName value1 value2 value3 value4 level
#------------ ------------ ------------- ------------ ------------- -----
produceSkim 1 - - - -1
and call the fillSkimTree() method for those events that meet the skimming criteria. One possible example is
if( passedCut("all") ) fillSkimTree();
If the above preliminary cut line is not present in the cut file, is commented out or its value1 is set to 0,
the skim creation will be turned off and calling the fillSkimTree() method will have no effect.
JSON parser (Edmund Berry):
---------------------------
See https://hypernews.cern.ch/HyperNews/CMS/get/exotica-lq/266.html
PU reweight (Edmund Berry):
---------------------------
See https://twiki.cern.ch/twiki/pub/CMS/Exo2011LQ1AndLQ2Analyses/PileupReweightingCode.pdf
Producing a new ntuple with a subset of cutFile variables and a subset of events (Paolo, Francesco, Edmund):
------------------------------------------------------------------------------------------------------------
The class baseClass provides the ability to produce a new ntuple with a subset of the variables defined
in the cutFile, and with a subset of events.
In order to do so, the following preliminary cut line has to be added to the cut file
#VariableName value1 value2 value3 value4 level
#------------ ------------ ------------- ------------ ------------- -----
produceReducedSkim 1 - - - -1
then each variable that needs to be included in the new tree has to be flagged with SAVE in
the cutFile at the end of the line where the variabole is defined, as for pT1stEle and pT2ndEle
below:
#VariableName minValue1(<) maxValue1(>=) minValue2(<) maxValue2(>=) level histoNbinsMinMax OptionalFlag
#------------ ------------ ------------- ------------ ------------- ----- ---------------- ------------
nEleFinal 1 +inf - - 0 11 -0.5 10.5
pT1stEle 85 +inf - - 1 100 0 1000 SAVE
pT2ndEle 30 +inf - - 1 100 0 1000 SAVE
invMass_ee 0 80 100 +inf 1 120 0 1200
(do not put anything for those variables that do not need to be saved, such as for nEleFinaland invMass_ee)
finally, call fillReducedSkimTree() in the analysisClass for the subset of events that need to be saved, e.g.:
if( passedCut("nEleFinal") ) fillReducedSkimTree();
If the above preliminary cut line is not present in the cut file, is commented out or its value1 is set to 0,
the skim creation will be turned off and calling the fillReducedSkimTree() method will have no effect.
The new ntuple will be created in a file named as the std output root file with _reduced_skim appended
before the .root and the tree name will be as in the input root file.