-
Notifications
You must be signed in to change notification settings - Fork 4
/
notes.tex
457 lines (381 loc) · 17.8 KB
/
notes.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
\begin{appendix}
\section{Talk Notes}
% 4 questions (NNs pre-processing,)
% 20 listeners, 60 on the room.
% I was suprisingly relaxed went OK. I was able to dicuss well the parallel coordinates sissue.
% timing only on min. extra. + 4 guestions!
From b tagging we not turn to tau tagging.
For this channel QCD is dominating background.
"The QCD cross-section is by several orders of
magnitude larger than the cross-section of other backgrounds".
Cross sections are such than, against QCD we need a 10$^9$ rejektin, if we want to see the signal.
This can be achieved,
when we join other methods (10$^4$), such and b-tagging dicussed in previous talk,
with tau tagging (10$^5$).
Other methods:
1) missing ET measurement,
2) mass reconstruction for associated top and W , and
3) cut on the reconstructed transverse mass of the charged Higgs boson.
Analyysis is based on Physics TDR -papereihin (CMS physics TDR-I and II):
Test data:
signaalia 35k jets (equal numbert events), and background 2230k jets (1087k events).
Customization:
By default TMVA reports signal (jet) efficiencies at 1 \%, 10 \% and 30 \% background
(jet) efficiency levels.
We used jets for training, but want to know the efficiencies in events
In practice signal data has one jet / event (the tau jet), and
background data has multiple jets / event.
So, we added software TMVA:
1) Signal (jet) efficiency at 1e-5 bkg (jet) efficiency
2) Evaluation of event efficiencies with TMVA::Reader, taking into account
the MC-level preselection efficiencies and TMVA preselection
(Signal (event) efficiency at bkg (event) efficiency levels 1e-6, 1e-5)
Profiling with TStopwatch
BDT took 5~h.
Informally:
"taking the detector effects into account,
the signal efficiency at a 10$^-5$ background level is expected to be a few percent;
thus the results of ideal simulation are in good agreement
with the results obtained with full simulation."
or
these result can be reflected against rew percent acheived with full CMS simulation
and traditional analysis using hard cuts.
Future improvments
1) Muuttujien määrä on varsin pieni. Add more variable.
2) Preselektio saattaa olla hieman liian tiukka. Loosing preselction cuts.
3) Understanding weak variable.
Strong correlation begtween neutr. hadron rejektio with Rtau-leikkauksen kanssa, when Rtau$>$0.80.
\section{Proceedings Deadline }
Will be publised on-line by IOP max.~10 pages latest on May 15th.
Organizers ask us to propose two possible reviewers for our paper.
\lstset{ % General settings
language=c++, % choose the language of the code
basicstyle=\ttfamily \small, % the size of the fonts that are used for the code \footnotsize
numbers=left, % where to put the line-numbers
numberstyle=\small, % the size of the fonts that are used for the line-numbers
stepnumber=2, % the step between two line-numbers. If it's 1 each line will be numbered
numbersep=10pt, % how far the line-numbers are from the code
showspaces=false, % show spaces adding particular underscores
showstringspaces=false, % underline spaces within strings
showtabs=false, % show tabs within strings adding particular underscores
frame=, % adds a frame around the code (single)
tabsize=2, % sets default tabsize to 2 spaces
captionpos=t, % sets the caption-position: top (t), bottom (b)
breaklines=true, % sets automatic line breaking
%breakatwhitespace=false, % sets if automatic breaks should only happen at whitespace
escapeinside={\%*}{*)}, % if you want to add a comment within your code
caption=footnote,
label=listing:relRef
}
\section{Code}
A code and data will be distributed using git and made abvailable
at \\ \url{http://www.helsinki.fi/~miheikki/system/refs/heikkinen/ah09bProceedings/code}
%\subsection{Makefile}
\lstset{
language=csh,
numbers=left,
stepnumber=2,
caption={\tt code/Makefile},
label=makefile
}
%\lstinputlisting{code/Makefile}
%\subsection{tmva-common.conf}
\lstset{
language=csh,
numbers=left,
stepnumber=2,
caption={\tt code/tmva-common.conf},
label=tmvacommonconf
}
%\lstinputlisting{code/tmva-common.conf}
%\subsection{tmva-example.conf}
\lstset{
language=csh,
numbers=left,
stepnumber=2,
caption={\tt code/tmva-example.conf},
label=tmvacommonconf
}
%\lstinputlisting{code/tmva-example.conf}
%\subsection{ametisti.sh.job}
\lstset{
language=csh,
numbers=left,
stepnumber=2,
caption={\tt code/ametisti.sh.job},
label=ametistishjob
}
%\lstinputlisting{code/ametisti.sh.job}
\section{Data files}
Minimalistic example datafile (recommendation: less than 1-2 MB) can
be included in the repository for testing and demonstration
purpooses. Although Git can easily deal with large files, it is
recommended that the production data would not be
included in the repository. It should be kept in mind that GitHub
(and shell accounts) offer relatively limited disk space (GitHub: 100
MB, CERN default: about 150 MB) and that ROOT files can probably not be
compressed further.
For production use it is recommended to store in the repository an URL
to the data. Then we can use the {\tt ROOT} or {\tt HTTP} protocols to
access it or use a {\tt make} directive/shell script to copy the data to the
user's computer. One good example of this practice is the {\tt
HipProofAnalysis} repository. Only the URL:s are stored, not the
data. Additionally also the parameters, configuration options,
software versions, etc. used for the production of the data should
probably be stored in some way in the repository.
\begin{comment}
\section{WORKING NOTES}
{\bf Suggested responsibility}:
\begin{itemize}
\item[aatos]
Aatos: editor, NN classifiers;
\item Pekka: release manager, git consulting, PROOF
\begin{itemize}
\item git consulting: OK (setting up workflow and repositories, user
training, documentation, software installation)
\item PROOF: I didn't see PROOF mentioned anywhere in the TMVA
documentation. Is it supported? If not, then I don't have resources
to do it (lesson learned in the past: PROOF-enabling an analysis
code can be a major undertaking...)
\end{itemize}
\item Sami: MC data,
\item Lauri 1-prog physics
\item Ritva:
\item Tomas: Ametisti
\item Tapio:
\item Matti:a mechanism to work with variables
\item Veikko:
\end{itemize}
\end{comment}
\subsection{Code repository}
\begin{itemize}
\item Source code for paper and TMVA script is available at
{\tt git://github.com/aatos/chep09tmva.git} (\url{http://github.com/aatos/chep09tmva})
\begin{itemize}
\item Pekka:
\begin{verbatim}
git remote add pekka git://github.com/kaitanie/chep09tmva.git
git fetch pekka
git merge pekka/master
make release (inform Pekka where tar.gz is available)
\end{verbatim}
\item Lauri (don't do manually PK as an release manager does this):
\begin{verbatim}
git remote add lauri http://cmsdoc.cern.ch/~wendland/chep09tmva.git
git fetch lauri
git merge lauri/master
\end{verbatim}
\item Matti (don't do manually PK as an release manager does this)::
\begin{verbatim}
git remote add matti git://github.com/makortel/chep09tmva.git
git fetch matti
git merge matti/master
\end{verbatim}
\end{itemize}
\item Alternatively LaTeX-files can be loaded form
\url{http://www.helsinki.fi/~miheikki/system/refs/heikkinen/ah09bProceedings.tar.gz}.
After this you can make your modifications and submit them as a
tarball. The tarball can be created by using command {\tt make
contribution}. The resulting file {\tt chep09tmva-contribution.tar.gz}
can be sent as and e-mail attachment to: {\tt [email protected]}.
\item You can also mail you comments and updates directly to editor (Aatos)
Based on pdf version
\url{http://www.helsinki.fi/~miheikki/system/refs/heikkinen/ah09bProceedings.pdf}.
\end{itemize}
Guide \url{http://ktown.kde.org/~zrusin/git/git-cheat-sheet-medium.png}.
Some git documentation:
\begin{itemize}
\item Git tutorial: \url{http://www.kernel.org/pub/software/scm/git/docs/gittutorial.html}
\item Git with HipProofAnalysis (contains instructions on how to use
Git on lxplus:
\url{http://projects.hepforge.org/radical/trac/wiki/GitWithHipProofAnalysis}
\end{itemize}
\subsection{Building the document}
Building the document requires {\tt make} and \LaTeX tools. The
document can be built using the {\tt make} command. At the end of the
compilation this will optionally launch a PDF viewer (by default Firefox browser
and Acrobat Reader plugin). You can change the PDF viewer program by
setting environment variable {\tt PDFVIEWER} to point to your
favourite PDF viewer (e.g. lightweight alternative {\tt xpdf}). To
enable the PDF viewer feature you can set the environment
variable {\tt USEVIEWER} to 1.
\subsection{Current status of TMVA}
For introduction browse, six talks from year 2008 \url{http://tmva.sourceforge.net/talks.shtml}.
\begin{itemize}
\item Current version is TMVA-v3.9.6 (2008, December. 2nd).
\item TMVA (\url{http://tmva.cvs.sourceforge.net}) is now included in ROOT releases:
\begin{itemize}
\item ROOT version 5.22 has been released on December 18, 2008
(release notes \url{http://root.cern.ch/root/v522/Version522.news.html}),
it has TMVA-v.3.9.5
\item ROOT version from 5-19-02a to 5-21-01-alice contains TMVA 3.9.4.
\end{itemize}
\item In addition to many bug fixes:
\begin{itemize}
\item Improved prepossessing
\item Pre-selection cuts on arrays. Previously used {\em TEventlists}
(only event wise pass/fail) were replaced by {\em TreeFormulas} (sensitive to array position).
\item Plugin capability: custom multivariate classifier can now be plugged into
the TMVA framework to benefit from TMVA's analysis and performance comparison
tools.
\item For details see release notes
\url{http://tmva.cvs.sourceforge.net/*checkout*/tmva/TMVA/development/RELNOTES}
\end{itemize}
\end{itemize}
\subsection{TMVA run configuration files}
The new example program ({\tt code/chep09tmva.cc}) uses a config file
({\tt code/tmva.conf}) for classifier configuration. There is one
possible problem in this setup. If everyone edits the same file time
and time again, merging everyone's work will become very painful. This
is a problem because we would like people to merge early and
often. There are a few proposals that should be investigated as
possible solutions to this problem:
\begin{enumerate}
\item Using config files is a good option. Hardcoding configs into
the program would probably make merges quite difficult as well.
\item Each user/classifier has a separate config file. The {\tt
chep09tmva} program should have a command line option that allows the
user to choose which configuration is used. An example invocation of
the {\tt chep09tmva} program is shown in listing \ref{configExample}.
\item Ability to have common config options in a separate file
(e.g. {\tt tmva-common.conf}) which could be included into
user/classifier specific configuration files with an {\tt include}
statement. An example of this is shown in listings \ref{commonConfig}
and \ref{userConfig}.
\end{enumerate}
The program has been modified as follows
\begin{itemize}
\item Support for \texttt{include} as shown in listing
\ref{commonConfig}
\item There is now a common configuration file
\texttt{tmva-common.conf} (which is still more to demonstrate than
to really do anything useful), and an example of user configuration
\texttt{tmva-example.conf}
\item By default it uses the \texttt{tmva-common.conf}, but the
configuration can be specified as shown in listing \ref{configExample}
\begin{itemize}
\item If the same directive (\texttt{Variables:}, \texttt{Cuts:},
\texttt{Trainer:}, \texttt{Classifiers:}) is given in both the
user configuration and common configuration, the user
configuration is used (i.e. e.g. variable lists are not merged).
\end{itemize}
\end{itemize}
\lstset{
language=csh,
numbers=left,
stepnumber=2,
caption=Example invocation of {\tt chep09tmva} with config file name as a parameter.,
label=configExample
}
\begin{lstlisting}
./chep09tmva pekka.conf
\end{lstlisting}
\lstset{
language=csh,
numbers=left,
stepnumber=2,
caption=Contents of the file {\tt tmva-common.conf} that contains config options shared by all analysis runs.,
label=commonConfig
}
\begin{lstlisting}
// String to pass TMVA::Factory::PrepareTrainingAndTestTree
Trainer:
NSigTrain=1000:NBkgTrain=20000:SplitMode=Random:NormMode=NumEvents:!V
\end{lstlisting}
\lstset{
language=csh,
numbers=left,
stepnumber=2,
caption=Contents of the user specific config file {\tt pekka.conf}.,
label=userConfig
}
\begin{lstlisting}
include tmva-common.conf
Cuts_D H:!V:FitMethod=MC:EffSel:SampleSize=20000:VarProp=FSmart:VarTransform=Decorrelate
\end{lstlisting}
\lstset{ %
language=csh, % choose the language of the code
basicstyle=\footnotesize, % the size of the fonts that are used for the code
numbers=left, % where to put the line-numbers
numberstyle=\footnotesize, % the size of the fonts that are used for the line-numbers
stepnumber=2, % the step between two line-numbers.
%If it's 1 each line will be numbered
numbersep=5pt, % how far the line-numbers are from the code
showspaces=false, % show spaces adding particular underscores
showstringspaces=false, % underline spaces within strings
showtabs=false, % show tabs within strings adding particular underscores
frame=single, % adds a frame around the code
tabsize=2, % sets default tabsize to 2 spaces
captionpos=t, % sets the caption-position to bottom
breaklines=true, % sets automatic line breaking
%breakatwhitespace=false, % sets if automatic breaks should only happen at whitespace
escapeinside={\%*}{*)}, % if you want to add a comment within your code
%caption=Bash function to release a directory.,
label=listing:relRef
}
\section{PLANS}
To be done:
\begin{itemize}
\item 090400 Preparing the paper
\item 090326 Giving the talk 15+5 min? Post talk update.
\item Update varibale transformation (log/image)
%\item Template code for analysis using latest ROOT, and TMVA inside it.
%\item Revise title, abtract and paper structure including appendix.
\end{itemize}
\section{HISTORY}
\begin{itemize}
\item 090505 Release.
\item 090504 Merge with PK. JPSC template added from \url{http://www.iop.org/EJ/journal/-page=extra.3/1742-6596} and
related Makefile target {\em paper} added.
\item 090330 Merge with PK (no updates). Committing to git repository.
\item 090326 Preparing for the talk and giving it. Post talk updated.
\item 090325 Last tuning (comments from Lauri and Matti). release
\item 090324 Asking last comments latest on Wednesday (090325) morning.
\item 090323 Adding fixes provided by Ritva, Lauri, Matti. NN run.
\item 090321 Preparing backup version of the talk. First release.
\item 090320 Merge with PK (receiving material/comments from Lauri, Ritva, Matti, Pekka)
\item 090319 Adding talk material from Matti and Sami.
TMVA from refs/manual. Conclusion. Asking last material update.
\item 090318 Updating comments from LW.
\item 090317 Adding material to slides. Asking new material from collaborators
\item 090316 Merge with PK. Updates in {\tt evaluate.cc} and {\tt timer.h}.
\item 090310 Trying local data reveived in USB key.
\item 090310 Merge with PK. New data is fixed for the analysis.
\item 090303 Merge with PK (updates in {\tt matti.tex})
\item 090226 Merge with PK (updates in chep09tmva.cc and output.cc).
Adding variable description to CHEP slides.
\item 090224 Merge with PK; updates in code (particularly libTmvaTasks.C). Adding material to CHEP slides.
Linking to latest data updated (tmva-largedata.conf).
\item 090217 Merge withe PK. (Update in {\tt matti.tex})
\item 090213 Adding template for slides \cite{ah09aTalk.tex} and related target 'talk'.
Clarified the contents of {\tt ah09bProceedings.tex}.
\item 090209 Merge with PK (additions to output.h/c, scripts/csc.C, and tmva-matti.conf)
\item 090206 Merge with PK (many bug fixe and timing functionality updated by Matti and Pekka)
\item 090204 Merge with PK (conf file format updated)
\item 090203 Merge with PK (updates from Matti for calculating efficiency,
root system files added by AH)
\item 090126 Merge with Pekka (new plots from him).
From Matti: evaluating all classifiers for signal efficiency
at 1e-5 OVERALL bkg efficiency. Simpifying the paper.
\item 090119 Major updates form Matti.
\item 090116 Merging for pekka. Releasing tarball with {\bf make release}.
\item 090115 Added argets for example analysis make ah1 (recommended way) and make ah0 (minimal).
\item 090113 Release management bug fix.
\item 090112 Testing new release tools developed by our release manager Pekka.
\item 081217 Fixed some lost files by Merging from Pekka.
\item 081216 Merge from Matti and Pekka.
Files added for each author, corresponding a specific analysis subsection.
First test runs for MLP done using {\tt chep09tmva.C}. Sample images added to {\tt ah09bProceedings.tex}.
\item 081215 This abstract was accepted as CHEP'09 talk.
\item 081202 Merging example data and related configuration file from Lauri.
\item 081125 Merging from Lauri, Matti, and Pekka.
Added subsections for code listing and table of contents.
\item 081111 Merging branch from Lauri and including comments from Sami.
Based on discussion at HIP group weekly meeting made some aditional changes to abtract.
\item 081028 Project released in \url{http://github.com/aatos/chep09tmva}. Removed proceedings notes in the Appendix A to separate file {\tt notes.tex}.
\item 091029 PK: Commented some points in the proposed
responsibilities. Added a couple of links to the Git documentation.
\item 081021 Title and abstract focus improved after discussion in the group.
\item 081014 First draft done after the idea to have TMVA paper at next CHEP was accepted in the group.
\end{itemize}
\end{appendix}