DØ "How To" Guide to
How To Run RECO
| |
Page updated July 13,
2002 |
This document describes how to a) run the default
version of the DØ reconstruction program (RECO), b) customize your own version of RECO to run a subset of the
standard algorithms and c) build and run a RECO that
includes your own version of some existing algorithm.
Introduction
The DØ Offline Reconstruction Program (RECO) is responsible for
reconstructing objects that are used to perform all DØ physics analyses.
It is a CPU intensive program that processes either collider events recorded
during online data taking or simulated events produced with the DØ Monte
Carlo (MC) program. The executable is run on the offline production farms and
the results are placed into the central data storage system for further
analysis. The program uses the DØ Event Data Model (EDM) to organize the
results within each event. EDM manages information within the event in the form
of chunks. The Raw Data Chunk (RDC), created either by the Level 3 trigger
system or the MC, contains the raw detector signals and is the primary input to
RECO. The output from RECO is many additional chunks associated with each type
of reconstructed object. RECO is designed to produce two output formats which
can be used for physics analyses, and which are optimized for size. The Data
Summary Tape (DST) contains all information necessary to perform any physics
analysis, and is designed to be 150 Kb per event. The Thumbnail (TMB) contains
a summary of the DST, and is designed to be 15 Kb per event. The TMB can be
used directly to perform many useful analyses. In addition, it allows the rapid
development of event selection criteria that will be subsequently applied to
the DST sample.
RECO is structured to reconstruct events in several hierarchical steps.
The first involves detector-specific processing. Detector unpackers process the
RDC by unpacking individual detector data blocks. They decode the raw
information, associate electronics channels with physical detector elements and
apply detector specific calibration constants. For many of the detectors, this
information is then used to reconstruct cluster (for example, from the
calorimeter and preshower detectors) or hit (from the tracking detectors)
objects. These objects use geometry constants to associate detector elements
with physical positions in space. The second step in RECO focuses on the output
of the tracking detectors. Hits in the silicon (SMT) and fiber tracker (CFT)
detectors are used to reconstruct global tracks. This is one of the most
CPU-intensive activities of RECO, and involves running several algorithms. The
results are stored in corresponding track chunks, which are used as input to
the third step of RECO, vertexing. First, primary vertex candidates are
searched for. These vertices indicate the locations of ppbar interactions and
are used in the calculation of various kinematical quantities (e.g. transverse
energy). Next, displaced secondary vertex candidates are identified. Such
vertices are associated with the decays of long-lived particles. The results of
the above algorithms are stored in vertex chunks, and are then available for
the final step of RECO - particle identification. This step produces the
objects most associated with physics analyses and is essential for successful
physics results. Using a wide variety of sophisticated algorithms, information
from each of the preceding reconstruction steps are combined and standard
physics object candidates are created. RECO first finds electron, photon, muon,
neutrino (missing ET) and jet candidates, which are based on detector, track
and vertex objects. Next, using all previous results, candidates for
heavy-quark and tau decays are identified. Additional physics object
identification is planned (e.g. Ks, Lambda, J/psi, W, Z, etc.) and will be
added as the reconstruction algorithms become available.
RECO is developed and maintained by the DØ
Algorithms group, which is composed of the
detector, tracking, vertexing and Object ID sub-groups. The program is
currently organized into 36 sub-systems, which reside in about 180 individual
software packages.
Running the default
version
If you would like to process a sample of events through the default
version of RECO, you can do the following:
- Choose the version of RECO that you want to use. Certified versions
are documented on the RECO Status web page.
These versions have been tested on a large number of events and are expected to
work. (NB. Testing is performed mostly on a Linux platform using the maxopt
build.) If you have problems running one of these versions, please
report
them. Other versions are under development, and may not be stable.
Typically, only production releases (those numbered pxx.xx.xx) are certified.
Development releases (numbered txx.xx.xx) will contain the latest version of
RECO algorithms, but may not be suitable for general use.
- Setup the DØ software environment:
- > setup n32 (d0mino only)
- > setup D0RunII pxx.xx.xx
- > setup d0tools
- Create a work area where you want to run your job(s). Usually you
will need to do this on a project (or scratch) disk (someplace where you will
not run out of disk space). Project disks are available through Physics, Object
ID and Algorithms groups.
- > mkdir myreco
- > cd myreco
- Create a text file that contains a list of input files that you want
to process. For example, if you have two data files in the area
/prj_root/704/higgs called file1.raw and file2.raw
- > emacs mydata.dat
- Add the
lines
/prj_root/704/higgs/file1.raw
/prj_root/704/higgs/file2.raw
- Run reco.
- > runreco -filelist=mydata.dat -format=data -batch
-maxopt
- Normally you will submit your job to the batch system of the
computing system you are using. This example does that (-batch). You may need
to learn details about specific batch queues, like allowed CPU time, memory
limits, etc. "Ask your friends" for more details. You probably want to use the
optimized version of RECO, since it runs about two times faster (-maxopt).
"runreco" will create a directory in your working area. The name of the
directory will be "D0Reco_x-version-opsys-mydata", where "version" is the
version RECO (e.g. "p11.09.00-maxopt") and "opsys" is the operating system
(e.g. Linux). You can override this default directory name by using the "-name"
option. All results from the job will be placed in this directory.
- You can get more information about the runreco command by doing
"runreco -h". Or you can refer to the d0tools
web
page. By the way, d0tools supports running many of the standard DØ
executables. Please see the web page for details. Once you learn how to run one
executable, you will know a lot about running the others.
- Once your job is completed, the results will end up in the
"D0Reco_x-version-opsys-mydata" directory. The following output files are of
general interest:
- D0Reco_x.log - The log file from the job. Contains error
messages.
- D0Reco_x.out - General message file.
- events.read - A summary file describing the number of events that
have been processed.
- events.write - A summary file describing the number of events
that have been written out.
- outputfile* - The processed events (the "DST"). There may be more
than one file (e.g. outputfile_00).
- tmbfile - The "thumbnail" file.
- Batch log files:
- On d0mino, look at D0reco_x.berr and D0reco_x.bout
- On clued0, look at runME.e* and runME.o*
- There are a few things to check after the job has been completed, to
see if RECO ran properly:
- If you know how many events are in your input files, look at the
events.read file and see if that number has been processed.
- Look at D0Reco_x.log. There will be a lot of informative
"ERLOG-i" and warning "ERLOG-w" messages. Examine that last set of messages,
and look for "serious" messages (more on this will come later).
- Look at the batch log files (especially the .berr / .e* error
logs). Crashes involving segmentation faults or floating point exceptions
usually have messages in these files.
- The d0tools commands (e.g. runreco) also support processing events in
SAM, assuming the computing system you are using has SAM available (at this
time, clued0 does not). The easiest way to run a job to process events in SAM
is to define a dataset, and use "-defname=mydef" instead of the
"-filelist=mydata.dat".
- Odds and ends:
- On d0mino, trapping of floating point exceptions is typically
turned on. If you encounter a crash when processing events, you may wish to
disable this feature. Simply unset the environmental variable TRAP_FPE.
- On Linux machines, trapping of floating point exceptions is
currently disabled. If you would like to run with this feature turned on, use
the "-fpe" option.
- If you experience a RECO crash, you can determine where the crash
occured by rerunning RECO in the debugger. First setup the debugger (setup
totalview) and then use the "-debug" option (without the "-batch" option, since
this will be done interactively).
- If you are experiencing problems processing your events, even
after dealing with floating point exception handling, you may consider either
skipping problematic events, or turning off problematic algorithms in RECO
(assuming that they are not important to your analysis). The next section
describes how to do this.
Customize RECO by turning off a
step
Many users want to run the standard RECO executable, turning off some
algorithms that they are not interested in. To do this, you will need to check
out the d0reco package from CVS and modify the appropriate reco framework rcp
file.
- Setup the DØ software environment (see above). You will also
need to setup CVS:
- Make a "local release" area and check out the reco CVS package
- > newrel -t pxx.xx.xx myreco
- > cd myreco
- > addpkg d0reco
- Build your own executable. In this example, you will build an
optimized build ("maxopt"), so that you get better performance. Also, by
running "gmake test", you will confirm that the default RECO passes its
integrated tests (just to be safe).
- > srt_setup SRT_QUAL=maxopt
- > d0setwa
- > gmake all
- > gmake test
- There are many high level rcp files in the d0reco package that
control how reco will be run. These files reside in the d0reco/rcp area, and
are called runD0reco*.rcp (documentation is available in
d0reco/doc/runscripts.txt). Two standard files are runD0reco_data.rcp (for
processing real data) and runD0reco_mc.rcp (for processing Monte Carlo).
- Within each of these files is a line like
string Packages = "init read cfgm unpack det gtr vtx
pid tmb tag dump special rdcdistill write writetmb fini"
This line controls the highest level of processing. It includes the
following steps:
- init - "initialization"
- read - "read events"
- cfgm - "configure unpacking"
- unpack - "unpack raw data"
- det - "process detectors" (e.g. find clusters)
- gtr - "perform tracking"
- vtx - "perform vertexing"
- pid - "perform particle ID"
- tmb - "build thumbnail"
- tag - "tag special events"
- dump - "optionally print events"
- special - "tag special events"
- rdcdistill - "distill raw data chunk to hold only trigger
info"
- write - "write out DST"
- writetmb - "write out thumbnail"
If you would like to turn off processing at this high level, remove
the appropriate item from the "Packages" line.
- There is a second level of control in reco, which handles what is
done in each of the above steps. As an example, for the pid step there is a
line in the runD0reco*.rcp file that looks something like
RCP pid = <d0reco reco_pid> // Particle ID
reconstruction
The format of this line is
RCP framework-package = <cvs-package
rcp>
where the "framework package" is controlled by the "rcp" that comes
from the corresponding "CVS package". This rcp will be found in
"cvs-package/rcp". Thus the "pid" line means that the file
d0reco/rcp/reco_pid.rcp controls which packages are run during the pid stage.
If you want to turn off one step in the pid step, remove the appropriate step
in the "Packages" line in reco_pid.rcp. Similarly, for all other steps. If the
"cvs-package" is not d0reco, you will need to "addpkg cvs-package" in order to
get access to that rcp.
- Once you have modified the rcp's, you can run reco using the runreco
command in a similar way as described above, including the two additional
switches "-localrcp -localbuild".
- > runreco -filelist=mydata.dat -format=data -batch -maxopt
-localrcp -localbuild
- See the above description for how to understand the output from the
reco job.
Build and run your own version of
RECO
Some users may wish to test modifications to existing RECO algorithms.
The following steps show how to build a local version of RECO that includes
such modifications.
- Setup the DØ software environment (see above).
- Make a "local release" area and check out the default RECO system
- > newrel -t pxx.xx.xx myreco
- > cd myreco
- > addpkg d0reco
- To be careful, you should build and test the default version before
making any modifications.
- > d0setwa
- > gmake all
- > gmake test
- Make sure there weren't any errors. If you are using a certified
version of reco and there was a problem, report it to the RECO coordinator. If
you are using a non-certified version, you may be experiencing a known problem.
You may want to contact the RECO coordinator to understand the current
status.
- Check out the package that contains the algorithm that you want to
work on. For example, emreco performs the reconstruction of EM objects:
- Make your modifications.
- Build your local RECO and run its integrated tests:
- You can choose to either build your executable non-optimized (as
above), or optimized. Optimized builds typically run twice as fast as
non-optimized, but can be more difficult to debug. To make an optimized
executable, issue the command "> srt_setup SRT_QUAL=maxopt" before "gmake
all".
- Check that everything compiled and the integrated tests ran
properly.
- Run your version of RECO on your own sample of events:
- > runreco -filelist=mydata.dat -format=data -batch -maxopt
-localrcp -localbuild
- See the above description for how to understand the output from the
reco job.
- If your job crashes, you may need to run the debugger:
- > setup totalview
- > runreco -filelist=mydata.dat -format=data -localrcp
-localbuild -debug
This page maintained by
Harry Melanson