H. Schellman
Version 1.40 - September 9, 2001
Scripts for this tutorial can be found at:
http://www-d0.fnal.gov/computing/professors_guide/scripts
or in
d0mino: WWW/docs/computing/professors_guide/scripts
or in cvs in the professors_guide package.
In this note, variables you need to type in are surrounded
by
. Don't type
whateverIsaid
, do an appropriate
substitution.
SUBSCRIBE D0RUG
http://www-d0.fnal.gov/computing/systems/d0_account.html
For instructions on setup:
http://www-d0.fnal.gov/d0unix/d0unix.html
setup n32 \\ ONLY ON IRIX - says to use new libraries setup D0RunII <version> setup d0cvs setup cern
This sets up environmentals which point to D0 packages and the cvs code management system. The default D0RunII is the 'current' version. You almost always want to set up a different version.
For example on September 10, 2001, the most current test release was t01.59.00 but it failed when I tested it so I'm going to use the p10.01.00 production release for examples.
setup D0RunII p10.01.00
The production releases are documented at:
http://www-d0.fnal.gov/computing/algorithms/status/index.html
You can find all of the D0 code in directories like
http://www-d0.fnal.gov/d0dist/dist/releases
either on the main D0 web site or on your local machine. On some machines /d0dist/dist may be lower in the directory tree.
The highest number version is normally still being built. Back off by one and you should have the best stable version.
setenv SCRATCH /scratch/7/<your username>
The scratch areas vary from system to system. On d0mino they are:
/scratch/1/ /scratch/4/ /scratch/7/ /scratch/10/
The command df | grep scratch should help you find one on your system. If you want you can put the SCRATCH definition in your .login but you may want to have if statements in case you login into another machine with a shared home area.
setup python
setup sam
setenv D0TOOLS_DIR $HOME/d0tools
setenv D0TOOLS_BIN $HOME/d0tools/bin
setenv D0TOOLS_DOC $HOME/d0tools/doc
set path=($path $HOME/d0tools/bin)
setenv PYTHONPATH ${PYTHONPATH}:$HOME/d0tools/python
You can get the environmental stuff from
http://www-d0.fnal.gov/computing/professors_guide/scripts/d0tools_setup.csh
http://www-d0.fnal.gov/computing/
http://www-d0.fnal.gov/run2_offline_software/framework/framework.html
http://www-d0.fnal.gov/computing/d0tools/doc
In principle you can do a complete analysis of D0 data by setting the right rcp parameters.
In addition, the algorithms group has a 'Users Corner' page with lots of hints and tips as well as current status of code.
http://www-d0.fnal.gov/computing/professors_guide/quick_guide//
Here is how to set up a test area, instructions on checking out existing packages and making your own come later.
D0 uses a combination of cvs, make, Softrel tools and a D0 specific package CTBUILD to do code management.
The code is stored in a cvs repository and the normal cvs commands can be quite useful.
The code structure is Softreltools, with each component its own package.
The ordinary user does not use make directly, instead the most usual make directives are automatically generated for you by the CTBUILD packages, which has a standard syntax for saying:
I want this cpp file to be compiled, and this component test run, and then I want to link it with X, Y and X to produce binary B.
The CTBUILD directives in your package will automatically create the correct GNUMakefile.
SoftRel tools has a handy feature which allows you to keep important stuff like modified code on your home area while keeping stuff like binaries and libraries on a scratch area.
What you want to do is make a file in your home area called .srtrc and put the following into it:
extra_dirs="$extradirs tmp>$SCRATCH/$release/tmp\ bin>$SCRATCH/$release/bin lib>$SCRATCH/$release/lib"
You can find this at
http://www-d0.fnal.gov/computing/professors_guide/scripts/dot_srtrc
newrel -t <version> <mytest>
This will make a test version based on version
version
of the code
and place it in directory
mytest
where mytest can be any name you want.
The version can be test, current or p10.01.00 or whatever.
The output of this command should look like:
<d0mino> newrel -t p10.01.00 p10.01
read user srtrc
Creating a test release "p10.01" in the directory
/home/schellma
Linking tmp to /scratch/7/schellma/p10.01/tmp
Linking bin to /scratch/7/schellma/p10.01/bin
Linking lib to /scratch/7/schellma/p10.01/lib
You will get a new subdirectory called p10.01 which contains an empty D0 code structure, go into that directory and look around.
cd p10.01
ls -pc1
You will see the contents
/GNUmakefile /bin /doc /include /lib /man /results /tmp
This is the master directory for your code.
In principle, you can run D0 code without ever writing a line of C++. This is done by using the D0 framework and run control parameter (RCP) files. reco_analyze is an example of this. It consists solely of a main framework parameter file and a couple of rcp files which tell the program where to find input and put output.
To get the code
addpkg reco_analyze
addpkg has various arguments.
addpkg reco_analyze gets the released version for the code release you are using, in this case p10.01.00. Safest if you just want to use it.
addpkg -h reco_analyze gets the most recent version. You would do this if you were trying to write new code. Not guaranteed to work.
addpkg reco_analyze <branch> gets the most recent version in a different cvs branch. This is mainly used to fix production branches. Do not try this at home!
You should now have a new subdirectory called reco_analyze and if you look in include you will see a link to the reco_analyze subdirectory of reco_analyze.
include -> ../reco_analyze/reco_analyze
Look what's in the reco_analyze package:
ls -pR reco_analyze
reco_analyze: CVS/ LIBDEPS VERSION binSAM/ rcp/ GNUmakefile SUBDIRS bin/ doc/ test/ reco_analyze/bin: BINARIES GNUmakefile OBJECTS CVS/ LIBRARIES RecoAnalyze_x.cpp reco_analyze/binSAM: BINARIES GNUmakefile OBJECTS CVS/ LIBRARIES RecoAnalyzeSAM_x.cpp reco_analyze/doc: CVS/ README.TXT reco_analyze/rcp: CVS/ RecoAReadEvent.rcp runRecoAnalyze.rcp RecoADumpEvent.rcp RecoASAMManager.rcp runRecoAnalyzeSAM.rcp RecoANtupleMgr.rcp RecoAWriteEvent.rcp reco_analyze/test: CVS/ ITESTS OBJECTS itest.sh GNUmakefile LIBRARIES itest.cpp
GNUmakefile Is your top level GNUmakefile. The D0 package CTBUILD makes it unlikely that you will need to change this.
LIBDEPS is a CTBUILD file which tells the d0 build system what libraries your package directly depends on. If you only depend on a library through another package don't put it here.
SUBDIRS is a CTBUILD file which tells which tells CTBUILD where to look for things you may wish to build. In this minimal case, the only one is bin which contains instructions on making a binary.
VERSION I dunno what this does - it appears to never have been set.
bin/ This is the binary directory with instructions on how to build the reco_analyze binary.
bin/RecoAnalyze_x.cpp This is an empty file! This is because reco_analyze just links in the default framework and adds no additional functionality aside from the parameters set in the main rcp file.
If you do a gmake, the D0 code management system will build an executable called RecoAnalyze_x and put it in p10.01/bin/<ARCHSPEC> (where <ARCHSPEC> is IRIX6-KCC_3_4 or whatever.) This is what the BINARIES file told it to do but it will not add any functionality unless you modify other packages as part of your test or add to the LIBRARIES and OBJECTS linked with reco_analyze. The SAM section 9 is an example of adding new OBJECTS.
At this point you can either
Here is the real meat of this package, 4 rcp file's
You will see a new subdirectory called rcpdb/ appear. This is where any local changes you make to rcp's will be stored during the code build phase. d0setwa also sets up environmental variables telling the D0 code where to look for permanent rcps. If you do not do d0setwa, you will get strange error messages complaining that 'control.dat' is missing.
I've put some files in
/prj_root/846/cd_1/class_sep01
You want the ones which begin with reco_. The D0 framework cannot read root files yet.
RecoAnalyze_x -rcp reco_analyze/rcp/runRecoAnalyze.rcp \ -out an.out -log an.err\ -input_file <infile>
The -input_file argument is optional but otherwise it looks for a file 'inputfile' in the current directory.
This will run the program using rcp runRecoAnalyze.rcp for master control and put the output in an.out and an.err. You should get a file called RecoAnalyze.root which contains a physics summary in root format.
Warning- root does not allow overwriting of files. You must either rename or remove RecoAnalyze.root to run again.
If you don't like that name, you can edit the file reco_analyze/rcp/RecoANtupleMgr.rcp and change the variable
string hbk_file = "RecoAnalyze"to another name.
In the section above, you used a raw C++ executable and passed it parameters. Much of D0 code requires extra files or communication with the data handling system. This is where the d0tools scripts come in. There is a general d0exe script and specific ones for reco_analyze and reco. d0tools is also sam and batch enabled.
You can find documentation at
http://www-d0.fnal.gov/computing/d0tools/doc For example your reco_analyze example above would have looked like:
For example you can see what's in my example by:
<d0mino> cat /RunII/home/schellma/class_sep01/mcrecofiles.dat /prj_root/846/cd_1/class_sep01/reco_sim_recocert_p09.08.00maxopt_fnal_pythia_qcd-incl-PtGt20.0_mb-poisson-2.5_224152613_2001_p10.01.00_000
d0setwa
runrecoanalyze -filelist=mcrecofiles.dat -num=100 -name=mcrecorun
This will use the default control files. If you want to run out of a local release you need to go to your test directory and tell d0tools about it. Here, I'm really being fancy and not even telling it I'm running recoanalyze.
rund0exe -exe=RecoAnalyze_x -rcp=runRecoAnalyze.rcp \ -filelist=mcrecofiles.dat -rcppkg=reco_analyze \ -localrcp -localfwkrcp -name=test2 -num=10
The d0reco and d0scan packages use a lot of extra files. For those, use the rund0reco instead of rund0exe command.
You can find out about it via:
http://www-d0.fnal.gov/d0unix/lsf_guide.html
d0tools supports batch - just add the -batch -q=
queue
options.
rund0exe -exe=RecoAnalyze_x -rcp=runRecoAnalyze.rcp\ -filelist=myfiles.dat -rcppkg=reco_analyze\ -localrcp -localfwkrcp -name=testbatch\ -batch -q=large
If you wish to use lots of data, you should use the SAM data access system. It is documented at http://d0db.fnal.gov/sam
To use it, you have to link your code with the SAM libraries and modify your run.rcp file to include the sam packages.
Sam is basically:
Because SAM keeps track of processing steps, it needs to know a bit more than what file you want. It will store information on what program you ran on what files with a unique name and description that you give it. This information is passed via the SAMManager rcp file or command line arguments.
samify <framework package>
It will change the OBJECTS and LIBRARIES but not check out sam_manager.
gmake clean
gmake all
and
gmake test
to remake your code. The results will show up in the /bin and /lib
areas in your test directory. This will take some time.
You have just added the sam package to your framework executable. These instructions would allow you to add any package to the framework in a similar fashion.
Put
-input_file "SAMInput:"on your command line when you run your executable without using d0tools. d0tools is smart enough to do this for you.
Change the ApplicationName to 'generic'
Selections are generally based on the filename (soon to be replaced by keywords) and the data tier (generated, digitized, reconstructed ..).
code datatier pythia/isajet generated d0gstar simulated d0sim digitized d0reco reconstructed reco_analyze root-tuple
If you want to run on d0reco output, you need to select 'reconstructed' files.
In general, you should be able to read old files with new code but it is very unlikely that you can go the other way.
You must specify the version number when selecting files. Input files are reprocessed with several versions and any analysis which just uses all reconstructed files will have multiple copies of the same events.
If you are interested in real data a good place to start is the runs database, with run numbers in hand, you can then look for interesting files for those run numbers. It takes several days for runs to be reconstructed so don't expect to find root-tuples for the data from 30 minutes before.
The runs database is at:
http://d0db.fnal.gov/run/runQuery.html
I suggest run 130194 as it is a global run taken on a night shift when people were not running in and changing things.
You can see what 130194's conditions are by clicking on its number.
You can also find what files are in SAM by going to the sam data browsing page and looking at 'Data Files'.
Try some key words like
%130194%for the file name and
t01.56.00for the Application Version
To look for interesting projects look at: http://d0db.fnal.gov/sam_project_editor/DatasetEditor.html and either use the full search capability or things like keywords, user names, dates.
I defined one for this class called sep01class find this definition and see what it has in it.
Look at some other definitions.
Has someone set up a set of files that looks good to you? If so, save the name (or id number) of that project and you can then run a job which reads those files.
Good ones to look for are the recocert samples which are used for reco certification and maintained by Harry Melanson.
If you want to create your own project, you can use the command line interface to sam which is documented at: http://d0db.fnal.gov/sam_project_editor/
or the nifty web based system at:
http://d0db.fnal.gov/sam_project_editor/DatasetEditor.html
#check the definition
sam translate constraints --run 130195\
--datatier="root-tuple" --applicationfamilyversion="t01.56.00"
#make the definition
sam create dataset definition --defname="hms-130194-example-t01.56-root \
--group=dzero --datatier="root-tuple"\
--appplicationfamilyversion="t01.56.00"\
--defdesc="root format data from run 130194 done with version t01.56"
#see what is in an existing definition, like the set of reco (not
root) files I made for the class as well.
sam translate constraints --dim="__set__ sep01class-reco"
The last query uses a new interface "dim" which allows much more complicated structures in the query:
sam create dataset definition --defname="hms-130194-example-t01.56-root "\ --group=dzero \ --dim"(run_number 130194 and data_tier root-tuple\ and application_family_version t01.56.00)"\ --defdesc="root format data from run 130194 done with version t01.56"
The dim interface currently has keywords:
<d0mino> sam translate constraints --dim=help
Specify dimensions and constraints combined with and/or/minus operators
as in these examples:
--dim='file_name %ztautau% and data_tier digitized'
--rpn='file_name %ztautau% data_tier digitized and'
--dim='file_name %ztautau%,%ztigtig% or physical_datastream_name e+j'
--dim='(data_tier digitized and appl_name d0reco and version t01.46.00) \
minus run_number 40041'
Available dimensions (not case sensitive):
APPL_NAME APPL_NAME_ANALYZED
CREATE_DATE DATA_FILE_LOCATION_STATUS
DATA_FILE_NAME DATA_TIER
DELIVERED_STATUS EVENT_NUMBER
FAMILY FAMILY_ANALYZED
FILE_ANALYZED FILE_NAME
FILE_STATUS FULL_PATH
LOGICAL_DATASTREAM_NAME PATH
PHYSICAL_DATASTREAM_NAME PROJECT_NAME
RUN_ID RUN_NUMBER
RUN_TYPE RUN_TYPE_ID
TAPE_LABEL VERSION
VERSION_ANALYZED __SET__
The dimension __SET__ is a special dimension which lets you combine
prior dataset definitions into your new dataset definition, simply
use __SET__ as your dimension name and the name of the existing
dataset definition as the constraint value, e.g.
--dim='file_name %ztautau% minus __set__ my-files-already-analyzed'
Running a project consists of 3 phases,
First
setup sam
To run a general executable you must have built that executable with sam manager and have sam turned on in your framework rcp.
Note that I have turned on the -localbuild flag to tell d0tools to use the version of RecoAnalyze_x I just rebuilt with sam. Actually the reco_analyze package already had one called RecoAnalyzeSAM_x but you needed the practice.
rund0exe -localbuild -exe=RecoAnalyze_x -rcp=runRecoAnalyzeSAM.rcp\ -defname=sep01class-reco\ -rcppkg=reco_analyze -localrcp -localfwkrcp -name=testsam -batch\ -q=large -num=100However, if you are using d0reco or reco_analyze, d0tools is smart enough to pick out the SAM enabled executable.
runrecoanalyze -defname=sep01class-reco\ -rcppkg=reco_analyze -localrcp -localfwkrcp -name=testsam -batch\ -q=large -maxopt
Here I could use the optmized version.
What d0tools is doing is putting a script around the sam submit command. If you find that d0tools is not doing what you want, write your own script and use sam submit to handle the data access part. Documentation can be found by doing:
sam submit --help
http://www-d0.fnal.gov/~schellma/root_example.html has an analysis done using an root-tuple similar to the one generated above.
Exercise for the user is to convert from
to
.
ls -c1p vertex_analyze GNUmakefile // as usual HEADER.html // makes a nicer web page for the directory LIBDEPS // already seen in reco_analyze README.html // makes a nicer web page SUBDIRS // tells CTBUILD about the subdirectories VERSION // contains the version number rcp/ // as in reco_analyze src/ // new - contains the source code test/ // tests vertex_analyze/ // header files bin/ // instructions on creating binaries doc/ // Documentation !!!
vertex_analyze does all sorts of things, it has rcp control, it is registered with the framework in for use in reco_analyze. It accesses chunks, uses heptuple.
You need to be an owner of a package to store it.
See the Professor's guide to releasing code.
All of this is in
schellma/tutorial on d0mino.
http://cdspecialproj.fnal.gov/d0/rcp
But here is a short summary:
// these are tracked in the RCP database bool lie = true int ten = 9 int threeD = (1,0,0) // comment float eleven = 11. double twelve = 12. string list = "a b and c" RCP anotherRCP = <mypackage anotherRCP> // you can access a whole new RCP here // these are not stored in the db untracked int counter = 1 untracked bool recent = NO untracked string why = "Cause"
void function(const edm::RCP& r){
try{
bool lie = r.getBool("lie");
string why = r.getUntrackedString("why","Idunno");
// allows a default!
// should get why == "Cause" with above rcp set
// why == "Idunno" without above rcp set
}
catch (const edm::XRCPNotFound& x){
// handle missing parameters here, only lie gets caught
}
}
For the interfaces, see the framework documentation - in general you don't need to change these unless you are adding major functions.
The string Packages tells the framework which packages you want to run. In this case you get geometry, event reading, root-ntuple generation, a bunch of analysis packages, dumps of events (printout) and write out summary events.
Below the Packages list is an RCP mapping which tells the framework where to find the configuration for each package:
For example
RCP jet = <jetanalyze JetAnalyze>
tells the framework to get the rcp file JetAnalyze from the jetanalyze packages. It is normally in the sub-directory jetanalyze/rcp.
Most of the RCP files come from the installed packages but reco_analyze has 3 changed ones for controlling i/o.
#!/usr/local/bin/tcsh/ -f echo "sam_manager" >> $1/bin/LIBRARIES echo "RegSAMManager" >> $1/bin/OBJECTS exit 0
addpkg reco_analyze
Release t01.46.00 uses reco_analyze version v00-05-05, will check that out Adding package "reco_analyze" to ".". cvs checkout -P-r v00-05-05 reco_analyze cvs checkout: CVSROOT is set but empty! Make sure that the cvs checkout: specification of CVSROOT is legal, either via the cvs checkout: `-d' option, the CVSROOT environment variable, or the cvs [checkout aborted]: CVS/Root file (if any). cvs checkout failed. addpkg failed.
You don't have cvs setup.
setup d0cvs
then try the addpkg again.
RecoAnalyze_x -rcp runRecoAnalyze_x
%ERLOG-A init:
Error trying to create RCP database object (flavor = FileSystemDB
DB name = personal)
The database's control file name is: /home/schellma/personal/control.dat
In this file, we found the database ID : 2
The value we expected for this database is: 99
Framework 7-Oct-2000 18:43:33 - -
%ERLOG-A init: Framework constructor: RCP problem
Framework 7-Oct-2000 18:43:33 - -
Framework completely failed
Exception Information: Framework constructor: RCP problem
d0setwa
and RCP environmentals will now point to your local area.
The framework command line can be used to over-ride some rcp variables:
// // RCP parameter Command line option //------------------------------------- // InputFile -input_file <file> // SkipEvents -skip_events <n> // NumEvents -num_events <n> // MaxEventsPerFile -per_input_file <n> // NumFiles -num_files <n>
// Command line overrides // // RCP parameter Command line option //------------------------------------- // OutputFile -output_file <file>
AnalysisProject -project project_name (see information below)
Station -station station_name (see information below)
ProcessDescription -desc description (see information below)
-app_family family_name (see information below)
ApplicationName -app_name application_name
ApplicationVersion -app_version float_version_number (see information below)
WorkingGroup -working_group group_name (see information below)
More on these variables:
-project: the analysis project name
-station: The analysis project and station names.
Both must be given when using the data handling system.
-desc: Optional description of the process. An
example would be "This is a test process"
-app_family:
-app_version: Information required for data handling
system when the consumer ID is not specified. All must
present when the consumer ID is not specified. The first
two are short for application family and application version.
-working_group: If family/version are given, then this is optional?
Advanced:
-consumer_id: Integer identifier from the data handling
system that specifies the process application family
and version, and working group.
-name process_name (Name of this process)
This document was generated using the LaTeX2HTML translator Version 99.1 release (March 30, 1999)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -split 0 -show_section_numbers d0cpp
The translation was initiated by WWW Server Account on 2001-09-09