Page updated: July 14, 2003

Run II Reconstruction Program Status - Version p13

Latest version in production (data):
p13.06.01
Latest version in production (MC):
p13.07.00
Replaced by p14 Replaced by p14

Overview   Status   Performance
Upcoming Features  Known Problems
Version Compatibility    Version History   Release Notes
User Information    How to run RECO    Report RECO problems


Overview

The p13 version of the reconstruction program represents a major milestone for the DØ experiment. It was originally intended to be the version used to reconstruct data taken after the "October, 2002" accelerator shutdown. However, that shutdown was postponed, and instead efforts have been refocused towards reconstructing "50 pb-1" for the Spring, 2003 conferences (Moriond, etc.)

One of the primary improvements targeted for p13 was to deploy the best currently available set of track reconstruction algorithms. A task force was formed to review the performance of existing algorithms, and recommended that RECO use as default the "GTR + HTF" algorithm. Some details concerning this review are available.

In addition, the various Algorithm and Object ID groups had the following goals for p13:

Based on requests from the WZ and Top physics groups, the following goal was also established:


Status

Now...

p13.06.01 is running on the data farms. p13.07.00 is running on the MC farms.

History...

According to the "Overall Goals of the Experiment", p13 RECO had two major milestones - cut the first version the week of September 30 and make it available for farm production the week of October 28.. On Friday, October 4, we started our final build of t02.35.00 which was copied to make p13.00.00, achieving the first p13 milestone. On Friday, November 1, the final build of p13.02.00 completed. It was frozen on November 4 and began official production of data on November 5.

This successful achievement of the p13 RECO goals was announced in D0 News.

One of the highest profile goals for RECO p13 was the implementation of the recommendations of the TARC committee regarding the default tracking algorithm (GTR + HTF). This has been accomplished and preliminary studies indicate performance consistent with TARC p11 testing. As a cross check, p11 GTR-only results were also reproduced. Other general performance numbers (average number of muons and electrons (tight, medium, loose), mean and RMS MEX, MEY and MET, CPU time, memory, DST size, TMB+ size) look reasonable. More information on RECO performance is available.

The following major goals have been achieved:

In addition to RECO, there are developments in the Level 3 system. The status for p13 is available.

The following RECO test samples are available:

Other collections of interesting event samples:


Performance

The following numbers indicate some general performance characteristics of RECO p13.02.00. These numbers are representative; we observe fluctuations depending on exactly which events are processed. More recent versions of p13 have similar performance.

Sample NEvt CPU sec/evt (1 GHz) RSS Memory (MB) VSIZE Memory (MB) DST size (KB) TMB size (KB)
Run 155554 200 20.6 306.2 425.7 157.1 15.2
Run 160588 200 26.9 368.2 476.1 244.0 16.8
Run 165985 200 18.3 307.6 442.5 138.3 15.2
Z(ee) data            
Z(mm) data            
Dimuon (tight) data 50 20.1 261.9 350.4 150.9 16.6
Z(ee) + 0 mb p10 MC 200 11.0 255.1 349.9 265.2 26.5
Z(mm) + 0 mb p10 MC 200 13.2 279.0 379.4 264.7 24.7

Standard plots are also available (p13.00.00):

Other plots


Upcoming Features

Next RECO pass release: None scheduled.


Known Problems

red dot Major error, yellow dot Annoying feature, blue dot Functionality to be added
green ball Scheduled for next release, blue dot Problem fixed.
(common = occurs within 100 events, rare = occurs within 1000 events)

Status Description Fixed in version
red dot Bad PDT data taken in run range 168618 to 169290. Not a RECO problem per se, but documented here to help "spread the word".  
yellow dot Currently available RECO rcp's do not support reprocessing Monte Carlo trigsim output so that l1l2chunk is available in the thumbnail. See "User Information" below for a prescription to get around this problem.  
yellow dot Analyzing skimmed thumbnails can take a long time. Part of the problem is the current code reloads the magnetic field each time a new run number is encountered. This will be fixed in a future release, however users may make local executables with a patch to cure the problem. In addition, users can customize the unpacking of the thumbnail for their purposes and speed up processing considerably.  
yellow dot p13.04.00 is taking up to 45 GHz-seconds per event to process certain recent data on the farms. This is about a factor of 2 more than benchmark studies and seems to occur when the instantaneous luminosity is at its highest, resulting in large tails in CFT . For these runs, the HTF tracking time grows very large. The data indicates ~ 1% of the events cause a factor of ~ 2 increase in CPU time required. This appears to be the first evidence of the expected rapid increase in tracking CPU time with multiple interactions. This problem will have to be dealt with in p14. N/A
yellow dot recocert certification statistics don't handle duplicate muon objects correctly.  
yellow dot Jet reconstruction complains about missing Monte Carlo chunks when processing data. Won't be solved in p13. N/A
yellow dot RECO crashes with floating point errors on d0mino. The same events do not have this problem on Linux. Unfortunately, totalview on d0mino crashes when trying to debug. Users are advised to "unset TRAP_FPE" to get around this problem. Won't be solved in p13. N/A
yellow dot Can't read thumbnails generated with p13.01.00 using p13.02.00 or beyond. This problem will not be fixed. N/A
yellow dot L1cal eta values for calorimeter towers is wrong. A prescription for how to deal with problem is available. p13.07.00
blue dot Use SMT and CFT calibration databases (currently use flat files). Version p13.07.00 provides this functionality and has been tested using a small number of jobs. More extensive testing will be performed on the farms, and the plan is to move to this mode of operation in p14. p13.07.00
blue dot An incomplete CFT calibration file was installed on the farms for processing runs 169521 - 170008, resulting in missing stereo hits for one sector. This resulted in poor 3D reconstruction efficiency in that sector. More details are available. Data has been reprocessed.
blue dot When generating events with p13.05.00 (and previous versions), a small number of cells end up with anomolously large energies. A description of the source of the problem is available from Robert Zitoun. p13.06.01
blue dot RECO crashes on IRIX using maxopt build, due to uninitialized variable _pshwrE in EmTmbObj. p13.06.00
blue dot The calorimeter non-linear corrections were not applied in p13.04.00. This results in worse resolution, especially in the EC. The problem is limited to p13.04.00 (i.e. previous versions are ok). A study documented the effects on jets (plots). p13.05.00
blue dot When reading thumbnails produced with p13.02 using p13.04, the charge of the muon is always zero. p13.05.00
blue dot The definition of the muon thumbnail variable qptloc is different for p13.02/03 and p13.04. In p13.02/03 it is q*ptloc In p13.04 it is q/ptloc where pt is the local pt of the muon. p13.05.00
blue dot Memory leak in gtrbase::GTrack::get_num_measurements caused problems with default running of RECO and RecoCert_x. This problem does not show up on the farms, because they disable recocert processing. However, "normal users" are affected. p13.05.00
blue dot Crashes in p13.03.00 muonid due to overwriting fixed arrays (rare) p13.04.00
blue dot No FPS clusters in thumbnail. p13.04.00
blue dot Memory leaks have been identified in the code used to unpack the thumbnail. This affects users trying to analyze p13.02.00 thumbnails, limiting the number of events that can be processed per job. p13.04.00
blue dot If 2 local muons share hits they are considered as coming from the same physical muon; one needs criteria to chose the BEST. Until now the criteria was using (among other things) the pT of the local muon track. Detailed studies show that this introduces an important bias in pT. This will be changed to use the chi2 instead of the pT. (Christophe Clement) p13.04.00
blue dot For one local muon track, there are several matchings with different central tracks which are kept into MuonParticleChunk (can deal with that by checking the local track and central track indices). p13.04.00
blue dot No trigger information in thumbnail. p13.04.00
blue dot Missing quotation mark in MuoCalTrackReco.rcp causing "email danielw@fnal" error message. p13.04.00
blue dot The solenoid field is shifted by 1cm in Z in RECO with respect to ideal position. In addition, data indicates possible additional shift. Proposal to fix the known 1cm shift in MC and data in p13.04.00. p13.04.00
blue dot In p13.02.00, unphysical CFT clusters were split. This seems to result in a 50% increase in processing time, with little change in performance. Proposed to revert back to previous handling of clusters. p13.04.00
blue dot Thumbnail does not contain information about cells removed by NADA. Thumbnail now includes CalNadaChunk p13.04.00
blue dot RECO_ANALYZE crashes in vertex_analyze because the quark list in BlockQuark.cpp exceeded the allocated size. p13.04.00
blue dot RECO crashes with an assert in MRCCtrlMCHModule (pmod->size() != 2). Caused by corrupt raw data. Change assert to ErrorLogger error message. p13.03.00
blue dot RECO crashes with a segmentation violation in muonid::MuonFind1 (rare). Caused by corrupt muon raw data. Protection to be installed to prevent crashing. p13.03.00
blue dot RECO crashes with a segmentation violation in cps_reco (rare). Caused by corrupt CPS raw data. Protection to be installed to prevent crashing. p13.03.00
blue dot RECO crashes in cps_evt/CPS3DCluster with the exception "SLC cache not set" (rare). Seems to be same cause as above problem. p13.03.00
blue dot RECO goes into an infinite loop in impTagTrackCategory::GetProbability (rare - occurs more often when processing p13 single particle Monte Carlo). p13.03.00
blue dot RECO crashes with a floating point exception in impTagAlg::calculate() (rare). Protection to be installed to protect against divide by zero. p13.03.00
blue dot The 8x8 hmatrix chisq in the CC appears to have a problem. The EC looks ok. Reported in w-wenu p13.02.00 recocert Monte Carlo sample. Caused by incorrect noise/zsup file in p13.02.00 d0sim. p13.03.00
blue dot Bug in calorimeter_geometry affecting extrapolations using CellLine p13.03.00
blue dot Calorimeter geometry in RECO shifted with respect to p12+ p13.02.00
blue dot Muon geometry in RECO shifted with respect to MC p12+ p13.02.00
blue dot RECO crashes with a divide by zero in ChPartReco (rare). p13.02.00
blue dot RECO crashes when processing MC event generated with p13.01.00 in gtr_htf::ForwardDetLayer::simHits. Error messages include ? wrong f-wedge width/length = 0.83647, 2.9605, 3.9635 ? cannot get det with layerId = 101, detId = 1. p13.02.00
blue dot RECO crashes with assert (short_hop || long_hop) in CftPropInteracting::err_dir_prop (rare). p13.02.00
blue dot RECO crashes in FPSUnp2Digi_data (rare). p13.02.00
blue dot Calorimeter geometry that aligns it to tracker (3 cm Z shift) causes floating point exceptions (currently disabled). p13.02.00
blue dot Bug in calculating SMT cluster errors - affects H disks, 1D ladders (dummy n side, error is 144 and not 12). p13.02.00
blue dot RECO crashes in muonid::MuonFind1 (rare) p13.02.00
blue dot Inconsistent access to EMQualityInfo methods between DST and TMB. p13.02.00
blue dot Offline zero suppression algorithm not identical to online ("<" instead of "<=") p13.01.00
blue dot FPSWedgeData issues an assert when encountering negative ADC values (rare). p13.01.00
blue dot No FPS clusters (incorrect RCP used by RECO). p13.01.00
blue dot RECO exits with no error message, due to bug in LinearAlgebra Matrix class (calls exit() if user tries to invert a singular matrix). Need to install protection in vkalman_util/KTrack. p13.01.00
blue dot RECO crashes with a divide by zero in muo_caltrackreco/MuoCalTrackFinder. p13.01.00
blue dot RECO crashes with a divide by zero in muon_centralmatch/PropagateMuon. p13.01.00
blue dot RECO crashes with segmentation fault in muonid::MuonFind1. New version of muonid being tested. p13.01.00
blue dot RECO crashes with a segmentation fault when accessing CalData. It is currently suspected that there is a memory overwrite someplace in the code. New version of muonid package is being tested. p13.01.00
blue dot RECO crashes at the end of the job after processing all events. New version of analysis_utilities is being tested. p13.01.00
blue dot RECO failed to build in p13.01.00 due to change in L3Chunk interface. Fix available in p13-br version of l3fchunk. p13.01.00
blue dot RECO in p13.01.00 can't connect to run config db server. Fix available in the p13-br versions of config_db_client, mag_field_config, d0omCORBA. p13.01.00
blue dot RECO crashes at the end after processing all events. Fix available in the p13-br version of recocert. p13.01.00
blue dot No CPS clusters created. p13.01.00
blue dot RECO crashes with a floating point error in muo_segmentscintonly::MuonScintLinFit::d0Fit (rare). Temporary user solution - skip offending event. p13.01.00
blue dot Refitting of tracks eliminates CFT axial-only tracks. p13.01.00
blue dot Including CFT axial-only tracks causes floating point exceptions. p13.01.00
blue dot TMB contains Muon::MuoTrackChunk instead of Muon::TrackChunk. This chunk will be obsolute once loose muons are available in the TMB (under development). p13.01.00
blue dot RECO crashes in em_util::CorrEemcalo::corretada_IC_S with a floating point exception (rare). p13.01.00
blue dot RECO crashes in d0om writing out MuonParticle (common). Temporary user solution - skip offending event. p13.01.00
blue dot Including CFT axial-only tracks causes crashes in KTrack due to singular error matrix inversion. p13.01.00
blue dot Crash in taureco due to axial only tracks. p13.01.00
blue dot Bug in CFT clusters - not using aligned geometry correctly for stereo clusters. p13.01.00
blue dot Muon geometry not consistent with p12+ Monte Carlo. p13.01.00
blue dot CPSClusterChunk takes 32KB in the DST (largest chunk). Will be dropped from the DST. p13.01.00
blue dot Loose muons in thumbnail. p13.01.00


Compatibility Summary

The following attempts to describe which versions of p13 RECO are compatible with each other. This summary is very general, and may not cover all possible issues that someone doing an analysis may care about. More detailed information about the differences between versions can be found below. Detailed release notes are also available.

Entries with the same color indicate compatibility. Columns are included if some significant non-compatibility can be identified.

Version Alignment Calibration CFT Calorimeter Tracking Vertexing MUID Thumbnail Notes
p13.02.00 a a a a a a a a  
p13.03.00 a a a a a a a a  
p13.04.00 c c c x c c c c p13.04.00 contained many new positive features. Unfortunately, it also introduced a problem with the calorimeter which was fixed in the next release. This problem affected the calorimeter resolution, especially in the EC's.A study documented the effects on jets (plots).
p13.05.00 c c c c c c c c  
p13.06.00 c c c c c c c c  
p13.06.01 c c c c c c c c  
p13.07.00 c c c c c c c c  
p13.08.00 c c c c c c c c  


Version History


Release Notes


User Information


This page maintained by Harry Melanson