Page updated: 2004

Run II Reconstruction Program Status - Version p14

Latest version:
p14.06.01

Running on the farms


Overview   Status    Performance   Test samples
Upcoming Features 
 Known Problems
Version Compatibility  
 Version History   Release Notes
General Information 
 How to run RECO   Report RECO problems


Overview

Now...

The current version of p14 is p14.06.01.

In general...

The p14 version of the reconstruction program has the following significant differences with the previous (p13) version:x


Status

p14.06.00 is currently installed on the data processing farms (FNAL). p14.06.00 is the final version.


Performance

The following statistics were measured on CAB using run 174244. The CPU time has been converted to 1 GHz-seconds. RSS measures real memory used and VSIZE measures virtual memory (real memory is more important for farm production, and virtual memory is more important for users running in shared batch systems). The numbers represent averages of different files within a run (we have observed significant differences between files in a given run). These numbers should be used with care when extrapolating performance to other runs.

Version

CPU Time (sec/event)

RSS Memory (MB)

VSIZE Memory (MB)

DST size (KB)

TMB size (KB)

NEVT

p14.03.00

14.8

413.4

666.4

208.2

22.3

1000

p14.02.00

20.7

445.0

597.5

211.2

23.8

100

p14.01.00

25.7

688.3

852.5

220.3

18.9

100

p13.06.01

17.2

388.9

494.4

171.6

18.9

100

Run 179760 is 26e30 run

Version

CPU Time (sec/event)

RSS Memory (MB)

VSIZE Memory (MB)

DST size (KB)

TMB size(KB)

NEVT

P14.05.00

20.6

273

523

210

24.7

2500

Run 174491 is a higher luminosity run.

Version

CPU Time (sec/event)

RSS Memory (MB)

VSIZE Memory (MB)

DST size (KB)

TMB size (KB)

NEVT

p14.03.00

37.1

453

883.1

283.6

27.4

1000

p14.02.00
(pre-build testing)

40.2

434.8

698.2

270.1

28.6

100

p13.06.01

31.8

432.1

578.1

224.0

23.0

100

Tracking efficiency is measured by matching "tight muons" to tracks found in the central detector (a technique developed by Erich Varnes and described in a talk to the tracking group).

Version

Tracks/event

Hits/track

Track effic (phi=0)

Tracks/primary vertex

p14.01.00

47.7

18.1

0.89

23.3

p13.06.01

35.3

16.0

0.71

12.3

DST composition (top 20 chunks), based on run 174244.

 

p13.06.01

 

 

p14.03.00

 

Chunk

Number of chunks

% of total

Chunk

Number of chunks

% of total

FPSClusterChunk

1

20.41%

L3Chunk

1

15.04%

L3Chunk

1

17.30%

TrackCalExtraChunk

1

14.32%

TrackCalExtraChunk

1

9.12%

FPSClusterChunk

1

10.74%

JetChunk

8

7.66%

JetChunk

6

8.02%

FPSDataChunk

1

6.00%

Calt42Chunk

1

6.63%

EMparticleChunk

2

5.49%

ChargedParticleChunk

1

5.08%

CalSCClusterChunk

1

4.27%

GTrackChunk

1

4.17%

RawDataChunk

1

4.09%

L1L2Chunk

1

3.96%

ChargedParticleChunk

1

3.99%

CPSClusterChunk

1

3.71%

GTrackChunk

1

3.68%

RawDataChunk

1

3.63%

CalDataChunk

1

3.31%

FPSDataChunk

1

3.22%

CalTClusterChunk

10

2.49%

EMparticleChunk

2

3.18%

CftClusterChunk

1

2.10%

CalSCClusterChunk

1

2.98%

SMTPosBCollectChunk

1

1.98%

CalDataChunk

1

2.63%

SMTPosDCollectChunk

1

1.67%

CalTClusterChunk

8

1.92%

VertexCollChunk

6

0.86%

CftClusterChunk

1

1.72%

MuoAlignChunk

1

0.59%

SMTPosBCollectChunk

1

1.45%

MuoCentralMatchChunk

1

0.54%

SMTPosDCollectChunk

1

1.23%

MuoSegmentChunk

1

0.53%

CPSDigiChunk

1

1.22%

TrackChunk

1

1%

VertexCollChunk

5

0.56%

A breakdown of where CPU time is spent, based on run 174244. Note that these numbers fluctuate by a few percent, based on which run is processed, and the instantaneous luminosity of that run. Also, not all times are accounted for (to the level of about 5%). These numbers should be considered to indicate general trends.

 

p13.06.01

p14.01.00

p14.03.00

Stage

% of CPU time

% of CPU time

% of CPU time

Initialization

0.08%

0.06%

0.00%

SAM

0.00%

0.00%

0.00%

Read event

0.08%

0.06%

0.11%

Unpacking

6.67%

5.51%

6.80%

Detector (RDC)

3.21%

2.88%

2.79%

Detector (DST)

1.69%

1.38%

2.01%

Tracking

56.96%

63.97%

62.65%

Vertexing

6.33%

5.51%

5.13%

Particle ID

14.94%

11.65%

15.16%

  cal

1.10%

0.94%

0.78%

  chpart

6.67%

5.58%

7.36%

  em

0.25%

0.19%

0.22%

  mu

2.11%

1.44%

2.68%

  jet

3.46%

2.63%

3.01%

  tau

0.08%

0.06%

0.00%

  met

0.25%

0.25%

0.22%

  links

0.00%

0.00%

0.00%

  bc

1.01%

0.56%

0.78%

  wz

0.00%

0.00%

0.00%

Write event

0.17%

0.13%

0.22%

Finish event

7.26%

5.70%

1.00%

  RecoStat (recostat)

1.43%

0.69%

0.86%

RECO certification plots generated with recocert:

p14.02.00, Run 176571 (3 MB)

p14.02.00, higher statistics (from Data Monitoring group)

p14.01.00 vs p13.06.01, Run 176571 (5.2 MB)

The above set of plots compare run 176517 reconstructed on the farms with p14.01.00 and p13.06.00. The plots were generated with the recocert package. Results from p14.01.00 are in black (p13.06.01 in red). Where appropriate, the plots have been normalized by the number of events processed. Track efficiency plots have been refined by Erich Varnes recently: a) require calorimeter confirmation of muon, to reduce muon fake rate, b) tuned matching cuts to better handle geometric matching. These modifications have resulted in a higher observed tracking efficiency.

p13.06.01, Run 174244 (10 MB) - for comparison


Test Samples

p14.01.00 certifications samples (Data and Monte Carlo) are available in SAM

Data

Monte Carlo


Upcoming Features


Known Problems

Data reconstructed with the following versions have known problems or annoyances:

 

History of Problems and resolutions:

red dotMajor error, yellow dotAnnoying feature, blue dotFunctionality to be added
green ballScheduled for next release, blue dotProblem fixed.
(common = occurs within 100 events, rare = occurs within 1000 events)

Status

Description

Fixed in version

Segmentation faults

Not fixed

Infinite loop in AA tracking algorithm

p14.06.00

blue dot

Energy information in the CPS data reconstructed with p14.03.0x (x>=1) due to a bug in cps_util and cps_calibration. Position information is not great, but usable. This is all fixed in p14.04.00.

p14.04.00

green ball

EM reco produces a peak at pT=5 GeV for low PT electrons. Fortunately, the cell information is stored, so EM post-process allows to do the reconstruction correctly.

p14.06.00

dot green

A serious bug was introduced in p14.03.01 as part of DSPACK. When L1L2 information is unpacked, program crashes, but not for all p14.03.01 data.

p14.03.02

red dot

A serious bug has been found in the calculation of the jet estimator n90 (from which f90 is built), for kt jets, and for cone jets which result from a merging of 2 or more jets. See this D0 News announcement for more details.

p14.04.00

yellow dot

A bug in Jet::p() results in returning pz instead of p. Users are advised to avoid p() and instead use sqrt(px() * px() + py() * py() + pz() * pz());

No plan to fix in p14.

red dot

CPU time per event can be x20 -x40 larger than typical for specific events, dominating average time.

p14.05.00.

blue dot

The calorimeter "energy sharing" problem that occurred during the first few months of 2003 (see Nirmalya's ADM talk or Greg Landsberg's summary) resulted in approximately 40 pb-1 of data that was compromised for physics analyses. A solution ( description of the solution, a before plot and an after plot) was implemented and can be used to reprocess RAW or DST data tiers.

p14.03.00

blue dot

RECO startup time is extremely variable when processing raw data. This was due to limitations with cacheing within the calibration database servers. The python version of the cache could only store about 6 individual calibration constant sets, aresulting in "cache thrashing" with the user db servers due to requests for more than this number of sets. Users attempting to (re)process events from a wide range of runs experienced even worse performance. Farm production was less affected. See below for more details. The cacheing code has been rewritten in C++, allowing for currently 30 SMT data sets to be in memory at one time.

New SMT C++ cache code deployed.

blue dot

CPU time per event grows linearly with the number of events processed. See Slava Kulik's ADM talk for a description of the solution.

p14.03.00

blue dot

RECO requires more memory than p13.

Improved in p14.02.00 and p14.03.00.

blue dot

Momentum resolution is significantly worse in data than in Monte Carlo. Effects show up in Z mass width and E/p width for electrons from W's (as well as other distributions). Additional evidence comes from analyzing recent magnet-off data, plotting <q/pt> vs. phi. Problem solved by realigning with magnet off data.

p14.02.00.

blue dot

FPS cluster chunk dominates DST size.

Improved in p14.02.00. More tuning required.

blue dot

Floating point errors in EMparticle

p14.02.00

blue dot

FPS unpacking crashes.

p14.02.00

blue dot

FPS reco crashes.

p14.02.00

blue dot

RECO processing time per event slower (1.2 - 1.4) than p13 depending on luminosity.

Improved with p14.02.00

blue dot

SMT Status is not correctly set in SMT calibration database. Results in permitting noisy chips to be considered during reconstruction. Impact may slow down reconstruction.

p14.02.00

blue dot

GTRHTF tracking crashes when processing p13.08.00 Monte Carlo.

p14.01.00

blue dot

Floating point error in trfzp (rare).

p14.01.00

blue dot

mag_field returns incorrect polarity.

p14.01.00

blue dot

CFT calibration db server hangs (only one example so far - under investigation).

p14.01.00

blue dot

GtrHtfAAPkg has noisy printout in output file.

p14.01.00

blue dot

sam_manager throws exception when it times out, preventing valid metadata to be generated.

p14.01.00

blue dot

A root exists bug that makes files written by root after reading a tree lose their TRefs. Requires later version of root.

p14.01.00



Version Compatibility

The following attempts to describe which versions of p14 RECO are compatible with each other. This summary is very general, and may not cover all possible issues that someone doing an analysis may care about. More detailed information about the differences between versions can be found below. Detailed release notes are also available.

Entries with the same color indicate compatibility.

p14 is significantly different than p13. Analyses should treat such samples independently.

Version

Alignment

Calibration

CAL

SMT

CFT

FPS

Tracking

Vertexing

MUID

Thumbnail

Notes

p14.00.00

x

x

x

x

x

 

x

x

x

x

x - Non-production version

p14.01.00

x

x

x

xx

x

x

x

x

x

x

x - Non-production version
xx - SMT Status word not used properly

p14.02.00

a

a

a

a

a

x

a

a

a

a

First production version.

p14.03.00

a

a

a

a

a

x

a

a

a

a

 

p14.03.0x

a

a

a

a

a

a

a

a

a

a

FPS phi calculation fixed.

p14.04.00

a

a

a

a

a

a

a

a

a

a

 

Notes:

p14.03.00 is the first version that officially supports reprocessing of DST's. Users should be aware that results from reprocessing DST's will not be identical to those from processing RAW files, since clusters / hits that are created based on calibration constants are not re-created. For example, reprocessing p13 DST's with p14.03.00 results in slightly different SMT clusters than if the raw data were processed. There differences are small, but users should treat them appropriately for their individual needs.


Version History


Release Notes


General Information
Users of this release are encouraged to report issues / fixes / "whatever" that might be useful to others using this release.


This page maintained by Suyong Choi