From suyong@fnal.gov Mon May 3 16:51:39 2004 Date: Mon, 03 May 2004 15:01:07 -0500 From: Suyong Choi To: Herbert Greenlee Subject: Re: summary of answers to Dfwg survey - Higgs group [ The following text is in the "ISO-8859-1" character set. ] [ Your display is set for the "US-ASCII" character set. ] [ Some characters may be displayed incorrectly. ] Dear Herb, Minor correction to the first sentence to answer 2. > We don't make recommendations. should be changed to We make recommentations, but do not insist on it. Regards, Suyong Suyong Choi wrote: > Hi, > > Here is the summary from the Higgs group. > > Regards > Suyong > > 1. What analysis data formats and analysis tools are members of your > group currently using? > >> Higgs group use various formats Athena, > > higgs_skim, higgs_multijet, tmb_tree and top_tree tuple makers all with > d0correct applied. > Except for TMB_tree, others are non-object format root-tuples. > > 2. What analysis data formats or analysis tools does your group > recommend to its members? > >> We don't make recommendations. Subgroup leaders may suggest > > some format for which they already have analysis code ready. > Analyzers are encouraged to check > their results against those obtained by others using different formats. > > 3. Do you encourage or discourage people to use tmb_tree? Why or why > not? > >> We do not encourage or discourage tmb_trees. > > This is a personal preference mostly. Some people don't like to use > objects > and/or find it cumbersome to use. Other formats are smaller, faster, > easier to modify, and > easy to analyze both at the root command line and in standalone programs. > > 4. How does your physics group support the efforts of analyzers? > That is, does your group provide centrally managed data sets, > tuples/trees, or analysis tools? > >> We use Common Sample Group's skims. > > Each subgroup makes the tuples. Datasets and analysis tools are provided > for the Athena and higgs_skim format. > > 5. Would your group benefit from the availability of common, possibly > centrally produced root trees? What requirements would a common root > format have to fulfill for your group to benefit? > >> We would certainly benefit from a centrally produced tuples, eliminating > > the need for us to support our own format and generate our own samples. > The requirements are: > > 1. It contains most of the tmb_tree content in a few kilobytes/event. > > 2. A standalone program to analyze the format can be linked > within a few seconds or less. In other words, it shouldn't depend > on a huge amount of code and d0 environment. > > 3. It can be read fast. Quantities that are > computationally intensive to compute should be calculated on demand > rather than in streamers. > > 4. It should be easy to strip events, trim branches, and add user > specific > branches geared toward particular analysis without writing a new class. > > 5. It probably is a good idea to keep the common tuple > in SAM system so that access to tuple is consistent > to other data sets and also accessible from remote. > > If it's too big or slow or it takes forever to link, we'll want to > continue making the current root tuples and the benefit will be lost. > > > 6. If tmb_tree were chosen as the basis for a common format, what > changes would be required to make it attractive to your group? > >> At least a clear documentation of all the methods without too much > > navigating should > be available. > > Also, It should be a lot smaller. The tmb_tree takes about 20kB/event, > much of which is redundant. > The tmb_tree track object, for example, uses 272 bytes/track while the > tmb uses 44 bytes/track. > Other roottuple formats fit essentially the same information into > 3.5kB/event and could be made still smaller. > The small format allows large data and MC samples (including the > complete 1EMloose, 1MUloose, and QCD moriond skims) > to be kept on a single workstation. This speeds up the analysis cycle. > > 7. Does your group develop algorithms in root? Should algorithm > development in root be encouraged? What is the best way to allow the > entire collaboration to benefit from algorithms developed in root? > >> We currently do not develop algorithms in ROOT. > > That being said, the major improvements to physics > in the past couple of years came from algorithms developed and > optimized outside of d0 framework environment, e.g. tracking and > b-tagging. > > Due to the slowness of working in d0 environment (linking, running, and > debugging), > algorithm development outside the framework is unavoidable. > However, algorithm development (using ROOT) should be done carefully, > especially the design of classes and packages, with assistance from > true software experts to make it simple and portable. > It can be written so that it is not tied to any specific format. > > 8. Is there any other information that you would like to bring to the > attention of the Data Format Working Group? > >> The root-tuple maker should directly use D0 already existing > > packages/code, e.g. d0correct, metreco,... not re-code these > in the root-tuple maker itself, to avoid more chances for mistakes. > > > > >