Analysis Tools Group Meeting 8/27/02 Thanks to all those who attended the meeting yesterday. We had very good discussions and I'm happy to see some serious thought going into streaming. Below is what I remember from the meeting. Send me corrections and/or additions. -------------- I started by showing the streaming proposal I put out a few weeks ago (see http://listserv.fnal.gov/scripts/wa.exe?A2=ind02&L=d0-atg&F=&S=&P=3838 ). The main features of the proposal are: O High PT leptons are in two streams (high Pt ele+X and high Pt mu+x). This was done to minimize the number of streams the W/Z, top and higgs high Pt lepton analyses would have to look at. Each stream is about a third of the global_CMT8.0 data. O Next down the list are low pt lepton triggers O Then jets O Then gap triggers O Then tau (all tau triggers overlap with high Pt lepton triggers, so the tau stream is empty right now). Note that the high Pt electron stream should really be called high pt EM since photons could end up there too. Several people felt that this streaming scheme is too coarse, given that nearly everyone would have to look at 2/3 of the data. We talked having smaller streams -- so long as the streams aren't too small, we should break things up further. There are worries about bookkeeping, but if we can do the bookkeeping correctly for two streams, having a few more should not be a problem. If it is, then D0 as a whole should be worried. Joe Kozminski from the top group showed their proposal for a stream scheme. The differences from the original are some triggers would have the primal stream "hptmultiele" and "hptmultimu" for high pt multi EM and muon. These triggers would go to the highest priority DILEP stream. Events with both high Pt Em and high Pt muon primal streams would also go to DILEP. Then follow the high pt electron and muon streams (now with just single leptons). The other change is a new primal stream "multijet" for multijet triggers that would be separated off to their own stream (so the jet stream would be for high Pt jets). The Multijet stream would help the all jets analyses. See Joe's talk for more information. The DILEP stream accounts for 10.4% of the data. Multijet is 5.1%. The high Pt EM and muon streams are 27% and 25% respectively. Joe explained that the DILEP stream would be very useful for detector studies and would help dilepton and multilepton analyses that now would only have to look at a ~10% stream instead of perhaps 70% of the data. This stream could also easily be reprocessed if necessary. Again, see Joe's talk for the details. A problem in coming up with stream schemes is that it's not clear what triggers are used by what groups and for what physics. Since triggers drive the streaming and physics (we assume) drive the triggers, it's extremely important to know what the triggers are being used for. Indeed, it seems apparent that while determining the streaming scheme is difficult, figuring out what primal streams we need and what triggers should belong to them is more difficult. We understand that the trigger board is working to clean up the trigger list and so it was suggested to have a "streaming fest" where all trigger and streaming representatives are present to figure out what the triggers are and where they should go. I think this would be more useful after we've decided more on the strategy of streaming (e.g. few big streams, many smaller streams, ...) and the trigger list has been pruned. Someone also brought up the idea of starting with the 2^n-1 possible streams and then merging like streams to eliminate small ones (keeping streams with like physics together) as a way of coming up with a stream scheme. Greg Landsberg mentioned that another good way to figure out the useful physics triggers is to look at the triggers hanging off of unprescaled L1 bits. We talked a bit about data and bookkeeping integrity. It's clear that the ATG should push for improved monitoring of level3 and datalogger so we would be aware of lost files and triggers. Heidi has started a e-mail conservation with Stu and Gerald Guglielmo about datalogger monitoring. I'll post it to the web under this meeting. A note post-facto: Michael Begel and I talked a bit Tuesday afternoon. He mentioned another reason for keeping the W/Z events in one stream --- he wants to determine luminosity from counting W events as a way to verify the bookkeeping (that is a straight count without looking at bookeeping information). The problem with W events going to more than one stream is that if a file is lost, the loss is biased (since a DILEP stream, for example, accepts events with a certain topology). So he would have to count W's in each stream and somehow put them together for a final count. That would be difficult. So the ease of doing physics conflicts with an important luminosity cross check. Sigh.