Improved Search for Single Top Quark Production at DØ in Run II

Publication and Plain English Summary
 

M. Agelou, B. Andrieu, P. Baringer, A. Bean, D. Bloch, E.E. Boos, V. Bunichev, T.Burnett, E. Busato, L. Christofek, B. Clément, L.V. Dudko, T. Gadfort, A. García-Bellido, D. Gelé, P. Gutierrez, A.P. Heinson, S. Jabeen, S. Jain, A. Juste, D. Kau, J. Mitrevski, J. Parsons, P.M. Perea, E. Perez, H.B. Prosper, V.I. Rud, R. Schwienhorst, M. Strauss, C. Tully, B. Vachon, G. Watts

Pictures of the authors

E-mail the conveners:   Arán García-Bellido, Ann Heinson, Reinhard Schwienhorst

 
Abstract

We present a search for electroweak production of single top quarks in the s-channel and t-channel modes. We have analyzed 230 pb–1 of data collected with the DØ detector at the Fermilab Tevatron collider at a center-of-mass energy of 1.96 TeV. Three separate analysis methods are used: neural networks, decision trees, and a cut-based analysis. No evidence for a single top signal is found. We set 95% confidence level Bayesian upper limits on the production cross sections using binned likelihood fits to the neural network and decision tree output distributions and using the total numbers of events in the cut-based analysis. The limits from the neural networks (decision trees, cut-based) analysis are 6.4 pb (8.3 pb, 10.6 pb) in the s-channel and 5.0 pb (8.1 pb, 11.3 pb) in the t-channel.

 
Please scroll down the page for: analysis description, plots, tables for talks, more plots for talks, cross section limits, links to conference talks on these results, and a list of frequently asked questions

 
Analysis Description

Please read the 11 page conference note for a more detailed description of these Winter 2005 results

Analysis Flow Diagrams



Background Measurement Methods



 
Plots

Clicking on a plot will give you the .eps version. Right click and "View Image" will get you the .gif version.

Distributions for the data and background model for the electron and muon channels combined.  Data is shown as points and background fractions are shown as solid colors.  Single top quark signals are overlaid as lines, and have been multiplied by a factor of ten so they can easily be seen.











We use eight neural networks for signal-background separation. There are four signal-background pairs: tb-Wbb, tb-ttlj, tqb-Wbb and tqb-ttlj (where ttlj is ttbar->lepton+jets), and each lepton flavor (electron and muon). Electron and muon networks are trained separately. Since the discriminant variables are not flavor dependent, each signal-background pair uses its own set of the most discriminating variables. The following plots show the performance of each neural network. Background is peaked towards 0 and signal is peaked towards 1 in the neural network output. The ttlj networks are very effective at separating ttbar backgrounds from signal, but the separation of the W+jets background from signal is more difficult. The fact that the NN outputs extend beyond 0 and 1 is because the package MLPFit uses a linear sum of sigmoids for the output neuron. The sigmoid function 1/(1+e–x) is constrained to [0,1] but the MLP approximation for the output neuron is not. Still, the probability that the given inputs correspond to signal events is indeed bounded in [0,1].


 
Neural Network Performance Plots
Electron Channel
Muon Channel
Wbb Network
ttlj Network
Wbb Network
ttlj Network
tb




tqb





Neural network output distributions for the electron and muon channels combined. The left four plots show comparisons of the background model and data with only one network (ttlj or W+jets) and in one channel (s or t). The right two plots combine the information in the left plots and show the distribution of the data (stars), background model (colored squares) and the signal region (black contour lines).








 
Decision Tree Performance Plots
Electron Channel
Muon Channel
Wbb tree
ttlj tree
Wbb tree
ttlj tree
tb




tqb





Decision tree output distributions for the electron and muon channels combined. The plots show comparisons of the background model and data with only one tree (ttlj or W+jets) and in one channel (s or t).






 
Tables for Talks

  • Cut-based analysis cuts (eps) (gif)
  • Neural network and decision tree input variables:
        All variables (eps) (gif)
        Object kinematics (eps) (gif)
        Event kinematics (eps) (gif)
        Angular variables (eps) (gif)
  • Systematic uncertainties (eps) (gif). Simplified version (eps) (gif)
  • Expected/observed cross section limits (eps) (gif). Observed limits only (eps) (gif)
 
More Plots for Talks

 
Cross Section Limits

Distributions of the Bayesian posterior probability density for the electron and muon channels combined: Cut-Based Analysis (left), Decision Trees (middle) and Neural Network Analysis (right).





Exclusion contours at 68%, 90%, and 95% confidence level on the posterior density distribution as a function of both the s-channel and t-channel cross sections in the neural network analysis. The s-channel cross section is obtained from tb muon data only and the t-channel cross section from tqb electron data only, such that the two likelihoods are independent of each other. Several representative non-Standard Model single-top processes are also shown.



95% Confidence Level Expected/Measured Upper Limits
(after final selections, with systematics, using Bayesian statistics)
 
 
s-channel
t-channel
Cut-Based
Electron
11.4/10.8
15.1/17.5
Muon
13.0/15.2 18.1/13.0
Combined
9.8/10.6
12.4/11.3
Decision Trees
Electron
6.9/7.9
9.3/13.8
Muon
7.3/14.8
10.9/7.9
Combined
4.5/8.3
6.4/8.1
Neural Networks
Electron 7.0/7.3
8.8/7.5
Muon
7.0/8.7
9.5/7.4
Combined
4.5/6.4
5.8/5.0

 
Conference Talks

(Click on talk name for a pdf file of the talk)
 
Frequently Asked Questions

  1. How did you choose the discriminant variables?

    We drew a list of sensitive variables based on an analysis of the signal and background Feynman diagrams (Ref. 1, Ref. 2) and on a study of single top quark production at next-to-leading order (Ref. 3).
    We then optimized, for each neural network (tb-Wbb, tb-lepjets, tqb-Wbb and tqb-lepjets), which sets of variables gave the minimium training error. The result is that each network uses around 11 variables. The total list of variables and the network they are used in can be seen here. The decision tree analysis used the same sets of input variables. The cut-based analysis started from the same sets of input variables and reoptimized the variables used and the cuts.

  2. Why does the NN output extend beyond 0 and 1?

    The fact that the NN outputs extend beyond 0 and 1 is because the package MLPFit uses a linear sum of sigmoids for the output neuron. The sigmoid function 1/(1+e-x) is constrained to [0,1] but the MLP approximation for the output neuron is not. Still, the probability that the given inputs correspond to signal events is indeed bounded in [0,1].

  3. Why are the distributions of the decision trees outputs "discrete"?

    The discrete nature of the distributions is a function of the number of nodes and the discrete nature of the decision trees. Decision trees are trained to partition the multidimensional variable space into regions for which the purity, or fraction of signal events, is either close to one or to zero. The resulting decision tree function applied to a given event then determines which region the event lies in, and returns the associated purity from the training sample. Since the number of such regions is finite, the function can only take a discrete number of values.

  4. Where is the difference between simple cuts and NN/DT limits coming from?

    Basically, why is the expected limit from cuts twice as large as the expected limits from NN or DT?
    The difference in the limits from cuts and NN/DT comes from two sources:
    (a) The multivariate nature of NN and DT, as opposed to sets of simple cuts. Multivariate techniques can separate better the signal from the background; and
    (b) The limit of the cut-based analysis is calculated by counting events after the selection cuts (observed number of events in data and expected number of events from bagkgrounds), whereas the NN and DT outputs are used in a binned likelihood to extract the limits. The binned likelihood takes advantadge of the shape information from the output distributions and can do a better job than simple counting.

Last modified:  April 18, 2005
Send comments to Philip Perea