next up previous
Next: 8.3.2 Likelihood Ratio Up: 8.3 Significance Tests Previous: 8.3 Significance Tests

8.3.1 Confidence Limits

One approach is to define a `confidence limit' by analogy to the conventional test. A test is applicable to the case where one has a collection of N normally distributed variables with means and widths . Call this hypothesis H. One defines the function

Then, given some particular measurement (call it D), one can test how consistent it is with H by asking how likely it would be, assuming H, to get a measurement with a larger than the one actually seen. That is, one evaluates the integral

This is in some sense the probability for the presumed model H to fluctuate to give the observed data D. (Note that it is not the probability that H is true; that is not well-defined unless one specifies the complete set of alternatives to H.)

In order to generalize this procedure for other types of distributions, note that the probability for observing a particular measurement D assuming H is just

This suggests that one can obtain an analogous significance for a problem with an arbitrary likelihood by computing the integral

 

That is, by computing the total probability of all possible data samples which have a lower probability than the one actually observed.

For the mass fitting problem, the hypothesis to test is that the data are described entirely by the background model. The appropriate likelihood is thus obtained by setting = 0 in equation (7.28), yielding

The remaining parameter is then integrated out:

The prior is again taken to be a gaussian.

The integral over the data space can then be written

Note that if is uniform, this prescription yields the same result as was used for the counting experiment (equation (5.12)). Strictly speaking, this is true only if is restricted to be larger than N; this will make a difference only in cases where the expected background is not small in comparison to the number of observed events. This is because the prescription developed here tests the consistency of D with H regardless of the direction of any disagreement, while (5.12) counts only upward fluctuations in the number of events. For example, consider some hypothetical experiment where H predicts that 100 events should be expected, but 1000 events are actually observed. Both methods would assign a small probability to this occurrence. However, if 100 events are expected, it is also quite unlikely to see zero events. The prescription developed here will also assign a small probability to this latter case; however, the counting experiment significance (5.12) would assign it a probability of 1.

The integral in equation (8.4) can be evaluated by Monte Carlo techniques. An outline of a procedure for doing so is as follows.

  1. Before starting, evaluate the likelihood for the data point being tested, .

  2. Loop over the number of events in the Monte Carlo ensemble, .  

  3. Evaluate the piece of the likelihood which depends only on the number of events:

    Define a probability threshold by .

  4. Generate a large number of event experiments, picking each mass from the background probability distribution . This forms a set of samples . For each of these samples, compute the remaining likelihood factor , and count the number of times that this is less than the threshold . Call this .

  5. The contribution to the significance is then . Return to step 2, and continue looping until the terms being summed become insignificantly small.

The results of this calculation are for the loose cuts, and for the standard cuts. If the calculation is repeated with taken to be uniform (i.e., using only counting information), the results are for the loose cuts and for the standard cuts. (If the counting experiment prescription of equation (5.12) were used instead, the result is unchanged for the standard cuts, but goes down to for the loose cuts.)

It is also interesting to try to construct a significance which uses only the shapes of the distributions and which does not depend on the scale of . This can be done by fixing = N and taking the likelihood to be simply = . The results from this are 0.06 for the loose cuts and 0.30 for the standard cuts.



next up previous
Next: 8.3.2 Likelihood Ratio Up: 8.3 Significance Tests Previous: 8.3 Significance Tests



Scott Snyder Fri May 19 19:19:46 CDT 1995