Thoughts on DFEA2 Commissioning

Jamieson Olsen

Introduction

Two DFEA2 boards have been installed on the west platform and they are receiving real CTT data from the splitter cards. At this point the new hardware seems to be working quite well. The "parasitic" connection to the existing CTT system is working. The DFEA2 is happy with the data and control information that it's presented. Downstream boards appear to be receiving properly formed records from the DFEA2.

Before the DFEA hardware is removed and replaced with the DFEA2 hardware more work needs to be done to insure that any firmware bugs have been fixed and that after the summer shutdown the CTT will resume quickly and without major interruption.

Below are some of the test procedures that should be addressed between now and the installation. Some of these procedures are in process now, while others have not been addressed and there is currently no post-doc slated to work on it long term.


Verify the cables

Put the AFE boards into fake track mode. Capture and dump out the DFEA2 input buffers and compare these against the expected values. This is a very strong test since it checks the AFE personality code, the MIXER operation, and the LVDS cables between the AFE, MIXER, and DFEA2.

Check the DFEA2 status registers

Once the Sequencers and AFEs are in a stable state then clear the history bits on the DFEA2. Then read back the DFEA2 status bits and verify that there all links are present, no link clock or sync problems exist, and that there is no disagreement between the embedded control bits and the SCL control bits delivered via the DFEC2 board. Also, the boards downstream of the DFEA2 will report any parity errors, etc. (Ultimately this process will execute automatically with EPICS, and alarms will be issued automatically if the there are problems.)

Capture DFEA2 data

Configure the DFEA2 to capture its inputs and outputs at the next L1accept. Then readout these buffers and dump the data to a file, reformatting as necessary. Feed the input data files into TrigSim and it will produce the expected output files (L1, L2CFT, and L2CPS) for this event. Then compare the real output records against the expected records and flag any differences. And repeat. The TrigSim program must model exactly what the DFEA2 firmware is doing - unpacking the fibers, track finding, track sorting, cluster/track matching, etc. The original DFEA firmware has been re-written for the DFEA2, to what extent I'm not sure, therefore Shouxiang will have to be involved with the TrigSim modifications. Additionally, TrigSim will need to know what unpacking map and track equations are being used. The most straightforward method is to have TrigSim read in and parse the actual VHDL files (but writing a VHDL parser from scratch does not sound like fun). This is a very strong check of the DFEA2 firmware and it will be the most useful tool for tracking down firmware bugs. If there is a discrepancy the captured input and output data can be sent back to Shouxiang and he can run it though the simulator to find where the discrepancy originates. Due to bandwidth limitations this capture process cannot keep up with high L1accept rates, but it should be able to grab events at the rate of perhaps a hundred hertz.

Comparing DFEA and DFEA2 records

The DFEA2 sends output records to a extra CTOC board installed on the west platform. With each L1accept this extra CTOC board sends the requested L1 records to our L3 readout crate (x13). The Examine tool compares the DFEA's and DFEA2's L1 records against the expected value calculated from the fiber information extracted directly from the AFEs. The advantage of this tool is that it compares the L1 records with each L1accept, so it's possible to get some good statistics this way. Unfortunately this test doesn't do much more than confirm that there's a difference in there somewhere. Without the input data it's impossible to analyze the event in the simulator and quickly track down the bug. Also, care must be taken to insure that the DFEA and DFEA2 are using the same (or functionally identical) set of track equations.

Comparing Trigger Rates

To perform this test the one of the "real" CTOC boards must be unplugged from the CTTT and replaced by the 'extra' CTOC (the one downstream of the DFEA2). Assuming that the DFEA2 output data stream is aligned to within +/- one RF tick of the DFEA then the CTTT can process the information and pass it along to the trigger framework. Given the difficulty in interpreting the results this test seems quite dubious -- with the DFEAs our XXX trigger term rate is 1674.56 Hz but with the DFEA2s it's 1721.53 Hz. But what does that really tell us about firmware/hardware problems?


updated 5 January 2005