Thoughts on DFEA2 Commissioning
Jamieson Olsen
Introduction
Two DFEA2 boards have been installed on the west platform and they are
receiving real CTT data from the splitter cards. At this point the
new hardware seems to be working quite well. The "parasitic"
connection to the existing CTT system is working. The DFEA2 is happy
with the data and control information that it's presented.
Downstream boards appear to be receiving properly formed records from
the DFEA2.
Before the DFEA hardware is removed and replaced with the DFEA2 hardware
more work needs to be done to insure that any firmware bugs have been fixed
and that after the summer shutdown the CTT will resume quickly and without
major interruption.
Below are some of the test procedures that should be addressed
between now and the installation. Some of these procedures are in process now,
while others have not been addressed and there is currently no post-doc slated to
work on it long term.
Verify the cables
Put the AFE boards into fake track mode. Capture and dump out the
DFEA2 input buffers and compare these against the
expected values. This is a very strong test since it checks the AFE
personality code, the MIXER operation, and the LVDS cables between the AFE,
MIXER, and DFEA2.
Check the DFEA2 status registers
Once the Sequencers and AFEs are in a stable state then clear the
history bits on the DFEA2. Then read back the DFEA2 status bits and
verify that there all links are present, no link clock or sync
problems exist, and that there is no disagreement between the
embedded control bits and the SCL control bits delivered via the
DFEC2 board. Also, the boards downstream of the DFEA2 will report any
parity errors, etc. (Ultimately this process will execute
automatically with EPICS, and alarms will be issued automatically if
the there are problems.)
Capture DFEA2 data
Configure the DFEA2 to capture its inputs and outputs at the next
L1accept. Then readout these buffers and dump the data to a file,
reformatting as necessary. Feed the input data files into TrigSim
and it will produce the expected output files (L1, L2CFT, and L2CPS)
for this event. Then compare the real output records against the
expected records and flag any differences. And repeat. The TrigSim
program must model exactly what the DFEA2 firmware is doing -
unpacking the fibers, track finding, track sorting, cluster/track
matching, etc. The original DFEA firmware has been re-written for
the DFEA2, to what extent I'm not sure, therefore Shouxiang will have
to be involved with the TrigSim modifications. Additionally, TrigSim
will need to know what unpacking map and track equations are being
used. The most straightforward method is to have TrigSim read in and
parse the actual VHDL files (but writing a VHDL parser from scratch
does not sound like fun). This is a very strong check of the DFEA2
firmware and it will be the most useful tool for tracking down
firmware bugs. If there is a discrepancy the captured input and
output data can be sent back to Shouxiang and he can run it though
the simulator to find where the discrepancy originates. Due to
bandwidth limitations this capture process cannot keep up with high
L1accept rates, but it should be able to grab events at the rate of
perhaps a hundred hertz.
Comparing DFEA and DFEA2 records
The DFEA2 sends output records to a extra CTOC board installed on the
west platform. With each L1accept this extra CTOC board sends the
requested L1 records to our L3 readout crate (x13). The Examine tool
compares the DFEA's and DFEA2's L1 records against the expected value
calculated from the fiber information extracted directly from the
AFEs. The advantage of this tool is that it compares the L1 records
with each L1accept, so it's possible to get some good statistics this
way. Unfortunately this test doesn't do much more than confirm that
there's a difference in there somewhere. Without the input data
it's impossible to analyze the event in the simulator and quickly track
down the bug. Also, care must be taken to insure that the DFEA and
DFEA2 are using the same (or functionally identical) set of track
equations.
Comparing Trigger Rates
To perform this test the one of the "real" CTOC boards must be
unplugged from the CTTT and replaced by the 'extra' CTOC (the one
downstream of the DFEA2). Assuming that the DFEA2 output data stream
is aligned to within +/- one RF tick of the DFEA then the CTTT can
process the information and pass it along to the trigger framework.
Given the difficulty in interpreting the results this test seems
quite dubious -- with the DFEAs our XXX trigger term rate is 1674.56
Hz but with the DFEA2s it's 1721.53 Hz. But what does that really tell
us about firmware/hardware problems?
updated 5 January 2005