DØ System Problems


Trouble Shooting | Customer Support | System Managers | Hardware Maintenance

Trouble Shooting

Before you can usefully contact any of the people below for help you need to do some trouble shooting to narrow down the possibilities and so that they will be able to help you. See:
VMS System Diagnostics and UNIX System Diagnostics for some simple things you can try to diagnose common problems before calling for help.

Customer Support

When trouble occurs with any DØ computer system the first place to try to get help is the Fermilab Customer Support group (formerly the Help Desk). For non-urgent help fill out the Help Request Form or send mail to helpdesk@fnal.gov. For more urgent problems, call x2345. Off-hours this will get you to a voice mail system which can be escaped to reach the Feynman Center Computer Operators (x2746). Customer Support will forward your problem to the proper people as quickly as possible. They know who to call at any given time, depending on the type of problem.

Be as explicit as possible in describing your problem. Give as many symptoms as you can to help narrow down the problem. "Unresponsive" window can be a Network problem, a disk failure, your window manager, your machine, ... any number of things. One of the more important data items you can give is how extensive is the problem, just one window, all windows on your machine, all machines in your area, all machines at D0... The help request form guides you through presenting some of the information needed. Be sure you mention what machine(s), cluster(s), printer(s) etc which are involved. For printers, mention the printer or queue name as well as the machine or cluster from which you are printing.

NOTE: Most boots nodes and servers are under 7day, 24hour (7x24) support. Satellite nodes are not. They are covered only 5days, 8hours (5x8). So if you can verify that the problem is a boot node or server problem, the priority will increase drastically.

Examples of 7x24 problems are a boot node or server down; user disks (login roots) unreachable; batch system down; printer queues (all of them) down etc. The printers themselves are 5x8 (you can use a different printer) but if the problem is the serving node which will take out all of them, that would be 7x24.

System Managers

Mail directly to the system managers can also be useful. These should only be used when you are sure that the problem requires system manager action: Incorrect system configurations (system commands as opposed to D0 commands that have ceased working), some sorts of hardware problems where it's not clear what hardware is at fault etc. Note that mail to these addresses is not certain to be seen immediately. Mail or phone to Customer Support will be acted on immediately and, depending on problem and system(s) involved may result in paging the responsible person(s) or even calls to home.

Hardware Maintenance

When the problem is definitely a hardware failure, and you can identify the hardware at fault, you need to make arrangements to have the machine repaired. The preferred method is to use the Computer Hardware Service Request Form. Next best is to send an explicit description of the problem via email to svscall@fnal.gov. You must include the computer name, model, owner and any identifying numbers on both the failed device and the host computer. The "S" number, blue tag, is preferred but others are also, sometimes, usable, including the serial number. Include your name, email address, phone number, affiliation and most important the location of the device. The last option is to phone in your request at x4373. Be sure you have all the information mentioned above before you call.

NOTE: if this is an urgent call, that is for a boot node or server, call Customer Support at x2345 rather than Hardware Maintenance directly. You'll get much faster service.

back to top


Alan Jonckheere
Last modified: Wed Mar 10 12:08:57 CST 1999