LSF ADMINISTRATOR'S GUIDE
=========================
The main administration tools are "lsadmin" and "badmin". "xlsadmin" is the
graphics interface for those two commands. You can type a command,
under the prompt type "help" for more details, for example:
% lsadmin
lsadmin>help
Commands are :
reconfig ckconfig limrestart limshutdown limlock limunlock
resrestart resshutdown reslogon reslogoff help ?
quit
Try help command... to get details.
CHECKING LOGS
-------------
1. Error logs "/usr/local/lsf/log/*"
2. lsbatch accounting log "/usr/local/lsf/work/D0_BATCH/logdir/lsb.acct"
3. lsbatch event log "/usr/local/lsf/work/D0_BATCH/logdir/lsb.events"
CHECKING HOST STATUS
--------------------
% lsload --- report current status and load levels of hosts
% lsmon --- running display of host status and load levels
% xlsmon --- graphics display of host status and load levels
RECONFIGURING LSF CLUSTER
-------------------------
First, edit relavant LSF configuration files which reside in
/usr/local/lsf/conf
/usr/local/lsf/conf/lsbatch/D0_BATCH/configdir
Then, tell LSF daemons to read the new configurations by running
% lsadmin reconfig
% badmin reconfig
You can do this while LSF system is in use. All current and pending jobs
are not affected.
RESTARTING LSF DAEMONS
----------------------
The LSF daemons can be restarted to clear persistent errors. Interactive
and batch jobs running on the host are not affected by restarting daemons.
% lsadmin
lsadmin>limrestart d0cha --- restart LIM daemon on d0cha
lsadmin>resrestart d0chb --- restart RES daemon on d0chb
lsadmin>quit
% badmin hrestart all --- restart sbatchd daemon on all hosts
BATCH SYSTEM STATUS
-------------------
% badmin hhist --- lsbatch server hosts history
% badmin qhist --- lsbatch queue history
% badmin mbdhist --- mbatchd deamon history
% badmin hist --- all lsbatch history information
CONTROLLING LSBATCH QUEUES
--------------------------
Each batch queue can be open or closed, active or inactive. User can
submit jobs to open queues but not to closed queues. Active queues start
jobs and inactive queues hold all jobs submitted.
% bqueues top_cha --- current status of queue top_cha
% bqueues -l --- long list of all queues
% badmin qclose top_cha --- close queue top_cha
% badmin qopen top_cha --- open queue top_cha
% badmin qinact top_cha --- inactivate queue top_cha
% badmin qact top_cha --- activate queue top_cha
CONTROLLING LSBATCH JOBS
------------------------
The LSF administrator can control batch jobs belonging to any user.
Other users may control only their own jobs. Jobs can be suspected,
resumed, killed, and moved within and between queues.
% bjobs --- get information about batch jobs
% btop jobID --- move a pending job to the top of its queue
% bbot jobID --- move a pending job to the bottom of its queue
% bswitch --- moves pending and running jobs from queue to queue
see man bswitch for details
% bstop jobID --- stop a job
% bresume jobID --- resume a stopped job
% bkill jobID --- kill a running or pending job
ADDING USERS TO A BATCH QUEUE
-----------------------------
1. Log in to d0cha/d0chb as the LSF administrator
2. Edit file /usr/local/lsf/conf/lsbatch/D0_BATCH/configdir/lsb.users
to add user names to the specific group
3. Run "badmin ckconfig" to check errors
4. Run "badmin reconfig" to make the changes effect for LSF
ADDING A BATCH QUEUE
--------------------
1. Log in to d0chb/d0cha as the LSF administrator
2. Edit file /usr/local/lsf/conf/lsbatch/D0_BATCH/configdir/lsb.queues
Add the new queue defination. You can copy another queue definition
from this file as a starting point
3. Run "badmin ckconfig" to check errors
4. Run "badmin reconfig" to make the changes effect for LSF
NOTE:
For most of the LSF commands, you can append option "-h" to seek for help.