D0 MC Dial-A-Job Daemon

Contents


Overview

The DAJ daemon (DAJD) manages Monte Carlo (MC) production after submission of the first job to process a request. It is designed to be easy to use by a non-expert user. DAJD is site independent so it can be deployed at any site and it is capable of managing MC production at many sites concurrently. DAJD handles recovery of common failure modes and integrates with existing MC request priority protocol. DAJD creates a log file of its activities and can be configured to create an HTML status page.

DAJD depends on the existence of a jobs database (JDB) in which is stored persistent information about submitted jobs. Job submission records job data to the database. The DAJD job monitoring daemon uses the jobs database information to submit recovery and merge jobs until the request is finished. Access to the JDB uses file locking via a wrapper class for the shelve object. The JDB is local to the submitter.

Dial-A-Job (DAJ) provides job submission that records job information to the JDB. Using the DAJ Ignition window provides the simplest method of starting a production job. When DAJD is running use of the Ignition window (select site, activate Go) is the only action the user needs do to process a request. The dajnox.py program provides a non-graphical (no X11 can be run on a console) initial production job submission front end that is compatible with DAJD.

DAJD also provides a simple request broker that can be easily configured to ensure that an execution site is always processing at least one request without any manual intervention.


Operations

Auto MC Job Management Schematic

The diagram above illustrates the four components of MC job management.

  1. The production job submission front end.
  2. The global SAM catalog.
  3. The local jobs database.
  4. The jobs monitor daemon.

Job Submission

Job Submission Diagram

The production job submission front-end creates a JDF from information the user provides and the request in SAM provides. The user supplies execution site information (station_name and possibly grid-requirement-string) and tarball versions of the d0release and d0runjob to be used in making the jobfiles_dataset for official production. The tarball versions do not change often regularly. The tarball version names are documented on the web. The tarball versions may be specified in the text dajdrc initialization file.

The submitter chooses an execution site and activates the submission process. The next request is obtained using the Queue.py request prioritizer. The JDF is created from the information in the request, the initialization file, and the submitter. The user is prompted for grid credentials if necessary, and the job is submitted. Information about the job is written to the jobs database.

Jobs Database

The jobs database (JDB) stores information about submitted jobs. This persistent job information is implemented as a Python shelve object, where the keys are request IDs and the values are a list of tuples. Each tuple corresponds to a job and has the structure (gjid, jobtype, station, d0release, jobfiles_dataset, subtime, endtime). Job submission records job data to the database. The DAJD job monitoring daemon uses the jobs database to submit recovery and merge jobs until the request is finished. Access to the JDB uses file locking via a wrapper class for the shelve object. The JDB is local to the submitter.

Job Monitor Daemon

The daemon awakens periodically to perform a cycle of operations. During the cycle the daemon checks the jobs database. The daemon loops over requests and determines if the last job processing a request is still running. If yes then go to next request. If not then it determines if the request is finished, where finished means the requested number of events are in thumbnails stored in SAM. If yes then set the request status to finished and archive the request by moving the request entry in the jobs database to the archive database. If production is not finished, then submit a production recovery job. If production is finished, then submit a merge job. If merge is not finished, then submit a merge recovery job. When a job is submitted an entry is made in the jobs database. The information in the jobs database is sufficient to create the appropriate JDF.

After checking the jobs database the last part of the daemon processing cycle is to invoke the Broker. Based on user configuration the Broker will automatically start processing new requests. See below for more information.

Job Monitor Daemon Diagram


Usage

DAJD requires these python files: daj_daemon.py, daj_common.py, dajnox.py, sites.py, Dajrc.py, Jdf.py, jobsdb.py, Request.py, get_resource.py, check_jobs.py, request_audit.py, fix_remerge.py, and find_merge_dups.py. These files are distributed with the DAJ tarball. Locate daj_daemon.py and these files in the same directory. The broker function of DAJD requires Queue.py, also included in the DAJ tarball. Queue.py reserves the next prioritized request in the D0 MC system. The location of Queue.py defaults to the current working directory. The location may be customized by an entry in dajdrc. See Initialization below. DAJD does not run in remote mode. It must be run on a machine with access to a sam client, sam station, and jim_client.

Before starting daj_daemon.py setup the sam and jim_client products. Obtain valid grid credentials for the sites to which jobs will be submitted.

DAJD is started from the command line with no arguments. It creates a run log and an error log into which the stdout and stderr streams are written. The names of the logs are displayed when the daemon starts. DAJD detaches it self from the terminal and starts its processing loop. When an iteration is complete the daemon sleeps for one hour before doing another iteration. Serious error conditions are written to the run log, error log, and will be emailed to the addresses specified as the value of the mail_errors_to keyword, if set in the dajdrc initialization file. The value of the keyword must be email addresses separated by commas and without spaces (e.g. user1@fnal.gov,user@localhost. The sending of email error messages may be disabled by specifying an empty keyword value or the value 'off'. By default (i.e. no mail_errors_to keyword in the initialization file) the error messages are sent to $LOGNAME@`/bin/hostname` which is the user who started the daemon at the node where the daemon is running. To include this user in the email list which is the value of the mail_errors_to keyword in the initialization file use the address user@localhost.

The operator must take care to renew their proxy at the appropriate time for continued job submission. Notifications are sent when remaining proxy time gets below a configurable threshold.

Run Control

DAJD runs as a detached daemon process. Commands may be sent to the daemon via signals. DAJD may be cleanly stopped by issuing a SIGHUP or SIGTERM signal to the DAJD process. The script dajd_stop included with the DAJ tarball will conveniently do this. A DAJD cycle may be initiated at any time by sending a SIGINT signal to the DAJD process. A SIGUSR1 signal dumps the DAJD configuration settings to the run log file. A SIGUSR2 signal reloads the site information. This allows the site information to be modified on the fly without restarting the daemon. A SIGALRM signal rereads the daj_daemon stats file. The script dajd_ctl included in the DAJ tarball will conveniently send these signals to DAJD.

Initialization

DAJD supports an optional initialization file, dajdrc, to specify job attribute defaults and user preferences. See the supplied dajdrc.template file as an example. The customizations available are listed below. Only some JDF attributes are reasonable to specify in the initialization file. The dajdrc initialization file is read upon daemon startup and at the start of each iteration making on the fly changes to the settings without restarting the daemon. Relevant keys for DAJD are runjob_version, d0rel_tarball_version, notify_user, notification, queue_file, setups, samopts, jobsdb_dir, tmp_dir, mail_errors_to, and possibly others. The file dajdrc is searched for in the order, value of DAJDRC environment variable if defined then in the working directory. The initialization file ignores blank lines and lines beginning with '#'. The significant lines are key value pairs separated by an equals sign. The attribute keys are their names. If the keyword is not specified the defaults are used. The preference keys with defaults in parentheses are:

The last four items in the list above are used by the Broker to submit an initial production job of a request. Note that normally there is no need to specify the d0rel_tarball_version and runjob_version keywords because these values are obtained from a central location by the daemon.

Broker

The DAJD Broker implements a simple system that allows completely automatic request processing. The operator sets a simple configuration and periodically monitors the DAJD logs for problems. The Broker will maintain a minimum number of requests simultaneously running at a site or'ed with maintaining a minimum number of events to be processed in running requests. The Broker can also forward jobs to an external broker for disposition through the use of resource pools. The only external broker service presently implemented is the ReSS for OSG.

The configuration of these requirements is done in the file dajd_quotas in the DAJD working directory. The file dajd_quotas.template is included in the DAJ tarball and contains explanation and examples of configuration. The dajd_quotas file contains lines with at least two tokens and up to six tokens separated by whitespace. Blank lines and lines beginning with a hash (#) are ignored.

The first token is the site resource name. The site is specified as the station name for non-OSG and non-LCG sites. For LCG or OSG sites the site is the appropriate station name joined with the resource_string with a semicolon between them as in osg-ouhep;atlas.iu.edu:2119/jobmanager-pbs. The second token is the minimum number of requests to be simultaneously run at the site. The third token is the minimum number of total events left to process in running requests at the site. If the number of running grid jobs falls below the minimum specified in the dajd_quotas file for a site or the total number of events left to process in requests running falls below the minimum specified in the quotas file for a site, and the number of merge jobs at the site is less than five a new request is obtained using Queue.py and sent to the site. The fourth token is the minimum number of events in a request obtained using Queue.py for the site. The fifth token is the maximum number of events in a request obtained using Queue.py for the site. The sixth token is the maximum number of events submitted per production non-phasedataset grid job. The seventh token is the priority of the site when new requests are assigned. A higher value means greater priority.

The site resource parameter may also be a pool resource object. The pool is defined dynamically in the dajd_quotas file. For a resource pool the identifier name in the file is of the form: name@pooltype;ce1,ce2,...,cen. Where cei has the form gate.keeper.address:port/jobmanager-type e.g. ress1@resspool;grid1.oscer.ou.edu:2119/jobmanager-lsf,osg-gw-2.t2.ucsd.edu:2119/jobmanager-condor 2. The cei's must be a known resource defined in sites.py and be appropriate to the type of pool. Only resource pool type resspool for OSG sites is implemented. The pool name is used for monitoring and may be used in the dajd_ignore file. For pool definitions lines are continued if the last character on a line is a backslash (\) e.g.
ress1@resspool;grid1.oscer.ou.edu:2119/jobmanager-lsf,\
osg-gw-2.t2.ucsd.edu:2119/jobmanager-condor 2

Only the first two tokens are required. Unspecified tokens default to zero. All tokens except the first must be non-negative integers. A value of zero for token five means no upper limit on events in requests obtained from Queue.py. A value of zero for token six means no upper limit on events submitted per grid job. No new requests will be started automatically at a site by removing or commenting out the site line in the quotas file or setting tokens 2 and 3 to zero for the site. The daj_quotas file is read each iteration so there is no need to restart DAJD for changes to take effect.

Optional Dynamic Configurations

Ignore Requests or Grid Jobs

It may be useful to suspend further processing on a request or a specific grid job in the jobs database. This may be accomplished by creating a file named dajd_ignore in the DAJD working directory. This file contains request ids and/or Global Job ID's (GJID) separated by whitespace or newlines. Blank lines and lines beginning with a hash (#) are ignored. No further processing on the listed requests is done by DAJD and the listed GJID's are ignored. For the requests listed presently running jobs finish but no new jobs are started. This file is read at the beginning of each iteration so there is no need to restart DAJD for changes to take effect. In the event permanent removal from the jobs database for requests or GJID's is desired, such may be accomplished using the jobsdbedit.py program included in the DAJ tarball.

Events per Job for Site or Request

It may be useful to submit jobs using a batch job size other than the default size of 250 events per job or created file for specific requests or sites. It may also be useful to specify the number of events of a grid job for a specific request. This may be accomplished by creating a file named dajd_epf in the DAJD working directory. This file contains request ID's and/or site specifications with associated values for the events_per_file and/or runjob_numevts attributes in the JDF used for job submission.

In the file blank lines and lines beginning with a hash (#) are ignored. A line with data has 2 or 3 whitespace separated strings. The first string is the request ID or site for which the events_per_file and/or runjob_numevts values will be applied. The second string is the events_per_file value. A value of "0" means use the site default value (usually 250). The optional third string is the runjob_numevts value. A value of "0" means use the default value. e.g.

 luhep 100 10000
 osg-ouhep;ouhep0.nhn.ou.edu:2119/jobmanager-condor 100 
 99999 50 5000
This file is read at the beginning of each iteration so there is no need to restart DAJD for changes to take effect.

Monitoring

DAJD will produce an HTML status page at the end of every cycle if so configured. The default is to produce a status file named dajd_status.html in the DAJD directory. The default file may be overridden by specifying the status_file keyword and the full file path to be used as the value in the dajdrc initialization file. To turn off the status page generation specify an empty string ("") as the value of status_file keyword in the initialization file.

The status page contains a table with a timestamp that is refreshed at the end of each cycle of the daemon. The Active table lists sites and the status of the requests assigned to the sites. Also listed for each request is number of grid jobs that have run to process the request (counting from zero); the number of events of the latest job if it is production or else indication that it is a merge job; "% Complete" is the percent of the requested number of events of unmerged files declared to SAM if a production job, or the percent of the requested number of events of merged files stored in SAM if a merge job; "% Delta" is the change in "% Complete" since the last update; and "No Change Hours" is the number of hours since "% Delta" was non-zero. The currently active job global job Id is listed for each request with links to the Samgrid batch system monitoring and detailed job monitoring pages for the job.

The status monitoring page contains a table of Inactive requests and jobs as specified in the dajd_ignore file. The definition of any defined resource pools is also on the monitoring page, and a list of any requests in the jobs database that have a status of hold.

If DAJD is configured to produce a status page then (a) a CSV file of the site data is produced with the same path as the status page file but with a ".csv" extension; (b) a file with the current inactive job information is produced named dajd_ignore.dat; and (c) a file is produced with the percentage completed and a time stamp for the last time the completion percentage changed for each request being processed. This statistics file is called dajd_stats.dat.

More complete status information is contained in the log files. When started DAJD creates a run log and an error log into which the stdout and stderr streams are written. The names of the logs are displayed when the daemon starts.

User Hook

As the last action of a daemon iteration user defined routines can be executed. The daemon will start in a separate thread a routine which looks for a file named dajd_ucmds. If found the thread will make an execfile() call for each nonempty line in sequence that does not have "#" as the first nonwhite character, with the white spaced stripped line as the argument to the function. Each line of dajd_ucmds is blank, or a comment beginning with "#", or the path to an executable python file. The list of user routines in dajd_ucmds can be changed at any time.


$Revision: 1.13 $
Joel Snow
Created June 28, 2006
Revised September 28, 2009