SAM-Grid User and Administrator Manual

Contents

1        How to read this manual 2

2        Introduction. 2

2.1         Overview of the SAM-Grid architecture. 3

3        Installation of the SAM-Grid. 4

3.1         System requirements. 4

3.1.1     Hardware. 4

3.1.2     Software. 4

3.1.3     System Configuration. 4

3.1.4     Summary of the activities as root 5

3.1.4.1       Setup Group Accounts. 5

3.1.4.2       Open Ports for Incoming TCP connections. 5

3.1.4.3       Enable Automatic Restart of SAMGrid servers at Boot Time. 7

3.1.4.4       Setup the /etc/grid-security and xinetd daemon (Execution Site Only) 7

3.1.5     Packages and Samgrid Production Release Cuts. 7

3.2         Middleware Installation. 7

3.2.1     Installing and Configuring Condor and Globus. 7

3.2.2     Installing and Configuring the Grid Security Infrastructure. 8

3.2.3     Updating the Grid Security Infrastructure. 9

3.2.4     Get a Service Certificate. 9

3.2.5     Installing XMLDB.. 10

3.2.6     Store the SAM Grid Global Constants to the XML database. 12

3.3         Client Site Installation. 12

3.4         Submission Site Installation. 13

3.4.1     General configuration. 13

3.4.2     Installation of the JIM Broker client 13

3.4.3     Installing Output retrieval via web. 15

3.5         Execution Site Installation. 16

3.5.1     General configuration. 17

3.5.2     Install sam.. 17

3.5.3     Setting up durable location (Optional) 17

3.5.4     Get a host certificate. 17

3.5.5     Get the list of users authorized to use the resources (gridmap-file) 18

3.5.6     Install SAM-Grid Globus job-managers and sandboxing mechanisms. 19

3.5.7     Creating the Resource Description. 22

3.5.8     Installing the resource advertisement software. 23

3.6         Monitoring Site Installation. 24

3.6.1     Create site Configuration. 24

3.6.2     Configure/Update MDS. 24

4        Starting the Servers. 25

5        Modifying the Product Configuration. 26

6        Automating the Maintenance Tasks. 26

6.1         Regular Cleanup and Maintenance Tools. 26

6.1.1     Cleaning up old Globus files and jim sandboxes. 26

6.1.2     Cleaning up CondorG queue for OSG jobs. 27

6.1.3     Cleaning up CondorG queue for Samgrid jobs. 28

6.1.4     Rotate log files daily and archive them Monthly. 28

6.1.5     Relocate condor job spool directories for jim_broker_client 29

6.2         Automate Gridmapfile generation from VOMS. 29

6.2.1     Generate gridmapfile for jim_broker_client from the DZero member list in voms. 29

7        Quick-Start 30

7.1         Job Submission. 30

7.1.1     A typical SAM Analysis Job submission. 30

7.1.2     Job Description File. 31

7.1.2.1       Attributes. 31

8        FAQ.. 31

9        Appendix A: The SAMGrid JDL. 31

9.1.1     Common JDL Specifications. 32

9.1.1.1       Required attributes. 32

9.1.1.2       Optional attributes. 32

9.1.2     SAM Analysis JDL Specifications. 33

9.1.2.1       Required attributes. 33

9.1.2.2       Optional attributes. 34

9.1.3     CAF JDL Specifications. 34

9.1.3.1       Required Attributes. 34

9.1.3.2       Optional attributes. 34

9.1.4     Monte Carlo JDL Specifications. 35

9.1.4.1       Required Attributes. 35

9.1.4.2       Optional attributes. 35

9.1.5     Merge Job JDL Specifications. 36

9.1.5.1       Required Attributes. 36

9.1.5.2       Mutually Exclusive attributes. 37

9.1.5.3       Optional attributes. 37

9.1.6     Structured Job JDL Specifications. 37

9.1.6.1       Required Attributes. 37

10     Suggestions. 37

1            How to read this manual

The manual has to be read sequentially. As you read, there will be pointers that will guide you to perform site specific installation e.g. "skip to submission site installation". If these pointers match your desired installation you may follow the pointer and then again you need to follow sequentially till the manual marks end of the site specific installation.

2            Introduction

SAM-Grid is a virtual project whose core is the D0-PPDG group at Fermilab and which includes off-site D0 collaborators under the aegis of various Grid projects. It's mission is to enable fully distributed computing for D0 and CDF, by:

·          Enhancing SAM as the distributed data handling system of the experiments.

·          Incorporating standard Grid tools and protocols.

·          Developing new solutions for Grid computing together with Computer Scientists.

Under this mission, the project strives to unite the D0 efforts from the multifarious Grid activities (PPDG, EU DataGrid, GridPP and more), off-site analysis work and other aspirations distributed throughout the D0 collaboration. The two main areas of work are Job Handling (including specification, brokering, scheduling etc.) and Monitoring and Information Services.

2.1           Overview of the SAM-Grid architecture

The SAM-Grid is a software suite that addresses the globally distributed computing needs of the Run II experiments at Fermilab. The Job and Information Management (JIM) components complement the Data Handling system of the experiments (SAM), providing the user with transparent remote job submission, data processing and status monitoring.

 

The logical entities of the SAM-Grid consist of

1.      Multiple Execution Sites

2.      A central Resource Selector[1]

3.      Multiple Job Submission Sites

4.      Multiple Clients (User Interface) to the Job Submission Sites.

 

Servers at the Job Submission Sites and at the Execution Sites register with the Resource Selector. Users describe and submit jobs to the Submission Sites via a User Interface, ultimately installed on a laptop. The Submission Sites maintain a spool of jobs that are periodically matched with the available resources. Matches are currently ranked by the Resource Selector according to the number of files of interest to the job that are already present at the Execution Site. Submission Sites are then responsible to reliably dispatch the job to the Execution Site. Typically, Submission Sites will also spool job outputs.

 

Typical resources at the execution site consist of

1.      A Local Resource Management system

2.      A SAM Station

3.      An Information Manager

 

The Local Resource Management system generally has experiment specific interfaces[2] and is based on a Batch System; it is responsible to receive and process jobs from the Submission Site. The SAM Station is a collection of resources managed by a set of services to satisfy Data Handling requests from individual jobs or other entities, like the Information System or the Resource Selector. It generally manages a pool of disk caches and may be interfaced to a local Mass Storage System. SAM Stations rely on a set of supporting services, some of which are distributed some are central. The Information Manager provides service configuration support and monitoring of status information. Each Site advertises resource availability to the Resource Selector.

3            Installation of the SAM-Grid

A site can join the SAM-Grid in four ways:

 

NOTE: Make sure to follow the instructions printed out at installation time.

DISCLAIMER: installing any of the JIM packages will drive you through the installation of Globus: the installation will be MUCH easier if the product area is NOT NFS shared. However, below you will find instructions on how to install Globus in this scenario as well.

 

Since the current focus of the SAM-Grid development is enabling distributed SAM analysis jobs, the discussion below assumes the site runs a SAM station. Please, refer to http://d0db.fnal.gov/sam/doc/install/ for instructions tailored to the DZero environment, http://cdfdb.fnal.gov/sam/doc/cdf/install/install.html to CDF.

3.1           System requirements

3.1.1                Hardware

The requirements will vary depending on configuration and custom installation choices.

 

Memory

128 MB of RAM  (256Mb recommended)

Hard Disk 

1 GB  (recommended)

Processor

Intel x86 processor (Pentium II (or) above recommended)

3.1.2                Software

 

Linux 

> = 2.4  kernel (RedHat (or) SUSE recommended)

UPS/UPD

>= 4.7

The packaging tool used for the SAM Grid is ups/upd.

The installation of Globus will not work if you use an earlier version.

If you need to install ups/upd, please go to http://www.fnal.gov/docs/products/ups/ .

If ups/upd is installed on your system already, generally you have to source a setup file: /usr/local/etc/setups.(c)sh for typical installation and DZero, ~cdfsoft/cdf2.(c)shrc for CDF.

3.1.3                System Configuration

·        Create a local ups product area, where all the SAM-Grid products will be installed. We strongly recommend that this area is owned by user sam: see ftp://ftp.fnal.gov/products/bootstrap/current/index.html#unix_user to create such a product area.

·        Create a local user called sam. Optionally, create a user called samgrid to enable generic authorized grid users to run jobs (this is optional, since users can be mapped to individual accounts, but highly recommended).

·        Create a directory writable by user sam, named e.g. "jim". Initialize the environment variable SAMGRID_LOCAL_DIRECTORY to point to it. This is optional but will make installation easier. This is the area used by SAMGrid products during runtime to do their activities, including sandboxing

3.1.4                Summary of the activities as root

 

In order to install the whole JIM software suite, root access is needed for the following actions:

 

3.1.4.1          Setup Group Accounts

 

SAMGrid’s servers typically run under, and use files belonging to, the “sam” UNIX account. Thus, an absolute minimum requirement is to have the “sam” account setup. Whereas it is possible to run the SAMGrid servers under another account, doing so will greatly complicate our support.

 

In the past, the SAM team also recommended the “products” account for use by the UPS/UPD system. This account exists on nearly all the FNAL systems. For our purposes, we realize that, outside of FNAL, UPS/UPD is installed solely for computing with SAM and therefore a separate user for merely owning the products files is hardly necessary. Moreover, the distinction between SAM and products creates numerous problems with permissions as our servers (especially third-party software) often write files at run-time that belong to “products” unless specifically changed.  We therefore strongly recommend installing and maintaining products as user “sam”.

 

For an execution site, depending on your local policies, you need to give authorization for off-site (relative to your site, not FNAL) users to execute jobs (please note that, by definition, this is required for your site to be part of the Grid). You may choose to map external authorized users to local “sam” account (which potentially might interfere with the SAMGrid server operation) or another group account such as “samgrid”.

3.1.4.2          Open Ports for Incoming TCP connections

 

Opening ports in the firewall from the head node (NB: SAMGrid does NOT require direct connectivity between worker nodes and the Internet):

 

grid gatekeeper:   (execuition site only) 2119 Open to all Submission Sites. See the Section on the Architecture for definition and http://samgrid.fnal.gov:8080/ for the list of the currently known submission sites. “Open to the world” would enable us to add new submission sites without changing the configuration of all the execution sites.

job-managers:       (execution site only) Any contiguous range of N ports also open to the Submission Sites where N is the number of concurrently running Grid jobs (A Grid job is “running” if it has been submitted to your local batch system). We recommend a number on the order of 100. Same consideration as above for “open to the world”. In order to have the gatekeeper use this port range, it needs to be started (e.g. via xinetd) with the environment variable GLOBUS_TCP_PORT_RANGE = 50001,50100 (example)

condor_schedd:     (submission site only) any contiguous range of M ports, where M is the maximum number of Grid jobs currently submitted through your site. Open to all Client machines authorized to use your submission site. (If all the authorized client machines are behind the same firewall, you do not need to open any of these ports.) Add to the $CONDOR_CONFIG file of jim_broker_client the macro HIGHPORT=port1 and LOWPORT=port200.

grid MDS:             (monitoring site only) 2135 Open to samgrid.fnal.gov, better to FNAL to enable possible fall over mechanisms.

tomcat:                  (all site suites, but client) 7080 open to samgrid.fnal.gov, enables configuration management via the XML Database and job’s output retrieval by the users.
(submission site only) 7081 GSI-secured door (optional), open to samgrid.fnal.gov and all the client machines which will provide for the secure job cancellation by the users.

 

If the site runs a SAM station, these are the ports that needs to be opened:

 

sam:                       4550-4555 Open to FNAL. This is required for CORBA callbacks by SAM servers. At absolute minimum, the list should include d0mino.fnal.gov (or any other D0 FNAL data router station) and d0db[-dev].fnal.gov for D0 and cdfdb.fnal.gov for CDF. Use option
--OAport=portNum to define on what port a given SAM server is listening.

sam_dcache_cp:   (CDF only) 25126 and 2811 Mainly to cdfdca.fnal.gov (for access to the CDF DCache system). D0 dcache systems to come soon. See also sam_gridftp client.

sam_gridftp server: 4567 (control) + any contiguous range of K ports (data) open to all sites to which will be allowed to pull data out of your site. NB: These should include the headnodes of all the SAM stations if you want to be considered part of the Grid!

sm_gridftp client: Any contiguous range of K ports for data, where K is the number of simultaneous transfer streams initiated by your site, must be open to all sites where your site will pull/push data (at a minimum, d0mino.fnal.gov for D0). This number must also match the number of parallel transfers set in the external SAM stager.

sam_bbftp server: (deprecated by grid_ftp). Open 14021 as described under the sam_gridftp server.

sam_bbftp client:  (deprecated by grid_ftp) All ports must be open to d0mino.fnal.gov (D0) and other sites where your site will push data.

 

 

More information on the requirements posed on firewalls by the Globus Toolkit at http://www.Globus.org/security/v2.0/firwalls.html

 

3.1.4.3          Enable Automatic Restart of SAMGrid servers at Boot Time

 

Exact means for this vary and depend on the local administrator’s preferences. A typical way is to modify the /etc/rc.local so that it includes a line similar to this:

 

su SAM –c /home/sam/samgrid_start.sh

 

Also see the Section on server start-up.

 

3.1.4.4          Setup the /etc/grid-security and xinetd daemon (Execution Site Only)

 

See Sections on configuring GSI and installing Globus gatekeepers (a.k.a. resource manager bundle).

3.1.5                Packages and Samgrid Production Release Cuts

The requirements of other packages are driven by the type of configuration you choose and are listed on their respective sections.  For each type of installation we have laid out the list of packages below.

You can find the latest Samgrid production cut at http://www-d0.fnal.gov/computing/grid/releases/

3.2           Middleware Installation

This refers to the general installation procedures required for by all the Site installation, unless specified.

3.2.1                Installing and Configuring Condor and Globus

The SAM-Grid uses the Condor and Globus middleware distributed by the Virtual Data Toolkit. The VDT product in ups is a wrapper around pacman: the software comes from the official VDT web site.

It is important that there is no variable in the environment that points to other instances of Globus while installing this new instance. You can check e.g. if GLOBUS_LOCATION or GPT_LOCATION are already defined or that PATH includes paths to other installations of Globus. In that case, check e.g. ~/.shrc and /etc/profile (or similar environment bootstrapping files) to eliminate such definitions during the installation phase.

 

Product

VDT

Install as

Sam

Install operation

upd install VDT -G-c

Tailor as

Sam or Root (see below)

Tailor Operation

Before tailoring make sure that your system have

1.      the “patch” command

2.      “gcc” (appropriate version for you Linux distribution)

3.      a ‘recent’ version of tar: v1.13.12 or newer.

More info at http://www.cs.wisc.edu/VDT/

 

as user sam:

$ ups tailor VDT

 

as user root:

$ ups InstallAsRoot VDT

 

Notes:

·          Because tailoring is CPU and I/O intensive, beware that

1.      On some systems this command can take 30 min.

2.      Installations on NFS mounted disk can give I/O related problems

·          the script executed as root changes the xinetd config files and restarts the xinetd daemon.

·          At the end of the installation, the location of the installation log will be printed out. Look at it for potential problems.

 

Notes for experts:

·          to change the default location of the gatekeeper gass cache, add this line to the xinetd configuration file the line
env = GLOBUS_GASS_CACHE_DEFAULT=/path/to/new/location

·          to let the gatekeeper know what ports are open in your firewall to run the job-managers, add something like this line to the xinetd configuration file:

env = GLOBUS_TCP_PORT_RANGE = 50001,50100

 

 

3.2.2                Installing and Configuring the Grid Security Infrastructure

This product configures the Globus Security Infrastructure of your system.

 

Product

sam_gsi_config

Install as

Sam

Install operation

upd install sam_gsi_config –q VDT -G-c

Tailor as

Sam or Root (see below)

Tailor Operation

$ups tailor sam_gsi_config –q VDT

The tailoring procedure configures GSI for various SAM-Grid products. You will be asked for what products you want to install GSI. If you do not know, configure it for all of them.

The script will print out what user(s) need to execute the command below. Typically, you need to execute it as user SAM and as root (for execution site installation):

$ups install_ca sam_gsi_config –q VDT

 

 

If you are installing either Client site (or) Monitoring site, please skip to the site specific installation. Otherwise, Submission site & execution site installers read further.

Skip Client Site Installation

Skip to Monitoring Site Installation

3.2.3                Updating the Grid Security Infrastructure

This paragraph describes what to do when a CA certificate has expired and needs to be replaced. It assumes a working sam_gsi_config installation. Also, you must know the fingerprint string of the expired CA.

 

Product

sam_gsi_config

Update as

products and/or SAM and/or root (see later)

Update Operation

If your sam_gsi_config installation is older than v2_0_8, first do

$ups update_config sam_gsi_config –q VDT

Update a CA certificate as:

$setup sam_gsi_config –q VDT

$sam_gsi_install_ca --fingerprint=<fingerprint_hash>

Where fingerprint_hash is a string of the form e1fce4e9

 

Instructions on what other users should execute this command will be printed on the screen. To force the installation as a user different from the one recommended by sam_gsi_config, add the option --force-user

 

3.2.4                Get a Service Certificate

Request a SAM service certificate to the DOEGrids CA. If you want to use a CA other than DOEGrids, this may be fine: please send email to cdfsam-admin@fnal.gov or d0sam-admin@fnal.gov.

If you are installing an execution site, you will also need to get a host certificate: you may want to get it now. Follow instructions at Get a host certificate.

 

As user

Sam

Operations

$ setup sam_gsi_config -q VDT

 

$ sam_cert_request

Follow instructions on the screen.

Notes:

The command above will drive you through the request of a SAM service certificate (typically 1 day response). When you receive by email your signed certificate, save it as is in the location printed on the screen and make it owned by user “sam”.

 

More detailed instructions for the installation of a SAM service certificate for sam_gridftp at

http://d0db.fnal.gov/sam/doc/install/fileTransfer.shtml#sam_gridftp

3.2.5                Installing XMLDB

This is an xml database server. It is currently implemented using the Xindice database and is used within the SAM-Grid as the interface that the Grid and the Fabric use to exchange information. Its main function is to store product and resource configurations.

Install the following packages (Tomcat & xmldb_server) on a single machine in your site. It can be either submission (or) execution (or) an independent machine. But we recommend its installation on submission site if you need output retrieval via the web.

The installation of Tomcat is optional if you have another Servlet runner. Tomcat is used as a servlet engine within SAM-Grid to run xmldb_server servlet.

Product

Tomcat

Install as

Sam

Install operation

upd install tomcat -G-c

Tailor as

Sam

Tailor Operation

ups tailor tomcat

Notes:

Defaults are fine.

The product area where tomcat is installed must be owned by user “sam”. If you have installed this server as products for special reasons, change the ownership from “products” to “sam” (e.g. you have root) or you can execute “ups chown tomcat”.

Start as

Sam

Start operation

ups start tomcat

 

 

Product

xmldb_server

Install as

Sam

Install operation

upd install xmldb_server -G-c

Tailor as

Sam

Tailor Operation

ups tailor xmldb_server

Configuration example:
<?xml version="1.0"?>
<interview_schema version="1_0" />
<xmldb_server
    db_name="db"
    webapps_directory="/local/ups/db/tomcat/webapps"
     db_location="/data/jim/xmldb_server/db"
     run_command="ups run tomcat"
     stop_command="ups stop tomcat">
</xmldb_server>

 

Configuration Parameters:

webapps_directory: Enter the directory used by your servlet engine to store the servlets.

db_location: Enter the directory used by the database to store the documents.

db_name: Enter the name of the xml database; this name is used when querying the database. Use the default 'db'

run_command: Enter the command that starts up your servlet engine.

stop_command: Enter the command that stops your servlet engine.

Notes:
Refer to Section System Configuration to get sensible defaults while tailoring. You have to decide where to store the xml documents of the db. This area must be writable by sam.

We have observed corruption in xmldb whenever the disk storing the DB files gets full. Only way to recover from this is clean up the database DB files and start from scratch. Users should make sure that, they consider this while deciding on the db_location. The disk requirement for the xmldb increases as we add more information with every local job running at the site. The increase in the disk space used is non linear. Hence, there is no good metrics to identify the disk required to store xmldb files. It is the responsibility of the users that the machine does not run out of disk space to avoid this problem.

Start  as

Sam

Start Operation

ups run xmldb_server &

Notes:

YOU NEED TO RUN THE COMMAND NOW, if you plan to use this database for configuration of other products (recommended). Refer Section starting up the servers for instructions to run all servers.

 

Install the following software on both submission and execution sites.

 

Product

xmldb_client

Install as

Sam

Install operation

upd install xmldb_client -G-c

Tailor as

Sam

Tailor Operation

ups tailor xmldb_client

 

Configuration example:
<?xml version="1.0" encoding="UTF-8"?>

<xmldb_client>

  <interview_schema_version version="1_0"/>

  <xmldb_server url="http://samgfarm4.fnal.gov:7080/Xindice"/>

</xmldb_client>

 

Configuration Parameters:

url: Enter the xml db server for your site. If this is the machine that runs the xml db server, accept the default, otherwise enter the correct address.

Enter the default xml db server URL ( typical form http://my.db.host:7080/Xindice ):

What is the url of the xmldb_server ? [http://samham.fnal.gov:7080/Xindice]:

    The attribute url is set to the 'http://samham.fnal.gov:7080/Xindice'

3.2.6                Store the SAM Grid Global Constants to the XML database

Product

jim_config

Configure as

Sam

Configure operation

$ ups store_constants jim_config

Notes

This will store the global constants like SAM IOR, broker location, DB Server name etc in the database.

Skip to Submission Site Installation

Skip to Execution Site Installation

3.3           Client Site Installation

Site where you submit your job to the Grid. This is a very light weight component that could be installed by installing just jim_client

 

Product

jim_client

Install as

Products

Install operation

upd install jim_client -G-c

Tailor as

Products

Tailor Operation

ups tailor jim_client

 

Configuration example:
<?xml version="1.0" encoding="UTF-8"?>

<jim_client_configuration>

  <interview_schema version="1_3"/>

  <condor_config_parameters>

    <uid_domain domain="fnal.gov"/>

    <schedd_host hostname="samgrid.fnal.gov"/>

    <condor_host hostname="samgrid.fnal.gov"/>

    <network_interface>

      <public_interface ip="131.225.167.1" />

    </network_interface>

    <structured_jobs structured_jobs="no"/>

  </condor_config_parameters>

  <MyProxy_Server hostname="fermigrid4.fnal.gov"/>

</jim_client_configuration>

 

Configuration Parameters:

uid_domain: Enter your domain

schedd_host: Enter the hostname of the submission site

condor_host: Enter the hostname of the jim_broker. Use default.

public_interface, network_interface: Enter the IP address of your system that you want to use. In case if you have multiple network interfaces you should use the IP address of the interface that is accessible from outside your local network. To get the information of various interfaces on your system run /sbin/ifconfig in another window

structured_jobs: Enter if you want to run structured jobs. Answer 'no' here

MyProxy_Server: Enter the address of the MyProxy_Server. Use the default

Notes:

You can ignore warnings about xmldb_client: by default, the JIM configuration manager will try to store this configuration into an xml database; this is not required for jim_client and the automatic FS storage is sufficient.

Congratulations. You may start submitting your job if your submission site is configured.

End of Client site Installation!

3.4           Submission Site Installation

3.4.1                General configuration

Make sure you have followed the middleware installation instructions at paragraph 3.2; in particular you need to install condor and Globus, configure GSI, request a service certificate, install the XML database and store the global SAMGrid constants into it.

3.4.2                Installation of the JIM Broker client

Product

jim_broker_client

Install as

Sam

Install Operation

upd install jim_broker_client -G-c

Tailor as

Sam

Tailor Operation

$ups tailor jim_broker_client

 

Configuration example:
<?xml version="1.0"?>

<jim_broker_client_configuration>

  <interview_schema version="1_6" />

  <condor_config_parameters>

    <uid_domain domain="fnal.gov" />

    <local_dir dir="/data/jim" />

    <spool_dir dir="/data1/jim" />

    <condor_host hostname="samgrid.fnal.gov" />

    <condor_admin_email email="parag@fnal.gov" />

    <network_interface ip="131.225.110.153" />

    <broker_identity subject="/DC=org/DC=doegrids/OU=Services/CN=sam/samgrid.fnal.gov" />

    <condor_lowport_highport lowport_highport="49152,65535" />

    <site_name site_name="samgrid.fnal.gov" />

  </condor_config_parameters>

</jim_broker_client_configuration>

 

Configuration Parameters:

uid_domain: Enter your domain

local_dir: Enter Full path where you want to store your log files for the JIM suite. It must be a local path. A directory called jim_broker_client will be created automatically inside this local_dir when you first start the scheduler. Logs and gridmapfile pertaining to the JIM broker client installation will be stored here. User ‘sam’ should have write access to this directory

spool_dir: Enter Full path where you want to store your spool files for the JIM suite. It must be a local path. A directory called jim_broker_client will be created automatically inside this spool_dir when you first start the scheduler. The spool area will be your location to store input and output sandboxes for JIM broker client. User ‘sam’ should have write access to this directory

condor_host: Enter the hostname of the broker

condor_admin_email: Enter the administrator email-id for this installation

public_interface, network_interface: Enter the IP address of your system that you want to use. In case if you have multiple network interfaces you should use the IP address of the interface that is accessible from outside your local network. To get the information of various interfaces on your system run /sbin/ifconfig in another window

broker_identity: Enter the certificate subject of the Broker. Use default.

condor_lowport_highport: Enter the range of port number on which you want the condor processes to run (eg 50101, 50120). Please note that this is important if the schedd node is behind a firewall

site_name: Enter Site Name. This is the name which will appear in the class Ad of the schedd and will be displayed on the web.

 

Notes:

Define the variable SAMGRID_LOCAL_DIRECTORY, as explained in Section System Configuration, to sensible defaults.

 

You will be asked several questions. Choose a directory for the job spooling area: User “sam” will write on this area on behalf of the user's job the input sandbox and other files. A good location is the samgrid local area (where also the local ups dir generally is) in a directory called "jim". AFTER tailoring you'll need to chown -R this area to user sam. Don't change the defaults of the other questions if you don't absolutely know what you are doing.

 

Other Useful Tasks:

·        Generate gridmapfile from voms

Refer to the Section “Automating the Maintenance Tasks”

 

Following tasks are not supported any more.

·        To add a new user to use your Submission site, execute

 

$ups AddUser jim_broker_client

 

NOTE: you can add users ONLY AFTER you successfully started jim_broker_client once.

 

·        Optionally, if you are an advanced user and want to add multiple user at the same time, first create an input_file with list of Grid subjects and execute

 

             $<jim_broker_client_prod_dir>ups/gridmap_gen.py  <input_file  >>condor_schedd_gridmap_file

Run as

Sam

Run Operation

ups run jim_broker_client &

Notes:

Refer Section starting up the servers for instructions to run all servers.

DO NOT DO THIS STEP UNTIL YOU HAVE INSTALLED THE SAM SERVICE CERTIFICATE.

3.4.3                Installing Output retrieval via web

This is an optional package for users who prefer to retrieve their output from a web page after the job is completed.

·        Make sure your servlet runner is installed and configured properly in your submission site. You may optionally install our distribution of tomcat.

·        Install jim_www_sandbox servlet

Product

jim_www_sandbox

Install as

Sam

Install operation

upd install jim_www_sandbox -G-c

Tailor as

Sam

Tailor Operation

ups tailor jim_www_sandbox

 

Configuration example:
<?xml version="1.0" encoding="UTF-8"?>

<jim_www_sandbox_configuration>

  <interview_schema version="1_1"/>

  <nonsecure_services url="http://samgrid.fnal.gov:7080" directory="/data/products/ups/db/tomcat/webapps"/>

  <secure_services url="https://samgrid.fnal.gov:7081" directory="/data/products/ups/db/tomcat/secureapps"/>

  <jim_out_sandbox servlet_secure="no"/>

</jim_www_sandbox_configuration>

 

Configuration Parameters:

nonsecure_services, directory: Enter the directory used by your servlet engine to run non secure servlets.
nonsecure_services, url: Enter the URL from which non secure servlets are hosted.

secure_services, directory: Enter the directory used by your servlet engine to run secure servlets.
secure_services, url: Enter the URL from which secure servlets are hosted.

Notes:

You will be prompted to enter the location where you have install servlet in your machine. After tailoring this you may need to restart the servlet runner.

·        If you are using other distributions of tomcat (or) servlet runner, you may need to do the following additional step.

o   Verify Broker client is configured properly and its environment is accessible to the servlet runner during startup i.e., your servlet runner should do “setup jim_broker_client” on the same terminal before start up so that it gets access to the Broker client’s environment.

End of Submission site Installation.

Skip to starting up the servers.

3.5           Execution Site Installation

 

IMPORTANT:  Before you proceed with the usual “upd install / ups tailor” routine, be sure to read, understand, and execute instructions from the document describing the grid to fabric job submission interface. This job submission is the core of the execution site installation (even though this is merely 1 or 2 packages out of 20 or so total for the execution site) and has historically caused the most questions and problems. Please do not install the rest of the execution site if the local job submission is not working!

 

3.5.1                General configuration

Make sure you have followed the middleware installation instructions at paragraph 3.2; in particular you need to install Condor and Globus, configure GSI, request a service certificate, install the XML database and store the global SAMGrid constants into it.

3.5.2                Install sam

See http://d0db.fnal.gov/sam/ for DZero and CDF instructions.

Install the latest version of SAM and declare it current. Refer to the Samgrid release cuts at http://www-d0.fnal.gov/computing/grid/releases/  Also make sure that "setup SAM -q d0_prd" (or -q cdf_prd) sets up the latest version (check for example $SAM_DIR). If this is not the case declare the previous versions "old".

We recommend that the installation of the JIM products be done on a separate ups database owned by user sam: this is the only set of products needed by the JIM software. On the other hand, generally the SAM software is installed on a product area owned by user products or cdfsoft. The JIM execution site software will need access to a few SAM products (see below): we found convenient simply to install and configure them again in the JIM ups database:

·          sam client: the code and the configuration. By default, the JIM software will execute “setup SAM -q d0_prd” (or cdf_prd) to get the SAM client environment.

·          sam_cp_config: needs to be configured for intra-cluster transfers. Typically jim_gridftp or fcp is used. You can add the line ‘.’ : [ ‘jim_gridftp’, ], to you domain capability map

3.5.3                Setting up durable location (Optional)

You may optionally decide to use a durable location setup at a different/central site or you may setup a durable location on site. The durable location will be used by Samgrid jobs to store production files before they are merged and finally stored to the tapes. To setup durable location, you need to refer to Samgrid’s latest release cut at http://www-d0.fnal.gov/computing/grid/releases/ Install packages listed under “Middleware packages”, “Sam client packages”, “Sam Station packages” and jim_gridftp. If the durable location is on a machine that acts as a Samgrid head node or station node, most of these packages should already exist. If not, please refer to individual package installation and configuration. Once the installation is complete, register the location to SAM by sending an email to the SAM shifters with the name of the machine, path to the storage and the disk size of the storage. Configure the local_storage in site configuration to use the durable location. Refer to Site configuration for more details. If need to configure multiple durable locations, please refer to documentation on configuring complex site with application specific queues and storages at http://www-d0.fnal.gov/computing/grid/doc/Application-ResourceTuning-01Aug05-cut.pdf

3.5.4                Get a host certificate

 

As user

Root

Operations

You need to request a host certificate to a Certificate Authority (CA) for your gateway node (typically 1 day response). SAM-Grid works mostly with the DOEGrids CA, but other CAs may be trusted as well. Contact d0sam-admin@fnal.gov or cdfsam-admin@fnal.gov for more information.

 

The GSI security binaries can be made available to your shell via

 

$setup VDT

 

the command to request the certificate to the DOEGrids CA is

 

$ GRID_SECURITY_DIR=/etc/grid-security grid-cert-request -host `hostname -f` -ca 1c3f2ca8