SETTING UP AN MCFARM JOB SERVER NODE

 

 

            This document describes how to set up a LINUX node as a generic farm node- that is as a job server, a file server and a production node all on the same machine, once that machine has been setup as directed by the document on Fermi RH 7.1 Installation. This document assumes that machine is dedicated to farm work.

 

 

SETTING UP THE DĆ PACKAGES:

 

            Before getting started on setting up the DĆ packages, create the directory /home/products and /home/products/fnal, and create a soft link to the latter directory as /fnal. Then proceed with the following steps.

 

·        Setting up UPS/UPD: Follow instructions on the d0race web page:
http://www-hep.uta.edu/~d0race/linux_install.html .

·        Installing the DĆ binaries:

o       Create directories /home/products/d0dist, /home/products/d0usr, and create links /d0dist and /d0usr pointing to these directories. Also create directories /d0dist/dist and /d0usr/products.

o       Download the file UPSd0dist.tar.gz into the /d0dist/dist directory, and the file UPSd0uprod.tar.gz into the /d0usr/products directory. Then un-tar each of these files in their respective download directories. (tar zxvf UPS*.tar.gz), and run the .fix-* files in their respective directories. (These scripts will prompt for user input – answer ‘y’ at each of the prompts)

o       Edit the file /fnal/ups/etc/upsdb_list, and add the lines
/d0usr/products/upsdb
/d0dist/dist/upsdb
.

o       Go to the directory /fnal/ups/db/.updfiles, and rename the updconfigfile which resides there. (EG. mv updconfig updconfig_old). Then download the file Updconfig into this directory. In this file, there are four sections each with the heading “COMMON”. Each of these sections contains a variable “UPS_THIS_DB”. Make sure that the first two occurrences of this variable are set to /d0dist/dist/upsdb, the third is set to /d0usr/products/upsdb and the fourth is set to /fnal/ups/db.
We are now ready to actually install the DĆ minitars.

o       Create a link /mcc-dist that points to /home/products/d0dist/dist

o       Download the 8 tar files listed on the MCP 10 page from d0mino.fnal.gov. These minitars reside in the directory /d0dist/dist/minitar/tarfiles. Download then onto the / directory. Then un-tar each of them from the ‘/’ directory, except for the mc_runjob minitar. Installation of this minitar will be described later.
This completes the setup of the DĆ minitars for farm operation.

 

 

SETTING UP THE SAM STATION:

 

            The job server is also typically one of the SAM gather servers. For this the job server must be set up as a SAM station. In order to do this, please follow the detailed instructions on the webpage for SAM INSTALLATION.

 

 

 

SETTING UP THE MCFARM DIRECTORY STRUCTURE:

 

Before setting up the directory structure, create the “mcfarm” account. It is advisable to set up the group for mcfarm and then add the user account mcfarm by executing the following commands:

groupadd –g 500 mcfarm

useradd –G 500 –g 500 mcfarm

We recommend that you set up the mcfarm account with the lowest possible group and user ID’s as described by the above two commands, but if you feel you want to give the group and user higher ID’s do so.

 

ln –s /home/scratch /scratch

 

 

 

SETTING UP THE MCFARM SOFTWARE:

 

This section describes the actual setting up of the mcfarm software and configuration. All the following steps must be performed as user mcfarm unless otherwise specified. (NOTE: In the following instructions, ‘~’ refers to the mcfarm user, i.e. ~/bin is the same as /home/mcfarm/bin, and so on.)

 

 

 

 

RUNNING A TEST JOB:

 

            You now have a farm of one node, which acts as a job-server, file server and production node all rolled into one. Now you can run a test minbi production job to see if everything was setup right.

 

  1. Make sure that the ~distribute.conf file has max=1 for this node so that jobs are being distributed to this node.
  2. Modify the ~conf-files/minbi-cdf.script.template file to make sure that it has the proper values for D-release, Cardfile version and UseMaxOpt. Rename it as before without the .template at the end.
  3. Run the ~conf-files/samples/make-minbi script to run a 1000 event pythia job. You can monitor its progress by using the command jobstat –av. It will get distributed, run, and ten will finally be gathered to the only cache that you have now. You can check this by doing ls /cacheJJJ_A to see the output there.
  4. Change the ~/distribute.conf file back to max=0 for the job server (unless you plan to allow production jobs to run there also).

 

 

NOTE: For running production type jobs, see the note on “How to Submit jobs to McFarm Control system” after you have complete building your farm.