MCFARM PRODUCTION NODE PREPARATION WITH AUTOMATIC INSTALL SCRIPT

 

 

            This document describes the preparation of an MC production node.

 

Prerequisites to setting up a production node (Referred to as node number NNN):

 

 

 

Steps to be taken before configuring NNN:

 

1.         On job server JJJ, make sure that the /etc/exports file contains in it a line for exporting the /home directory to NNN like this:
            /home  FARM_NAME_NNN(rw,no_root_squash)

2.         On EACH file server FFF, the /etc/exports file should contain a line for each cache disk that needs to be exports to the node NNN:
            /scratch FARM_NAME_NNN(rw,no_root_squash)

 

NOTE: It is VERY important that the above steps are completed correctly before proceeding to the configuration of NNN itself!

 

Configuring Node NNN as an mcfarm Production Node:

 

  1. Obtain the tar file “ProdNodeSetup.tar”, and un-tar it as root user (preferably in the /root directory, but it really does not matter). You will get two scripts out of it – a python script “ProdNodeSetup” that will do the configuring, and a shell script “SetupEnv” that you will have to modify to export some environment variables that the configuration script needs.
  2. The shell script consists of a number of open-ended export statements, that you will have to complete for defining the following environment variables: (Note that you do not need the quotation marks while exporting the variables. The value of these variables should follow immediately after the ‘=’ sign):
    1. FARM_NAME:
      This should be the name of the entire farm cluster – the name that you have given to your cluster, for example, one of our farms run by the High Energy Physics group consists of machines named like “hepfmXXX.uta.edu”, so on that farm the value of this variable would be “hepfm”.
    2. FARM_DOMAIN:
      This is the domain name of the machines on the cluster. Using the same example as above, the “hepfm” farm would have a domain name of “uta.edu”.
    3. FARM_JOB_SERVER_NUMBER:
      This is the node number JJJ of the job server node. For example, the value of this variable on the hepfm farm would be “000”.
    4. FARM_WORK_AREA:
      The value of the variable is the name of the directory on this node that you want to be used for farm work. For example, if you installed LINUX on NNN following our document, then the value of this variable would be “/scratch”.
    5. FARM_ROOT_DAEMON:
      The value of this variable can be “yes” or “no”, depending on whether or not you want the root daemon. This tells the script whether or not to configure this node for root daemon access by mcfarm.
    6. FARM_FILE_SERVER_LIST:
      This is a colon separated list telling the script of the file servers and their partitions. For example, if your job server is node 000, and you have a file server – node 001 with two cache disks that you want to be used for file serving, then the value of this variable would be: “000_A:001_A:001_B”. Note that you need the “_A” for node 000 even though there is only one cache disk that you want to be used in file serving. If you have not installed a file server as yet apart from the job server JJJ, then the value for this variable will just be “JJJ_A”.
      (Please take time to understand how to fill in this environment variable, since it is crucial to creating the correct links and directories and mounting those links/dirs from the file servers. If you are unsure about anything, please contact the UTA group for assistance)

Then run this script as a shell script as “. SetupEnv”.

 

  1. Then run the python config script as “python ProdNodeSetup”. This should configure NNN as a production node. If there are any error messages fix the errors and re run the script.
    This script does the following things:
    1. Configures NNN so that NNN mounts /home from the Job server JJJ. You can test this by logging on to this machine as any user that exists on JJJ. You should be able to see all the folders in the /home directory on JJJ.
    2. Configures NNN for access to all the cache disks of all the file servers.
      You can test this by doing a listing of cacheFFF_A and you should be able to see all the files residing on that file server’s cache disk.
    3. Creates directories and links on this node for use by mcfarm. These are created in $FARM_WORK AREA.
    4. Configures NNN for root daemon access if you so chose.

If there were any problems during installation using this script, it will exit and in general will cleanup after it so that the system is back in the same state before you ran the script. If however you want to clean up for some reason after running the script successfully the following command will accomplish that: “python ProdNodeSetup cleanup”. This will restore the state of the system before you ran the install script.

 

4.   Modify the file /home/mcfarm/bin/attach_nodes_to_js to include this node also.

5.   On the job-server JJJ and on each gather-server GGG, make the following mount directories and links:
                mkdir     /mnt/hepfmNNN

mkdir     /mnt/hepfmNNN/scratch

mkdir     /mnt/hepfmNNN/cache_A

mkdir     /mnt/hepfmNNN/gath_queue

ln    -s    /mnt/hepfmNNN/scratch           /scrNNN

ln    -s    /mnt/hepfmNNN/cache_A        /cacheNNN_A

ln    -s    /mnt/hepfmNNN/gath_queue   /gatherNNN

chown   mcfarm.mcfarm    /mnt/hepfmNNN/*

chown   mcfarm.mcfarm    /scrNNN

chown   mcfarm.mcfarm    /cacheNNN_A

chown   mcfarm.mcfarm    /gatherNNN

 

Then, make sure that NNN exports its /scratch directory to JOB_SERVER_JJJ, by adding the following line to the /etc/exports file as root user:

/scratch           JOB_SERVER_JJJ(rw,no_root_squash)
Then issue the command exportfs –ar to place this new file in play.


Then if you did NOT implement the root daemon, you must issue these manual mounts (now, and each time this server is booted):

 

mount   -t   nfs   -o   rw,rsize=16384,wsize=16384,actimeo=0,intr   

                                         hepfmNNN.uta.edu:/scratch

                                         /mnt/hepfmNNN/scratch

mount   -t   nfs   -o   rw,rsize=16384,wsize=16384,actimeo=5,intr  

                                         hepfm009.uta.edu:/cacheNNN_A

                                         /mnt/hepfm009/cache_A

mount   -t   nfs   -o   rw,rsize=16384,wsize=16384,actimeo=5,intr

                                        hepfm009.uta.edu:/scratch/gath_queue

                                        /mnt/hepfm009/gath_queue

 

If you DID implement the root daemon, then these mounts are performed using this command as mcfarm on the job server:

 

root_command    $FARM_SERVER_NODENAME -- script=$FARM_BIN/attach_nodes_to_js

 

and for each gather-server GGG

 

root_command    hepfmGGG   --script=$FARM_BIN/attach_nodes_to_gs_GGG

 

which will mount this new node (and all old nodes, harmlessly) on the job and gather servers.

 

Either way, from the server you should now be able to do    ls   /scrNNN   and see all the contents of the nde node’s /scratch directory.  Same thing for   ls  /cacheNNN_A     and   ls   /gatherNNN  (test these mounts by placing a file into the target directories).

 

6.   Modify /home/mcfarm/distribute.conf to include a line as follows for the node NNN:
            node=NNN,max=0,partition=hda7,nodename=hepfmNNN.uta.edu

Make sure that NNN is the node number and that the partition is correct. (The partition can be ascertained by doing df /scratch on NNN).

 

7.  When you are ready to allow farm tasks to be sent to this new node (you have verified the SSH functions from the job server and all gather servers, and you have verified NFS links from the job server, gather servers, file servers, and the node itself), then do these two steps to tell the farm distribute daemon to send jobs to the new node:

 

Modify the distribute.conf file to set max=M, where M is the number of CPUs on the new node that are to receive farm work.  Do not exceed the actual number.

If the farm itself is already running, then from the job server, issue
          start_execute NNN    

If the farm itself has not been started yet, issue start_farm as mcfarm.

 

 

NOTE:
If you are about to turn this node into a file-server, you probably do not want to allow production jobs to be sent because they will compete for time with the serving of files.   In such a case, leave max=0 in the distribute.conf file.  If you do try to send jobs there, watch the future performance of the other nodes when they run d0sim, which uses the file servers for minbi data.