MCFARM Gather-Server Node Preparation Outline

4/21/02

 

Gather servers may be added to the farm if the load on the job server is too high to handle all gathering (e.g., SAM stores).  This brings an additional thread and NIC card to bear on the task of sending files to SAM  The following steps will transform a regular production node into a gather-server node GGG (for SAM, in this example).

 

·         Prepare the node as you would for regular production.  Gathering and regular production should be able to go together without excessive interference, but if necessary you can set    max=0   in the distribute.conf file to shut off production on this node.

 

·         Every production node (and the job server) must export its /scratch area to this new gather server GGG.  The job server is already doing so, via its export of /home.  On each production node, as root, add this line to the /etc/exports file:

 

        /scratch  hepfmGGG(rw,no_root_squash)

 

(repeat for additional scratch areas).

 

then as root issue the command    /usr/sbin/exportfs   -ar      to place the new exports file in play.  This is a good candidate for a root command.

 

·         On the new gather server GGG, as root, make these directories and links for every other production node NNN (including the job server):

 

mkdir     /mnt/hepfmNNN

mkdir     /mnt/hepfmNNN/scratch

mkdir     /mnt/hepfmNNN/cache_A

mkdir     /mnt/hepfmNNN/gath_queue

ln    -s    /mnt/hepfmNNN/scratch           /scrNNN

ln    -s    /mnt/hepfmNNN/cache_A        /cacheNNN_A

ln    -s    /mnt/hepfmNNN/gath_queue   /gatherNNN

chown   mcfarm.mcfarm    /mnt/hepfmNNN/*

chown   mcfarm.mcfarm    /scrNNN

chown   mcfarm.mcfarm    /cacheNNN_A

chown   mcfarm.mcfarm    /gatherNNN

 

IF there are any other nodes that have a segment of the archive queue, then you must also make these directories and links to each of them so this gather-server can see them for archiving.  You do not have to do this for a segment that is present in the /home/mcfarm directory itself.

 

mkdir     /mnt/hepfmFFF/archive_A

ln    -s    /mnt/hepfmFFF/archive_A   /archiveFFF_A

chown   mcfarm.mcfarm   /mnt/hepfmFFF/archive_A

chown   mcfarm.mcfarm   /archiveFFF_A

 

(repeat if there are segments on B, C, etc).

 

Then if you did NOT implement the root daemon, you must issue these manual mounts (now, and each time this server is booted):

 

mount   -t   nfs   -o   rw,rsize=16384,wsize=16384,actimeo=0,intr   

                                         hepfmNNN.uta.edu:/scratch

                                         /mnt/hepfmNNN/scratch

mount   -t   nfs   -o   rw,rsize=16384,wsize=16384,actimeo=5,intr  

                                         hepfm009.uta.edu:/cacheNNN_A

                                         /mnt/hepfm009/cache_A

mount   -t   nfs   -o   rw,rsize=16384,wsize=16384,actimeo=5,intr

                                        hepfm009.uta.edu:/scratch/gath_queue

                                        /mnt/hepfm009/gath_queue

 

If you DID implement the root daemon, then these mounts are performed using this command as mcfarm this gather-server GGG

 

root_command    hepfmGGG   --script=$FARM_BIN/attach_nodes_to_gs_GGG

 

which will mount this new node (and all old nodes, harmlessly) on gather servers.

 

Either way, from the server you should now be able to do    ls   /scrNNN   and see all the contents of the nde node’s /scratch directory.  Same thing for   ls  /cacheNNN_A     and   ls   /gatherNNN    and     ls  /archiveFFF_A   (test these mounts by placing a file into the target directories).

 

·         The ~/bin/setup_farm script must have these lines added so that the farm knows that this is a gather-server and that it has access to the other node’s disks:

 

export FARM_SCRATCH_ACCESS_008=  

export FARM_CACHE_ACCESS_008=

 

Note that these variables are not set to anything in particular – they simply exist.

 

·         In the farm, the gather-server performs the mounts of the production nodes in the start_farm command, IF you have implemented the root daemon, modify the ~/bin/attach_node program to explicitly identify the new gather server and set the “attach_gs” and “on_gs” switches correctly.  Make the same changes to ~/bin/detach_node.   Then propagate these two scripts to every node’s  /scratch/localbin directory (use a root command). 

 

Next, modify the ~/bin/attach_nodes_to_gs_GGG script to mention every production node and the job server.  Next, modify the ~/bin/start_farm to issue this command on start up:

 

root_command    hepfmGGG   --script=$FARM_BIN/attach_nodes_to_gs_GGG

 

Next, modify the ~/bin/start_farm scrip to actually start a gather daemon on the new gather-server:

 

ssh hepfmGGG start_gather --m=SAM

 

Next, modify the ~/bin/stop_farm script to stop the gather daemon on the new gather server:

 

ssh hepfmGGG stop_gather --wait &