MCFARM FILE-SERVER
MODE PREPARATION
This document describes the preparation of an mcfarm file server node by converting an existing production node.
Prerequisites:
(You can also make this node available for production, and can later not allow distribution to this node, based on the output of the mcfarm prodsumm command, which details production activity. Please check the mcfarm users manual for information on the various mcfarm commands.)
Configuration of node FFF:
1. The file server must export its scratch directory(s) to every other node. As root on node FFF, modify the /etc/exports file to include a line for every other node, such as
/scratch
hepfmNNN(rw,no_root_squash)
/scratch2 hepfmNNN(rw,no_root_squash)
Then as root issue the command /usr/sbin/exportfs -ar to place the new exports file in play.
2. As root on the file server, make these directories and links (may already exist):
mkdir /scratch/cache_A
chown mcfarm.mcfarm
/scratch/cache_A
ln -s /scratch/cache_A /cacheFFF_A
chown mcfarm.mcfarm
/cacheFFF_A
(repeat for all other cache partitions, using _B and /scratch2, etc.).
3. Every other production node (not the job server) must auto-mount this new file-server partition. On every such node now on the farm, as root, add these directories and links:
mkdir /mnt/hepfmFFF
mkdir
/mnt/hepfmFFF/cache_A
ln -s /mnt/hepfmFFF/cache_A /cacheFFF_A
chown mcfarm.mcfarm /mnt/hepfmFFF
chown mcfarm.mcfarm /mnt/hepfmFFF/cache_A
chown mcfarm.mcfarm /cacheFFF_A
(Repeat the above for any additional partitions B, C, etc).
Then add this line to the /etc/fstab file of every production node:
hepfmFFF.uta.edu:/cacheFFF_A /mnt/hepfmFFF/cache_A nfs
rw,intr,rsize=16384,wsize=16384,actimeo=5
hepfmFFF.uta.edu:/cacheFFF_B /mnt/hepfmFFF/cache_B nfs
(this line for 2nd partition)
rw,intr,rsize=16384,wsize=16384,actimeo=5
and then as root on each of those nodes, issue a mount –a command. As mcfarm on each production node, you should now be able to see the /cacheFFF_A contents (and B, C, etc.). Test this by copying test files in and out of the directory.
4.
Since this node is offering file-serving to the farm, place these two lines in
the ~/bin/setup_farm script so that
farm software knows to use it (FFF is this node number):
export FARM_FILESERVER_CACHE_FFF_A=$FARM_CACHE'FFF_A'
export FARM_FILESERVER_CACHE_FFF_A_DEV=hda7
Repeat
those two lines for B, C, etc. Specify
the correct partition. . You will have
to stop all gather daemons, issue the new setup_farm
command, and restart them before this will take effect.
5.
IF this node is ALSO going to be used to hold a portion of the archive queue
(which it probably should, because the archives can get large and thus needs to
be segmented over multiple nodes), then as root on the new file server FFF create
these directories and links:
mkdir
/scratch/cache_A/archive
mkdir
/scratch/cache_A/archive/jobs
ln -s /scratch/cache_A /archiveFFF_A
chown mcfarm.mcfarm
/scratch/cache_A/archive
chown mcfarm.mcfarm
/scratch/cache_A/archive/jobs
chown mcfarm.mcfarm
/archiveFFF_A
Then
modify the ~/bin/setup_farm script to
identify this new archive segment, and specify how much room in MB to keep on
the disk (e.g., don’t fill up the disk with archive info):
export FARM_ARCHIVE_JOBS_QUEUE_NN=$FARM_ARCHIVE'FFF_A/archive/jobs'
export FARM_ARCHIVE_MINIMUM_NN=2048 # leave 2GB
where NN is a number
from 00 to 99 and specifies the order in which this segment is to be filled
during archiving, and FFF is the new node.
You will have to stop all gather daemons, issue the new setup_farm command, and restart them
before this will take effect.
Then on the job
server, and on each other gather-server, as root, make these directories and
links so they can see this segment of the archive queue:
mkdir
/mnt/hepfmFFF/archive_A
ln -s /mnt/hepfmFFF/archive_A /archiveFFF_A
chown
mcfarm.mcfarm
/mnt/hepfmFFF/archive_A
chown
mcfarm.mcfarm /archiveFFF_A
6.
Modify ~/bin/attach_node to
specifically mention this node as a file-server (in two places). Do the same change to ~/bin/detatch_node.
Propagate both script to the /scratch/localbin
directory of all nodes (you can use the root daemon for this).
7.
The node FFF should now be able to accept new files in its cache directory(s),
and they should be visible to every other node. Since the file server was built out of a production node, it
should auto-mount the job server.
File-serving per se does not require NFS access to any other node but
the job server.