Installing a SAM Station at SFU in July 2003 (last update July 25, 2003)

On this page we describe the installation of a SAM station at Simon Fraser University in July 2003. This follows the instructions for UPS/UPD and for D0RunII from SFU. The contact person is Dugan O'Neil . This is like a diary...mistakes are left in though notes are put beside them (look for ###) or nearby when they are known. Otherwise things may be hacked later in the instructions to fix some procedural flaw whose cause is unknown (at least to me). In the end it all works.

References

Installing SAM

  1. Register your SAM Station. Send an email to sam-admin requesting that they register a new station. Specify what machine(s) the station will control and who should be the station administrator.
  2. create a SAM user (requires root privilege)
    /usr/sbin/adduser sam -u 7816
    
  3. install the sam ups product
    upd install -G -c sam
    
  4. The instructions on the web then say to do "ups tailor sam" but this doesn't seem to actually do anything (ie. immediate return). Since "ups tailor sam_config" actually does something I'll assume that the docs are out of date and just continue. Now configure sam
    ups tailor sam_config
    What would you like to do?
       [a: add configuration]
       [d: done (default)]
    a
     choose    [9: D0 Production User (__d0_user_prd__)]
    You have chosen "D0 Production User (__d0_user_prd__)" as your base configuration.
    Enter new configuration qualifier:
    prd
    Enter new configuration description:
    test station at SFU
    Adding configuration prd based on configuration __d0_user_prd__...
    Added configuration prd.
    Declaring sam_config with qualifier prd into the UPS database...
    Executing: ups declare -r /D0/ups/prd/sam_config/NULL/v4_2_23 -M /D0/ups/db/sam_config/v4_2_23 -m sam_config_prd.table sam_config v4_2_23 -q prd -f NULL -c
    What would you like to do?
       [a: add configuration]
       [e: edit configuration]
       [r: remove configuration]
       [s: set default configuration]
       [d: done (default)]
    d
    
  5. Now install sam_bootstrap
    upd install -G -c sam_bootstrap
    
  6. Configure sam_bootstrap
    ups tailor sam_bootstrap
    Enter SAM_BOOTSTRAP_ENV [default: /D0/ups/db/sam_bootstrap/sam_bootstrap.env]?
    Enter SAM_SERVER_HOME [default: /home/sam/private]?
    Enter SAM_MAIL_RECIPIENT [default: sam-auto@fnal.gov]? dugan_oneil@sfu.ca
    Enter SAM_SERVER_LIST [default: /home/sam/private/p8460_server_list.txt]?
    .
    keep hitting enter
    .
    ---------------
    Configure a station? [default: false]: y
    Enter the environment? station_prd       #### mistake: prd
    Enter the sam_station product version:? v4_2_1_43 (current version returned by upd list -a sam_station)
     Station name? p8460.phys.sfu.ca
    Options to the server [default: None]? --constrain-delivery=d0mino.fnal.gov --min-delivery=1k
    Configure another station? [default: false]:
    ----------------
    Configure an fss? [default: false]: y
    Enter the environment [default: station_prd]?   #### mistake: prd
    Enter the sam_station product version:? v4_2_1_43
    Station name [default: p8460.phys.sfu.ca]?
    Options to the server [default: None]?
    Configure another fss? [default: false]:
    ----------------
    Configure a stager? [default: false]: y
    Enter the environment [default: station_prd]?     #### mistake: prd
    Enter the sam_station product version:? v4_2_1_43
    Station name [default: p8460.phys.sfu.ca]?
    Options to the server [default: None]? --with-fss --without-sm --max-transfers=N
    Configure another stager? [default: false]:
    ----------------
    Configure a bbftp? [default: false]:
    ----------------
    Configure a gridftp? [default: false]: y
    Enter the environment [default: station_prd]?     #### mistake: prd
    Enter the sam_gridftp product version: [default: v1_8d]?
    Options to the server [default: None]?
    Configure another gridftp? [default: false]:
    ----------------
    
    The configuration then ends with
    Due to a ups bug, currently sam_gridftp must be installed by hand
    Please read the instructions in /D0/ups/prd/sam_bootstrap/NULL/v4_2_26/www/sam_bootstrap.html
    
    You need to initialize the server_list file.
    Please execute the following command as user sam:
    
         cp /tmp/oneil_p8460_server_list.txt \
             /home/sam/private/p8460_server_list.txt
    
    So, do the copy (need mkdir first) and read the file....
  7. install security infrastructure for sam_grid_ftp (reading from file:///D0/ups/prd/sam_bootstrap/NULL/v4_2_26/www/sam_bootstrap.html)
    upd install globus_dh_server -G-c
    Installation of the globus security infrastructure
    - as root:
    # setup globus_dh_server
    # ${GLOBUS_LOCATION}/setup/globus/setup-gsi
    Installation of the globus data handling client tools
    - as products:
      upd install globus_dh_client -G-c
    Installation of the globus security infrastructure
    configuration tools
    - as products:
    > upd install sam_gsi_config -G-c
    -------------------
    Configuration of the globus security infrastructure
    and request of the sam server certificate: ONLY THE
    FERMILAB Kerberos CA IS TRUSTED.
    -as root:
    # ups tailor sam_gsi_config
    What VO should be configured (options:  d0 cdf jimcaf jimsam)? d0
    Do you want the DOE Grid your default CA (this is convenient when requesting new DOE Grid certificate) (yes/no)? yes
    # setup sam_gsi_config
    # sam_cert_request
    You will need to repeat the last 2 steps on each machine requiring 
    gridftp server or client access.
    
    This last setp will generate a long list of instructions which boil down to:
    
    send the file /etc/grid-security/samserver.request in an email to nightwatch@fnal.gov in order to get a certificate. Apparently they will respond within two working days with your certificate and you should
    
    IMPORTANT: when the CA sends your certificate,
               you can copy it on /etc/grid-security/samserver.cert
               as user sam. In any case, you should make sure that
               user sam has read access to it.
    
    Once the certificate is received you can
    
    Send email TO sam_admin@fnal.gov and ask
    to add your sam server certificate to the central sam_gridftp grid-mapfile.include your certificate subject to the email as reported by sam_cert_requestquest (it is a string of the form
    "/C=US/ST=Illinois/L=Batavia/O=Fermilab/CN=sam/p8460.phys.sfu.ca" )
    
    once the sam-admin team get back to you you can do the test they recommend in their instructions. This test and its results are shown below:
    [sam@p8460 sam]$ setup globus_dh_client
    [sam@p8460 sam]$ export X509_USER_CERT=/etc/grid-security/samserver.cert
    [sam@p8460 sam]$ export X509_USER_KEY=/etc/grid-security/samserver.key
    [sam@p8460 sam]$ grid-proxy-init
    Your identity: /C=US/ST=Illinois/L=Batavia/O=Fermilab/CN=sam/p8460.phys.sfu.ca
    Creating proxy .................................... Done
    Your proxy is valid until Sat Jul 12 04:04:03 2003
    [sam@p8460 sam]$ grid-proxy-destroy
    
    Now fill your /home/sam/.gridmap file with definitions of all your friends
    ups get_gridmap sam_gsi_config
    
    You should then make a cronjob to automatically update this so you keep track of any new friends....but I haven't done that yet.
  8. install sam_grid_ftp
    upd install sam_gridftp
    setup globus_location
    ups tailor sam_gridftp
    (defaults OK)
    upd install sam_cp -G-c
    edit $SAM_CP_DIR/ups/sam_cp.table and uncomment 
     setupOptional(sam_gridftp)
     This ugly work around is again because of the upd v4_6 bug,
     ticket 27906
    
    I then edited
    /D0/ups/db/sam_cp_config/Config/sam_cp_config.py
    
    and replaced the line that said YOUR_NODE_HERE with
            'p8460.phys.sfu.ca' : ['sam_gridftp', ],
    
    and added the lines
          'd0mino.fnal.gov' : [ 'sam_kerberos_rcp', \
                                'sam_bbftp',\
                                'rcp',\
                                'sam_gridftp', \
                                'enstore', ],
    
    finally edit ${GLOBUS_LOCATION}/etc/ftpaccess adding
    noretrieve /*
    allow-retrieve /sam
    upload /home/sam * no
    
    
  9. Now configure your station ( http://d0db.fnal.gov/sam/doc/install/stationConfig.shtml). Here is where we do all the details like allocating disk, etc.
    1. First start the station. In order to make it start I had to hand edit a file. I guess I was meant to call the "environment" for the stagers "prd" rather than "station_prd" above. So, I edited /home/sam/private/p8460_server_list.txt replacing station_prd with prd everywhere. Then
      setup sam -q prd
      ups start sam_bootstrap
      
      you should be able to query the station now with commands like
      sam dump station --station=p8460.phys.sfu.ca --disks
      
      and can now change things in the station configuration...like adding disks.

      UGLY: my screen is periodically filling up with messages like

      cat: /D0/ups/db/sam_bbftp/sam_bbftp_cookie.config: No such file or directory
      cat: /D0/ups/db/sam_bbftp/sam_bbftp_retry.config: No such file or directory
      cat: /D0/ups/db/sam_bbftp/sam_bbftp_parallel_xfer.config: No such file or directory
      cat: /D0/ups/db/sam_dcache_cp/sam_dcache_cp_host.config: No such file or directory
      cat: /D0/ups/db/sam_dcache_cp/sam_dcache_cp_port.config: No such file or directory
      
      this even though I have no interest in either bbftp or dcache. They are not running in my server list file. I made them go away with
       touch /D0/ups/db/sam_bbftp/sam_bbftp_cookie.config
       touch /D0/ups/db/sam_bbftp/sam_bbftp_retry.config
       touch /D0/ups/db/sam_bbftp/sam_bbftp_parallel_xfer.config
       touch /D0/ups/db/sam_dcache_cp/sam_dcache_cp_host.config
       touch /D0/ups/db/sam_dcache_cp/sam_dcache_cp_port.config
       touch /D0/ups/db/sam_dcache_cp/sam_dcache_cp_mounting_point.config
      
      Also, the stager kept crashing and the email said 'Specify the max number of parallel transfers via the --max-transfers= flag"...which puzzled me. I then looked in /home/sam/private/p8460_server_list.txt and see that in the stager line --max-transfers is set to "N". Hmmmm. Maybe I did something wrong in the setup above? Anyway, I changed it from N to 3.
    2. Now add disks I started by adding a 40Gb disk called /sam. As a station administrator (oneil) I do
      sam add disk --station=p8460.phys.sfu.ca \
                    --mount=p8460.phys.sfu.ca:/sam \
                    --size=40000k
      
      
      Now when I look at the station disks I see
      sam dump station --station=p8460.phys.sfu.ca --disks
      
      [oneil@p8460 oneil]$ sam dump station --station=p8460.phys.sfu.ca --disks
      *** BEGIN DUMP STATION p8460.phys.sfu.ca version v4_2_1_43 running at p8460 7 days 5 hours 55 minutes 53 seconds, admins: oneil
      No replica selection criteria
      There are 0 authorized transfer groups
      Minimum delivery is 1KB; external deliveries are constrained to d0mino.fnal.gov
      Excess consumer satisfaction: 1
      STATION DISKS:
      disk 1270 p8460.phys.sfu.ca:/sam, 40000000KB/40000000KB = 100% free INACTIVE
      station disk total: 40000000KB/40000000KB = 100% free
      
      *** END OF STATION DUMP ***
      
      Notice that my disk is inactive. This is apparently because /sam is not owned by user sam. So, I changed the ownership
      chown sam.users /sam
      
      and restarted the station
      ups restart sam_bootstrap
      
      and now my disk is active.
    3. Now add groups Now we add the dzero group.
      sam add group --group=dzero --fair-share=1 \
                                --station=p8460.phys.sfu.ca \
                                --admin=oneil
                                
      
      Please note that the first time I did this I only put in the required parameters as shown above. When I tried to use the station to deliver some files with this configuration SAM complained that the group had 0 space assigned to it and so it couldn't deliver any files. Clearly the default values are not good enough. I then had a tough time changing parameters of this group like giving it a real disk quota. Turns out there is a (temporary) trick that is needed...you must put --admin= on the configure commandline
      sam configure group --group=dzero --maxDisk=40G --admin=oneil --maxProjects=100 --station=p8460.phys.sfu.ca
      
    4. Add routing information I made the station line in /home/sam/private/p8460_server_list.txt look like
      station prd v4_2_1_43 p8460.phys.sfu.ca  -\
      -min-delivery=1k --routing-station=\.\*::central-router --routing-user=p8460.p\
      hys.sfu.ca --routing-group=dzero
      
      Please note that you should NOT put a --constrain-delivery option on this line any more. The routing options take care of things. I did this at first and saw a messages like
      07/25/03 08:32:45 p8460.phys.sfu.ca.SM.CacheFitter_constrained 753:
      Initialized
       with 1 disks and 0 candidates
      07/25/03 08:32:45 p8460.phys.sfu.ca.SM.CacheFitter_constrained 753:
      Delivery of
       2428073 is constrained to disks (none), i.e., impossible
      07/25/03 08:32:45 p8460.phys.sfu.ca.SM.CacheMan dzero 753: Could not fit
      files
      on disk, possibly due to fragmentation
      07/25/03 08:32:45 p8460.phys.sfu.ca.SM.Repler 753: No more deliveries
      possible
      
      
      in /home/sam/private/station__p8460__prd__p8460.phys.sfu.ca/trace whenever I tried to get some files. Once I removed the --constrain-delivery option and restarted the station things worked fine.
      (as user sam)
      setup ups
      ups restart sam_bootstrap
      
  10. Test some file delivery Try to get your favourite files delivered!

Dugan ONeil
Last modified: Fri Jul 25 18:49:32 CDT 2003