Reprocessing: Meeting of 6-Dec-2004: 9:30-10:30 ESNet video conference

Our meeting number on the ESNet is 823073776 (82d0repro).
Instructions to dial into a video conference via phone.

Agenda

  1. News
  2. Status of Implementation of Reprocessing in JIM (update)
  3. Status of JIM Deployment to Remote Site
  4. AOB.

Minutes

Participants

Joe Steele, Amber Boehnlein, Jörg Meyer, Aaron Dominguez, Tibor Kurca, Thomas Nunnmann, Vastislav Hynek, Milos Lokajicek, Jan Svec, Yann Coadou, Gabriele Garzoglio, Mike Diesburg, Daniel Wicke (FNAL), Jae Yu (SAR).

Topics

  1. News
    p17.01.00 tarball for JIM still missing. Ian didn't talk to Mike since last week.
    20pb-1 is 95-97% done. One infinite loop (maxopt only problem at a rate of 1/run, identified by Gordon Watts)
    Some events with larger than 2GB vmem. TMB is 30-40% larger than before.
    Timing not faster than p14. Speed improvements only for high luminosity runs.
  2. Status of Implementation of Reprocessing in JIM (update)
  3. Status of JIM Deployment to Remote Site

    New cut of JIM versions appropriate for reprocessing:
    jim_job_managers v2_2_25
    jim_sandbox v2_4_1
    jim_client v2_0_25
    sam_client v1_0_8
    xmldb_client v2_0_6
    xmldb_server v1_0_3
    vdt v1_1_14_13

    jobfiles_dataset: reco_bin7
    includes mc_runjob_v06_03_04-jim-02.tar.gz

    Set the following variables in ups tailor sam_config to the names in you local DB proxy (defaults are given in parenthesis):
    D0DBSERVER_NAME (D0DbServer.user_prd)
    SMTDBSERVER_NAME (SmtDbServer.user_prd)
    CFTDBSERVER_NAME (CftDbServer.user_prd)
    CALDBSERVER_NAME (CalDbServer.user_prd)
    CPSDBSERVER_NAME (CpsDbServer.user_prd)
    MUONMDTDBSERVER_NAME (MuonMDTDbServer.user_prd)
    MUONPDTDBSERVER_NAME (MuonPDTDbServer.user_prd)
    MUONMSCDBSERVER_NAME (MuonMSCDbServer.user_prd)

    CONFIGDBSERVER_NAME (ConfigDbServer.user_prd)

  4. DØFarm: Storing problem of last time is believed to be cause be misconfiguration which hinders the head node to write to enstore.
  5. GridKa: Operational again. Upgraded to most recent versions 10-job test successfull. TN: says the worker nodes shouldn't be able to connect to the DB. Logfiles need to be checked.
  6. Lyon: Up and running. 100job test running. Needs to be configured to use local DB server.
  7. SAR:
    Upgraded to last weeks cut. Stuck in intra-station transport. In contact with Gabriele.
  8. WestGrid:
    Progressing in configuration for old SAM station, then migrate to most recent JIM cut. Thereafter migrate to new SAM station.
    Dugan requested 1009 CPUs*3GHz for reprocessing.
  9. Wisconsin:
    Jobs are sent to both clusters, on one cluster jobs may get preempted. Reprocessing need to be constrained to one of the two clusters.
  10. CMS Farm:
    Joe Kayser works with Joe Snow to get JIM up on one of the 3 gateway nodes. Once it's up the other two gateway nodes will be set up.
  11. Prague:
  12. AOB.

Next Meeting

13-Dec-2004
Mike Diesburg, Daniel Wicke, 29-Nov-2004. Last Change 10-Dec-2004.
Diesburg@fnal.gov, Wicke@fnal.gov