Status of production
- WestGrid: Yann crashed his machine with around 400 batch jobs. Yann believes that this is more related to the number of jobs run before.
Do we have a memory leak? GG: will check for know bugs in globus.
Yann says 500 to 600 production jobs and 4-5 merge jobs were sufficient to crash the system.
- Lyon: machine crashes occur at around 1000 batch jobs in the queue.
Now messages in /var/log/messages .
In order to reduce the number of batch jobs in execution sites Gabriele will look into adding some logic to the broker.