Until Wednesday running ok provided merging and production is done separately. Merging poses very high load on head node due to the fact that the head nodes work as a NAT box. At that time it still wasn't using fcp. Head node ist dual 2.4GHz, 3GB RAM. The SamGrid team will help to optimise the configuration.
From Wednesday a new version of qstat caused problems which could only be resolved on Saturday (inconsistent version of python). Grid job got held. Batch jobs that aren't in the system at that moment won't find boxNNNNNN.MMMMM/sandbox. SamGrid shouldn't require this for reprocessing. The SamGrid team will investigate.
There seems to be a problem with the 2.4.(20?) kernel in case of heavy network load caused by kswapd.
Yann and Dugan are hoping to get a dedicated NAT box.