INTRODUCTION
The whole production will be run from the d0mino0x node,
where so called User Interface (UI) - jim_client and the d0repro tools are installed

Login to d0mino01 (or d0mino02, d0mino03).
In your home directory create a soft link to the working directory of your team:
ln -s /prj_root/2631/d0repro-work/TeamX d0repro-work
where X stands for A, B or C team, i.e. TeamA, TeamB or TeamC, see composition of teams

For the production we are going to use so called "Autopilote" mode of running. It means that Autopilot.daemon will be constantly running in the background and will take care for job submissions and status checking.
You have tools at your disposition to steer this Autopilot.daemon - see description in OPERATIONS STEERING.

AUTOPILOT
You have to start the Autopilot.daemon.
That daemon will run until your certificate expires or until a new daemon is started by your team elsewhere.
This should be kept running at all times.
(setup will only work in "bash"):

To get your environmental variables correctly set, run the script setups-p20.sh in the bash shell, i.e. do:
bash .... if your default shell isn't already bash
source d0repro-work/../setups-p20.sh
Now start your Autopilot daemon in the background (&) with -nohup option and redirect the output to the file Autopilot.out
Autopilot.daemon >> Autopilot.out 2>&1 -nohup &
Now you can leave the bash shell, i.e.
exit

To follow what has Autopilot done you can look into file Autopilot.out or if you want to have "online" information you can do:
tail -f Autopilot.out

OPERATIONS STEERING
Important! For all our certification and test jobs use --test option consequently !!!

d0release version for tests and certification is p20.07.00
For example:
set_status.py production approved daysetname p20.07.00 --test
sub_production.py daysetname p20.07.00 --test

Important! For production DON'T USE --test option !!!

d0release version for production is p20.07.01
For example:
set_status.py production approved daysetname p20.07.01
sub_production.py daysetname p20.07.01

To steer the Autopilot.daemon set_status.py and list_status.py commands should be used.
Do the following 2 setup commands in a shell of your preference:
umask 22
setup d0repro

Available commands:
list_status.py
- will report about all datasets that have been started so far

list_status.py -all
- will also report on datasets with status "new"

set_status.py production new daysetname d0release
- to "create" a request for all daysets that you want to run

set_status.py production approved daysetname d0release
- to start the production for a given dataset, i.e. the dataset will be picked up by Autopilot

set_status.py production approved daysetname d0release
- to start recovery on partially failed datasets (yes, the same command as to start production)

set_status.py merge approved daysetname d0release
- depending on whether a production or a merge jobs needs recovery, i.e. Autopilot may suggest you to have them resubmitted

set_status.py production finished daysetname d0release
- to mark dataset that has been finished
- or use this command if all files missing from a production job have failed twice in the same event (beyond the 1st one) with the same exit code,
to mark the production as finished (though incomplete).

For jobs for which the auto_pilot suggests investigation use
"set_status.py merge/production approved daysetname d0release "
to have them resubmitted by the autopilot.
Alternatively you can also use the sub_production.py and
sub_merge.py commands, but that has subtle operational disadvantages.
The number of grid jobs in the system should be watched by the operator and if it drops too low one has to manually approve one or two datasets to fill up.

Description, installation and commands to operate in manual mode:
d0repro tools to run grid jobs