Difference between revisions of "Testsuite Machines"
(→General) |
(→General) |
||
Line 12: | Line 12: | ||
chmod a+x GetComponents | chmod a+x GetComponents | ||
./GetComponents --root=. -a https://svn.einsteintoolkit.org/manifest/trunk/einsteintoolkit.th | ./GetComponents --root=. -a https://svn.einsteintoolkit.org/manifest/trunk/einsteintoolkit.th | ||
− | |||
− | + | cp simfactory/etc/defs.local.ini.simple simfactory/etc/defs.local.ini | |
+ | nano simfactory/etc/defs.local.ini | ||
− | + | Edit defs.local.ini and replace | |
− | + | YOUR_LOGIN with your username | |
− | + | YOUR@EMAIL.ADDRESS with your usual email address | |
− | + | YOUR_ALLOCATION with your project allocation | |
− | + | YOUR_THORNLIST with manifest/einsteintoolkit.th | |
− | |||
− | |||
See the machine-specific notes below for any additional steps for each machine. | See the machine-specific notes below for any additional steps for each machine. | ||
− | |||
sim sync <machine> | sim sync <machine> | ||
− | sim build -- | + | sim login <machine> |
− | sim | + | sim build |
+ | sim create-submit ettests_1proc --testsuite --procs 16 --num-threads 16 --walltime 3:00:00 | ||
+ | sim create-submit ettests_2proc --testsuite --procs 16 --num-threads 8 --walltime 3:00:00 | ||
+ | |||
+ | Replace 16 and 8 with the number of cores on each node, and half of this, respectively, for the machine you are using. Remember that "procs" here means "cores" and "num-threads" means "number of threads per process". The idea is to use a full node, i.e. all the cores, and then either one or two MPI processes. | ||
− | + | When the jobs have finished, you should have the summary.log files in | |
− | + | <simulations>/ettests_1proc/output-0000/TEST/sim/summary.log | |
+ | <simulations>/ettests_2proc/output-0000/TEST/sim/summary.log | ||
Update the testsuite status page by adding the log files to the release-info repository: | Update the testsuite status page by adding the log files to the release-info repository: | ||
− | svn | + | svn checkout https://svn.einsteintoolkit.org/www/release-info |
cd release-info | cd release-info | ||
− | scp machine: | + | scp machine:<simulations>/ettests_1proc/output-0000/TEST/sim/summary.log <machine>__1_16.log |
− | + | scp machine:<simulations>/ettests_2proc/output-0000/TEST/sim/summary.log <machine>__2_8.log | |
− | + | svn commit <machine>*.log | |
− | To re-run the tests with an updated checkout, run the GetComponents command above with the --update flag, rebuild, delete the " | + | To re-run the tests with an updated checkout, run the GetComponents command above with the --update flag, rebuild, delete the "ettests_*" simulations, and resubmit the simulations. |
=Machines= | =Machines= |
Revision as of 08:31, 29 April 2013
This page contains notes and instructions for people running the ET testsuites on various different machines. If you have experience running testsuites on a machine which is not listed here, please consider adding some information which might help others (or yourself!) in future.
Contents
General
FIXME: this is out of date as it refers to simfactory version 1, which is obsolete!
To check out the ET:
mkdir etrelease cd etrelease curl -O https://github.com/gridaphobe/CRL/raw/master/GetComponents chmod a+x GetComponents ./GetComponents --root=. -a https://svn.einsteintoolkit.org/manifest/trunk/einsteintoolkit.th
cp simfactory/etc/defs.local.ini.simple simfactory/etc/defs.local.ini nano simfactory/etc/defs.local.ini
Edit defs.local.ini and replace
YOUR_LOGIN with your username YOUR@EMAIL.ADDRESS with your usual email address YOUR_ALLOCATION with your project allocation YOUR_THORNLIST with manifest/einsteintoolkit.th
See the machine-specific notes below for any additional steps for each machine.
sim sync <machine> sim login <machine> sim build sim create-submit ettests_1proc --testsuite --procs 16 --num-threads 16 --walltime 3:00:00 sim create-submit ettests_2proc --testsuite --procs 16 --num-threads 8 --walltime 3:00:00
Replace 16 and 8 with the number of cores on each node, and half of this, respectively, for the machine you are using. Remember that "procs" here means "cores" and "num-threads" means "number of threads per process". The idea is to use a full node, i.e. all the cores, and then either one or two MPI processes.
When the jobs have finished, you should have the summary.log files in
<simulations>/ettests_1proc/output-0000/TEST/sim/summary.log <simulations>/ettests_2proc/output-0000/TEST/sim/summary.log
Update the testsuite status page by adding the log files to the release-info repository:
svn checkout https://svn.einsteintoolkit.org/www/release-info cd release-info scp machine:<simulations>/ettests_1proc/output-0000/TEST/sim/summary.log <machine>__1_16.log scp machine:<simulations>/ettests_2proc/output-0000/TEST/sim/summary.log <machine>__2_8.log svn commit <machine>*.log
To re-run the tests with an updated checkout, run the GetComponents command above with the --update flag, rebuild, delete the "ettests_*" simulations, and resubmit the simulations.
Machines
Kraken
Edit simfactory/mdb.pm
- myproxy-logon -p 7514 -s myproxy.teragrid.org -T -l @USER@ -o @SOURCEDIR@/.globus/proxy-teragrid + myproxy-logon -p 7514 -s myproxy.teragrid.org -T -l <username> -o @SOURCEDIR@/.globus/proxy-teragrid
where <username> is your user name on Kraken. This is related to Ticket #381
Change the sourcebasedir in simfactory/udb.pm to be under /lustre/scratch. This is necessary because the Cactus directory must be visible from the compute node when running the tests (to see the parameter files, testsuite reference output and test.ccl files), and the home directory on Kraken is not visible from the compute nodes.
set_option 'kraken', 'sourcebasedir', '/lustre/scratch/USERNAME';
Copy kraken.sh as kraken-testsuite.sh in simfactory/scriptfiles. Replace the aprun command with the following:
cd @SOURCEDIR@ export CCTK_TESTSUITE_RUN_COMMAND="aprun -n \$nprocs -d 1 \$exe \$parfile" test_dir=TEST_$(date +%Y-%m-%d-%H%M%S) mkdir $test_dir CONFIGNAME=$(ls configs|tail) # SimFactory should provide the configuration name for i in 1 2 3; do case $i in 1 ) export CCTK_TESTSUITE_RUN_PROCESSORS=1 export OMP_NUM_THREADS=1;; 2 ) export CCTK_TESTSUITE_RUN_PROCESSORS=2 export OMP_NUM_THREADS=1;; 3 ) export CCTK_TESTSUITE_RUN_PROCESSORS=2 export OMP_NUM_THREADS=2;; esac make $CONFIGNAME-testsuite PROMPT=no cp TEST/$CONFIGNAME/summary.log kraken__${CCTK_TESTSUITE_RUN_PROCESSORS}_${OMP_NUM_THREADS}.log mv TEST $test_dir/TEST.$i done
Datura
Copy datura.sh as datura-testsuite.sh in simfactory/scriptfiles. Replace the mpirun command with the following:
cd @SOURCEDIR@ test_dir=TEST_$(date +%Y-%m-%d-%H%M%S) mkdir $test_dir CONFIGNAME=$(ls configs|tail) # SimFactory should provide the configuration name export CACTUS_STARTTIME=$(date +%s) for i in 1 2 3; do case $i in 1 ) export CCTK_TESTSUITE_RUN_PROCESSORS=1 export OMP_NUM_THREADS=1;; 2 ) export CCTK_TESTSUITE_RUN_PROCESSORS=2 export OMP_NUM_THREADS=1;; 3 ) export CCTK_TESTSUITE_RUN_PROCESSORS=2 export OMP_NUM_THREADS=2;; esac export CCTK_TESTSUITE_RUN_COMMAND="${MPIDIR}/bin/mpirun -v --mca btl openib,self --mca mpi_leave_pinned 0 -np \$nprocs -npernode $CCTK_TESTSUITE_RUN_PROCESSORS \$exe -L 3 \$parfile" make $CONFIGNAME-testsuite PROMPT=no cp TEST/$CONFIGNAME/summary.log datura__${CCTK_TESTSUITE_RUN_PROCESSORS}_${OMP_NUM_THREADS}.log mv TEST $test_dir/TEST.$i done
BlueDrop
The following script is a slightly modified version of run_tests adapted to run on the Power7 machine BlueDrop:
set -e cd Cactus #Blue drop (Note that we need to specify the absolute path for the bluedrop_run*.ll files): for i in 1 2 3; do case $i in 1 ) hostname | grep ^bd && export CCTK_TESTSUITE_RUN_COMMAND="poe \$exe \$parfile -retry -1 -llfile /home/bcmundim/bluedrop_run_1.ll" export CCTK_TESTSUITE_RUN_PROCESSORS=1 export OMP_NUM_THREADS=1;; 2 ) hostname | grep ^bd && export CCTK_TESTSUITE_RUN_COMMAND="poe \$exe \$parfile -retry -1 -llfile /home/bcmundim/bluedrop_run_2.ll" export CCTK_TESTSUITE_RUN_PROCESSORS=2 export OMP_NUM_THREADS=1;; 3 ) hostname | grep ^bd && export CCTK_TESTSUITE_RUN_COMMAND="poe \$exe \$parfile -retry -1 -llfile /home/bcmundim/bluedrop_run_2.ll" export CCTK_TESTSUITE_RUN_PROCESSORS=2 export OMP_NUM_THREADS=2;; esac make sim-testsuite PROMPT=no mv TEST/sim/summary.log ../${HOSTNAME}__${CCTK_TESTSUITE_RUN_PROCESSORS}_${OMP_NUM_THREADS}.log done
Remember to change the absolute path (/home/bcmundim in the script above) to the appropriate location of your LoadLeveler batch job script files, the bluedrop_run_*.ll. The following batch job scripts were used to run the test suites; bluedrop_run_1.ll and bluedrop_run_2.ll, respectively:
## Comment the class line to run on short queue: #@ class = debug #@ job_type = parallel ##@ node_usage = not_shared #@ node_usage = shared #@ environment = COPY_ALL #@ tasks_per_node = 1 #@ node = 1 #@ wall_clock_limit = 0:30:00 ### uncomment below for a normal batch job # #@ output = $(host).$(jobid).$(stepid).out # #@ error = $(host).$(jobid).$(stepid).err #@ queue ## uncomment for a normal batch job # $HOME/a.out
## Comment the class line to run on short queue: #@ class = debug #@ job_type = parallel ##@ node_usage = not_shared #@ node_usage = shared #@ environment = COPY_ALL #@ tasks_per_node = 2 #@ node = 1 #@ wall_clock_limit = 0:30:00 ### uncomment below for a normal batch job # #@ output = $(host).$(jobid).$(stepid).out # #@ error = $(host).$(jobid).$(stepid).err #@ queue ## uncomment for a normal batch job # $HOME/a.out