Configuring a new machine
Contents
Configuring a new machine
If your machine is not supported by SimFactory already, you will need to write your own option list, run script and (for a cluster) submit script.
Machine definition
When using SimFactory on a cluster, it needs to know a lot of information about the details of the cluster, provided in a "machine definition file" in simfactory/mdb/machines. For example, it needs to know the number of cores on each node. Copy one of the provided files, and adapt it to your machine. Getting this right is nontrivial.
When using SimFactory on a laptop or workstation (i.e. a machine for which SimFactory has no machine definition file), the "sim setup" command will write a suitable machine definition file automatically.
The following is based in stampede.ini:
[stampede]
gives the name of the machine that can be used with simfactory's --machine option. It must be unique among all machine definition files.
nickname = stampede name = Stampede location = TACC description = A very large Linux cluster at TACC webpage = http://www.tacc.utexas.edu/user-services/user-guides/stampede-user-guide status = production
describe the machine they are used by simfactory when reporting in the machine but are arbitrary otherwise.
 hostname        = stampede.tacc.utexas.edu
 rsynccmd        = /home1/00507/eschnett/rsync-3.0.9/bin/rsync
 envsetup        = <<EOT
     module load intel/15.0.2
     module load mvapich2/2.1
     module -q load hdf5
     module load fftw3
     module load gsl
     module load boost
     module load papi
 EOT
 aliaspattern    = ^login[1234](\.stampede\.tacc\.utexas\.edu)?$
hostname is the name of the login node used by simfactory's login and remote commands, envsetup is executed before each simfactory command (in particular during build and when running the simulation) to ensure a consistent set of libraries are loaded and finally aliaspattern is a regular expression used by simfactory to identify the machine. It must match all cluster login nodes.
 sourcebasedir   = /work/00507/@USER@
 disabled-thorns = <<EOT
     ExternalLibraries/BLAS
     ExternalLibraries/CGNS
     ExternalLibraries/curl
         LSUThorns/Twitter
     ExternalLibraries/flickcurl
         LSUThorns/Flickr
     ExternalLibraries/LAPACK
     ExternalLibraries/libxml2
     ExternalLibraries/Nirvana
         CarpetDev/CarpetIONirvana
         CarpetExtra/Nirvana
     ExternalLibraries/OpenSSL
 EOT
 enabled-thorns = <<EOT:b
     ExternalLibraries/OpenCL
         CactusExamples/HelloWorldOpenCL
         CactusExamples/WaveToyOpenCL
         CactusUtils/OpenCLRunTime
         CactusUtils/Accelerator
         McLachlan/ML_BSSN_CL
         McLachlan/ML_BSSN_CL_Helper
         McLachlan/ML_WaveToy_CL
     ExternalLibraries/OpenBLAS
     ExternalLibraries/pciutils
     ExternalLibraries/PETSc
         CactusElliptic/EllPETSc
         CactusElliptic/TATelliptic
         CactusElliptic/TATPETSc
 EOT
sourcebasedir is the root directory underneath which all Cactus
trees are located, it should be large enough to hold multiple compiled Cactus
checkouts.  Some clusters do not provide all libraries to run all thorns in
the Einstein Toolkit or they require alternative libraries (eg OpenBLAS
instead of LAPACK), disabled-thorns and enabled-thorns
let you choose which thorns to enable/disable in thornlists.
enabled-thorns will remove a #DISABLED from lines in the
thornlist, while disabled-thorns will add it.
optionlist = stampede-mvapich2.cfg submitscript = stampede.sub runscript = stampede-mvapich2.run make = make -j8
optionlist, submitscript and runscript are
used when compiling and submitting simulations and are described in detail
below. You can find examples in the subdirectories of simfactory/mdb. 
make is the command used to compile the code, it can contain
extra arguments to enable for example parallel compilation.
The final set of options deals with the queuing system and characteristics of the machine.
basedir = /scratch/00507/@USER@/simulations cpu = Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz cpufreq = 2.7 flop/cycle = 8 ppn = 16 spn = 2 mpn = 2 max-num-threads = 16 num-threads = 8 memory = 32768 nodes = 6400 min-ppn = 16 allocation = NO_ALLOCATION queue = normal # [normal, large, development] maxwalltime = 48:00:00 # development has 4:0:0 maxqueueslots = 25 # there are 50, but jobs can take 2 slots
basedir is the root directory under which all simulations are
created, this should live on a fast, parallel file system. cpu,
cpufreq, flop/cycle, spn and mpn
are currently unused by simfactoqy. spn is the number of CPU
sockets per node, mpn is the number of NUMA domains per node
(memory sockets). ppn is the nubmer of cores per
node (historically called processors hence the p) which
is passed by simfactory to the queuing system to request a certain number of
cores per node. max-num-threads is the maximum number of threads
that can be used, typically the same as ppn and
num-threads is the default number of threads used, often the number
of cores in a single NUMA domain. min-ppn is the
minimum number of cores that need to be requested, often this is identical to
ppn if the queuing system does not hanlde under-subscribing a node.
memory is currently only used by simfactory's distribute
utility script and nodes is
only used to to abort if more nodes than are in the cluster are requested.
allocation gives the allocation to which to charge runs on clusters
where computer time is accounted for (which is almost all clusters at
computing centres), queue is the default queue to submit to, often
named "default", "batch", "production" or similar. maxwalltime is
the maximum allowed run time for a single job, if a long running simulation is
requested simfactory automatically splits it up into chunks of length no
longer than maxwalltime. maxqueueslots is the maximum
number of jobs that can be queued at the same time, a limit imposed on some
clusters to reduce the load on the queue scheduler.
Please see SimFactory's online documentation for the exact definition of the terms that SimFactory uses to refer to cores, nodes, CPUs, processing units etc.
submit = sbatch @SCRIPTFILE@; sleep 5 # sleep 60 getstatus = squeue -j @JOB_ID@ stop = scancel @JOB_ID@ submitpattern = Submitted batch job ([0-9]+) statuspattern = '@JOB_ID@ ' queuedpattern = ' PD ' runningpattern = ' (CF|CG|R|TO) ' holdingpattern = ' S ' #exechost = head -n 1 SIMFACTORY/NODES #exechostpattern = ^(\S+) stdout = cat @SIMULATION_NAME@.out stderr = cat @SIMULATION_NAME@.err stdout-follow = tail -n 100 -f @SIMULATION_NAME@.out @SIMULATION_NAME@.err
submit, getstatus and stop are used by
simfactory as the commands to submit a new job to the queuing system query the
status of a running job and cancel a running job. You can use
@JOB_ID@ to refer to the job's identifier in them. The respective
pattern variables are regular exttssions simfactory matches
against the ouput of the commands. submitpattern must capture
(enclose in parenthesis) the actual job id so that it can be referred to as
the first captured group ($1 in sed). statuspattern
is used to select the line in getstatus's output that contains
the actual job state information. The queuedpattern,
runningpattern and holdingpattern pattens are used
to identify job states, whichever matches first (in the order listed above)
detemines the job state.  stdout, stderr and
stdout-follow are commands that simfactory executes during
sim show-output to obtain the simulations log output,
stdout-follow is used when the --follow option is
specified to show output of a running segment.  On some clusters the log
output for running segments is not directly accessible from the head nodes and
simfactory has to first log into one of the cluster nodes to gain access to
the output. exechost and exechostpattern give the
command and regulra exttssion used to obtain the host to log into. If they are
specified stdout-follow is executed on that host, otherwise on
the head node.
Option list
The options provided by Cactus are described in the Cactus documentation. This page provides additional information and recommendations.
The following is based on the ubuntu.cfg optionlist which can be found in simfactory/mdb/optionslists. Usually it is best to start from files describing a cluster that is similar, e.g. uses the same MPI stack and compilers, to the machine you would like to set up.
VERSION = 2012-09-28
Cactus will reconfigure when the VERSION string changes.
Compilers
CPP = cpp FPP = cpp CC = gcc CXX = g++ F77 = gfortran F90 = gfortran
The C and Fortran preprocessors, and the C, C++, Fortran 77 and Fortran 90 compilers, are specified by these options. You can specify a full path if the compiler you want to use is not available on your default path. Note that it is strongly recommended to use compilers from the same family; e.g. don't mix the Intel C Compiler with the GNU Fortran Compiler.
Compilation and linking flags
CPPFLAGS = -DMPICH_IGNORE_CXX_SEEK FPPFLAGS = -traditional CFLAGS = -g3 -march=native -std=gnu99 CXXFLAGS = -g3 -march=native -std=gnu+11 F77FLAGS = -g3 -march=native -fcray-pointer -m128bit-long-double -ffixed-line-length-none F90FLAGS = -g3 -march=native -fcray-pointer -m128bit-long-double -ffixed-line-length-none LDFLAGS = -rdynamic
Cactus thorns can be written in C or C++. Cactus supports the C99 and C++11 standards respectively. Additionally, the Einstein Toolkit requires some of the GNU extensions provided by the options gnu99 / gnu++11. Without these options each feature required needs to be enabled manually, which is quite error prone. Both GNU and Intel compilers support these options.
-g3 ensures that debugging symbols are included in the object files. It is not necessary to set DEBUG = yes to get debugging symbols.
The rdynamic linker flag ensures that additional information is available in the executable for producing backtraces at runtime in the event of an internal error.
LIBDIRS = C_LINE_DIRECTIVES = yes F_LINE_DIRECTIVES = yes
Debugging
DEBUG = no CPP_DEBUG_FLAGS = -DCARPET_DEBUG FPP_DEBUG_FLAGS = -DCARPET_DEBUG C_DEBUG_FLAGS = -O0 CXX_DEBUG_FLAGS = -O0 F77_DEBUG_FLAGS = -O0 F90_DEBUG_FLAGS = -O0
When DEBUG = yes is set (e.g. on the make command line or with SimFactory's --debug option), these debug flags are used. The intention here is to disable optimisation and enable additional code which may slow down execution but makes the code easier to debug.
Optimisation
OPTIMISE = yes CPP_OPTIMISE_FLAGS = # -DCARPET_OPTIMISE -DNDEBUG FPP_OPTIMISE_FLAGS = # -DCARPET_OPTIMISE -DNDEBUG C_OPTIMISE_FLAGS = -O2 -ffast-math CXX_OPTIMISE_FLAGS = -O2 -ffast-math F77_OPTIMISE_FLAGS = -O2 -ffast-math F90_OPTIMISE_FLAGS = -O2 -ffast-math
Profiling
PROFILE = no CPP_PROFILE_FLAGS = FPP_PROFILE_FLAGS = C_PROFILE_FLAGS = -pg CXX_PROFILE_FLAGS = -pg F77_PROFILE_FLAGS = -pg F90_PROFILE_FLAGS = -pg
OpenMP
OPENMP = yes CPP_OPENMP_FLAGS = -fopenmp FPP_OPENMP_FLAGS = -fopenmp C_OPENMP_FLAGS = -fopenmp CXX_OPENMP_FLAGS = -fopenmp F77_OPENMP_FLAGS = -fopenmp F90_OPENMP_FLAGS = -fopenmp
Warnings
WARN = yes CPP_WARN_FLAGS = -Wall FPP_WARN_FLAGS = -Wall C_WARN_FLAGS = -Wall CXX_WARN_FLAGS = -Wall F77_WARN_FLAGS = -Wall F90_WARN_FLAGS = -Wall
ExternalLibraries
The Einstein toolkit thorns use a variety of third-party libraries like MPI or HDF5. These are usually provided by helper thorns in the ExternalLibaries arrangement. As a general rule, to enable a capability FOO add
ExternalLibraries/FOO
to your ThornList and set FOO_DIR to the directory where the include and lib directories are found.
HDF5
If no HDF5 options are given, then HDF5 will be used if it can be automatically detected from standard locations, and will be built from a source package in the HDF5 thorn if not. Alternatively you can specify HDF5_DIR to point to an HF5 installation, for example
HDF5_DIR = /usr/local/hdf5-1.9.1
The following options disable support for Fortran and C++ when building HDF5, as it is not required by the Einstein Toolkit.
HDF5_ENABLE_FORTRAN = no HDF5_ENABLE_CXX = no
MPI
MPI_DIR = /usr MPI_INC_DIRS = /usr/include/mpich2 MPI_LIB_DIRS = /usr/lib MPI_LIBS = mpich fmpich mpl
The correct values to use can be difficult to find out since MPI comes with its own compiler wrappers mpicc etc. that are expected to be used. 
Note that the thorn ExternalLibraries/MPI will often be able to determine the correct options if you set MPI_DIR = NO_BUILD, ensure that mpicc (or similar) is found in $PATH and leave the other variables unset.
In cases where auto-detection fails (not unlikely on clusters), often, the mpicc wrapper will come with options to query the values to use here.
| MPI stack | Options | Comments | 
|---|---|---|
| OpenMPI | mpicc -showme:compileandmpicc -showme:link | |
| mpich (and mvapich) | mpicc -compile_infoandmpicc -link_info | |
| Intel | mpiicc -compile_infoandmpiicc -link_info | impi is a derivative of mpich. Note the name of the wrapper is: mpiicc (with an extra "i"). | 
| Cray | please use the cc, CC, ftn wrappers and load the correct modules | 
Others
PTHREADS_DIR = NO_BUILD
Submission script
The submission script is used to submit a job to the queueing system. See the examples in simfactory/mdb/submitscripts, and create a new one for your cluster that uses the same queueing system.
Run script
The most important part of the run script is usually the set of modules that need to be loaded, and the mpirun command to use on the machine. See the examples in simfactory/mdb/runscripts, and create a new one for your cluster that is similar to one that already exists.
