Difference between revisions of "Simulation Factory Advanced Tutorial"
(→Configuration) |
|||
(113 intermediate revisions by 46 users not shown) | |||
Line 1: | Line 1: | ||
− | The Simulation Factory | + | The Simulation Factory simplifies many aspects of running Cactus-based |
− | facility for managing | + | simulations. It provides a central facility for managing authoritative |
− | + | source tree versions, providing convenient access to remote HPC | |
− | way from | + | systems, building Cactus source configurations, and managing |
+ | simulations all the way from submission to archiving their output. | ||
+ | |||
+ | |||
== Getting Started == | == Getting Started == | ||
− | |||
− | |||
− | + | To begin using The Simulation Factory, it needs to be checked out from | |
+ | '''git'''. The Simulation Factory is typically placed into a | ||
+ | '''simfactory''' folder inside a Cactus source tree. This can be | ||
+ | accomplished with the following '''git''' command: | ||
+ | |||
+ | git clone -b master https://bitbucket.org/simfactory/simfactory2.git | ||
− | The Simulation Factory | + | The Simulation Factory could also be placed in an independent location |
+ | to be used with multiple Cactus source trees. This approach will be | ||
+ | described later. | ||
== Initial Setup == | == Initial Setup == | ||
− | Once | + | Once the Simulation Factory has been checked out from svn, the next |
− | + | step is to configure it, telling it e.g. about your user name. The | |
+ | Simulation Factory comes with a convenient method to generate | ||
+ | some simple defaults to get started. To begin, issue the command | ||
− | + | simfactory/bin/sim setup | |
− | + | ||
+ | Setup will prompt for a username, an email address, and an allocation. It will allow you | ||
+ | to enter additional configuration from other machines, but this is advanced, and can be | ||
+ | safely ignored. Setup will also generate a machine database entry for your local machine | ||
+ | based upon the generic machine database entry. | ||
+ | |||
+ | === Additional Configurations === | ||
+ | |||
+ | The Simulation Factory contains a database known as the Machine | ||
+ | Database. This collection of information describes all the aspects | ||
+ | that are unique about each individual HPC system, so that the | ||
+ | Simulation Factory can provide a common interface for all systems that | ||
+ | hides these differences. | ||
+ | |||
+ | The Machine Database consists of different sections, one for each | ||
+ | machine. The section name is given in square brackets, e.g. | ||
+ | '''[comet]'''. There is a special section '''[default]''' that | ||
+ | provides default values for those properties that are not explicitly | ||
+ | set in the machine-specific entries. | ||
− | + | The Machine Database is an authoritative collection of information, | |
− | + | and is generally not meant to contain modification that are only | |
− | + | relevant for individual people. These local modifications are instead | |
− | + | maintained in the file '''simfactory/etc/defs.local.ini''' described | |
− | + | above, where one can add, change, or overwrite properties of Machine | |
− | + | Database entries. For instance, if an alternative username, | |
+ | allocation, and sourcebasedir is needed for the machine | ||
+ | '''comet''', you would add the following section there: | ||
− | == | + | [comet] |
+ | user = COMET_USERNAME | ||
+ | allocation = COMET_ALLOCATION | ||
− | + | There are several macros that help simplifying configuration entries. | |
+ | The most useful is probably '''@USER@''', which expands to the | ||
+ | '''user''' property of the Machine Database entry. | ||
− | + | For example, if you are using the same user name on many systems, but | |
− | + | have a different user name on some systems, then you would set the | |
− | + | common user name in the '''[default]''' section, and override this for | |
− | + | those machines where your user name differs. The example | |
+ | '''simfactory/etc/defs.local.ini.complex''' has examples for this. | ||
− | + | Most of the macros available in the Simulation Factory are described | |
+ | in the section [[#Macros]] below. | ||
− | + | The command | |
− | simfactory/sim list-machines | + | simfactory/bin/sim list-machines |
+ | |||
+ | outputs a list of all preconfigured machines that the Simulation | ||
+ | Factory knows about. | ||
− | === Local Workstation Configuration === | + | === Extended Local Workstation Configuration === |
− | + | Once determining the hostname of your machine through the command: | |
− | machine | ||
− | |||
hostname | hostname | ||
− | + | It is then possible to edit the machine simfactory/mdb/machines/<hostname>.ini, which was created via the command: | |
+ | |||
+ | sim setup | ||
+ | |||
+ | Then edit '''simfactory/mdb/machines/<hostname>.ini'''. From here, all values can be customized. | ||
+ | |||
+ | == Accessing Remote Systems == | ||
+ | |||
+ | The Simulation Factory simplifies access to remote systems, both for | ||
+ | transferring files and logging in. You can synchronise (replicate) an | ||
+ | authoritative version of your Cactus source tree to remote systems, | ||
+ | obtain an interactive shell, or execute commands. | ||
+ | |||
+ | === Information Commands === | ||
+ | |||
+ | The following commands can be used to discover information about a | ||
+ | machine, or list all known machines. | ||
+ | |||
+ | List all known machines: | ||
+ | |||
+ | simfactory/bin/sim list-machines | ||
+ | |||
+ | Print the complete Machine Database to the screen: | ||
+ | |||
+ | simfactory/bin/sim print-mdb | ||
+ | |||
+ | Print the Machine Database entry for a single machine: | ||
+ | |||
+ | simfactory/bin/sim print-mdb <machine> | ||
+ | |||
+ | Print the name of the machine on which the Simulation Factory is | ||
+ | currently being executed: | ||
+ | |||
+ | simfactory/bin/sim whoami | ||
+ | |||
+ | === Syncing === | ||
+ | |||
+ | Historically, Cactus and the Einstein Toolkit have not been installed | ||
+ | into a central location on each machine, but are instead built | ||
+ | on-demand by every user for a certain thorn list. (One of the | ||
+ | advantages is that people can thus easily add their own thorns.) To | ||
+ | help with this approach, the Simulation Factory provides a facility to | ||
+ | synchronize a Cactus user's local, authoritative source tree to remote | ||
+ | HPC systems, where it can then be compiled and run. | ||
+ | |||
+ | Remote access is implemented on top of ssh and other ssh-like | ||
+ | mechanisms such as gsi-ssh. Currently, you must still manage all ssh | ||
+ | keys and passwords manually. (We highly recommend to use ssh keychain | ||
+ | and ssh agents to avoid having to enter passwords multiple times.) | ||
+ | |||
+ | ==== Configuration ==== | ||
+ | |||
+ | Before syncing a source tree to a remote system, a small amount of | ||
+ | configuration must be performed. It is necessary to either verify that | ||
+ | the defaults are correct, or to define the correct values for the | ||
+ | following keys for the remote system in the Machine Database: | ||
+ | |||
+ | <ul> | ||
+ | <li> '''sourcebasedir''' | ||
+ | <ul> | ||
+ | <li>The root directory under which the Cactus source tree will reside | ||
+ | </ul> | ||
+ | <li> '''basedir''' | ||
+ | <ul> | ||
+ | <li>The root directory in which all simulation output will reside | ||
+ | </ul> | ||
+ | <li> '''user''' | ||
+ | <ul> | ||
+ | <li>The user name on the remote system | ||
+ | </ul> | ||
+ | </ul> | ||
+ | |||
+ | You can output the currently configured values by issuing the command | ||
+ | |||
+ | simfactory/bin/sim print-mdb <machine> | ||
+ | |||
+ | If you need to change these values, then edit (on the local system) | ||
+ | the file '''simfactory/etc/defs.local.ini''' and add a section for the | ||
+ | remote machine. This entry will augment the existing Machine Database | ||
+ | entry and updating/replace the corresponding values. An example for | ||
+ | the machine '''queenbee''' can be see in the [[#Additional | ||
+ | Configuration]] section. | ||
+ | |||
+ | To see the list of files and directories that are synchronized, look | ||
+ | at '''simfactory/etc/defs.ini''' and find the following two keys | ||
+ | |||
+ | <ul> | ||
+ | <li> '''sync-sources''' | ||
+ | <ul> | ||
+ | <li>The list of files and directories that will be transferred when | ||
+ | the option '''--sync-sourcetree''' is enabled (on by default) | ||
+ | </ul> | ||
+ | <li> '''sync-parfiles''' | ||
+ | <ul> | ||
+ | <li>The list of files and directories that will be copied when the | ||
+ | option '''--sync-parfiles''' is enabled (also on by default). This | ||
+ | list of files typically includes just parameter files. | ||
+ | </ul> | ||
+ | </ul> | ||
+ | |||
+ | You can control which files are included or excluded using '''simfactory/etc/filter.local.rules''' which uses '''rsync''' filter rules syntax. | ||
− | + | ==== Performing a Sync ==== | |
− | + | A sync command takes two options, both of which default to '''true'''. | |
<ul> | <ul> | ||
− | <li> ''' | + | <li> '''sync-sourcetree''' |
<ul> | <ul> | ||
− | <li> | + | <li>Synchronise the complete source tree, as specified in the |
+ | aforementioned '''rsync-sources''' configuration entry. This takes a | ||
+ | few seconds or minutes, depending on the connection. | ||
</ul> | </ul> | ||
− | <li> ''' | + | <li> '''sync-parfiles''' |
− | < | + | <ul> |
− | <li> | + | <li>Synchronise parameter files, as specified by the aforementioned |
− | + | '''rsync-parfiles''' configuration entry. This is typically faster | |
+ | than synchronising the source tree. | ||
</ul> | </ul> | ||
+ | </ul> | ||
+ | |||
+ | Usually, you would issue the command: | ||
+ | |||
+ | simfactory/bin/sim sync <machine> | ||
+ | |||
+ | To synchronise only parfiles, you can negate the | ||
+ | '''--sync-sourcetree''' argument with the following command | ||
+ | |||
+ | simfactory/bin/sim sync <machine> --nosync-sourcetree | ||
− | + | If you want to synchronise not from the local machine, but from | |
+ | another remote machine, then use | ||
− | + | simfactory/bin/sim --remote <frommachine> sync <tomachine> | |
+ | |||
+ | This executes the synchronisation command on the machine | ||
+ | '''<frommachine>'''. | ||
− | |||
− | |||
=== Remote Login === | === Remote Login === | ||
− | |||
− | == | + | The Simulation Factory provides the ability to log in to a remote |
+ | system. This is initiated with the command | ||
+ | |||
+ | simfactory/bin/sim login <machine> | ||
+ | |||
+ | This will automatically cd into the Cactus directory on the remote | ||
+ | system. | ||
+ | |||
+ | === Local/Remote Command Execution === | ||
+ | |||
+ | To execute a command (locally) via the Simulation Factory, use the command | ||
+ | |||
+ | simfactory/bin/sim execute <command> | ||
+ | |||
+ | The command will be executed in the Cactus directory on the remote | ||
+ | system. | ||
+ | |||
+ | If the command is complex, and requires arguments, the command must be | ||
+ | quoted. For example | ||
+ | |||
+ | simfactory/bin/sim execute 'ls -al' | ||
+ | |||
+ | To execute a remote command, use the command | ||
+ | |||
+ | simfactory/bin/sim --remote <machine> execute <command> | ||
− | == | + | An example of a complex command being executed remotely is |
+ | |||
+ | simfactory/bin/sim --remote queenbee execute 'find . -name *.py -exec sed -i .bk -n s/foo/bar/g {} \;' | ||
+ | |||
+ | == Build Cactus Configurations == | ||
+ | |||
+ | The Simulation Factory provides a central facility for configuring and | ||
+ | building Cactus source trees. When a Cactus source tree is compiled, | ||
+ | the Simulation Factory creates a '''configuration''' for the compiled | ||
+ | executable, storing with it related information such as the Cactus | ||
+ | options list, and the scripts necessary to submit and run jobs in a | ||
+ | queuing system. This configuration is thus a self-contained entity | ||
+ | containing everything that is necessary to perform Cactus simulations. | ||
=== Information Commands === | === Information Commands === | ||
+ | |||
+ | To list all existing Cactus configurations, use the following command | ||
+ | |||
+ | simfactory/bin/sim list-configurations | ||
+ | |||
+ | |||
+ | or, for a remote machine, | ||
+ | |||
+ | simfactory/bin/sim --remote <machine> list-configurations | ||
+ | |||
=== Building a Configuration === | === Building a Configuration === | ||
− | === | + | |
+ | To build a configuration, four pieces of information are required: | ||
+ | |||
+ | <ul> | ||
+ | <li>Thorn list | ||
+ | <ul> | ||
+ | <li>This defines which thorns are to be included into the configuration. | ||
+ | <li>Default: '''thornlist''' parameter of the Machine Database entry | ||
+ | <li>Override: '''--thornlist=<thornlist>''' | ||
+ | <li>The default option list is probably not useful in many cases. | ||
+ | </ul> | ||
+ | <li>Option List | ||
+ | <ul> | ||
+ | <li>This specifies the compiler and build options that need to be used | ||
+ | to build Cactus on a particular system. | ||
+ | <li>Default: '''optionlist''' parameter of the Machine Database entry | ||
+ | <li>Override: '''--optionlist=<optionlist>''' | ||
+ | <li>The Simulation Factory is supposed to contain good, working | ||
+ | default option lists for all supported systems. In fact, this is one | ||
+ | of the main strengths of the Simulation Factory. You should normally | ||
+ | not need to override the default. | ||
+ | </ul> | ||
+ | <li>Submission Script | ||
+ | <ul> | ||
+ | <li>This specifies how to submit a job to the queueing system on a | ||
+ | particular system. | ||
+ | <li>Default: '''submitscript''' parameter of the Machine Database entry | ||
+ | <li>Override: '''--submitscript=<submitscript>''' | ||
+ | <li>Similar to the option list, the Simulation Factory is supposed to | ||
+ | contain good, working default submission scripts for all supported | ||
+ | systems. | ||
+ | </ul> | ||
+ | <li>Run Script | ||
+ | <ul> | ||
+ | <li>This specifies to how execute an MPI process on a particular | ||
+ | system; it is closely connected to the submission script. | ||
+ | <li>Default: '''runscript''' parameter of the Machine Database entry | ||
+ | <li>Override: '''--runscript=<runscript>''' | ||
+ | <li>Same as with the the submission script, the Simulation Factory is | ||
+ | supposed to contain good, working default run scripts for all | ||
+ | supported systems. | ||
+ | </ul> | ||
+ | </ul> | ||
+ | |||
+ | To build a configuration with a specific thornlist, issue the | ||
+ | following command: | ||
+ | |||
+ | simfactory/bin/sim build [<configurationname>] --thornlist=<thornlist> | ||
+ | |||
+ | or, for a remote machine, | ||
+ | |||
+ | simfactory/bin/sim --remote <machine> build [<configurationname>] --thornlist=<thornlist> | ||
+ | |||
+ | If you choose to omit the configuration name, it will default to | ||
+ | 'sim'. (We recommend this.) You can in addition specify any of the | ||
+ | options | ||
+ | <ul> | ||
+ | <li>--debug | ||
+ | <li>--optimise (default) | ||
+ | <li>--profile | ||
+ | </ul> | ||
+ | These options create configurations for debugging, optimisation (this | ||
+ | is the default), or profiling enabled. | ||
+ | If any of these options is specified, then the configuration name will | ||
+ | be modified correspondingly, e.g. to 'sim-debug'. | ||
+ | |||
+ | In order to modify any option not listed above, it is necessary to edit and modify the corresponding option file stored in Simfactory. | ||
+ | |||
+ | ==== Additional Options ==== | ||
+ | |||
+ | <ul> | ||
+ | |||
+ | <li>'''--reconfig''' | ||
+ | <ul> | ||
+ | <li>Reconfigure before building, i.e. re-examine the configuration | ||
+ | options and re-run the CST stage. This happens automatically when | ||
+ | the option list changes. | ||
+ | </ul> | ||
+ | |||
+ | <li>'''--clean''' | ||
+ | <ul> | ||
+ | <li>Clean the configuration (remove all object files etc.) before | ||
+ | building. | ||
+ | </ul> | ||
+ | |||
+ | </ul> | ||
+ | |||
+ | === Script Locations === | ||
+ | |||
+ | The Simulation Factory provides default scripts for all its | ||
+ | preconfigured machines. These scripts can be found in the following | ||
+ | locations | ||
+ | |||
+ | <ul> | ||
+ | |||
+ | <li>'''Option Lists''' | ||
+ | <ul> | ||
+ | <li>MDB Key: optionlist | ||
+ | <li>Location: simfactory/mdb/optionlists | ||
+ | </ul> | ||
+ | |||
+ | <li>'''Submit Scripts''' | ||
+ | <ul> | ||
+ | <li>MDB Key: submitscript | ||
+ | <li>Location: simfactory/mdb/submitscripts | ||
+ | </ul> | ||
+ | |||
+ | <li>'''Run Scripts''' | ||
+ | <ul> | ||
+ | <li>MDB Key: runscript | ||
+ | <li>Location: simfactory/mdb/runscripts | ||
+ | </ul> | ||
+ | |||
+ | </ul> | ||
+ | |||
+ | To determine, for instance, which option list Queen Bee uses by | ||
+ | default, issue the command | ||
+ | |||
+ | simfactory/bin/sim print-mdb queenbee | grep optionlist | ||
== Managing Simulations == | == Managing Simulations == | ||
+ | |||
+ | The Simulation Factory provides a convenient, consistent facility for | ||
+ | submitting, running, and managing simulations. This is accomplished | ||
+ | through two main commands '''submit''' and '''run'''. | ||
=== Information Commands === | === Information Commands === | ||
+ | |||
+ | The status of all simulations on a particular machine can be seen with | ||
+ | the following command | ||
+ | |||
+ | simfactory/bin/sim list-simulations | ||
+ | |||
+ | If a more detailed look at each simulation is required, the verbose | ||
+ | option can be specified | ||
+ | |||
+ | simfactory/bin/sim list-simulations --verbose | ||
+ | |||
=== Submitting a Simulation === | === Submitting a Simulation === | ||
+ | |||
+ | Four primary pieces of information are necessary when submitting a | ||
+ | simulation to the host queuing system. They are | ||
+ | |||
+ | <ul> | ||
+ | |||
+ | <li>'''Configuration''' | ||
+ | <ul> | ||
+ | <li>The Cactus configuration to run | ||
+ | <li>'''option''': --configuration | ||
+ | <li>'''default''': "sim" | ||
+ | </ul> | ||
+ | |||
+ | <li>'''Parfile''' | ||
+ | <ul> | ||
+ | <li>The Cactus parameter file to use | ||
+ | <li>'''option''': --parfile | ||
+ | </ul> | ||
+ | |||
+ | <li>'''Walltime''' | ||
+ | <ul> | ||
+ | <li>The total amount of wall time required | ||
+ | <li>'''option''': --walltime | ||
+ | <li>'''default''': MDB Key '''maxwalltime''' | ||
+ | </ul> | ||
+ | |||
+ | <li>Processors | ||
+ | <ul> | ||
+ | <li>The total number of processors to use | ||
+ | <li>'''option''': --procs | ||
+ | <li>'''default''': 1 | ||
+ | </ul> | ||
+ | |||
+ | </ul> | ||
+ | |||
+ | The option '''--configuration''' only needs to be specified the first | ||
+ | time you submit a simulation. Subsequent re-submissions of the same | ||
+ | simulation (for restarting from checkpoints) will always use the same | ||
+ | configuration that was specified the first time. Here is an example of | ||
+ | submitting a simulation named "static_tov" using the configuration "sim-debug" and the aforementioned | ||
+ | options: | ||
+ | |||
+ | simfactory/bin/sim submit static_tov --configuration sim-debug --parfile=par/static_tov.par --walltime=4:00:00 --procs=8 | ||
+ | |||
+ | or, for a remote machine, | ||
+ | |||
+ | simfactory/bin/sim --remote <machine> submit static_tov --configuration sim-debug --parfile=par/static_tov.par --walltime=4:00:00 --procs=8 | ||
+ | |||
+ | It is possible to submit a simulation using shorthand notation where | ||
+ | you do not need to specify the option names, but have to specify the | ||
+ | options in a certain order. If you don't specify a simulation name | ||
+ | using the shorthand syntax, a simulation name will be derived from | ||
+ | from the parameter file name. | ||
+ | |||
+ | simfactory/bin/sim submit [<simulationname>] <parfile> <walltime> <procs> --configuration <configuration> | ||
+ | |||
+ | An example is | ||
+ | |||
+ | simfactory/bin/sim submit par/static_tov.par 4:00:00 8 --configuration sim-debug | ||
+ | |||
+ | If --configuration <configuration> is omitted, the default 'sim' configuration is used. | ||
+ | |||
+ | ==== Additional Options: Submission ==== | ||
+ | |||
+ | <ul> | ||
+ | |||
+ | <li>Number of OpenMP Threads | ||
+ | <ul> | ||
+ | <li>The number of OpenMP threads per MPI process. (You specify the | ||
+ | total number of processors (cores), and the number of OpenMP | ||
+ | threads; the number of MPI processes is then calculated | ||
+ | automatically.) | ||
+ | <li>option: --num-threads | ||
+ | <li>default: 1 (as if OpenMP was not used) | ||
+ | </ul> | ||
+ | |||
+ | <li>Allocation | ||
+ | <ul> | ||
+ | <li>The allocation for the simulation, overriding the corresponding | ||
+ | MDB entry | ||
+ | <li>option: --allocation | ||
+ | <li>default: taken from the MDB | ||
+ | </ul> | ||
+ | |||
+ | <li>Queue | ||
+ | <ul> | ||
+ | <li>The queue for the simulation, overriding the corresponding MDB | ||
+ | entry | ||
+ | <li>option: --queue | ||
+ | <li>default: taken from the MDB | ||
+ | </ul> | ||
+ | |||
+ | <li>Processors per node | ||
+ | <ul> | ||
+ | <li>The number of processors per node requested from the queueing system | ||
+ | <li>option: --ppn | ||
+ | <li>default: all processors on a node | ||
+ | </ul> | ||
+ | |||
+ | <li>Used processors per node | ||
+ | <ul> | ||
+ | <li>The number of processors per node that should actually be used, | ||
+ | allowing under-using nodes even if the queueing system does not | ||
+ | allow it. (The remaining processors will idle and will remain | ||
+ | unused.) | ||
+ | <li>option: --ppn-used | ||
+ | <li>default: all processors on a node | ||
+ | </ul> | ||
+ | |||
+ | </ul> | ||
+ | |||
+ | For other changes in the submit script options it is necessary to edit and modify the scripts in simfactory/mdb/submitscripts. In order for the changes to have effect, the configuration must then be rebuilt. | ||
+ | |||
=== Running a Simulation === | === Running a Simulation === | ||
+ | |||
+ | The Simulation Factory can execute a simulation directly, bypassing | ||
+ | the queuing system. Running a simulation directly uses the same | ||
+ | options, but ignores wall time limit etc. You use the '''run''' | ||
+ | command for this: | ||
+ | |||
+ | simfactory/bin/sim run static_tov --configuration sim-debug --parfile=par/static_tov.par --procs=8 | ||
+ | |||
+ | ==== Additional Options: Running ==== | ||
+ | |||
+ | See [[#Aditional Options: Submission]] | ||
+ | |||
=== Other Simulation Commands === | === Other Simulation Commands === | ||
+ | |||
+ | To launch an interactive session on a compute node, use the command | ||
+ | |||
+ | simfactory/bin/sim interactive --procs=8 --walltime=4:00:00 | ||
+ | |||
+ | This leads to a login shell on the compute node, but is otherwise | ||
+ | similar to the submit command. | ||
+ | |||
+ | To stop a simulation: | ||
+ | |||
+ | simfactory/bin/sim stop <simulationname> | ||
+ | |||
+ | To purge (put in the basedir/TRASH folder) an existing simulation: | ||
+ | |||
+ | simfactory/bin/sim purge <simulationname> [--restart-id=<restartid>] | ||
+ | |||
+ | To show the current output (stdout and stderr) for a given simulation: | ||
+ | |||
+ | simfactory/bin/sim show-output <simulationname> [--restart-id=<restartid>] | ||
+ | |||
=== What's Produced === | === What's Produced === | ||
+ | |||
+ | When a simulation is submitted for the first time, all necessary | ||
+ | information from the Cactus build configuration is brought into a | ||
+ | specific simulation folder created underneath the '''basedir''' | ||
+ | directory. Contained inside this folder, which has the same name as | ||
+ | the specified simulation, are the executable, run script, submit | ||
+ | script, a SIMFACTORY folder, a log file, and the output directories | ||
+ | for each individual restart. | ||
+ | |||
+ | Simulations are self-contained, and once created do not rely on | ||
+ | outside information. For example, recompiling the executable or | ||
+ | changing the parameter file that were used to submit a simulation will | ||
+ | not influence the simulation, since the simulation contains copies of | ||
+ | both. This ensures that simulations can continue to run unperturbed | ||
+ | even weeks after they have been created. | ||
+ | |||
+ | Here is the contents of the simulation folder "static_tov" with | ||
+ | several restarts in it: | ||
+ | |||
+ | [mwt@eric2 simulations]$ ls -l static_tov | ||
+ | total 32 | ||
+ | -rw-r--r-- 1 mwt lsuusers 0 Sep 30 13:30 LOG | ||
+ | drwxr-xr-x 3 mwt lsuusers 4096 Aug 20 10:19 output-0000 | ||
+ | drwxr-xr-x 4 mwt lsuusers 4096 Aug 20 10:19 output-0001 | ||
+ | drwxr-xr-x 4 mwt lsuusers 4096 Aug 20 10:24 output-0002 | ||
+ | drwxr-xr-x 3 mwt lsuusers 4096 Aug 20 23:57 output-0003 | ||
+ | drwxr-xr-x 4 mwt lsuusers 4096 Sep 17 09:02 output-0004 | ||
+ | drwxr-xr-x 7 mwt lsuusers 4096 Aug 20 10:18 SIMFACTORY | ||
+ | |||
+ | The SIMFACTORY folder contains the executable, the necessary script | ||
+ | files for submission and execution, and a properties.ini file that is | ||
+ | used by the Simulation Factory to store information about the | ||
+ | simulation. | ||
+ | |||
+ | Each time a simulation is either run or submitted, a restart directory | ||
+ | is created underneath the simulation directory. This restart folder | ||
+ | has a name of the format "output-####", starting with "output-0000". | ||
+ | Contained inside the restart folder are several internal files, the | ||
+ | output written to stdout and stderr from the simulation, and the | ||
+ | simulation output itself. The simulation output is typically stored | ||
+ | inside a directory named after the basename of the parameter file. An | ||
+ | example output directory is: | ||
+ | |||
+ | [mwt@eric2 output-0001]$ ls -l | ||
+ | total 172 | ||
+ | -rw-r--r-- 1 mwt lsuusers 0 Sep 17 09:06 LOG | ||
+ | -rw-r--r-- 1 mwt lsuusers 9 Sep 17 09:06 mpd_nodefile | ||
+ | -rw-r--r-- 1 mwt lsuusers 32 Sep 17 09:06 mpi_nodefile | ||
+ | -rw-r--r-- 1 mwt lsuusers 33 Sep 17 09:06 NODELIST | ||
+ | drwxr-xr-x 3 mwt lsuusers 20480 Sep 17 16:12 qc0-mclachlan | ||
+ | -rw------- 1 mwt lsuusers 2520 Sep 17 21:06 qc0-mclachlan.err | ||
+ | -rw------- 1 mwt lsuusers 108210 Sep 17 21:06 qc0-mclachlan.out | ||
+ | -rw-r--r-- 1 mwt lsuusers 13621 Sep 17 09:06 qc0-mclachlan.par | ||
+ | lrwxrwxrwx 1 mwt lsuusers 23 Sep 17 09:06 scratch -> /var/scratch/mwt/250072 | ||
+ | drwxr-xr-x 2 mwt lsuusers 4096 Sep 17 09:06 SIMFACTORY | ||
+ | |||
+ | === Script Locations === | ||
+ | |||
+ | When a simulation is created, it copies the submit script and the run | ||
+ | script from the build configuration into the folder | ||
+ | "basedir/<simulation>/SIMFACTORY". The executable goes in the "exe/" | ||
+ | folder, the run and submit scripts into the "run/" folder, the Cactus | ||
+ | options list into the "cfg/" folder, and the parfile into the "par/" | ||
+ | folder. Below shows an example SIMFACTORY directory | ||
+ | |||
+ | [mwt@eric2 SIMFACTORY]$ ls -lR | ||
+ | .: | ||
+ | total 32 | ||
+ | drwxr-xr-x 2 mwt lsuusers 4096 Sep 17 09:06 cfg | ||
+ | drwxr-xr-x 2 mwt lsuusers 4096 Sep 17 09:05 data | ||
+ | drwxr-xr-x 2 mwt lsuusers 4096 Sep 17 09:05 exe | ||
+ | drwxr-xr-x 2 mwt lsuusers 4096 Sep 17 09:06 par | ||
+ | -rw-r--r-- 1 mwt lsuusers 740 Sep 17 09:06 properties.ini | ||
+ | drwxr-xr-x 2 mwt lsuusers 4096 Sep 17 09:06 run | ||
+ | |||
+ | ./cfg: | ||
+ | total 12 | ||
+ | -rw-r--r-- 1 mwt lsuusers 4041 Sep 17 09:06 OptionList | ||
+ | |||
+ | ./exe: | ||
+ | total 121408 | ||
+ | -rwxr-xr-x 1 mwt lsuusers 124306159 Sep 17 09:06 cactus_sim | ||
+ | |||
+ | ./par: | ||
+ | total 24 | ||
+ | -rw-r--r-- 1 mwt lsuusers 13621 Sep 17 09:06 qc0-mclachlan.par | ||
+ | |||
+ | ./run: | ||
+ | total 16 | ||
+ | -rw-r--r-- 1 mwt lsuusers 1162 Sep 17 09:06 RunScript | ||
+ | -rw-r--r-- 1 mwt lsuusers 410 Sep 17 09:06 SubmitScript | ||
+ | |||
+ | |||
== Other Advanced Features == | == Other Advanced Features == | ||
− | === | + | === Test suites === |
− | + | ||
+ | Cactus thorns often contain regression tests consisting of test parameter files and the resulting output, and Cactus has a mechanism for verifying that the output of the parameter files with the current version of the code matches the output stored in the thorn. SimFactory can run these tests, using a queuing system if necessary. To run the tests, you use the usual SimFactory commands for creating, submitting or running a simulation, but you do not need to specify a parameter file. Instead, you include the --testsuite option, and if you want to run specific tests, the --select-tests option. | ||
+ | |||
+ | To run all tests immediately on two processors (cores): | ||
+ | |||
+ | sim create-run mytests --testsuite --procs 2 | ||
+ | |||
+ | where "mytests" is the name of the simulation that will be created. | ||
+ | |||
+ | To run the tests using a queuing system: | ||
+ | |||
+ | sim create-submit mytests --testsuite --procs 2 | ||
+ | |||
+ | You can use all the usual SimFactory commands and option for creating, running and submitting simulations. | ||
+ | |||
+ | By default, the entire test suite is run. If you want to run only specific tests, you can additionally use the --select-tests option. You can give this option a test name (ending in .par), an arrangement name or a thorn specification in the form <arrangement>/<thorn>. | ||
+ | |||
+ | sim create-run mytests --testsuite --procs 2 --select-tests McLachlan | ||
+ | |||
+ | sim create-run mytests --testsuite --procs 2 --select-tests McLachlan/ML_BSSN | ||
+ | |||
+ | sim create-run mytests --testsuite --procs 2 --select-tests ML_BSSN_sgw3d.par | ||
+ | |||
+ | Whether run using create-run or create-submit, a summary.log file will be created in mytests/TEST/<config>/summary.log. | ||
+ | |||
+ | * Note that for some machines, you may need to use --ppn-used to run on the correct number of processors. Note that many of the tests will only run on 1 or 2 processes. | ||
+ | |||
+ | * Since it is necessary to have the test data and Cactus flesh scripts available when the job starts, the required data is copied into the simulation restart directory on job submission (or interactive running). This ensures that the test data and scripts are available when the home directory is not mounted on the compute nodes, and that the test data is not modified between job submission and job running. | ||
+ | |||
+ | * When you use the --testsuite option, it is not necessary to specify a parameter file. The positional arguments syntax (parfile, cores, walltime) is not supported for running the test suite. |
Latest revision as of 15:30, 27 March 2023
The Simulation Factory simplifies many aspects of running Cactus-based simulations. It provides a central facility for managing authoritative source tree versions, providing convenient access to remote HPC systems, building Cactus source configurations, and managing simulations all the way from submission to archiving their output.
Contents
Getting Started
To begin using The Simulation Factory, it needs to be checked out from git. The Simulation Factory is typically placed into a simfactory folder inside a Cactus source tree. This can be accomplished with the following git command:
git clone -b master https://bitbucket.org/simfactory/simfactory2.git
The Simulation Factory could also be placed in an independent location to be used with multiple Cactus source trees. This approach will be described later.
Initial Setup
Once the Simulation Factory has been checked out from svn, the next step is to configure it, telling it e.g. about your user name. The Simulation Factory comes with a convenient method to generate some simple defaults to get started. To begin, issue the command
simfactory/bin/sim setup
Setup will prompt for a username, an email address, and an allocation. It will allow you to enter additional configuration from other machines, but this is advanced, and can be safely ignored. Setup will also generate a machine database entry for your local machine based upon the generic machine database entry.
Additional Configurations
The Simulation Factory contains a database known as the Machine Database. This collection of information describes all the aspects that are unique about each individual HPC system, so that the Simulation Factory can provide a common interface for all systems that hides these differences.
The Machine Database consists of different sections, one for each machine. The section name is given in square brackets, e.g. [comet]. There is a special section [default] that provides default values for those properties that are not explicitly set in the machine-specific entries.
The Machine Database is an authoritative collection of information, and is generally not meant to contain modification that are only relevant for individual people. These local modifications are instead maintained in the file simfactory/etc/defs.local.ini described above, where one can add, change, or overwrite properties of Machine Database entries. For instance, if an alternative username, allocation, and sourcebasedir is needed for the machine comet, you would add the following section there:
[comet] user = COMET_USERNAME allocation = COMET_ALLOCATION
There are several macros that help simplifying configuration entries. The most useful is probably @USER@, which expands to the user property of the Machine Database entry.
For example, if you are using the same user name on many systems, but have a different user name on some systems, then you would set the common user name in the [default] section, and override this for those machines where your user name differs. The example simfactory/etc/defs.local.ini.complex has examples for this.
Most of the macros available in the Simulation Factory are described in the section #Macros below.
The command
simfactory/bin/sim list-machines
outputs a list of all preconfigured machines that the Simulation Factory knows about.
Extended Local Workstation Configuration
Once determining the hostname of your machine through the command:
hostname
It is then possible to edit the machine simfactory/mdb/machines/<hostname>.ini, which was created via the command:
sim setup
Then edit simfactory/mdb/machines/<hostname>.ini. From here, all values can be customized.
Accessing Remote Systems
The Simulation Factory simplifies access to remote systems, both for transferring files and logging in. You can synchronise (replicate) an authoritative version of your Cactus source tree to remote systems, obtain an interactive shell, or execute commands.
Information Commands
The following commands can be used to discover information about a machine, or list all known machines.
List all known machines:
simfactory/bin/sim list-machines
Print the complete Machine Database to the screen:
simfactory/bin/sim print-mdb
Print the Machine Database entry for a single machine:
simfactory/bin/sim print-mdb <machine>
Print the name of the machine on which the Simulation Factory is currently being executed:
simfactory/bin/sim whoami
Syncing
Historically, Cactus and the Einstein Toolkit have not been installed into a central location on each machine, but are instead built on-demand by every user for a certain thorn list. (One of the advantages is that people can thus easily add their own thorns.) To help with this approach, the Simulation Factory provides a facility to synchronize a Cactus user's local, authoritative source tree to remote HPC systems, where it can then be compiled and run.
Remote access is implemented on top of ssh and other ssh-like mechanisms such as gsi-ssh. Currently, you must still manage all ssh keys and passwords manually. (We highly recommend to use ssh keychain and ssh agents to avoid having to enter passwords multiple times.)
Configuration
Before syncing a source tree to a remote system, a small amount of configuration must be performed. It is necessary to either verify that the defaults are correct, or to define the correct values for the following keys for the remote system in the Machine Database:
- sourcebasedir
- The root directory under which the Cactus source tree will reside
- basedir
- The root directory in which all simulation output will reside
- user
- The user name on the remote system
You can output the currently configured values by issuing the command
simfactory/bin/sim print-mdb <machine>
If you need to change these values, then edit (on the local system) the file simfactory/etc/defs.local.ini and add a section for the remote machine. This entry will augment the existing Machine Database entry and updating/replace the corresponding values. An example for the machine queenbee can be see in the [[#Additional Configuration]] section.
To see the list of files and directories that are synchronized, look at simfactory/etc/defs.ini and find the following two keys
- sync-sources
- The list of files and directories that will be transferred when the option --sync-sourcetree is enabled (on by default)
- sync-parfiles
- The list of files and directories that will be copied when the option --sync-parfiles is enabled (also on by default). This list of files typically includes just parameter files.
You can control which files are included or excluded using simfactory/etc/filter.local.rules which uses rsync filter rules syntax.
Performing a Sync
A sync command takes two options, both of which default to true.
- sync-sourcetree
- Synchronise the complete source tree, as specified in the aforementioned rsync-sources configuration entry. This takes a few seconds or minutes, depending on the connection.
- sync-parfiles
- Synchronise parameter files, as specified by the aforementioned rsync-parfiles configuration entry. This is typically faster than synchronising the source tree.
Usually, you would issue the command:
simfactory/bin/sim sync <machine>
To synchronise only parfiles, you can negate the --sync-sourcetree argument with the following command
simfactory/bin/sim sync <machine> --nosync-sourcetree
If you want to synchronise not from the local machine, but from another remote machine, then use
simfactory/bin/sim --remote <frommachine> sync <tomachine>
This executes the synchronisation command on the machine <frommachine>.
Remote Login
The Simulation Factory provides the ability to log in to a remote system. This is initiated with the command
simfactory/bin/sim login <machine>
This will automatically cd into the Cactus directory on the remote system.
Local/Remote Command Execution
To execute a command (locally) via the Simulation Factory, use the command
simfactory/bin/sim execute <command>
The command will be executed in the Cactus directory on the remote system.
If the command is complex, and requires arguments, the command must be quoted. For example
simfactory/bin/sim execute 'ls -al'
To execute a remote command, use the command
simfactory/bin/sim --remote <machine> execute <command>
An example of a complex command being executed remotely is
simfactory/bin/sim --remote queenbee execute 'find . -name *.py -exec sed -i .bk -n s/foo/bar/g {} \;'
Build Cactus Configurations
The Simulation Factory provides a central facility for configuring and building Cactus source trees. When a Cactus source tree is compiled, the Simulation Factory creates a configuration for the compiled executable, storing with it related information such as the Cactus options list, and the scripts necessary to submit and run jobs in a queuing system. This configuration is thus a self-contained entity containing everything that is necessary to perform Cactus simulations.
Information Commands
To list all existing Cactus configurations, use the following command
simfactory/bin/sim list-configurations
or, for a remote machine,
simfactory/bin/sim --remote <machine> list-configurations
Building a Configuration
To build a configuration, four pieces of information are required:
- Thorn list
- This defines which thorns are to be included into the configuration.
- Default: thornlist parameter of the Machine Database entry
- Override: --thornlist=<thornlist>
- The default option list is probably not useful in many cases.
- Option List
- This specifies the compiler and build options that need to be used to build Cactus on a particular system.
- Default: optionlist parameter of the Machine Database entry
- Override: --optionlist=<optionlist>
- The Simulation Factory is supposed to contain good, working default option lists for all supported systems. In fact, this is one of the main strengths of the Simulation Factory. You should normally not need to override the default.
- Submission Script
- This specifies how to submit a job to the queueing system on a particular system.
- Default: submitscript parameter of the Machine Database entry
- Override: --submitscript=<submitscript>
- Similar to the option list, the Simulation Factory is supposed to contain good, working default submission scripts for all supported systems.
- Run Script
- This specifies to how execute an MPI process on a particular system; it is closely connected to the submission script.
- Default: runscript parameter of the Machine Database entry
- Override: --runscript=<runscript>
- Same as with the the submission script, the Simulation Factory is supposed to contain good, working default run scripts for all supported systems.
To build a configuration with a specific thornlist, issue the following command:
simfactory/bin/sim build [<configurationname>] --thornlist=<thornlist>
or, for a remote machine,
simfactory/bin/sim --remote <machine> build [<configurationname>] --thornlist=<thornlist>
If you choose to omit the configuration name, it will default to 'sim'. (We recommend this.) You can in addition specify any of the options
- --debug
- --optimise (default)
- --profile
These options create configurations for debugging, optimisation (this is the default), or profiling enabled. If any of these options is specified, then the configuration name will be modified correspondingly, e.g. to 'sim-debug'.
In order to modify any option not listed above, it is necessary to edit and modify the corresponding option file stored in Simfactory.
Additional Options
- --reconfig
- Reconfigure before building, i.e. re-examine the configuration options and re-run the CST stage. This happens automatically when the option list changes.
- --clean
- Clean the configuration (remove all object files etc.) before building.
Script Locations
The Simulation Factory provides default scripts for all its preconfigured machines. These scripts can be found in the following locations
- Option Lists
- MDB Key: optionlist
- Location: simfactory/mdb/optionlists
- Submit Scripts
- MDB Key: submitscript
- Location: simfactory/mdb/submitscripts
- Run Scripts
- MDB Key: runscript
- Location: simfactory/mdb/runscripts
To determine, for instance, which option list Queen Bee uses by default, issue the command
simfactory/bin/sim print-mdb queenbee | grep optionlist
Managing Simulations
The Simulation Factory provides a convenient, consistent facility for submitting, running, and managing simulations. This is accomplished through two main commands submit and run.
Information Commands
The status of all simulations on a particular machine can be seen with the following command
simfactory/bin/sim list-simulations
If a more detailed look at each simulation is required, the verbose option can be specified
simfactory/bin/sim list-simulations --verbose
Submitting a Simulation
Four primary pieces of information are necessary when submitting a simulation to the host queuing system. They are
- Configuration
- The Cactus configuration to run
- option: --configuration
- default: "sim"
- Parfile
- The Cactus parameter file to use
- option: --parfile
- Walltime
- The total amount of wall time required
- option: --walltime
- default: MDB Key maxwalltime
- Processors
- The total number of processors to use
- option: --procs
- default: 1
The option --configuration only needs to be specified the first time you submit a simulation. Subsequent re-submissions of the same simulation (for restarting from checkpoints) will always use the same configuration that was specified the first time. Here is an example of submitting a simulation named "static_tov" using the configuration "sim-debug" and the aforementioned options:
simfactory/bin/sim submit static_tov --configuration sim-debug --parfile=par/static_tov.par --walltime=4:00:00 --procs=8
or, for a remote machine,
simfactory/bin/sim --remote <machine> submit static_tov --configuration sim-debug --parfile=par/static_tov.par --walltime=4:00:00 --procs=8
It is possible to submit a simulation using shorthand notation where you do not need to specify the option names, but have to specify the options in a certain order. If you don't specify a simulation name using the shorthand syntax, a simulation name will be derived from from the parameter file name.
simfactory/bin/sim submit [<simulationname>] <parfile> <walltime> <procs> --configuration <configuration>
An example is
simfactory/bin/sim submit par/static_tov.par 4:00:00 8 --configuration sim-debug
If --configuration <configuration> is omitted, the default 'sim' configuration is used.
Additional Options: Submission
- Number of OpenMP Threads
- The number of OpenMP threads per MPI process. (You specify the total number of processors (cores), and the number of OpenMP threads; the number of MPI processes is then calculated automatically.)
- option: --num-threads
- default: 1 (as if OpenMP was not used)
- Allocation
- The allocation for the simulation, overriding the corresponding MDB entry
- option: --allocation
- default: taken from the MDB
- Queue
- The queue for the simulation, overriding the corresponding MDB entry
- option: --queue
- default: taken from the MDB
- Processors per node
- The number of processors per node requested from the queueing system
- option: --ppn
- default: all processors on a node
- Used processors per node
- The number of processors per node that should actually be used, allowing under-using nodes even if the queueing system does not allow it. (The remaining processors will idle and will remain unused.)
- option: --ppn-used
- default: all processors on a node
For other changes in the submit script options it is necessary to edit and modify the scripts in simfactory/mdb/submitscripts. In order for the changes to have effect, the configuration must then be rebuilt.
Running a Simulation
The Simulation Factory can execute a simulation directly, bypassing the queuing system. Running a simulation directly uses the same options, but ignores wall time limit etc. You use the run command for this:
simfactory/bin/sim run static_tov --configuration sim-debug --parfile=par/static_tov.par --procs=8
Additional Options: Running
See #Aditional Options: Submission
Other Simulation Commands
To launch an interactive session on a compute node, use the command
simfactory/bin/sim interactive --procs=8 --walltime=4:00:00
This leads to a login shell on the compute node, but is otherwise similar to the submit command.
To stop a simulation:
simfactory/bin/sim stop <simulationname>
To purge (put in the basedir/TRASH folder) an existing simulation:
simfactory/bin/sim purge <simulationname> [--restart-id=<restartid>]
To show the current output (stdout and stderr) for a given simulation:
simfactory/bin/sim show-output <simulationname> [--restart-id=<restartid>]
What's Produced
When a simulation is submitted for the first time, all necessary information from the Cactus build configuration is brought into a specific simulation folder created underneath the basedir directory. Contained inside this folder, which has the same name as the specified simulation, are the executable, run script, submit script, a SIMFACTORY folder, a log file, and the output directories for each individual restart.
Simulations are self-contained, and once created do not rely on outside information. For example, recompiling the executable or changing the parameter file that were used to submit a simulation will not influence the simulation, since the simulation contains copies of both. This ensures that simulations can continue to run unperturbed even weeks after they have been created.
Here is the contents of the simulation folder "static_tov" with several restarts in it:
[mwt@eric2 simulations]$ ls -l static_tov total 32 -rw-r--r-- 1 mwt lsuusers 0 Sep 30 13:30 LOG drwxr-xr-x 3 mwt lsuusers 4096 Aug 20 10:19 output-0000 drwxr-xr-x 4 mwt lsuusers 4096 Aug 20 10:19 output-0001 drwxr-xr-x 4 mwt lsuusers 4096 Aug 20 10:24 output-0002 drwxr-xr-x 3 mwt lsuusers 4096 Aug 20 23:57 output-0003 drwxr-xr-x 4 mwt lsuusers 4096 Sep 17 09:02 output-0004 drwxr-xr-x 7 mwt lsuusers 4096 Aug 20 10:18 SIMFACTORY
The SIMFACTORY folder contains the executable, the necessary script files for submission and execution, and a properties.ini file that is used by the Simulation Factory to store information about the simulation.
Each time a simulation is either run or submitted, a restart directory is created underneath the simulation directory. This restart folder has a name of the format "output-####", starting with "output-0000". Contained inside the restart folder are several internal files, the output written to stdout and stderr from the simulation, and the simulation output itself. The simulation output is typically stored inside a directory named after the basename of the parameter file. An example output directory is:
[mwt@eric2 output-0001]$ ls -l total 172 -rw-r--r-- 1 mwt lsuusers 0 Sep 17 09:06 LOG -rw-r--r-- 1 mwt lsuusers 9 Sep 17 09:06 mpd_nodefile -rw-r--r-- 1 mwt lsuusers 32 Sep 17 09:06 mpi_nodefile -rw-r--r-- 1 mwt lsuusers 33 Sep 17 09:06 NODELIST drwxr-xr-x 3 mwt lsuusers 20480 Sep 17 16:12 qc0-mclachlan -rw------- 1 mwt lsuusers 2520 Sep 17 21:06 qc0-mclachlan.err -rw------- 1 mwt lsuusers 108210 Sep 17 21:06 qc0-mclachlan.out -rw-r--r-- 1 mwt lsuusers 13621 Sep 17 09:06 qc0-mclachlan.par lrwxrwxrwx 1 mwt lsuusers 23 Sep 17 09:06 scratch -> /var/scratch/mwt/250072 drwxr-xr-x 2 mwt lsuusers 4096 Sep 17 09:06 SIMFACTORY
Script Locations
When a simulation is created, it copies the submit script and the run script from the build configuration into the folder "basedir/<simulation>/SIMFACTORY". The executable goes in the "exe/" folder, the run and submit scripts into the "run/" folder, the Cactus options list into the "cfg/" folder, and the parfile into the "par/" folder. Below shows an example SIMFACTORY directory
[mwt@eric2 SIMFACTORY]$ ls -lR .: total 32 drwxr-xr-x 2 mwt lsuusers 4096 Sep 17 09:06 cfg drwxr-xr-x 2 mwt lsuusers 4096 Sep 17 09:05 data drwxr-xr-x 2 mwt lsuusers 4096 Sep 17 09:05 exe drwxr-xr-x 2 mwt lsuusers 4096 Sep 17 09:06 par -rw-r--r-- 1 mwt lsuusers 740 Sep 17 09:06 properties.ini drwxr-xr-x 2 mwt lsuusers 4096 Sep 17 09:06 run
./cfg: total 12 -rw-r--r-- 1 mwt lsuusers 4041 Sep 17 09:06 OptionList
./exe: total 121408 -rwxr-xr-x 1 mwt lsuusers 124306159 Sep 17 09:06 cactus_sim
./par: total 24 -rw-r--r-- 1 mwt lsuusers 13621 Sep 17 09:06 qc0-mclachlan.par
./run: total 16 -rw-r--r-- 1 mwt lsuusers 1162 Sep 17 09:06 RunScript -rw-r--r-- 1 mwt lsuusers 410 Sep 17 09:06 SubmitScript
Other Advanced Features
Test suites
Cactus thorns often contain regression tests consisting of test parameter files and the resulting output, and Cactus has a mechanism for verifying that the output of the parameter files with the current version of the code matches the output stored in the thorn. SimFactory can run these tests, using a queuing system if necessary. To run the tests, you use the usual SimFactory commands for creating, submitting or running a simulation, but you do not need to specify a parameter file. Instead, you include the --testsuite option, and if you want to run specific tests, the --select-tests option.
To run all tests immediately on two processors (cores):
sim create-run mytests --testsuite --procs 2
where "mytests" is the name of the simulation that will be created.
To run the tests using a queuing system:
sim create-submit mytests --testsuite --procs 2
You can use all the usual SimFactory commands and option for creating, running and submitting simulations.
By default, the entire test suite is run. If you want to run only specific tests, you can additionally use the --select-tests option. You can give this option a test name (ending in .par), an arrangement name or a thorn specification in the form <arrangement>/<thorn>.
sim create-run mytests --testsuite --procs 2 --select-tests McLachlan
sim create-run mytests --testsuite --procs 2 --select-tests McLachlan/ML_BSSN
sim create-run mytests --testsuite --procs 2 --select-tests ML_BSSN_sgw3d.par
Whether run using create-run or create-submit, a summary.log file will be created in mytests/TEST/<config>/summary.log.
- Note that for some machines, you may need to use --ppn-used to run on the correct number of processors. Note that many of the tests will only run on 1 or 2 processes.
- Since it is necessary to have the test data and Cactus flesh scripts available when the job starts, the required data is copied into the simulation restart directory on job submission (or interactive running). This ensures that the test data and scripts are available when the home directory is not mounted on the compute nodes, and that the test data is not modified between job submission and job running.
- When you use the --testsuite option, it is not necessary to specify a parameter file. The positional arguments syntax (parfile, cores, walltime) is not supported for running the test suite.