Adding requirements to the Cactus scheduler

Problem Outline

One of the currently most complex aspects of programming with Cactus is writing schedule.ccl files for new routines, in particular if mesh refinement is used. The basic problem is that it is very difficult to ensure that routines are executed in the correct order, i.e. that all grid variables which are required for a routine are actually calculated beforehand. It is also difficult to ensure that boundary conditions (and synchronisation and symmetry boundaries) are applied when needed, in particular after regridding.

The Cactus schedule consists of several independent "parts": There are schedule bins defined by the flesh, there are schedule groups defined by infrastructure thorns (e.g. MoL or HydroBase), and there is the recursive Berger-Oliger algorithm traversing the bins implemented in Carpet. It is for the end-user difficult to see which groups are executed when and on what refinement level, and in which order this occurs.

The Cactus schedule offers "before" and "after" clauses to ensure a partial ordering between routines. Unfortunately, this ordering applies only to routines within the same schedule group and the same schedule bin and refinement level. It is not possible to ensure a particular order between routines in different schedule groups or schedule bins, and it is very complex to ensure that a routine is executed e.g. after another routine has been executed on all refinement levels.

There is one example setup that illustrates this problem. When setting up initial conditions for a hydrodynamics evolution, one may e.g. want to first set up a neutron star, then calculate its maximum density, and then set up the atmosphere to a value depending on this maximum density. Making this possible in Cactus required introducing a new schedule bin "postpostinitial" to the flesh, and requires careful arrangement of schedule groups defined by ADMBase and HydroBase. Even now that this is possible, it is probably not possible to ensure at run time that these actions occur in a correct order.

Current State

The flesh that allows adding REQUIRES and PROVIDES clauses to the schedule block for every routine. They are stored in the schedule database of the flesh and are ignored by default. There is a suggestion to rename these clauses to READS and WRITES; this has not yet been done.

The component list of the project can be found at the following URL, which includes the branch of the flesh, examples and other project files:

https://svn.cactuscode.org/projects/NewSchedule/NewSchedule/NewSchedule.th

Carpet has a file Requirements.cc that detects the presence of these clauses and performs rudimentary checks. These checks are probably useless in their current form.

Next Steps

To bring this project further, we need to define how the "reads" and "writes" clauses should look like. As mentioned above, it is insufficient to list only grid variables there, since most routines access only parts of grid variables. Ian Hinder volunteered to come up with an initial plan for what kinds of "parts" there should be (e.g. interior, outer boundary, symmetry boundary, ghost zone, etc.). Those parts are driver-dependent, which means we have to come up with a way to tell the flesh about those parts and their connections (what is part of what).

A simple example would look like the following (syntax arbitrary):

INTERIOR ∈ DOMAIN
BOUNDARY ∈ DOMAIN
INTERIOR ∩ BOUNDARY = ∅
INTERIOR ∪ BOUNDARY = DOMAIN

Defining parts of grid functions

Application thorns typically write to either the interior of the grid (for example, those points which can be updated using finite differencing) or to the physical outer boundary (for applying user-supplied boundary conditions). Other types of points are those on symmetry boundaries, interprocessor boundaries and mesh refinement boundaries, which an application thorn should never need to write to. Symmetry thorns would write to symmetry boundaries, and the driver would write to interprocessor and mesh refinement boundaries.

Consider a single local grid component. It is a cuboidal set of points. According to Cactus, each of the 6 faces of the component is either an interprocessor boundary (including refinement boundaries) or a symmetry boundary, or a physical boundary. Each face can be only one of these. Each face has a boundary width. Points on edges and corners are associated with multiple faces, and are considered physical boundary points if they are not part of a symmetry or interprocessor boundary. Hence, physical boundary points are only those which absolutely have to be updated, as they are not updated by any other mechanism.

A typical application thorn only needs to be concerned with interior and physical boundary points. We can divide the points in a component into the categories:

Interior;
PhysicalBoundary;
SymmetryBoundary;
InterprocessorBoundary;
RefinementBoundary.

Most scheduled application functions need to read their variables from everywhere on the grid, and some write variables to everywhere on the grid. We can use READS and WRITES lines in a schedule block to specify the variables and locations that each scheduled function reads from and writes to. Each line would be a space-separated (we should think of a mechanism to allow new-lines) list of variables or groups (qualified with an implementation name if outside the current implementation). To specify which part of the grid was being read or written, we could have "part" keywords in curly brackets after the grid function or group name. If omitted, the default would be Everywhere. (FrankL: Shouldn't we make the default for reading everywhere, but for writing only the interior? This is what most thorns do. IanH: I agree that most thorns do this, but we have to weigh that against the confusion of having two different defaults.)

Examples

For example,


SCHEDULE TwoPunctures AT Initial 
{
	LANG: C
	WRITES: ADMBase::metric ADMBase::curv ADMBase::lapse
} "Create puncture black hole initial data"

schedule ML_BSSN_convertFromADMBase AT Initial
{
	LANG: C
	READS: ADMBase::metric ADMBase::curv ADMBase::lapse ADMBase::shift
	WRITES: ML_log_confac ML_metric ML_trace_curv ML_curv ML_shift 
} "ML_BSSN_convertFromADMBase"

schedule ML_BSSN_convertFromADMBaseGamma AT Initial
{
	LANG: C
	READS: ML_log_confac ML_metric
	WRITES: ML_Gamma{Interior}
} "ML_BSSN_convertFromADMBaseGamma"


schedule ML_BSSN_RHS1 in MoL_CalcRHS
{
	LANG: C
	READS: ML_log_confac ML_metric ML_trace_curv ML_curv ML_Gamma ADMBase::lapse ML_shift
	WRITES: ML_log_confac_rhs{Interior} ML_metric_rhs{Interior} ML_trace_curv_rhs{Interior} ML_curv_rhs{Interior} ML_Gamma_rhs{Interior} ADMBase::dtlapse{Interior} ML_shift_rhs{Interior}
} "ML_BSSN_RHS1"

schedule ML_BSSN_RadiativeRHSBoundary in MoL_CalcRHS
{
	LANG: C
	READS: ML_log_confac ML_metric ML_trace_curv ML_curv ML_Gamma ADMBase::lapse ML_shift
	WRITES: ML_log_confac_rhs{PhysicalBoundary} ML_metric_rhs{PhysicalBoundary} ML_trace_curv_rhs{PhysicalBoundary} ML_curv_rhs{PhysicalBoundary} ML_Gamma_rhs{PhysicalBoundary} ADMBase::dtlapse{PhysicalBoundary} ML_shift_rhs{PhysicalBoundary}
} "ML_BSSN_RHS1"

schedule ML_BSSN_enforce in MoL_PostStep
{
	LANG: C
	READS: ML_metric ML_curv
	WRITES: ML_curv
} "ML_BSSN_enforce"

schedule psis_calc_4th AT Analysis
{
	LANG: C
	READS: ADMBase::metric ADMBase::curv
	WRITES: Psi4r{Interior} Psi4i{Interior}
} "psis_calc_4th"

schedule Multipole_Calc AT Analysis
{
	LANG: C
	READS: Psi4r Psi4i
} "psis_calc_4th"

It might be useful to modify the syntax to say that variables are all read and all written from and to the same parts of the grid, as that will be the usual case.

Interaction with MoL

MoL is the time integrator that takes grid functions on the previous time level as input and produces new values for the grid functions on the current time level as output. It requires routines that calculate the RHS and/or apply boundary conditions to the evolved grid functions.

Integrating MoL with the mechanism provided above faces several difficulties:

The set of evolved grid functions is not defined in the schedule.ccl; it is instead defined via function calls at run time. One approach would be to define call-back functions that MoL has to provide, so that the scheduler can access this information.
It is a priori not clear whether MoL evolves only the interior or also the boundary of grid functions. This can even be different for different grid functions. We can probably safely assume that MoL does not evolve ghost zones or symmetry zones (although this is technically also not defined).
MoL integrates in time in a WHILE loop implemented in the scheduler. The WHILE condition depends on the particular time integrator that is chosen.

To simplify things, I suggest that we leave MoL unmodified and treat it as black box. MoL needs to specify (e.g. via callback functions) what variables are integrated in time, and which region of these variables is integrated. The input to MoL is then the past time level of these variables, and the output of MoL is the current time level of these variables.

Further, there is one special bin (or group) very similar to the existing MoL_RHS. In this bin, initially the current time level of these variables is defined (MoL needs to ensure this). At the end of this bin, the RHS grid functions need to be defined (MoL requires this). This is equivalent to a WRITES and READS statement.

Since it is now known which regions of which variables MoL accesses (reads/writes), the scheduler can do the remainder and can schedule all other required routines, such as e.g. boundary conditions. For example, if MoL provides ("writes") in the beginning of the RHS bin the interior of the state vector, and there is a routine which reads the whole domain of the state vector and writes the interior of the RHS, then the scheduler can easily deduce that the corresponding boundary condition routine must be called.

Example

MoL provides a call-back function that specifies the READS and WRITES declarations for MoL altogether and for MoL_RHS:

MoL READS ML_BSSN{Interior, previous-timelevel}
MoL WRITES ML_BSSN{Interior, current-timelevel}
MoL_RHS WRITES ML_BSSN{Interior}
MoL_RHS READS ML_BSSN_RHS{Interior}

The declarations for MoL_RHS are understood as describing what is present in the beginning and what is required at the end of this bin.

Of course, the programmer could also decide that certain evolved variables are integrated all over the domain, not just the interior.

The application would then provide (at least) the following routines:

RHS: READS ML_BSSN{All}, WRITES ML_BSSN_RHS{Interior} BC: READS ML_BSSN{Interior}, WRITES ML_BSSN{Boundary}

We can easily extend this example to include conversion to ADMBase if e.g. another RHS routine requires them.

Synchronisation and symmetry boundaries would also be applied automatically. (There is a slight complication regarding whether "Boundary" includes ghost zones or not – grid points on the edge or in the corder of grid functions can be both an outer boundary and a ghost zone, and one needs to be clear whether these are included or not. However, this is a detail that can be solved later.)

Simple Test Case

Since current schedules, even for WaveToy, are already very complex, we have a test code with a very simple schedule. This is implemented in the WaveToySimple thorn (https://svn.cactuscode.org/projects/NewSchedule/WaveToySimple/trunk/). To get the test code working, checkout Cactus using this thornlist: https://svn.cactuscode.org/projects/NewSchedule/NewSchedule/NewSchedule.th. Simple parameter files are provided in arrangements/NewSchedule/WaveToySimple/par.

The requirements part of the schedule looks as follows:

WaveToy_InitialData

PROVIDES: scalarevolve scalarevolve_p

WaveToy_Evolution

REQUIRES: scalarevolve_p scalarevolve_p_p[Interior]

PROVIDES: scalarevolve[Interior]

WaveToy_Boundaries

PROVIDES: scalarevolve[PhysicalBoundary]

WaveToy_Analysis

REQUIRES: scalarevolve

PROVIDES: scalaranalysis

There are some issues encountered with this schedule:

Curly brackets do not work for specifying parts of the grid as they confuse the parser. Square brackets were used instead.
It's not clear how to refer to past time levels. The _p syntax was used, but that isn't accepted by the schedule checker.
WaveToy_Analysis requires scalarevolve, but the schedule checker does not recognize it as being provided (because the provides are in a separate schedule bin?).
READS and WRITES seem more appropriate than REQUIRES and PROVIDES.

Notes/Issues

How do we deal with thorns whose dependencies are not known at compile-time, such as Dissipation? This thorn reads and writes variables named by a parameter. We could add a special case in the schedule.ccl to say "READS: <some syntax to mean all variables listed in this parameter>"

Adding requirements to the Cactus scheduler

Contents

Problem Outline

Suggested Solution

Current State

Next Steps

Defining parts of grid functions

Examples

Interaction with MoL

Example

Simple Test Case

Notes/Issues

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Toolbox