Adding requirements to the Cactus scheduler
Contents
Problem Outline
One of the currently most complex aspects of programming with Cactus is writing schedule.ccl files for new routines, in particular if mesh refinement is used. The basic problem is that it is very difficult to ensure that routines are executed in the correct order, i.e. that all grid variables which are required for a routine are actually calculated beforehand. It is also difficult to ensure that boundary conditions (and synchronisation and symmetry boundaries) are applied when needed, in particular after regridding.
The Cactus schedule consists of several independent "parts": There are schedule bins defined by the flesh, there are schedule groups defined by infrastructure thorns (e.g. MoL or HydroBase), and there is the recursive Berger-Oliger algorithm traversing the bins implemented in Carpet. It is for the end-user difficult to see which groups are executed when and on what refinement level, and in which order this occurs.
The Cactus schedule offers "before" and "after" clauses to ensure a partial ordering between routines. Unfortunately, this ordering applies only to routines within the same schedule group and the same schedule bin and refinement level. It is not possible to ensure a particular order between routines in different schedule groups or schedule bins, and it is very complex to ensure that a routine is executed e.g. after another routine has been executed on all refinement levels.
There is one example setup that illustrates this problem. When setting up initial conditions for a hydrodynamics evolution, one may e.g. want to first set up a neutron star, then calculate its maximum density, and then set up the atmosphere to a value depending on this maximum density. Making this possible in Cactus required introducing a new schedule bin "postpostinitial" to the flesh, and requires careful arrangement of schedule groups defined by ADMBase and HydroBase. Even now that this is possible, it is probably not possible to ensure at run time that these actions occur in a correct order.
Suggested Solution
To resolve this issue, and to generally simplify the way in which schedule.ccl files are designed and written, the following was suggested:
- Each scheduled routine declares which grid variables it reads and which grid variables it writes
- Since most routine write only parts of grid variables, the routine would also specify which part it reads/writes, e.g. the interior, outer boundary, symmetry boundary, etc.
- This allows the Cactus scheduler in a first step to validate the schedule and detect cases where a required variable has not been defined, or where a variable is calculated multiple times or synchronized multiple times
- In a second step this will also allow the Cactus scheduler to completely derive the schedule from these declarations. This may even make it possible to execute routines in parallel if they are independent. Even SYNC statements can be automatically derived, and schedule groups would not be necessary any more.
One particular issue arises with routines which modify a variable, e.g. imposing the constraint that <math>\tilde A^i_i=0</math>. These routines read and write the same variable, and it is thus not immediately clear why they should be executed or in which order they should be executed. To resolve this, I suggest to add a tag to variables, declaring that this routine "reads Aij:original" and writes "Aij:constraints-enforced". Each other routine accessing this variables would then also need to declare whether it reads or writes the original Aij or the Aij with constraints enforced.
Another issue arises with loops in the schedule. This is currently mostly used by MoL for the sub-timesteps. I have currently no good idea for handling this, and I suggest to punt and implement a special case for this.
Current State
There is a patch <https://wiki.einsteintoolkit.org/et-docs/images/d/de/requirements.diff> to the flesh that allows adding REQUIRES and PROVIDES clauses to the schedule block for every routine. These clauses can be arbitrary strings (there is no syntax checking done), and they are stored in the schedule database of the flesh and are ignored by default. There is a suggestion to rename these clauses to READS and WRITES; this has not yet been done.
Carpet has a file Requirements.cc that detects the presence of these clauses and performs rudimentary checks. These checks are probably useless in their current form.
Next Steps
To bring this project further, we need to define how the "reads" and "writes" clauses should look like. As mentioned above, it is insufficient to list only grid variables there, since most routines access only parts of grid variables. Ian Hinder volunteered to come up with an initial plan for what kinds of "parts" there should be (e.g. interior, outer boundary, symmetry boundary, ghost zone, etc.).
Since current schedules, even for WaveToy, are already very complex, it would be beneficial to have an example code with a very simple schedule, i.e. a schedule that does not use coordinates, boundary conditions, symmetries, or MoL. (Since these thorns are probably required by Carpet, they would still be active, but the example code itself would be independent of these.) We are looking for a volunteer for this.
Defining parts of grid functions
Application thorns typically write to either the interior of the grid (for example, those points which can be updated using finite differencing) or to the physical outer boundary (for applying user-supplied boundary conditions). Other types of points are those on symmetry boundaries, interprocessor boundaries and mesh refinement boundaries, which an application thorn should never need to write to. Symmetry thorns would write to symmetry boundaries, and the driver would write to interprocessor and mesh refinement boundaries.
Consider a single local grid component. It is a cuboidal set of points. According to Cactus, each of the 6 faces of the component is either an interprocessor boundary (including refinement boundaries) or a symmetry boundary, or a physical boundary. Each face can be only one of these. Each face has a boundary width. Points on edges and corners are associated with multiple faces, and are considered physical boundary points if they are not part of a symmetry or interprocessor boundary. Hence, physical boundary points are only those which absolutely have to be updated, as they are not updated by any other mechanism.
A typical application thorn only needs to be concerned with interior and physical boundary points. We can divide the points in a component into the categories:
- Interior;
- PhysicalBoundary;
- SymmetryBoundary;
- InterprocessorBoundary;
- RefinementBoundary.
Most scheduled application functions need to read their variables from everywhere on the grid, and some write variables to everywhere on the grid. We can use READS and WRITES lines in a schedule block to specify the variables and locations that each scheduled function reads from and writes to. Each line would be a space-separated list of variables or groups (qualified with an implementation name if outside the current implementation). To specify which part of the grid was being read or written, we could have "part" keywords in curly brackets after the grid function or group name. If omitted, the default would be Everywhere.
Examples
For example,
SCHEDULE TwoPunctures AT Initial { LANG: C WRITES: ADMBase::metric ADMBase::curv ADMBase::lapse } "Create puncture black hole initial data" schedule ML_BSSN_convertFromADMBase AT Initial { LANG: C READS: ADMBase::metric ADMBase::curv ADMBase::lapse ADMBase::shift WRITES: ML_log_confac ML_metric ML_trace_curv ML_curv ML_shift } "ML_BSSN_convertFromADMBase" schedule ML_BSSN_convertFromADMBaseGamma AT Initial { LANG: C READS: ML_log_confac ML_metric WRITES: ML_Gamma{Interior} } "ML_BSSN_convertFromADMBaseGamma" schedule ML_BSSN_RHS1 in MoL_CalcRHS { LANG: C READS: ML_log_confac ML_metric ML_trace_curv ML_curv ML_Gamma ADMBase::lapse ML_shift WRITES: ML_log_confac_rhs{Interior} ML_metric_rhs{Interior} ML_trace_curv_rhs{Interior} ML_curv_rhs{Interior} ML_Gamma_rhs{Interior} ADMBase::dtlapse{Interior} ML_shift_rhs{Interior} } "ML_BSSN_RHS1" schedule ML_BSSN_RadiativeRHSBoundary in MoL_CalcRHS { LANG: C READS: ML_log_confac ML_metric ML_trace_curv ML_curv ML_Gamma ADMBase::lapse ML_shift WRITES: ML_log_confac_rhs{PhysicalBoundary} ML_metric_rhs{PhysicalBoundary} ML_trace_curv_rhs{PhysicalBoundary} ML_curv_rhs{PhysicalBoundary} ML_Gamma_rhs{PhysicalBoundary} ADMBase::dtlapse{PhysicalBoundary} ML_shift_rhs{PhysicalBoundary} } "ML_BSSN_RHS1" schedule ML_BSSN_enforce in MoL_PostStep { LANG: C READS: ML_metric ML_curv WRITES: ML_curv } "ML_BSSN_enforce" schedule psis_calc_4th AT Analysis { LANG: C READS: ADMBase::metric ADMBase::curv WRITES: Psi4r{Interior} Psi4i{Interior} } "psis_calc_4th" schedule Multipole_Calc AT Analysis { LANG: C READS: Psi4r Psi4i } "psis_calc_4th"
It might be useful to modify the syntax to say that variables are all read and all written from and to the same parts of the grid, as that will be the usual case.