Detailed Release Announcement
Contents
Not yet released: Lovelace (ET_2012_05)
Accelerator Support
This release of the Einstein Toolkit adds support for GPUs and other accelerators. This support comprises three levels of abstraction, ranging from merely building and running both CUDA and OpenCL code, to automated code generation targetting GPUs instead of CPUs. As with any other programming paradigm (such as MPI or OpenMP), the performance benefits depend on the particular algorithms used and optimisations that are applied. In addition, the Simulation Factory greatly aids portability to a wide range of computing systems.
At the lowest level, Cactus now supports compiling, building, and running with either CUDA or OpenCL. CUDA is supported as new language in addition to C, C++, and Fortran; OpenCL is supported as an external library, and builds and executes compute kernels via run-time calls. Details are described in the user's guide (for CUDA) and in thorn ExternalLibraries/OpenCL (for OpenCL).
Many accelerator platforms today separate between host memory and device memory, and require explicit copy or map operations to transfer data. An intermediate level of abstraction aids transferring grid variables between host and device, using schedule declarations to keep track of which data are needed where, and minimising expensive data transfers. For OpenCL, there is a compact API to build and execute compute kernels at run time. Details are described in thorns CactusUtils/Accelerator and CactusUtils/OpenCLRunTime (with example parameter file).
Finally, the code generation system Kranc has been extended to be able to produce either C++ or OpenCL code, based on the infrastructure described above. This allows writing GPU code in a very high-level manner. However, it needs to be stated that the efficiency of the generated code depends on many variables, including e.g. the finite differencing stencil radius and the number of operations in the generated compute kernels. Non-trivial kernels typically require system-dependent tuning to execute efficiently, as GPUs and other accelerators generally show a rather unforgiving performance behaviour. The thorns McLachlan/ML_WaveToy and McLachlan/ML_WaveToy_CL are examples, generated from the same Kranc script, showing the generated C++ and OpenCL code.
New features since last release
- Einstein Toolkit: All test cases pass on almost all of the tested twenty production and development machines.
- SimFactory
- Machine database and optionlists updated due to system changes on HPC resources
- Simfactory can now be used to run the testsuites on a lot of systems. ICH: this is not new
- IOUtil: checkpoint_dir is now steerable
- SphericalSurface: added functionality to name spherical surfaces
- Formaline: Support a "local repository" that collects all machine-local repositories
- TimerReport: Allow different timers on different processes
- WeylScal4: Enable use of LoopControl, and hence OpenMP
- EOS_Omni: use C interface for HDF5 to avoid needing Fortran HDF5 bindings
- GRHydro:
- use atmosphere integer mask instead of bitmask
- remove (now) unused old Tmunu interface
- Implemented enhanced PPM scheme by Colella & Sekora 2008, McCorquodale & Colella 2011. Can be actived by setting use_enhanced_ppm = yes
- External Libraries: several updates and configuration improvements
- Cactus
- implement per-variable tolerances for Cactus testsuites, for long discussion, see ET ticket #114
- Allow arithmetic expression in ParameterSet: parameter files can now contain a limited set of expressions
- Handle requirements recursively
- A lot of smaller bug fixes
- McLachlan: Implement CCZ4 formulation
- CarpetMask: Keep track of the volume that is masked out
- CarpetLib: Define MPI reduction operators for complex numbers
- CarpetIOASCII: Add new "compact" output format
- Csrpet: Support accelerator data transfer
- CarpetRegrid2: Add periodic boundary conditions
- Simfactory
- Use OpenMP by default
- Make running testsuites using simfactory possible
- Updated a lot of configurations
How to upgrade from ET_2011_10 (Maxwell)
To upgrade from the previous release, use GetComponents with the new thornlist to check out the new version. GetComponents can also be used to update an existing checkout, but since some components might have changed location or have been removed from the toolkit you should try a new checkout whenever possible instead.
See the Download page on the Einstein Toolkit website for download instructions.
Remaining issues with this release
- Certain machines need to be configured specially in Simfactory because the remote directories cannot be determined automatically just from the username. See the Machine notes below.
- CarpetIOHDF5: When the new parameter CarpetIOHDF5::output_symmetry_points is set to "yes", then symmetry points are not actually output if CarpetIOHDF5::output_buffer_points has been set to "no".
- Recovering with Carpet: The maximum number of timelevels that can be recovered is Carpet::prolongation_order_time+1. This is usually the case, but it is possible to write parameter files e.g. with prolongation_order_time=1 that use 3 timelevels. This bug manifests in an assert() failure when recovering from checkpoints. The work-around is to either increase Carpet::prolongation_order_time or decrease the number of timelevels for the grid function in question accordingly.
Machine notes
Kraken
defs.local.ini needs to have sourcebasedir = $HOME configured for this machine. You need to determine $HOME by logging in to the machine.
LoneStar and Ranger
defs.local.ini needs to have sourcebasedir = $WORK and basedir = $SCRATCH/simulations configured for this machine. You need to determine $WORK and $SCRATCH by logging in to the machine.
Pandora
The LSU system pandora has at the time of writing not only severe problems of compiling Cactus (can crash the filesystem), but in part due to this also outstanding are some failing testsuite results. Please contact Peter Diener at CCT (LSU) before trying to use this machine.
Maxwell (ET_2011_10)
New features since last release
- Einstein Toolkit: All test cases pass on almost all of the tested twenty production and development machines.
- Carpet
- Significant internal development
- Grid structure is handled in a more efficient manner, leading to improved parallel scalability
- Grid structure output now supports multipatch
- Improvements to OpenMP parallelism in Carpet
- Support for cell-centering
- Timers are now hierarchical - use parameter output_timer_tree_every to output the timer tree to standard output. This makes it much easier to see where the time is spent in a simulation
- A backtrace file is now written to the output directory when the simulation code crashes. Note that you probably need to add the -rdynamic option to CFLAGS and CXXFLAGS for the backtrace symbols to be interpreted correctly.
- CarpetIOHDF5: There are now parameters which select whether symmetry, boundary and buffer points are output for sliced output.
- CarpetRegrid2: Now supports "true" AMR based on a regridding criterion
- SimFactory
- Completely new rewrite, new repository.
- Machine database and optionlists updated due to system changes on HPC resources
- Can now run the Cactus test suites as part of a job in a queuing system
- TODO: List of new machines supported by SimFactory?
- Optionlists now enable instruction vectorisation by default - this affects those thorns that explicitly use this vectorisation, including McLachlan and Carpet
- Now supports parameter file scripts <name>.rpar - these should be scripts which write a parameter file to <name>.par. This is useful for performing simple calculations on parameters in python or perl
- Now uses the Intel compiler by default on Kraken and Hopper
- Cactus
- CUDA support added for GPU computing
- Parameters can now be used in STORAGE specifications in schedule.ccl files
- Multi-line parameter values can now contain comments - this makes it easier to comment out entries
- Mac OS 10.7 (Lion) is now supported
- CCTK_GFINDEX3D now checks index against array bounds when CCTK_DEBUG is defined
- Standard output of Cactus build process is now much more compact
- McLachlan
- Performance improvements
- BSSN has instruction vectorisation enabled by default for improved speed
- GRHydro
- Support for MHD was added, but is by default disabled.
- WeylScal4: OpenMP support enabled in WeylScal4
- TimerReport: "top timers" now given as min/max/mean across all processes instead of just from the root process
- ADMBase
- Variables now have flat boundary condition applied
- Default value of ADMBase::initial_shift is now zero rather than none
- TwoPunctures: Now outputs a BBH metadata file, as used by NINJA / NRAR projects
- Vectors: New thorn which supports instruction vectorization to improve performance of codes that use it
- Cauchy Characteristic Extraction and the PITT Null Code are now included.
- The Pitt code implements a robust fully nonlinear characteristic evolution scheme for the Einstein equations for asymptotically flat spacetimes.
- Included in the code is the gauge invariant calculation of the Bondi News function at future null infinity.
- Include in the code are thorns that implement Cauchy Characteristic extraction, where Cauchy evolutions (McLachlan) provide boundary data for a characteristic evolution. This allows for the unambiguous calculation of the gravitational waveform from merging BBH spacetimes.
- FFTW3 library has been added to the ET
- Kranc
- thorns can now be generated including a Jacobian transformation of all derivatives - this means they can be used with multi-patch
- improvements to instruction vectorization
- can now perform finite differences using either function calls or macros; control using VECTORISE_INLINE = yes/no in optionlist; using functions can make the code fit in the instruction cache where it didn't before, resulting in large speed increases, using macros can cause compilers to run out of memory for complicated codes
- Generated thorns now check that there are sufficient ghost and boundary points for the finite differencing stencil used
- error detection has been improved
How to upgrade from ET_2011_05 (Curie)
To upgrade from the previous release, it is necessary to download a separate copy of the new release using GetComponents and manually copy any local changes from the old version to the new version. Unfortunately, due to the significant changes to SimFactory and Carpet in this release, it is not possible to perform an automatic upgrade.
See the Download page on the Einstein Toolkit website for download instructions.
There is also a SimFactory Transition Guide which explains the important differences in the new version. (TODO)
Remaining issues with this release
- Certain machines need to be configured specially in Simfactory because the remote directories cannot be determined automatically just from the username. See the Machine notes below.
- CarpetIOHDF5: When the new parameter CarpetIOHDF5::output_symmetry_points is set to "yes", then symmetry points are not actually output if CarpetIOHDF5::output_buffer_points has been set to "no".
- Recovering with Carpet: The maximum number of timelevels that can be recovered is Carpet::prolongation_order_time+1. This is usually the case, but it is possible to write parameter files e.g. with prolongation_order_time=1 that use 3 timelevels. This bug manifests in an assert() failure when recovering from checkpoints. The work-around is to increase Carpet::prolongation_order_time accordingly.
- CarpetMask: CarpetMask excises the interior of the black hole from the reduction operations, thus effectively reducing the volume of the simulation domain. However, CarpetReduce is not aware of this, and thus emits the following warning when CarpetMask is used:
WARNING level 1 in thorn CarpetReduce processor 0 host kop70.datura.admin (line 120 of Cactus/arrangements/Carpet/CarpetReduce/src/mask_test.c): -> Simulation domain volume and reduction weight sum differ INFO (CarpetReduce): Simulation domain volume: 432000 INFO (CarpetReduce): Reduction weight sum: 431999.996160507
In this case, this warning is harmless, and it can be avoided by increasing the warning level e.g. to 4 in the CCTK_VWarn call on line 120 of Cactus/arrangements/Carpet/CarpetReduce/src/mask_test.c.
Compatibility notes
- SimFactory: Completely rewritten to make it more maintainable. The user interface has changed; see the Tutorial for new users
- Carpet: Can no longer use symmetries specified using CartGrid3D::domain. Use the symmetry thorns in CactusNumerical instead.
- Cactus: Build process parameter SILENT=no has been replaced with VERBOSE=yes
- Applying boundary conditions consistently: Boundary conditions were not applied consistently, in particular after recovering. There is now a new schedule group MoL_PseudoEvolutionBoundaries. Boundary conditions (including synchronisiation, symmetry conditions etc.) for functions scheduled in MoL_PseudoEvolution need to be scheduled in MoL_PseudoEvolutionBoundaries.
- Enforcing constraints consistently: Certain constraints should be enforced only during time evolution. Since the group MoL_PostStep is scheduled at many other occasions, such constraints need to be scheduled in the new group MoL_PostStepModify instead.
- McLachlan: Use of ML_BSSN_O2, ML_BSSN_O8 and ML_BSSN_MP_O8 is now deprecated and these thorns will be removed in the next release. ML_BSSN can be used with the new fdOrder parameter (set to 2, 4, 6 or 8) to control finite differencing order. Multipatch can be enabled in ML_BSSN in the parameter file (see Kranc documentation).
- WeylScal4: Parameter fd_order = 2nd/4th is now deprecated in favour of fdOrder = 2/4/6/8. fd_order will be removed in the next release.
- LocalReduce and LocalInterp have moved from CactusBase to CactusNumerical
- Carpet: Refinement levels can now have different numbers of ghost zones. This means that cctk_nghostzones is NOT defined in global mode any more, and will contain poison.
- EOS_Omni: This is now the only supported EOS interface within the Einstein Toolkit. Documentation for this interface is bundled with the EOS_Omni thorn.
- Cactus: SILENT=no is depreciated within Cactus, and is superseeded by VERBOSE=yes.
Machine notes
Kraken
defs.local.ini needs to have sourcebasedir = $HOME configured for this machine. You need to determine $HOME by logging in to the machine.
LoneStar and Ranger
defs.local.ini needs to have sourcebasedir = $WORK and basedir = $SCRATCH/simulations configured for this machine. You need to determine $WORK and $SCRATCH by logging in to the machine.
Curie (ET_2011_05)
[text from release announcement]
This release comprises the following tools, arrangements, and thorns. Each tool/arrangement/thorn may have its own licencing conditions, but all are available as open source. Green components are new in this release.
Cactus Flesh
CactusBase Standard Cactus thorns
CactusConnect
CactusElliptic
CactusIO
CactusNumerical
CactusPUGH
CactusPUGHIO
CactusTest
CactusUtils new: NoMPI
ExternalLibraries Interfaces to external libraries, new: zlib
Carpet Adaptive mesh refinement
EinsteinAnalysis Einstein Toolkit EinsteinBase EinsteinEOS new: EOS_Omni, others will be removed next release EinsteinEvolve LegoExcision will be removed next release EinsteinInitialData EinsteinUtils
McLachlan BSSN implementation
TAT/TATelliptic Various thorns
AEIThorns Thorns hosted at AEI new: PunctureTracker, SystemStatistics
LSUThorns Thorns hosted at LSU new: Vectors
Kranc Automated code generation
GetComponents Downloading tools and thorns new repository
SimFactory Building code and running simulations
The Simulation Factory contains ready-to-use configuration details for more than 60 systems, including most HPC systems at DOE, LONI, TeraGrid, and RZG.
The Einstein Toolkit thorns contain over 130 regression test cases. On a large portion of the tested machines, all of these testsuites pass, using both MPI and OpenMP.
The changes between this and the previous release include:
- A new equation of state (EOS) interface was introduced, replacing both EOS_Base and EOSG_Base. It was designed with efficiency in mind, and combines all EOSs into one single thorn. All previously supported EOSs are now provided by EOS_Omni. The other EOS thorns are still maintained, but their support will be dropped at the next Einstein Toolkit release.
- The location of the GetComponents script changed (now hosted at github).
- The MHD implementation within GRHydro saw several updates, but is still disabled by default.
- Since spacetime-excision is not actively used anymore and not supported by an evolution thorn within the ET, this will be the last time LegoExcision will be part of an Einstein Toolkit release. Please consider other options if you rely on it, or let us know so that we can reconsider this decision.
- This release still ships with the Perl-version of Simfactory, but includes updated machine configurations and some bug fixes.
- Some external libraries now check for the parallel usage of the old library interface (e.g. HDF5=yes) and abort in this case. The new way (e.g. HDF5_DIR=...) is not compatible with the old way to specify libraries. If you get errors because of this you have to remove one of the two specifications from your optionlist.
Chandrasekhar (ET_2010_11)
[text from release announcement]
This release comprises the following tools, arrangements, and thorns. Each tool/arrangement/thorn may have its own licencing conditions, but all are available as open source. Green components are new in this release, shown in red are components now not longer part of the Einstein Toolkit:
Cactus Flesh
CactusBase Standard Cactus thorns CactusConnect CactusElliptic CactusExternal Not part of the Einstein Toolkit anymore (use ExternalLibraries/libjpeg instead of jpeg6b) CactusIO CactusNumerical new: InterpToArray CactusPUGH CactusPUGHIO CactusTest Various Cactus testsuite thorns CactusUtils CactusWave Wavetoy example thorns
ExternalLibraries Interfaces to external libraries (new: OpenSSL, libjpeg, several updates in other thorns)
Carpet Adaptive mesh refinement
EinsteinAnalysis Einstein Toolkit EinsteinBase EinsteinEOS EinsteinEvolve EinsteinInitialData EinsteinUtils
McLachlan BSSN implementation
TAT/TATelliptic Various thorns AEIThorns/AEILocalInterp LSUThorns/QuasiLocalMeasures LSUThorns/SummationByParts
Kranc Automated code generation
GetComponents Downloading tools and thorns
SimFactory Building code and running simulations
All repositories participating in this release carry a branch "ET_2010_11" marking this release. These release branches will be updated if severe errors are found.
This release has been tested on the following systems and architectures:
Workstations (Intel, Linux) MacBook Pro notebook (Intel, Mac OS X) Blue Drop, NCSA (Power 7, Linux) Damiana, AEI (Intel Woodcrest cluster, Linux) Kraken, NICS (Cray XT5, Linux) Philip, LSU (Intel cluster, Linux) Queen Bee, LONI (Intel cluster, Linux) Ranger, TACC (AMD cluster, Linux)
The Simulation Factory contains ready-to-use configuration details for more than 20 additional systems, including most HPC systems at DOE, LONI, TeraGrid, and RZG.
The Einstein Toolkit thorns contain 132 regression test cases. While all test cases pass on some systems, there are unfortunately also some systems where certain test cases fail. We verified that this is because of accumulation of floating-point round-off error in most cases, and we will discuss this issue in a broader context in the near future.
The Einstein Toolkit web site contains online documentation for its thorns, and pointers for using it to build your own code. There is also a tutorial that explains how to download, build, and run the code for a simple binary black hole evolution. We invite you to join our mailing list <users@einsteintoolkit.org>.
The changes between this and the previous release include (not complete):
- Several Libraries can now be build in parallel-make mode, increasing compilation speed on some machines a lot.
- Several Libraries now clean up intermediate files, often using considerably less disk space per configuration.
- GRHydro includes (disabled, not yet finished) support for MHD. Don't try to use it yet, and don't get confused about that code.
- Several Libraries have been updated (ExternalLibraries)
- Simfactory received several updates, and this will likely be the last release with the Perl version.
- The links in the Reference manual now work (again).
- A lot of other bugs and testsuites were corrected.
On behalf of the Einstein Toolkit Consortium: the "Chandrasekhar" Release Team
Gabrielle Allen Eloisa Bentivegna Tanja Bode Peter Diener Roland Haas Ian Hinder Frank Löffler Bruno Mundim Christian D. Ott Erik Schnetter Eric Seidel Michael Thomas
November 23, 2010
Bohr (ET_2010_06)
[text from release announcement]
This release comprises the following tools, arrangements, and thorns. Each tool/arrangement/thorn may have its own licencing conditions, but all are available as open source:
Cactus Flesh
CactusBase Standard Cactus thorns CactusConnect CactusElliptic CactusExternal CactusIO CactusNumerical CactusPUGH CactusPUGHIO CactusUtils
ExternalLibraries Interfaces to external libraries
Carpet Adaptive mesh refinement
EinsteinAnalysis Einstein Toolkit EinsteinBase EinsteinEOS EinsteinEvolve EinsteinInitialData EinsteinUtils
McLachlan BSSN implementation
TAT/TATelliptic Various thorns AEIThorns/AEILocalInterp LSUThorns/QuasiLocalMeasures LSUThorns/SummationByParts
Kranc Automated code generation
GetComponents Downloading tools and thorns
SimFactory Building code and running simulations
All repositories participating in this release carry a branch "ET_2010_06" marking this release. These release branches will be updated if severe errors are found.
This release has been tested on the following systems and architectures:
Workstations (Intel, Linux) MacBook Pro notebook (Intel, Mac OS X) Blue Drop, NCSA (Power 7, Linux) Damiana, AEI (AMD cluster, Linux) Kraken, NICS (Cray XT5, Linux) Philip, LSU (Intel cluster, Linux) Queen Bee, LONI (Intel cluster, Linux) Ranger, TACC (AMD cluster, Linux)
The Simulation Factory contains ready-to-use configuration details for more than 20 additional systems, including most HPC systems at DOE, LONI, TeraGrid, and RZG.
The Einstein Toolkit thorns contain 89 regression test cases. While all test cases pass on important systems, there are unfortunately also some systems where certain test cases fail. We verified that this is because of accumulation of floating-point round-off error in most cases, and we will discuss this issue in a broader context in the near future.
The Einstein Toolkit web site contains online documentation for its thorns, and pointers for using it to build your own code. There is also a tutorial that explains how to download, build, and run the code for a simple binary black hole evolution. We invite you to join our mailing list <users@einsteintoolkit.org>.
On behalf of the Einstein Toolkit Consortium: the "Bohr" Release Team
Gabrielle Allen Eloisa Bentivegna Tanja Bode Peter Diener Roland Haas Ian Hinder Frank Loeffler Bruno Mundim Erik Schnetter Eric Seidel
June 17, 2010