Improving the treatment of external libraries

From Einstein Toolkit Documentation
Jump to: navigation, search

Background

Sometimes Cactus code needs to call code in libraries external to Cactus. In order to ensure that these libraries are available to Cactus, each is represented by a Cactus thorn. These thorns perform the following functions:

  1. Provide configuration variables for the user to use in their option lists to specify the location of the library on the machine. Sometimes this is as simple as <lib>_DIR = /path/to/lib, but sometimes additional options need to be set, for example if the lib and include directories are in nonstandard locations in relation to the base library path. The external library thorn translates the user's settings for these variables into compiler and linker options which are then used by Cactus when compiling and linking.
  2. In the case that the user doesn't specify the location of the library, the thorn attempts to locate the library in some standard locations.
  3. As a last resort, or if requested by the user, the thorn will build a version of the library and install it in the configuration directory. For this reason, the thorn usually contains a distribution tarball of the library which is extracted and compiled during the Cactus configuration (CST) stage. Sometimes there is also a patch which is applied to the source of the library. Having the libraries built as part of the configuration ensures that all the libraries needed are always available even if they have not been installed on the machine, but adds a significant overhead to compilation time.

Problems with the existing mechanism

At present, the logic for performing all of the above is implemented in configure.sh scripts in each library thorn. This logic can be quite subtle and involved, and it is desirable for this to be implemented in a central location in a careful and correct way, rather than being duplicated for each library thorn.

Requirements

Collecting together the behaviour required by library thorns, such a mechanism needs to make it straightforward to do the following:

  • Provide a variable, <lib>_DIR which either:
    • points to the library installation, in which case the thorn checks that the library in this location is usable and uses it if this is the case, otherwise compilation is aborted;
    • is BUILD (caseless), in which case the thorn builds the library and uses that version;
    • is NO_BUILD (causeless), in which case the thorn searches for a version on the system and uses that, unless it cannot be found, in which case the compilation is aborted;
    • is unset, or is the empty string, in which case the library is first searched for, and if not found, is built (we could have a synonym AUTO (caseless) for this, to make it explicit)
  • Provide variables which point to the include and library directories, but by default set these variables to <lib>/include and <lib>/lib.
  • Search some standard locations for the library.
  • Test that a given setting of the configuration variables works correctly; i.e. that a working version of the library is actually found at a given location

Current configuration scripts search through some "standard" directories such as /usr and /usr/local and then look in standard subdirectories such as "lib" and "lib64" for libraries with standard extensions such as ".a", ".so" and ".dylib" (for Mac OS). This search logic is very low-level, and should probably be replaced by something higher level. For example, we shouldn't make assumptions about the possible library extensions (.so etc), but instead should leave that to the linker. Similarly, many standard installations of libraries will be in locations which the linker searches automatically, so in those cases, if the linker can find the library without assistance, we should not try to second-guess it.

Proposal

Building

The thorn can provide a build.sh script which is responsible for building and installing the library in the configuration directory. Separating this into a separate script makes the system more modular, and separates the logic for whether to build from the code that builds.

Testing

The thorn can provide a source file conftest.cc which, if successfully compiled, linked and run, indicates that the options used to compile and link it are sufficient to find the library. When the user sets <lib>_DIR = /path/to/lib, <lib>_INC_DIRS and/or <lib>_LIB_DIRS, or sets <lib>_DIR = NO_BUILD, these are used to compile and link this program in order to test that this is possible. This testing ensures that a missing library is detected early during the configuration stage, rather than when the first source file tries to include the missing header file.

Searching

When the library location has not been specified, the system needs to search for the library. It will first attempt to compile, link and run the test program with no additional options. This may be successful if the library is installed in a location which is on the standard compiler include/linker paths. If this fails, a sequence of include and library directories can be tried, and the first combination which works can then be used. It may turn out that this is unnecessary, because any "standard" paths that the thorn might know about would probably be tried by the system anyway. A possible exception would be a MacPorts installation in /opt/local which is not searched for by default by the compiler. Note that this is related to the problem that if we add, e.g. /opt/local/lib to the linker path for, say, HDF5, then this might cause something else like MPICH to be found from there, even if the user has specified MPI_DIR to be something else. We don't have a solution to this problem.

Factoring out common code

Each library thorn's configure.sh script can call a bash function (or maybe a script) which we provide in the flesh: configure_library:

 configure_library <lib> <dist> <patch>...
  • <lib>: A capitalised name of the library, e.g. HDF5, HWLOC, MPI suitable for use in environment variable names.
  • <dist>: A path to the source distribution tarball relative to the thorn, e.g. dist/zlib-1.2.7.tar.gz
  • <patch>: A path to any patch that needs to be applied

configure_library does the following:

  • Enable verbose output if requested
  • Look at the variables set by the user and choose which of the following to do, and do it:
    • Build the library
    • Use the library from a specified location, and test that this works, otherwise abort
    • Search for the library, and use it if found, otherwise abort
    • Search for the library, use if found, otherwise build
  • Set Cactus include and link variables so that the library will be appropriately found by Cactus, as well as the HAVE_<lib> macro

Questions

  • How should the user get the behaviour of setting <lib>_DIR to point to a directory, i.e. explicitly using a specific installation, if there is no such directory, and the include and lib dirs need to be specified separately? Maybe setting any of these three variables (<lib>_DIR, <lib>_INC_DIRS and <lib>_LIB_DIRS) should trigger using exactly that version of the library.

Additional Issues

to be worked into the text above.

  • "Standard" paths such as those with prefix /usr or /usr/local must not be added to INC_DIRS and LIB_DIRS (BCM: or at least it should be possible to optionally add them at the end)
  • Instead of a capitalised library name, we should pass the "natural" name, and capitalise within the script. This allows nicer error messages.
  • Assuming that configure_library unpacks the library: It will be necessary to pass in also the directory in which the (unpacked) configure script is found.
  • <patch>: Note that multiple patches may be required.
  • Running a test program: Sometimes it is not possible to run a program (e.g. for MPI). In this case, building and linking is still a good idea for testing.
  • We should think about cross-compiling. It needs to be supported at least to the extent that the configuration options can specify an existing library, which is then accepted by Cactus.
  • Maybe there should be a generic mechanism to skip all testing?
  • Having a _DIR variable containing "BUILD" and "NO_BUILD" doesn't really make sense. Maybe we should set e.g. "HDF5=BUILD", or "HDF5=NO_BUILD", and use HDF5_DIR only for providing default values for HDF5_LIB_DIRS? In this case, "HDF5=NO_TEST" would also be possible.
  • Certain environment variables (e.g. LIBS) should automatically be unset by configure_library, not by the thorn's build script. These variables have a special, non-standard meaning in Cactus, and basically all configure scripts get confused by them.

Possible improvements

which are not really related to the topic discussed here.

  • In most cases, a library that has been built once can be re-used by all other builds on the same machine. (Exceptions exist.) Since reducing build time is mentioned as a goal above, we could install libraries into a specific directory (e.g. Cactus/exe/lib), and thus share external library builds between configurations. These would then not be deleted by a make realclean, but would be rebuilt when necessary. It is up to the option list to ensure that incompatible option lists specify different install directories. Maybe the option list name itself could be used as prefix.
  • We should separate the distributed tarballs from the thorns containing the logic. This would save download time for those who don't want / need the tarballs.
  • We could allow the tarballs to be downloaded directly and only when needed, without storing them in svn.

Some high-level thoughts

which should probably be at the top.

  • Do we really want to continue to develop this mechanism? Or can we adopt rpm / dpkg / pkgconfig instead? In the long run, it may be easier to write a pkgconfig definition for LORENE than maintaining whole sets of configuration scripts ourselves.

Building from source

(move this to a separate page)

The current system of storing the source tarball in the thorn SVN repository is not ideal. Included here is an email written by Ian in 2013 about this, with some ideas for improving the system.

  • The tarball would not be included in the repository. RH: I would prefer to store tarballs (or their content) in a repository we control, this does not have to be the same that holds the control files, the reason is that some of the ExternalLibraries may change their hosting url/not life as long as the ET. For example the oldest version of curl still available on the webserver at http://curl.haxx.se/ is from 2000 and anything older than 2010 is in a subdirectory archeology.
  • The configuration.ccl file of the thorn would have a field for the URL of a tarball to download
  • At configuration time, the thorn's configure script (or something factored into a Cactus script) would download the tarball using the URL if it is necessary to build the library on that machine.
  • The tarball would be cached in Cactus/librarycache or similar
  • A script or makefile target would be provided to "pre-cache" the tarball of one or more thorns. RH: GetComponents can download arbitrary single files via http so we already have that possibility

A checkout of the ET now would not download any external library tarballs. If, on a given machine, a library is not installed and needs to be built, it would be downloaded when needed. If you want to make sure you have everything needed to build, for example if you are about to catch a plane, you could run something like "make get-libraries" and all library tarballs would be downloaded into Cactus/librarycache. [It would be possible to do this using simfactory, which might be able to determine via its machine database which machines need which tarballs downloaded]. You could then choose to sync this to the remote machine, or not. On a remote machine, you could run get-libraries on the head node which presumably has an internet connection so that the library tarballs are available to the build process which might, as Frank points out, not have internet access.

The vast majority of users would notice little difference; the checkout would be faster, and less disk space would be used by their Cactus trees. Edge-case users who have to compile on machines without internet connections would have a simple command to run on the head node of the machine which would restore the same functionality that we currently have.