Version control

From Einstein Toolkit Documentation
Revision as of 09:47, 11 August 2013 by Hinder (talk | contribs)
(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search
Version Control for the Einstein Toolkit

The Current System

Most thorns stored in SVN, some in Git. No other VCS is used for any component of the ET.

Reasons to Change

Won’t argue for Git vs SVN here; search the internet for why it is preferred by many people. When the proposal to switch to Git was raised on the ET telecon, nobody objected, and many people voiced support. Further, there was no objection on the mailing list. We conclude that of the people who care about the issue, all prefer Git and support the transition. Certainly among many of the most active developers, Git is strongly preferred. Considerations For the majority of the toolkit, it has always been possible to check out individual thorns; with CVS and SVN this is because the VCS allows subdirectories in a repository to be treated as individual repositories. The Cactus framework is designed to be modular, with the unit of modularity being the thorn. Thorns can be assembled and combined in a thornlist from which an executable is built. Thorns are grouped in arrangements for convenience, but Cactus actually does not use the concept of arrangement for any purpose besides organising the directory structure. Some thorns can reasonably be treated as independent of the other thorns in their arrangement, but others are intimately linked. One example is Carpet and CarpetLib. Currently, one is not usable without the other. Note, however, that the decision to split them into separate thorns (a high-level Cactus driver and a low-level mesh-refinement library) implies that CarpetLib could be used as a library independent of Carpet by some other code. This aside, developers often treat sets of thorns in an arrangement as being part of a more cohesive whole, and want to think of specific versions of the arrangement rather than of individual thorns. Combining multiple thorns into a single repository means it is very much harder to mix different versions of each thorn. Thorns should be in separate repositories if it is possible that people will want to check them out individually, or if they are sufficiently logically-separated from other thorns that you may want to use a different version of one thorn. Options Assuming that we wish to transition the majority of the toolkit to Git, we have the following options: Require that all components of the toolkit are hosted in Git repositories Allow a mixture of Git and SVN repositories For all thorns hosted by the ET, use Git. For any thorns from external sources not hosted by the ET, provide ET-managed Git-SVN mirrors.

Managing the ET becomes much easier if everything is in Git (Git provides mechanisms to treat collections of repositories together), so (2) adds a lot of work. (3) is probably the best option, and hopefully the number of SVN thorns will be very small.

Additionally, we can: Require each thorn to be an independent repository Require each arrangement to be an independent repository Decide whether a given arrangement should be split into subrepositories on a case-by-case basis The third option is the most pragmatic.

It is strongly desirable to be able to manage the entire toolkit (or more generally, an entire Cactus tree) as a single entity. One can then talk about a specific version of the whole code, can globally revert to a previous version, see history of all thorns together, update the tree, etc. GetComponents and Heterogeneous Repositories There is no (?) established method for treating multiple subcomponents as a single repository if those subcomponents use different VCSs. We have developed GetComponents for the purpose of checking out such a heterogeneous collection, and support has been added to it for performing some basic operations on the whole repository, but this is a “home-grown” solution, and we would need to develop and support it ourselves.

Suggested arrangement/thorn repository split

The following split is based on logical relationships between the thorns, and ignores the issue of the potential convenience of having a number of thorns in a single repository in the absence of infrastructure for working with sets of repositories.


Arrangement repositories CactusBase (nearly all of these thorns are used nearly all of the time) CactusPUGH: These thorns are probably quite tightly-coupled CactusTest: Mostly provides test cases for the flesh. Use a single repository. [IMHO could also be merged into a single thorn CactusBase/CactusTest] PITTNullCode: A logical grouping which makes up a larger code; these thorns are probably always used together, and might well be updated together. The spherical harmonic thorns might be usable without the rest; move out? EinsteinExact: Logically independent thorns, but all automatically generated from sources in a single repository. Probably makes sense to keep this as a single repository. KrancNumericalTools: Contains a single thorn (which may go away in future, to be replaced by code copied into the generated thorns). Part of a Git repository already. McLachlan: A set of thorns, logically understood as a single code, or code family, many of which share generation-scripts. Treat as a single repository. Carpet: Thorns are components of a larger code. Arguably some of these thorns (LoopControl, CycleClock) should not be in the Carpet arrangement, but somewhere else. Probably makes sense to keep the current repository. CarpetExtra: (see above)


One repository per thorn CactusNumerical: The arrangement does not form a “larger code” for which it would make sense to treat the thorns as a unit. The thorns do not really depend on each other, and there is no intrinsic value to treating CactusNumerical as a single repository. CactusPUGHIO: We only actually use 2 of them in the ET thornlist and they seem to be independent of each other. CactusUtils: All essentially independent from each other CactusWave: Independent thorns; usually you only want one EinsteinAnalysis: Independent thorns. Note that these are not even all stored at the same host currently. EinsteinBase: The situation here is more fuzzy; you often want a lot of these, and there are dependencies, but you may well want a vacuum-only tree with no EOS, Tmunu or Hydro thorns. EinsteinEOS: These seem to be alternative options, not linked to each other EinsteinEvolve: These are not part of a larger code, and you might want any one of them without having the rest. EinsteinInitialData: Individual independent thorns. EinsteinUtils ExternalLibraries: Very much individual thorns (but the discussion of what to do with these libraries is beyond the scope of this document) TAT, AEIThorns, LSUThorns: Groupings of thorns hosted by institutions; no real logical relationship between them, and we probably don’t include all the thorns in these repository groupings in the ET anyway. The thorns in each arrangement are not related to each other.

Miscellaneous repositories Cactus flesh SimFactory

Undecided: CactusConnect CactusElliptic CactusIO

Cactus flesh, CactusBase, CactusUtils, CactusPUGH etc could conceivably all be in a single Cactus repository, with commit rights being controlled by subdirectory (if using Gitolite <http://gitolite.com/gitolite/vref.html>). Are these all under the same licence?

Arguments for having lots of thorns in a single repository

If your tools can only “see” a single repository at a time, it is hard to see all the changes that exist in your current working tree, so you don’t know what might need to be committed or discarded. Having many thorns per repository means a smaller number of repositories to check manually. Counter-argument: having to check more than a single repository is something which should be handled by a tool anyway, whether this is based on a native Git solution (submodules or subtrees, both well-supported by Sourcetree) or functionality added to GetComponents. Checkout time is reduced due to requiring a smaller number of connections to the server Counter-argument: With parallel checkouts, this might not be an important consideration.

The main objection raised on the ET call to having one thorn per repository in general was that this leads to a large number of repositories, and this overwhelms the user interface. SourceTree shows submodules (if we go that route) in a hierarchical structure, and lists separately those with uncommitted changes. “git status” lists the submodules with uncommitted changes.