Difference between revisions of "Version control"
(→Reasons to Change) |
|||
Line 7: | Line 7: | ||
==Reasons to Change== | ==Reasons to Change== | ||
− | Won’t argue for Git vs SVN here; search the internet for why one or the other is preferred. In the end its about taste and what is | + | Won’t argue for Git vs SVN here; search the internet for why one or the other is preferred. In the end its about taste and what it is used for (nothing is the best for everything). When the proposal to switch to Git for some components was raised on the ET telecon, nobody objected, and many people voiced support. Further, there was no objection on the mailing list. We conclude that of the active developers who care about the issue, most prefer Git and support a transition. |
==Considerations== | ==Considerations== |
Revision as of 18:28, 11 August 2013
Version Control for the Einstein Toolkit
The Current System
Most thorns stored in SVN, some in Git. No other VCS is used for any component of the ET.
Reasons to Change
Won’t argue for Git vs SVN here; search the internet for why one or the other is preferred. In the end its about taste and what it is used for (nothing is the best for everything). When the proposal to switch to Git for some components was raised on the ET telecon, nobody objected, and many people voiced support. Further, there was no objection on the mailing list. We conclude that of the active developers who care about the issue, most prefer Git and support a transition.
Considerations
- For the majority of the toolkit, it has always been possible to check out individual thorns; this is because for much of the toolkit, there is a 1:1 correspondence between thorns and repositories. This decision was made to allow people to check out just what they needed
- The Cactus framework is designed to be modular, with the unit of modularity being the thorn. Thorns can be assembled and combined in a thornlist from which an executable is built.
- Thorns are grouped in arrangements for convenience, but Cactus actually does not use the concept of arrangement for any purpose besides organising the directory structure.
- Some thorns can reasonably be treated as independent of the other thorns in their arrangement, but others are intimately linked. One example is Carpet and CarpetLib. Currently, one is not usable without the other. Note, however, that the decision to split them into separate thorns (a high-level Cactus driver and a low-level mesh-refinement library) implies that CarpetLib could be used as a library independent of Carpet by some other code. This aside, developers often treat sets of thorns in an arrangement as being part of a more cohesive whole, and want to think of specific versions of the arrangement rather than of individual thorns. Combining multiple thorns into a single repository means it is very much harder to mix different versions of each thorn. Thorns should be in separate repositories if it is possible that people will want to check them out individually, or if they are sufficiently logically-separated from other thorns that you may want to use a different version of one thorn.
Options
Assuming that we wish to transition parts of the toolkit to Git, we have the following options:
- Require that all components of the toolkit are hosted in Git repositories
- Allow a mixture of Git and SVN repositories
- For all thorns hosted by the ET, use Git. For any thorns from external sources not hosted by the ET, provide ET-managed Git-SVN mirrors.
Managing the ET becomes much easier if everything is in one repository system (Git provides mechanisms to treat collections of repositories together; but then Subversion has that as well), so (2) adds some work, both on the management side as well as on the user side. On the other hand, we cannot and should not enforce Git for all repositories, so we will always have a mix.
Repository rearrangements
- Require each thorn to be an independent repository
- Require each arrangement to be an independent repository
- Decide whether a given arrangement should be split into subrepositories on a case-by-case basis
The third option is the most pragmatic.
It is strongly desirable to be able to manage the entire toolkit (or more generally, an entire Cactus tree) as a single entity. One can then talk about a specific version of the whole code, can globally revert to a previous version, see history of all thorns together, update the tree, etc... Also note that here one of the deficiencies of Git shows: it is not possible to checkout only a sub-tree of a Git repository, while this is not a problem at all for Subversion. However, in either case, with thorns spread across the globe, controlled by different groups, this is hard to impossible to achieve.
GetComponents and Heterogeneous Repositories
There is no (?) established method for treating multiple subcomponents as a single repository if those subcomponents use different VCSs. We have developed GetComponents for the purpose of checking out such a heterogeneous collection, and support has been added to it for performing some basic operations on the whole repository, but this is a “home-grown” solution, and we would need to develop and support it ourselves.
Possible arrangement/thorn repository split
The following split is based on logical relationships between the thorns, and ignores the issue of the potential convenience of having a number of thorns in a single repository in the absence of infrastructure for working with sets of repositories.
Arrangement repositories
- CactusBase (nearly all of these thorns are used nearly all of the time)
- CactusPUGH:
- These thorns are probably quite tightly-coupled
- CactusTest:
- Mostly provides test cases for the flesh. Use a single repository. [IMHO could also be merged into a single thorn CactusBase/CactusTest]
- PITTNullCode:
- A logical grouping which makes up a larger code; these thorns are probably always used together, and might well be updated together. The spherical harmonic thorns might be usable without the rest; move out?
- EinsteinExact:
- Logically independent thorns, but all automatically generated from sources in a single repository. Probably makes sense to keep this as a single repository.
- KrancNumericalTools:
- Contains a single thorn (which may go away in future, to be replaced by code copied into the generated thorns). Part of a Git repository already.
- McLachlan:
- A set of thorns, logically understood as a single code, or code family, many of which share generation-scripts. Treat as a single repository.
- Carpet:
- Thorns are components of a larger code. Arguably some of these thorns (LoopControl, CycleClock) should not be in the Carpet arrangement, but somewhere else. Probably makes sense to keep the current repository.
- CarpetExtra: (see above)
One repository per thorn
- CactusNumerical:
- The arrangement does not form a “larger code” for which it would make sense to treat the thorns as a unit. The thorns do not really depend on each other, and there is no intrinsic value to treating CactusNumerical as a single repository.
- CactusPUGHIO:
- We only actually use 2 of them in the ET thornlist and they seem to be independent of each other.
- CactusUtils:
- All essentially independent from each other
- CactusWave:
- Independent thorns; usually you only want one
- EinsteinAnalysis:
- Independent thorns. Note that these are not even all stored at the same host currently.
- EinsteinBase:
- The situation here is more fuzzy; you often want a lot of these, and there are dependencies, but you may well want a vacuum-only tree with no EOS, Tmunu or Hydro thorns.
- EinsteinEOS:
- These seem to be alternative options, not linked to each other
- EinsteinEvolve:
- These are not part of a larger code, and you might want any one of them without having the rest.
- EinsteinInitialData:
- Individual independent thorns.
- EinsteinUtils
- ExternalLibraries:
- Very much individual thorns (but the discussion of what to do with these libraries is beyond the scope of this document)
- TAT, AEIThorns, LSUThorns:
- Groupings of thorns hosted by institutions; no real logical relationship between them, and we probably don’t include all the thorns in these repository groupings in the ET anyway. The thorns in each arrangement are not related to each other.
Miscellaneous repositories
- Cactus flesh
- SimFactory
Undecided:
- CactusConnect
- CactusElliptic
- CactusIO
Cactus flesh, CactusBase, CactusUtils, CactusPUGH etc could conceivably all be in a single Cactus repository, with commit rights being controlled by subdirectory (if using Gitolite <http://gitolite.com/gitolite/vref.html>). Are these all under the same licence?
Arguments for having lots of thorns in a single repository
If your tools can only “see” a single repository at a time, it is hard to see all the changes that exist in your current working tree, so you don’t know what might need to be committed or discarded. Having many thorns per repository means a smaller number of repositories to check manually. Counter-argument: having to check more than a single repository is something which should be handled by a tool anyway, whether this is based on a native Svn/Git solution (submodules or subtrees) or functionality added to GetComponents. Checkout time is reduced due to requiring a smaller number of connections to the server Counter-argument: With parallel checkouts, this might not be an important consideration.
The main objection raised on the ET call to having one thorn per repository in general was that this leads to a large number of repositories, and this overwhelms the user interface. SourceTree shows submodules (if we go that route) in a hierarchical structure, and lists separately those with uncommitted changes, but SorceTree is Mac (and Windows) only, so also not a solution for everyone. “git status” lists the submodules with uncommitted changes, like "svn status" as well. If people don't want to use submodules or subtrees, we would need to develop tools to manage multiple repositories together. Such tools are needed anyway unless you have a single repository for the whole toolkit, which isn't going to happen.