Difference between revisions of "GitSuperRepo"

From Einstein Toolkit Documentation
Jump to: navigation, search
 
Line 1: Line 1:
 
(draft)
 
(draft)
  
* Software built from many different components living in their own repositories
+
Background:
 +
 
 +
* Einstein toolkit built from many different components living in their own repositories
  
 
* End user must check out each component and compile them together into an executable which is then run to produce output
 
* End user must check out each component and compile them together into an executable which is then run to produce output
Line 13: Line 15:
 
Problems:
 
Problems:
  
* Upstream projects use different version control systems (SVN, Git, Mercurial, ...) leading to a nonuniform experience for the end user/developer.  Multiple tools must be learned for merging/branching etc.
+
* Upstream projects use different version control systems (SVN, Git, Mercurial, ...) leading to a nonuniform experience for the end user/developer.  Multiple tools must be learned for merging/branching/committing etc.
 +
 
 +
* It is not easy to see at a glance exactly what version of the code is in use.  One could iterate over all the different repositories, of different types, and print the revision information, and any local differences.  This could be added to GetComponents, but this has not been done yet and we argue that this is not the best solution to the problem.
 +
 
 +
* Knowing what version of the code has been used to produce a given scientific result is essential for the scientific process, where results must be repeatable.  The current best solution to this problem is the Formaline thorn which stores a complete copy of the source code of all thorns in the simulation output directory.  We argue that this is only a partial solution to the problem.  While all the source code is present, the version control metadata has been entirely stripped.  When comparing different simulations, at best one obtains a large diff of all the source changes between them, without information about why they were made or who made them.  There is also no method for conveniently using the formaline output for a new simulation.
  
* It is not easy to see at a glance exactly what version of the code is in useOne could iterate over all the different repositories, of different types, and print the revision information, and any local differences.  This could be added to GetComponents, but we argue that this is the wrong solution to the problem.
+
* Updating a Cactus source tree is currently an irreversible and dangerous processThere is no guarantee that the "current" trunk branch of all the components will function correctly, and there is no way, short of a manual backup beforehand, of reverting to the previous state if they don't.

Revision as of 11:18, 23 June 2011

(draft)

Background:

  • Einstein toolkit built from many different components living in their own repositories
  • End user must check out each component and compile them together into an executable which is then run to produce output
  • End user is often also a developer of some of the components (public or private)
  • GetComponents (URL) is a tool to simplify this process by collecting component repository information into a single "CRL" file (CRL = Component Retrieval Language).
  • GetComponents allows you to check out the latest versions from a CRL file, or to update an existing set of checkouts to the latest version

Problems:

  • Upstream projects use different version control systems (SVN, Git, Mercurial, ...) leading to a nonuniform experience for the end user/developer. Multiple tools must be learned for merging/branching/committing etc.
  • It is not easy to see at a glance exactly what version of the code is in use. One could iterate over all the different repositories, of different types, and print the revision information, and any local differences. This could be added to GetComponents, but this has not been done yet and we argue that this is not the best solution to the problem.
  • Knowing what version of the code has been used to produce a given scientific result is essential for the scientific process, where results must be repeatable. The current best solution to this problem is the Formaline thorn which stores a complete copy of the source code of all thorns in the simulation output directory. We argue that this is only a partial solution to the problem. While all the source code is present, the version control metadata has been entirely stripped. When comparing different simulations, at best one obtains a large diff of all the source changes between them, without information about why they were made or who made them. There is also no method for conveniently using the formaline output for a new simulation.
  • Updating a Cactus source tree is currently an irreversible and dangerous process. There is no guarantee that the "current" trunk branch of all the components will function correctly, and there is no way, short of a manual backup beforehand, of reverting to the previous state if they don't.