Distribution is a relatively new feature in configuration management (CM) systems. In fact, many of the most widely known commercial and research systems, such as ADC, CCC/Harvest, NSE, EPOS, and ShapeTools do not yet provide any real support for distribution. Those that do (e.g., Adele, ClearCase, Continuus/CM, Distributed RCS, and Distributed CVS) appear to suffer from one or more of the following significant problems:
A significant segment of today's software industry is moving toward a model of project organization that involves the use of multiple engineers at multiple sites working on a single software system or set of highly interdependent software systems. In the extreme case, multiple companies in multiple countries form temporary alliances, sometimes called virtual corporations [1], for the purpose of producing a specific product. And while these companies might be collaborators on one product, simultaneously they may be competitors on another.
In such a setting, configuration management becomes a serious challenge, and it seems that the limitations presented above imply that current state-of-the-art CM systems do not have the capabilities necessary to support such a scenario. The challenge exhibits itself at several levels. At the lowest levels, there is the issue of distributing large amounts of data in a timely fashion over great distances. At the highest levels, there is the issue of integrating the asynchronous efforts of engineers who may be adhering to different CM procedures and practices. These converge in the middle levels, where lie the issues of providing distributed data management that is specialized to the needs of configuration management in a context that can assume no more than a decentralized federation of cautiously cooperating parties.
NUCM (Network-Unified Configuration Management) is a system we are developing to help explore the middle levels of the distribution problem [3]. It embodies an architecture that separates CM repositories, which are the stores for versions of software artifacts and information about those artifacts, from CM policies, which are the specific procedures for creating, evolving, and assembling versions of artifacts maintained in the repositories. To do this, NUCM defines both a generic model of a distributed CM repository and a programmatic interface for implementing, on top of the repository, specific CM policies, such as check-in/check-out and change sets [2]. Structured this way, NUCM allows experimentation, not only with the model and the interface, but also with new CM policies and distribution mechanisms.
The figure below illustrates our philosophy. A group of physical repositories, containing versioned artifacts, is tied together into one large, logical repository by a group of NUCM access servers. Using the programmatic interface of NUCM, CM client systems access and manipulate the artifacts that are stored in the logical repository. The CM client systems do not need to be aware of where the artifacts are stored, the NUCM access servers take care of locating the artifacts and transporting the artifacts back and forth to the client system. Thus, distribution is transparent, which greatly simplifies the implementation of a distributed CM system.
Using NUCM as the underlying repository, we have been able (with relative ease) to build both a check-in/check-out CM client and a CM client based on the change-set approach. In addition, a distributed software release manager has been implemented on top of NUCM.
Our future plans for NUCM are to strengthen its implementation for eventual outside release and to continue experimenting with NUCM in building a variety of different CM systems. This will entail the following activities.