Thesis Proposal:
Software Deployment in a Wide-Area Setting Using A Distributed Agent-based Architecture
Richard S. Hall
Thesis Advisors: Alexander L. Wolf and Dennis Heimbigner
Department of Computer Science
University of Colorado
Boulder, Colorado 80309 USA
 
 
 
 
ABSTRACT

Until recently there has been little research effort directed towards software deployment. Software deployment includes activities such as releasing, configuring, installing, updating, adapting, de-installing, and even de-releasing a software system. Modern software systems are increasing the complexity of these tasks as more sophisticated architectural models, such as system of systems and coordinated distributed systems, become commonplace. Our research direction focuses on two main areas with respect to the support of software deployment. The first of these is the definition and implementation of an architecture that supports software deployment activities in a global, uniform manner. The second area is to use this architecture to create standard, generic schemas and processes to perform software deployment activities. This standardization will allow software producers to deploy their software more fully and with less effort than the current ad-hoc methods. The goal of this research is to make software deployment an integral part of software development.
 
 
 
 

Table of Contents

1 Introduction
2 Terminology and Background
2.1 Software Deployment Processes
3 Motivating Factors
3.1 Software Complexity
3.2 Producer and Consumer Connectivity
3.3 Functional Capabilities
3.3.1 Content delivery
3.3.2 System install and update
3.3.3 Standardization
3.3.4 Network Management
4 Deployment Solution Characterization and Capabilities
4.1 Abstractions
4.1.1 Consumer Abstraction
4.1.2 Software System Abstraction
4.1.3 Process Abstraction
4.2 Process Coverage
4.3 Coordination
4.4 Required Capabilities
5 Current Systems and Approaches
5.1 Content Delivery
5.2 System Install and Update
5.3 Standardization
5.4 Network Management
6 Initial Research Results
6.1 The Software Dock Architecture
6.1.1 Federated Deployment Registry
6.1.1.1 Registry Events
6.1.1.2 Schema Definitions
6.1.2 Field Dock
6.1.3 Release Dock
6.1.4 Agents
6.1.5 Wide-Area Messaging/Event System
6.2 Software Dock Baseline
7 Research Plan
7.1 Open Research Issues
7.1.1 Complete Implementations of Architectural Components
7.1.2 Complete definitions of Registry Schemas
7.1.3 Standardized Agent Definition and Creation
7.1.4 Software Dock Solutions to Real-World Deployment Scenarios
7.2 Evaluation
7.2.1 Evaluation as a Software Deployment Solution
7.2.2 Evaluation with Respect to Other Solutions
8 Conclusion
 

1  Introduction

Software deployment is the complex process that covers all of the activities performed after a software system has been developed. Software deployment differs from software maintenance because it covers all post-development activities such as configuring, releasing, installing, updating, adapting, reconfiguring, and even de-installing a software system. The architecture of modern software systems, specifically systems built of component systems (i.e., system of systems) and coordinated distributed systems, has significantly complicated the software deployment process. The deployment of modern software systems requires the careful coordination and interaction of multiple software producers and multiple software consumers. In this scenario producers and consumers are generally geographically dispersed and autonomous.

Traditional configuration management systems have focused on development activities, such as source code control, and until recently few mechanisms were in place to support software deployment activities. Solutions to the most common deployment activities, configuring and installing a software system, have seen the most development effort, but these efforts have failed to generate a sufficient general-purpose solution. A contributing factor to this failure to generate a general solution is that the information required to configure and install various software systems on a particular site is generally not accessible, complete, nor accurate. Even when the information is available, nonstandard methods to access the information make it difficult to automatically configure the system being deployed. Tools such as Autoconf [12] and Ship [6] attempt to obtain the information on a per installation, ad hoc basis by using scripts and heuristics. The Microsoft Registry [9] stores some amount of configuration information at a site, but that information is only partially standardized and is specific to Microsoft Windows. Compounding these configuration and installation problems is the fact that many development organizations do not make their software system's resource dependencies an explicit part of the system's definition. As a result, deployment of the software system fails or partially fails because there is no way to ensure that the resource dependencies are met. This type of failure can be characterized as the "missing component" problem.

The introduction of new software systems and the continual re-engineering of old software systems to take advantage of or to adhere to component technologies, such as CORBA [18] or JavaBeans [24], will result in the "missing component" problem becoming more common. The personal computing notion that software systems are self-contained, such that copies of all components needed for an installation are included on a CD-ROM, is overly simplistic. For example, the plug-ins and helper applications used with Web browsers are not themselves components of the browsers, but are independently developed and maintained systems themselves. Even if one could construct a monolithic installation, this approach still fails when components are shared among systems where the versions of those shared components may not be consistent.

Support for deployment activities other than configuration and installation is essentially nonexistent. The installed system typically becomes a static entity detached from its producers and poorly understood by its consumers. For example, a consumer might modify his consumer site configuration in such a way that prohibits an existing software system at the consumer site to operate properly. The consumer has no recourse for determining how his changes may affect the software systems at his site. This difficulty results from the fact that constraints and dependencies are not explicit nor is the software able to automatically adapt. Instead, as the environment changes, it becomes the consumer's burden to ensure that the system continues to function properly. As enhancements and bug fixes are released, there is no standard way for the consumer to become aware of these artifacts, to automatically upgrade the installed system, or to even locate the installed system.

Clearly, there is a need to bring all of the deployment activities under the umbrella of software development and configuration management. This can be accomplished by developing a powerful new generation of configuration management technologies that account for post-development activities. To be effective, these new technologies must:

In general, then, the new configuration management technologies should automate — as an integral part of software systems themselves — the activities required for continuous support of deployed systems.

As a contribution to the new generation of configuration management technologies, we are proposing the Software Dock, an architecture for supporting the software deployment process. By analogy to the hardware docking stations used with portable notebook computers, the Software Dock provides a context in which to situate a software system at a site. As with its hardware counterpart, "docking" a software system involves protocols for interrogating the local environment for its properties and adapting the software to that environment. But a significant difference from hardware docking is the fact that both the environment and the software system are malleable. For example, if a required component is not found at a site, then that component can be added dynamically to the site to satisfy the needs of the software being docked. Alternatively, a more appropriate version of the software system itself can be obtained dynamically and installed. This allows installation to become a process of negotiation between a producer and a consumer. Moreover, the mutual adaptation process can continue beyond the initial docking to provide a perpetually evolving combination of system and environment.

The Software Dock is a system of loosely-coupled, cooperating, distributed components that are bound together by a wide-area messaging and event system. The components include field docks for maintaining site-specific configuration information by consumers, release docks for managing the configuration and release of software systems by producers, and a variety of agents for automating the deployment process. Both the information about releases and the information about field sites are represented as hierarchies of data that, when combined, form a federated software deployment registry with a conceptually global name space. Events generated by operations on the hierarchies propagate throughout the federated registry and are received by interested agents. The agent technology enables concomitant actions to be automatically performed in response to those events.

This paper has the following organization. Section 2 introduces the basic terminology and background of the research area. Section 3 presents factors that are motivating research in the area of software deployment. Section 4 summarizes the characteristics and capabilities that are required in a software deployment solution, while Section 5 summarizes related work. In Section 6 the initial results of our research is presented which includes a discussion of the proposed Software Dock architecture and a prototype of the architecture. Section 7 presents the research plan for further exploring the Software Dock as a means for supporting software deployment and a plan for evaluating the Software Dock. The conclusions are then presented in Section 8.

2  Terminology and Background

Software deployment is the assembly and maintenance of the resources necessary to use a version of a system at a particular site.

This definition is intentionally vague in order to fully capture the scope of software deployment. The assembly and maintenance can be thought of as the specific deployment process being performed, such as installation or activation. The collection of all software deployment processes or activities form the basic deployment life-cycle of a software system. [Figure 1] These processes are, in fact, instantiated from a specific deployment policy. Policies can be thought of as a parameterized or generic process. In general, a process is concerned with what actually needs to be done, whereas a policy is concerned with how it is done.

A resource is anything that is needed to enable the use of the system. Examples include shared libraries, disk space, and component systems. A system generically refers to an artifact or collection of artifacts to be made available at a site. Some basic artifacts are binary executables, data files, and documentation. A version of a system refers to both time-order versions of an evolving system and to platform-specific and functional variants.

Once a system has been deployed it is available for use at a particular site. The term "use" is dependent upon the type of system that was deployed. An executable will be executed, a collection of Web pages will be viewed with a browser, or a complex distributed system may have servers that need to be started. The site that is the target of the deployment is generally referred to as the consumer or field site. The site where the system originates is generally referred to as the producer or release site. It is generally assumed that deployment will involve some form of transfer or copying of resources from a producer site to a consumer site. The resources in this instance may be the actual system or just the knowledge of how to access the system. By and large, site is used to refer to a single node connected to a network, but it is not limited by this usage and may indeed refer to some sort of collection of nodes working in a coordinated fashion.

2.1  Software Deployment Processes

The general software deployment process is actually composed of a variety of sub-processes or activities. Figure 1 lists those activities and organizes them into an overall deployment life-cycle. The diagram should be interpreted as showing the sequence of activities that might be applied to a given product from a given producer, with respect to a given target site. Each of the software deployment life-cycle activities are discussed in more detail below.

Release: The release process is the interface between the development process and the deployment process. It encompasses all the activities needed to prepare and advertise a system so that it can be assembled correctly at some consumer site. The notion of advertising includes the dissemination of sufficient information to interested parties and providing access in some form so that they can perform the follow-on installation activity.

The release process must, in some form, package all of the knowledge about the software system to be deployed, processes to perform deployment tasks, and the actual system components. The deployment processes may be specific to the software system that is being deployed or they may be some generic processing engine, such as a scripting engine. The information in this package should include a description of the system including its dependencies and constraints in order to manage the deployed software on the consumer site.

Installation: The installation activity covers the initial deployment of a software system onto the consumer site. It is usually the most complex of the deployment activities because it must find and assemble all the resources necessary to use a system. The installation process uses the package created in the release process above. Given a package the installation process interprets the encoded knowledge in the package and then examines the target site in order to determine how to properly configure the software system to the specific target site. Once installation is completed the deployed software system is ready to be activated.

Activation: Activation refers to the activity of starting up those components of a system that must execute in order for the system to be usable. For a simple tool, activation involves establishing some form of command (or clickable graphical icon) for executing the binary component of the tool. For a complex system, there may be components that must run continuously in order for the system to be usable. Examples of the latter might be various servers and database systems needed by other parts of the system. Note that the installation process may actually use other systems and may therefore need to activate various tools in order to complete the installation of its system. For example, if a system has been packaged as an archive file, the installer must be able to activate the unarchiver tool to extract the pieces of the system to be installed. If an unarchiver is not available, then a recursive installation may be required to obtain and install the unarchiver tool.

De-activation: De-activation is the inverse of activation, and refers to the activity of shutting down any executing components of an installed system. De-activition may be required in order to perform other deployment activities, for example, before an update can be performed the software system may need to be de-activated. As a result, any servers or additional tasks that were performed during activation need to be returned to a mutable state.

Update: The update process involves modifying a software system that has been previously installed on a consumer site. The update may be the result of the release of a new version of the software to fix a bug or add new functionality. From an abstract perspective, installation is a special case of the update process where there are no existing components on the consumer site and, thus, everything must be updated. Update may normally be less complex than installation because it can often rely on the fact that many of the needed resources have already been obtained during the installation process. Typically the deployment life-cycle includes a repeated sequence in which a system is de-activated, an updated version of the system is installed, and then the system is reactivated. For some systems, de-activation may not be necessary and update can be performed while a previous version is still active.

Adapt: The adapt process involves modifying a software system that has been previously installed on a consumer site. Adapt differs from update in that updates are instigated by remote events whereas adaptations are instigated by local events. For example, if the configuration at the consumer site changes in a way that affects the deployed software system it may be necessary for the deployed software system to take some sort of corrective action. In such a situation the software system is an active participant in its own management, adapting to its environment as it changes.

De-installation: At some point, a system as a whole is no longer required at a given consumer site and the system will be de-installed. De-installation is not necessarily a trivial process. Special attention has to be paid to shared resources such as data files and libraries in order to prevent dangling references to the required resource. De-installation is therefore not the process of undoing everything that was done in installation, rather it is examining the current state of the system and its dependencies and constraints and then removing the specific software package in such a way that it will not violate these dependencies and constraints.

De-release: Ultimately, a system is marked as obsolete and support by the producer is withdrawn. De-release is distinct from de-installation in the sense that it makes the software system no longer available for installation at consumer sites, but it does not remove it from consumer sites that are using the software. Consumers of the software may continue to use the software without knowing that it has been marked as obsolete, but at the very least the de-release process should attempt to notify current users that support for the software has been withdrawn.

3  Motivating Factors

This section discusses issues, capabilities, and functionality that are motivating effort and research into software deployment.

3.1  Software Complexity

The growing complexity of software systems has exacerbated the need for more mature software deployment support. The push towards componentware technology is creating a situation where more and more software systems are being created as systems of systems. As a result, software producers are forced into a situation where they are no longer in control of all of the pieces of their system. Required subsystems are more commonly being developed and released independently by autonomous organizations. This relationship creates a loose collaboration between organizations that have little opportunity or support for cooperation with each other.

As networking, including local area, intranet, and internet, becomes the norm, software system complexity continues to grow. Distributed technologies are being combined with component technologies, such as CORBA and JavaBeans, creating potentially unmanageable relationships and dependencies among subsystem components. The ability to locate or even ensure that all necessary system dependencies are in place is a complex task that is not fully supported. Additionally, if a deployed system is a cooperating, distributed system, there is little if any support for deploying such a system where coordination of many possible servers over a collection of nodes is required.

3.2  Producer and Consumer Connectivity

The Internet is a very important connectivity tool that cuts across many of the other capabilities and activities that are motivating factors in software deployment. It deserves to be mentioned by itself, first to emphasize its importance, second, to discuss its impact on deployment issues that were not being fully addressed or were not fully feasible before the explosive use and interest in the Internet.

The widespread popularity of the Internet has demanded that software producers rethink their deployment activities and new issues have arisen as a result. Most of these issues are concerned with the support of electronic commerce over the Internet. Providing secure distribution, licensing, and billing of software and services is a growing concern because the Internet has created a virtual marketplace. This virtual marketplace is demanding a software solution to what used to be largely a physical world scenario. Users want to purchase, install, maintain, and support their software systems via the Internet. As a result it is becoming increasingly difficult for software producers to develop their software systems without taking these issues into consideration.

A byproduct of the connectivity offered by the Internet are the new interaction possibilities between software producers and consumers. Through this connectivity producers can receive feedback during a software system's life-cycle and fashion responses to this feedback. This requires that software deployment systems support an aspect of deployment that was not directly available in the past. The consumer's expectations, by having this level of connectivity, are increasing and require a new level of quality of service from software producers. For this reason it is imperative that deployment activities become an integral part of the software development process and, as such, have a full set of tools and infrastructure to support them.

3.3  Functional Capabilities

Software deployment research is motivated by specific functionality requirements of consumers, producers, and the growing commercialization of the Internet. This section describes the most important functional capabilities that motivate software deployment research.

3.3.1  Content delivery

Content delivery is meant to be a generic term referring to the movement of any deployment specific information from one point to another over a network. Given such a broad definition it is clear that content delivery is required by or affects many of the software deployment capabilities and activities. The growing effort to provide content delivery mechanisms is bolstered by the current popularity of the Internet.

Systems such as rsync [25] provide content mirroring techniques to deploy a set of files on multiple machines with a specific purpose of keeping them synchronized with the release site. Other systems like Castanet [13] provide various content delivery mechanisms based on the multi-cast distribution through publish-and-subscribe paradigms. In other cases, deployment systems, such as installation systems, have been extended to integrate with Web browsers and use the browser's built-in content delivery mechanisms [1].

3.3.2  System install and update

System install and update are two distinct processes. Past software deployment technologies have concentrated on the installation process. Despite the fact that software producers have been installing software systems for as long as they have been developing them, there is still no complete and unified method for performing software installation. There are many approaches such as installation script generators [10], system inspectors [12], and high-level platform abstractions [13]. These approaches, though, show their limitations when applied to the pure, generic environment of the Internet.

As alluded to above, system installation and update are two distinct processes where update has not been given the same focus as installation. The popularity of the Internet has changed this focus though. Since many installation systems have been extended to include installing software over networks it became apparent that installation was just a special case of update; an install is just an update where none of the software system's components are present on the target site. This realization has led certain systems like Castanet and OpenWEB netDeploy [20] to specifically address both the install and update processes in their solutions.

3.3.3  Standardization

Any effort to create complete, unified software deployment technologies must also put some effort into standardizing what is being deployed and how it is represented and manipulated. This observation was not lost on the software producing community and there have been many efforts to standardize various aspects of software and software deployment technologies. Efforts to devise software component standards [24], hardware and software description standards [3][9], and software packaging and releasing standards [28] have been driving many research organizations.

3.3.4  Network Management

Network management systems mostly address the management of the hardware elements of a network. Network management has grown to include many of the activities of software deployment, such as installing, activating, de-activating, updating, and de-installing, as a result of software producers largely ending their involvement in the software life-cycle after producing a software system. Producers of network management systems have concentrated most of their effort in creating centralized deployment systems that contain all of the knowledge about integrating and maintaining deployed software systems [26][8]. This approach resulted from assumptions that one organization was in control of both sides of the deployment equation, i.e., the producer side and the consumer side.

As the efforts in other areas of software deployment capabilities and activities continues to grow and become successful, network management will likely reduce its concentration on software deployment activities and focus more clearly on issues that fall outside of the software deployment life-cycle. These areas may include various system run-time management requirements, software monitoring, load balancing, and error recovery.

4  Deployment Solution Characterization and Capabilities

In order to evaluate a specific software deployment solution it is necessary to be able to characterize what capabilities a good software deployment solution should possess. These characteristics are not meant to imply a specific architecture or approach, rather they are meant to illustrate capabilities that are required in some form to create a complete, unified software deployment solution. Further, in order to define these characteristics and capabilities, it is necessary to understand the overall vision from which these characteristics and capabilities are being derived.

The vision of this proposal is to describe and provide a complete, unified software deployment framework. A complete, unified software deployment framework is a missing link in the current state of affairs in the software producing and consuming communities. The goal is to redefine the current, implicit limits of software development to include the notion of the software deployment life-cycle as an integral part of the software development process. By combining the software development life-cycle with the software deployment life-cycle a complete software life-cycle is created.

It is a natural progression to include software deployment activities as an extension to the responsibilities of the software producer; it is the software producer who has the required knowledge of the software system in the first place. Much like any other manufacturer is responsible for the ongoing proper functioning and repair of the items they produce, it is also the responsibility of the software producer to ensure the proper functioning and repair of the software systems they produce. In the past this burden was left to the consumer, but the consumer is starting to demand a higher level of quality of service. Recent advances in connectivity and enabling technologies have made it possible for the software producers to consider such an increase in quality of service.

It is our belief that it is necessary to create an environment where both the producer and the consumer concerns are brought together. This environment should facilitate communication and open negotiation between producers and consumers to perform the common goal of making software deployable. As such this environment represents a direction statement for creating deployable software systems for the future.

4.1  Abstractions

One approach to characterizing a software deployment solution is in terms of the abstractions that it provides to unify the problem space. The worst case for a software deployment solution is where L x M x N different deployment activity specifications are required, where L is the number of possible target sites, M is the number of product variants, and N is the number of different activities covering a complete deployment life-cycle. Informally, this case would be similar to having a deployment system consisting of a separate "Makefile" for doing some activity, such as installation, for every product into every known site. Though this is not a common situation it illustrates how the capabilities of a software deployment system can be evaluated by how much they reduce the cardinality of L x M x N.

Given such an analysis, it is clear that there are three main components that can be abstracted by a software deployment system: consumer, software system, and process. By using these different levels of abstraction, a means for evaluating the level of support provided by a particular deployment system can be defined.

4.1.1  Consumer Abstraction

The consumer abstraction creates a single, common interface to interact with the spectrum of possible consumer sites. If a given software system can be deployed on a variety of target sites, the consumer abstraction removes the need for the producer to need to know how to communicate with each specific target. Instead, the producer will interact with a Windows-based system in the same manner as a UNIX-based system using this common abstraction.

Mechanisms such as Gnu's Autoconf [12] and the Microsoft Registry [9] show two examples of consumer abstraction. Autoconf is used to produce a single program, "configure", which dynamically computes a consumer abstraction. The Registry, in contrast, is a passive repository containing the consumer abstraction. In either case, the deployment process is simplified since a producer can construct installation scripts that are parameterized by common information available from the abstraction. It is important to note, though, that these two examples specifically target different operating system platforms and, as such, do not provide a single, common interface.

The consumer abstraction largely provides querying or discovery mechanisms for a site. These particular mechanisms provide access to information about the configuration and resources at a site. The consumer abstraction is not limited to only obtaining descriptive information and may indeed abstract processes that have common functionality across all targets, such as file system access.

4.1.2  Software System Abstraction

The software system abstraction creates a single, common means to interact with and reason about software systems. While the consumer abstraction is used to obtain and describe the site where a software system is to be deployed, the software system abstraction is used to obtain and describe a complementary set of information about the software to be deployed at a particular site.

In the simplest case where a software system is a single executable and possibly some data files, then describing it completely is a simple process. The software system description is nothing more than an inventory of files, documentation pointers, contact pointers, and platform requirements.

Complex software systems pose the biggest challenge and have the greatest need for the software system abstraction. A complex system may be composed of multiple, distributed components where subsystem dependencies between components are explicitly required. This type of dependency and constraint information can have a direct impact on how some deployment processes, such as activation, are performed. It may also be possible that a software system has variant configurations that are dependent upon the resources available at the target site, all of which needs to be captured in the software system abstraction.

The software system abstraction is not limited to executable software systems. Software systems solely based on data must also be covered by the abstraction. A good example of this type of system is a collection of Web pages or any other document-based system.

4.1.3  Process Abstraction

The process abstraction creates a single, common means to interact with and reason about deployment processes and activities. This layer of abstraction deals with the distinction between a process and policy. A deployment process, such as installation or update, is a set of steps that is followed to perform the specific deployment process. A process, then, can be characterized as what needs to be done, whereas a policy is characterized as how it should be done.

For example, the update process has a distinct set of steps that it must take to actually perform the update. These steps include examining the target site's configuration, retrieving the necessary updated artifacts, and properly modifying the target site. Every update process will perform something similar to these basic steps. A policy could be used to determine how these steps are carried out. A policy might indicate that updates should only occur during non-business hours or that someone in the system administration department must first approve updates. Given this distinction, processes can be thought of as being parameterized by policy decisions.

At this time there are no known examples of a system that performs this level of abstraction in the general case.

4.2  Process Coverage

Another characteristic that is important for evaluating the level of support provided by a particular software deployment system is process coverage. As defined in Section 2.1, software deployment involves many different sub-processes and the coverage of these sub-processes by a particular software deployment system can be orthogonal to the abstractions provided by that system. For example, Castanet [13] provides consumer and producer abstractions, but they are overly simplistic and result in process coverage limited to the direct support of simple installation and the differential update of content. Providing access to the underlying computational engine (i.e., the computer) is used to enable support for the other software deployment activities. This "Turing machine" approach is very common, but clearly it does not provide any substantial process coverage.

For evaluation purposes, the broader the process coverage provided by a software deployment system the better the solution. It should be noted, though, that process coverage is not completely orthogonal to the abstractions describe above. Generally speaking, a very well defined set of abstractions is necessary to provide generic, broad process coverage.

4.3  Coordination

Support for the coordination of distributed, cooperating software systems is another area for evaluation that is important and separate from the previously mentioned characteristics. Distributed software systems contribute to the growing complexity of software deployment because their architectures have inherently complex and unreliable relationships and dependencies. Such architectures require special support in order to deploy successfully because coordination among servers, peers, and clients may be necessary. These issues of coordination complicate the software deployment activities of Section 2.1.

4.4  Required Capabilities

This section addresses capability-oriented requirements that fall outside of the general, abstract characteristics described above. It is believed that these capabilities are necessary to create an effective software deployment system. This discussion is not meant to be exhaustive, but it discusses the most important capability requirements.

Internet-scalability: The explosive popularity of the Internet has created a new, lucrative environment for software deployment. It is imperative that any proposed software deployment solution explicitly support large numbers of producers and many consumers, distributed over large geographical distances. The scale of the Internet is many magnitudes larger than local-area networks and organizational intranets and therefore requires that special attention is paid to its requirements.

Raise Abstraction Level for Software Deployment: Many of the current software deployment systems perform a specific process of the software deployment life-cycle or a subset. Unfortunately, these systems are usually limited to perform only the specific task or set of tasks that were originally intended by the deployment system developer. Support for deployment tasks other than those considered by the deployment system producer is usually in the form of allowing access to the Turing machine (i.e., the underlying computer). This is not a valid level of support. A software deployment solution must provide a framework that raises the level of abstraction for performing deployment related activities. Raising the level of abstraction will enable efficient and timely solutions to other deployment activities without requiring the reinvention of tedious, error-prone infrastructure.

Provide Unified Access to Procedural Resources: Most software deployment activities require more than just declarative information to be accomplished. Nearly all deployment activities require some sort of processing to be performed on the consumer site. Therefore a software deployment system cannot be considered complete unless it makes some attempt to provide controlled access to procedural resources as well as declarative information. Not only is consumer-side processing of this form necessary for completing most deployment tasks, but a unified approach to information and resource access can provide a great deal of security by limiting access to consumer site resources.

Explicit Bi-directional, Semi-continuous Communication: The importance of bi-directional, semi-continuous communication between producers and consumers, via the Internet, must be exploited. The connectivity afforded by the Internet enables producers and consumers to participate in a symbiotic relationship where information flows between the two participants. This level of cooperation has not been possible in the past. By incorporating bi-directional, semi-continuous communication support for the full software deployment life-cycle is enabled by allowing producers to monitor changes to a consumer site that may affect the producers software and, thus, to take corrective actions. In turn the consumer is able to receive direct notification of announcements, such as bug fixes, that are generated from the producer's site and to give feedback to the producer.

Autonomy: Since organizational boundaries and cultures are very distinct, it is of tantamount importance that a software deployment system create an environment in which those differences can coexist. In particular these difference exists from consumer to consumer and from producer to producer. Consumers should be able to control how their site is accessed and how deployment processes are performed. Producers should be able to define their deployment processes without regard for how dependent component organizations perform their deployment processes.

Platform Independence: Many systems address particular aspects of software deployment, some systems even come close to addressing the entire software deployment life-cycle, but no systems attempt to do so in a platform independent manner. Platform independence is a necessary prerequisite largely inspired by the global marketplace created by the Internet. In an effort to take advantage of such a market it is necessary to not make limiting assumptions about who or what the consumer may be. A consumer side abstraction is directly related to platform independence, though the two are not equivalent.

5  Current Systems and Approaches

A wide variety of systems already exist to support various parts of the deployment process with varying degrees of sophistication and system architectures. In the following subsections many of these systems are discussed and characterized with respect to the issues presented in Section 4. The subsections are grouped according to related functionality.

Additional consideration will be paid to those systems that actually raise the level of abstraction for software deployment. Some systems claim to provide support and coverage for all of the software deployment life-cycle, but in reality they merely provide access to the underlying machine. The claim that such a system actually supports software deployment is suspect since this is a variant of the "Turing Tar pit" argument, where one can claim to do anything if one can execute an arbitrary program (i.e., Turing machine).

Another consideration is that many software deployment systems have created a standard consumer abstraction by forcing their number of target sites to be one. That is, many systems target only Windows 95 systems, for example, and assume all such systems are effectively identical, at least with respect to deployment activities. As an aside, the fact that these systems are often not identical leads to many software deployment problems. Given such a limitation, a software deployment system employing such an approach cannot be characterized as having a generic consumer side abstraction.

5.1  Content Delivery

Castanet [13], PointCast [21], ZIP Delivery [17], rdist [16], and rsync [25] implement content delivery systems. In this class of systems and technologies, the information being deployed is simply transferred from one or more information centers to a number of receiving nodes.

Point-Cast and ZIP delivery provide news multi-casting services, they are somehow an evolution of the Internet News system [11]. Unlike Internet News, they don't support a bi-directional communication, but they provide news and advertisements through a programmable active receiver application that can be configured to poll the news server for new information. The receiver application has also a library of local display capabilities that enable a graphical presentation of the information that are received.

A consumer can determine which data he wants to receive by subscribing to a number of "channels", possibly from different producers. The subscription or some configuration on the consumer site determines how often or in response to which event the channel has to be updated.

This same publish/subscribe protocol is adopted by Castanet. Castanet is another content deployment system that has some additional features to deal with applications rather than news. A Castanet channel is in essence a set of files. On a regular basis, depending on the configuration of the channel on the consumer site, the consumer pulls an update for that channel. In addition, Castanet enables the producer to customize the channel with a channel plug-in; a channel plug-in is an application that manages the communication with the consumer and interacts with the tuner at the consumer site.

A fat web page [4] is an HTML document that contains embedded information intended to be installed in a database running on a local machine. The embedded data can be program logic, text, outlines, user interface elements, or arbitrary binary information. The goal of fat pages is to simplify the goal of distributing and installing small artifacts. Unfortunately, fat pages are only good for small artifacts and provide no real support for performing any of the deployment processes.

Rsync is a file synchronizer that synchronizes a set of files. The rsync operation involves two machines, the source machine, that has the files or the new versions, and the target machine, where the set of files must be deployed. Rsync can be invoked by either the source or the target.

These systems evaluate at a low level with respect to the abstraction level they provide for either the consumer or product side. For example, if a software system has any complex dependencies on consumer side parameters or other software systems, none of these approaches directly support these dependencies and deliver the same set of files under all circumstances. What configuration that does exists on the consumer side deals more with the communication mechanisms and does not specifically provide a consumer site model.

Content delivery may be treated as the simplest possible form of installation in which no target or product specific computation is carried out. It is also worth noting that these content delivery systems have adopted quite different technologies for carrying out the data transfer function with an eye toward making them scaleable and efficient, especially in a low-bandwidth or costly network environment.

5.2  System Install and Update

NET-Install [1] is a deployment system that supports the release, installation, and de-installation activities of software deployment. A software producer creates a NET-Install script which includes a description of the deployment package as well as minimal dependencies and constraints (e.g., existence of particular files, operating system version, required disk space, etc.). This script is then interpreted at the consumer site by a plug-in in current browser technology.

As such, NET-Install provides a simple consumer side abstraction by providing a standard mechanism for obtaining some target site information. The target sites, though, are limited to those running Microsoft Windows. Additionally, a limited deployment system abstraction is provided by the files, dependencies, and constraints listed in the package definition. The usefulness of these abstractions are severely limited by their simplicity. In other words, these abstraction cover most simple installations but coverage would degrade as the complexity of the installation increased, such as in installation a distributed system.

OpenWEB netDeploy [20] is a deployment system that supports the release, installation, update, and de-installation activities of software deployment. OpenWEB netDeploy creates a deployment package which is merely a list of the files (either embedded or URLs) that comprise the system to be deployed. Update is also supported whenever the deployed system is executed by retrieving the latest version of the software system if it has changed. OpenWEB netDeploy is enabled through browser helper application/viewer technology. A Launcher utility on the consumer site is used for updating and executing the deployed system. Conditional configurations are possible based on limited consumer site configuration and file existence querying.

OpenWEB netDeploy provides a consumer site abstraction in the form of its Launcher/browser helper application combination. The completeness of the consumer site abstraction is very limited, but it does support multiple operating systems. The provided deployment system abstraction is much more constricting than the consumer site abstraction; all deployed systems are views as independent, finished file sets. Recent extension have added the ability to specify dependencies between file sets.

InstallShield [10] is a deployment system that supports the installation and de-installation activities of software deployment, though it does not necessarily support Internet-based deployment. Generally speaking InstallShield is a tool for building scripts to install Microsoft Windows-based software systems. InstallShield also provides the capability to create a single executable installation package that could be distributed over the Internet. Consumer site abstraction is provided through various mechanism to query and interact with the target site, though still limited to Microsoft Windows platforms. The deployment system package describes the system to be deployed allowing for conditional components, constraints, and dependencies. InstallShield only supports a limited set of deployment activities (i.e., installation and de-installation).

Oil Change [19] is a system for providing software updates to your computer via the Internet. Oil Change examines a consumer's site to determine all the software and versions of the software. Using this list Oil Change examines a "master list" of software and available updates; this master list is maintained by CyberMedia, Oil Change's producer. The automatic installation of updates is supported as well as notification of new updates. Oil Change does not support deployment processes other than update and its centralized architecture is a clear scalability issue.The FreeBSD porting system [7] supports the FreeBSD user community by organizing freely available software into a carefully constructed hierarchy known as the "ports collection."

The FreeBSD porting system uses various forms of heuristics to determine a site's state and employs the results in building and installing a software package. The primary flaw in the system is that it embeds dependencies and other knowledge into Make files which makes it difficult to locate and manage information about software systems. The deployment process support is also limited to installation and de-installation.

5.3  Standardization

GNU Autoconf [12] does not directly support a software deployment activity. In general it supports a consumer site abstraction by providing a program to determine the consumer site configuration. These techniques include inspecting the consumer site using heuristics and macros or asking a user. These tools will generate a makefile file from a deployment system description which can then be used in conjunction with the UNIX make command to build and install a software system.

There are also a number of formalisms for describing systems and sites for deployment purposes. The Desktop Management Task Force (DMTF) is the major organizational force here and is pushing a standard called the Management Information Format [3] (MIF) for specifying various properties about both hardware and software. Tivoli's Application Management Specification [27] (AMS) is derived from the MIF. It specifically targets the description of application software systems. The Simple Network Management Protocol [2] (SNMP) defines a standard for defining schema information about network components, primarily hardware components. In terms of abstraction, these systems are used to specify both the site abstraction and the software system abstraction. None of these systems specifically cover a particular deployment process.

5.4  Network Management

Microsoft's Zero Administration Initiative for Windows [14] refers to a set of core technologies that give control and manageability over Windows-based environments by automating such tasks as operating system updates and application installation, and providing tools for central administration and desktop system lock down. The Zero Administration Initiative combines Microsoft's network management system, Systems Management Server (SMS) [15], with new capabilities to lock down and centrally control user access to the computing resources.

SMS from Microsoft, TME-10 [26] from Tivoli, Netview from IBM and OpenView [8] from Hewlett-Packard are representative of a number of complex, network management systems. Their original purpose was to support the management of corporate local-area networks. They had specific capabilities such as detecting hardware failures, network disruptions, and reporting problems for examination.

Recently, these systems have ventured beyond hardware and have begun addressing the problems of software management, including some parts of the deployment process. As a rule, these systems assume a homogeneous set of target sites within the corporate local-area network. Additionally, there is usually a logically centralized "producer", which is some designated central administration site (possibly multi-machine) for all officially approved system releases.

With respect to deployment life-cycle support, these systems support essentially all of the life-cycle activities. This is tempered by noting that they are oriented to the deployment of more-or-less standalone tools with few inter-system dependencies and with no complex activation requirements. These systems do not provide much support in the way of producer-side abstractions, again because the products are mostly standalone, and the consumer-side is of medium complexity because of the imposed homogeneity.

A specific capability of note is inventory, which may be considered a subpart of installation. These systems are capable of scanning a target site and determining the set of installed systems, and sometimes even the installed version of systems. This information is then brought back to a repository at a central site. Another capability of note is their ability to deploy software to a large number of targets. This is important for organizations that have networks of thousands of machines. Of course, this is partly made possible by the homogeneity of the targets.

6  Initial Research Results

In light of the current state-of-the-art we have pursued research to investigate the development of a software deployment framework to support the entire software deployment life-cycle. The initial results of this research have delivered two artifacts, the definition of a distributed, agent-based architecture to support software deployment and an initial prototype of this defined architecture. The defined architecture, called the Software Dock, and its prototype are described in more detail in the following two subsections.

6.1  The Software Dock Architecture

The Software Dock architecture consists of five primary components, as depicted in Figure 2 and described here:

There can be many field and release docks representing the interests of the many possible participants in the deployment process. Tying them together is WAM/E, which provides bi-directional communication pathways. Agent technology is used to provide a means of dynamically distributing functionality and enabling consumer-side processing of events on the behalf of producers. The following subsections describe each of these components in more detail.

It is important to point out that this proposed system, the Software Dock, is not intended as a legacy solution to software deployment problems. While current, real-world examples and systems will be used to illustrate the software deployment dilemma, applying the Software Dock to legacy systems is not necessarily where the computing community (i.e., producers and consumers) will see the greatest benefit. Rather it is when new software systems are designed to be Software Dock aware that the greatest benefits will be seen. Therefore the Software Dock proposes a direction for creating deployable software systems and facilitates the inclusion of software deployment as an integral part of software development.

 

6.1.1 Federated Deployment Registry

The federated deployment registry is central to the Software Dock architecture and is formed by conjoining the registries at all release and field sites [Figure 3]. The federated deployment registry is conceptual in the sense that it provides only a logically global name space, not a physical global implementation. The global name space enables a standard method to query the state of consumer and release sites, to subscribe for events, and to discover properties of the Software Dock environment in general.

The registry is organized as an n-ary tree. The tree model was chosen mainly for its simplicity, but also because it subsumes relevant existing models, such as the DMTF MIF format [3], the Microsoft Registry [9], the X resources model [23], and most file systems.

The schema of the registry is kept consistent across sites to ease the development of agents that access the information. Each tree node is a collection of name-attribute pairs. The names associated with an attribute are associated with the parent node and not the attribute itself. An attribute can be a primitive scalar type, such as a character string or an integer, or it can be a collection of attributes. Custom collection types can be created to facilitate schema definitions. The type of a collection is exploited to specify the structure of the sub-tree under that node. For example, a collection of type "Application" adheres to a specific sub-tree schema definition that is used to fully describe a software application.

Custom collection types are described as minimal schema definitions because they can be arbitrarily extended with additional attributes at run-time for proprietary reasons without affecting the type of the collection. This allows for schema augmentation without disrupting the behavior of agents.

6.1.1.1  Registry Events
All events occur as a result of operations being performed on a registry. An event is defined as having a type, a name, and an attribute list. The event type directly corresponds to the registry operation that generated the event, for example, adding an attribute to the registry generates an event of type "add". The event name is derived from a path-name-like construction based on the event-generating attribute's path name in the registry. Finally, the attribute list associated with an event contains event specific data. The event attributes depend on the type of the event, but can be augmented by user supplied attributes. For example, when a registry attribute is changed, the event attributes include, by definition, the old and new values of the respective registry attribute if it is a scalar type. The initiator of the event could also have included additional event attributes at the time of performing the registry operation.
6.1.1.2  Schema Definitions
The field and release dock registries are merely containers for structured data wrapped in an event-based interface. By themselves the registries do not provide much in the way of semantics, rather the registry provides a means to support the definition and management of structured information or schema. Standardizing on a schema for the field and release dock registries provides the semantic framework necessary to support the Software Dock environment.

Schema definition is critical to the Software Dock. Schema definition provides the backbone for platform independence. In addition, standardized schemas make it possible to create standard process definitions to perform generic deployment tasks. While initial schema development effort will build off of work done in DMTF's MIF [3] and AMS [27], the end result must define a direction for deployable software systems to pursue. Thus the goal of the registry schemas is to define how software systems should be defined for software deployment, not how to create schemas to deploy current software systems.

The registry typing system, as described in the introduction of this section, facilitates standard schema definitions. The registry typing system, based on the sub-tree structure definition of a registry collection, can be leveraged as part of the solution to the software deployment dilemma. The Software Dock system will include classes of standard registry types or schema definitions. In particular the field dock registry will include schemas for describing a target consumer site including its configuration, resources, and constraints. The release dock registry will include schemas for describing software systems including their components, the semantics of their components, constraints of the system, and dependencies of the system.

6.1.2  Field Dock

A field dock typically exists one per consumer site and performs several roles on behalf of the consumer. Field Dock Registry. The registry maintained by a field dock forms the basic description of the consumer site. The information contained in the registry can be thought of as a snapshot of the state of the field site. The field dock itself does not place any semantics on the content or the schema of the registry. All interpretation of semantics is performed by external entities, namely agents. When the site's state changes, an agent responsible for monitoring or affecting those changes uses the field dock's registry interface to reflect those changes in the registry. When an agent initiates an operation on the registry, the operation generates an event that is propagated to other, interested agents.

Event Propagation. The process of local event delivery and propagation is the responsibility of the field dock. When an event is generated the dock sends the event to any local agent that has subscribed to that event. In order to subscribe to an event, an agent uses the dock's event interface to tell the dock the type and the name of the event in which it is interested.

Controlled Site Access and Abstraction. The final function performed by the field dock is controlled access to the underlying site. All operations that can be performed on a site are directly exported in the field dock's interface or they are indirectly exported through specific agents performing a defined task in response to the occurrence of a specific event or event pattern. Examples of the former include the registry interfaces and the event interfaces of the field dock described above. An example of the latter is an agent that resides at a site and adds an icon to the desktop whenever an application is added to the site. This agent provides an indirect interface to the user's desktop. The indirect interfaces created by the field dock's registry and event system can be quite sophisticated. For example, an agent could create an indirect interface to the site's file system by registering for specific events that semantically denote file operations, and then map these events into the file system itself.

As a result of these controlled interfaces the consumer site is afforded some level of security. Agents that come from external, untrusted sources will only be allowed access to these controlled interfaces. The consumer has full control over which interfaces are made available and therefore can control what operations agents can or cannot perform.

6.1.3  Release Dock

A release dock works in support of producers and resides on a designated site within a software producing organization. The architecture of the release dock shares much of its architecture and functionality with that of the field dock. The release dock maintains a registry of the producer's software releases and provides registry and event interfaces.

A software release in the release dock's registry is a collection of artifacts, such as executables, libraries, documentation, dependencies, and constraints. It also includes the agents responsible for all of the deployment activities, including configuring, installing, maintaining, and de-installing the software.

Like the field dock, the release dock generates events when operations are performed on its registry. These events are used to indicate changes in the state of releases, such as the release of a new software system, a new version of an existing system, or a patch to an existing system.

Organizations use a release dock much like current FTP sites are used for distributing software, though the release dock is more sophisticated. The release dock may provide a user interface, perhaps through a Web page, to allow consumers to browse the available releases. The release dock, however, does not distribute software releases directly. Instead, when the consumer initiates a download, the release dock sends an agent to the consumer's site. This installation agent is responsible for installing the requested software release by interacting with the consumer site's field dock to obtain the appropriate configuration information. Once the configuration information is obtained, the agent retrieves the properly configured components from its release dock and installs them at the field site.

6.1.4  Agents

Since semantic knowledge is not part of field or release docks, most of their functionality is embodied in the set of agents that they host at any given time. Agents register for specific events and then perform specific tasks once these events have been received. The actions performed by agents can cause other events to be generated, thus stimulating the system with additional event-action responses. Through these techniques, agents provide a large portion of the functionality in the Software Dock environment.

From the perspective of a particular site, there are two classes of agents: internal and external. Internal agents extend the functionality of a local dock and, to some extent, are trusted at that site. External agents are obtained from remote sites to perform some particular function on behalf of a remote organization. Therefore, external agents are not trusted to the same extent as internal agents.

An external agent that comes from a release dock is generically referred to as a deployment agent. The most common deployment agent is an agent responsible for installing a software system. This installation agent is typically downloaded from a release dock directly by a consumer or indirectly by another agent. The downloaded installation agent then proceeds to install the software on behalf of the consumer, using the mechanisms provided by the Software Dock. When an installation agent installs a software system, it may additionally install other deployment agents at the site to perform tasks such as updating the software when updates become available.

In contrast to external agents, which mainly perform deployment activities, internal agents provide three major capabilities: viewing, abstraction, and isolation. A viewing agent provides a user interface for accessing, browsing, or manipulating a site's registry. A specific possibility could include an agent that provides a graphical interface for accessing applications installed at the site, or one that adds an entry to the Windows 95 start menu whenever an application is added to the site. To perform these tasks, an agent needs only to register with the field dock for the specific application events.

Other internal agents define abstract interfaces for operations whose implementation requires site-specific knowledge. They accomplish this by subscribing to selected events and, in response, by performing site-specific actions. In effect, they provide an extended, controlled interface to the underlying site for use by other agents, typically deployment agents. A good example of an internal agent is an agent that creates an indirect interface to the local site's file system. This type of agent registers with the field dock for events that semantically indicate file operations, such as a root directory node for an application. Events that occur under this registry node are semantically equivalent to the creating and updating of sub-directories and files in the site's file system.

In effect, an internal agent that provides file system access allows external agents to write to the local site's file system without having direct access. Thus, internal agents support the isolation of potentially untrusted external agents from the local site's resources. This adds an important degree of security and access control to the whole deployment process.

Field and release dock implementations are accompanied by a variety of predefined agents for performing specific tasks. Generic agent classes are also provided for performing simple software system installations and updates by interpreting the standard schema in both the field and release dock registries. It is also possible to create agents from scratch using the interfaces provided by the field and release dock servers.

6.1.5  Wide-Area Messaging/Event System

The main facilitator of interaction among components in the Software Dock architecture is event notification. As previously described, an event has a type, a name, and a list of attributes. Until this point, the discussion of events has largely been restricted to propagation within a site, only alluding to non-local propagation among sites.

WAM/E is the component responsible for propagating events across a wide-area network to sites interested in global events. Its general operation is similar to that of the local event propagation mechanism. A difference is that the subscriber to locally propagated events is an agent, while docks themselves are the subscribers to globally propagated events, which in turn propagate the events to local agents. Thus, WAM/E provides an interface by which a dock can register interest in events. It also provides an interface by which a dock can inject an event consisting of a type, a name, and an attribute list.

It is also likely that WAM/E will need to participate in artifact propagation. Certain events, such as update notifications, have an associated artifact (i.e., the update patch) that may need to be delivered along with the event. Various options from physically including the artifact with the event to caching the artifact within the WAM/E infrastructure are possibilities that need to be explored. WAM/E also needs to address the issues of event visibility and scoping. It is necessary to limit the visibility of events for performance as well as privacy reasons. WAM/E is the subject of further research [22].

6.2  Software Dock Baseline

A prototype of the Software Dock currently exists as a proof of concept. The prototype is implemented in C++ and Java using CORBA technology. Only the most basic features of the system described herein have been implemented. Specifically an initial version of the field dock has been implemented and a modified version of SRM [28] is being used as an initial release dock. A handful of agents have been created to facilitate the creation of a demonstration scenario to deploy an actual software system. No effort has been expended to create standard agent definitions or a standard means to create agents for specific deployment tasks.

The registries have been sparsely implemented in order to support only the most basic requirements to prove the concepts described in this proposal. Simple support for registry manipulation and event generation is included. The schemas of the registries have not been rigorously defined and only include basic declarative information that pertains solely to the demonstration scenario that was used as a motivating example. There is no implementation of a WAM/E-like component in the current prototype.

This prototype has been successfully used to demonstrate the deployment of an actual software system called OLLA, Online Learning Academy. OLLA is an HTML content-based system occupying 45 megabytes in over 1700 files. OLLA is intended to be installed locally at a consumer site and is functionally dependent upon two other software systems. In order to be fully installed, OLLA requires various client-side configuration and processing. In addition to installation of OLLA, the Software Dock prototype was used to demonstrate a system update cycle. These initial results have indicated that the Software Dock framework is a viable approach and that further exploration is needed to verify the full extent of its utility.

7  Research Plan

Determining the full utility and feasibility of the Software Dock framework requires that specific research areas be explored further. These areas represent issues that have either not been address yet in our work or related research work or have only been addresses in some cursory fashion. In addition to these research issues an evaluation plan is required to validate proposed or developed solutions. This section describes these topics in detail below.

7.1  Open Research Issues

In order to reach a point where evaluation and validation of the Software Dock is possible, there are four specific areas of research to pursue and resolve; the descriptions of these areas follow.

7.1.1  Complete Implementations of Architectural Components

Currently only very initial versions of the field and release docks exist. These definitions must be improved to fully incorporate the proposed registry model that will include a registry node typing a system, a flexible query mechanism to support both positional and type-based queries, and a flexible event propagation mechanism to support both positional and type-based event subscriptions. Incorporating support for agent-based technology in the field and release docks also requires investigation as we search for the best way to integrate agents to provide good performance and a reasonable level of security.

Additionally, an initial version of WAM/E must be introduced to connect the field and release docks at the Internet-level. This initial version will either be implemented from scratch, a modification to an existing event system, or the product of a related research effort [22]. The implementation of a WAM/E-like component is a research issue in its own right, but its inclusion in the Software Dock architecture is critical to support Internet-scale, bi-directional, semi-continuous communication between producers and consumers.

7.1.2  Complete definitions of Registry Schemas

The schema definitions found in the field and release dock registries are at the heart of the proposed Software Dock system. In order for these schemas to fulfill their role they must be fully defined to the greatest extent possible. Effort to combine knowledge from related work, such as DMTF and Microsoft's Registry, with gained experimental knowledge is crucial to ensure that the standardized registry schemas provide reasonable deployment process coverage. Once rigorous definitions of both the field and release dock registries have been created it will be possible to pursue the creation of standard agent classes to perform a variety of generic deployment processes.

7.1.3  Standardized Agent Definition and Creation

The proposed Software Dock architecture results in agents playing a pivotal role. Since many agents need to be created in order for a producer to participate in this environment it is imperative that agent definition and creation are simple processes. To alleviate the agent creation burden it is necessary to create a collection of standard agent definitions that can be used by developers "as is" if no customization is needed or to use them as a base from which to derive customized agents. These standard agents will interpret the standard schemas that have been created in order to perform their tasks. In addition, tools to help create agents, such as code generators or component technologies, must be explored.

7.1.4  Software Dock Solutions to Real-World Deployment Scenarios

Real-world deployment scenarios must be used to demonstrate that the Software Dock architecture provides utility. Complex and interesting software systems that require sophisticated software deployment solutions will be found. By examining and understanding these real-world scenarios, a better understanding of the requirements of the Software Dock is gained and this understanding can be funneled back into the Software Dock implementation. Additionally, effort to create fully working deployment solutions for such scenarios will demonstrate the feasibility of the Software Dock approach.

7.2  Evaluation

The evaluation of the Software Dock is divided into two parts, verifying that it provides a reasonable solution to the various software deployment activities and verifying that it provides a better solution than other software deployment systems.

7.2.1  Evaluation as a Software Deployment Solution

As described in Section 4 there are many ways to characterize a good software deployment system. In addition, the aforementioned section also described many capabilities that are required in order for a software deployment system to meet the needs of those who will be using it, namely producers and consumers. Verifying that the Software Dock exhibits these characteristics will be one part of evaluating it as a software deployment solution. In general this verification will result from a mapping between Software Dock architectural constructs and the required characteristics or capabilities described in Section 4.

Further validation of the Software Dock as a software deployment solution will be provided through the creation of Software Dock solutions to real-world software deployment scenarios. These solutions may then be used as a means to gauge various issues of scalability and performance of the Software Dock by varying certain parameters in the problem space. The recorded results will show whether the Software Dock solutions perform within an acceptable range of what is considered reasonable for such a deployment task as well as show whether the solutions scale at some reasonable, non-exponential rate.

To gather such metrics, each activity in the software deployment life-cycle must be treated independently since the parameters of the problem space that affect any given software deployment activity are independent of the other activities. As an example, the size and the number of artifacts to deploy directly affects the install and update activities. On the other hand, the amount of event traffic directly affects the update and adapt activities, but not the install activity. Therefore, by varying specific parameters for specific deployment activities it can be determined whether the overhead associated with the Software Dock solution scales at a pace no greater than the increasing scale of the parameters themselves.

7.2.2  Evaluation with Respect to Other Solutions

In order to evaluate the Software Dock with respect to other deployment solutions it may not be possible to provide a strictly quantitative set of metrics for evaluation. For example, the number of lines of perl script required to install a particular software system has no direct correlation to the Software Dock solution since it is hoped that the Software Dock solution will be created by the software producer describing their system using the schema definitions of the release dock registry. Where possible, though, direct quantitative metrics will be used. By creating Software Dock solutions to real-world deployment scenarios it will be possible to compare the solution's performance to the existing solution, if one is available.

A more complete evaluation of the Software Dock with respect to other deployment solutions will be provided through critical evaluation of the Software Dock and other deployment solutions with respect to the characterizations and required capabilities described in Section 4. The issues described in Section 4 are critical in providing a software deployment solution that will meet the needs of producers and consumers into the future.

These issues will be used as a sort of litmus test to eliminate certain software deployment solutions as viable alternatives. For example, systems that provide limited software deployment process coverage, offer a very simple abstraction for a deployable artifact, or assume homogeneity of platforms shall be eliminated as viable approaches since they do not meet the requirements set forth in Section 4. Any systems that have not been eliminated through the critical evaluation phase will be further evaluated to determine if there are any other limitations or shortcomings that have not been exposed by the issues of Section 4. The systems that remain after this evaluation phase will be considered viable software deployment solutions.

8  Conclusion

The Software Dock represents a radical departure from the current view of software development. The growing complexity in deploying software systems has made it essential that software producers start to consider software deployment as an integral part of software development. The management of deployed software systems has been an undue burden on software consumers for too long. It is a natural progression to place this burden back on the software producer since the producer has all the necessary knowledge for the task.

The Software Dock is an example of a framework to support software deployment. The Software Dock creates servers, the release and field docks, that provide abstractions for the two participants in software deployment, the producers and the consumers respectively. The connectivity of these producer and consumer abstractions is leveraged and the introduction of agent-based technology serves as a mediator between the two participants to perform software deployment activities. Software deployment will then become a mutually evolving process of negotiation and cooperation between the producer and the consumer.
 
 

Bibliography

    1. NET-Install, 1997. http://www.twenty.com.
    2. J. Case, K. McCloghrie, M. Rose, and S. Waldbusser. Structure of Management Information for Version 2 of the Simple Network Management Protocol (SNMPv2). RFC 1902, January 1996.
    3. Desktop Management Task Force. Desktop Management Interface Specification, Version 2.0, 27 March 1996.
    4. Fat Pages, 1997. http://www.scripting.com/fatPages/.
    5. Stuart I. Feldman. Make -- a program for maintaining computer programs. Software -- Practice and Experience, 9:255 -- 265, 1979.
    6. G. Fowler, D. Korn, H. Rao, J. Snyder, and K.-P. Vo. Configuration Management. In B. Krishnamurthy, editor, Practical Reusable UNIX Software, chapter 3. Wiley, New York, 1995.
    7. The FreeBSD Documentation Project. FreeBSD Handbook. FreeBSD Documentation Project, 15 May 1996. http://ftp.freebsd.org/pub/FreeBSD/docs/handbook.tex.
    8. Hewlett-Packard Company, 1997. http://hpcc997.external.hp.com/openview/index.html.
    9. Jerry Honeycutt. Using the Windows 95 Registry. Que Publishing, Indianapolis, IN, 1996.
    10. InstallShield, 1997. http://www.installshield.com.
    11. Brian Kantor and Phil Lapsley. Network news transfer protocol, a proposed standard for the stream-based transmission of news. RFC 977, February 1986.
    12. D. Mackenzie, R. McGrath, and N. Friedman. Autoconf: Generating Automatic Configuration Scripts. Free Software Foundation, Inc, April 1994.
    13. Marimba, Inc. Castanet White Paper, 1996. http://www.marimba.com/developer/castanetwhitepaper.html.
    14. Microsoft, Inc., 1997. http://www.microsoft.com/windows/innovation/.
    15. Microsoft, Inc., 1997. http://www.microsoft.com/smsmgmt/.
    16. Daniel Nachbar. When network file systems aren't enough: Automatic software distribution revisited. In Proceedings of the USENIX 1986 Summer Technical Conference, pages 159--171, Atlanta, GA, June 1986. USENIX Association.
    17. NETDelivery Corporation, 1997. http://www.netdelivery.com/.
    18. Object Management Group. The Common Object Request Broker: Architecture and Specification, Revision 2.0, July 1995.
    19. Oil Change, 1997. http://www.cybermedia.com.
    20. OpenWEB netDeploy, 1997. http://www.osa.com.
    21. PointCast, 1997. http://www.pointcast.com.
    22. D.S. Rosenblum and A.L. Wolf. A Design Framework for Internet-Scale Event Observation and Notification. Technical Report 97-06, Department of Information and Computer Science, University of California, Irvine, California, February 1997.
    23. Robert W. Scheifler and James Gettys. X Window System. Digital Press, 3rd edition, 1992.
    24. Sun Microsystems Inc. JavaBeans Specification 1.0, 1996. http://www.javasoft.com/beans/.
    25. Andrew Tridgell and Paul Mackerras. The rsync algorithm. Technical Report TR-CS-96-05, http://cs.anu.edu.au/techreports/1996/index.html, June 1996.
    26. TME/10 Software Distribution. http://www.tivoli.com/products/Courier.
    27. Tivoli Systems Inc. Applications Management Specification, 1995.
    28. A. van der Hoek, R.S. Hall, D.M. Heimbigner, and A.L. Wolf. Software Release Management. In Proceedings of the 6th European Software Engineering Conference, Zürich, Switzerland, 1997.