SERL Software Engineering Research Laboratory
University of Colorado

Tolerating Intrusions Through Secure System Reconfiguration

Overview
Documents
People

Overview

Our goal is to design, prototype, and evaluate a framework for tolerating intrusions in large-scale, heterogeneous, networked computing enterprises. The framework, which we refer to as Willow ("bend don't break"), derives from a synergistic blending of leading-edge results from the disciplines of fault tolerance, configuration management, and security. In our view, intrusion tolerance can best be achieved through the assured reconfiguration of systems in a secure, timely, and automated fashion. Our innovative approach to reconfiguration-based intrusion tolerance provides a uniform, powerful mechanism for both proactive and reactive reconfiguration. Proactive reconfiguration adds, removes, and replaces components and interconnections to cause a system to assume postures that achieve enterprise-wide intrusion tolerance goals, such as increased resilience to specific kinds of attacks, increased preparedness for recovery from specific kinds of failures, or relaxed tolerance procedures once a threat has passed. Reactive reconfiguration adds, removes, and replaces components and interconnections to restore the integrity of compromised systems in bounded time after a failure, either by restoring the system to some previously consistent state, adapting the system to some alternative non-compromised configuration, or gracefully shedding non-trustworthy data and functionality.

Willow Framework

Willow framework picture
Figure 1: Willow Framework.
The critical computing enterprises that we are targeting have the characteristic that they are formed from large numbers of components assembled into complex structures whose management at the scale of a physically distributed enterprise is beyond the capacity of manual procedures. They also have the characteristic that their components can originate from multiple sources, some trusted and others not. Examples of such systems abound, ranging from military command and control to vital national security assets such as the Federal Reserve banking system. In fact, as part of the effort proposed here we intend to leverage our on-going collaborations with the Information Processing Division of the Federal Reserve Bank to gain continual, real-world feedback on the efficacy of our results.

The Willow framework will consist of two interrelated elements: an application architecture for designing large-scale, dynamically reconfigurable systems and a common infrastructure for building and securely operating those systems (see Figure 1). The architecture embodies essential principles for effectively integrating reconfiguration control mechanisms with COTS components, thus permitting application developers to concentrate on the specialized aspects of their systems. A coordinated set of models, a standard component reconfiguration interface, and an agent-based policy mechanism together form the innovative core of the architecture. The infrastructure element of Willow will provide several common components for inclusion into implementations adhering to the architecture, as well as several development tools for generating, optimizing, and verifying distributed reconfiguration control code. Key to the utility of our approach is the use of a novel high-level language for specifying intrusion tolerance policies that are then translated into the control code. Also used in realizing control is an advanced software configuration and deployment system that is designed to operate in a wide-area, large-scale, and heterogeneous enterprise environment. The infrastructure is secured using a new technique we refer to as view-based delegation of trust, and interfaces to sophisticated posture selection, intrusion detection, and intrusion notification facilities developed outside the scope of this program (e.g., CIDF and CC2). The combination of Willow and those facilities provides a seamless and secure foundation for protection, detection, response, and recovery in the face of coordinated, dispersed attacks against critical computing enterprises.

Development Process

Development Process
Figure 2: Development Process.
The intrusion tolerance approach provided by the Willow infrastructure depends on the existence of a number of models and specifications, configurable components, and reconfigurable systems. These artifacts are defined or implemented as part of the development process show in Figure 2. The left side of that figure shows a simplified process for developing a reconfigurable system. We assume that various COTS components have been developed and made available to system developers, who in turn combine them with glue code and Willow infrastructure components to produce a complete application system capable of supporting reconfiguration.

The middle of Figure 2 shows the process for defining an instance of the reconfigurable systems model (RSD), which is the integrated model for describing the possible legal configurations for a software system. At least conceptually, it is also constructed by system developers in parallel with their construction of the actual software system. We actually envision the definition process as involving two submodels. The deployment model represents the possible configuration information for the system as it will be deployed (e.g., installed). This is analogous to our existing Software Dock Deployable Software Description (DSD) specification. The run-time model represents the possible configuration information for the system as it will be executed. The analog here is to our Ménage configurable architecture specification. Both of these submodels are merged to produce an initial RSD. This initial RSD represents the possible configurations under normal deployment and activation scenarios.

The right side of Figure 2 shows the process for developing the agents that automate tolerance strategies. We assume that some external authority provides the specific set of intrusions for which tolerance is desired, as well as the tolerance postures that the system should be able to assume. The reasoning here is closely related to that of fault tolerance; it is impossible to respond to all conceivable faults, so some authority must define the set of faults of interest. Given the set of intrusions and postures of interest and the initial RSD, an intrusion tolerance analyst constructs the tolerance specification. This specification maps intrusions and postures to corresponding reconfigurations defined in the RSD. Note that new configurations may be necessary to tolerate specific intrusions, so we assume that the tolerance analyst may modify the RSD to produce the final RSD.

The resulting tolerance specification and the final RSD are both provided as inputs to a translator. This tool translates the specifications into code in the form of tolerance agents, which are responsible for the rapid execution of reconfigurations in the face of intrusion and posture triggers. In addition, certain kinds of analysis are performed to verify, for example, that the RSD and the tolerance specification are consistent with respect to the set of legal configurations. These analyses provide confidence in the assurability of automated reconfigurations.

The results of the development process, namely configurable systems, RSDs, and tolerance agents, are deposited into depots, where they wait, ready for use in case a tolerance trigger is fired.

Reconfiguration Process

Reconfiguration Process
Figure 3: Reconfiguration Process.
The ultimate purpose of the Willow framework is to support tolerance of intrusions by providing sophisticated end-to-end reconfiguration. Figure 3 shows a simplified view of the process that carries out reconfigurations.

As mentioned above, this process is driven by events representing posture and intrusion triggers. Posture triggers indicate that the controlled system should be reconfigured in anticipation of selected future intrusions. Intrusion triggers indicate that its controlled system has already been compromised and that it should initiate reconfiguration to recover or to provide some degraded functionality, depending on the nature and extent of the intrusion. In either triggering situation, notifications describing the events are delivered by the Willow event service to specific FRCs spread across the computing enterprise network. This mapping was determined in the development process as part of the intrusion tolerance analysis. The FRCs then dispatch the appropriate agents to carry out reconfiguration activities, parameterized by instances of the three kinds of models. In general, the activities involve taking activated components and deactivating them, requesting new and/or replacement configurable components from depots and configuring them for the site where they will reside, and taking configured components cached at the site and activating them. The results of these activities are recorded by modifying the relevant models.

Documents

People

PI: Alexander L. Wolf - University of Colorado Boulder.

Co-PIs: Dennis Heimbigner - University of Colorado Boulder; John C. Knight - University of Virginia; Premkumar T. Devanbu, Michael Gertz , and Karl N. Levitt - University of California at Davis.

Acknowledgements: This project is sponsored by the Air Force Materiel Command, Rome Laboratory, SPAWAR, and DARPA under Contract Numbers F30602-00-2-0608 and N66001-00-8945. The content of the information does not necessarily reflect the position or the policy of the Government and no official endorsement should be inferred.

Copyright © 2000 Software Engineering Research Laboratory - University of Colorado