Repository logo
 

The robustness of resource allocation in parallel and distributed computing systems

Date

2004

Authors

Siegel, Howard Jay, author
Ali, Shoukat, author
Maciejewski, Anthony A., author
IEEE Computer Society, publisher

Journal Title

Journal ISSN

Volume Title

Abstract

This paper gives an overview of the material to be discussed in the invited keynote presentation by H. J. Siegel; it summarizes our research in [1]. Performing computing and communication tasks on parallel and distributed systems involves the coordinated use of different types of machines, networks, interfaces, and other resources. Decisions about how best to allocate resources are often based on estimated values of task and system parameters, due to uncertainties in the system environment. An important research problem is the development of resource management strategies that can guarantee a particular system performance given such uncertainties. We have designed a methodology for deriving the degree of robustness of a resource allocation - the maximum amount of collective uncertainty in system parameters within which a user-specified level of system performance (QoS) can be guaranteed. Our four-step procedure for deriving a robustness metric for an arbitrary system will be presented. We will illustrate this procedure and its usefulness by deriving robustness metrics for some example distributed systems.

Description

Rights Access

Subject

Citation

Associated Publications