On the role of topology in autonomously coping with failures in content dissemination systems
Date
2014
Authors
Stern, Ryan, author
Pallickara, Shrideep, advisor
Strout, Michelle, committee member
Turk, Daniel, committee member
Journal Title
Journal ISSN
Volume Title
Abstract
Content dissemination systems provide a substrate that allows large numbers of entities to communicate with each other. These entities could be processes, sensors, and networked instruments that produce and consume data streams. To ensure scaling, the content dissemination substrate comprises a large number of distributed nodes. As the number of participating nodes increases, the likelihood of failures also increases. These failures can occur for any number of reasons, including: faulty hardware, programmer or user error, power failure, and network outages. Node failures can result in partitions with the original set of connected nodes disintegrating into smaller, disjoint subsets. Brewer's CAP theorem limits the choices for a partitioned system: availability or consistency but not both. It is therefore desirable to ensure that partitions are less likely. This thesis explores how nodes comprising the content dissemination system can be organized into topologies with the objective of improved partition tolerance. The topologies we consider are based on random, regular, power law, and Watts-Strogatz small world graphs. Connections within these topologies can account for network proximity and are suitable for real-time communications. We explore specific attributes of a topology that contribute to its partition resiliency, such as clustering coefficients, distribution of random links, and preferential attachment. Metrics we use to profile suitability of different topologies include: communication path lengths, migration of workloads, and the impact on system throughput. This research will allow designers to choose topologies or configure metrics to achieve performance objectives and the degree of partition tolerance.
Description
Rights Access
Subject
autonomous failure resilience
topologies
small-world systems
power-laws
content dissemination systems