Repository logo
 

Determining disease outbreak influence from voluminous epidemiology data on enhanced distributed graph-parallel system

Date

2017

Authors

Shah, Naman, author
Pallickara, Sangmi Lee, advisor
Pallickara, Shrideep, committee member
Turk, Daniel E., committee member

Journal Title

Journal ISSN

Volume Title

Abstract

Historically, catastrophe has resulted from large-scale epidemiological outbreaks in livestock populations. Efforts to prepare for these inevitable disasters are critical, and these efforts primarily involve the efficient use of limited available resources. Therefore, determining the relative influence of the entities involved in large-scale outbreaks is mandatory. Planning for outbreaks often involves executing compute-intensive disease spread simulations. To capture the probabilities of various outcomes, these simulations are executed several times over a collection of representative input scenarios, producing voluminous data. The resulting datasets contain valuable insights, including sequences of events that lead to extreme outbreaks. However, discovering and leveraging such information is also computationally expensive. This thesis proposes a distributed approach for aggregating and analyzing voluminous epidemiology data to determine the influential measure of the entities in a disease outbreak using the PageRank algorithm. Using the Disease Transmission Network (DTN) established in this research, planners or analysts can accomplish effective allocation of limited resources, such as vaccinations and field personnel, by observing the relative influential measure of the entities. To improve the performance of the analysis execution pipeline, an extension to the Apache Spark GraphX distributed graph-parallel system has been proposed.

Description

Rights Access

Subject

distributed analytics
epidemiological PageRank
NAADSM influential analysis
enhanced distributed graph-parallel system
disease propagation network
extended Apache Spark Graphx

Citation

Associated Publications