In The Name of GOD
Improving Survivability of HA-Clusters
some related links: XEN LVM AoE iSCSI Drbd HeartBeat Ganetti Surveys My Reports Related URLS | The above subject is my M.SC. thesis on Computer Engineering at IUST Under the supervision of Dr. Sharifi started at Oct 2006 and defended at Oct 2008. Absract: Nowadays, the human trends on systems that able to continue their works and fulfillment of their missions with existence of typical faults.The ability to provide service under adverse conditions is broadly defined as survivability. Typical faults under consideration by vendors and service providers include hardware failures, software bugs, malicious attacks, human operation/maintenance errors, and natural disasters are conditions threat computing systems. These threats are inevitable in large-scale network based information systems providing services to industry and The ever increasing impact of the service unavailability on public safety and commercial trustworthiness are not able to omit, thus traditional clusters with their expensive and redundancy solutions were only ways for improvement of availability level of systems. In this thesis by using distributed system, high availability clusters and virtualization concepts, I prepare a HA-clustering architecture JOOYAN that in compare with other HA clusters has improvements such as decreasing level of hardware redundancy, disaster recovery, WAN-based clustering, using commodity computers, have a level of survivability in presence of catastrophic damages and disasters. The results of experiences on JOOYAN cluster with 3 computation-nodes when injecting failure scenarios on coordinators, network, hardware and virtual machines have shown that JOOYAN could detect real failure in a small time-interval (in scale of mili sec.) then with booting VM on leader node (in case of disaster) it could provide disaster recovery and high availability in a same time that resulted improvement of survivability HA clustering.
|
Sadegh Houshmand Jooyani
Iran University of Science and Technology
Department of Computer Science & Engineering
System Software Group
Distributed Systems Research Laboratory