Recent Projects

  • NDSEQ (NonDeterministic SEQuential Specifications) is a novel lightweight specification method to specify the correctness of parallelism in a program. In particular, NDSEQ specification of a program specifies the correctness of a program's parallelism using a sequential version of the program with controlled nondeterminism. Such a nondeterministic sequential specification allows (1) the correctness of parallel interference to be verified independently of the program's functional correctness, and (2) the functional correctness of a program to be understood and verified on a sequential version of the program, one with controlled nondeterminism but no interleaving of parallel threads. For details, please see our PLDI 2010 paper here.
  • KUDA is a framework for performing runtime monitoring using GPUs as accelerators. The idea is to separate the runtime burden of a runtime analysis such as data-race checking from the application threads and use GPU threads to perform the actual checking of the execution in a highly parallel manner. See our technical report titled "Rethinking Runtime Verification on Hundreds of Cores: Challenges and Opportunities" for more information about out vision and approach. The KUDA tool is open source and available for download at

During my graduate study, I led technical and practical development of three main projects: Vyrd, Goldilocks, and QED. Among them, Vyrd and Goldilocks provide runtime techniques that are based on monitoring the program’s execution and collecting data in order to identify bugs in the program. On the other hand, QED provides a static/compile-time method that analyzes the program’s source code to show that the program is bug-free and always behaves as intended. For each project, papers and slides describing the work done can be found in its own web page or by searching for the project name among my publications and talks.


The QED project is the main subject of my Ph.D. dissertation. QED consists of a proof system and a supporting tool for the static verification of  concurrent programs. The QED tool runs on top of the Boogie system and uses the Z3 theorem prover. The tool is open source and available for download at

For more information go to the QED project web page.


Our research on race detection has been driven by our new lockset-based algorithm, called Goldilocks, for precisely computing the happens-before relation. This algorithm allowed us to detect data-races at runtime in a precise and efficient manner. Using the Goldilocks algorithm, we developed a race-aware Java runtime that throws a runtime exception called DataRaceException when a race is about to occur.

In January 2009, the Goldilocks project appeared in the SIGPLAN CACM Research Highlights Nominations.

For more information go to the Goldilocks project web page.


The VYRD project was the main subject of my M.S. thesis. In this project our goal was to develop runtime verification tools for global correctness criteria for concurrent software. Among those are atomicity and refinement. Our research group developed a runtime verification tool prototype, Vyrd, for complex, highly concurrent software components. We used Vyrd to verify several industrial-scale software designs, including storage infrastructure software developed at Microsoft Research, a file system, and Java class libraries. Vyrd was able to find subtle errors in these systems that their developers were not aware of. We extended our prototype to a fully automated easy-to-use verification framework that also uses a model checker to verify the entire state space of the program.

For more information go to the VYRD project web page.

Earlier Projects

Design and Implementation of A Distributed File Sharing System with Content-Addressable Network

In this work, design and implementation issues of a file sharing system over content addressable overlay network. CAN is a scalable, efficient and robust distributed hash table providing distributed applications to store and retrieve information in a simple and efficient manner. In this work, some improvements on CAN-based systems and their applicability on a file sharing system are discussed. In addition, a hierarchical text classification approach is proposed as an alternative hash function and to decompose dimensionality into lower dimensional realities.

MDC Video Retrieval and Path Diversity Mechanisms on Scalable Content-Addressable Networks

Multiple state video coding techniques, which code the video into multiple independently decodable streams, have been widely researched to develop techniques to deal with improved state recovery for the corrupted stream at the receiver side. We used scalable CANs to figure out how a virtual cartesian space based abstraction of underlaying topology can facilitate retrieval and transmission mechanisms for multi-description video. This work proposes solutions to issues like searching, transmitting video packets through diverse paths and synchronizing different descriptions of a MDC coded video over a P2P overlay network. Our system allows communicating with more than one sender for each description and activating/inactivating selected senders according to path bandwidth estimation. We dealt with synchronizing streams from different senders to merge them into one stream on time making use of buffering at the receiver side. We evaluated our algorithms for 2 descriptions on ns2 simulator using a physical mesh topology with dynamic changing conditions and peers capable of participating our mechanisms.

Weighted Classification of Unlabeled Documents with EM Algorithm

In this work, usage of EM text classification algorithm in an information retrieval (IR) system for classifying unlabeled documents in a topic hierarchy is introduced. IR systems can leverage text classification to capture similarity information and to search efficiently among their data considering the hierarchical relationship between documents. Two usages of classification results for CAN-based IR systems are proposed as an additional hash function and dimension reduction approach using realities. The performance of the algorithm in classifying large amounts of unlabeled documents is shown, and benefits and drawbacks of classification in an IR system are discussed.

Parallel Implementation of Hierarchical EM Algorithm for Unsupervised Text Classification

With increasing rate of information storage on Internet, the number of documents handled by information retrieval (IR) systems is growing exponentially in time. Indexing huge number of documents and searching among them requires strong machine learning techniques to compute categorization information of documents in an automated manner. Hierarchical EM is one of the machine learning-based text categorization algorithms which has been proved to be efficient and accurate to classify documents for a topic hierarchy. In this work two parallelization techniques, called horizontal and vertical decomposition, for hierarchical EM algorithm were proposed and evaluated with a number of unlabeled data sets. In both cases, the load balance was preserved and the speed up and e±ciency characteristics of the algorithms for large data sets were evaluated. The work can give intuition on how a tree based algorithm can be decomposed into tasks for parallel implementations.

Modeling Workload Breakdown Properties of The MPEG-2 Decoder Application with Hidden Markov Processes

Dynamic power management (DPM) is a design methodology for dynamically configuring systems to provide service constraints with a minimum load on components, and thus minimum power consumption. Stochastic techniques, in the context of DPM, treat dynamic power management as an optimization problem under uncertainty, using random processes to model the probabilistic behavior of the system. In this work, we derived Hidden Markov Models (HMMs) for the workload characterization of some specific tasks in the MPEG-2 decoder application. Our work reveals how a state-based stochastic model can exploit the probabilistic execution time of the tasks in an application that affects the load on the processor, and thus the power consumption during a period. We analyzed the HMMs produced by evaluating the predicted state sequences using the Viterbi algorithm, given some real executions of the decoder on different MPEG-2 video files, by assessing how close the workload behaviors of the predicted states were to the observed real values.