Research

PhD Research statement

Behind every interconnected system be it physical, social, biological, or man-made, there is a graph encoding the interactions between its components. Learning from graph data has the potential to unleash one's ability to reason about the behavior of such systems; to understand their innate structure; and, ultimately predict their dynamic evolution. In the midst of ``data deluge,'' realizing this potential has never been closer, even though formidable challenges are yet there to overcome. Contemporary graphs have massive scale up to billions of nodes, and generate unceasingly ``big data'' (60B messages, 3B likes, 350M pictures daily in Facebook). Interconnections change over time, which gives rise to dynamic graphs. This torrent of data necessitates on-the-fly (online) and scalable analytics. Graph edges or node attributes may be only partially available due to application specific constraints, which calls for learning approaches to impute the missing information. Nodal features may abide with nonlinear relations that requires judicious and expressive modeling.  Often nodes are associated with large amounts of  meta-information that requires  methods tailored for multi-way (tensor) data.   Nodes may be connected via multiple types or layers of relations such as those comprising multiple social ties among individuals in family, friends, or coworker circles that gives rise to multi-layer graphs. Last but not least, approaches to learning over graph data must be also robust to adversarial behavior. These challenges have been confronted only partly and separately under different formulations and application domains. 

Intellectual merits. My research is centered on analytical and algorithmic foundations that aspire to address the aforementioned challenges facing robust and nonlinear learning tasks over large-scale dynamic graphs. The overarching vision is to leverage and adapt state-of-the-art learning, optimization and networking tools for inference tasks based on limited dynamic graph data. Target applications include identifying anomalies and  communities, as well as providing graph-driven recommendations. Ultimate goal is to both analytically and numerically demonstrate how valuable insights from mining graph data can lead to markedly improved learning tools.  To this end, the research thrusts are as follows. 

Broader impact. Data analytics and machine learning have already permeated a major segment of modern society. The toolbox developed under this research will boost state-of-the-art in statistical learning, network science, graph mining, and big-data analytics. It will thus impact and effect technology transfer to a broad range of emerging fields, ranging from computational biology and neuroscience to social-economic networks.  As far as impact to engineering, this research will offer a flexible suite of robust algorithms for scalable learning from dynamic and multi-way graph data.