My research focus is on developing efficient and accurate algorithms for data analysis in biomedical research, which is in growing necessity of innovation. I wish to understand the foundations of data analysis and idiosyncrasies of biomedical data towards developing scientifically interpretable algorithms. I want to make both theoretical contributions to data analysis and practical contributions to the biomedical field.

Biomedical research is a multidisciplinary research field with a primary focus on improving human health. Future breakthroughs in the biomedical domain are expected to come at the intersection of relevant basic sciences including cell biology, biochemistry, microbiology, immunology, and emerging computational technologies including genomics, spatial proteomics, transcriptomics, etc. Recent advancements in data-acquisition hardware and computing facilities have fueled rapid growth in data-driven approaches to biomedical research. The National Institutes of Health (NIH) in their NIH-Wide Strategic Plan has recognized the unique moment that we live in today in terms of opportunities in biomedical research and has identified data analysis as an integral contributor.

Biomedical data are often characterized to have rich structures (e.g., spatiotemporal structures). There is an opportunity to exploit these structures for developing computationally efficient and accurate algorithms for biomedical applications. The biomedical applications are commonly formulated as optimization problems, which typically constitute a data-fidelity term that estimates the fidelity of noisy data against some system of equations, and a regularization term introduced to overcome ill-posedness. There are three prevailing approaches to overcome the ill-posedness. The first approach is variational where the additional constraints are injected in the variational form to prefer certain solutions. The second approach is non-variational, where additional constraints are introduced in terms of differential equations (e.g., diffusion equation). The third approach is data-driven, which requires a large volume of the population data, where the intrinsic representation of data is learned from the population data via linear (e.g., PCA) or non-linear (e.g., deep neural networks) methods for reparameterization. There are certain benefits or lack of it for each of these approaches in terms of their correctness, computational efficiency, and clinical viability. In my research, I am interested in advancing these computational approaches and developing novel clinically viable methods. Next, I will discuss my past research and future research agenda.

An overview of my current and past research projects is here.