Research

My research mainly focuses on the development of Bayesian statistical methods and computational tools motivated by problems in biomedical data analysis. The methods that I have developed find successful applications in characterization of tumor heterogeneity, clinical trial design, and inference with missing data. In addition to my primary research area, I am also interested in and have worked on a wide range of topics such as posterior contraction, spatial statistics, and model combination.

The following figure summarizes some of my works.

1. Tumor Heterogeneity

During tumor growth, tumor cells acquire and accumulate somatic mutations that lead to genetically different cell subpopulations. This phenomenon is known as intra-tumor heterogeneity. Each cell subpopulation, referred to as a subclone, consists of cells that have the same genetic architecture, such as point mutations and copy number aberrations. I have developed a series of methods for inference on intra-tumor heterogeneity, which can shed light on tumor progression and can further suggest personalized treatment strategy.

(Working) Papers:

  • Sengupta, S., Zhou, T., Müller, P. and Ji, Y. (2016), A Bayesian Nonparametric Model for Reconstructing Tumor Subclones Based on Mutation Pairs. Proceedings of The Pacific Symposium on Biocomputing (PSB), 21, 393-404. [Link]
  • Zhou, T., Müller, P., Sengupta, S. and Ji, Y. (2019), PairClone: A Bayesian Subclone Caller Based on Mutation Pairs. Journal of the Royal Statistical Society: Series C (Applied Statistics) (2019), 68(3), 705-725. [Link]
  • Zhou, T., Sengupta, S., Müller, P. and Ji, Y. (2019), TreeClone: Reconstruction of Tumor Subclone Phylogeny Based on Mutation Pairs using Next Generation Sequencing Data. Annals of Applied Statistics (2019), 13(2), 874-899. [Link]
  • Zhou, T., Sengupta, S., Müller, P. and Ji, Y. (2020+), RNDClone: Tumor Subclone Reconstruction Based on Integrating DNA and RNA Sequence Data. Annals of Applied Statistics, forthcoming.
  • With Sengupta, S., Bi, D. and Ji, Y. (2020+), MutStat: An Ultra-fast Computational Method to Determine Clonal Status of Somatic Mutations.

Our work was reported by the University of Chicago Beagle December 2017 Newsletter.


2. Clinical Trial Design

Clinical trials play a key role in drug development. Innovative trial designs may improve the efficiency of clinical trials by means of, for example, shorter duration, fewer participants, and increased power of detecting a treatment effect if it exists. I have developed a series of innovative trial designs for various types and phases of clinical trials, some of which are currently considered by pharmaceutical companies for implementations in real-world trials.

(Working) Papers:

  • Zhou, T. and Ji, Y. (2019), Discussion of “A Hybrid Phase I-II/III Clinical Trial Design Allowing Dose Re-Optimization in Phase III” by A. G. Chapple and P. F. Thall. Biometrics, 75(2), 385-388. [Link]
  • Zhou, T., Guo, W. and Ji, Y. (2019+), PoD-TPI: Probability-of-Decision Toxicity Probability Interval Design to Accelerate Phase I Trials. Statistics in Biosciences, forthcoming. [Link]
  • Zhou, T. and Ji, Y. (2020+), A Robust Master Protocol Trial Design Based on Bayesian Hypothesis Testing. Biostatistics, forthcoming. [Link]
  • Zhou, T. and Ji, Y. (2020+), A Unified Framework for Time-to-Event Dose-Finding Designs. Submitted to Statistical Science.
  • Zhou, T. and Ji, Y. (2020+), Clinical Trial Design with Multi-center Data and Historical Control Data — A Nonparametric Bayesian Approach.
  • Zhou, T. and Ji, Y. (2020+), A Rule-based Design for Drug Combination Trials.
  • With Lyu, J., Guo, W. and Ji, Y. (2020+), MUCE: A Bayesian Hierarchical Model for the Design and Analysis of Phase 1b Multiple Expansion Cohort Trials.


3. Missing Data

In longitudinal clinical studies, the research objective is often to make inference on a subject's full data response; for example, to calculate the treatment effect of a test drug at the end of a study. However, the vector of responses for a research subject is often incomplete due to dropout, and the dropout is typically non-ignorable. To make inference on the full data estimands in the presence of missing data, I developed a flexible semiparametric Bayesian approach based on a joint model for the full data response, missingness and baseline covariates.

Paper:

  • Zhou, T., Daniels, M. J. and Müller, P. (2020), A Semiparametric Bayesian Approach to Dropout in Longitudinal Studies with Auxiliary Covariates. Journal of Computational and Graphical Statistics, 29(1), 1-12. [Link]


4. Model Combination

Paper:

  • Zhou, T. (2018), Discussion of “Using Stacking to Average Bayesian Predictive Distributions” by Y. Yao, A. Vehtari, D. Simpson and A. Gelman. Bayesian Analysis, 13(3), 976-977. [Link]


5. Posterior Contraction of Latent Feature Models

(Working) Paper:

  • Li, T., Zhou, T., Tsui, K.-W., Wei, L. and Ji, Y. (2019+), Posterior Contraction Rate of Sparse Latent Feature Models with Application to Proteomics. Submitted to Bayesian Analysis. [Link]


6. Spatial Statistics

(Working) Paper:

  • With Travina, A. and Müller, P. (2019+), Khipu, Provenance and Control: A Bayesian Approach to Understanding the Inca Cultural Imperialism. Media report by The Texas Scientist (or this link). Presented at 35th NECAAAE.


Other research interests

I am interested in a much larger range of topics, including variable selection, causal inference, dynamic models, machine learning, data mining, big data, scalable algorithms, and much more.