Papers by Topic

INFERENCE FOR LARGE RANDOM MATRICES & SPECTRAL METHODS

(* alphabetic order)

[1] Ma, R., Cai, T. T., and Li, H. (2021) Optimal Permutation Recovery in Permuted Monotone Matrix Model.  Journal of the American Statistical Association, 116(535), 1358-1372  [arxiv] [paper] [R codes]

[2] Ma, R., and Barnett, I. (2021) The Asymptotic Distribution of Modularity in Weighted Signed Networks. Biometrika, 108(1): 1-16  [arxiv] [paper

[3] Cai, T. T., Li, H., and Ma, R.* (2021) Optimal Structured Principal Subspace Estimation: Metric Entropy and Minimax Rates. Journal of Machine Learning Research, 22(46): 1-45 [arxiv] [paper]

[4] Ma, R., Cai, T. T., and Li, H. (2021) Optimal Estimation of Bacterial Growth Rates Based on Permuted Monotone Matrix. Biometrika, 108(3): 693-708 [arxiv] [paper] [software]

[5] Ma, R., and Li, H. (2022) Interaction Network in Microbiome Studies. In Piegorsch, W. W., Levine, R. A., Zhang, H. H., and Lee, T. C. M. (eds.). Computational Statistics in Data Science, Chapter 13 [link]

[6] Ma, R., Sun, E., and Zou, J. (2023) A Spectral Method for Assessing and Combining Multiple Data Visualizations. Nature Communications [paper] [software] [arxiv]

[7] Ding, X., and Ma, R.* (2023) Learning Low-Dimensional Nonlinear Structures from High-Dimensional Noisy Data: An Integral Operator Approach. Annals of Statistics, 51(4), 1744-1769 [arxiv] [paper] [R codes]

[8] Cai, T. T., and Ma, R.* (2024) Matrix Reordering for Noisy Disordered Matrices: Optimality and Computationally Efficient Algorithms. IEEE Transactions on Information Theory, 70(1), 509-531 [arxiv] [paper]

[9] Ma, R., Sun, E., Donoho, D., and Zou, J. (2024) Principled and Interpretable Alignability Testing and Integration of Single-Cell Data. Proceedings of the National Academy of Sciences (direct submission)  [arxiv] [software] [paper]

[10] Ding, X., and Ma, R.*, Kernel Spectral Joint Embeddings for High-Dimensional Noisy Datasets Using Duo-Landmark Integral Operators. Submitted [arxiv]

[11] Landa, B., Kluger, Y., and Ma, R., Entropic Optimal Transport Eigenmaps for Integration and Joint Embedding of Datasets. Submitted [arxiv]

MANIFOLD LEARNING & NONLINEAR EMBEDDING

(* alphabetic order)

[1] Cai, T. T., and Ma, R.* (2022) Theoretical Foundations of t-SNE for Visualizing High-Dimensional Clustered Data.  Journal of Machine Learning Research, 23(301): 1-54 [arxiv] [paper]

[2] Sun, E., Ma, R., and Zou, J. (2023) Dynamic Visualization of High-Dimensional Data. Nature Computational Science [paper] [bioRxiv] [software]

[3] Ma, R., Sun, E., and Zou, J. (2023) A Spectral Method for Assessing and Combining Multiple Data Visualizations. Nature Communications [paper] [software] [arxiv]

[4] Ding, X., and Ma, R.* (2023) Learning Low-Dimensional Nonlinear Structures from High-Dimensional Noisy Data: An Integral Operator Approach. Annals of Statistics, 51(4), 1744-1769 [arxiv] [paper] [R codes]

[5] Cai, T. T., and Ma, R.* (2024) Matrix Reordering for Noisy Disordered Matrices: Optimality and Computationally Efficient Algorithms. IEEE Transactions on Information Theory, 70(1), 509-531 [arxiv] [paper]

[6] Ma, R., Sun, E., Donoho, D., and Zou, J. (2024) Principled and Interpretable Alignability Testing and Integration of Single-Cell Data. Proceedings of the National Academy of Sciences (direct submission)  [arxiv] [software] [paper]

[7] Ding, X., and Ma, R.*, Kernel Spectral Joint Embeddings for High-Dimensional Noisy Datasets Using Duo-Landmark Integral Operators. Submitted [arxiv]

[8] Fischer, J., and  Ma, R., Sailing in High-Dimensional Spaces: Low-Dimensional Embeddings through Angle Preservation. Submitted

[9] Landa, B., Kluger, Y., and Ma, R., Entropic Optimal Transport Eigenmaps for Integration and Joint Embedding of Datasets. Submitted [arxiv]

INFERENCE FOR HIGH-DIMENSIONAL REGRESSIONS

(* alphabetic order)

[1] Ma, R., Cai, T. T., and Li, H. (2021) Global and Simultaneous Hypothesis Testing for High-Dimensional Logistic Regression Models.  Journal of the American Statistical Association, 116(534), 984-998 [arxiv] [paper] [R codes] 

[2] Ma, R., Cai, T. T., and Li, H. (2022) Optimal Estimation of Simultaneous Signals Using Absolute Inner Product and Applications to Integrative Genomics.  Statistica Sinica, 32, 1027-1048 [arxiv] [paper] [R codes]

[3] Cai, T. T., Guo, Z., and Ma, R.* (2022) Statistical Inference for High-Dimensional Generalized Linear Models with Binary Outcomes. Journal of the American Statistical Association [paper] [R package]

[4] Ma, R., Guo, Z., Cai, T. T., and Li, H. (2024+) Statistical Inference of Genetic Relatedness based on High-Dimensional Logistic Regression. Statistica Sinica [arxiv] [R codes]

[5] Zhang, L., Ma, R., Cai, T. T., and Li, H. (2024+)  Estimation, Confidence Intervals, and Large-Scale Hypotheses Testing for High-Dimensional Mixed Linear Regression. Submitted

[6] Fei, X., Ma, R., and Li, H. (2024+) Statistical Inference for High-Dimensional Linear Regression with Blockwise Missing Data. Statistica Sinica in press [paper]

BIOMEDICAL DATA SCIENCE & INTERPRETABLE MACHINE LEARNING

(* alphabetic order)

[1] Ma, R., Cai, T. T., and Li, H. (2021) Optimal Estimation of Bacterial Growth Rates Based on Permuted Monotone Matrix. Biometrika, 108(3): 693-708 [arxiv] [paper] [software]  

[2] Ma, R., Hansen, M., Ranciaro, A., Thompson, S., Beggs, W., Mpoloka, S. W., Mokone, G. G., Meskel, D. W., Belay, G., Nyambo, T., Michailidis, G., Li, H., Burant, C., and Tishkoff, S. (2021) Impact of Subsistence and Genetics on Lipid Profiles in Ethnically Diverse Africans. Diabetes, 70 (Supplement_1): 191-LB [link]

[3] Ma, R., and Li, H. (2022) Interaction Network in Microbiome Studies. In Piegorsch, W. W., Levine, R. A., Zhang, H. H., and Lee, T. C. M. (eds.). Computational Statistics in Data Science, Chapter 13 [link]

[4] Ma, R., Guo, Z., Cai, T. T., and Li, H. (2023+) Statistical Inference of Genetic Relatedness based on High-Dimensional Logistic Regression. Statistica Sinica [arxiv] [R codes]

[5] Sun, E., Ma, R., and Zou, J. (2022) Dynamic Visualization of High-Dimensional Data. Nature Computational Science [paper] [bioRxiv] [software]

[6] Ma, R., Sun, E., and Zou, J. (2023) A Spectral Method for Assessing and Combining Multiple Data Visualizations. Nature Communications [paper] [software] [arxiv]

[7] Kelly, D., Ramdas, S., Ma, R., Rawlings-Goss, R., Grant, G., Ranciaro, A., Hirbo, J., Beggs, W., Yeager, M., Chanock, S., Nyambo, T., Omar, S., Meskel, D., Belay. G., Li, H., Brown, C., Tishkoff, S. (2023) The Genetic and Evolutionary Basis of Gene Expression Variation in East Africans. Genome Biology, 24(35) [bioRxiv] [paper]

[8] Einav, T., and Ma, R.* (2023) Using Interpretable Machine Learning to Extend Heterogeneous Antibody-Virus Datasets. Cell Reports Methods, 3(100540) [paper] [software] [bioRxiv

[9] Sun, E., Ma, R., Negredo, P., Brunet, A., and Zou, J. (2024) TISSUE: Uncertainty-Calibrated Prediction of Single-Cell Spatial Transcriptomics Improves Downstream Analyses.  Nature Methods [bioRxiv] [software] [paper]

[10] Ma, R., Sun, E., Donoho, D., and Zou, J. (2024) Principled and Interpretable Alignability Testing and Integration of Single-Cell Data. Proceedings of the National Academy of Sciences (direct submission)  [arxiv] [software] [paper]

[11] Sun, E., Ma, R., and Zou, J. (2024) SPRITE: Improving Spatial Gene Expression Imputation with Gene and Cell Networks. Bioinfomatics, accepted [bioRxiv]

[12] Li, S., Alexander, J., Kendall, J., Andrews, P., Rose, E., Orjuela, H., Park, S., Podszus, C., Shanley, L., Ma, R., Rishi, A., Donoho, D., Goldberg, G., Levy, D., Wigler, M., Genomic and Transcriptomic Analyses of the Same Single Nuclei. Submitted