Junhui Wang - Publications

I. Journal Papers

Zhang, Y., Sun, Y. He, X., Zhang, J. and Wang, J. (2025+). On optimal tracking of structural changes in time-varying networks. Journal of Computational and Graphical Statistics, in press.
Zhen, Y., Xu. S. and Wang, J. (2025+). Consistent community detection in multi-layer networks with heterogeneous differential privacy. Statistica Sinica, in press.
Zhou, W., Kang, X., Zhong, W. and Wang, J. (2025+). Efficient learning of DAG structures in heavy-tailed data. Statistica Sinica, in press.
Zhang, H. and Wang, J. (2025). Efficient estimation for longitudinal network via adaptive merging. Journal of the American Statistical Association, 120, 1683-1694.
Zhang, H. and Wang, J. (2025). Efficient estimation and inference for the signed beta-model in directed signed networks. Statistica Sinica, 35, 1671-1687.
Zhang, J., Wang, J. and Wang, X. (2024). Consistent community detection in inter-layer dependent multi-layer networks. Journal of the American Statistical Association, 119, 3141-3151.
Zhao, R., Zhang, H. and Wang, J. (2024). Identifiability and consistent estimation for Gaussian chain graph models. Journal of the American Statistical Association, 119, 3101-3112.
Xu, Q., Yuan, Y., Wang, J. and Qu, A. (2024). Crowdsourcing utilizing subgroup structure of latent factor modeling. Journal of the American Statistical Association, 119, 1192-1204.
Ren, M., Zhen, Y. and Wang, J. (2024). Transfer learning for tensor Gaussian graphical models. Journal of Machine Learning Research, 25(396), 1-40.
Zhen, Y. and Wang, J. (2024). Non-negative tensor completion for dynamic counterfactual prediction on COVID-19 pandemic. Annals of Applied Statistics, 18, 224-245.
Zhang, Y., Zhang, J., Sun, Y. and Wang, J. (2024). Change point detection in dynamic network via regularized tensor decomposition. Journal of Computational and Graphical Statistics, 33, 515-524.
Cui, L., Hong, Y., Li, Y. and Wang, J. (2024). A regularized high-dimensional positive definite covariance estimator with high-frequency data. Management Science, 70, 6483-7343.
Liu, C., Jiao, Y., Wang, J. and Huang, J. (2024). Non-asymptotic bounds for adversarial excess risks under misspecified models. SIAM Journal on Mathematics of Data Science, 6, 847-868.
Yuan, H., Lu, K., Li, G. and Wang, J. (2024). High-frequency-based volatility model with network structure. Journal of Time Series Analysis, 45, 533-557.
Zhen, Y. and Wang, J. (2023). Community detection in general hypergraph via graph embedding. Journal of the American Statistical Association, 118, 1620-1629.
Zhang, J., Li, C. and Wang, J. (2023). A stochastic block Ising model for multi-layer networks with inter-layer dependence. Biometrics, 79, 3564-3573.
Ren, M., Zhang, S. and Wang, J. (2023). Consistent estimation of the number of communities via regularized network embedding. Biometrics, 79, 2404-2416.
Xu, S., Zhen, Y. and Wang, J. (2023). Covariate-assisted community detection in multi-layer networks. Journal of Business and Economics Statistics, 41, 915–926.
Lv, S., He, X. and Wang, J. (2023). Kernel-based estimation for partially functional linear model: minimax rates and randomized sketches. Journal of Machine Learning Research, 24(55):1-38.
Zhang, J., He, X. and Wang, J. (2022) Directed community detection with network embedding. Journal of the American Statistical Association, 117, 1809-1819.
Zhao, R., He, X. and Wang, J. (2022). Learning linear non-Gaussian directed acyclic graph with diverging number of nodes. Journal of Machine Learning Research, 23(269):1-34.
Feng, L. and Wang, J. (2022). Projected robust PCA with application to smooth image recovery. Journal of Machine Learning Research, 23(249):1-41.
Dai, B., Shen, X. and Wang, J. (2022). Embedding learning. Journal of the American Statistical Association, 117, 307-319.
Zhou, W., He, X., Zhong, W. and Wang, J. (2022). Efficient learning of quadratic variance function directed acyclic graphs via topological layers. Journal of Computational and Graphical Statistics, 31, 1269-1279.
Zhang, J. and Wang, J. (2022). Identifiability and parameter estimation of the overlapped stochastic co-block model. Statistics and Computing, 32:57.
Dai, B., Shen, X., Wang, J. and Qu, A. (2021). Scalable collaborative ranking for personalized prediction. Journal of the American Statistical Association, 116, 1215-1223.
He, X., Wang, J. and Lv, S. (2021). Efficient kernel-based variable selection with sparsistency. Statistica Sinica, 31, 2123-2151.
Xu, S., Dai, B. and Wang, J. (2021). Sentiment analysis with covariate-assisted word embeddings. Electronic Journal of Statistics, 15, 3015-3039.
Chen, F., He, X. and Wang, J. (2021). Learning sparse conditional distribution: an efficient kernel-based approach. Electronic Journal of Statistics, 15, 1610-1635.
Zhou, Y., Yang, B., Wang, J., Zhu, J. and Tian, G. (2021). A scaling-free minimum enclosing ball method to detect differentially expressed genes for RNA-seq data. BMC Genomics, 22: 479.
He, X., Lv, S. and Wang, J. (2020). Variable selection for classification with derivative-induced regularization. Statistica Sinica, 30, 2075-2103.
He, X. and Wang, J. (2020). Discovering model structure for partially linear models. Annals of the Institute of Statistical Mathematics, 72, 45-63.
Dai, B., Wang, J., Shen, X. and Qu, A. (2019). Smooth neighborhood recommender systems. Journal of Machine Learning Research, 20(16), 1-24.
Dai, B. and Wang, J. (2019). Query-dependent learning to rank and its asymptotic properties. Electronic Journal of Statistics, 13, 465-488.
Zhou, Y., Zhu, J., Tong, T., Wang, J., Lin, B. and Zhang, J. (2019). A statistical normalization method and differential expression analysis for RNA-seq data between different species. BMC Bioinformatics, 20:163.
He, X., Wang, J. and Lv, S. (2018). Gradient-induced model-free variable selection with composite quantile regression. Statistica Sinica, 28, 1521-1538.
Yang, L., Wang, J. and Ma, S. (2018). Reduced-rank modeling for high-dimensional model-based clustering. Journal of Computational Mathematics, 36, 426-440.
Bi, X., Qu, A., Wang, J. and Shen, X. (2017). A group-specific recommender system. Journal of the American Statistical Association, 112, 1344-1353.
Lv, S., He, X. and Wang, J. (2017). A unified penalized method for sparse additive quantile models: a RKHS approach. Annals of the Institute of Statistical Mathematics, 69, 897-923.
Wang, J., Shen, X., Sun, Y. and Qu, A. (2017). Automatic summarization with existing and novel tags. Biometrika, 104, 273-290.
Yang, L., Fang, Y., Wang, J. and Shao, Y. (2017). Variable selection for partially linear models via learning gradients. Electronic Journal of Statistics, 11, 2907-2930.
Yuan, T. and Wang, J. (2017). Reduced-rank multi-label classification. Statistics and Computing, 27, 181-191.
Shu, X., Wang, J., Shen, X. and Qu, A. (2017). Word segmentation in Chinese language processing. Statistics and Its Interface, 10, 165-173.
Wang, J., Shen, X., Sun, Y. and Qu, A. (2016). Classification with unstructured predictors and an application to sentiment analysis. Journal of the American Statistical Association, 111, 1242-1253.
Zhang, C., Liu, Y., Wang, J. and Zhu, H. (2016). Reinforced angle-based multicategory support vector machine. Journal of Computational and Graphical Statistics, 25, 806-825.
Yang, L., Lv, S. and Wang, J. (2016). Model-free variable selection in reproducing kernel Hilbert space. Journal of Machine Learning Research, 17(78), 1-24.
Wang, Y., Fang, Y. and Wang, J. (2016). Sparse optimal discriminant clustering. Statistics and Computing, 26, 629-639.
Wang, J. (2015). Joint estimation of sparse multivariate regression and conditional graphical models. Statistica Sinica, 25, 831-851.
Hedayat, S., Wang, J. and Xu, T. (2015). Minimum clinically important difference in medical studies. Biometrics, 71, 33-41.
Xu, T., Fang, Y., Rong, A. and Wang, J. (2015). Flexible combination of multiple diagnostic biomarkers to improve diagnostic accuracy. BMC Medical Research Methodology, 15:94.
Xu, T., Wang, J. and Fang, Y. (2014). Covariate-adjusted Youden index and optimal cut-points. Statistics in Medicine, 33, 4963-4974.
Sun, W., Wang, J. and Fang, Y. (2013). Consistent selection of tuning parameters via variable selection stability . Journal of Machine Learning Research, 14, 3419-3440.
Xu, T. and Wang, J. (2013). An efficient model-free estimation of multiclass conditional probability. Journal of Statistical Planning and Inference, 143, 2079-2088.
Yuan, T. and Wang, J. (2013). A coordinate descent algorithm for sparse positive definite matrix estimation. Statistical Analysis and Data Mining, 6, 431-442.
Wang, J. (2013). Boosting the generalized margin in cost-sensitive multiclass classification. Journal of Computational and Graphical Statistics, 22, 178-192.
Wang, J. and Fang, Y. (2013). Analysis of presence-only data via semisupervised learning approaches. Computational Statistics and Data Analysis, 59, 134-143.
Sun, W., Wang, J. and Fang, Y. (2012). Regularized k-means clustering of high-dimensional data and its asymptotic consistency. Electronic Journal of Statistics, 6, 148-167.
Fang, Y. and Wang, J. (2012). Selection of the number of clusters via the bootstrap. Computational Statistics and Data Analysis, 56, 468-477. R package
Fang, Y. and Wang, J. (2011). Penalized cluster analysis with applications to family data. Computational Statistics and Data Analysis, 55, 2128-2136.
Huang, X., Fang, Y. and Wang, J. (2011). Identification of functional rare variants in GWAS using stability selection based on random collapsing. BMC Proceedings, 5(Suppl 9):S58.
Wang, J. (2010). Consistent selection of the number of clusters via cross validation. Biometrika, 97, 893-904. R package
Wang, J. and Wang, L. (2010). Sparse supervised dimension reduction in high dimensional classification. Electronic Journal of Statistics, 4, 914-931.
Wang, J., Shen, X. and Pan, W. (2009). On large margin hierarchical classification with multiple paths. Journal of the American Statistical Association, 104, 1213-1223.
Wang, J., Shen, X. and Pan, W. (2009). On efficient large margin semisupervised learning: methodology and theory. Journal of Machine Learning Research, 10, 719-742.
Wang, J., Shen, X. and Liu, Y. (2008). Probability estimation for large margin classifiers. Biometrika, 95, 149-167. R package
Wang, J. and Shen, X. (2007). Large margin semi-supervised learning. Journal of Machine Learning Research, 8, 1867-1891.
Wang, J. and Shen, X. (2006). Estimation of generalization error: random and fixed inputs. Statistica Sinica, 16, 569-588.

II. Conference Papers

Wang, C., He, X., Wang, Y. and Wang, J. (2024). On the target-kernel alignment: a unified analysis with kernel complexity. NeurIPS.
Lv, S., Wang, J., Liu, J. and Liu, Y. (2021). Improved learning rates of a functional lasso-type SVM with sparse multi-kernel representation. NeurIPS, spotlight.
Yao, W., Lee, W. and Wang, J. (2020). Learning from crowds via joint probabilistic matrix factorization and clustering in latent space. ECML-PKDD.
Mukherjee, A., Kumar, A., Liu, B., Wang, J., Meraz, S., Hsu, M., Castellanos, M. and Ghosh, R. (2013). Spotting opinion spammers using behavioral footprints. SIGKDD.
Tang, P., Kannan, O., Wang, J., Oh, J., and Kwigizile, V. (2013). Importance ranking of bridge condition explanatory data items for customized data collection and bridge management. TRB Annual Meeting.
Mukherjee, A., Liu, B., Wang, J., Glance, N. and Jindal, N. (2011). Detecting group review spam. WWW.
Wang, J. (2007). Efficient large margin semisupervised learning. AISTATS.

III. Book Chapters and Discussions

He, X., Xu, S. and Wang, J. (2019). Discussion on "Entropy learning for dynamic treatment regimes". Statistica Sinica, 29, 1658-1662.
Zhou, Y., Wang, J., Zhao, Y. and Tong, T. (2018). Discriminant analysis and normalization methods for next-generation sequencing data. New Frontiers of Biostatistics and Bioinformatics, pp. 365-384. Springer, New York.
Wang, J., Shen, X. and Pan, W. (2007). On transductive support vector machine. Contemporary Mathematics 43 Machine and Statistical Learning: Prediction and Discovery, pp. 7-19. AMS, Providence.

Acknowledgement: my research is supported in part by HK RGC Grants GRF-11311022, GRF-14306523, GRF-14303424, GRF-14302925, and CUHK Startup Grant 4937091.