Our group focuses on fundamental research in statistics, probability, and machine learning.
Rognon-Vael P, Rossell D, Zwiernik P. Improving variable selection properties by using external data (2025) arxiv 2502.15584
C. Amorino, D. Belomestny, V. Pilipauskaite, M Podolskij, S. Zhou. Polynomial rates via deconvolution for nonparametric estimation in McKean-Vlasov SDEs, Probability Theory and Related Fields.
C. Amorino, A. Gloter. Minimax rate for multivariate data under componentwise local differential privacy constraints, to appear in Annals of Statistics.
P. Zwiernik, Entropic covariance models. to appear in Annals of Statistics. arxiv:2306.03590
Torrens M, Papaspiliopoulos O, Rossell D. Confounder importance learning for treatment effect inference. Bayesian Analysis (2025, in press). arvix.2110.00314
Rossell D, Kseung AK, Saez I, Michele G. Semi-parametric local variable selection under misspecification. Biometrika (2024). Paper DOI 10.1093/biomet/asae068
Jewson J, Li L, Battaglia L, Hansen S, Rossell D, Zwiernik P. Graphical model inference with external network data. Biometrics 2024, 80(4). ujae151. arxiv 2210.11107
S. Briend, C. Giraud, G. Lugosi, and D. Sulem, Estimating the history of a random recursive tree. Bernoulli, to appear, 2024.
Cappello, L., Padilla, O. M., Bayesian variance change point detection withcredible sets. to appear in IEEE Transactions on Pattern Analysis and Machine Intelligence.
Cappello, L., Lo, W.T.J., Zhang, J.Z., Xu, P., Barrow, D., Chopra, I., Clark, A. G., Wells, M. T., Kim, J. Bayesian phylodynamic inference of population dynamics with dormancy. to appear in PNAS
2024
Foondun, M., Khoshnevisan, D. and Nualart, E. (2024), Instantaneous everywhere-blowup of parabolic SPDEs, Probability Theory and Related Fields, 190, 601-624.
G. Mesters, P. Zwiernik, Non-independent component analysis. Annals of Statistics 2024, Vol. 52, No. 6, 2506-2528.
N. Broutin, N. Kamčev, and G. Lugosi. Increasing paths in random temporal graphs. Annals of Applied Probability, Vol. 34, No. 6, 5498-5521, 2024.
Brownlees, C., & Llorens-Terrazas, J. (2024). Empirical risk minimization for time series: Nonparametric performance bounds for prediction. Journal of Econometrics, 244(1), 105849.
Rossell D, Kseung AK, Saez I, Michele G. Semi-parametric local variable selection under misspecification. Biometrika (2024). Paper DOI 10.1093/biomet/asae068
Cappello, L., Veber, A., Palacios, J. A., An Efficient Coalescent Model for Heterochronously Sampled Molecular Data. Journal of the American Statistical Association. Vol. 119, No. 548, 2437-2449,2024.
2023
F. Röttger, S. Engelke, P. Zwiernik, Total positivity in multivariate extremes. Annals of Statistics 2023, Vol. 51, No. 3, 962-1004.
Cappello, L., Madrid Padilla, O. H., Palacios, J. A. (2023). Bayesian Change Point Detection with Spike-and-Slab Priors. Journal of Computational and Graphical Statistics, 1-13.
2022
Jewson J, Rossell D. Loss function selection and the use of improper models. Journal of the Royal Statistical Society B 2022 84, 1640-1665. Online version
S. Lauritzen, P. Zwiernik. Locally associated graphical models and mixed convex exponential families. Annals of Statistics 2022, Vol. 50, No. 5, 962-1004.
G. Lugosi and S. Mendelson. Multivariate mean estimation with direction-dependent accuracy. Journal of the European Mathematical Society, 2022.
L. Addario-Berry, L. Devroye, G. Lugosi, and V. Velona. Broadcasting on random recursive trees. Annals of Applied Probability, 32(1):497-528, 2022.
Rossell D. Concentration of posterior probabilities and normalized L0 criteria (2022). Bayesian Analysis, 17, 2, 565-591. Open access version
Avalos-Pacheco A., Rossell D., Savage R (2022). Heterogeneous large datasets integration using Bayesian factor regression. Bayesian Analysis. 17(1): 33-66. arXiv.1810.09894
Cappello, L., Palacios, J. A., Adaptive Preferential Sampling in Phylodynamics. Journal of Computational and Graphical Statistics, 31(2): 541-552, 2022. Open access version
2021
M. Greenacre. Compositional Data Analysis. Annual Reviews in Statistics and its Application, to appear, 2021.
G. Lugosi, J. Truszkowski, V. Velona, and P. Zwiernik. Learning partial correlation graphs and graphical models by covariance queries. Journal of Machine Learning Research, 22(203):1--41, 2021.
Rossell D, Abril O, Bhattacharya A. Approximate Laplace approximations for scalable model selection (2021). Journal of the Royal Statistical Society B, 83, 4, 853-879. Online version (open access)
G. Lugosi, and S. Mendelson. Robust multivariate mean estimation: the optimality of trimmed mean. Annals of Statistics, 2021.
S. Lauritzen, C. Uhler and P. Zwiernik, Total positivity in exponential families with application to binary variables. Annals of Statistics, 2021, Vol. 49, No. 3, 1436-1459.
Rossell D, Rubio FJ. Additive Bayesian variable selection under censoring and misspecification (2021). Statistical Science, 38, 1,13-29 Open access
Rossell D, Zwiernik P. Dependence in elliptical partial correlation graphs (2021). Electronic Journal of Statistics, 15, 2, 4236-4263. Open access version
2020
C. Bordenave, G. Lugosi, and N. Zhivotovskiy. Noise sensitivity of the top eigenvector of a Wigner matrix. Probability Theory and Related Fields, 2020.
G. Lugosi, and S. Mendelson. Risk minimization by median-of-means tournaments. Journal of the European Mathematical Society, 2020.
P. Bartlett, P.L. Long, G. Lugosi, and A. Tsigler. Benign overfitting in linear regression. PNAS, 117.48 (2020): 30063-30070.
A. Corral, F. Udina and E. Arcaute, Truncated lognormal distributions and scaling in the size of naturally defined population clusters. Physical Review E, 2020, 101, No. 4.
2019
G. Lugosi, and S. Mendelson, Near-optimal mean estimators with respect to general norms. Probability Theory and Related Fields, 2019.
J. Fúquene, M.F.J. Steel, and D. Rossell, On choosing mixture components via non-local priors. Journal of the Royal Statistical Society B, 2019, 81, 5, 809-837.
S. Lauritzen, C. Uhler, and P. Zwiernik, Maximum likelihood estimation in Gaussian models under total positivity. Annals of Statistics, 2019, Vol. 47, No. 4, 1835-1863.
Sub-Gaussian estimators of the mean of a random vector by G. Lugosi, and S. Mendelson. Annals of Statistics, 2019, Vol. 47, No. 2, pp 783-794.
2018
Variable selection in compositional data analysis using pairwise logratios. M. Greenacre. Mathematical Geosciences, 2018, 1-34. doi: 10.1007/s11004-018-9754-x
Tractable Bayesian variable selection: beyond normality by D. Rossell and F.J. Rubio. Journal of the American Statistical Association, 2018, pp 1-17.
2017
Nonlocal priors for high-dimensional estimation by D. Rossell and D. Telesca. Journal of the American Statistical Association, 2017, 112.517, pp 254-265.
Maximum likelihood estimation for linear Gaussian covariance models by P. Zwiernik, C. Uhler, and D. Richards. Journal of the Royal Statistical Society: Series B, 79(4), 2017, 1269–1292.
S. Fallat, S. Lauritzen, K. Sadeghi, C. Uhler, N. Wermuth, and P. Zwiernik, Total positivity in Markov structures. Annals of Statistics 2017, Vol. 45, No. 3, 1152-1184.
"Size" and "shape" in the meansurement of multivariate proximity by M. Greenacre. Methods in Ecology and Evolution 2017, 8:1415-1424. doi: 10.1111/2041-210X.12776 with video abstract.
2016
Set estimation from reflected Brownian motion by A. Cholaquidis, R. Fraiman, G. Lugosi, and B. Pateiro-López. Journal of the Royal Statistical Society: Series B, 2016, 78:1057–1078.
Sub-Gaussian mean estimators by L. Devroye, M. Lerasle, G. Lugosi, and R. Imbuzeiro Oliveira. Annals of Statistics, 2016, 44:2695-2725.
Almost optimal sparsification of random geometric graphs by N. Broutin, L. Devroye, and G. Lugosi, Annals of Applied Probability, 2016, 26:5, 3078-3109.
Weighted Euclidean biplots by M. Greenacre and P. Groenen. Journal of Classification, 33:442-459.
On probability laws of solutions of differential systems driven by fractional Brownian motion by F. Baudoin, E. Nualart, C. Ouyang, and S. Tindel, Annals of Probability, 2016, 44, pp 2554-2590.
Exponential varieties by M. Michałek, B. Sturmfels, C. Uhler, and P. Zwiernik, Proceedings of the London Mathematical Society (3) 112 (2016), no. 1, 27–56.
2015
Empirical risk minimization for heavy-tailed losses by C. Brownlees, E. Joly and G. Lugosi, Annals of Statistics, 2015, 43(6), 2507-2536.
Jewson J, Li L, Battaglia L, Hansen S, Rossell D, Zwiernik P. Graphical model inference with external network data. Biometrics 2024, 80(4). ujae151. arxiv 2210.11107
Cappello, L., Kim, J. , Liu, S. , Palacios, J. A., Statistical Challenges in Tracking the Evolution of SARS-CoV-2. Statistical Science, 37(2): 162-182, 2022. Open access version
Semken C, Rossell D. Specification analysis for technology use and teenager well-being. Statistical validity and a Bayesian proposal (2022). Journal of the Royal Statistical Society C
Parikh, V., Ioannidis, ... Cappello, L. ,..., Rivas, M., Ashley, E. (2022) Deconvoluting complex correlates of COVID19 severity with a multi-omic pandemic tracking strategy. Nature Communications, 13, 5107
L. Beauchemin, M. Slifker, D. Rossell, and J. Font-Burgada (2020). Characterizing MHC-I genotype predictive power for oncogenic mutation probability in cancer patients. Immunoinformatics, Methods and Protocols. Springer.
Graeve M, Greenacre M. (2020). The selection and analysis of fatty acid ratios: A new approach for the univariate and multivariate analysis of fatty acid trophic markers in marine pelagic organisms. Limnology and Oceanographic Methods, 18, 196-210. doi: 10.1002/lom3.10360 with video abstract
Greenacre M (2020) . Amalgamations are valid in compositional data analysis, can be used in agglomerative clustering, and their logratios have an inverse transformation. Applied Computing and Geosciences, 5, doi: 10.106/j.acags.2019.100017
Gavard R, Jones H, Palacio Lozano D, Thomas M, Rossell D, Spencer S, Barrow M (2020). KairosMS: A new solution for the processing of hyphenated ultrahigh resolution mass spectrometry data. Analytical Chemistry, 92.5 3775-86
Gavard R, Palacio Lozano D, Guzman A, Rossell D, Spencer S, Barrow M (2019). Rhapso: Automatic stitching of mass segments from Fourier transform ion cyclotron resonance mass spectra. Analytical Chemistry, 91:15130-37
Greenacre M (2019). Use of correspondence analysis in clustering a mixed-scale data set with missing data. Archives of Data Science, doi: 10.5445/KSP/1000085952/04
Korneliussen T, Greenacre M (2018). Information sources used by European tourists: a cross-cultural study. Journal of Travel Research, 57, 193-205.
Greenacre M (2017). Ordination with any dissimilarity measure: a weighted Euclidean solution. Ecology, 98:2293-2300.
Marty R, Kaabinejadian S, van de Haar J, Rossell D, Ideker T, Hildebrand W, Engin HB, Font-Burgada J, Carter H. (2017) MHC-I genotype restricts the oncogenic mutational landscape. Cell, 171, 1272-1283
Greenacre M (2016). Data reporting and visualization in ecology. Polar Biology, 39:2189-2205.
Font-Burgada J, Shalapour S, Ramaswamy S, Hsueh B, Rossell D, Umemura A, Taniguchi K, Nakagawa H, Valasek MA, Ye L, Kopp JL, Sander M, Carter H, Deisseroth K, Verma IM, Karin M. (2015) Hybrid Periportal Hepatocytes Regenerate the Injured Liver without Giving Rise to Cancer. Cell, 162(4):766-79.
Calon A, Lonardo E, Berenguer A, Espinet E, Hernando-Momblona X, Iglesias M, Sevillano M, Palomo-Ponce S, Tauriello DVF, Byrom D, Cortina C, Morral C, Barceló C, Tosi S, Riera A, Stephan-Otto Attolini C, Rossell D, Sancho E, Batlle E. (2015) Stromal gene expression defines poor prognosis subtypes in colorectal cancer. Nature Genetics, 47, 320-329. doi:10.1038/ng.3225
D. Nualart and E. Nualart, Introduction to Malliavin Calculus, IMS Textbooks, Cambridge University Press, 2018.
M. Greenacre. Compositional Data Analysis in Practice. Chapman&Hall, 2018.
P. Zwiernik. Semialgebraic Statistics and Latent Tree Models. Chapman&Hall, 2017.
S. Boucheron, G. Lugosi, and P. Massart, Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, 2013.
N. Cesa-Bianchi, and G. Lugosi, Prediction, Learning, and Games. Cambridge University Press, 2006.
L. Devroye and G. Lugosi, Combinatorial Methods in Density Estimation. Springer, 2000.
L. Devroye, L. Györfi, G. Lugosi, A Probabilistic Theory of Pattern Recognition. Springer, 1996.
M. Greenacre. Correspondence Analysis in Practice. Chapman&Hall, 1993.
Christian Brownlees:
Annals of Financial Economics, Econometrics, Journal of Network Theory in Finance, Journal of Risk and Financial Management
Gábor Lugosi:
Annals of Applied Probability, Journal of Machine Learning Research, Probability Theory and Related Fields
Eulàlia Nualart:
Stochastic Processes and their Applications
David Rossell:
Bayesian Analysis
Piotr Zwiernik:
Journal of Royal Statistical Society: Series B, Biometrika, Algebraic Statistics, Scandinavian Journal of Statistics
"Statistical learning for dependent and high-dimensional data"
Reference: PID2022-138268NB-I00
Financing entity: Agencia Española de Investigación.
Dates: 2023-27.
Amount: €100,000
"Modern challanges in high-dimensional data analysis"
Financing entity: Fundación BBVA
Dates: 2022-2025
Principle investigators: Gábor Lugosi
Amount: € 149,978
"Prediccion, Inferencia y Computacion en Modelos Estructurados de Alta Dimension"
Reference: PGC2018-101643-B
Financing entity: Ministerio de Economía y Competitividad (MINECO)
Dates: 2019-2022
Principle investigators: Gábor Lugosi, Omiros Papaspiliopoulos
Amount: € 141,812
"Algorithms and Learning for AI"
Financing entity: Google
Dates: 2018-2020
Principle investigator: Gábor Lugosi
Amount: USD 150,000
“High-dimensional problems in structured probabilistic models”
Financing entity: Fundación BBVA
Dates: 2018-2020
Principle investigator: Gabor Lugosi
Amount: € 100,000
“Estimación de redes latentes”
Reference: MTM2015-67304-P
Financing entity: Ministerio de Economía y Competitividad (MINECO)
Dates: 2016-2018
Principle investigators: Gabor Lugosi, Omiros Papaspiliopoulos
Amount: € 52,998