Research
Our group focuses on fundamental research in statistics and machine learning.
Recent publications
Cappello, L., Madrid Padilla, O. H., Palacios, J. A. (2023). Bayesian Change Point Detection with Spike-and-Slab Priors. Journal of Computational and Graphical Statistics, 1-13. arxiv 2106.10383
Carter JS, Rossell D, Smith JQ. Partial correlation graphical LASSO (2023). Scandinavian Journal of Statistics. Open access version
G. Mesters, P. Zwiernik, Non-independent component analysis (2024+). to appear in Annals of Statistics arXiv:2206.13668
Jewson J, Rossell D. Loss function selection and the use of improper models. Journal of the Royal Statistical Society B 2022 84, 1640-1665. Online version
Rossell D. Concentration of posterior probabilities and normalized L0 criteria (2022). Bayesian Analysis, 17, 2, 565-591. Open access version
Cappello, L., Palacios, J. A., Adaptive Preferential Sampling in Phylodynamics. Journal of Computational and Graphical Statistics, 31(2): 541-552, 2022. Open access version
Cappello, L., Kim, J. , Liu, S. , Palacios, J. A., Statistical Challenges in Tracking the Evolution of SARS-CoV-2. Statistical Science, 37(2): 162-182, 2022. Open access version
Semken C, Rossell D. Specification analysis for technology use and teenager well-being. Statistical validity and a Bayesian proposal (2022). Journal of the Royal Statistical Society C
Avalos-Pacheco A., Rossell D., Savage R (2022). Heterogeneous large datasets integration using Bayesian factor regression. Bayesian Analysis. 17(1): 33-66. arXiv.1810.09894
Selected publications in theory/methodology (the last 10 years)
2023
F. Röttger, S. Engelke, P. Zwiernik, Total positivity in multivariate extremes. Annals of Statistics 2023, Vol. 51, No. 3, 962-1004.
2022
Jewson J, Rossell D. Loss function selection and the use of improper models. Journal of the Royal Statistical Society B 2022 84, 1640-1665. Online version
S. Lauritzen, P. Zwiernik. Locally associated graphical models and mixed convex exponential families. Annals of Statistics 2022, Vol. 50, No. 5, 962-1004.
G. Lugosi and S. Mendelson. Multivariate mean estimation with direction-dependent accuracy. Journal of the European Mathematical Society, 2022.
L. Addario-Berry, L. Devroye, G. Lugosi, and V. Velona. Broadcasting on random recursive trees. Annals of Applied Probability, 32(1):497-528, 2022.
Rossell D. Concentration of posterior probabilities and normalized L0 criteria (2022). Bayesian Analysis, 17, 2, 565-591. Open access version
2021
M. Greenacre. Compositional Data Analysis. Annual Reviews in Statistics and its Application, to appear, 2021.
G. Lugosi, J. Truszkowski, V. Velona, and P. Zwiernik. Learning partial correlation graphs and graphical models by covariance queries. Journal of Machine Learning Research, 22(203):1--41, 2021.
Rossell D, Abril O, Bhattacharya A. Approximate Laplace approximations for scalable model selection (2021). Journal of the Royal Statistical Society B, 83, 4, 853-879. Online version (open access)
G. Lugosi, and S. Mendelson. Robust multivariate mean estimation: the optimality of trimmed mean. Annals of Statistics, 2021.
S. Lauritzen, C. Uhler and P. Zwiernik, Total positivity in exponential families with application to binary variables. Annals of Statistics, 2021, Vol. 49, No. 3, 1436-1459.
Rossell D, Rubio FJ. Additive Bayesian variable selection under censoring and misspecification (2021). Statistical Science, 38, 1,13-29 Open access
Rossell D, Zwiernik P. Dependence in elliptical partial correlation graphs (2021). Electronic Journal of Statistics, 15, 2, 4236-4263. Open access version
2020
C. Bordenave, G. Lugosi, and N. Zhivotovskiy. Noise sensitivity of the top eigenvector of a Wigner matrix. Probability Theory and Related Fields, 2020.
G. Lugosi, and S. Mendelson. Risk minimization by median-of-means tournaments. Journal of the European Mathematical Society, 2020.
P. Bartlett, P.L. Long, G. Lugosi, and A. Tsigler. Benign overfitting in linear regression. PNAS, 117.48 (2020): 30063-30070.
A. Corral, F. Udina and E. Arcaute, Truncated lognormal distributions and scaling in the size of naturally defined population clusters. Physical Review E, 2020, 101, No. 4.
2019
G. Lugosi, and S. Mendelson, Near-optimal mean estimators with respect to general norms. Probability Theory and Related Fields, 2019.
J. Fúquene, M.F.J. Steel, and D. Rossell, On choosing mixture components via non-local priors. Journal of the Royal Statistical Society B, 2019, 81, 5, 809-837.
S. Lauritzen, C. Uhler, and P. Zwiernik, Maximum likelihood estimation in Gaussian models under total positivity. Annals of Statistics, 2019, Vol. 47, No. 4, 1835-1863.
Sub-Gaussian estimators of the mean of a random vector by G. Lugosi, and S. Mendelson. Annals of Statistics, 2019, Vol. 47, No. 2, pp 783-794.
2018
Variable selection in compositional data analysis using pairwise logratios. M. Greenacre. Mathematical Geosciences, 2018, 1-34. doi: 10.1007/s11004-018-9754-x
Tractable Bayesian variable selection: beyond normality by D. Rossell and F.J. Rubio. Journal of the American Statistical Association, 2018, pp 1-17.
2017
Nonlocal priors for high-dimensional estimation by D. Rossell and D. Telesca. Journal of the American Statistical Association, 2017, 112.517, pp 254-265.
Maximum likelihood estimation for linear Gaussian covariance models by P. Zwiernik, C. Uhler, and D. Richards. Journal of the Royal Statistical Society: Series B, 79(4), 2017, 1269–1292.
S. Fallat, S. Lauritzen, K. Sadeghi, C. Uhler, N. Wermuth, and P. Zwiernik, Total positivity in Markov structures. Annals of Statistics 2017, Vol. 45, No. 3, 1152-1184.
"Size" and "shape" in the meansurement of multivariate proximity by M. Greenacre. Methods in Ecology and Evolution 2017, 8:1415-1424. doi: 10.1111/2041-210X.12776 with video abstract.
2016
Set estimation from reflected Brownian motion by A. Cholaquidis, R. Fraiman, G. Lugosi, and B. Pateiro-López. Journal of the Royal Statistical Society: Series B, 2016, 78:1057–1078.
Sub-Gaussian mean estimators by L. Devroye, M. Lerasle, G. Lugosi, and R. Imbuzeiro Oliveira. Annals of Statistics, 2016, 44:2695-2725.
Almost optimal sparsification of random geometric graphs by N. Broutin, L. Devroye, and G. Lugosi, Annals of Applied Probability, 2016, 26:5, 3078-3109.
Weighted Euclidean biplots by M. Greenacre and P. Groenen. Journal of Classification, 33:442-459.
On probability laws of solutions of differential systems driven by fractional Brownian motion by F. Baudoin, E. Nualart, C. Ouyang, and S. Tindel, Annals of Probability, 2016, 44, pp 2554-2590.
Exponential varieties by M. Michałek, B. Sturmfels, C. Uhler, and P. Zwiernik, Proceedings of the London Mathematical Society (3) 112 (2016), no. 1, 27–56.
2015
Empirical risk minimization for heavy-tailed losses by C. Brownlees, E. Joly and G. Lugosi, Annals of Statistics, 2015, 43(6), 2507-2536.
Selected publications in applications (the last 10 years)
Parikh, V., Ioannidis, ... Cappello, L. ,..., Rivas, M., Ashley, E. (2022) Deconvoluting complex correlates of COVID19 severity with a multi-omic pandemic tracking strategy. Nature Communications, 13, 5107
L. Beauchemin, M. Slifker, D. Rossell, and J. Font-Burgada (2020). Characterizing MHC-I genotype predictive power for oncogenic mutation probability in cancer patients. Immunoinformatics, Methods and Protocols. Springer.
Graeve M, Greenacre M. (2020). The selection and analysis of fatty acid ratios: A new approach for the univariate and multivariate analysis of fatty acid trophic markers in marine pelagic organisms. Limnology and Oceanographic Methods, 18, 196-210. doi: 10.1002/lom3.10360 with video abstract
Greenacre M (2020) . Amalgamations are valid in compositional data analysis, can be used in agglomerative clustering, and their logratios have an inverse transformation. Applied Computing and Geosciences, 5, doi: 10.106/j.acags.2019.100017
Gavard R, Jones H, Palacio Lozano D, Thomas M, Rossell D, Spencer S, Barrow M (2020). KairosMS: A new solution for the processing of hyphenated ultrahigh resolution mass spectrometry data. Analytical Chemistry, 92.5 3775-86
Gavard R, Palacio Lozano D, Guzman A, Rossell D, Spencer S, Barrow M (2019). Rhapso: Automatic stitching of mass segments from Fourier transform ion cyclotron resonance mass spectra. Analytical Chemistry, 91:15130-37
Greenacre M (2019). Use of correspondence analysis in clustering a mixed-scale data set with missing data. Archives of Data Science, doi: 10.5445/KSP/1000085952/04
Korneliussen T, Greenacre M (2018). Information sources used by European tourists: a cross-cultural study. Journal of Travel Research, 57, 193-205.
Greenacre M (2017). Ordination with any dissimilarity measure: a weighted Euclidean solution. Ecology, 98:2293-2300.
Marty R, Kaabinejadian S, van de Haar J, Rossell D, Ideker T, Hildebrand W, Engin HB, Font-Burgada J, Carter H. (2017) MHC-I genotype restricts the oncogenic mutational landscape. Cell, 171, 1272-1283
Greenacre M (2016). Data reporting and visualization in ecology. Polar Biology, 39:2189-2205.
Font-Burgada J, Shalapour S, Ramaswamy S, Hsueh B, Rossell D, Umemura A, Taniguchi K, Nakagawa H, Valasek MA, Ye L, Kopp JL, Sander M, Carter H, Deisseroth K, Verma IM, Karin M. (2015) Hybrid Periportal Hepatocytes Regenerate the Injured Liver without Giving Rise to Cancer. Cell, 162(4):766-79.
Calon A, Lonardo E, Berenguer A, Espinet E, Hernando-Momblona X, Iglesias M, Sevillano M, Palomo-Ponce S, Tauriello DVF, Byrom D, Cortina C, Morral C, Barceló C, Tosi S, Riera A, Stephan-Otto Attolini C, Rossell D, Sancho E, Batlle E. (2015) Stromal gene expression defines poor prognosis subtypes in colorectal cancer. Nature Genetics, 47, 320-329. doi:10.1038/ng.3225
Books
D. Nualart and E. Nualart, Introduction to Malliavin Calculus, IMS Textbooks, Cambridge University Press, 2018.
M. Greenacre. Compositional Data Analysis in Practice. Chapman&Hall, 2018.
P. Zwiernik. Semialgebraic Statistics and Latent Tree Models. Chapman&Hall, 2017.
S. Boucheron, G. Lugosi, and P. Massart, Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, 2013.
N. Cesa-Bianchi, and G. Lugosi, Prediction, Learning, and Games. Cambridge University Press, 2006.
L. Devroye and G. Lugosi, Combinatorial Methods in Density Estimation. Springer, 2000.
L. Devroye, L. Györfi, G. Lugosi, A Probabilistic Theory of Pattern Recognition. Springer, 1996.
M. Greenacre. Correspondence Analysis in Practice. Chapman&Hall, 1993.
Current editorial services
Christian Brownlees:
Annals of Financial Economics, Econometrics, Journal of Network Theory in Finance, Journal of Risk and Financial Management
Gábor Lugosi:
Annals of Applied Probability, Journal of Machine Learning Research, Probability Theory and Related Fields
Eulàlia Nualart:
Stochastic Processes and their Applications
David Rossell:
Bayesian Analysis
Piotr Zwiernik:
Journal of Royal Statistical Society: Series B, Biometrika, Algebraic Statistics, Scandinavian Journal of Statistics
Research projects
"Prediccion, Inferencia y Computacion en Modelos Estructurados de Alta Dimension"
Reference: PGC2018-101643-B
Financing entity: Ministerio de Economía y Competitividad (MINECO)
Dates: 2019-2022
Principle investigators: Gábor Lugosi, Omiros Papaspiliopoulos
Amount: € 141,812
"Algorithms and Learning for AI"
Financing entity: Google
Dates: 2018-2020
Principle investigator: Gábor Lugosi
Amount: USD 150,000
“High-dimensional problems in structured probabilistic models”
Financing entity: Fundación BBVA
Dates: 2018-2020
Principle investigator: Gabor Lugosi
Amount: € 100,000
“Estimación de redes latentes”
Reference: MTM2015-67304-P
Financing entity: Ministerio de Economía y Competitividad (MINECO)
Dates: 2016-2018
Principle investigators: Gabor Lugosi, Omiros Papaspiliopoulos
Amount: € 52,998