From June 29 – July 2, 2021 we will organize a conference on “Mathematical Statistics and Learning“. The meeting will bring together leading experts in mathematical statistics and machine learning to discuss high-dimensional and structured problems resulting from the modelling and statistical analysis of data from large complex systems. The organizing committee is: Gábor Lugosi (UPF & BGSE), Gergely Neu (UPF), Caroline Uhler (MIT), and Piotr Zwiernik UPF & BGSE).
Data Science Research: Statistics
Main research directions
- Applied Statistics
- Graphical Models and Network Analysis
- Mathematical Statistics and Learning
- Stochastic Analysis and Malliavin Calculus
- Bayesian Statistics and Monte Carlo Methods
- Time Series and Econometrics
Research group website
Highlights
June 29 – July 2, 2021
Mathematical Statistics and Learning 2021
O. Papaspiliopoulos and N. Chopia. Springer, 2020
An Introduction to Sequential Monte Carlo
LP. Bartlett, P.L. Long, G. Lugosi, and A. Tsigler. PNAS, 2020.
Benign overfitting in linear regression
The phenomenon of benign overfitting is one of the key mysteries uncovered by deep learning methodology: deep neural networks seem to predict well, even with a perfect fit to noisy training data. Motivated by this phenomenon, Gábor and his co-authors consider when a perfect fit to training data in linear regression is compatible with accurate prediction. Their analysis shows that overparameterization is essential for benign overfitting in this setting: the number of directions in parameter space that are unimportant for prediction must significantly exceed the sample size. (link to the article)
S. Lauritzen, C. Uhler and P. Zwiernik
Total positivity in exponential families with application to binary variables
Annals of Statistics, to appear
P. Bartlett, P.L. Long, G. Lugosi, and A. Tsigler. Benign overfitting in linear regression. PNAS, to appear, 2020.
C. Bordenave, G. Lugosi, and N. Zhivotovskiy. Noise sensitivity of the top eigenvector of a Wigner matrix. Probability Theory and Related Fields, to appear, 2020.
G. Lugosi, and S. Mendelson. Robust multivariate mean estimation: the optimality of trimmed mean. Annals of Statistics, to appear, 2020.
S. Lauritzen, C. Uhler and P. Zwiernik, Total positivity in exponential families with application to binary variables. Annals of Statistics, to appear.
G. Lugosi, and S. Mendelson. Regularization, sparse recovery, and median-of-means tournaments. Bernoulli, to appear.
G. Lugosi, and S. Mendelson, Near-optimal mean estimators with respect to general norms. Probability Theory and Related Fields, to appear.
G. Lugosi, and S. Mendelson. Risk minimization by median-of-means tournaments. Journal of the European Mathematical Society.
L. Beauchemin, M. Slifker, D. Rossell, and J. Font-Burgada. Characterizing MHC-I genotype predictive power for oncogenic mutation probability in cancer patients. Immunoinformatics, Methods and Protocols. Springer, in press.
Predicting election results with emerging parties by J.G. Montalvo, O. Papaspiliopoulos and T. Stumpf-Fetizon. European Journal of Political Economy.
Continuous mixtures with skewness and heavy tails by D. Rossell and M.F.J. Steel. In Handbook of mixture analysis, Chapter 10, CRC press
-
A. Corral, F. Udina and E. Arcaute, Truncated lognormal distributions and scaling in the size of naturally defined population clusters. Physical Review E, 2020, 101, No. 4.
-
On choosing mixture components via non-local priors by J. Fúquene, M.F.J. Steel, and D. Rossell. Journal of the Royal Statistical Society B, 2019, 81, 5, 809-837.
-
Maximum likelihood estimation in Gaussian models under total positivity by S. Lauritzen, C. Uhler, and P. Zwiernik. Annals of Statistics, 2019, Vol. 47, No. 4, 1835-1863.
-
Sub-Gaussian estimators of the mean of a random vector by G. Lugosi, and S. Mendelson. Annals of Statistics, 2019, Vol. 47, No. 2, pp 783-794.
-
Auxiliary gradient‐based sampling algorithms by Titsias, Michalis K., and O. Papaspiliopoulos. Journal of the Royal Statistical Society: Series B, (Statistical Methodology) 80.4, 2018, pp 749-767.
-
Tractable Bayesian variable selection: beyond normality by D. Rossell and F.J. Rubio. Journal of the American Statistical Association, 2018, pp 1-17.
-
Nonlocal priors for high-dimensional estimation by D. Rossell and D. Telesca. Journal of the American Statistical Association, 2017, 112.517, pp 254-265.
-
Maximum likelihood estimation for linear Gaussian covariance models by P. Zwiernik, C. Uhler, and D. Richards. Journal of the Royal Statistical Society: Series B, 79(4), 2017, 1269–1292.
-
Total positivity in Markov structures by S. Fallat, S. Lauritzen, K. Sadeghi, C. Uhler, N. Wermuth, and P. Zwiernik. Annals of Statistics 2017, Vol. 45, No. 3, 1152-1184.
-
Set estimation from reflected Brownian motion by A. Cholaquidis, R. Fraiman, G. Lugosi, and B. Pateiro-López. Journal of the Royal Statistical Society: Series B, 2016, 78:1057–1078.
-
Sub-Gaussian mean estimators by L. Devroye, M. Lerasle, G. Lugosi, and R. Imbuzeiro Oliveira. Annals of Statistics, 2016, 44:2695-2725.
-
Almost optimal sparsification of random geometric graphs by N. Broutin, L. Devroye, and G. Lugosi, Annals of Applied Probability, 2016, 26:5, 3078-3109.
-
On probability laws of solutions of differential systems driven by fractional Brownian motion by F. Baudoin, E. Nualart, C. Ouyang, and S. Tindel, Annals of Probability, 2016, 44, pp 2554-2590.
-
Exact sampling of diffusions with a discontinuity in the drift by O. Papaspiliopoulos, G. Roberts, and K. Taylor, Advances in Applied Probability, 2016, 48(A), 249-259.
-
Exponential varieties by M. Michałek, B. Sturmfels, C. Uhler, and P. Zwiernik, Proceedings of the London Mathematical Society (3) 112 (2016), no. 1, 27–56.
-
Empirical risk minimization for heavy-tailed losses by C. Brownlees, E. Joly and G. Lugosi, Annals of Statistics, 2015, 43(6), 2507-2536.
Gavard R, Jones H, Palacio Lozano D, Thomas M, Rossell D, Spencer S, Barrow M (2020). KairosMS: A new solution for the processing of hyphenated ultrahigh resolution mass spectrometry data. Analytical Chemistry, 92.5 3775-86
Gavard R, Palacio Lozano D, Guzman A, Rossell D, Spencer S, Barrow M (2019). Rhapso: Automatic stitching of mass segments from Fourier transform ion cyclotron resonance mass spectra. Analytical Chemistry, 91:15130-37
M. Greenacre. Variable selection in compositional data analysis using pairwise logratios. Mathematical Geosciences, 2018, 1-34.
Marty R, Kaabinejadian S, van de Haar J, Rossell D, Ideker T, Hildebrand W, Engin HB, Font-Burgada J, Carter H. (2017) MHC-I genotype restricts the oncogenic mutational landscape. Cell, 171, 1272-1283
Font-Burgada J, Shalapour S, Ramaswamy S, Hsueh B, Rossell D, Umemura A, Taniguchi K, Nakagawa H, Valasek MA, Ye L, Kopp JL, Sander M, Carter H, Deisseroth K, Verma IM, Karin M. (2015) Hybrid Periportal Hepatocytes Regenerate the Injured Liver without Giving Rise to Cancer. Cell, 162(4):766-79.
Calon A, Lonardo E, Berenguer A, Espinet E, Hernando-Momblona X, Iglesias M, Sevillano M, Palomo-Ponce S, Tauriello DVF, Byrom D, Cortina C, Morral C, Barceló C, Tosi S, Riera A, Stephan-Otto Attolini C, Rossell D, Sancho E, Batlle E. (2015) Stromal gene expression defines poor prognosis subtypes in colorectal cancer. Nature Genetics, 47, 320-329. doi:10.1038/ng.3225
Christian Brownlees:
Annals of Financial Economics, Econometrics, Journal of Network Theory in Finance, Journal of Risk and Financial Management
Gábor Lugosi:
Annals of Applied Probability, Journal of Machine Learning Research, Probability Theory and Related Fields
Eulàlia Nualart:
Stochastic Processes and their Applications (Associate Editor)
Omiros Papaspiliopoulos:
Biometrika (Deputy Editor), SIAM Journal of Uncertainty Quantification
David Rossell:
Bayesian Analysis (Associate Editor)
Piotr Zwiernik:
Biometrika, Journal of Algebraic Statistics, Scandinavian Journal of Statistics
“Prediccion, Inferencia y Computacion en Modelos Estructurados de Alta Dimension”
-
Reference: PGC2018-101643-B
-
Financing entity: Ministerio de Economía y Competitividad (MINECO)
-
Dates: 2019-2021
-
Principle investigators: Gábor Lugosi, Omiros Papaspiliopoulos
-
Amount: € 141,812
“Algorithms and Learning for AI”
-
Financing entity: Google
-
Dates: 2018-2020
-
Principle investigator: Gábor Lugosi
-
Amount: USD 150,000
“High-dimensional problems in structured probabilistic models”
-
Financing entity: Fundación BBVA
-
Dates: 2018-2020
-
Principle investigator: Gabor Lugosi
-
Amount: € 100,000
“Estimación de redes latentes”
-
Reference: MTM2015-67304-P
-
Financing entity: Ministerio de Economía y Competitividad (MINECO)
-
Dates: 2016-2018
-
Principle investigators: Gabor Lugosi, Omiros Papaspiliopoulos
-
Amount: € 52,998