On choosing mixture components via non‐local priors


Jairo Fúquene, Mark Steel, David Rossell

Journal of the Royal Statistical Society: Series B (Statistical Methodology)
Volume81, Issue5
November 2019
Pages 809-837



Choosing the number of mixture components remains an elusive challenge. Model selection criteria can be either overly liberal or conservative and return poorly separated components of limited practical use. We formalize non‐local priors (NLPs) for mixtures and show how they lead to well‐separated components with non‐negligible weight, interpretable as distinct subpopulations. We also propose an estimator for posterior model probabilities under local priors and NLPs, showing that Bayes factors are ratios of posterior‐to‐prior empty cluster probabilities. The estimator is widely applicable and helps to set thresholds to drop unoccupied components in overfitted mixtures. We suggest default prior parameters based on multimodality for normal–T ‐mixtures and minimal informativeness for categorical outcomes. We characterize theoretically the NLP‐induced sparsity, derive tractable expressions and algorithms. We fully develop normal, binomial and product binomial mixtures but the theory, computation and principles hold more generally. We observed a serious lack of sensitivity of the Bayesian information criterion, insufficient parsimony of the Akaike information criterion and a local prior, and a mixed behaviour of the singular Bayesian information criterion. We also considered overfitted mixtures; their performance was competitive but depended on tuning parameters. Under our default prior elicitation NLPs offered a good compromise between sparsity and power to detect meaningfully separated components.

Would you like to receive more information about 
any of the projects?

Similar Cases

Method: Monte Carlo methods
Omiros Papaspiliopoulos, Nicolas Chopin
New book in Springer Series in Stadistics. An Introduction to all aspects of particle filtering: the algorithms, their uses in different areas, their computer implementation in Python and the supporting theory
Michalis K. Titsias, Omiros Papaspiliopoulos
Algorithmic framework for exact Bayesian variable selection and model averaging in linear models

How we can help


Let us train you to acquire those data science analytics skills that you or your team are missing.


Need some help in any analytics challenge you are facing? Get your analytics to the next level: count on us as expert consultants!

Contact us

We’d love to hear from you… Drop us a line to get in touch!


Data Sciencce Center Barcelona Graduate School of Economics Ramón Trías Fargas, 25-27 08005 Barcelona, Spain.

Join our LinkedIn Group


Case studies

About DSC



Stay tuned for Data Science updates

Subscribe to our Newsletter and you will receive the latest news about our work, studies, publications and more.




© Barcelona Graduate School of Economics. All rights reserved.