As datasets grow ever larger in scale, complexity and variety, there is an increasing need for powerful machine learning and statistical techniques that are capable of learning from such data. Bayesian nonparametrics is a promising approach to data analysis that is increasingly popular in machine learning and statistics. Bayesian nonparametric models are highly flexible models with infinite-dimensional parameter spaces that can be used to directly parameterise and learn about functions, densities, conditional distributions etc. This ERC funded project aims to develop Bayesian nonparametric techniques for learning rich representations from structured data in a computationally efficient and scalable manner.
This EPSRC project involving Yee Whye Teh (Oxford), Arnaud Doucet (Oxford), and Christophe Andrieu (Bristol) aims to develop both methodologies and theoretical foundations for scalable Markov chain Monte Carlo methods for big data. The starting point was stochastic gradient Langevin dynamics (SGLD) (Welling and Teh 2011), where we have provided theoretical analyses in terms of both asymptotic convergence (Teh et al 2016) as well as weak error expansion (Vollmer et al 2016). We have also developed a range of novel scalable Monte Carlo algorithms based on different techniques.