Abstract
Over the last decade, deep generative modelling has emerged as a powerful probabilistic tool in machine learning. The idea behind generative modelling is simple: transform noise to create new data that matches a given training data set. Such transformations must adapt to the information contained in the training data, which is high-dimensional in typical machine learning applications. Generative models, which have demonstrated outstanding empirical generation capabilities for images, videos, text, and many others, have in common that they train deep neural networks to either approximate the transformation directly (e.g., Generative Adversarial Networks) or to approximate the characteristics of a stochastic process that dynamically evolves noise into data (e.g., diffusion models). To explain this empirical success mathematically, we face the statistical task of identifying scenarios in which the distance between the target and generated distributions converges with minimax optimal rate in terms of the sample size as well as the intrinsic dimension and smoothness of the data distribution.
While there has been significant progress on this question in rather idealised settings, existing statistical theory is still far from providing a convincing mathematical explanation for why deep generative models perform so well for very different tasks. Due to the complex nature of the field, answering such questions requires a concerted effort from a diverse group of researchers working in probability, nonparametric statistics, functional analysis and optimisation. The aim of this Mini-Workshop was therefore to bring these experts together to foster intensive interactions and to address the statistical challenges posed by generative modelling.