To Join via Zoom: To join this seminar, please request Zoom connection details from headsec [at] stat.ubc.ca.
Title: Optimal Transportation in the Inference of Finite Mixture Models
Abstract: Finite mixture models are widely used to model data that exhibit heterogeneity. They are also used to approximate density functions of various shapes. The learning of the model is the most fundamental task. In this thesis, we 1) investigate the learning of the finite location-scale mixtures where the maximum likelihood estimate (MLE) is not well defined; 2) develop novel procedures for distributed learning of finite mixtures; and 3) apply mixture models to approximate inference in graphical models and develop algorithms to make the inference tractable. We find the transportation divergence, which is a byproduct of the optimal transportation theory, is useful for these problems.
We study the minimum Wasserstein distance estimator (MWDE) to learn the finite location-scale mixtures. We show that the MWDE is well defined and consistent. Simulation study shows it suffers some efficiency loss against a penalized version of MLE in general without noticeable gain in robustness. The MWDE is also computationally more expensive than the penalized MLE.
Under finite mixture models, the general split-and-conquer approach for the distributed learning cannot be directly used, since the parameter space is non-Euclidean. We develop a novel split-and-conquer approach. We show that the estimator is root-n-consistent under some general conditions. Experiments show that the proposed approach has comparable statistical performance to the global estimator based on the full dataset, if the latter is feasible. It can even outperform the global estimator if the model assumption does not match the real-world data. It also has better statistical and computational performance than existing methods.
When mixtures are used in graphical models to approximate density functions, the order of the mixture increases exponentially due to recursive procedures and the inference becomes intractable. One way to make the inference tractable is to approximate the mixture by one with lower order. We propose to approximate by minimizing the composite transportation divergence (CTD) between two mixtures. The optimization problem can be solved by a majorization maximization algorithm. We show that many existing approaches are special cases of our approach. We further show that the performance of some existing algorithms can be further improved by choosing various cost functions in the CTD.