Statistical Seminar
Organizer:
Yunan Wu 吴宇楠 (YMSC)
Speaker:
Wei Luo 骆威
浙江大学数据科学研究中心
Time:
Mon., 14:00- 15:00, Sept. 22, 2025
Venue:
C548, Shuangqing Complex Building A
Title:
Facilitating model-based clustering by dimension reduction
Abstract:
The Gaussian Mixture Model (GMM) has been widely used for clustering analysis. It is commonly fitted by the maximal likelihood approach, which is computationally challenging due to the non-convex minimization, especially as the dimensionality grows. To address this issue, we propose a two-step approach by recovering the intrinsic low-dimensional structure of GMM under additional constraints on its heterogeneity; that is, there exists a low-dimensional linear transformation of the data, given which the rest of the data are normally distributed and thus redundant for clustering. Our approach first recovers the desired low-dimensional data based on Stein's Lemma and then uses the reduced data only to fit GMM. Its computational efficiency comes from both the lower dimensionality and denoising of the data. Under a sparsity assumption of the clustering pattern, our approach can be generalized in high-dimensional settings. With the aid of a novelly constructed pseudo response, it can also be embedded into a general framework of sufficient dimension reduction, which encompasses a wider class of methods beyond Stein's Lemma to recover the low-dimensional structure of GMM. These findings are illustrated in the numerical studies at the end.