Mixture Models and topography of mixtures

The main result of this article states that one can get as many as *D+1* modes from just *D* a two component normal mixture in dimensions. Multivariate mixture models are widely used for modeling homogeneous populations and for cluster analysis. Either the components directly or modes arising from these components are often used to extract individual clusters. Although in lower dimensions these strategies work well, our results show that high dimensional mixtures are often very complex and researchers should take extra precautions when using mixture models for cluster analysis. Further our analysis shows that the number of modes depends on the component means and eigenvalues of the ratio of the two component covariance matrices, which in turn provides a clear guideline as to when one can use mixture analysis for clustering high dimensional data.

- Dan Ren
- Bader Al-Ruwali

- Bruce G. Lindasy (Phd Supervisor)
- Marianthi Markatou
- Shu-Chaun Chen

Kernels, degrees of freedom, and power properties of quadratic distance goodness-of-fit tests

Lindsay B.G., Markatou M., and Ray S. Journal of the American Statistical Association. 109 (505)

Journal Page | Open Access | Scopus Link | Cite | Citing Papers | ## Abstract

In this article, we study the power properties of quadratic-distance-based goodness-of-fit tests. First, we introduce the concept of a root kernel and discuss the considerations that enter the selection of this kernel. We derive an easy to use normal approximation to the power of quadratic distance goodness-of-fit tests and base the construction of a noncentrality index, an analogue of the traditional noncentrality parameter, on it. This leads to a method akin to the Neyman-Pearson lemma for constructing optimal kernels for specific alternatives. We then introduce a midpower analysis as a device for choosing optimal degrees of freedom for a family of alternatives of interest. Finally, we introduce a new diffusion kernel, called the Pearson-normal kernel, and study the extent to which the normal approximation to the power of tests based on this kernel is valid. Supplementary materials for this article are available online. 2014 American Statistical Association.

On the number of modes of finite mixtures of elliptical distributions

Alexandrovich G., Holzmann H., and Ray S. Studies in Classification, Data Analysis, and Knowledge Organization.

Journal Page | Open Access | Scopus Link | Cite | Citing Papers | ## Abstract

We extend the concept of the ridgeline from Ray and Lindsay (Ann Stat 33:2042-2065, 2005) to finite mixtures of general elliptical densities with possibly distinct density generators in each component. This can be used to obtain bounds for the number of modes of two-component mixtures of t distributions in any dimension. In case of proportional dispersion matrices, these have at most three modes, while for equal degrees of freedom and equal dispersion matrices, the number of modes is at most two. We also give numerical illustrations and indicate applications to clustering and hypothesis testing. Springer International Publishing Switzerland 2013.

On the upper bound of the number of modes of a multivariate normal mixture

Ray S. and Ren D. Journal of Multivariate Analysis. 108

Journal Page | Open Access | Scopus Link | Cite | Citing Papers | ## Abstract

The main result of this article states that one can get as many as D+. 1 modes from just a two component normal mixture in D dimensions. Multivariate mixture models are widely used for modeling homogeneous populations and for cluster analysis. Either the components directly or modes arising from these components are often used to extract individual clusters. Although in lower dimensions these strategies work well, our results show that high dimensional mixtures are often very complex and researchers should take extra precautions when using mixture models for cluster analysis. Further our analysis shows that the number of modes depends on the component means and eigenvalues of the ratio of the two component covariance matrices, which in turn provides a clear guideline as to when one can use mixture analysis for clustering high dimensional data. 2012 Elsevier Inc.

Quadratic distances on probabilities: A unified foundation

Lindsay B.G., Markatou M., Ray S., Yang K.E., and Chen S.-C. Annals of Statistics. 36 (2)

Journal Page | Open Access | Scopus Link | Cite | Citing Papers | ## Abstract

This work builds a unified framework for the study of quadratic form distance measures as they are used in assessing the goodness of fit of models. Many important procedures have this structure, but the theory for these methods is dispersed and incomplete. Central to the statistical analysis of these distances is the spectral decomposition of the kernel that generates the distance. We show how this determines the limiting distribution of natural goodness-of-fit tests. Additionally, we develop a new notion, the spectral degrees of freedom of the test, based on this decomposition. The degrees of freedom are easy to compute and estimate, and can be used as a guide in the construction of useful procedures in this class. Institute of Mathematical Statistics, 2008.

Model selection in high dimensions: A quadratic-risk-based approach

Ray S. and Lindsay B.G. Journal of the Royal Statistical Society. Series B: Statistical Methodology. 70 (1)

Journal Page | Open Access | Scopus Link | Cite | Citing Papers | ## Abstract

We propose a general class of risk measures which can be used for data-based evaluation of parametric models. The loss function is defined as the generalized quadratic distance between the true density and the model proposed. These distances are characterized by a simple quadratic form structure that is adaptable through the choice of a non-negative definite kernel and a bandwidth parameter. Using asymptotic results for the quadratic distances we build a quick-to-compute approximation for the risk function. Its derivation is analogous to the Akaike information criterion but, unlike the Akaike information criterion, the quadratic risk is a global comparison tool. The method does not require resampling, which is a great advantage when point estimators are expensive to compute. The method is illustrated by using the problem of selecting the number of components in a mixture model, where it is shown that, by using an appropriate kernel, the method is computationally straightforward in arbitrarily high data dimensions. In this same context it is shown that the method has some clear advantages over the Akaike information criterion and Bayesian information criterion. 2008 Royal Statistical Society.

The topography of multivariate normal mixtures

Ray S. and Lindsay B.G. Annals of Statistics. 33 (5)

Journal Page | Open Access | Scopus Link | Cite | Citing Papers | ## Abstract

Multivariate normal mixtures provide a flexible method of fitting high-dimensional data. It is shown that their topography, in the sense of their key features as a density, can be analyzed rigorously in lower dimensions by use of a ridgeline manifold that contains all critical points, as well as the ridges of the density. A plot of the elevations on the ridgeline shows the key features of the mixed density. In addition, by use of the ridgeline, we uncover a function that determines the number of modes of the mixed density when there are two components being mixed. A followup analysis then gives a curvature function that can be used to prove a set of modality theorems. Institute of Mathematical Statistics, 2005.