64th ISI World Statistics Congress - Ottawa, Canada

64th ISI World Statistics Congress - Ottawa, Canada

Mixture of distributions in the symbolic context

Conference

64th ISI World Statistics Congress - Ottawa, Canada

Format: IPS Abstract

Session: IPS 406 - Advances in Symbolic Data Analysis

Thursday 20 July 2 p.m. - 3:40 p.m. (Canada/Eastern)

Abstract

Consider a sample of n objects, each one described by a probability distribution. A probabilistic model-based clustering of these objects is done by estimating a mixture of distributions, each component of that mixture being a distribution on the set of distributions. We will consider the case of a mixture of Dirichlet distributions (resp. of Dirichlet Processes). We will also show that the PCA of n distributions on R^p can be done by finding the best projection of a mixture. ABSTRACT OF THE SESSION This session intends to focus on some of the most recent advances in Symbolic Data Analysis also in relation to the developments in inferential and predictable perspectives, with evidence of their applicative contributions in the analysis of symbolic data in the form of distributions and intervals. Interval-valued variables have found more and more attention in data analysis since this type of data represents either the uncertainty existing in an error measurement or the natural variability of the data. Currently, methods and algorithms which aim to manage interval-valued data are very much required to analyse great amount of aggregated data. New measure of concordance and discordance represent another new line of research in SDA particularly relevant in SDA to compare a representation of a class of numerical and categorical data to the distribution of the descriptors of a set of classes. The proposed methods and new proposals are quite all supported by software mainly developed in R. Applications on real data allow us to evaluate the effectiveness of the methods.