64th ISI World Statistics Congress - Ottawa, Canada

64th ISI World Statistics Congress - Ottawa, Canada

Symbolic t-SNE and UMAP methods for interval type variables.

Conference

64th ISI World Statistics Congress - Ottawa, Canada

Format: IPS Abstract

Session: IPS 406 - Advances in Symbolic Data Analysis

Thursday 20 July 2 p.m. - 3:40 p.m. (Canada/Eastern)

Abstract

UMAP (Uniform Manifold Approximation and Projection) is a very new
method for dimension reduction. UMAP method improve t-SNE (t-Distributed Stochastic Neighbor Embedding) method for data visualization and dimensionality reduction. The great advantage of UMAP is that it preserves better than t-SNE the global structure with superior run time performance. The foregoing makes UMAP an ideal method to be applied to the hyper-rectangles that are in the rows of the symbolic data table with interval-type variables, since UMAP compresses the structure inside each hyper-rectangle very well and at the same time better preserves the global structure of the clusters generated by each hyper-rectangle. This paper presents an adapted version of the t-SNE and UMAP methods for interval type variables. In addition, R and Python codes for both generalizations are presented.