64th ISI World Statistics Congress - Ottawa, Canada

64th ISI World Statistics Congress - Ottawa, Canada

Continuous-time multivariate analysis

Author

PR
Philip Reiss

Co-author

  • B
    Biplab Paul
  • E
    Erjia Cui

Conference

64th ISI World Statistics Congress - Ottawa, Canada

Format: IPS Abstract

Keywords: pca, splines

Session: IPS 268 - Functional and High-dimensional Data Analysis: New Directions and Innovations

Monday 17 July 2 p.m. - 3:40 p.m. (Canada/Eastern)

Abstract

The starting point for much of multivariate analysis (MVA) is an n × p data matrix whose n rows represent observations and whose p columns represent variables. Some multivariate data sets, however, may be best conceptualized not as n discrete p-variate observations, but as p curves or functions defined on a common time interval. Such a viewpoint may be useful for multivariate data observed at very high time resolution, with unequal time intervals, and/or with substantial missingness. We introduce a framework for extending techniques of multivariate analysis to such settings. The proposed framework rests on the assumption that the curves can be represented as linear combinations of basis functions such as B-splines. This is formally identical to the Ramsay-Silverman representation of functional data; but whereas functional data analysis extends MVA to the case of observations that are curves rather than vectors – heuristically, n × p data with p infinite – we are instead concerned with what happens when n is infinite. We demonstrate a new R package that translates the classical MVA methods of principal component analysis, Fisher’s linear discriminant analysis, and k-means clustering to the above continuous-time setting. The methods are illustrated with a novel perspective on the well-known Canadian weather data set, as well as with applications to neurobiological and environmetric data.