Principal components analysis is a widely used technique that provides an optimal lower-dimensional approximation to multivariate observations. In the functional case, a new and simple characterization of elliptical distributions on separable Hilbert spaces allows us to obtain an equivalent stochastic optimality property for the principal component subspaces associated with elliptically distributed random elements. This property holds even when second moments do not exist.
These lower-dimensional approximations can be very useful in identifying potential outliers among high-dimensional or functional observations. In this talk we propose a new class of robust estimators for principal components. For a fixed dimension q, we robustly estimate the q-dimensional linear space that best fits the data, in the sense of minimizing the sum of coordinate-wise robust residual scale estimators. The extension to the infinite-dimensional case is also studied. In analogy to the linear regression case, we call this proposal S-estimators. Our method is consistent for elliptical random vectors, and is Fisher-consistent for elliptically distributed random elements on arbitrary Hilbert spaces. Numerical experiments show that our proposal is highly competitive when compared with other existing methods when the data are generated both by finite- or infinite-rank stochastic processes. We also illustrate our approach using two real functional data sets, where the robust estimator is able to discover atypical observations in the data that would have been missed otherwise.
This talk is the result of recent collaborations with Graciela Boente and David Tyler.