Functional data analysis and applications to evolutionary biology, nonparametric regression via splines, kernels and local polynomials; shapes of regression functions.
I am interested in non-parametric, or smoothing, regression, and the related area of functional data analysis. In the general regression problem, each Y observation is assumed to depend upon an observed variable, x. In classical (parametric) regression, the form of this dependence is quite specific. For instance, one might assume that Y is linear in x, with random error. The least squares regression estimate of the linear relationship is easy to find, often without a calculator. And a linear dependency of Y upon x is certainly easy to picture. The widespread use of computers has reduced the need for easily calculated estimates of the relationship between variables. And the development of computer graphics allows the statistician to literally look at forms of dependencies which are far more complex than the classically assumed linear relationships. Instead of determining the "best-fitting" line, one can calculate and graph a "best-fitting smooth curve", through such computer-intensive techniques as spline and kernel smoothing.
Functional data analysis is a suite of statistical methods for analyzing families of curves. Often a data set consists of observations from a group of curves, as opposed to a group of points or vectors. One can analyze observations curve by curve, smoothing data about one curve independently from all other curves, and then try to pool information across pools. However, a less ad-hoc, more model-based approach will make more use of the information available in the data.
I've use functional data analysis techniques in many applications areas. I've worked with evolutionary biologists, where each curve gives the value of a physical trait as a function of age, for an individual. One goal of the curve analysis is to determine how the correlation structure of the trait over time can be used for breeding for particular values, e.g. how the correlation structure of lactation over time can be used to breed dairy cows with maximal milk production. I have several co-authored papers on developing functional data analysis techniques for the analysis of energy consumption, where each curve gives energy consumed as a function of time. Currently, a student and I are learning about biologging devices used to track behavior and position of marine mammals. Here, a curve is a position coordinate as a function of time.