Nonparametric modelling for classification of high-dimensional data and multiple hypothesis testing
by Professor Peter Hall
Abstract: Problems involving classification of high-dimensional data, and `highly multiple' hypothesis testing, arise frequently in the analysis of genetic data and complex signals. Their theoretical elucidation raises challenges, however. We address this issue by
interpreting small samples of high-dimensional data as small numbers of replicates of long segments of nonstationary time-series.
Depending to some extent on how erratic the time-series are, important features of classifiers, or of multiple hypothesis testing procedures, can be accessed by exploring properties of time-series
models. For example it can be shown that, in the context of multiple hypothesis testing, the assumption of independence is much less of an issue in high-dimensional settings than in conventional, low- dimensional ones.
This is particularly true when the null distributions of test statistics are relatively light-tailed, for instance when they can plausibly be based on Normal approximations.
Similar arguments can be employed to explore other aspects of the analysis of high-dimensional data.
For More Information: Dr Owen Jones: firstname.lastname@example.org