We consider the standard non-parametric regression model with Gaussian errors but where the data consist of different samples. The question to be answered is whether the samples can be adequately represented by the same regression function. To do this we define for each sample a universal, honest and non-asymptotic confidence region for the regression function. Any subset of the samples can be represented by the same function if and only if the intersection of the corresponding confidence regions is non-empty. If the empirical supports of the samples are disjoint then the intersection of the confidence regions is always non–empty and a negative answer can only be obtained by placing shape or quantitative smoothness conditions on the joint approximation, or by making additional assumptions about the support points. Alternatively, a simplest joint approximation function can be calculated which gives a measure of the cost of the joint approximation, for example, the number of extra peaks required.
|Translated title of the contribution||Quantifying the cost of simultaneous nonparametric approximation of several samples|
|Pages (from-to)||747 - 780|
|Number of pages||34|
|Journal||Electronic Journal of Statistics|
|Publication status||Published - Jan 2009|