Automated smoothing parameter estimation for quantile additive models

  • Bertrand Nortier

Student thesis: Doctoral ThesisDoctor of Philosophy (PhD)

Abstract

Quantile Additive Models (QAMs) are statistical models where conditional quantiles of an observed variable are modelled as a sum of functions. These functions are univariate or multivariate smooth functions of a set of covariates. The model is penalized with a quadratic regularization term that is used to control the degree of smoothness of the functions via a vector of hyper-parameters or smoothing parameters. Determining these smoothing parameters is particularly difficult in the case of QAMs. This is especially true when trying to fit a model of extreme quantiles. The more extreme the quantile, the more the dataset used to fit the model is imbalanced. In this thesis, a fully automated method to determine these smoothing parameters and the vector of parameters of the model is proposed. To achieve this goal, several contributions are made. First, a cross validation criterion that addresses some of the shortcomings of previously proposed criteria is introduced: the Quantile Generalized Approximate Cross Validation (QGACV) criterion. Then, it is proposed to replace the exact pinball loss function by its expected value, itself replaced by a rounded surrogate loss. The degree of rounding of the surrogate loss is estimated with a method aimed at minimizing a squared distance between the surrogate loss and the expected loss, which is estimated by an empirical loss. This is followed by a method to optimize both the QGACV criterion based on graduated optimization and Generalized Iteratively Reweighted Least Squares. Last, a method to obtain confidence intervals based on non-parametric bootstrap, together with a method to relax the smoothing parameters to obtain better coverage are introduced. The full implementation of the methodology including automated initialization methods have been implemented in an R package named ‘qgacv’. Package ‘qgacv’ is then compared to existing R packages that can be used to fit QAMs. The method presented is competitive and has several advantages compared to the existing methods and R packages.
Date of Award23 Mar 2021
Original languageEnglish
Awarding Institution
  • The University of Bristol
SponsorsThe Alan Turing Institute
SupervisorSimon Wood (Supervisor)

Cite this

'