A theory for model-based nonsmooth optimization

Student thesis: Doctoral ThesisDoctor of Philosophy (PhD)

Abstract

Weanalyse and develop a gradient-free optimisation framework for functions l :Rd →R that are not assumed to be convex, differentiable, or even continuous. The central idea is to approximate l recursively by fitting it to a parametric family of probability distributions. Each iteration consists of a generalised Bayesian update step, which incorporates new information from the objective function, followed by a reprojection onto the chosen parametric family. A key feature of this framework is that the reprojection step reduces to the computation of expectations with respect to the updated distribution. These expectations can, in practice, be approximated efficiently using Monte Carlo techniques, making the algorithm implementable in a gradient-free manner which only requires access to pointwise function evaluations. When the parametric family is taken to be Gaussian, we demonstrate that the resulting procedure admits a natural interpretation as a time-inhomogeneous gradient descent algorithm applied to a sequence of smoothed versions of the original function l. This interpretation provides both intuition and analytical leverage, connecting, from a theoretical point of view, our method to the class of (inhomogeneous) descent algorithms. In parallel with the algorithmic development, we provide a general convergence analysis of inhomogeneous stochastic gradient descent schemes. This analysis encompasses both our proposed algorithm, classical Gaussian smoothing approaches and their stochastic variants. The theory that emerges is of independent interest and is established under weak assumptions. A particularly noteworthy case is when the objective function l exhibits (bounded) discontinuities. To the best of our knowledge, these conditions are less restrictive than those typically found in the literature, thereby significantly broadening the scope of applicability of the corresponding algorithms . To illustrate the practical value of our approach, we apply it on a challenging machine learning classification task, showing its effectiveness in scenarios where classical smoothness and convexity assumptions fail.
Date of Award20 Jan 2026
Original languageEnglish
Awarding Institution
  • University of Bristol
SupervisorChristophe Andrieu (Supervisor) & Mathieu Gerber (Supervisor)

Cite this

'