Gradient-free optimization via integration

Research output: Working paperPreprint

2 Downloads (Pure)

Abstract

In this paper we propose a novel, general purpose, algorithm to optimize functions $l\colon \mathbb{R}^d \rightarrow \mathbb{R}$ not assumed to be convex or differentiable or even continuous. The main idea is to sequentially fit a sequence of parametric probability densities, possessing a concentration property, to $l$ using a Bayesian update followed by a reprojection back onto the chosen parametric sequence. Remarkably, with the sequence chosen to be from the exponential family, reprojection essentially boils down to the computation of expectations. Our algorithm therefore lends itself to Monte Carlo approximation, ranging from plain to Sequential Monte Carlo (SMC) methods. The algorithm is therefore particularly simple to implement and we illustrate performance on a challenging Machine Learning classification problem. Our methodology naturally extends to the scenario where only noisy measurements of $l$ are available and retains ease of implementation and performance. At a theoretical level we establish, in a fairly general scenario, that our framework can be viewed as implicitly implementing a time inhomogeneous gradient descent algorithm on a sequence of smoothed approximations of $l$. This opens the door to establishing convergence of the algorithm and provide theoretical guarantees. Along the way, we establish new results for inhomogeneous gradient descent algorithms of independent interest.
Original languageEnglish
PublisherarXiv.org
Number of pages37
DOIs
Publication statusPublished - 1 Aug 2024

Keywords

  • stat.CO

Fingerprint

Dive into the research topics of 'Gradient-free optimization via integration'. Together they form a unique fingerprint.

Cite this