TY - UNPB
T1 - Gradient-free optimization via integration
AU - Andrieu, Christophe
AU - Chopin, Nicolas
AU - Fincato, Ettore
AU - Gerber, Mathieu
PY - 2024/8/1
Y1 - 2024/8/1
N2 - In this paper we propose a novel, general purpose, algorithm to optimize functions $l\colon \mathbb{R}^d \rightarrow \mathbb{R}$ not assumed to be convex or differentiable or even continuous. The main idea is to sequentially fit a sequence of parametric probability densities, possessing a concentration property, to $l$ using a Bayesian update followed by a reprojection back onto the chosen parametric sequence. Remarkably, with the sequence chosen to be from the exponential family, reprojection essentially boils down to the computation of expectations. Our algorithm therefore lends itself to Monte Carlo approximation, ranging from plain to Sequential Monte Carlo (SMC) methods. The algorithm is therefore particularly simple to implement and we illustrate performance on a challenging Machine Learning classification problem. Our methodology naturally extends to the scenario where only noisy measurements of $l$ are available and retains ease of implementation and performance. At a theoretical level we establish, in a fairly general scenario, that our framework can be viewed as implicitly implementing a time inhomogeneous gradient descent algorithm on a sequence of smoothed approximations of $l$. This opens the door to establishing convergence of the algorithm and provide theoretical guarantees. Along the way, we establish new results for inhomogeneous gradient descent algorithms of independent interest.
AB - In this paper we propose a novel, general purpose, algorithm to optimize functions $l\colon \mathbb{R}^d \rightarrow \mathbb{R}$ not assumed to be convex or differentiable or even continuous. The main idea is to sequentially fit a sequence of parametric probability densities, possessing a concentration property, to $l$ using a Bayesian update followed by a reprojection back onto the chosen parametric sequence. Remarkably, with the sequence chosen to be from the exponential family, reprojection essentially boils down to the computation of expectations. Our algorithm therefore lends itself to Monte Carlo approximation, ranging from plain to Sequential Monte Carlo (SMC) methods. The algorithm is therefore particularly simple to implement and we illustrate performance on a challenging Machine Learning classification problem. Our methodology naturally extends to the scenario where only noisy measurements of $l$ are available and retains ease of implementation and performance. At a theoretical level we establish, in a fairly general scenario, that our framework can be viewed as implicitly implementing a time inhomogeneous gradient descent algorithm on a sequence of smoothed approximations of $l$. This opens the door to establishing convergence of the algorithm and provide theoretical guarantees. Along the way, we establish new results for inhomogeneous gradient descent algorithms of independent interest.
KW - stat.CO
U2 - 10.48550/arXiv.2408.00888
DO - 10.48550/arXiv.2408.00888
M3 - Preprint
BT - Gradient-free optimization via integration
PB - arXiv.org
ER -