Asymptotic Optimality for Decentralised Bandits

Conor Newton, A J Ganesh, Henry Reeve

Research output: Contribution to journalArticle (Academic Journal)peer-review

3 Citations (Scopus)
17 Downloads (Pure)

Abstract

We consider a large number of agents collaborating on a multi-armed bandit problem with a large number of arms. We present an algorithm which improves upon the Gossip- Insert-Eliminate method of Chawla et al. [3]. We provide a regret bound which shows that our algorithm is asymptotically optimal and present empirical results demonstrating lower regret on simulated data.
Original languageEnglish
Pages (from-to)51-53
Number of pages3
JournalACM SIGMETRICS Performance Evaluation Review
Volume49
Issue number2
DOIs
Publication statusPublished - 20 Jan 2022
EventReinforcement Learning in Networks and Queues, Sigmetrics 2021 -
Duration: 14 Jun 202114 Jun 2021

Fingerprint

Dive into the research topics of 'Asymptotic Optimality for Decentralised Bandits'. Together they form a unique fingerprint.

Cite this