Single-Agent Polices for the Multi-Agent Persistent Surveillance Problem via Artificial Heterogeneity

Tom Kent*, Arthur G Richards*, Angus Johnson

*Corresponding author for this work

Research output: Contribution to conferenceConference Paperpeer-review

Abstract

Modelling and planning as well as Machine Learning techniques such as Reinforcement Learning often find multi-agent problems difficult due to a rapidly growing decision space which is made increasing complex through the interacting agents. This paper is motivated by the question of whether we are able to train single-agent policies in isolation, and without the need for explicit cooperation or coordination still successfully deploy them to multi-agent scenarios. In particular we look at the multi-agent Persistent Surveillance Problem (MAPSP), which is the problem of using a number of agents to continually visit and re-visit areas of a map to maximise a metric of surveillance.
We outline five distinct single-agent policies to solve the MAPSP: Reinforcement Learning (DDPG); Neuro-Evolution (NEAT); a Gradient Descent (GD) heuristic; a random heuristic; and a pre-defined ‘ploughing pattern’ (Trail). We will compare the performance and scalability of these single-agent policies to the Multi-Agent PSP. Importantly, in doing so we will demonstrate an emergent property which we call the Homogeneous-Policy Convergence Cycle (HPCC), whereby agents following homogeneous policies can get stuck together, continuously repeating the same action as other agents, significantly impacting performance. This paper will show that just a small amount of noise, at the state or action level, is sufficient to solve the problem, essentially creating artificially-heterogeneous policies for the agents.
Original languageEnglish
Publication statusPublished - 14 Sept 2020
EventEuropean Conference on Multi-Agent Systems - Thessaloniki, Greece
Duration: 14 Sept 202015 Sept 2020
Conference number: 17
https://eumas2020.csd.auth.gr/eumas2020/

Conference

ConferenceEuropean Conference on Multi-Agent Systems
Abbreviated titleEUMAS 2020
Country/TerritoryGreece
CityThessaloniki
Period14/09/2015/09/20
Internet address

Fingerprint

Dive into the research topics of 'Single-Agent Polices for the Multi-Agent Persistent Surveillance Problem via Artificial Heterogeneity'. Together they form a unique fingerprint.
  • T-B PHASE: Prosperity Partnership with Thales

    Richards, A. G. (Principal Investigator), Wilson, R. E. (Co-Investigator), Johnson, A. (Collaborator), Bullock, S. (Co-Investigator), Lawry, J. (Co-Investigator), Noyes, J. M. (Co-Investigator), Hauert, S. (Co-Investigator), Bode, N. W. F. (Co-Investigator), Pitonakova, L. (Researcher), Kent, T. (Researcher), Crosscombe, M. (Researcher), Zanatto, D. (Researcher), Alkan, B. (Researcher), Drury, K. L. (Manager), Hogg, E. (Student), Bonnell, W. D. (Student), Bennett, C. (Student), Clarke, C. E. M. (Student), Potts, M. W. (Student), Sartor, P. N. (Collaborator), Harvey, D. (Collaborator), Rayneau-Kirkhope, B. (Collaborator), Galvin, K. (Collaborator), Lam, J. (Collaborator), Barden, E. (Collaborator), Chattington, M. (Collaborator), Radanovic, M. (Researcher), Morey, E. J. (Student), Ball, M. (Co-Principal Investigator), Hunt, E. R. (Collaborator), Richards, A. G. (Principal Investigator), Radanovic, M. (Researcher), Morey, E. J. (Student), Steane, V. (Collaborator), Reed Edworthy, J. (Collaborator) & Hart, S. G. (Student)

    1/10/1731/03/23

    Project: Research

Cite this