Contrastive Visual Explanations for Reinforcement Learning via Counterfactual Rewards

Xiaowei Liu*, Kevin McAreavey, Weiru Liu

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

69 Downloads (Pure)

Abstract

Causal attribution aided by counterfactual reasoning is recognised as a key feature of human explanation. In this paper we propose a post-hoc contrastive explanation framework for reinforcement learning (RL) based on comparing learned policies under actual environmental rewards vs. hypothetical (counterfactual) rewards. The framework provides policy-level explanations by accessing learned Q-functions and identifying intersecting critical states. Global explanations are generated to summarise policy behaviour through the visualisation of sub-trajectories based on these states, while local explanations are based on the action-values in states. We conduct experiments on several grid-world examples. Our results show that it is possible to explain the difference between learned policies based on Q-functions. This demonstrates the potential for more informed human decision-making when deploying policies and highlights the possibility of developing further XAI techniques in RL.
Original languageEnglish
Title of host publicationExplainable Artificial Intelligence
Subtitle of host publicationFirst World Conference, xAI 2023, Lisbon, Portugal, July 26–28, 2023, Proceedings, Part II
EditorsLuca Longo
PublisherSpringer
Pages72-87
Number of pages16
Volume2
Edition1
ISBN (Electronic)978-3-031-44067-0
ISBN (Print)978-3-031-44066-3
DOIs
Publication statusPublished - 21 Oct 2023
EventFirst World Conference on Explainable Artificial Intelligence - Cultural Centre of Belem, Lisbon, Portugal
Duration: 26 Jul 202328 Jul 2023
https://xaiworldconference.com/

Publication series

NameCommunications in Computer and Information Science
Volume1902 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

ConferenceFirst World Conference on Explainable Artificial Intelligence
Abbreviated titlexAI 2023
Country/TerritoryPortugal
CityLisbon
Period26/07/2328/07/23
Internet address

Bibliographical note

Funding Information:
The authors would thank anonymous reviewers for their valuable comments. This work is partially funded by the EPSRC CHAI project (EP/T026820/1).

Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

Keywords

  • Explainable reinforcement learning
  • Contrastive explanations
  • Counterfactuals
  • Visual explanations

Fingerprint

Dive into the research topics of 'Contrastive Visual Explanations for Reinforcement Learning via Counterfactual Rewards'. Together they form a unique fingerprint.

Cite this