Projects per year
Abstract
Causal attribution aided by counterfactual reasoning is recognised as a key feature of human explanation. In this paper we propose a post-hoc contrastive explanation framework for reinforcement learning (RL) based on comparing learned policies under actual environmental rewards vs. hypothetical (counterfactual) rewards. The framework provides policy-level explanations by accessing learned Q-functions and identifying intersecting critical states. Global explanations are generated to summarise policy behaviour through the visualisation of sub-trajectories based on these states, while local explanations are based on the action-values in states. We conduct experiments on several grid-world examples. Our results show that it is possible to explain the difference between learned policies based on Q-functions. This demonstrates the potential for more informed human decision-making when deploying policies and highlights the possibility of developing further XAI techniques in RL.
Original language | English |
---|---|
Title of host publication | Explainable Artificial Intelligence |
Subtitle of host publication | First World Conference, xAI 2023, Lisbon, Portugal, July 26–28, 2023, Proceedings, Part II |
Editors | Luca Longo |
Publisher | Springer |
Pages | 72-87 |
Number of pages | 16 |
Volume | 2 |
Edition | 1 |
ISBN (Electronic) | 978-3-031-44067-0 |
ISBN (Print) | 978-3-031-44066-3 |
DOIs | |
Publication status | Published - 21 Oct 2023 |
Event | First World Conference on Explainable Artificial Intelligence - Cultural Centre of Belem, Lisbon, Portugal Duration: 26 Jul 2023 → 28 Jul 2023 https://xaiworldconference.com/ |
Publication series
Name | Communications in Computer and Information Science |
---|---|
Volume | 1902 CCIS |
ISSN (Print) | 1865-0929 |
ISSN (Electronic) | 1865-0937 |
Conference
Conference | First World Conference on Explainable Artificial Intelligence |
---|---|
Abbreviated title | xAI 2023 |
Country/Territory | Portugal |
City | Lisbon |
Period | 26/07/23 → 28/07/23 |
Internet address |
Bibliographical note
Funding Information:The authors would thank anonymous reviewers for their valuable comments. This work is partially funded by the EPSRC CHAI project (EP/T026820/1).
Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
Keywords
- Explainable reinforcement learning
- Contrastive explanations
- Counterfactuals
- Visual explanations
Fingerprint
Dive into the research topics of 'Contrastive Visual Explanations for Reinforcement Learning via Counterfactual Rewards'. Together they form a unique fingerprint.Projects
- 1 Finished
-
8463 EP/T026707/1 CHAI : Cyber Hygiene in AI enabled domestic life
Liu, W. (Principal Investigator)
1/12/20 → 28/02/24
Project: Research