Counterfactual Shapley Values for Explaining Reinforcement Learning

Yiwei Shi*, Weiru Liu

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

Abstract

This paper introduces an approach based on Counterfactual Shapley Values, which enhances explainability in reinforcement learning by integrating counterfactual analysis with Shapley Values. The approach aims to quantify and compare the contributions of different state dimensions to various action choices. To more accurately analyze the impacts of these contributions, we introduce new characteristic value functions, the Counterfactual Difference based Characteristic Value functions and the Average Counterfactual Difference based Characteristic Value functions. These functions help to evaluate the differences in contributions between optimal and non-optimal actions. Experiments across several RL domains, such as GridWorld, FrozenLake, and Taxi, demonstrate the effectiveness of the Counterfactual Shapley Values method. The results show that this method not only improves transparency in complex RL systems but also quantifies the differences across various decisions.
Original languageEnglish
Title of host publication3rd World Conference on Explainable Artificial Intelligence
EditorsRiccardo Guidotti, Ute Schmid, Luca Longo
PublisherSpringer
Pages169-193
Number of pages25
VolumeII
ISBN (Electronic)978-3-032-08324-1
ISBN (Print)978-3-032-08323-4
DOIs
Publication statusPublished - 16 Oct 2025
EventExplainable Artificial Intelligence: Third World Conference, xAI 2025 - Istanbul, Turkey, Turkey
Duration: 9 Jul 202511 Jul 2025
https://xaiworldconference.com/2025/

Publication series

NameCommunications in Computer and Information Science
Volume2577
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

ConferenceExplainable Artificial Intelligence
Country/TerritoryTurkey
Period9/07/2511/07/25
Internet address

Bibliographical note

Publisher Copyright:
© The Author(s) 2026.

Fingerprint

Dive into the research topics of 'Counterfactual Shapley Values for Explaining Reinforcement Learning'. Together they form a unique fingerprint.

Cite this