Skip to content

Counterfactual explanations of machine learning predictions: Opportunities and challenges for AI safety

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Original languageEnglish
Title of host publicationProceedings of the AAAI Workshop on Artificial Intelligence Safety 2019
Subtitle of host publicationco-located with the Thirty-Third AAAI Conference on Artificial Intelligence 2019 (AAAI 2019) Honolulu, Hawaii, January 27, 2019
Publisher or commissioning bodyCEUR Workshop Proceedings
Number of pages4
Volume2301
DateAccepted/In press - 26 Nov 2018
DatePublished (current) - 27 Jan 2019
Event2019 AAAI Workshop on Artificial Intelligence Safety, SafeAI 2019 - Honolulu, United States
Duration: 27 Jan 2019 → …

Publication series

NameCEUR Workshop Proceedings
ISSN (Print)1613-0073

Conference

Conference2019 AAAI Workshop on Artificial Intelligence Safety, SafeAI 2019
CountryUnited States
CityHonolulu
Period27/01/19 → …

Abstract

One necessary condition for creating a safe AI system is making it transparent to uncover any unintended or harmful behaviour. Transparency can be achieved by explaining predictions of an AI system with counterfactual statements, which are becoming a de facto standard in explaining algorithmic decisions. The popularity of counterfactuals is mainly attributed to their compliance with the “right to explanation” introduced by the European Union’s General Data Protection Regulation and them being understandable by a lay audience as well as domain experts. In this paper we describe our experience and the lessons learnt from explaining decision tree models trained on UCI German Credit and FICO Explainable Machine Learning Challenge data sets with class-contrastive counterfactual statements. We review how counterfactual explanations can affect an artificial intelligence system and its safety by investigating their risks and benefits. We show example explanations, discuss their strengths and weaknesses, show how they can be used to debug the underlying model, inspect its fairness and unveil security and privacy challenges that they pose.

Event

2019 AAAI Workshop on Artificial Intelligence Safety, SafeAI 2019

Duration27 Jan 2019 → …
CityHonolulu
CountryUnited States

Event: Conference

Download statistics

No data available

Documents

Documents

  • Full-text PDF (final published version)

    Rights statement: This is the final published version of the article (version of record). It first appeared online via CEUR at http://ceur-ws.org/Vol-2301/paper_20.pdf . Please refer to any applicable terms of use of the publisher.

    Final published version, 229 KB, PDF document

    Licence: CC BY

Links

View research connections

Related faculties, schools or groups