Explainers in the Wild: Making Surrogate Explainers Robust to Distortions Through Perception

Research output: Contribution to conferenceConference Paperpeer-review

Abstract

Explaining the decisions of models is becoming pervasive in
the image processing domain, whether it is by using posthoc methods or by creating inherently interpretable models. While the widespread use of surrogate explainers is a welcome addition to inspect and understand black-box models,
assessing the robustness and reliability of the explanations is key for their success. Additionally, whilst existing work in the explainability field proposes various strategies to address this problem, the challenges of working with data in the wild is often overlooked. For instance, in image classification, distortions to images can not only affect the predictions assigned by the model, but also the explanation. Given a clean and a distorted version of an image, even if the prediction probabilities are similar, the explanation may still be different. In this paper we propose a methodology to evaluate the effect of distortions in explanations by embedding perceptual distances that tailor the neighbourhoods used to training surrogate explainers. We also show that by operating in this way, we can make the explanations more robust to distortions. We generate explanations for images in the Imagenet-C dataset and demonstrate how using a perceptual distances in the surrogate explainer creates more coherent explanations for the distorted and reference images.
Original languageEnglish
Pages3717
Number of pages3721
Publication statusPublished - 19 Jul 2021
Event28th IEEE International Conference on Image Processing - Anchorage, United States
Duration: 19 Sept 202122 Sept 2021
Conference number: 2021

Conference

Conference28th IEEE International Conference on Image Processing
Abbreviated titleICIP
Country/TerritoryUnited States
CityAnchorage
Period19/09/2122/09/21

Fingerprint

Dive into the research topics of 'Explainers in the Wild: Making Surrogate Explainers Robust to Distortions Through Perception'. Together they form a unique fingerprint.

Cite this