Abstract
Explaining the decisions of models is becoming pervasive in
the image processing domain, whether it is by using posthoc methods or by creating inherently interpretable models. While the widespread use of surrogate explainers is a welcome addition to inspect and understand black-box models,
assessing the robustness and reliability of the explanations is key for their success. Additionally, whilst existing work in the explainability field proposes various strategies to address this problem, the challenges of working with data in the wild is often overlooked. For instance, in image classification, distortions to images can not only affect the predictions assigned by the model, but also the explanation. Given a clean and a distorted version of an image, even if the prediction probabilities are similar, the explanation may still be different. In this paper we propose a methodology to evaluate the effect of distortions in explanations by embedding perceptual distances that tailor the neighbourhoods used to training surrogate explainers. We also show that by operating in this way, we can make the explanations more robust to distortions. We generate explanations for images in the Imagenet-C dataset and demonstrate how using a perceptual distances in the surrogate explainer creates more coherent explanations for the distorted and reference images.
the image processing domain, whether it is by using posthoc methods or by creating inherently interpretable models. While the widespread use of surrogate explainers is a welcome addition to inspect and understand black-box models,
assessing the robustness and reliability of the explanations is key for their success. Additionally, whilst existing work in the explainability field proposes various strategies to address this problem, the challenges of working with data in the wild is often overlooked. For instance, in image classification, distortions to images can not only affect the predictions assigned by the model, but also the explanation. Given a clean and a distorted version of an image, even if the prediction probabilities are similar, the explanation may still be different. In this paper we propose a methodology to evaluate the effect of distortions in explanations by embedding perceptual distances that tailor the neighbourhoods used to training surrogate explainers. We also show that by operating in this way, we can make the explanations more robust to distortions. We generate explanations for images in the Imagenet-C dataset and demonstrate how using a perceptual distances in the surrogate explainer creates more coherent explanations for the distorted and reference images.
Original language | English |
---|---|
Pages | 3717 |
Number of pages | 3721 |
Publication status | Published - 19 Jul 2021 |
Event | 28th IEEE International Conference on Image Processing - Anchorage, United States Duration: 19 Sept 2021 → 22 Sept 2021 Conference number: 2021 |
Conference
Conference | 28th IEEE International Conference on Image Processing |
---|---|
Abbreviated title | ICIP |
Country/Territory | United States |
City | Anchorage |
Period | 19/09/21 → 22/09/21 |
Fingerprint
Dive into the research topics of 'Explainers in the Wild: Making Surrogate Explainers Robust to Distortions Through Perception'. Together they form a unique fingerprint.Equipment
-
HPC (High Performance Computing) and HTC (High Throughput Computing) Facilities
Alam, S. R. (Manager), Williams, D. A. G. (Manager), Eccleston, P. E. (Manager) & Greene, D. (Manager)
Facility/equipment: Facility