Systems based on artificial intelligence and machine learning models should be transparent, in the sense of being capable of explaining their decisions to gain humans' approval and trust. While there are a number of explainability techniques that can be used to this end, many of them are only capable of outputting a single one-size-fits-all explanation that simply cannot address all of the explainees' diverse needs. In this work we introduce a model-agnostic and post-hoc local explainability technique for black-box predictions called LIMEtree, which employs surrogate multi-output regression trees. We validate our algorithm on a deep neural network trained for object detection in images and compare it against Local Interpretable Model-agnostic Explanations (LIME). Our method comes with local fidelity guarantees and can produce a range of diverse explanation types, including contrastive and counterfactual explanations praised in the literature. Some of these explanations can be interactively personalised to create bespoke, meaningful and actionable insights into the model's behaviour. While other methods may give an illusion of customisability by wrapping, otherwise static, explanations in an interactive interface, our explanations are truly interactive, in the sense of allowing the user to "interrogate" a black-box model. LIMEtree can therefore produce consistent explanations on which an interactive exploratory process can be built.
|Publication status||Unpublished - 4 May 2020|