A Rate-Distortion Framework for Explaining Neural Network Decisions



We propose a rate-distortion framework for explaining neural network decisions. We formulate the task of determining the most relevant signal components for a classifier prediction as an optimisation problem. For the case of binary signals and Boolean classifier functions we show that it is hard to solve and to approximate. Finally, we present a heuristic solution strategy for deep ReLU neural network classifiers. We present numerical experiments and compare our method to other established methods.

Jul 3, 2019
École nationale supérieure d’électrotechnique, d’électronique, d’informatique, d’hydraulique et des télécommunications (ENSEEIHT)
Jan Macdonald
Jan Macdonald

My research is at the interface of applied and computational mathematics and scientific machine learning. I am interested in inverse problems, signal- and image recovery, and robust and interpretable deep learning.