Publications

Stats

View publication

Title Model Interpretability through the lens of Computational Complexity
Authors Pablo Barceló, Mikael Monet, Jorge Pérez, Bernardo Subercaseaux
Publication date 2020
Abstract In spite of several claims stating that some models are
more
interpretable than others --e.g., "linear models are more interpretable than
deep neural networks"-- we still lack a principled notion of
interpretability that allows us to formally compare among different classes
of models. We make a step towards such a theory by studying whether folklore
interpretability claims have a correlate in terms of computational
complexity theory. We focus on post-hoc explainability queries that,
intuitively, attempt to answer why individual inputs are classified in a
certain way by a given model. In a nutshell, we say that a class C1 of
models is more interpretable than another class C2, if the computational
complexity of answering post-hoc queries for models in C2 is higher than for
C1. We prove that this notion provides a good theoretical counterpart to
current beliefs on the interpretability of models; in particular, we show
that under our definition and assuming standard complexity-theoretical
assumptions (such as P!=NP), both linear and tree-based models are strictly
more interpretable than neural networks. Our complexity analysis, however,
does not provide a clear-cut difference between linear and tree-based
models, as we obtain different results depending on the particular {post-hoc
explanations} considered. Finally, by applying a finer complexity analysis
based on parameterized complexity, we are able to prove a theoretical result
suggesting that shallow neural networks are more interpretable than deeper
ones.
Conference name Advances in Neural Information Processing Systems
Reference URL View reference page