Title: Interpreting Deep Networks' Predictions with BoW Models
Abstract: In this talk, I will discuss two approaches to providing an interpretation of the output of a deep network to a user. Both of these approaches build upon the idea of incorporating a Bag of visual Words (BoW) representation within the networks. In the first case, interpretability is obtained via attention maps, which highlight the image regions that were found to be important for the prediction. In the second one, it is achieved by providing the BoW codewords with a visual and semantic meaning. I will then further describe how the resulting visual codebook can be employed to detect adversary examples, that is, images to which a small amount of structured noise has been added so as to fool the network.
Mathieu Salzmann is a Senior Researcher at EPFL-CVLab. Previously, he was a Senior Researcher and Research Leader in NICTA’'s computer vision research group, with an adjunct position at the Australian National University. Prior to this, from Sept. 2010 to Jan 2012, he was a Research Assistant Professor at TTI-Chicago, and, from Feb. 2009 to Aug. 2010, a postdoctoral fellow at ICSI and EECS at UC Berkeley under the supervision of Prof. Trevor Darrell. He obtained his PhD in Jan. 2009 from EPFL under the supervision of Prof. Pascal Fua. His research interests lie at the intersection of machine learning and geometry for computer vision.