Is that a Panda, or a Gibbon? Investigating the mystery of Adversarial Examples
Machine Learning (ML) models show great promise in the field of computer vision, which focuses on enabling systems to model and understand digital images automatically (https://goo.gl/whsbSH
But sometimes those ML systems get it wrong. As it turns out, many machine learning models, including Neural Networks, have intriguing properties. One such property, “blind spots” (http://goo.gl/FPuCzk
), causes them to misclassify adversarial examples
- images that are formed by applying very small, but intentional, perturbations to existing correctly labeled examples. Moreover, when these different models misclassify an adversarial example, they often agree with each other on its class. But why does this happen?
At the 2015 International Conference on Learning Representations (http://goo.gl/DxCNM1
), Google Research Scientists +Ian Goodfellow
, +Jon Shlens
, and +Christian Szegedy
presented Explaining and Harnessing Adversarial Examples
), where they investigate neural networks’ vulnerability to adversarial perturbation.
Previously, the thinking was that adversarial examples were due to overfitting and the non-linear nature of Deep Neural Networks. In this paper, the authors argue that, rather, existing models are too
linear, and that generalization of adversarial examples across different models can be explained as a result of the different models learning similar functions when trained to perform the same task. In doing so, they propose a fast method of generating adversarial examples that can be used to help train models to resist adversarial perturbation.