Poisoning Attacks Against Support Vector Machines
Abstract
We investigate a family of poisoning attacks against Support Vector Machines (SVM). Such attacks amount to injecting specially crafted training data so as to increase the test error of the trained SVM. Central to the motivation for these attacks is the fact that most learning algorithms assume that their training data comes from a natural or well-behaved distribution. However, this iassumption does not generally hold in security-sensitive settings. As we demonstrate in this contribution, an intelligent adversary can to some extent predict the change of the SVM decision function in response to malicious input and use this ability to construct malicious data points. The proposed attack uses a gradient ascent strategy in which the gradient is computed based on properties of the SVM's optimal solution. The gradient ascent method can be easily kernelized and enables the attack to be constructed in the input space even for non-linear kernels. We experimentally demonstrate that our gradient ascent procedure reliably identifies good local maxima of the non-convex validation error surface and inflicts a significant damage on the test error of the trained classifier.