Sponge Examples: Energy-Latency Attacks on Neural Networks (2021, Arxiv)
Published May 12, 2021 by Arxiv.
5 stars
(1 review)
The high energy costs of neural network training and inference led to the use
of acceleration hardware such as GPUs and TPUs. While this enabled us to train
large-scale neural networks in datacenters and deploy them on edge devices, the
focus so far is on average-case performance. In this work, we introduce a novel
threat vector against neural networks whose energy consumption or decision
latency are critical. We show how adversaries can exploit carefully crafted
$\boldsymbol{sponge}~\boldsymbol{examples}$, which are inputs designed to
maximise energy consumption and latency.
We mount two variants of this attack on established vision and language
models, increasing energy consumption by a factor of 10 to 200. Our attacks can
also be used to delay decisions where a network has critical real-time
performance, such as in perception for autonomous vehicles. We demonstrate the
portability of our malicious inputs across CPUs and a variety of hardware
accelerator chips including …
The high energy costs of neural network training and inference led to the use
of acceleration hardware such as GPUs and TPUs. While this enabled us to train
large-scale neural networks in datacenters and deploy them on edge devices, the
focus so far is on average-case performance. In this work, we introduce a novel
threat vector against neural networks whose energy consumption or decision
latency are critical. We show how adversaries can exploit carefully crafted
$\boldsymbol{sponge}~\boldsymbol{examples}$, which are inputs designed to
maximise energy consumption and latency.
We mount two variants of this attack on established vision and language
models, increasing energy consumption by a factor of 10 to 200. Our attacks can
also be used to delay decisions where a network has critical real-time
performance, such as in perception for autonomous vehicles. We demonstrate the
portability of our malicious inputs across CPUs and a variety of hardware
accelerator chips including GPUs, and an ASIC simulator. We conclude by
proposing a defense strategy which mitigates our attack by shifting the
analysis of energy consumption in hardware from an average-case to a worst-case
perspective.
This paper explores input to DNN models that cause an asymptotic increase in power usage or timing. By using genetic algorithms in a white-box setting, the researchers could find image and text inputs that would drive up inference effort.
The results were impressive, causing a 6000x slowdown on a hosted Azure translation model.