PAPER DIGEST
Most Influential CVPR 2018 Paper · 2026-03 edition

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, Dmitry Kalenichenko

Venue
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018
Recognition
Most Influential CVPR 2018 Paper (Rank No. 13)
Edition
2026-03
Impact factor
9
Certificate ID
55d88e49f9dd0278

Abstract

The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based visual recognition models call for efficient on-device inference schemes. We propose a quantization scheme along with a co-designed training procedure allowing inference to be carried out using integer-only arithmetic while preserving an end-to-end model accuracy that is close to floating-point inference. Inference using integer-only arithmetic performs better than floating-point arithmetic on typical ARM CPUs and can be implemented on integer-arithmetic-only hardware such as mobile accelerators (e.g. Qualcomm Hexagon). By quantizing both activations and weights as 8-bit integers, we obtain a close to 4x memory footprint reduction compared to 32-bit floating-point representations. Even on MobileNets, a model family known for runtime efficiency, our quantization approach results in an improved tradeoff between latency and accuracy on popular ARM CPUs for ImageNet classification and COCO detection.

Download PDF certificate