PAPER DIGEST
Most Influential ICML 2010 Paper · 2026-03 edition

3D Convolutional Neural Networks For Human Action Recognition

Shuiwang Ji; Wei Xu; Ming Yang; Kai Yu

Venue
International Conference on Machine Learning (ICML) 2010
Recognition
Most Influential ICML 2010 Paper (Rank No. 2)
Edition
2026-03
Impact factor
8
Certificate ID
828bc8e7f317b082

Abstract

We consider the fully automated recognition of actions in uncontrolled environment. Most existing work relies on domain knowledge to construct complex handcrafted features from inputs. In addition, the environments are usually assumed to be controlled. Convolutional neural networks (CNNs) are a type of deep models that can act directly on the raw inputs, thus automating the process of feature construction. However, such models are currently limited to handle 2D inputs. In this paper, we develop a novel 3D CNN model for action recognition. This model extracts features from both spatial and temporal dimensions by performing 3D convolutions, thereby capturing the motion information encoded in multiple adjacent frames. The developed model generates multiple channels of information from the input frames, and the final feature representation is obtained by combining information from all channels. We apply the developed model to recognize human actions in real-world environment, and it achieves superior performance without relying on handcrafted features.

Download PDF certificate