Abstract
Recognizing human actions based on 3D skeleton data, commonly referred to as 3D action recognition, is fast gaining interest from the scientific community recently, because this approach presents a robust, compact and a perspective-invariant representation of motion data. Recent attempts on this problem proposed the development of RNN-based learning methods to model the temporal dependency in the sequential data. In this paper, we extend this idea to a hierarchical spatio-temporal domains to exploit the local and global features embedded in the long skeleton sequence. We introduce a novel temporal-contextual recurrent layer to learn the local features from consecutive frames and then to aggregate the extracted features hierarchically, refining the sequence representation layer by layer. Our method achieves competitive performance on 3 popular benchmark datasets for 3D human action analysis.
Original language | English |
---|---|
Title of host publication | 2018 15th International Conference on Control, Automation, Robotics and Vision, ICARCV 2018 |
Publisher | IEEE, Institute of Electrical and Electronics Engineers |
Pages | 901-906 |
Number of pages | 6 |
ISBN (Electronic) | 9781538695821 |
DOIs | |
Publication status | Published - 18 Dec 2018 |
Externally published | Yes |
Event | 15th International Conference on Control, Automation, Robotics and Vision, ICARCV 2018 - Singapore, Singapore Duration: 18 Nov 2018 → 21 Nov 2018 Conference number: 15th |
Publication series
Name | 2018 15th International Conference on Control, Automation, Robotics and Vision, ICARCV 2018 |
---|
Conference
Conference | 15th International Conference on Control, Automation, Robotics and Vision, ICARCV 2018 |
---|---|
Abbreviated title | ICARCV |
Country/Territory | Singapore |
City | Singapore |
Period | 18/11/18 → 21/11/18 |