Abstract
Human activity recognition is one of the most studied topics in the field of computer vision. In recent years, with the availability of RGB-D sensors and powerful deep learning techniques, research on human activity recognition has gained momentum. From simple human atomic actions, the research has advanced towards recognizing more complex human activities using RGB-D data. This paper presents a comprehensive survey of the advanced deep learning based recognition methods and categorizes them in human atomic action, human–human interaction, human–object interaction. The reviewed methods are further classified based on the individual modality used for recognition i.e. RGB based, depth based, skeleton based, and hybrid. We also review and categorize recent challenging RGB-D datasets for the same. In addition, the paper also briefly reviews RGB-D datasets and methods for online activity recognition. The paper concludes with a discussion on limitations, challenges, and recent trends for promising future directions.
Original language | English |
---|---|
Article number | 103531 |
Journal | Journal of Visual Communication and Image Representation |
Volume | 86 |
Early online date | 7 May 2022 |
DOIs | |
Publication status | Published - Jul 2022 |
Bibliographical note
Funding Information:This research was supported by Science and Engineering Research Board (SERB), India under project no. ECR/2016/000387, in cooperation with the Department of Science & Technology (DST), Government of India. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of DST-SERB or the Government of India. The DST-SERB or Government of India is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation thereon. We are also thankful to all the reviewers for their valuable comments and suggestions to improve the scientific value of the paper.
Publisher Copyright:
© 2022 Elsevier Inc.
Keywords
- CNN
- Deep learning
- Fusion
- GCN
- Human action recognition
- Human–human interaction
- Human–object interaction
- LSTM
- Multi-modality
- RGB-D sensors
- Skeleton