Deep learning and RGB-D based human action, human–human and human–object interaction recognition: A survey

Pushpajit Khaire*, Praveen Kumar

*Corresponding author for this work

Research output: Contribution to journalArticle (Academic Journal)peer-review

34 Citations (Scopus)

Abstract

Human activity recognition is one of the most studied topics in the field of computer vision. In recent years, with the availability of RGB-D sensors and powerful deep learning techniques, research on human activity recognition has gained momentum. From simple human atomic actions, the research has advanced towards recognizing more complex human activities using RGB-D data. This paper presents a comprehensive survey of the advanced deep learning based recognition methods and categorizes them in human atomic action, human–human interaction, human–object interaction. The reviewed methods are further classified based on the individual modality used for recognition i.e. RGB based, depth based, skeleton based, and hybrid. We also review and categorize recent challenging RGB-D datasets for the same. In addition, the paper also briefly reviews RGB-D datasets and methods for online activity recognition. The paper concludes with a discussion on limitations, challenges, and recent trends for promising future directions.

Original languageEnglish
Article number103531
JournalJournal of Visual Communication and Image Representation
Volume86
Early online date7 May 2022
DOIs
Publication statusPublished - Jul 2022

Bibliographical note

Funding Information:
This research was supported by Science and Engineering Research Board (SERB), India under project no. ECR/2016/000387, in cooperation with the Department of Science & Technology (DST), Government of India. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of DST-SERB or the Government of India. The DST-SERB or Government of India is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation thereon. We are also thankful to all the reviewers for their valuable comments and suggestions to improve the scientific value of the paper.

Publisher Copyright:
© 2022 Elsevier Inc.

Keywords

  • CNN
  • Deep learning
  • Fusion
  • GCN
  • Human action recognition
  • Human–human interaction
  • Human–object interaction
  • LSTM
  • Multi-modality
  • RGB-D sensors
  • Skeleton

Fingerprint

Dive into the research topics of 'Deep learning and RGB-D based human action, human–human and human–object interaction recognition: A survey'. Together they form a unique fingerprint.

Cite this