Skip to main navigation Skip to search Skip to main content

Deep learning and RGB-D based human action, human–human and human–object interaction recognition: A survey

Pushpajit Khaire*, Praveen Kumar

*Corresponding author for this work

    Research output: Contribution to journalArticle (Academic Journal)peer-review

    53 Citations (Scopus)

    Abstract

    Human activity recognition is one of the most studied topics in the field of computer vision. In recent years, with the availability of RGB-D sensors and powerful deep learning techniques, research on human activity recognition has gained momentum. From simple human atomic actions, the research has advanced towards recognizing more complex human activities using RGB-D data. This paper presents a comprehensive survey of the advanced deep learning based recognition methods and categorizes them in human atomic action, human–human interaction, human–object interaction. The reviewed methods are further classified based on the individual modality used for recognition i.e. RGB based, depth based, skeleton based, and hybrid. We also review and categorize recent challenging RGB-D datasets for the same. In addition, the paper also briefly reviews RGB-D datasets and methods for online activity recognition. The paper concludes with a discussion on limitations, challenges, and recent trends for promising future directions.

    Original languageEnglish
    Article number103531
    JournalJournal of Visual Communication and Image Representation
    Volume86
    Early online date7 May 2022
    DOIs
    Publication statusPublished - Jul 2022

    Bibliographical note

    Funding Information:
    This research was supported by Science and Engineering Research Board (SERB), India under project no. ECR/2016/000387, in cooperation with the Department of Science & Technology (DST), Government of India. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of DST-SERB or the Government of India. The DST-SERB or Government of India is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation thereon. We are also thankful to all the reviewers for their valuable comments and suggestions to improve the scientific value of the paper.

    Publisher Copyright:
    © 2022 Elsevier Inc.

    Keywords

    • CNN
    • Deep learning
    • Fusion
    • GCN
    • Human action recognition
    • Human–human interaction
    • Human–object interaction
    • LSTM
    • Multi-modality
    • RGB-D sensors
    • Skeleton

    Fingerprint

    Dive into the research topics of 'Deep learning and RGB-D based human action, human–human and human–object interaction recognition: A survey'. Together they form a unique fingerprint.

    Cite this