SHARP: Segmentation of Hands and Arms by Range Using Pseudo-depth for Enhanced Egocentric 3D Hand Pose Estimation and Action Recognition

Wiktor Mucha*, Michael Wray, Martin Kampel

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

Abstract

Hand pose represents key information for action recognition in the egocentric perspective, where the user is interacting with objects. We propose to improve egocentric 3D hand pose estimation based on RGB frames only by using pseudo-depth images. Incorporating state-of-the-art single RGB image depth estimation techniques, we generate pseudo-depth representations of the frames and use distance knowledge to segment irrelevant parts of the scene. The resulting depth maps are then used as segmentation masks for the RGB frames. Experimental results on H2O Dataset confirm the high accuracy of the estimated pose with our method in an action recognition task. The 3D hand pose, together with information from object detection, is processed by a transformer-based action recognition network, resulting in an accuracy of 91.73%, outperforming all state-of-the-art methods. Estimations of 3D hand pose result in competitive performance with existing methods with a mean pose error of 28.66 mm. This method opens up new possibilities for employing distance information in egocentric 3D hand pose estimation without relying on depth sensors. The code is available under https://github.com/wiktormucha/SHARP.
Original languageEnglish
Title of host publicationPattern Recognition
Subtitle of host publication27th International Conference, ICPR 2024, Kolkata, India, December 1–5, 2024, Proceedings, Part XV
EditorsApostolos Antonacopoulos, Subhasis Chaudhuri, Rama Chellappa, Cheng-Lin Liu, Saumik Bhattacharya, Umapada Pal
PublisherSpringer, Cham
Pages178-193
Number of pages16
Volume15315
ISBN (Electronic)9783031783548
ISBN (Print)9783031783531
DOIs
Publication statusE-pub ahead of print - 4 Dec 2024
EventInternational Conference on Pattern Recognition - Kolkata, India
Duration: 1 Dec 20245 Dec 2024
Conference number: 27
https://icpr2024.org/

Publication series

NameLecture Notes in Computer Science (LNCS)
PublisherSpringer Cham
Volume15315
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceInternational Conference on Pattern Recognition
Abbreviated titleICPR
Country/TerritoryIndia
CityKolkata
Period1/12/245/12/24
Internet address

Bibliographical note

Publisher Copyright:
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG.

Research Groups and Themes

  • Intelligent Systems Laboratory (MaVi)
  • Computer Vision
  • Hand Pose Estimation
  • Action Recognition
  • Egocentric Vision

Fingerprint

Dive into the research topics of 'SHARP: Segmentation of Hands and Arms by Range Using Pseudo-depth for Enhanced Egocentric 3D Hand Pose Estimation and Action Recognition'. Together they form a unique fingerprint.

Cite this