Skip to main navigation Skip to search Skip to main content

Skeleton-Snippet Contrastive Learning with Multiscale Feature Fusion for Action Localization

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

Abstract

The self-supervised pretraining paradigm has achieved great success in learning 3D action representations for skeleton-based action recognition using contrastive learning. However, learning effective representations for skeleton-based temporal action localization remains challenging and underexplored. Unlike video-level {action} recognition, detecting action boundaries requires temporally sensitive features that capture subtle differences between adjacent frames where labels change. To this end, we formulate a snippet discrimination pretext task for self-supervised pretraining, which densely projects skeleton sequences into non-overlapping segments and promotes features that distinguish them across videos via contrastive learning. Additionally, we build on strong backbones of skeleton-based action recognition models by fusing intermediate features with a U-shaped module to enhance feature resolution for frame-level localization. Our approach consistently improves existing skeleton-based contrastive learning methods for action localization on BABEL across diverse subsets and evaluation protocols. We also achieve state-of-the-art transfer learning performance on PKUMMD with pretraining on NTU RGB+D and BABEL.
Original languageEnglish
Title of host publicationPattern Recognition: 28th International Conference, ICPR 2026, Lyon, France, August 17–21, 2026, Proceedings, Part I.
PublisherSpringer
DOIs
Publication statusAccepted/In press - 31 Mar 2026
Event28th International Conference on Pattern Recognition - Lyon, France
Duration: 17 Aug 202622 Aug 2026
https://icpr2026.org/

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference28th International Conference on Pattern Recognition
Abbreviated titleICPR 2026
Country/TerritoryFrance
CityLyon
Period17/08/2622/08/26
Internet address

Fingerprint

Dive into the research topics of 'Skeleton-Snippet Contrastive Learning with Multiscale Feature Fusion for Action Localization'. Together they form a unique fingerprint.

Cite this