Skip to main navigation Skip to search Skip to main content

Multi-Temporal Convolutions for Human Action Recognition in Videos

Alexandros Stergiou, Ronald Poppe

    Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

    5 Citations (Scopus)

    Abstract

    Effective extraction of temporal patterns is crucial for the recognition of temporally varying actions in video. We argue that the fixed-sized spatio-temporal convolution kernels used in convolutional neural networks (CNNs) can be improved to extract informative motions that are executed at different time scales. To address this challenge, we present a novel convolution block that is capable of extracting spatio-temporal patterns at multiple temporal resolutions. Our proposed multi-temporal convolution (MTConv) blocks utilize two branches that focus on brief and prolonged spatio-temporal patterns, respectively. The extracted time-varying features are aligned in a third branch, with respect to global motion patterns through recurrent cells. The proposed blocks are lightweight and can be integrated into any 3D-CNN architecture. This introduces a substantial reduction in computational costs. Extensive experiments on Kinetics, Moments in Time and HACS action recognition benchmark datasets demonstrate competitive performance of MTConvs compared to the state-of-the-art with a significantly lower computational footprint 11Our code is available at: https://git.io/JfuPi.

    Original languageEnglish
    Title of host publicationIJCNN 2021 - International Joint Conference on Neural Networks, Proceedings
    PublisherInstitute of Electrical and Electronics Engineers (IEEE)
    ISBN (Electronic)9781665439008
    ISBN (Print)9781665445979
    DOIs
    Publication statusPublished - 22 Jul 2021
    Event2021 International Joint Conference on Neural Networks, IJCNN 2021 - Virtual, Shenzhen, China
    Duration: 18 Jul 202122 Jul 2021

    Publication series

    NameProceedings of the International Joint Conference on Neural Networks (IJCNN)
    ISSN (Print)2161-4393
    ISSN (Electronic)2161-4407

    Conference

    Conference2021 International Joint Conference on Neural Networks, IJCNN 2021
    Country/TerritoryChina
    CityVirtual, Shenzhen
    Period18/07/2122/07/21

    Bibliographical note

    Funding Information:
    This publication is supported by the Netherlands Organization for Scientific Research (NWO) with a TOP-C2 grant for Automatic recognition of bodily interactions (ARBITER).

    Publisher Copyright:
    © 2021 IEEE.

    Fingerprint

    Dive into the research topics of 'Multi-Temporal Convolutions for Human Action Recognition in Videos'. Together they form a unique fingerprint.

    Cite this