Moment of Untruth: Dealing with Negative Queries in Video Moment Retrieval

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

Abstract

Video Moment Retrieval is a common task to evaluate the performance of visual-language models-it involves localising start and end times of moments in videos from query sentences. The current task formulation assumes that the queried moment is present in the video, resulting in false positive moment predictions when irrelevant query sentences are provided. In this paper we propose the task of Negative-Aware Video Moment Retrieval (NA-VMR), which considers both moment retrieval accuracy and negative query rejection accuracy. We make the distinction between In-Domain and Out-of-Domain negative queries and provide new evaluation benchmarks for two popular video moment retrieval datasets: QVHighlights and Charades-STA. We analyse the ability of current SOTA video moment retrieval approaches to adapt to Negative-Aware Video Moment Retrieval and propose UniVTG-NA, an adaptation of UniVTG designed to tackle NA-VMR. UniVTG-NA achieves high negative rejection accuracy (avg. 98.4%) scores while retaining moment retrieval scores to within 3.87% Recall@1. Dataset splits are available at https://github.com/keflanagan/MomentofUntruth.
Original languageEnglish
Title of host publication2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages5336-5345
Number of pages10
ISBN (Electronic)9798331510831
ISBN (Print)9798331510848
DOIs
Publication statusPublished - 8 Apr 2025
EventIEEE/CVF Winter Conference on Applications of Computer Vision: WACV - Tuscon, Arizona, United States
Duration: 28 Feb 20255 Mar 2025
https://wacv2025.thecvf.com/

Publication series

NameIEEE Workshop on Applications of Computer Vision (WACV)
PublisherIEEE
ISSN (Print)2472-6737
ISSN (Electronic)2642-9381

Conference

ConferenceIEEE/CVF Winter Conference on Applications of Computer Vision
Country/TerritoryUnited States
CityTuscon, Arizona
Period28/02/255/03/25
Internet address

Bibliographical note

Publisher Copyright:
© 2025 IEEE.

Fingerprint

Dive into the research topics of 'Moment of Untruth: Dealing with Negative Queries in Video Moment Retrieval'. Together they form a unique fingerprint.

Cite this