Projects per year
Abstract
Long videos contain many repeating actions, events and shots.
These repetitions are frequently given identical captions, which makes it
difficult to retrieve the exact desired clip using a text search. In this
paper, we formulate the problem of unique captioning: Given multiple
clips with the same caption, we generate a new caption for each clip
that uniquely identifies it. We propose Captioning by Discriminative
Prompting (CDP), which predicts a property that can separate identically
captioned clips, and use it to generate unique captions. We introduce
two benchmarks for unique captioning, based on egocentric footage
and timeloop movies – where repeating actions are common. We demonstrate
that captions generated by CDP improve text-to-video R@1 by
15% for egocentric videos and 10% in timeloop movies.
These repetitions are frequently given identical captions, which makes it
difficult to retrieve the exact desired clip using a text search. In this
paper, we formulate the problem of unique captioning: Given multiple
clips with the same caption, we generate a new caption for each clip
that uniquely identifies it. We propose Captioning by Discriminative
Prompting (CDP), which predicts a property that can separate identically
captioned clips, and use it to generate unique captions. We introduce
two benchmarks for unique captioning, based on egocentric footage
and timeloop movies – where repeating actions are common. We demonstrate
that captions generated by CDP improve text-to-video R@1 by
15% for egocentric videos and 10% in timeloop movies.
Original language | English |
---|---|
Title of host publication | Computer Vision – ACCV 2024 |
Subtitle of host publication | 17th Asian Conference on Computer Vision, Hanoi, Vietnam, December 8–12, 2024, Proceedings |
Publisher | Springer, Singapore |
Publication status | Accepted/In press - 11 Oct 2024 |
Event | Asian Conference on Computer Vision - Hanoi, Viet Nam Duration: 8 Dec 2024 → 12 Dec 2024 https://accv2024.org |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer |
Volume | 15480 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | Asian Conference on Computer Vision |
---|---|
Abbreviated title | ACCV |
Country/Territory | Viet Nam |
City | Hanoi |
Period | 8/12/24 → 12/12/24 |
Internet address |
Projects
- 2 Active
-
Visual AI - Full Programme Grant Extension
Damen, D. (Principal Investigator)
1/06/23 → 30/11/25
Project: Research
-
8459 EPSRC EP/T004991/1 UMPIRE - Dima Aldamen Fellowship
Damen, D. (Principal Investigator)
1/02/20 → 31/01/25
Project: Research