Abstract
Summarization of videos depicting human activities is a timely problem with important applications, e.g., in the domains of surveillance or film/TV production, that steadily becomes more relevant. Research on video summarization has mainly relied on global clustering or local (frame-by-frame) saliency methods to provide automated algorithmic solutions for key-frame extraction. This work presents a method based on selecting as key-frames video frames able to optimally reconstruct the entire video. The novelty lies in modelling the reconstruction algebraically as a Column Subset Selection Problem (CSSP), resulting in extracting key-frames that correspond to elementary visual building blocks. The problem is formulated under an optimization framework
and approximately solved via a genetic algorithm. The proposed video summarization method is being evaluated using a publicly available
annotated dataset and an objective evaluation metric. According to the
quantitative results, it clearly outperforms the typical clustering approach.
and approximately solved via a genetic algorithm. The proposed video summarization method is being evaluated using a publicly available
annotated dataset and an objective evaluation metric. According to the
quantitative results, it clearly outperforms the typical clustering approach.
Original language | English |
---|---|
Title of host publication | 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2017) |
Subtitle of host publication | Proceedings of a meeting held 5-9 March 2017, New Orleans, Louisiana, USA |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Pages | 1627-1631 |
Number of pages | 5 |
ISBN (Electronic) | 9781509041176 |
ISBN (Print) | 9781509041183 |
DOIs | |
Publication status | Published - Aug 2017 |
Publication series
Name | |
---|---|
ISSN (Print) | 2379-190X |
Keywords
- video summarization
- Sparse dictionary learning
- Genetic algorithm