Abstract
Video indexing is a central component necessary to facilitate
efficient content-based retrieval and browsing of visual information
stored in large multimedia databases.
This thesis presents work towards a unified framework for automated
video indexing. To create an efficient index, a set of representative
key frames are selected which capture and encapsulate the entire video
content. This is achieved by, firstly, segmenting the video into its
constituent shots and, secondly, selecting an optimal number of frames
between the identified shot boundaries. The segmentation algorithm is
designed to detect both abrupt shot transitions, or \emph{cuts}, and gradual
transitions, such as \emph{dissolves} and \emph{fades}. This is
achieved by means of a two-component frame differencing metric taking
both image structure and colour distributions into account.
The application of hierarchical block-based normalised correlation and
local colour histogram differences leads to a method which is both
accurate and robust.
After the segmentation step, the key frames are selected to minimise
representational redundancy whilst still portraying the content in each
shot. This is achieved by employing a graph-based representation of
each shot where nodes represent frames and connection weights the
amount of shared content between the frames corresponding to the
connected nodes. The key frames are then selected as those
corresponding to nodes present on the least weight path through the
graph. As a final step, the camera motion is characterised to
provide an additional layer of video annotation which may prove useful
for indexing.
Translated title of the contribution | Video Segmentation and Indexing using Motion Estimation |
---|---|
Original language | English |
Publication status | Published - 2004 |