The combining of visible light and infrared visual representations occurs naturally in some creatures, including the rattlesnake. This process, and the wide-spread use of multi-spectral multi-sensor systems, has influenced research into image fusion methods. Recent advances in image fusion techniques have necessitated the creation of novel ways of assessing fused images, which have previously focused on the use of subjective quality ratings combined with computational metric assessment. Previous work has shown the need to apply a task to the assessment process; the current work continues this approach by extending the novel use of scanpath analysis. In our experiments, participants were shown two video sequences, one in high luminance (HL) and one in low luminance (LL), both featuring a group of people walking around a clearing of trees. Each participant was shown visible and infrared (IR) inputs alone; and side-by-side (SBS); in an average (AVE) fused; a discrete wavelet transform (DWT) fused; and a dual-tree complex wavelet transform (DT-CWT) fused displays. Participants were asked to track one individual in each video sequence, as well as responding by key press when other individuals carried out secondary actions. Results showed the SBS display to lead to much poorer accuracy than the other displays, while reaction times in carrying out the secondary task favoured AVE in the HL sequence and DWT in the LL sequence. Results are discussed in relation to previous findings regarding item saliency and task demands, and the potential for comparative experiments evaluating human performance when viewing fused sequences against naturally occurring fusion processes such as the rattlesnake is highlighted.
|Translated title of the contribution||Task-based scanpath assessment of multi-sensor video fusion in complex scenarios|
|Pages (from-to)||51 - 65|
|Number of pages||15|
|Publication status||Published - Jan 2010|