Evaluation and benchmarking in content-based image retrieval has always been a somewhat neglected research area, making it difficult to judge the efficacy of many presented approaches. In this paper we investigate the issue of benchmarking for colour-based image retrieval systems, which enable users to retrieve images from a database based on lowlevel colour content alone. We argue that current image retrieval evaluation methods are not suited to benchmarking colour-based image retrieval systems, due in main to not allowing users to reflect upon the suitability of retrieved images within the context of a creative project and their reliance on highly subjective ground-truths. As a solution to these issues, the research presented here introduces the Mosaic Test for evaluating colour-based image retrieval systems, in which test-users are asked to create an image mosaic of a predetermined target image, using the colour-based imageretrieval system that is being evaluated. We report on our findings from a user study which suggests that the Mosaic Test overcomes the major drawbacks associated with existing image retrieval evaluation methods, by enabling users to reflect upon image selections and automatically measuring image relevance in a way that correlates with the perception of many human assessors. We therefore propose that the Mosaic Test be adopted as a standardised benchmark for evaluating and comparing colour-based image retrieval systems.
|Title of host publication||Proceedings of 1st European Workshop on Human-Computer Interaction and Information Retrieval (EuroHCIR)|
|Number of pages||4|
|Publication status||Published - 2011|
|Name||CEUR workshop proceedings|
- benchmarking, content-based image retrieval, image databases, image mosaic, performance evaluation