Abstract
Non-invasive population monitoring of endangered species is possible using image data. Vast amounts of uncontrolled natural images can be generated using camera traps, (KAYS et al., 2011), which require detection of animal encounters before analysis. The feasibility of this approach is limited by the requirement to manually analyse the video for the presence of relevant specimens. Automation will greatly enhance the practicality of population monitoring using images. Here we present results for an automated image analysis system based on deformable part-based models (DPMs), (FELZENSZWALB et al., 2010). Three detector configurations are trained and tested using 750 images containing wild and zoo specimen chimpanzees, with a manually annotated ground truth of faces for evaluation, (provided by the SAISBECO Project: www.saisbeco.com). All models have comparable average precisions and qualitatively capture different aspects of the data variation. An existing chimp face detection system (ERNST et al., 2011), building upon extensive human face detection work (VIOLA et al., 2004), has some robustness to illumination variance. Whilst capable of real-time detection, it has limited robustness to occlusion or pose variance, common in natural images. DPMs have already been applied to animal head detection with some success (PARKHI et al., 2011). They offer the potential to incorporate varied poses without extensive parameter tuning, whilst retaining illumination invariance and introducing partial occlusion robustness. Our three detector configurations use DPMs over the face region (‘Face’); an expanded facial region (‘Expanded’); and as the basis of a linear integration of multiple detectors, which we call ‘detector fabric’ (‘Fabric’). Their cross-validated average precisions were measured as 70.12%, 72.41% and 70.84% respectively. The models’ results differ qualitatively: favouring clear faces; distinct sur-roundings or a mixture of the two. The Face detector is capable of detection of well resolved faces, where the other two – more reliant on spatial context – fail. A reduced reliance on facial features alone, (Expanded and Fabric), has allowed detection where the face is less well resolved. Expanding the ground truth annotations to cover a greater image area has enabled detection in non-frontal poses and with some partial occlusion. Where only Fabric succeeds at detection, faces can be further fragmented by occlusion, and detection is possible where neither the face nor the surrounding regions are well resolved. This is a promising prototype for applying DPMs to wild animal detection – capturing pose and occlusion variance, albeit at increased computational cost. The techniques are suitably generic to be applied to other species.
REFERENCES:
ERNST A, KUBLBECK C (2011): Fast face detection and species classification of African great apes. 8th IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS), 279 - 284.
FELZENSZWALB P, GIRSHICK R, MCALLESTER D, RAMANAN D, (2010): Object detection with discrimi-natively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intel-ligence 32, 1627 - 1645.
KAYS R, TILAK S, KRANSTAUBER B, JANSEN P, CARBONE C, ROWCLIFFE M, FOUNTAIN T, EGGERT J, HE Z (2011): Camera traps as sensor networks for monitoring animal communities. International Journal of Research and Reviews in Wireless Sensor Networks 1, 19 - 29.
PARKHI O, VEDALDI A, JAWAHAR C, ZISSERMAN A (2011): The truth about cats and dogs,” Proceedings of ICCV, 1427 - 1434.
SAISBECO, http://www.saisbeco.com/, Downloaded on 14 June 2013.
VIOLA P, JONES M (2004): Robust real-time face detection,” International Journal of Computer Vision 57, 137 - 154.
REFERENCES:
ERNST A, KUBLBECK C (2011): Fast face detection and species classification of African great apes. 8th IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS), 279 - 284.
FELZENSZWALB P, GIRSHICK R, MCALLESTER D, RAMANAN D, (2010): Object detection with discrimi-natively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intel-ligence 32, 1627 - 1645.
KAYS R, TILAK S, KRANSTAUBER B, JANSEN P, CARBONE C, ROWCLIFFE M, FOUNTAIN T, EGGERT J, HE Z (2011): Camera traps as sensor networks for monitoring animal communities. International Journal of Research and Reviews in Wireless Sensor Networks 1, 19 - 29.
PARKHI O, VEDALDI A, JAWAHAR C, ZISSERMAN A (2011): The truth about cats and dogs,” Proceedings of ICCV, 1427 - 1434.
SAISBECO, http://www.saisbeco.com/, Downloaded on 14 June 2013.
VIOLA P, JONES M (2004): Robust real-time face detection,” International Journal of Computer Vision 57, 137 - 154.
Original language | English |
---|---|
Title of host publication | 9th International Conference on Behaviour, Physiology and Genetics of Wildlife |
Publisher | Leibnitz Institute for Zoo and Wildlife Research |
Publication status | Published - 2013 |