The Contrasting Roles of Shape in Human Vision and Convolutional Neural Networks

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

Abstract

Convolutional neural networks (CNNs) were inspired by human vision and, in some settings, achieve a performance comparable to human object recognition. This has lead to the speculation that both systems use similar mechanisms to perform recognition. In this study, we conducted a series of simulations that indicate that there is a fundamental difference between human vision and vanilla CNNs: while object recognition in humans relies on analysing shape, these CNNs do not have such a shape-bias. We teased apart the type of features selected by the model by modifying the CIFAR-10 dataset so that, in addition to containing objects with shape, the images concurrently contained non-shape features, such as a noise-like mask. When trained on these modified set of images, the model did not show any bias towards selecting shapes as features. Instead it relied on whichever feature allowed it to perform the best prediction – even when this feature was a noise-like mask or a single predictive pixel amongst 50176 pixels.
Original languageEnglish
Title of host publicationProceedings of the 41st Annual Conference of the Cognitive Science Society
Subtitle of host publicationCogSci 2019
EditorsA.K. Goel, C.M. Seifert, C. Freksa
Publication statusPublished - 24 Jul 2019

Fingerprint Dive into the research topics of 'The Contrasting Roles of Shape in Human Vision and Convolutional Neural Networks'. Together they form a unique fingerprint.

Cite this