TY - JOUR
T1 - Deep Problems with Neural Network Models of Human Vision
AU - Bowers, Jeffrey S
AU - Malhotra, Gaurav
AU - Dujmovic, Marin
AU - Llera Montero, Milton
AU - Tsvetkov, Chris I
AU - Biscione, Valerio
AU - Puebla, Guillermo
AU - Gonzalez Adolfi, Federico
AU - Hummel, John
AU - Heaton, Rachel
AU - Evans, Benjamin D
AU - Mitchell, Jeffrey
AU - Blything, Ryan
PY - 2022/12/1
Y1 - 2022/12/1
N2 - Deep neural networks (DNNs) have had extraordinary successes in classifying photographic images of objects and are often described as the best models of biological vision. This conclusion is largely based on three sets of findings: (1) DNNs are more accurate than any other model in classifying images taken from various datasets, (2) DNNs do the best job in predicting the pattern of human errors in classifying objects taken from various behavioral datasets, and (3) DNNs do the best job in predicting brain signals in response to images taken from various brain datasets (e.g., single cell responses or fMRI data). However, these behavioral and brain datasets do not test hypotheses regarding what features are contributing to good predictions and we show that the predictions may be mediated by DNNs that share little overlap with biological vision. More problematically, we show that DNNs account for almost no results from psychological research. This contradicts the common claim that DNNs are good, let alone the best, models of human object recognition. We argue that theorists interested in developing biologically plausible models of human vision need to direct their attention to explaining psychological findings. More generally, theorists need to build models that explain the results of experiments that manipulate independent variables designed to test hypotheses rather than compete on making the best predictions. We conclude by briefly summarizing various promising modelling approaches that focus on psychological data.
AB - Deep neural networks (DNNs) have had extraordinary successes in classifying photographic images of objects and are often described as the best models of biological vision. This conclusion is largely based on three sets of findings: (1) DNNs are more accurate than any other model in classifying images taken from various datasets, (2) DNNs do the best job in predicting the pattern of human errors in classifying objects taken from various behavioral datasets, and (3) DNNs do the best job in predicting brain signals in response to images taken from various brain datasets (e.g., single cell responses or fMRI data). However, these behavioral and brain datasets do not test hypotheses regarding what features are contributing to good predictions and we show that the predictions may be mediated by DNNs that share little overlap with biological vision. More problematically, we show that DNNs account for almost no results from psychological research. This contradicts the common claim that DNNs are good, let alone the best, models of human object recognition. We argue that theorists interested in developing biologically plausible models of human vision need to direct their attention to explaining psychological findings. More generally, theorists need to build models that explain the results of experiments that manipulate independent variables designed to test hypotheses rather than compete on making the best predictions. We conclude by briefly summarizing various promising modelling approaches that focus on psychological data.
U2 - https://doi.org/10.1017/S0140525X22002813
DO - https://doi.org/10.1017/S0140525X22002813
M3 - Article (Academic Journal)
SP - 1
JO - Behavioral and Brain Sciences
JF - Behavioral and Brain Sciences
SN - 0140-525X
ER -