Abstract
Researchers studying the correspondences between Deep Neural Networks (DNNs) and humans often give little consideration to severe testing when drawing conclusions from empirical findings, and this is impeding progress in building better models of minds. We first detail what we mean by severe testing and highlight how this is especially important when working with opaque models with many free parameters that may solve a given task in multiple different ways. Second, we provide multiple examples of researchers making strong claims regarding DNN-human similarities without engaging in severe testing of their hypotheses. Third, we consider why severe testing is undervalued. We provide evidence that part of the fault lies with the review process. There is now a widespread appreciation in many areas of science that a bias for publishing positive results (among other practices) is leading to a credibility crisis, but there seems less awareness of the problem here.
Original language | English |
---|---|
Article number | 101158 |
Number of pages | 11 |
Journal | Cognitive Systems Research |
Volume | 82 |
Early online date | 22 Aug 2023 |
DOIs | |
Publication status | Published - 1 Dec 2023 |
Bibliographical note
Publisher Copyright:© 2023
Keywords
- Memory
- Neural networks
- Perception
- Psychology
- Severe testing
- Vision