Combinatorial Generalisation in Machine Vision

  • Milton Llera Montero

Student thesis: Doctoral ThesisDoctor of Philosophy (PhD)

Abstract

The human capacity for generalisation, i.e. the fact that we are able to successfully perform a familiar task in novel contexts, is one of the hallmarks of our intelligent behaviour. But what mechanisms enable this capacity that is at the same time so impressive but comes so naturally to us? This is a question that has driven copious amounts of research in both Cognitive Science and Artificial Intelligence for almost a century, with some advocating the need for symbolic systems and others the benefits of distributed representations. In this thesis we will explore which principles help AI systems to generalise to novel combinations of previously observed elements (such as color and shape) in the context of machine vision. We will show that while approaches such as disentangled representation learning showed initial promise, they are fundamentally unable to solve this generalisation problem. In doing so we will illustrate the need to perform severe tests of models in order to properly assess their limitations. We will also see how such failures are robust across different datasets, training modalities and in the internal representations of the models. We then show that a different type of system that attempts to learn object-centric representations is capable of solving the generalisation challenges that previous models could not. We conclude by discussing the implications of these results for long-standing questions regarding the kinds of cognitive systems that are required to solve generalisation problems.
Date of Award5 Dec 2023
Original languageEnglish
Awarding Institution
  • University of Bristol
SupervisorJeffrey S Bowers (Supervisor) & Rui Ponte Costa (Supervisor)

Cite this

'