Abstract
Successful motor learning requires solving a credit assignment problem, where the relationship between motor commands and their outcomes must be inferred. This credit assignment problem is shared by both biological and artificial agents that need to learn motor functions in a (shared) physical environment. In this regard, the computational principles enabling artificial agents to solve this credit assignment problem may also underline how the brain solves the same credit assignment problem. In the reinforcement learning, RL, literature, policy gradient methods provide principled approaches to solve this credit assignment problem within complex motor learning settings. This thesis aims to investigate (policy) gradient computations in relation to motor learning in both biological and artificial agents. In particular, this thesis makes two types of contributions. The first type of contribution is practical, showing how to improve a popular RL algorithm, TD-learning, by exploiting (action) gradient computations. This contribution is targeted at the machine learning community, by providing a novel algorithm, Taylor TD- learning, to improve current RL methods. The second type of contribution that this thesis makes is theoretical. In particular, the thesis builds on (action) gradient computations in RL to provide a normative framework to help us understand motor learning in neuroscience. This (normative) framework enables us to gain deeper insights on the different motor learning processes taking place in the brain (e.g., error- and reward- based learning mechanisms), while promoting closer relations between the neuroscience and RL literature.
Date of Award | 18 Jun 2024 |
---|---|
Original language | English |
Awarding Institution |
|
Supervisor | Laurence Aitchison (Supervisor), Casimir J H Ludwig (Supervisor) & Nathan F Lepora (Supervisor) |