Resource-based Dynamic Rewards for Factored MDPs

Ronan Killough, Kim Bauters, Kevin McAreavey, Weiru Liu, Jun Hong

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

311 Downloads (Pure)


Factored MDPs provide an efficient way to reduce the complexity of large, real-world domains by exploiting structure within the state space. This avoids the need for the state space to be fully enumerated, which is impractical in large domains. However, defining a reward function for state transitions is difficult in a factored MDP since transitions are not known prior to execution. In this paper, we provide a novel method for deriving rewards from information within the states in order to determine intermediate rewards for state transitions. We do this by treating some specific state variables as resources, allowing costs and rewards to be inferred from changes to the resources and ensuring the agent is resource-aware while also being goal oriented. To facilitate this, we propose a novel variant of Dynamic Bayesian Networks specifically for modelling action transitions
and capable of dealing with relative changes to real-valued state variables (such as resources) in a compact fashion. We also propose a number of reward functions which model resource types commonly found in real-world situations. We go on to show that our proposed framework offers an improvement over existing techniques involving reward functions for factored MDPs as it improves both the efficiency and decision quality of online planners when operating on these models.
Original languageEnglish
Title of host publication2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI 2017)
Subtitle of host publicationProceedings of a meeting held 6-8 November 2017, Boston, Massachusetts, USA
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages8
ISBN (Electronic)9781538638767
ISBN (Print)9781538638774
Publication statusPublished - Jun 2018

Publication series

ISSN (Print)2375-0197


Dive into the research topics of 'Resource-based Dynamic Rewards for Factored MDPs'. Together they form a unique fingerprint.

Cite this