Simplified policy iteration for skip-free Markov decision processes

EJ Collins

Research output: Working paper

Abstract

We describe and analyse a new simplified policy iteration type algorithm for finite average cost Markov decision processes that are skip-free in the negative direction. We show that the algorithm is guaranteed to converge after a finite number of iterations, but the computational effort required for each iteration step is comparable with that for value iteration. We show that the analysis can be easily extended to solve continuous time models, discounted cost models and communicating models, and provides new insights into the formulation of the constraints in the linear programming approach to skip-free models. We also introduce and motivate a new class of models for multidimensional control problems which we call skip-free Markov decision processes on trees and show that the approach and the algorithm extend naturally to this wider class of models.
Translated title of the contributionSimplified policy iteration for skip-free Markov decision processes
Original languageEnglish
PublisherUniversity of Bristol
Number of pages29
Publication statusPublished - 2007

Fingerprint

Dive into the research topics of 'Simplified policy iteration for skip-free Markov decision processes'. Together they form a unique fingerprint.

Cite this