Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes

Research output: Contribution to journalArticle (Academic Journal)peer-review

6 Citations (Scopus)

Abstract

The mean-square asymptotic behavior of temporal-difference learning algorithrns with constant step-sizes and linear function approximation is analyzed in this paper. The analysis is carried out for the case of discounted cost function associated with a Markov chain with a finite dimensional state-space. Under mild conditions, an upper bound for the asymptotic mean-square error of these algorithms is determined as a function of the step-size. Moreover, under the same assumptions, it is also shown that this bound is linear in the step size. The main results of the paper are illustrated with examples related to M/G/1 queues and nonlinear AR models with Markov switching.
Translated title of the contributionAsymptotic analysis of temporal-difference learning algorithms with constant step-sizes
Original languageEnglish
Pages (from-to)107 - 133
Number of pages27
JournalMachine Learning
Volume63 (2)
DOIs
Publication statusPublished - May 2006

Bibliographical note

Publisher: Springer
Other identifier: IDS number 041RE

Fingerprint

Dive into the research topics of 'Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes'. Together they form a unique fingerprint.

Cite this