Variance-based Learning Classifier System without Convergence of Reward Estimation

Takato Tatsumi, Takahiro Komine, Masaya Nakata, Hiroyuki Sato, Tim M D Kovacs, Takadama Keiki

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

5 Citations (Scopus)
337 Downloads (Pure)

Abstract

Learning Classifier System (LCS) is an evolutionary machine learning method that is constituted by reinforcement learning and genetic algorithm. As an important feature of LCS, LCS can acquire generalized rules that match multiple states using # symbol. Among LCSs, Accuracy-based LCS (XCS) [4] can acquire\accurate"generalized rules by reducing the difference between the predicted reward and the acquired reward, but XCS is hard to correctly estimate such difference in noisy environments. To address this issue, our previous research proposed XCS-SAC (XCS with Self-adaptive Accuracy Criterion) for noisy environments. Since the estimated standard deviation of the rewards of the inaccurate rules is larger than that of the accurate ones, the fitness of rules in XCS-SAC is calculated according to the estimated standard deviation of the rewards.

However, XCS-SAC needs to wait until convergence of the estimated standard deviation of all state-action pairs. This paper pays attention that the average value of rewards is distributed around a true value. To overcome this problem, this paper proposes XCS without Convergence of Reward Estimation (XCS-CRE) that can determine the accuracy of rules according to the distribution range of the average value of rewards of the matched state-action pair.
Original languageEnglish
Title of host publicationGECCO '16 Companion
Subtitle of host publicationProceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion
Place of PublicationNew York, NY, USA
PublisherAssociation for Computing Machinery (ACM)
Pages67-68
Number of pages2
ISBN (Print)9781450343237
DOIs
Publication statusPublished - 20 Jul 2016
EventGECCO '16 -
Duration: 20 Jul 2016 → …

Conference

ConferenceGECCO '16
Period20/07/16 → …

Fingerprint

Dive into the research topics of 'Variance-based Learning Classifier System without Convergence of Reward Estimation'. Together they form a unique fingerprint.

Cite this