Adaptive Optimal Control via Reinforcement Learning
: Theory and Its Application to Automotive Engine Systems

  • Anthony Siming Chen

Student thesis: Doctoral ThesisDoctor of Philosophy (PhD)

Abstract

This thesis presents a new adaptive optimal control framework for continuous-time nonlinear input-affine systems. The idea of combining adaptive control and optimal control has emerged recently due to the advancement in one class of machine learning: reinforcement learning. The topic is also known as the approximate/adaptive dynamic programming (ADP) which is often formulated in discrete time or as the Markov decision process (MDP). This work, for the first time, extends the idea of linear discrete-time Q-learning to a nonlinear continuous-time adaptive optimal control algorithm that runs without stepwise iterations. A particular focus of the research is on the automotive engine application with the objective of developing highly-integrated and complex propulsion technology of the future, accounting for sustainability of future transport technology, i.e. emission reduction and optimised energy and power use. Hence, the thesis comprises two parts:

The theoretical work is driven by the development of reinforcement learning and ADP, where a novel online Q-learning algorithm is proposed to approximately solve the optimal control problem in real time using a new adaptive critic neural network without the requirement of complete system knowledge. The finite-time convergence of the value function approximation is guaranteed by using a sliding-mode technique while the persistent excitation (PE) condition of the state trajectories can be verified directly in real time. Furthermore, the proposed Q-learning approach is extended to solve a nonlinear optimal observer design problem, where an observer Hamilton-Jacobi-Bellman (OHJB) equation is obtained. The closed-loop stability is rigorously proved via the Lyapunov analysis and numerical simulations demonstrate the effectiveness of the proposed methods.

The practical work investigates the control problems of a Wankel rotary engine, i.e. air-fuel ratio (AFR) control and idle speed control with the aim of emission reduction and efficiency improvement. An adaptive optimal controller is designed for the idle speed regulation. Two controllers: 1) nonlinear observer-based and 2) Q-learning-based are developed for the AFR. The control system development covers dynamics modelling, calibration, control design/simulation, implementation, and practical experiments. The proposed controllers are successfully applied and validated through a series of simulations and engine tests under different driving cycles.
Date of Award22 Mar 2022
Original languageEnglish
Awarding Institution
  • University of Bristol
SupervisorStuart C Burgess (Supervisor), Guido Herrmann (Supervisor) & Chris Brace (Supervisor)

Cite this

'