Closed-Loop Q-learning Control of a Small Unmanned Aircraft

Robert Clarke, Liam Fletcher, Colin Greatwood, Antony Waldock, Thomas S Richardson

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

4 Citations (Scopus)


Fixed-wing unmanned aerial vehicles (UAVs) using conventional flight controllers are limited in their manoeuvrability, as agile flight manoeuvres require the exploitation of the nonlinear post-stall flight regime. This paper extends previous work which used a Deep Q-Network (DQN) to generate open-loop trajectories for a perching manoeuvre. This paper proposes a closed-loop DQN controller, where live inference of the actuator actions is performed on a small unmanned aircraft during flight. DQN models are trained and evaluated in simulation before being deployed on the vehicle. The training processes uses a numerical flight dynamics model of the aircraft, combined with a baseline DQN implementation, to generate a series of trained models. Models with both fixed and varying start conditions at the start of each learning episode were evaluated, to identify which approach is best for generating robust controllers. Real-world closed-loop control is demonstrated through a series of flight tests, with varying wind conditions. The closed-loop controller was shown to have superior performance compared with open-loop mode, achieving a greater mean reward when performing the manoeuvre. Further, it is found that incorporating wind into the training process, as well as real-world effects such as noise and latency, is essential for further development.
Original languageEnglish
Title of host publicationAIAA Scitech 2020 Forum
PublisherAmerican Institution of Aeronautics and Astronautics
Publication statusPublished - 5 Jan 2020
EventAIAA SciTech Forum 2020 - Hyatt Regency Orlando, Orlando, United States
Duration: 6 Jan 202010 Jan 2020


ConferenceAIAA SciTech Forum 2020
Country/TerritoryUnited States


Dive into the research topics of 'Closed-Loop Q-learning Control of a Small Unmanned Aircraft'. Together they form a unique fingerprint.
  • HPC (High Performance Computing) Facility

    Susan L Pywell (Manager), Simon A Burbidge (Other), Polly E Eccleston (Other) & Simon H Atack (Other)

    Facility/equipment: Facility

Cite this