Integrating scientific knowledge into machine learning using interactive decision trees

Research output: Contribution to journalArticle (Academic Journal)peer-review

21 Citations (Scopus)
120 Downloads (Pure)

Abstract

Decision Trees (DT) describe a type of machine learning method that has been widely used in the geosciences to automatically extract patterns from complex and high dimensional data. However, like any data-based method, the application of DT is hindered by data limitations, such as significant biases, leading to potentially physically unrealistic results. We develop interactive DT (iDT) that put humans in the loop to integrate the power of experts' scientific knowledge with the power of the algorithms to automatically learn patterns from large datasets. We created an open-source Python toolbox that implements the iDT framework. Users can interactively create new composite variables, change the variable and threshold to split, prune and group variables based on their physical meaning. We demonstrate with three case studies how iDT overcomes problems with current DT thus achieving higher interpretability and robustness of the result.
Original languageEnglish
Article number105248
JournalCOMPUTERS & GEOSCIENCES
Volume170
Early online date15 Oct 2022
DOIs
Publication statusE-pub ahead of print - 15 Oct 2022

Bibliographical note

Funding Information:
This work was supported by the Engineering and Physical Sciences Research Council in the UK via grant [grant number EP/L016214/1 ] awarded for the Water Informatics: Science and Engineering (WISE) Centre for Doctoral Training, which is gratefully acknowledged. Francesca Pianosi is partially supported by the Engineering and Physical Sciences Research Council through an Early Career “Living with Environmental Uncertainty” Fellowship [grant number EP/R007330/1 ]. Support for Thorsten Wagener was provided by the Alexander von Humboldt Foundation in the framework of the Alexander von Humboldt Professorship endowed by the German Federal Ministry of Education and Research .

Publisher Copyright:
© 2022 The Authors

Research Groups and Themes

  • Water and Environmental Engineering

Fingerprint

Dive into the research topics of 'Integrating scientific knowledge into machine learning using interactive decision trees'. Together they form a unique fingerprint.

Cite this