Data extraction methods for systematic review (semi)automation: A living review protocol

Lena Schmidt, Kazeem Olorisade, Luke A McGuinness, Julian P T Higgins

Research output: Contribution to journalArticle (Academic Journal)peer-review

150 Downloads (Pure)


Background: Researchers in evidence-based medicine cannot keep up with the amounts of both old and newly published primary research articles. Conducting and updating of systematic reviews is time-consuming. In practice, data extraction is one of the most complex tasks in this process. Exponential improvements in computational processing speed and data storage are fostering the development of data extraction models and algorithms. This, in combination with quicker pathways to publication, led to a large landscape of tools and methods for data extraction tasks.

Objective: To review published methods and tools for data extraction to (semi)automate the systematic reviewing process.

Methods: We propose to conduct a living review. With this methodology we aim to do monthly search updates, as well as bi-annual review updates if new evidence permits it. In a cross-sectional analysis we will extract methodological characteristics and assess the quality of reporting in our included papers.

Conclusions: We aim to increase transparency in the reporting and assessment of machine learning technologies to the benefit of data scientists, systematic reviewers and funders of health research. This living review will help to reduce duplicate efforts by data scientists who develop data extraction methods. It will also serve to inform systematic reviewers about possibilities to support their data extraction.
Original languageEnglish
Article number210
Number of pages8
Early online date25 Mar 2020
Publication statusE-pub ahead of print - 25 Mar 2020


  • data extraction
  • natural language processing
  • reproducibility
  • systematic reviews
  • text mining


Dive into the research topics of 'Data extraction methods for systematic review (semi)automation: A living review protocol'. Together they form a unique fingerprint.

Cite this