Analysis of ‘One in a Million’primary care consultation conversations using natural language processing

Yvette V Pyne, Yik Ming Wong, Haishuo Fang, Edwin D. Simpson

Research output: Contribution to journalArticle (Academic Journal)peer-review

Abstract

Background

Modern patient electronic health records form a core part of primary care; they contain both clinical codes and free text entered by the clinician. Natural language processing (NLP) could be employed to generate these records through ‘listening’ to a consultation conversation.

Objectives

This study develops and assesses several text classifiers for identifying clinical codes for primary care consultations based on the doctor–patient conversation. We evaluate the possibility of training classifiers using medical code descriptions, and the benefits of processing transcribed speech from patients as well as doctors. The study also highlights steps for improving future classifiers.

Methods

Using verbatim transcripts of 239 primary care consultation conversations (the ‘One in a Million’ dataset) and novel additional datasets for distant supervision, we trained NLP classifiers (naïve Bayes, support vector machine, nearest centroid, a conventional BERT classifier and few-shot BERT approaches) to identify the International Classification of Primary Care-2 clinical codes associated with each consultation.

Results

Of all models tested, a fine-tuned BERT classifier was the best performer. Distant supervision improved the model’s performance (F1 score over 16 classes) from 0.45 with conventional supervision with 191 labelled transcripts to 0.51. Incorporating patients’ speech in addition to clinician’s speech increased the BERT classifier’s performance from 0.45 to 0.55 F1 (p=0.01, paired bootstrap test).

Conclusions

Our findings demonstrate that NLP classifiers can be trained to identify clinical area(s) being discussed in a primary care consultation from audio transcriptions; this could represent an important step towards a smart digital assistant in the consultation room.
Original languageEnglish
JournalBMJ Health & Care Informatics
DOIs
Publication statusPublished - 28 Apr 2023

Fingerprint

Dive into the research topics of 'Analysis of ‘One in a Million’primary care consultation conversations using natural language processing'. Together they form a unique fingerprint.

Cite this