Beyond temperature scaling: Obtaining well-calibrated multiclass probabilities with Dirichlet calibration

Meelis Kull, Miquel Perello Nieto, Markus Kängsepp, Telmo Silva Filho, Hao Song, Peter Flach

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

101 Downloads (Pure)


Class probabilities predicted by most multiclass classifiers are uncalibrated, often tending towards over-confidence. With neural networks, calibration can be improved by temperature scaling, a method to learn a single corrective multiplicative factor for inputs to the last softmax layer. On non-neural models the existing methods apply binary calibration in a pairwise or one-vs-rest fashion. We propose a natively multiclass calibration method applicable to classifiers from any model class, derived from Dirichlet distributions and generalising the beta calibration method from binary classification. It is easily implemented with neural nets since it is equivalent to log-transforming the uncalibrated probabilities, followed by one linear layer and softmax. Experiments demonstrate improved probabilistic predictions according to multiple measures (confidence-ECE, classwise-ECE, log-loss, Brier score) across a wide range of datasets and classifiers. Parameters of the learned Dirichlet calibration map provide insights to the biases in the uncalibrated model.
Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 32 (NIPS 2019)
EditorsHannah Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d'Alché-Buc, Emily Fox, Roman Garnett
PublisherNeural Information Processing Systems Foundation
Number of pages12
Publication statusPublished - 14 Dec 2019
EventNeural Information Processing Systems - Vancouver, Canada
Duration: 8 Dec 201914 Dec 2019
Conference number: 33


ConferenceNeural Information Processing Systems
Abbreviated titleNIPS 2019
Internet address

Bibliographical note

provisional published date added, based on conference information, as no publication date given in online proceedings

Structured keywords

  • Digital Health
  • Jean Golding


Dive into the research topics of 'Beyond temperature scaling: Obtaining well-calibrated multiclass probabilities with Dirichlet calibration'. Together they form a unique fingerprint.

Cite this