Density Ratio Estimation and Neyman Pearson Classification with Missing Data

Josh Givens, Song Liu, Henry W J Reeve

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

2 Citations (Scopus)
60 Downloads (Pure)

Abstract

Density Ratio Estimation (DRE) is an important machine learning technique with many downstream applications. We consider the challenge of DRE with missing not at random (MNAR) data. In this setting, we show that using standard DRE methods leads to biased results while our proposal (M-KLIEP), an adaptation of the popular DRE procedure KLIEP, restores consistency. Moreover, we provide finite sample estimation error bounds for M-KLIEP, which demonstrate minimax optimality with respect to both sample size and worst-case missingness. We then adapt an important downstream application of DRE, Neyman-Pearson (NP) classification, to this MNAR setting. Our procedure both controls Type I error and achieves high power, with high probability. Finally, we demonstrate promising empirical performance both synthetic data and real-world data with simulated missingness.
Original languageEnglish
Title of host publicationProceedings of The 26th International Conference on Artificial Intelligence and Statistics
PublisherProceedings of Machine Learning Research
Pages8645-8681
Number of pages37
Volume206
Publication statusPublished - 25 Apr 2023

Publication series

NameProceedings of Machine Learning Research
ISSN (Electronic)2640-3498

Bibliographical note

40 pages, 11 Figures. To be published in proceedings for AISTAT 2023

Keywords

  • stat.ML
  • cs.LG
  • stat.ME

Fingerprint

Dive into the research topics of 'Density Ratio Estimation and Neyman Pearson Classification with Missing Data'. Together they form a unique fingerprint.

Cite this