Abstract
Mixture proportion estimation is a building block in many weakly supervised classification tasks (missing labels, label noise, anomaly detection). Estimators with finite sample guarantees help analyse algorithms for such tasks, but so far only exist for Euclidean and Hilbert space data. We generalise the framework of Blanchard, Lee and Scott to allow extensions to other data types, and exemplify its use by deducing novel estimators for metric space data, and for randomly compressed Euclidean data – both of which make use of favourable geometry to tighten guarantees. Finally we demonstrate a theoretical link with the state of the art estimator specialised for Hilbert space data.
Original language | English |
---|---|
Pages (from-to) | 682-699 |
Number of pages | 18 |
Journal | Proceedings of Machine Learning Research |
Volume | 98 |
Publication status | Published - 24 Mar 2019 |
Event | International Conference on Algorithmic Learning Theory - Chicago, United States Duration: 22 Mar 2019 → 24 Mar 2019 Conference number: 30 http://algorithmiclearningtheory.org/alt2019/ |
Keywords
- Mixture proportion estimation
- metric spaces
- covering dimension
- randonm projections
- Gaussian width