Abstract
This paper explores the mechanisms to efficiently combine annotations of different quality for multiclass classification datasets, as we argue that it is easier to obtain large collections of weak labels as opposed to true labels. Since labels come from different sources, their annotations may have different degrees of reliability (e.g., noisy labels, supersets of labels, complementary labels or annotations performed by domain experts), and we must make sure that the addition of potentially inaccurate labels does not degrade the performance achieved when using only true labels. For this reason, we consider each group of annotations as being weakly supervised and pose the problem as finding the optimal combination of such collections. We propose an efficient algorithm based on expectation-maximization and show its performance in both synthetic and real-world classification tasks in a variety of weak label scenarios.
Original language | English |
---|---|
Number of pages | 20 |
Journal | Neurocomputing |
DOIs | |
Publication status | Published - 10 Mar 2020 |
Keywords
- classification
- weak label
- loss function
- cost-sensitive learning
Fingerprint
Dive into the research topics of 'Recycling Weak Labels for Multiclass Classification'. Together they form a unique fingerprint.Student theses
-
Uncertainty aware classification: augmenting classifiers to handle uncertainty
Perello Nieto, M. (Author), Flach, P. (Supervisor) & Santos-Rodriguez, R. (Supervisor), 9 May 2023Student thesis: Doctoral Thesis › Doctor of Philosophy (PhD)
File
Equipment
-
HPC (High Performance Computing) and HTC (High Throughput Computing) Facilities
Alam, S. R. (Manager), Eccleston, P. E. (Other), Williams, D. A. G. (Manager) & Atack, S. H. (Other)
Facility/equipment: Facility