TY - JOUR
T1 - Ensemble of a subset of kNN classifiers
AU - Gul, Asma
AU - Perperoglou, Aris
AU - Khan, Zardad
AU - Mahmoud, Osama
AU - Miftahuddin, Miftahuddin
AU - Adler, Werner
AU - Lausen, Berthold
PY - 2018/12
Y1 - 2018/12
N2 - Combining multiple classifiers, known as ensemble methods, can give substantial improvement in prediction performance of learning algorithms especially in the presence of non-informative features in the data sets. We propose an ensemble of subset of kNN classifiers, ESkNN, for classification task in two steps. Firstly, we choose classifiers based upon their individual performance using the out-of-sample accuracy. The selected classifiers are then combined sequentially starting from the best model and assessed for collective performance on a validation data set. We use bench mark data sets with their original and some added non-informative features for the evaluation of our method. The results are compared with usual kNN, bagged kNN, random kNN, multiple feature subset method, random forest and support vector machines. Our experimental comparisons on benchmark classification problems and simulated data sets reveal that the proposed ensemble gives better classification performance than the usual kNN and its ensembles, and performs comparable to random forest and support vector machines.
AB - Combining multiple classifiers, known as ensemble methods, can give substantial improvement in prediction performance of learning algorithms especially in the presence of non-informative features in the data sets. We propose an ensemble of subset of kNN classifiers, ESkNN, for classification task in two steps. Firstly, we choose classifiers based upon their individual performance using the out-of-sample accuracy. The selected classifiers are then combined sequentially starting from the best model and assessed for collective performance on a validation data set. We use bench mark data sets with their original and some added non-informative features for the evaluation of our method. The results are compared with usual kNN, bagged kNN, random kNN, multiple feature subset method, random forest and support vector machines. Our experimental comparisons on benchmark classification problems and simulated data sets reveal that the proposed ensemble gives better classification performance than the usual kNN and its ensembles, and performs comparable to random forest and support vector machines.
KW - Bagging
KW - Ensemble methods
KW - Nearest neighbour classifier
KW - Non-informative features
UR - http://www.scopus.com/inward/record.url?scp=84955318531&partnerID=8YFLogxK
U2 - 10.1007/s11634-015-0227-5
DO - 10.1007/s11634-015-0227-5
M3 - Article (Academic Journal)
C2 - 30931011
SN - 1862-5347
VL - 12
SP - 827
EP - 840
JO - Advances in Data Analysis and Classification
JF - Advances in Data Analysis and Classification
IS - 4
ER -