A feature selection method for classification within functional genomics experiments based on the proportional overlapping score

Osama Mahmoud*, Andrew Harrison, Aris Perperoglou, Asma Gul, Zardad Khan, Metodi V. Metodiev, Berthold Lausen

*Corresponding author for this work

Research output: Contribution to journalArticle (Academic Journal)

23 Citations (Scopus)
262 Downloads (Pure)

Abstract

Background

Microarray technology, as well as other functional genomics experiments, allow simultaneous measurements of thousands of genes within each sample. Both the prediction accuracy and interpretability of a classifier could be enhanced by performing the classification based only on selected discriminative genes. We propose a statistical method for selecting genes based on overlapping analysis of expression data across classes. This method results in a novel measure, called proportional overlapping score (POS), of a feature’s relevance to a classification task.

Results

We apply POS, along‐with four widely used gene selection methods, to several benchmark gene expression datasets. The experimental results of classification error rates computed using the Random Forest, k Nearest Neighbor and Support Vector Machine classifiers show that POS achieves a better performance.

Conclusions

A novel gene selection method, POS, is proposed. POS analyzes the expressions overlap across classes taking into account the proportions of overlapping samples. It robustly defines a mask for each gene that allows it to minimize the effect of expression outliers. The constructed masks along‐with a novel gene score are exploited to produce the selected subset of genes.

Original languageEnglish
Article number274
Number of pages20
JournalBMC Bioinformatics
Volume15
Issue number1
DOIs
Publication statusPublished - 11 Aug 2014

Keywords

  • Feature selection
  • Gene mask
  • Gene ranking
  • Microarray classification
  • Minimum subset of genes
  • Proportional overlap score

Fingerprint Dive into the research topics of 'A feature selection method for classification within functional genomics experiments based on the proportional overlapping score'. Together they form a unique fingerprint.

  • Cite this