We investigate if interactions of longer range than typically considered in local protein secondary structure prediction methods can be captured in a simple machine learning framework to improve the prediction of beta-sheets. We use support vector machines and recursive feature elimination to show that the small signals available in long range interactions can indeed be exploited. The improvement is small but statistically significant on the benchmark datasets we used. We also show that feature selection within a long window and over amino acids at specific positions typically selects amino acids that are shown to be more relevant in the initiation and termination of beta-sheet formation.
|Translated title of the contribution||Exploiting long-range dependencies in protein beta-sheet secondary structure prediction|
|Title of host publication||the Pattern Recognition in Bioinformatics conference, Nijmegen, the Netherlands|
|Pages||337 - 345|
|Publication status||Published - 2010|