Protein structural class determination using support vector machines

IŞIK Z., Yanikoglu B., Sezerman U.

COMPUTER AND INFORMATION SCIENCES - ISCIS 2004, PROCEEDINGS, vol.3280, pp.82-89, 2004 (SCI-Expanded) identifier identifier


Proteins can be classified into four structural classes (all-a, all-beta, alpha/beta, alpha+beta) according to their secondary structure composition. In this paper, we predict the structural class of a protein from its Amino Acid Composition (AAC) using Support Vector Machines (SVM). A protein can be represented by a 20 dimensional vector according to its AAC. In addition to the AAC, we have used another feature set, called the Trio Amino Acid Composition (Trio AAC) which takes into account the amino acid neighborhood information. We have tried both of these features, the AAC and the Trio AAC, in each case using a SVM as the classification tool, in predicting the structural class of a protein. According to the Jackknife test results, Trio AAC feature set shows better classification performance than the AAC feature.