Skip to content


  • Research Article
  • Open Access

MASSP3: A System for Predicting Protein Secondary Structure

EURASIP Journal on Advances in Signal Processing20062006:017195

  • Received: 15 May 2005
  • Accepted: 1 December 2005
  • Published:


A system that resorts to multiple experts for dealing with the problem of predicting secondary structures is described, whose performances are comparable to those obtained by other state-of-the-art predictors. The system performs an overall processing based on two main steps: first, a "sequence-to-structure" prediction is performed, by resorting to a population of hybrid genetic-neural experts, and then a "structure-to-structure" prediction is performed, by resorting to a feedforward artificial neural networks. To investigate the performance of the proposed approach, the system has been tested on the RS126 set of proteins. Experimental results (about 76% of accuracy) point to the validity of the approach.


  • Neural Network
  • Information Technology
  • Secondary Structure
  • Artificial Neural Network
  • Quantum Information

Authors’ Affiliations

Department of Electrical and Electronic Engineering, University of Cagliari, Piazza d'Armi, Cagliari, 09123, Italy


  1. Bairoch A, Apweiler R: The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Research 2000, 28(1):45–48. 10.1093/nar/28.1.45Google Scholar
  2. Berman HM, Westbrook J, Feng Z, et al.: The protein data bank. Nucleic Acids Research 2000, 28(1):235–242. 10.1093/nar/28.1.235Google Scholar
  3. Chou PY, Fasman UD: Prediction of protein conformation. Biochemistry 1974, 13: 211–215. 10.1021/bi00699a001Google Scholar
  4. Robson B, Suzuki E: Conformational properties of amino acid residues in globular proteins. Journal of Molecular Biology 1976, 107(3):327–356. 10.1016/S0022-2836(76)80008-3Google Scholar
  5. Mitchell EM, Artymiuk PJ, Rice DW, Willett P: Use of techniques derived from graph theory to compare secondary structure motifs in proteins. Journal of Molecular Biology 1992, 212: 151–166.Google Scholar
  6. Kanehisa M: A multivariate analysis method for discriminating protein secondary structural segments. Protein Engineering 1988, 2(2):87–92. 10.1093/protein/2.2.87Google Scholar
  7. King RD, Sternberg MJE: Identification and application of the concepts important for accurate and reliable protein secondary structure prediction. Protein Science 1996, 5: 2298–2310. 10.1002/pro.5560051116Google Scholar
  8. Ptitsyn OB, Finkelstein AV: Theory of protein secondary structure and algorithm of its prediction. Biopolymers 1983, 22(1):15–25. 10.1002/bip.360220105Google Scholar
  9. Taylor WR, Thornton JM: Prediction of super-secondary structure in proteins. Nature 1983, 301: 540–542. 10.1038/301540a0Google Scholar
  10. Salamov AA, Solovyev V: Prediction of protein secondary structure by combining nearest neighbor algorithms and multiple sequence alignment. Journal of Molecular Biology 1995, 247: 11–15. 10.1006/jmbi.1994.0116Google Scholar
  11. Rost B, Sander C:Prediction of protein secondary structure at better than 70 accuracy. Journal of Molecular Biology 1993, 232(2):584–599. 10.1006/jmbi.1993.1413Google Scholar
  12. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. Journal of Molecular Biology 1990, 215(3):403–410.Google Scholar
  13. Higgins D, Thompson J, Gibson T, Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 1994, 22(22):4673–4680. 10.1093/nar/22.22.4673Google Scholar
  14. Jones DT: Protein secondary structure prediction based on position-specific scoring matrices. Journal of Molecular Biology 1999, 292(2):195–202. 10.1006/jmbi.1999.3091Google Scholar
  15. Altschul SF, Madden TL, Schaeffer AA, et al.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 1997, 25(17):3389–3402. 10.1093/nar/25.17.3389Google Scholar
  16. Frishman D, Argos P: Incorporation of long-distance interactions into a secondary structure prediction algorithm. Protein Engineering 1996, 9: 133–142. 10.1093/protein/9.2.133Google Scholar
  17. Frishman D, Argos P:75 accuracy in protein secondary structure prediction. Proteins 1997, 27: 329–335. 10.1002/(SICI)1097-0134(199703)27:3<329::AID-PROT1>3.0.CO;2-8Google Scholar
  18. Cuff JA, Clamp ME, Siddiqui AS, Finlay M, Barton GJ: Jpred: a consensus secondary structure prediction server. Bioinformatics 1998, 14: 892–893. 10.1093/bioinformatics/14.10.892Google Scholar
  19. Cuff JA, Barton GJ: Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. PROTEINS: Structure, Function and Genetics 1999, 34: 508–519. 10.1002/(SICI)1097–0134(19990301)34:4<508::AID-PROT10>3.0.CO;2-4Google Scholar
  20. Baldi P, Brunak S, Frasconi P, Soda G, Pollastri G: Exploiting the past and the future in protein secondary structure prediction. Bioinformatics 1999, 15(11):937–946. 10.1093/bioinformatics/15.11.937Google Scholar
  21. Baldi P, Brunak S, Frasconi P, Pollastri G, Soda G: Bidirectional dynamics for protein secondary structure prediction. In Sequence Learning: Paradigms, Algorithms, and Applications. Edited by: Sun R, Giles CL. Springer, New York, NY, USA; 2000:80–104.Google Scholar
  22. Pollastri G, Przybylski D, Rost B, Baldi P: Improving the prediction of protein secondary structure in three and eight classes using neural networks and profiles. Proteins 2002, 47: 228–235. 10.1002/prot.10082Google Scholar
  23. Rivest RL: Learning decision lists. Machine Learning 1987, 2(3):229–246.Google Scholar
  24. Clark P, Niblett T: The CN2 induction algorithm. Machine Learning 1989, 3(4):261–283.Google Scholar
  25. Quinlan JR: Induction of decision trees. Machine Learning 1986, 1(1):81–106.Google Scholar
  26. Vere SA: Multilevel counterfactuals for generalizations of relational concepts and productions. Artificial Intelligence 1980, 14(2):139–164. 10.1016/0004-3702(80)90038-7MATHGoogle Scholar
  27. Breiman L, Friedman J, Olshen R, Stone C: Classification and Regression Trees. Wadsworth, Belmont, Calif, USA; 1984.MATHGoogle Scholar
  28. Back T, Fogel D, Michalewicz Z: Handbook of Evolutionary Computation. Oxford University Press, New York, NY, USA; 1997.MATHGoogle Scholar
  29. Eiben AE, Smith JE: Introduction to Evolutionary Computing. Springer, New York, NY, USA; 2003.MATHGoogle Scholar
  30. Bremmerman HJ: Optimization through evolution and recombination. In Self-Organizing Systems. Edited by: Yovits MC, Jacobi GT, Goldstine GD. Spartan Books, Washington, DC, USA; 1962:93–106.Google Scholar
  31. Fogel LJ, Owens AJ, Walsh MJ: Artificial Intelligence Through Simulated Evolution. John Wiley & Sons, New York, NY, USA; 1966.MATHGoogle Scholar
  32. Holland JH: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor, Mich, USA; 1975.Google Scholar
  33. Goldberg DE: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Reading, Mass, USA; 1989.MATHGoogle Scholar
  34. Holland JH: Adaption. In Progress in Theoretical Biology. Volume 4. Edited by: Rosen R, Snell FM. Academic Press, New York, NY, USA; 1976:263–293.Google Scholar
  35. Holland JH: Escaping brittleness: the possibilities of general purpose learning algorithms applied to parallel rule based systems. In Machine Learning, An Artificial Intelligence Approach. Volume 2. Edited by: Michalski RS, Carbonell J, Mitchell M. Morgan Kaufmann, Los Altos, Calif, USA; 1986:593–623. chapter 20Google Scholar
  36. Wilson SW: Classifier fitness based on accuracy. Evolutionary Computation 1995, 3(2):149–175. 10.1162/evco.1995.3.2.149Google Scholar
  37. Fogel GB, Corne DW (Eds): Evolutionary Computation in Bioinformatics. Morgan Kaufmann, San Francisco, Calif, USA; 2003.Google Scholar
  38. Jacobs RA, Jordan MI, Nowlan SJ, Hinton GE: Adaptive mixtures of local experts. Neural Computation 1991, 3(1):79–87. 10.1162/neco.1991.3.1.79Google Scholar
  39. Jordan MI, Jacobs RA: Hierarchies of adaptive experts. In Advances in Neural Information Processing Systems. Volume 4. Edited by: Moody J, Hanson S, Lippman R. Morgan Kaufmann, San Mateo, Calif, USA; 1992:985–993.Google Scholar
  40. Weigend AS, Mangeas M, Srivastava AN: Nonlinear gated experts for time series: discovering regimes and avoiding overfitting. International Journal of Neural Systems 1995, 6(4):373–399. 10.1142/S0129065795000251Google Scholar
  41. Valiant L: A theory of the learnable. Communications of the ACM 1984, 27: 1134–1142. 10.1145/1968.1972MATHGoogle Scholar
  42. Vapnik VN: Statistical Learning Theory. John Wiley & Sons, New York, NY, USA; 1998.MATHGoogle Scholar
  43. Krogh A, Vedelsby J: Neural network ensembles, cross validation, and active learning. In Advances in Neural Information Processing Systems. Volume 7. Edited by: Tesauro G, Touretzky D, Leen T. MIT Press, Cambridge, Mass, USA; 1995:231–238.Google Scholar
  44. Breiman L: Stacked regressions. Machine Learning 1996, 24: 41–48.MATHGoogle Scholar
  45. Freund Y, Schapire RE: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer Science and System Sciences 1997, 55(1):119–139. 10.1006/jcss.1997.1504MathSciNetMATHGoogle Scholar
  46. Schapire RE: A brief introduction to boosting. Proceedings of the 16th International Joint Conference on Artificial Intelligence, 1999, Stockholm, Sweden 1401–1406.Google Scholar
  47. Yao X: Evolving artificial neural networks. Proceedings of the IEEE 1999, 87(9):1423–1447. 10.1109/5.784219Google Scholar
  48. Yao X, Liu Y: Evolving neural network ensembles by minimization of mutual information. International Journal of Hybrid Intelligent Systems 2004, 1(1):12–21.MathSciNetMATHGoogle Scholar
  49. Armano G, Mancosu G, Orro A: A multi agent system for protein secondary structure prediction. The 4th International Workshop on Network Tools and Applications in Biology "Models and Metaphors from Biology to Bioinformatics Tools" (NETTAB '04), 2004, Camerino, ItalyGoogle Scholar
  50. Armano G: NXCS experts for financial time series forecasting. In Applications of Learning Classifier Systems. Edited by: Bull L. Springer, New York, NY, USA; 2004:68–91.Google Scholar
  51. Armano G, Orro A, Saba M: Encoding multiple alignments by resorting to substitution matrices. In DIEE - Tech. Rep.. University of Cagliari, Cagliari, Italy; May 2005.Google Scholar
  52. Henikoff S, Henikoff JG: Amino acid substitution matrices from protein blocks. Proceedings of the National Academy of Sciences of the United States of America 1992, 89(2):10915–10919. 10.1073/pnas.89.22.10915Google Scholar
  53. Cleeremans A: Mechanisms of Implicit Learning Connectionist Models of Sequence Processing. MIT Press, Cambridge, Mass, USA; 1993.Google Scholar
  54. Zemla A, Vencolvas C, Fidelis K, Rost B: A modified definition of Sov, a segment-based measure for protein secondary structure prediction assessment. Proteins 1999, 34(2):220–223. 10.1002/(SICI)1097-0134(19990201)34:2<220::AID-PROT7>3.0.CO;2-KGoogle Scholar


© Armano et al. 2006