Skip to main content

Advertisement

MASSP3: A System for Predicting Protein Secondary Structure

Article metrics

  • 722 Accesses

  • 1 Citations

Abstract

A system that resorts to multiple experts for dealing with the problem of predicting secondary structures is described, whose performances are comparable to those obtained by other state-of-the-art predictors. The system performs an overall processing based on two main steps: first, a "sequence-to-structure" prediction is performed, by resorting to a population of hybrid genetic-neural experts, and then a "structure-to-structure" prediction is performed, by resorting to a feedforward artificial neural networks. To investigate the performance of the proposed approach, the system has been tested on the RS126 set of proteins. Experimental results (about 76% of accuracy) point to the validity of the approach.

References

  1. 1.

    Bairoch A, Apweiler R: The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Research 2000, 28(1):45–48. 10.1093/nar/28.1.45

  2. 2.

    Berman HM, Westbrook J, Feng Z, et al.: The protein data bank. Nucleic Acids Research 2000, 28(1):235–242. 10.1093/nar/28.1.235

  3. 3.

    Chou PY, Fasman UD: Prediction of protein conformation. Biochemistry 1974, 13: 211–215. 10.1021/bi00699a001

  4. 4.

    Robson B, Suzuki E: Conformational properties of amino acid residues in globular proteins. Journal of Molecular Biology 1976, 107(3):327–356. 10.1016/S0022-2836(76)80008-3

  5. 5.

    Mitchell EM, Artymiuk PJ, Rice DW, Willett P: Use of techniques derived from graph theory to compare secondary structure motifs in proteins. Journal of Molecular Biology 1992, 212: 151–166.

  6. 6.

    Kanehisa M: A multivariate analysis method for discriminating protein secondary structural segments. Protein Engineering 1988, 2(2):87–92. 10.1093/protein/2.2.87

  7. 7.

    King RD, Sternberg MJE: Identification and application of the concepts important for accurate and reliable protein secondary structure prediction. Protein Science 1996, 5: 2298–2310. 10.1002/pro.5560051116

  8. 8.

    Ptitsyn OB, Finkelstein AV: Theory of protein secondary structure and algorithm of its prediction. Biopolymers 1983, 22(1):15–25. 10.1002/bip.360220105

  9. 9.

    Taylor WR, Thornton JM: Prediction of super-secondary structure in proteins. Nature 1983, 301: 540–542. 10.1038/301540a0

  10. 10.

    Salamov AA, Solovyev V: Prediction of protein secondary structure by combining nearest neighbor algorithms and multiple sequence alignment. Journal of Molecular Biology 1995, 247: 11–15. 10.1006/jmbi.1994.0116

  11. 11.

    Rost B, Sander C:Prediction of protein secondary structure at better than 70 accuracy. Journal of Molecular Biology 1993, 232(2):584–599. 10.1006/jmbi.1993.1413

  12. 12.

    Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. Journal of Molecular Biology 1990, 215(3):403–410.

  13. 13.

    Higgins D, Thompson J, Gibson T, Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 1994, 22(22):4673–4680. 10.1093/nar/22.22.4673

  14. 14.

    Jones DT: Protein secondary structure prediction based on position-specific scoring matrices. Journal of Molecular Biology 1999, 292(2):195–202. 10.1006/jmbi.1999.3091

  15. 15.

    Altschul SF, Madden TL, Schaeffer AA, et al.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 1997, 25(17):3389–3402. 10.1093/nar/25.17.3389

  16. 16.

    Frishman D, Argos P: Incorporation of long-distance interactions into a secondary structure prediction algorithm. Protein Engineering 1996, 9: 133–142. 10.1093/protein/9.2.133

  17. 17.

    Frishman D, Argos P:75 accuracy in protein secondary structure prediction. Proteins 1997, 27: 329–335. 10.1002/(SICI)1097-0134(199703)27:3<329::AID-PROT1>3.0.CO;2-8

  18. 18.

    Cuff JA, Clamp ME, Siddiqui AS, Finlay M, Barton GJ: Jpred: a consensus secondary structure prediction server. Bioinformatics 1998, 14: 892–893. 10.1093/bioinformatics/14.10.892

  19. 19.

    Cuff JA, Barton GJ: Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. PROTEINS: Structure, Function and Genetics 1999, 34: 508–519. 10.1002/(SICI)1097–0134(19990301)34:4<508::AID-PROT10>3.0.CO;2-4

  20. 20.

    Baldi P, Brunak S, Frasconi P, Soda G, Pollastri G: Exploiting the past and the future in protein secondary structure prediction. Bioinformatics 1999, 15(11):937–946. 10.1093/bioinformatics/15.11.937

  21. 21.

    Baldi P, Brunak S, Frasconi P, Pollastri G, Soda G: Bidirectional dynamics for protein secondary structure prediction. In Sequence Learning: Paradigms, Algorithms, and Applications. Edited by: Sun R, Giles CL. Springer, New York, NY, USA; 2000:80–104.

  22. 22.

    Pollastri G, Przybylski D, Rost B, Baldi P: Improving the prediction of protein secondary structure in three and eight classes using neural networks and profiles. Proteins 2002, 47: 228–235. 10.1002/prot.10082

  23. 23.

    Rivest RL: Learning decision lists. Machine Learning 1987, 2(3):229–246.

  24. 24.

    Clark P, Niblett T: The CN2 induction algorithm. Machine Learning 1989, 3(4):261–283.

  25. 25.

    Quinlan JR: Induction of decision trees. Machine Learning 1986, 1(1):81–106.

  26. 26.

    Vere SA: Multilevel counterfactuals for generalizations of relational concepts and productions. Artificial Intelligence 1980, 14(2):139–164. 10.1016/0004-3702(80)90038-7

  27. 27.

    Breiman L, Friedman J, Olshen R, Stone C: Classification and Regression Trees. Wadsworth, Belmont, Calif, USA; 1984.

  28. 28.

    Back T, Fogel D, Michalewicz Z: Handbook of Evolutionary Computation. Oxford University Press, New York, NY, USA; 1997.

  29. 29.

    Eiben AE, Smith JE: Introduction to Evolutionary Computing. Springer, New York, NY, USA; 2003.

  30. 30.

    Bremmerman HJ: Optimization through evolution and recombination. In Self-Organizing Systems. Edited by: Yovits MC, Jacobi GT, Goldstine GD. Spartan Books, Washington, DC, USA; 1962:93–106.

  31. 31.

    Fogel LJ, Owens AJ, Walsh MJ: Artificial Intelligence Through Simulated Evolution. John Wiley & Sons, New York, NY, USA; 1966.

  32. 32.

    Holland JH: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor, Mich, USA; 1975.

  33. 33.

    Goldberg DE: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Reading, Mass, USA; 1989.

  34. 34.

    Holland JH: Adaption. In Progress in Theoretical Biology. Volume 4. Edited by: Rosen R, Snell FM. Academic Press, New York, NY, USA; 1976:263–293.

  35. 35.

    Holland JH: Escaping brittleness: the possibilities of general purpose learning algorithms applied to parallel rule based systems. In Machine Learning, An Artificial Intelligence Approach. Volume 2. Edited by: Michalski RS, Carbonell J, Mitchell M. Morgan Kaufmann, Los Altos, Calif, USA; 1986:593–623. chapter 20

  36. 36.

    Wilson SW: Classifier fitness based on accuracy. Evolutionary Computation 1995, 3(2):149–175. 10.1162/evco.1995.3.2.149

  37. 37.

    Fogel GB, Corne DW (Eds): Evolutionary Computation in Bioinformatics. Morgan Kaufmann, San Francisco, Calif, USA; 2003.

  38. 38.

    Jacobs RA, Jordan MI, Nowlan SJ, Hinton GE: Adaptive mixtures of local experts. Neural Computation 1991, 3(1):79–87. 10.1162/neco.1991.3.1.79

  39. 39.

    Jordan MI, Jacobs RA: Hierarchies of adaptive experts. In Advances in Neural Information Processing Systems. Volume 4. Edited by: Moody J, Hanson S, Lippman R. Morgan Kaufmann, San Mateo, Calif, USA; 1992:985–993.

  40. 40.

    Weigend AS, Mangeas M, Srivastava AN: Nonlinear gated experts for time series: discovering regimes and avoiding overfitting. International Journal of Neural Systems 1995, 6(4):373–399. 10.1142/S0129065795000251

  41. 41.

    Valiant L: A theory of the learnable. Communications of the ACM 1984, 27: 1134–1142. 10.1145/1968.1972

  42. 42.

    Vapnik VN: Statistical Learning Theory. John Wiley & Sons, New York, NY, USA; 1998.

  43. 43.

    Krogh A, Vedelsby J: Neural network ensembles, cross validation, and active learning. In Advances in Neural Information Processing Systems. Volume 7. Edited by: Tesauro G, Touretzky D, Leen T. MIT Press, Cambridge, Mass, USA; 1995:231–238.

  44. 44.

    Breiman L: Stacked regressions. Machine Learning 1996, 24: 41–48.

  45. 45.

    Freund Y, Schapire RE: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer Science and System Sciences 1997, 55(1):119–139. 10.1006/jcss.1997.1504

  46. 46.

    Schapire RE: A brief introduction to boosting. Proceedings of the 16th International Joint Conference on Artificial Intelligence, 1999, Stockholm, Sweden 1401–1406.

  47. 47.

    Yao X: Evolving artificial neural networks. Proceedings of the IEEE 1999, 87(9):1423–1447. 10.1109/5.784219

  48. 48.

    Yao X, Liu Y: Evolving neural network ensembles by minimization of mutual information. International Journal of Hybrid Intelligent Systems 2004, 1(1):12–21.

  49. 49.

    Armano G, Mancosu G, Orro A: A multi agent system for protein secondary structure prediction. The 4th International Workshop on Network Tools and Applications in Biology "Models and Metaphors from Biology to Bioinformatics Tools" (NETTAB '04), 2004, Camerino, Italy

  50. 50.

    Armano G: NXCS experts for financial time series forecasting. In Applications of Learning Classifier Systems. Edited by: Bull L. Springer, New York, NY, USA; 2004:68–91.

  51. 51.

    Armano G, Orro A, Saba M: Encoding multiple alignments by resorting to substitution matrices. In DIEE - Tech. Rep.. University of Cagliari, Cagliari, Italy; May 2005.

  52. 52.

    Henikoff S, Henikoff JG: Amino acid substitution matrices from protein blocks. Proceedings of the National Academy of Sciences of the United States of America 1992, 89(2):10915–10919. 10.1073/pnas.89.22.10915

  53. 53.

    Cleeremans A: Mechanisms of Implicit Learning Connectionist Models of Sequence Processing. MIT Press, Cambridge, Mass, USA; 1993.

  54. 54.

    Zemla A, Vencolvas C, Fidelis K, Rost B: A modified definition of Sov, a segment-based measure for protein secondary structure prediction assessment. Proteins 1999, 34(2):220–223. 10.1002/(SICI)1097-0134(19990201)34:2<220::AID-PROT7>3.0.CO;2-K

Download references

Author information

Correspondence to Giuliano Armano.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Armano, G., Orro, A. & Vargiu, E. MASSP3: A System for Predicting Protein Secondary Structure. EURASIP J. Adv. Signal Process. 2006, 017195 (2006) doi:10.1155/ASP/2006/17195

Download citation

Keywords

  • Neural Network
  • Information Technology
  • Secondary Structure
  • Artificial Neural Network
  • Quantum Information