HARD PROBLEMS IN GENE SEQUENCE ANALYSIS: CLASSICAL APPROACHES AND SUITABILITY OF GENETIC ALGORITHMS

L. Jantschi1, S.D. Bolboaca3,2 and R.E. Sestras3

(1) Technical University of Cluj-Napoca, Cluj, Romania
(2) “Iuliu Haţieganu” University of Medicine and Pharmacy Cluj-Napoca, Cluj, Romania
(3) University of Agricultural Sciences and Veterinary Medicine Cluj-Napoca, Cluj, Romania

Abstract

Genetic algorithms are based on observations of natural phenomena as well as on the simulation of the artificial selection of organisms with multiple loci controlling a measurable trait. Genetic algorithms evolved into complex and strong informatics tools able to deal with hard problems of decision, classification, optimization, or/and simulation. We aimed to show how genetic algorithms can be used to solve hard problems on gene sequence analysis.

References

  1. Altschul S.F. (1989) J. Theor. Biol., 138(3), 297-309.
  2. Bak P., Sneppen K. (1993) Phys. Rev. Lett., 71(24), 4083-4086.
  3. Baldwin B.G., Sanderson M.J. (1998) Proc. Natl. Acad. Sci. USA, 95(16), 9402-9406.
  4. Banzhaf W., Nordin P., Keller R.E., Francone F.D. (1997) Genetic Programming: An Introduction: On the Automatic Evolution of Computer Programs and Its Applications, Morgan Kaufmann Publishers, San Francisco, p. 450.
  5. Barricelli N.A. (1954) Methodos, 45-68.
  6. Benoit B., Fleurey F., Jézéquel J.-M., Le Traon Y. (2005) IEEE Software, 22(2),76-82.
  7. Bosworth J., Norman F., Zeigler B.P. (1972) Comparison of Genetic Algorithms with Conjugate Gradient Methods, NASA Contractor Reports, CR-2093.
  8. Bouktir T., Slimani L. (2005) Leonardo J. Sci., 4(7), 43-57.
  9. Bremermann H.J, Rogson J., Salaff S. (1966) Global properties of evolution processes. In: Natural Automata and useful Simulations (H.H. Pattee, Ed.), Proceedings of Symposium on Fundamental Biological Models, Stanford University 1965, pp. 3-42.
  10. Brown M.P., Grundy W.N., Lin D., Cristianini N., Sugnet C.W., Furey T.S., Ares M.Jr., Haussler D. (2000) Proc. Natl. Acad. Sci. USA, 97(1), 262-267.
  11. Califano A. (2000) Bioinformatics, 16(4), 341-357.
  12. Carrillo H., Lipman D. (1988) SIAM J. Appl. Math., 48, 1073-1082.
  13. Churchill G., Burks C., Eggert M., Engle M.L., Waterman M.S. (1993) Assembling DNA sequence fragments by shuffling and simulated annealing, Tech. Rep. LA-UR-93-2287, Los Alamos Scientific Laboratory Publication, LA-UR-2287, p. 25.
  14. Corpet F., Michot B. (1994) Comput. Applicat. Biosci., 10(4), 389-399.
  15. Darwin C.R. (1859) On the origin of species by means of natural selection, J. Murray, London, p. 459.
  16. Davis L. (1987) Genetic Algorithms and Simulated Annealing, M. Kaufmann, San Francisco, p. 216.
  17. Davis L. (1991) The Handbook of Genetic Algorithms, VN Reinhold, New York, p. 385.
  18. De Boer P.-T., Kroese D.P., Mannor S., Rubinstein R.Y. (2005) Ann. Oper. Res., 134(1), 19-67.
  19. Do C.B., Mahabhashyam M.S.P., Brudno M., Batzoglou S. (2005) Genome Res., 15(2), 330-340.
  20. Durbin R., Eddy S., Krogh A., Mitchison G. (2002) Biological sequence analysis. Probabilistic models of proteins and nucleic acids, 7th Ed., Cambridge University Press, Cambridge, UK, p. 356.
  21. Falkenauer E. (1998) Genetic Algorithms and Grouping Problems, Wiley, New York, p. 220.
  22. Feng D., Doolittle R.F. (1987) J. Mol. Evol., 25, 351-360.
  23. Fogel L.J. (1999) Intelligence Through Simulated Evolution: Forty Years of Evolutionary Programming, Wiley Interscience, New York, p. 162.
  24. Fraser A. (1957) Aust. J. Biol. Sci., 10, 484-491.
  25. Gilks W.R., Roberts G.O. (1996) Strategies for improving MCMC, In: Markov Chain Monte Carlo in Practice: Interdisciplinary Statistics (W.R. Gilks, S. Richardson, D. Spiegelhalter, Eds.), Chapman & Hall, London.
  26. Glover F. (1977) Decision Sci., 8(1), 156-166.
  27. Gondro C., Kinghorn B.P. (2007) Genet. Mol. Res., 6(4), 964-982.
  28. Gotoh O. (1986) J. Theor. Biol., 121, 327-337.
  29. Greenberg M.E., Ziff E.B. (1984) Nature, 311(5985), 433-438.
  30. Grefenstette J.J. (1984) Genesis: A system for using genetic search procedures. In: Proceedings of a Conference on Intelligent Systems and Machines, Rochester, MI, 161-165.
  31. Holland J.H. (1975) Adaptation in Natural and Artificial Systems, Univ. of Michigan Press, Ann Arbor, p. 183.
  32. Huang X. (1992) Genomics, 14, 18-25.
  33. Huelsenbeck J.P., Ronquist F., Nielsen R., Bollback J.P. (2001) Science, 294(5550), 2310-2314.
  34. Hvidsten T.R., Komorowski J., Sandvik A.K., Laegreid A. (2001) Pac. Symp. Biocomput., 6, 299-310.
  35. Iyer V.R., Eisen M.B., Ross D.T., Schuler G., Moore T., Lee J.C., Trent J.M., Staudt L.M., Hudson J.Jr., Boguski M.S., Lashkari D., Shalon D., Botstein D., Brown P.O. (1999) Science, 283(5398), 83-87.
  36. Jähner D., Hunter T. (1991) Mol. Cell. Biol., 11(7), 3682-3690.
  37. Joshi R.R. (2007) Current Bioinformatics, 2(2), 113-131.
  38. Karlin S., Altschul S.F. (1990) Proc. Natl. Acad. Sci. USA, 87(6), 2264-2268.
  39. Kjellström G. (1991) J. Optim. Theor. Appl., 71(3), 589-597.
  40. Kobti Z., Reynolds R.G., Kohler T. (2004) SwarmFest 8(online), p. 8.
  41. Krogh A., Brown M., Mian I.S., Sjolander K., Haussler D. (1994) J. Mol. Biol., 235(5), 1501-1531.
  42. Lamarck J.B.P.A. (1830) An Exposition of Zoological Philosophy, JB Baillère, Paris, p. 420 and p. 450 (in french).
  43. Lathrop R.H. (1994) Protein Engng., 7(9), 1059-1068.
  44. Lau L.F., Nathans D. (1987) Proc. Natl. Acad. Sci. USA, 84(5), 1182-1186.
  45. Lipman D.J., Altschul S.F., Kececioglu J.D. (1989) Proc. Natl. Acad. Sci USA, 86(12), 4412-4415.
  46. Löytynoja A., Goldman N. (2005) Proc. Natl. Acad. Sci. USA, 102(30), 10557-10562.
  47. Moore M.J., Bell C.D., Soltis P.S., Soltis D.E. (2007) Proc. Natl. Acad. Sci. USA 104(49), 19363-19368.
  48. Mulder N.J., Apweiler R. (2008) Curr. Protoc. Bioinformatics, Suppl. 21, 2.7.1-2.7.18.
  49. Murata M., Richardson J.S., Sussman J.L. (1985) Proc. Natl. Acad. Sci. USA, 82, 3073-3077.
  50. Notredame C., O’Brien E.A., Higgins D.G. (1997) Nucleic Acid Res., 25(22), 4570-4580.
  51. Pan K.-H., Lih C.-J., Cohen S.N. (2002) Proc. Natl. Acad. Sci. USA, 99(4), 2118-2123.
  52. Parsons R., Forrest S., Burks C. (1993) Genetic Algorithms for DNA Sequence Assembly, In: ISMB-93 Proceedings, 310-318.
  53. Rechenberg I. (1973) Evolutionsstrategies – Optimierung technischer Systeme nach Prinzipien der biologischen Information, Frommann-Holzboog Verlag, Stuttgart.
  54. Ronquist F., Huelsenbeck J.P. (2003) Bioinformatics, 19(12), 1572-1574.
  55. Russell D.J., Otu H.H., Sayood K. (2008) BMC Bioinformatics, 9, Article no. 306.
  56. Ryder K., Lau L.F., Nathans D. (1988) Proc. Natl. Acad. Sci. USA, 85(5), 1487-1491.
  57. Schefel H.-P. (1981) Numerical Optimization of Computer Models, John Wiley and Sons Ltd, New York, p. 398.
  58. Schwefel H.-P. (1995) Evolution and Optimum Seeking, Wiley & Sons, New York, p. 456.
  59. Segal E., Taskar B., Gasch A., Friedman N., Koller D. (2001) Bioinformatics, 17(S1), 243-252.
  60. Shevchenko A., Valcu C.-M., Junqueira M. (2009) J. Proteomics, 72(2), 137-144.
  61. Smith J.E. (2007) IEEE Trans. Syst. Man. Cy. B, 37(1), 6-17.
  62. Taylor W.R. (1986) J. Mol. Biol., 188, 233-258.
  63. Vingron M., Argos P. (1990) Protein Engng., 3(7), 565-569.
  64. Waterman M.S. (1984) Bull. Math. Biol., 46, 473-500.
  65. Waterman M.S., Vingron M. (1994) Proc. Natl. Acad. Sci. USA, 91(11), 4625-4628.
  66. Yuan J.S., Burris J., Stewart N.R., Mentewab A., Neal Jr. C.N. (2007) BMC Bioinformatics, 8(Suppl. 7), Article no. S6.
  67. Zheng X., Qin Y., Wang J. (2009) Math. Biosci., 217(2), 159-166.
  68. Zwickl D.J. (2006) Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion. Ph.D. dissertation, The University of Texas at Austin.