Knighting in sequence biology Edward N. Trifonov Genome classification, construction of phylogenetic trees, became today a major approach in studying evolutionary relatedness of various species in their vast - versity. Although the modern genome clustering delivers the trees which are very similar to those generated by classical means, and basic terminology is the same, the phenotypic traits and habitats are not anymore the playground for the classi- cation. The sequence space is the playground now. The phenotypic traits are - placed by sequence characteristics, “words”, in particular. Matter-of-factually, the phenotype and genotype merged, to confusion of both classical and modern p- logeneticists. Accordingly, a completely new vocabulary of stringology, information theory and applied mathematics took over. And a new brand of scientists emerged – those who do know the math and, simultaneously, (do?) know biology. The book is written by the authors of this new brand. There is no way to test their literacy in biology, as no biologist by training would even try to enter into the elite circle of those who masters their almost occult language. But the army of - formaticians, formal linguists, mathematicians humbly (or aggressively) longing to join modern biology, got an excellent introduction to the field of genome cl- tering, written by the team of their kin.
Les mer
Knighting in sequence biology Edward N. But the army of - formaticians, formal linguists, mathematicians humbly (or aggressively) longing to join modern biology, got an excellent introduction to the field of genome cl- tering, written by the team of their kin.
Les mer
Biological Background.- Biological Classification.- Mathematical Models for the Analysis of Natural-Language Documents.- DNA Texts.- N-Gram Spectra of the DNA Text.- Application of Compositional Spectra to DNA Sequences.- Marker-Function Profile-Based Clustering.- Genome as a Bag of Genes – The Whole-Genome Phylogenetics.
Les mer
The study of language texts at the level of formal non-semantic models has a long history. Suffice it to say that the well-known Markov chains were first introduced as one of such models. The representation of biological data as text and, consequently, applications of text-analysis models in the field of comparative genomics are substantially newer; nevertheless the methods are well developed. In this book, we try to juxtapose linguistic and bioinformatics models of text analysis. So, it can be read, in a sense, “in two directions” – the book is written so as to appeal to the bioinformatician, who may be interested in finding techniques that had initially appeared in the natural language analysis, and to computational linguist, who may be surprised to discover familiar methods used in bioinformatics. In the presentation of the material, the authors, nevertheless, give preference their professional field - bioinformatics. Therefore, even a specialist in bioinformatics can find something new himself in this book. For example, this book includes a review of the main data mining models generating the text spectra.
The chapters of the book assume neither advanced mathematical skills nor beginner knowledge of molecular biology. Relevant biological concepts are introduced in the beginning of the book. Several computer science issues relevant to the topics of the book are reviewed in the three appendices: clustering, sequence complexity, and DNA curvature modeling.
Les mer
Presents a general spectrum approach in bioinformatics Written by experts in this field
Produktdetaljer
ISBN
9783642263408
Publisert
2012-06-28
Utgiver
Vendor
Springer-Verlag Berlin and Heidelberg GmbH & Co. K
Høyde
235 mm
Bredde
155 mm
Aldersnivå
Research, P, 06
Språk
Product language
Engelsk
Format
Product format
Heftet
Antall sider
206