Cover image for Theory and mathematical methods for bioinformatics
Title:
Theory and mathematical methods for bioinformatics
Personal Author:
Series:
Biological and medical physics, biomedical engineering
Publication Information:
New York, NY : Springer, 2008
Physical Description:
xvi, 445 p. : ill. ; 24 cm.
ISBN:
9783540748908
Added Author:

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000010178995 QH324.2 S53 2008 Open Access Book Book
Searching...

On Order

Summary

Summary

Bioinformatics is an interdisciplinary science which involves molecular bi- ogy, molecular chemistry, physics, mathematics, computational sciences, etc. Mostofthebooksonbiomathematicspublishedwithinthepasttenyearshave consistedofcollectionsofstandardbioinformaticsproblemsandinformational methods,andfocus mainly onthe logisticsofimplementing andmakinguse of various websites, databases, software packages and serving platforms. While these types of books do introduce some mathematical and computational methods alongside the software packages, they are lacking in a systematic and professional treatment of the mathematics behind these methods. It is signi?cant in the ?eld of bioinformatics that not only is the amount of data increasing exponentially, but collaboration is also both widening and deepening among biologists, chemists, physicists, mathematicians, and c- puter scientists. The sheer volume of problems and databases requires - searchers to continually develop software packages in order to process the huge amounts of data, utilizing the latest mathematical methods. The - tent of this book is to provide a professional and in-depth treatment of the mathematical topics necessary in the study of bioinformatics.


Author Notes

Shiyi Shen

Since 1985, professor of mathematics, 1987-1998: chair of the department of mathematics, and the dean of the college of mathematical sciences; a standing committee of China Mathematical Society, the director of the Tianjin Mathematical Society. 1984-1986: visiting scholar of Cornell University; visiting scholar of Stanford University, and visiting researcher at the Hong Kong Chinese University. Shiyi Shen's fields of scientific interest are informatics and bioinformatics. His publications include about 60 journal papers and 6 books (Chinese).

Jack Tuszynski

Professor (from 07/1993 until present). Department of Physics, University of Alberta

Research Manager of the Neurons Group, (May 1, 2000- June 1, 2001) Starlab NV, Brussels, Belgium

Visiting Professor, Department of Physics, Ecole Normale Superieure, Lyon, France (December 2000, June-September 2001)

Senior Visiting Fellow, Laboratory of Biomolecular Dynamics, Catholic University of Leuven, Belgium (November-December 2000 and February-March 2001)

Adjunct Professor (from March 1, 2000). Department of Oncology, Division of Medical Physics, University of Alberta.

Visiting Professor (07/1995 - 09/1995). Institut für Theoretische Physik, J. Liebig-Universität GieÃYen, Germany.

Visiting Professor (07/1993 - 08/1994). Institut für Theoretische Physik, H.Heine-Universität Düsseldorf, Germany.

McCalla Professor (07/1992 - 07/1993). Department of Physics, University of Alberta.

Guest Professor (summer 1992), Visting Scientist (summer 1994, spring 1996). Institute of Mathematical Modelling, Danish Technical University, Lyngby.

Associate Professor (07/1990 - 07/1993). Department of Physics, University of Alberta, Edmonton. Tenure granted effective July 1, 1991.

Assistant Professor (01/1988 - 06/1990). Department of Physics,University of Alberta, Edmonton. Field: theoretical condensed matter physics.

Honorary Assistant Professor (01/1988 - 01/1991). Department of Physics, Memorial University of Newfoundland, St. John's, Newfoundland.

Assistant Professor (09/1983 - 01/1988). Department of Physics, Memorial University of Newfoundland, St. John's. Field: theoretical condensed matter physics. Tenure granted as of September 1, 1987.

Post-doctoral Fellow (04/1983 - 09/1983). Chemistry Department, The University of Calgary. Supervisor: Professor R. Paul. Field: Molecular biophysics.


Table of Contents

Outlinep. 1
Part I Mutations and Alignments
1 Introductionp. 5
1.1 Mutation and Alignmentp. 5
1.1.1 Classification of Biological Sequencesp. 5
1.1.2 Definition of Mutations and Alignmentsp. 6
1.1.3 Progress on Alignment Algorithms and Problems to Be Solvedp. 8
1.1.4 Mathematical Problems Driven by Alignment and Structural Analysisp. 12
1.2 Basic Concepts in Alignment and Mathematical Modelsp. 13
1.2.1 Mutation and Alignmentp. 13
1.3 Dynamic Programming-Based Algorithms for Pairwise Alignmentp. 17
1.3.1 Introduction to Dynamic Programming-Based Algorithmsp. 17
1.3.2 The Needleman-Wunsch Algorithm, the Global Alignment Algorithmp. 18
1.3.3 The Smith-Waterman Algorithmp. 21
1.4 Other Notationsp. 24
1.4.1 Correlation Functions of Local Sequencesp. 24
1.4.2 Pairwise Alignment Matrices Among Multiple Sequencesp. 25
1.5 Remarksp. 26
1.6 Exercises, Analyses, and Computationp. 27
2 Stochastic Models of Mutations and Structural Analysisp. 29
2.1 Stochastic Sequences and Independent Sequence Pairsp. 29
2.1.1 Definitions and Notations of Stochastic Sequencesp. 29
2.1.2 Independently and Identically Distributed Sequencesp. 31
2.1.3 Independent Stochastic Sequence Pairsp. 33
2.1.4 Local Penalty Function and Limit Properties of 2-Dimensional Stochastic Sequencesp. 36
2.2 Stochastic Models of Flow Raised by Sequence Mutationsp. 37
2.2.1 Bernoulli Processesp. 37
2.2.2 Poisson Flowp. 40
2.2.3 Mutated Flows Resulting from the Four Mutation Typesp. 43
2.3 Stochastic Models of Type-I Mutated Sequencesp. 45
2.3.1 Description of Type-I Mutationp. 45
2.3.2 Properties of Type-I Mutationsp. 47
2.4 Type-II Mutated Sequencesp. 50
2.4.1 Description of Type-II Mutated Sequencesp. 51
2.4.2 Stochastic Models of Type-II Mutated Sequencesp. 51
2.4.3 Error Analysis of Type-II Mutated Sequencesp. 54
2.4.4 The Mixed Stochastic Models Caused by Type-I and Type-II Mutationsp. 57
2.5 Mutated Sequences Resulting from Type-III and Type-IV Mutationsp. 58
2.5.1 Stochastic Models of Type-III and Type-IV Mutated Sequencesp. 58
2.5.2 Estimation of the Errors Caused by Type-III and Type-IV Mutationsp. 59
2.5.3 Stochastic Models of Mixed Mutationsp. 61
2.6 Exercisesp. 64
3 Modulus Structure Theoryp. 67
3.1 Modulus Structure of Expanded and Compressed Sequencesp. 67
3.1.1 The Modulus Structures of Expanded Sequences and Compressed Sequencesp. 67
3.1.2 The Order Relation and the Binary Operators on the Set of Expanded Modesp. 71
3.1.3 Operators Induced by Modesp. 73
3.2 Modulus Structure of Sequence Alignmentp. 76
3.2.1 Modulus Structures Resulting from Multiple Alignmentp. 76
3.2.2 Structure Analysis of Pairwise Alignmentp. 78
3.2.3 Properties of Pairwise Alignmentp. 81
3.2.4 The Order Relation and the Operator Induced by Modulus Structurep. 84
3.3 Analysis of Modulus Structures Resulting from Sequence Mutationsp. 85
3.3.1 Mixed Sequence Mutationsp. 85
3.3.2 Structure Analysis of Purely Shifting Mutationsp. 87
3.3.3 Structural Representation of Mixed Mutationp. 93
3.4 The Binary Operations of Sequence Mutationsp. 93
3.4.1 The Order Relationship Among the Modes of Shifting Mutationsp. 93
3.4.2 Operators Induced by Modes of Shifting Mutationsp. 96
3.5 Error Analysis for Pairwise Alignmentp. 100
3.5.1 Uniform Alignment of Mutation Sequencesp. 100
3.5.2 Optimal Alignment and Uniform Alignmentp. 102
3.5.3 Error Analysis of Uniform Alignmentp. 104
3.5.4 Local Modification of Sequence Alignmentp. 106
3.6 Exercisesp. 107
4 Super Pairwise Alignmentp. 109
4.1 Principle of Statistical Decision-Based Algorithms for Pairwise Sequencesp. 109
4.1.1 Uniform Alignment and Parameter Estimation for Pairwise Sequencesp. 109
4.1.2 The Locally Uniform Alignment Resulting from Local Mutationp. 111
4.1.3 The Estimations of Mutation Position and Lengthp. 113
4.2 Operation Steps of the SPA and Its Improvementp. 115
4.2.1 Operation Steps of the SPAp. 115
4.2.2 Some Unsolved Problems and Discussions of SPAp. 118
4.2.3 Local Modifications for Sequence Alignmentp. 121
4.3 Index Analysis of SPAp. 122
4.3.1 The Statistical Features of Estimationsp. 122
4.3.2 Improvement of the Algorithm to Estimate {{\hat s}}^\astp. 128
4.3.3 The Computational Complexity of the SPAp. 131
4.3.4 Estimation for the Error of Uniform Alignment Induced by a Hybrid Stochastic Mutation Sequencep. 132
4.4 Applications of Sequence Alignment and Examplesp. 135
4.4.1 Several Applications of Sequence Alignmentp. 135
4.4.2 Examples of Pairwise Alignmentp. 137
4.5 Exercisesp. 146
5 Multiple Sequence Alignmentp. 149
5.1 Pairwise Alignment Among Multiple Sequencesp. 149
5.1.1 Using Pairwise Alignment to Process Multiple Sequencesp. 149
5.1.2 Topological Space Induced by Pairwise Alignment of Multiple Sequencesp. 150
5.2 Optimization Criteria of MAp. 156
5.2.1 The Definition of MAp. 156
5.2.2 Uniform Alignment Criteria and SP-Optimization Criteria for Multiple Sequencesp. 156
5.2.3 Discussion of the Optimization Criterion of MAp. 160
5.2.4 Optimization Problem Based on Shannon Entropyp. 164
5.2.5 The Similarity Rate and the Rate of Virtual Symbolsp. 170
5.3 Super Multiple Alignmentp. 172
5.3.1 The Situation for MAp. 172
5.3.2 Algorithm of SMAp. 174
5.3.3 Comparison Among Several Algorithmsp. 179
5.4 Exercises, Analyses, and Computationp. 180
6 Network Structures of Multiple Sequences Induced by Mutationp. 183
6.1 General Method of Constructing the Phylogenetic Treep. 183
6.1.1 Summaryp. 183
6.1.2 Distance-Based Methodsp. 184
6.1.3 Feature-Based (Maximum Parsimony) Methodsp. 188
6.1.4 Maximum-Likelihood Method and the Bayes Methodp. 191
6.2 Network Structure Generated by MAp. 197
6.2.1 Graph and Tree Generated by MAp. 197
6.2.2 Network System Generated by Mutations of Multiple Sequencesp. 200
6.3 The Application of Mutation Network Analysisp. 206
6.3.1 Selection of the Data Samplep. 206
6.3.2 The Basic Steps to Analyze the Sequencesp. 208
6.3.3 Remarks on the Alignment and Output Analysisp. 210
6.4 Exercises, Analyses, and Computationp. 216
7 Alignment Spacep. 219
7.1 Construction of Alignment Space and Its Basic Theoremsp. 219
7.1.1 What Is Alignment Space?p. 219
7.1.2 The Alignment Space Under General Metricp. 221
7.2 The Analysis of Data Structures in Alignment Spacesp. 224
7.2.1 Maximum Score Alignment and Minimum Penalty Alignmentp. 224
7.2.2 The Structure Mode of the Envelope of Pairwise Sequencesp. 226
7.2.3 Uniqueness of the Maximum Core and Minimum Envelope of Pairwise Sequencesp. 230
7.2.4 The Envelope and Core of Pairwise Sequencesp. 231
7.2.5 The Envelope of Pairwise Sequences and Its Alignment Sequencesp. 233
7.3 The Counting Theorem of the Optimal Alignment and Alignment Spheroidp. 237
7.3.1 The Counting Theorem of the Optimal Alignmentp. 237
7.3.2 Alignment Spheroidp. 238
7.4 The Virtual Symbol Operation in the Alignment Spacep. 241
7.4.1 The Definition of the Virtual Symbol Operatorp. 241
7.4.2 The Modulus Structure of the Virtual Symbol Operatorp. 243
7.4.3 The Isometric Operation and Micro-Adapted Operation of Virtual Symbolsp. 247
7.5 Exercises, Analyses, and Computationp. 250
Part II Protein Configuration Analysis
8 Background Information Concerning the Properties of Proteinsp. 255
8.1 Amino Acids and Peptide Chainsp. 255
8.1.1 Amino Acidsp. 255
8.1.2 Basic Structure of Peptide Chainsp. 257
8.2 Brief Introduction of Protein Configuration Analysisp. 259
8.2.1 Protein Structure Databasep. 259
8.2.2 Brief Introduction to Protein Structure Analysisp. 260
8.3 Analysis and Explorationp. 263
9 Informational and Statistical Iterative Analysis of Protein Secondary Structure Predictionp. 265
9.1 Protein Secondary Structure Prediction and Informational and Statistical Iterative Analysisp. 265
9.1.1 Protein Secondary Structure Predictionp. 265
9.1.2 Data Source and Informational Statistical Model of PSSPp. 267
9.1.3 Informational and Statistical Characteristic Analysis on Protein Secondary Structurep. 269
9.2 Informational and Statistical Calculation Algorithms for PSSPp. 271
9.2.1 Informational and Statistical Calculation for PSSPp. 271
9.2.2 Informational and Statistical Calculation Algorithm for PSSPp. 273
9.2.3 Discussion of the Resultsp. 275
9.3 Exercises, Analyses, and Computationp. 277
10 Three-Dimensional Structure Analysis of the Protein Backbone and Side Chainsp. 279
10.1 Space Conformation Theory of Four-Atom Pointsp. 279
10.1.1 Conformation Parameter System of Four-Atom Space Pointsp. 279
10.1.2 Phase Analysis on Four-Atom Space Pointsp. 283
10.1.3 Four-Atom Construction of Protein 3D Structurep. 286
10.2 Triangle Splicing Structure of Protein Backbonesp. 288
10.2.1 Triangle Splicing Belts of Protein Backbonesp. 288
10.2.2 Characteristic Analysis of the Protein Backbone Triangle Splicing Beltsp. 291
10.3 Structure Analysis of Protein Side Chainsp. 292
10.3.1 The Setting of Oxygen Atom O and Atom C B on the Backbonesp. 293
10.3.2 Statistical Analysis of the Structures of Tetrahedrons V O , V Bp. 295
10.4 Exercises, Analyses, and Computationp. 297
11 Alignment of Primary and Three-Dimensional Structures of Proteinsp. 299
11.1 Structure Analysis for Protein Sequencesp. 299
11.1.1 Alignment of Protein Sequencesp. 299
11.1.2 Differences and Similarities Between the Alignment of Protein Sequences and of DNA Sequencesp. 301
11.1.3 The Penalty Functions for the Alignment of Protein Sequencesp. 302
11.1.4 Key Points of the Alignment Algorithms of Protein Sequencesp. 306
11.2 Alignment of Protein Three-Dimensional Structuresp. 307
11.2.1 Basic Idea and Method of the Alignment of Three-Dimensional Structuresp. 307
11.2.2 Example of Computation in the Discrete Casep. 310
11.2.3 Example of Computation in Consecutive Casep. 314
11.3 Exercises, Analyses, and Computationp. 323
12 Depth Analysis of Protein Spatial Structurep. 325
12.1 Depth Analysis of Amino Acids in Proteinsp. 325
12.1.1 Introduction to the Concept of Depthp. 325
12.1.2 Definition and Calculation of Depthp. 327
12.1.3 Proof of Theorem 40p. 329
12.1.4 Proof of Theorem 41p. 334
12.2 Statistical Depth Analysis of Protein Spatial Particlesp. 335
12.2.1 Calculation for Depth Tendency Factor of Amino Acidp. 335
12.2.2 Analysis of Depth Tendency Factor of Amino Acidp. 338
12.2.3 Prediction for Depth of Multiple Peptide Chains in Protein Spatial Structurep. 342
12.2.4 The Level Function in Spatial Particle Systemp. 347
12.2.5 An Examplep. 347
12.3 Exercisesp. 353
13 Analysis of the Morphological Features of Protein Spatial Structurep. 355
13.1 Introductionp. 355
13.1.1 Morphological Features of Protein Spatial Structurep. 355
13.1.2 Several Basic Definitions and Symbolsp. 357
13.1.3 Preliminary Analysis of the Morphology of Spatial Particle Systemp. 360
13.1.4 Examplep. 362
13.2 Structural Analysis of Cavities and Channels in a Particle Systemp. 366
13.2.1 Definition, Classification and Calculation of Cavityp. 366
13.2.2 Relationship Between Cavitiesp. 368
13.2.3 Examplep. 371
13.3 Analysis of ¿-Accessible Radius in Spatial Particle Systemp. 371
13.3.1 Structural Analysis of a Directed Polyhedronp. 371
13.3.2 Definition and Basic Properties of ¿-Accessible Radiusp. 377
13.3.3 Basic Principles and Methods of ¿-Accessible Radiusp. 378
13.4 Recursive Algorithm of ¿-Functionp. 382
13.4.1 Calculation of the ¿-Function Generated by 0-Level Convex Hullp. 382
13.4.2 Recursive Calculation of ¿-Functionp. 384
13.4.3 Examplep. 387
13.5 Proof of Relative Theorems and Reasoning of Computational Formulasp. 387
13.5.1 Proofs of Several Theoremsp. 387
13.5.2 Reasoning of Several Formulasp. 390
13.6 Exercisesp. 394
14 Semantic Analysis for Protein Primary Structurep. 395
14.1 Semantic Analysis for Protein Primary Structurep. 395
14.1.1 The Definition of Semantic Analysis for Protein Primary Structurep. 395
14.1.2 Information-Based Stochastic Models in Semantic Analysisp. 397
14.1.3 Determination of Local Words Using Informational and Statistical Means and the Relative Entropy Density Function for the Second and Third Ranked Wordsp. 400
14.1.4 Semantic Analysis for Protein Primary Structurep. 402
14.2 Permutation and Combination Methods for Semantic Structuresp. 412
14.2.1 Notation Used in Combinatorial Graph Theoryp. 416
14.2.2 The Complexity of Databasesp. 420
14.2.3 Key Words and Core Words in a Databasep. 421
14.2.4 Applications of Combinatorial Analysisp. 425
14.3 Exercises, Analyses, and Computationp. 429
Epiloguep. 431
Referencesp. 433
Indexp. 441