Bioinformatics : sequence alignment and markov models

Publisher's Note: Products purchased from Third Party sellers are not guaranteed by the publisher for quality, authenticity, or access to any online entitlements included with the product.

GET FULLY UP-TO-DATE ON BIOINFORMATICS-THE TECHNOLOGY OF THE 21ST CENTURY

Bioinformatics showcases the latest developments in the field along with all the foundational information you'll need. It provides in-depth coverage of a wide range of autoimmune disorders and detailed analyses of suffix trees, plus late-breaking advances regarding biochips and genomes.

Featuring helpful gene-finding algorithms, Bioinformatics offers key information on sequence alignment, HMMs, HMM applications, protein secondary structure, microarray techniques, and drug discovery and development. Helpful diagrams accompany mathematical equations throughout, and exercises appear at the end of each chapter to facilitate self-evaluation.

This thorough, up-to-date resource features:

Worked-out problems illustrating concepts and models End-of-chapter exercises for self-evaluation Material based on student feedback Illustrations that clarify difficult math problems A list of bioinformatics-related websites

Bioinformatics covers:

Sequence representation and alignment Hidden Markov models Applications of HMMs Gene finding Protein secondary structure prediction Microarray techniques Drug discovery and development Internet resources and public domain databases

Author Notes

Kal Renganathan Sharma, Ph.D., P.E., Adjunct Professor, Department of Chemical Engineering, Prairie View A&M University, Prairie View, Texas

Preface	p. xi
Acknowledgments	p. xv
1 Preliminaries	p. 1
1.1 Molecular Biology	p. 2
1.1.1 Amino Acids and Proteins	p. 2
1.1.2 Structures of Proteins	p. 3
1.1.3 Sequence Distribution of Insulin	p. 6
1.1.4 Bioseparation Techniques	p. 9
1.1.5 Nucleic Acids and Genetic Code	p. 12
1.1.6 Genomes-Diversity, Size, and Structure	p. 20
1.2 Probability and Statistics	p. 23
1.2.1 Three Definitions of Probability	p. 24
1.2.2 Bayes' Theorem and Conditional Probability	p. 25
1.2.3 Independent Events and Bernoulli's Theorem	p. 25
1.2.4 Discrete Probability Distributions	p. 26
1.2.5 Continuous Probability Distributions	p. 28
1.2.6 Statistical Inference and Hypothesis Testing	p. 30
1.3 Which Is Larger, 2[superscript n] or n[superscript 2]?	p. 31
1.4 Big O Notation and Asymptotic Order of Functions	p. 32
Summary	p. 33
References and Sources	p. 34
Exercises	p. 35
Part 1 Sequence Alignment and Representation
2 Alignment of a Pair of Sequences	p. 41
Objectives	p. 41
2.1 Introduction to Pairwise Sequence Alignment	p. 41
2.2 Why Study Sequence Alignment	p. 43
2.3 Alignment Grading Function	p. 47
2.4 Optimal Global Alignment of a Pair of Sequences	p. 51
2.4.1 Needleman and Wunsch Algorithm	p. 51
2.5 Dynamic Programming	p. 55
2.6 Time Analysis and Space Efficiency	p. 56
2.7 Dynamic Arrays and O(N) Space	p. 56
2.8 Subquadratic Algorithms for Longest Common Subsequence Problems	p. 57
2.9 Optimal Local Alignment of a Pair of Sequences	p. 59
2.9.1 Smith and Waterman Algorithm	p. 59
2.10 Affine Gap Model	p. 60
2.11 Greedy Algorithms for Pairwise Alignment	p. 63
2.12 Other Alignment Methods	p. 65
2.13 Pam and Blosum Matrices	p. 66
Summary	p. 69
References	p. 70
Further Reading	p. 71
Exercises	p. 71
3 Sequence Representation and String Algorithms	p. 85
Objectives	p. 85
3.1 Suffix Trees	p. 85
3.1.1 Overview of Suffix Trees in Sequence Analysis	p. 85
3.2 Algorithm for Suffix Tree Representation of a Sequence	p. 88
3.3 Streaming a Sequence Against a Suffix Tree	p. 89
3.4 String Algorithms	p. 91
3.4.1 Rabin-Karp Algorithm	p. 92
3.4.2 Knuth-Morris-Pratt (KMP) Algorithm	p. 92
3.4.3 Boyer-Moore Algorithm	p. 94
3.4.4 Finite Automaton	p. 96
3.5 Suffix Trees in String Algorithms	p. 97
3.6 Look-up Tables	p. 99
Summary	p. 100
References	p. 101
Exercises	p. 102
4 Multiple-Sequence Alignment	p. 115
Objectives	p. 115
4.1 What Is Multiple-Sequence Alignment?	p. 115
4.2 Defenitions of Multiple Global Alignment and Sum of Pairs	p. 117
4.2.1 Multiple Global Alignment	p. 117
4.2.2 Sum of Pairs	p. 117
4.3 Optimal MSA by Dynamic Programming	p. 117
4.4 Theorem of Wang and Jiang [2]	p. 118
4.5 What Are NP Complete Problems?	p. 118
4.6 Center-Star-Alignment Algorithm [4]	p. 119
4.6.1 Time Analysis	p. 119
4.7 Progressive Alignment Methods	p. 121
4.8 The Consensus Sequence	p. 122
4.9 Greedy Method	p. 123
4.10 Geometry of Multiple Sequences	p. 123
Summary	p. 125
References	p. 125
Exercises	p. 126
Part 2 Probability Models
5 Hidden Markov Models and Applications	p. 133
Objectives	p. 133
5.1 Introduction	p. 133
5.2 kth-order Markov Chain	p. 134
5.3 DNA Sequence and Geometric Distribution [2-4]	p. 135
5.4 Three Questions in the HMM	p. 143
5.5 Evaluation Problem and Forward Algorithm	p. 146
5.6 Decoding Problem and Viterbi Algorithm	p. 146
5.7 Relative Entropy	p. 147
5.8 Probabilistic Approach to Phylogeny	p. 149
5.9 Sequence Alignment Using HMMs	p. 152
5.10 Protein Families	p. 153
5.11 Wheel HMMs to Model Periodicity in DNA	p. 156
5.12 Generalized HMM (GHMM)	p. 157
5.13 Database Mining	p. 160
5.14 Multiple Alignments	p. 160
5.15 Classification Using HMMs	p. 161
5.16 Signal Peptide and Signal Anchor Prediction by HMMs	p. 162
5.17 Markov Model and Chargaff's Parity Rules	p. 163
Summary	p. 164
References	p. 165
Exercises	p. 166
6 Gene Finding, Protein Secondary Structure	p. 179
Objectives	p. 179
6.1 Introduction	p. 179
6.2 Relative Entropy Site-Selection Problem	p. 180
6.2.1 Greedy Approach	p. 180
6.2.2 Gibbs Sampler	p. 181
6.3 Maximum-Subsequence Problem	p. 182
6.3.1 Bates and Constable Algorithm	p. 182
6.3.2 Binomial Heap [4-7]	p. 182
6.4 Interpolated Markov Model (IMM)	p. 184
6.5 Shine Dalgarno SD Sites Finding	p. 185
6.6 Gene Annotation Methods	p. 187
6.7 Secondary Structures of Proteins	p. 191
6.7.1 Neural Networks	p. 193
6.7.2 PHD Architecture of Rost and Sander	p. 196
6.7.3 Ensemble Method of Riis and Krogh [23]	p. 198
6.7.4 Protein Secondary Structure Using HMMs	p. 199
6.7.5 DAG RNNs: Directed Acyclic Graphs and Recursive NN Architecture and 3D Protein Structure Prediction	p. 200
6.7.6 Annotate Subcellular Localization for Protein Structure	p. 201
Summary	p. 203
References	p. 204
Exercises	p. 206
Part 3 Measurement Techniques
7 Biochips	p. 213
Objectives	p. 213
7.1 Introduction	p. 213
7.1.1 Microarrays, Biochips, and Disease	p. 214
7.1.2 Five Steps and Ten Tips	p. 218
7.1.3 Applications of Microarrays	p. 220
7.2 Microarray Detection	p. 223
7.2.1 Fluorescence Detection and Optical Requirements	p. 223
7.2.2 Confocal Scanning Microscope	p. 224
7.3 Microarray Surfaces	p. 227
7.4 Phosphoramadite Synthesis	p. 231
7.5 Microarray Manufacture	p. 233
7.6 Normalization for cDNA Microarray Data	p. 236
Summary	p. 240
References	p. 241
Exercises	p. 242
8 Electrophoretic Techniques and Finite Speed of Diffusion	p. 245
Objectives	p. 245
8.1 Role of Electrophoresis in the Measurement of Sequence Distribution	p. 245
8.2 Fick's Laws of Molecular Diffusion	p. 246
8.3 Generalized Fick's Law of Diffusion	p. 249
8.3.1 Derivation of a Generalized Fick's Law of Diffusion	p. 251
8.3.2 Taitel Paradox and Final Time Condition	p. 254
8.3.3 Relativistic Transformation of Coordinates	p. 259
8.3.4 Periodic Boundary Condition	p. 267
8.4 Electrophoresis Apparatus	p. 269
8.5 Electrophoretic Term, Ballistic Term, and Fick Term in the Governing Equation	p. 270
Summary	p. 274
References	p. 275
Exercises	p. 276
A Internet Hotlinks to Public-Domain Databases	p. 287
B PERL for Bioinformaticists	p. 299
Index	p. 303

Available:*

On Order

Summary

Summary

Author Notes

Table of Contents