Cover image for Bioinformatics : sequence alignment and markov models
Title:
Bioinformatics : sequence alignment and markov models
Personal Author:
Publication Information:
New York : McGraw-Hill, 2009
Physical Description:
xvi, 320 p. : ill. ; 24 cm.
ISBN:
9780071593069

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000010190573 QH324.2 S52 2009 Open Access Book Book
Searching...

On Order

Summary

Summary

Publisher's Note: Products purchased from Third Party sellers are not guaranteed by the publisher for quality, authenticity, or access to any online entitlements included with the product.


GET FULLY UP-TO-DATE ON BIOINFORMATICS-THE TECHNOLOGY OF THE 21ST CENTURY

Bioinformatics showcases the latest developments in the field along with all the foundational information you'll need. It provides in-depth coverage of a wide range of autoimmune disorders and detailed analyses of suffix trees, plus late-breaking advances regarding biochips and genomes.

Featuring helpful gene-finding algorithms, Bioinformatics offers key information on sequence alignment, HMMs, HMM applications, protein secondary structure, microarray techniques, and drug discovery and development. Helpful diagrams accompany mathematical equations throughout, and exercises appear at the end of each chapter to facilitate self-evaluation.

This thorough, up-to-date resource features:

Worked-out problems illustrating concepts and models End-of-chapter exercises for self-evaluation Material based on student feedback Illustrations that clarify difficult math problems A list of bioinformatics-related websites

Bioinformatics covers:

Sequence representation and alignment Hidden Markov models Applications of HMMs Gene finding Protein secondary structure prediction Microarray techniques Drug discovery and development Internet resources and public domain databases


Author Notes

Kal Renganathan Sharma, Ph.D., P.E., Adjunct Professor, Department of Chemical Engineering, Prairie View A&M University, Prairie View, Texas


Table of Contents

Prefacep. xi
Acknowledgmentsp. xv
1 Preliminariesp. 1
1.1 Molecular Biologyp. 2
1.1.1 Amino Acids and Proteinsp. 2
1.1.2 Structures of Proteinsp. 3
1.1.3 Sequence Distribution of Insulinp. 6
1.1.4 Bioseparation Techniquesp. 9
1.1.5 Nucleic Acids and Genetic Codep. 12
1.1.6 Genomes-Diversity, Size, and Structurep. 20
1.2 Probability and Statisticsp. 23
1.2.1 Three Definitions of Probabilityp. 24
1.2.2 Bayes' Theorem and Conditional Probabilityp. 25
1.2.3 Independent Events and Bernoulli's Theoremp. 25
1.2.4 Discrete Probability Distributionsp. 26
1.2.5 Continuous Probability Distributionsp. 28
1.2.6 Statistical Inference and Hypothesis Testingp. 30
1.3 Which Is Larger, 2[superscript n] or n[superscript 2]?p. 31
1.4 Big O Notation and Asymptotic Order of Functionsp. 32
Summaryp. 33
References and Sourcesp. 34
Exercisesp. 35
Part 1 Sequence Alignment and Representation
2 Alignment of a Pair of Sequencesp. 41
Objectivesp. 41
2.1 Introduction to Pairwise Sequence Alignmentp. 41
2.2 Why Study Sequence Alignmentp. 43
2.3 Alignment Grading Functionp. 47
2.4 Optimal Global Alignment of a Pair of Sequencesp. 51
2.4.1 Needleman and Wunsch Algorithmp. 51
2.5 Dynamic Programmingp. 55
2.6 Time Analysis and Space Efficiencyp. 56
2.7 Dynamic Arrays and O(N) Spacep. 56
2.8 Subquadratic Algorithms for Longest Common Subsequence Problemsp. 57
2.9 Optimal Local Alignment of a Pair of Sequencesp. 59
2.9.1 Smith and Waterman Algorithmp. 59
2.10 Affine Gap Modelp. 60
2.11 Greedy Algorithms for Pairwise Alignmentp. 63
2.12 Other Alignment Methodsp. 65
2.13 Pam and Blosum Matricesp. 66
Summaryp. 69
Referencesp. 70
Further Readingp. 71
Exercisesp. 71
3 Sequence Representation and String Algorithmsp. 85
Objectivesp. 85
3.1 Suffix Treesp. 85
3.1.1 Overview of Suffix Trees in Sequence Analysisp. 85
3.2 Algorithm for Suffix Tree Representation of a Sequencep. 88
3.3 Streaming a Sequence Against a Suffix Treep. 89
3.4 String Algorithmsp. 91
3.4.1 Rabin-Karp Algorithmp. 92
3.4.2 Knuth-Morris-Pratt (KMP) Algorithmp. 92
3.4.3 Boyer-Moore Algorithmp. 94
3.4.4 Finite Automatonp. 96
3.5 Suffix Trees in String Algorithmsp. 97
3.6 Look-up Tablesp. 99
Summaryp. 100
Referencesp. 101
Exercisesp. 102
4 Multiple-Sequence Alignmentp. 115
Objectivesp. 115
4.1 What Is Multiple-Sequence Alignment?p. 115
4.2 Defenitions of Multiple Global Alignment and Sum of Pairsp. 117
4.2.1 Multiple Global Alignmentp. 117
4.2.2 Sum of Pairsp. 117
4.3 Optimal MSA by Dynamic Programmingp. 117
4.4 Theorem of Wang and Jiang [2]p. 118
4.5 What Are NP Complete Problems?p. 118
4.6 Center-Star-Alignment Algorithm [4]p. 119
4.6.1 Time Analysisp. 119
4.7 Progressive Alignment Methodsp. 121
4.8 The Consensus Sequencep. 122
4.9 Greedy Methodp. 123
4.10 Geometry of Multiple Sequencesp. 123
Summaryp. 125
Referencesp. 125
Exercisesp. 126
Part 2 Probability Models
5 Hidden Markov Models and Applicationsp. 133
Objectivesp. 133
5.1 Introductionp. 133
5.2 kth-order Markov Chainp. 134
5.3 DNA Sequence and Geometric Distribution [2-4]p. 135
5.4 Three Questions in the HMMp. 143
5.5 Evaluation Problem and Forward Algorithmp. 146
5.6 Decoding Problem and Viterbi Algorithmp. 146
5.7 Relative Entropyp. 147
5.8 Probabilistic Approach to Phylogenyp. 149
5.9 Sequence Alignment Using HMMsp. 152
5.10 Protein Familiesp. 153
5.11 Wheel HMMs to Model Periodicity in DNAp. 156
5.12 Generalized HMM (GHMM)p. 157
5.13 Database Miningp. 160
5.14 Multiple Alignmentsp. 160
5.15 Classification Using HMMsp. 161
5.16 Signal Peptide and Signal Anchor Prediction by HMMsp. 162
5.17 Markov Model and Chargaff's Parity Rulesp. 163
Summaryp. 164
Referencesp. 165
Exercisesp. 166
6 Gene Finding, Protein Secondary Structurep. 179
Objectivesp. 179
6.1 Introductionp. 179
6.2 Relative Entropy Site-Selection Problemp. 180
6.2.1 Greedy Approachp. 180
6.2.2 Gibbs Samplerp. 181
6.3 Maximum-Subsequence Problemp. 182
6.3.1 Bates and Constable Algorithmp. 182
6.3.2 Binomial Heap [4-7]p. 182
6.4 Interpolated Markov Model (IMM)p. 184
6.5 Shine Dalgarno SD Sites Findingp. 185
6.6 Gene Annotation Methodsp. 187
6.7 Secondary Structures of Proteinsp. 191
6.7.1 Neural Networksp. 193
6.7.2 PHD Architecture of Rost and Sanderp. 196
6.7.3 Ensemble Method of Riis and Krogh [23]p. 198
6.7.4 Protein Secondary Structure Using HMMsp. 199
6.7.5 DAG RNNs: Directed Acyclic Graphs and Recursive NN Architecture and 3D Protein Structure Predictionp. 200
6.7.6 Annotate Subcellular Localization for Protein Structurep. 201
Summaryp. 203
Referencesp. 204
Exercisesp. 206
Part 3 Measurement Techniques
7 Biochipsp. 213
Objectivesp. 213
7.1 Introductionp. 213
7.1.1 Microarrays, Biochips, and Diseasep. 214
7.1.2 Five Steps and Ten Tipsp. 218
7.1.3 Applications of Microarraysp. 220
7.2 Microarray Detectionp. 223
7.2.1 Fluorescence Detection and Optical Requirementsp. 223
7.2.2 Confocal Scanning Microscopep. 224
7.3 Microarray Surfacesp. 227
7.4 Phosphoramadite Synthesisp. 231
7.5 Microarray Manufacturep. 233
7.6 Normalization for cDNA Microarray Datap. 236
Summaryp. 240
Referencesp. 241
Exercisesp. 242
8 Electrophoretic Techniques and Finite Speed of Diffusionp. 245
Objectivesp. 245
8.1 Role of Electrophoresis in the Measurement of Sequence Distributionp. 245
8.2 Fick's Laws of Molecular Diffusionp. 246
8.3 Generalized Fick's Law of Diffusionp. 249
8.3.1 Derivation of a Generalized Fick's Law of Diffusionp. 251
8.3.2 Taitel Paradox and Final Time Conditionp. 254
8.3.3 Relativistic Transformation of Coordinatesp. 259
8.3.4 Periodic Boundary Conditionp. 267
8.4 Electrophoresis Apparatusp. 269
8.5 Electrophoretic Term, Ballistic Term, and Fick Term in the Governing Equationp. 270
Summaryp. 274
Referencesp. 275
Exercisesp. 276
A Internet Hotlinks to Public-Domain Databasesp. 287
B PERL for Bioinformaticistsp. 299
Indexp. 303