Available:*
Library | Item Barcode | Call Number | Material Type | Item Category 1 | Status |
---|---|---|---|---|---|
Searching... | 30000010209510 | QH324.2 V34 2009 | Open Access Book | Book | Searching... |
Searching... | 30000003499468 | QH324.2 V34 2009 | Open Access Book | Book | Searching... |
On Order
Summary
Summary
Emphasizing the search for patterns within and between biological sequences, trees, and graphs, Combinatorial Pattern Matching Algorithms in Computational Biology Using Perl and R shows how combinatorial pattern matching algorithms can solve computational biology problems that arise in the analysis of genomic, transcriptomic, proteomic, metabolomic, and interactomic data. It implements the algorithms in Perl and R, two widely used scripting languages in computational biology.
The book provides a well-rounded explanation of traditional issues as well as an up-to-date account of more recent developments, such as graph similarity and search. It is organized around the specific algorithmic problems that arise when dealing with structures that are commonly found in computational biology, including biological sequences, trees, and graphs. For each of these structures, the author makes a clear distinction between problems that arise in the analysis of one structure and in the comparative analysis of two or more structures. He also presents phylogenetic trees and networks as examples of trees and graphs in computational biology.
This book supplies a comprehensive view of the whole field of combinatorial pattern matching from a computational biology perspective. Along with thorough discussions of each biological problem, it includes detailed algorithmic solutions in pseudo-code, full Perl and R implementation, and pointers to other software, such as those on CPAN and CRAN.
Author Notes
Gabriel Valiente is Associate Professor in the Department of Software at the Technical University of Catalonia. His research interests include computational biology, bioinformatics, exact and approximate matching in graphs and patterns, and graph transformation.
Table of Contents
Foreword | |
Preface | |
1 Introduction | p. 1 |
1.1 Combinatorial Pattern Matching | p. 3 |
1.2 Computational Biology | p. 4 |
1.3 A Motivating Example: Gene Prediction | p. 4 |
Bibliographic Notes | p. 17 |
I Sequence Pattern Matching | |
2 Sequences | p. 21 |
2.1 Sequences in Mathematics | p. 21 |
2.1.1 Counting Labeled Sequences | p. 22 |
2.2 Sequences in Computer Science | p. 24 |
2.2.1 Traversing Labeled Sequences | p. 26 |
2.3 Sequences in Computational Biology | p. 29 |
2.3.1 Reverse Complementing DNA Sequences | p. 31 |
2.3.2 Counting RNA Sequences | p. 33 |
2.3.3 Generating DNA Sequences | p. 35 |
2.3.4 Representing Sequences in Perl | p. 38 |
2.3.5 Representing Sequences in R | p. 40 |
Bibliographic Notes | p. 42 |
3 Simple Pattern Matching in Sequences | p. 43 |
3.1 Finding Words in Sequences | p. 43 |
3.1.1 Word Composition of Sequences | p. 43 |
3.1.2 Alignment Free Comparison of Sequences | p. 49 |
Bibliographic Notes | p. 52 |
4 General Pattern Matching in Sequences | p. 53 |
4.1 Finding Subsequences | p. 53 |
4.1.1 Suffix Arrays | p. 56 |
4.2 Finding Common Subsequences | p. 67 |
4.2.1 Generalized Suffix Arrays | p. 74 |
4.3 Comparing Sequences | p. 86 |
4.3.1 Edit Distance-Based Comparison of Sequences | p. 86 |
4.3.2 Alignment-Based Comparison of Sequences | p. 95 |
Bibliographic Notes | p. 110 |
II Tree Pattern Matching | |
5 Trees | p. 115 |
5.1 Trees in Mathematics | p. 115 |
5.1.1 Counting Labeled Trees | p. 115 |
5.2 Trees in Computer Science | p. 117 |
5.2.1 Traversing Rooted Trees | p. 118 |
5.3 Trees in Computational Biology | p. 118 |
5.3.1 The Newick Linear Representation | p. 123 |
5.3.2 Counting Phylogenetic Trees | p. 125 |
5.3.3 Generating Phylogenetic Trees | p. 126 |
5.3.4 Representing Trees in Perl | p. 128 |
5.3.5 Representing Trees in R | p. 131 |
Bibliographic Notes | p. 135 |
6 Simple Pattern Matching in Trees | p. 137 |
6.1 Finding Paths in Unrooted Trees | p. 137 |
6.1.1 Distances in Unrooted Trees | p. 138 |
6.1.2 The Partition Distance between Unrooted Trees | p. 140 |
6.1.3 The Nodal Distance between Unrooted Trees | p. 144 |
6.2 Finding Paths in Rooted Trees | p. 148 |
6.2.1 Distances in Rooted Trees | p. 150 |
6.2.2 The Partition Distance between Rooted Trees | p. 151 |
6.2.3 The Nodal Distance between Rooted Trees | p. 151 |
Bibliographic Notes | p. 152 |
7 General Pattern Matching in Trees | p. 155 |
7.1 Finding Subtrees | p. 155 |
7.1.1 Finding Subtrees Induced by Triplets | p. 156 |
7.1.2 Finding Subtrees Induced by Quartets | p. 159 |
7.2 Finding Common Subtrees | p. 161 |
7.2.1 Maximum Agreement of Rooted Trees | p. 161 |
7.2.2 Maximum Agreement of Unrooted Trees | p. 172 |
7.3 Comparing Trees | p. 172 |
7.3.1 The Triplets Distance between Rooted Trees | p. 172 |
7.3.2 The Quartets Distance between Unrooted Trees | p. 175 |
Bibliographic Notes | p. 178 |
III Graph Pattern Matching | |
8 Graphs | p. 181 |
8.1 Graphs in Mathematics | p. 181 |
8.1.1 Counting Labeled Graphs | p. 182 |
8.2 Graphs in Computer Science | p. 183 |
8.2.1 Traversing Directed Graphs | p. 183 |
8.3 Graphs in Computational Biology | p. 184 |
8.3.1 The eNewick Linear Representation | p. 193 |
8.3.2 Counting Phylogenetic Networks | p. 195 |
8.3.3 Generating Phylogenetic Networks | p. 198 |
8.3.4 Representing Graphs in Perl | p. 202 |
8.3.5 Representing Graphs in R | p. 205 |
Bibliographic Notes | p. 208 |
9 Simple Pattern Matching in Graphs | p. 211 |
9.1 Finding Paths in Graphs | p. 211 |
9.1.1 Distances in Graphs | p. 214 |
9.1.2 The Path Multiplicity Distance between Graphs | p. 220 |
9.1.3 The Tripartition Distance between Graphs | p. 228 |
9.1.4 The Nodal Distance between Graphs | p. 234 |
9.2 Finding Trees in Graphs | p. 238 |
9.2.1 The Statistical Error between Graphs | p. 243 |
Bibliographic Notes | p. 246 |
10 General Pattern Matching in Graphs | p. 247 |
10.1 Finding Subgraphs | p. 247 |
10.1.1 Finding Subgraphs Induced by Triplets | p. 248 |
10.2 Finding Common Subgraphs | p. 259 |
10.2.1 Maximum Agreement of Rooted Networks | p. 259 |
10.3 Comparing Graphs | p. 269 |
10.3.1 The Triplets Distance between Graphs | p. 269 |
Bibliographic Notes | p. 273 |
A Elements of Perl | p. 275 |
A.1 Perl Scripts | p. 275 |
A.2 Overview of Perl | p. 294 |
A.3 Perl Quick Reference Card | p. 297 |
Bibliographic Notes | p. 304 |
B Elements of R | p. 305 |
B.1 R Scripts | p. 305 |
B.2 Overview of R | p. 323 |
B.3 R Quick Reference Card | p. 329 |
Bibliographic Notes | p. 336 |
References | p. 339 |
Index | p. 351 |