Skip to:Content
|
Bottom
Cover image for Combinatorial pattern matching algorithms in computational biology using Perl and R
Title:
Combinatorial pattern matching algorithms in computational biology using Perl and R
Personal Author:
Series:
Chapman & Hall/CRC mathematical and computational biology series
Publication Information:
Boca Raton, FL : Chapman & Hall/CRC, 2009
Physical Description:
352 p. : ill. ; 24 cm.
ISBN:
9781420069730

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000010209510 QH324.2 V34 2009 Open Access Book Book
Searching...
Searching...
30000003499468 QH324.2 V34 2009 Open Access Book Book
Searching...

On Order

Summary

Summary

Emphasizing the search for patterns within and between biological sequences, trees, and graphs, Combinatorial Pattern Matching Algorithms in Computational Biology Using Perl and R shows how combinatorial pattern matching algorithms can solve computational biology problems that arise in the analysis of genomic, transcriptomic, proteomic, metabolomic, and interactomic data. It implements the algorithms in Perl and R, two widely used scripting languages in computational biology.

The book provides a well-rounded explanation of traditional issues as well as an up-to-date account of more recent developments, such as graph similarity and search. It is organized around the specific algorithmic problems that arise when dealing with structures that are commonly found in computational biology, including biological sequences, trees, and graphs. For each of these structures, the author makes a clear distinction between problems that arise in the analysis of one structure and in the comparative analysis of two or more structures. He also presents phylogenetic trees and networks as examples of trees and graphs in computational biology.

This book supplies a comprehensive view of the whole field of combinatorial pattern matching from a computational biology perspective. Along with thorough discussions of each biological problem, it includes detailed algorithmic solutions in pseudo-code, full Perl and R implementation, and pointers to other software, such as those on CPAN and CRAN.


Author Notes

Gabriel Valiente is Associate Professor in the Department of Software at the Technical University of Catalonia. His research interests include computational biology, bioinformatics, exact and approximate matching in graphs and patterns, and graph transformation.


Table of Contents

Foreword
Preface
1 Introductionp. 1
1.1 Combinatorial Pattern Matchingp. 3
1.2 Computational Biologyp. 4
1.3 A Motivating Example: Gene Predictionp. 4
Bibliographic Notesp. 17
I Sequence Pattern Matching
2 Sequencesp. 21
2.1 Sequences in Mathematicsp. 21
2.1.1 Counting Labeled Sequencesp. 22
2.2 Sequences in Computer Sciencep. 24
2.2.1 Traversing Labeled Sequencesp. 26
2.3 Sequences in Computational Biologyp. 29
2.3.1 Reverse Complementing DNA Sequencesp. 31
2.3.2 Counting RNA Sequencesp. 33
2.3.3 Generating DNA Sequencesp. 35
2.3.4 Representing Sequences in Perlp. 38
2.3.5 Representing Sequences in Rp. 40
Bibliographic Notesp. 42
3 Simple Pattern Matching in Sequencesp. 43
3.1 Finding Words in Sequencesp. 43
3.1.1 Word Composition of Sequencesp. 43
3.1.2 Alignment Free Comparison of Sequencesp. 49
Bibliographic Notesp. 52
4 General Pattern Matching in Sequencesp. 53
4.1 Finding Subsequencesp. 53
4.1.1 Suffix Arraysp. 56
4.2 Finding Common Subsequencesp. 67
4.2.1 Generalized Suffix Arraysp. 74
4.3 Comparing Sequencesp. 86
4.3.1 Edit Distance-Based Comparison of Sequencesp. 86
4.3.2 Alignment-Based Comparison of Sequencesp. 95
Bibliographic Notesp. 110
II Tree Pattern Matching
5 Treesp. 115
5.1 Trees in Mathematicsp. 115
5.1.1 Counting Labeled Treesp. 115
5.2 Trees in Computer Sciencep. 117
5.2.1 Traversing Rooted Treesp. 118
5.3 Trees in Computational Biologyp. 118
5.3.1 The Newick Linear Representationp. 123
5.3.2 Counting Phylogenetic Treesp. 125
5.3.3 Generating Phylogenetic Treesp. 126
5.3.4 Representing Trees in Perlp. 128
5.3.5 Representing Trees in Rp. 131
Bibliographic Notesp. 135
6 Simple Pattern Matching in Treesp. 137
6.1 Finding Paths in Unrooted Treesp. 137
6.1.1 Distances in Unrooted Treesp. 138
6.1.2 The Partition Distance between Unrooted Treesp. 140
6.1.3 The Nodal Distance between Unrooted Treesp. 144
6.2 Finding Paths in Rooted Treesp. 148
6.2.1 Distances in Rooted Treesp. 150
6.2.2 The Partition Distance between Rooted Treesp. 151
6.2.3 The Nodal Distance between Rooted Treesp. 151
Bibliographic Notesp. 152
7 General Pattern Matching in Treesp. 155
7.1 Finding Subtreesp. 155
7.1.1 Finding Subtrees Induced by Tripletsp. 156
7.1.2 Finding Subtrees Induced by Quartetsp. 159
7.2 Finding Common Subtreesp. 161
7.2.1 Maximum Agreement of Rooted Treesp. 161
7.2.2 Maximum Agreement of Unrooted Treesp. 172
7.3 Comparing Treesp. 172
7.3.1 The Triplets Distance between Rooted Treesp. 172
7.3.2 The Quartets Distance between Unrooted Treesp. 175
Bibliographic Notesp. 178
III Graph Pattern Matching
8 Graphsp. 181
8.1 Graphs in Mathematicsp. 181
8.1.1 Counting Labeled Graphsp. 182
8.2 Graphs in Computer Sciencep. 183
8.2.1 Traversing Directed Graphsp. 183
8.3 Graphs in Computational Biologyp. 184
8.3.1 The eNewick Linear Representationp. 193
8.3.2 Counting Phylogenetic Networksp. 195
8.3.3 Generating Phylogenetic Networksp. 198
8.3.4 Representing Graphs in Perlp. 202
8.3.5 Representing Graphs in Rp. 205
Bibliographic Notesp. 208
9 Simple Pattern Matching in Graphsp. 211
9.1 Finding Paths in Graphsp. 211
9.1.1 Distances in Graphsp. 214
9.1.2 The Path Multiplicity Distance between Graphsp. 220
9.1.3 The Tripartition Distance between Graphsp. 228
9.1.4 The Nodal Distance between Graphsp. 234
9.2 Finding Trees in Graphsp. 238
9.2.1 The Statistical Error between Graphsp. 243
Bibliographic Notesp. 246
10 General Pattern Matching in Graphsp. 247
10.1 Finding Subgraphsp. 247
10.1.1 Finding Subgraphs Induced by Tripletsp. 248
10.2 Finding Common Subgraphsp. 259
10.2.1 Maximum Agreement of Rooted Networksp. 259
10.3 Comparing Graphsp. 269
10.3.1 The Triplets Distance between Graphsp. 269
Bibliographic Notesp. 273
A Elements of Perlp. 275
A.1 Perl Scriptsp. 275
A.2 Overview of Perlp. 294
A.3 Perl Quick Reference Cardp. 297
Bibliographic Notesp. 304
B Elements of Rp. 305
B.1 R Scriptsp. 305
B.2 Overview of Rp. 323
B.3 R Quick Reference Cardp. 329
Bibliographic Notesp. 336
Referencesp. 339
Indexp. 351
Go to:Top of Page