Bioinformatics, biocomputing and Perl : an introduction to bioinformatics computing skills and practice

Title:

Personal Author:

Moorhouse, Michael

Publication Information:

Chichester : Wiley, 2004

ISBN:

9780470853313

Subject Term:

Bioinformatics

Computational biology

Perl (Computer program language)

Added Author:

Barry, Paul, 1966-

Available:*

Library	Item Barcode	Call Number	Material Type	Item Category 1	Status
Searching... PSZ JB	30000004890848	QH324.2 M66 2004	Open Access Book	Book	Searching... Unknown

Bioinformatics, Biocomputing and Perl presents a modern introduction to bioinformatics computing skills and practice. Structuring its presentation around four main areas of study, this book covers the skills vital to the day-to-day activities of today's bioinformatician. Each chapter contains a series of maxims designed to highlight key points and there are exercises to supplement and cement the introduced material.

Working with Perl presents an extended tutorial introduction to programming through Perl, the premier programming technology of the bioinformatics community. Even though no previous programming experience is assumed, completing the tutorial equips the reader with the ability to produce powerful custom programs with ease.

Working with Data applies the programming skills acquired to processing a variety of bioinformatics data. In addition to advice on working with important data stores such as the Protein DataBank, SWISS-PROT, EMBL and the GenBank, considerable discussion is devoted to using bioinformatics data to populate relational database systems. The popular MySQL database is used in all examples.

Working with the Web presents a discussion of the Web-based technologies that allow the bioinformatics researcher to publish both data and applications on the Internet.

Working with Applications shifts gear from creating custom programs to using them. The tools described include Clustal-W, EMBOSS, STRIDE, BLAST and Xmgrace. An introduction to the important Bioperl Project concludes this chapter and rounds off the book.

Author Notes

Paul Barry works as a Lecturer in Computing Science at The Institute of Technology, Carlow in Ireland.

Preface	p. xv
1 Setting the Biological Scene	p. 1
1.1 Introducing Biological Sequence Analysis	p. 1
1.2 Protein and Polypeptides	p. 4
1.3 Generalised Models and their Use	p. 5
1.4 The Central Dogma of Molecular Biology	p. 6
1.5 Genome Sequencing	p. 10
1.6 The Example DNA-gene-protein system we will use	p. 12
Where to from Here	p. 13
2 Setting the Technological Scene	p. 15
2.1 The Layers of Technology	p. 15
2.2 Finding per	p. 17
Where to from Here	p. 18
I Working with Perl	p. 19
3 The Basics	p. 21
3.1 Let's Get Started!	p. 21
3.2 Iteration	p. 26
3.3 More Iterations	p. 30
3.4 Selection	p. 34
3.5 There Really is MTOWTDI	p. 36
3.6 Processing Data Files	p. 41
3.7 Introducing Patterns	p. 44
Where to from Here	p. 46
The Maxims Repeated	p. 46
4 Places to Put Things	p. 49
4.1 Beyond Scalars	p. 49
4.2 Arrays: Associating Data with Numbers	p. 49
4.3 Hashes: Associating Data with Words	p. 60
Where to from Here	p. 68
The Maxims Repeated	p. 68
5 Getting Organised	p. 71
5.1 Named Blocks	p. 71
5.2 Introducing Subroutines	p. 73
5.3 Creating Subroutines	p. 74
5.4 Visibility and Scope	p. 85
5.5 In-built Subroutines	p. 90
5.6 Grouping and Reusing Subroutines	p. 92
5.7 The Standard Modules	p. 96
5.8 CPAN: The Module Repository	p. 96
Where to from Here	p. 100
The Maxims Repeated	p. 100
6 About Files	p. 103
6.1 I/O: Input and Output	p. 103
6.2 Reading Files	p. 105
6.3 Writing Files	p. 116
6.4 Chopping and Chomping	p. 118
Where to from Here	p. 119
The Maxims Repeated	p. 119
7 Patterns, Patterns and More Patterns	p. 121
7.1 Pattern Basics	p. 121
7.2 Introducing the Pattern Metacharacters	p. 124
7.3 Anchors	p. 132
7.4 The Binding Operators	p. 134
7.5 Remembering What Was Matched	p. 135
7.6 Greedy by Default	p. 137
7.7 Alternative Pattern Delimiters	p. 138
7.8 Another Useful Utility	p. 139
7.9 Substitutions: Search and Replace	p. 140
7.10 Finding a Sequence	p. 142
Where to from Here	p. 146
The Maxims Repeated	p. 146
8 Perl Grabbag	p. 147
8.1 Introduction	p. 147
8.2 Strictness	p. 147
8.3 Perl One-liners	p. 149
8.4 Running Other Programs from per	p. 152
8.5 Recovering from Errors	p. 153
8.6 Sorting	p. 155
8.7 HERE Documents	p. 159
Where to from Here	p. 160
The Maxims Repeated	p. 161
II Working with Data	p. 163
9 Downloading Datasets	p. 165
9.1 Let's Get Data	p. 165
9.2 Downloading from the Web	p. 165
Where to from Here	p. 171
The Maxims Repeated	p. 171
10 The Protein Databank	p. 173
10.1 Introduction	p. 173
10.2 Determining Biomolecule Structures	p. 174
10.3 The Protein Databank	p. 177
10.4 The PDB Data-file Formats	p. 179
10.5 Accessing Data in PDB Entries	p. 182
10.6 Accessing PDB Annotation Data	p. 183
10.7 Contact Maps	p. 192
10.8 STRIDE: Secondary Structure Assignment	p. 196
10.9 Assigning Secondary Structures	p. 197
10.10 Introducing the mmCIF Protein Format	p. 205
Where to from Here	p. 210
The Maxims Repeated	p. 210
11 Non-redundant Datasets	p. 211
11.1 Introducing Non-redundant Datasets	p. 211
11.2 Non-redundant Protein Structures	p. 213
Where to from Here	p. 217
The Maxims Repeated	p. 217
12 Databases	p. 219
12.1 Introducing Databases	p. 219
12.2 Available Database Systems	p. 224
12.3 SQL: the Language of Databases	p. 226
12.4 A Database Case Study: MER	p. 227
Where to from Here	p. 269
The Maxims Repeated	p. 269
13 Databases and Perl	p. 273
13.1 Why Program Databases?	p. 273
13.2 Perl Database Technologies	p. 274
13.3 Preparing Perl	p. 275
13.4 Programming Databases with DBI	p. 276
13.5 Customising Output	p. 282
13.6 Customising Input	p. 285
13.7 Extending SQL	p. 289
Where to from Here	p. 292
The Maxims Repeated	p. 292
III Working with the Web	p. 295
14 The Sequence Retrieval System	p. 297
14.1 An Example of What's Possible	p. 297
14.2 Why SRS?	p. 298
14.3 Using SRS	p. 298
Where to from Here	p. 300
The Maxims Repeated	p. 300
15 Web Technologies	p. 303
15.1 The Web Development Infrastructure	p. 303
15.2 Creating Content for the WWW	p. 305
15.3 Preparing Apache for Perl	p. 310
15.4 Sending Data to a Web Server	p. 315
15.5 Web Databases	p. 320
Where to from Here	p. 327
The Maxims Repeated	p. 327
16 Web Automation	p. 329
16.1 Why Automate Surfing?	p. 329
16.2 Automated Surfing with Perl	p. 330
Where to from Here	p. 335
The Maxims Repeated	p. 336
IV Working with Applications	p. 337
17 Tools and Datasets	p. 339
17.1 Introduction	p. 339
17.2 Sequence Databases	p. 340
17.3 General Concepts and Methods	p. 347
17.4 Introducing Bioinformatics Tools	p. 357
17.5 BLAST	p. 362
Where to from Here	p. 371
The Maxims Repeated	p. 371
18 Applications	p. 373
18.1 Introduction	p. 373
18.2 Scientific Background to Mer Operon	p. 374
18.3 Downloading the Raw DNA Sequence	p. 377
18.4 Initial BLAST Sequence Similarity Search	p. 378
18.5 GeneMark	p. 380
18.6 Structural Prediction with SWISS-MODEL	p. 388
18.7 DeepView as a Structural Alignment Tool	p. 396
18.8 PROSITE and Sequence Motifs	p. 401
18.9 Phylogenetics	p. 407
Where to from Here?	p. 410
The Maxims Repeated	p. 411
19 Data Visualisation	p. 413
19.1 Introducing Visualisation	p. 413
19.2 Displaying Tabular Data Using HTML	p. 415
19.3 Creating High-quality Graphics with GD	p. 422
19.4 Plotting Graphs	p. 431
Where to from Here	p. 439
The Maxims Repeated	p. 439
20 Introducing Bioperl	p. 441
20.1 What is Bioperl?	p. 441
20.2 Bioperl's Relationship to Project Ensembl	p. 442
20.3 Installing Bioperl	p. 442
20.4 Using Bioperl: Fetching Sequences	p. 444
20.5 Remote BLAST Searches	p. 448
Where to from Here	p. 451
The Maxims Repeated	p. 452
A Appendix A	p. 453
B Appendix B	p. 457
C Appendix C	p. 459
D Appendix D	p. 461
E Appendix E	p. 467
F Appendix F	p. 471
Index	p. 475

Available:*

On Order

Summary

Summary

Author Notes

Table of Contents