Available:*
Library | Item Barcode | Call Number | Material Type | Item Category 1 | Status |
---|---|---|---|---|---|
Searching... | 30000010071206 | QH324.2 B564 2005 | Open Access Book | Book | Searching... |
On Order
Summary
Summary
Bioconductor is a widely used open source and open development software project for the analysis and comprehension of data arising from high-throughput experimentation in genomics and molecular biology. Bioconductor is rooted in the open source statistical computing environment R.
This volume's coverage is broad and ranges across most of the key capabilities of the Bioconductor project, including importation and preprocessing of high-throughput data from microarray, proteomic, and flow cytometry platforms:
Curation and delivery of biological metadata for use in statistical modeling and interpretation
Statistical analysis of high-throughput data, including machine learning and visualization
Modeling and visualization of graphs and networks
The developers of the software, who are in many cases leading academic researchers, jointly authored chapters. All methods are illustrated with publicly available data, and a major section of the book is devoted to exposition of fully worked case studies.
This book is more than a static collection of descriptive text, figures, and code examples that were run by the authors to produce the text; it is a dynamic document. Code underlying all of the computations that are shown is made available on a companion website, and readers can reproduce every number, figure, and table on their own computers.
Table of Contents
I Preprocessing data from genomic experiments | p. 1 |
1 Preprocessing Overview | p. 3 |
1.1 Introduction | p. 3 |
1.2 Tasks | p. 4 |
1.3 Data structures | p. 6 |
1.4 Statistical background | p. 8 |
1.5 Conclusion | p. 12 |
2 Preprocessing High-density Oligonucleotide Arrays | p. 13 |
2.1 Introduction | p. 13 |
2.2 Importing and accessing probe-level data | p. 15 |
2.3 Background adjustment and normalization | p. 18 |
2.4 Summarization | p. 25 |
2.5 Assessing preprocessing methods | p. 29 |
2.6 Conclusion | p. 32 |
3 Quality Assessment of Affymetrix GeneChip Data | p. 33 |
3.1 Introduction | p. 33 |
3.2 Exploratory data analysis | p. 34 |
3.3 Affymetrix quality assessment metrics | p. 37 |
3.4 RNA degradation | p. 38 |
3.5 Probe level models | p. 41 |
3.6 Conclusion | p. 47 |
4 Preprocessing Two-Color Spotted Arrays | p. 49 |
4.1 Introduction | p. 49 |
4.2 Two-color spotted microarrays | p. 50 |
4.3 Importing and accessing probe-level data | p. 51 |
4.4 Quality assessment | p. 57 |
4.5 Normalization | p. 62 |
4.6 Case study | p. 67 |
5 Cell-Based Assays | p. 71 |
5.1 Scope | p. 71 |
5.2 Experimental technologies | p. 71 |
5.3 Reading data | p. 73 |
5.4 Quality assessment and visualization | p. 79 |
5.5 Detection of effectors | p. 85 |
6 SELDI-TOF Mass Spectrometry Protein Data | p. 91 |
6.1 Introduction | p. 91 |
6.2 Baseline subtraction | p. 93 |
6.3 Peak detection | p. 95 |
6.4 Processing a set of calibration spectra | p. 96 |
6.5 An example | p. 105 |
6.6 Conclusion | p. 108 |
II Meta-data: biological annotation and visualization | p. 111 |
7 Meta-data Resources and Tools in Bioconductor | p. 113 |
7.1 Introduction | p. 113 |
7.2 External annotation resources | p. 115 |
7.3 Bioconductor annotation concepts: curated persistent packages and Web services | p. 116 |
7.4 The annotate package | p. 119 |
7.5 Software tools for working with Gene Ontology (GO) | p. 120 |
7.6 Pathway annotation packages: KEGG and cMAP | p. 125 |
7.7 Cross-organism annotation: the homology packages | p. 130 |
7.8 Annotation from other sources | p. 132 |
7.9 Discussion | p. 133 |
8 Querying On-line Resources | p. 135 |
8.1 The Tools | p. 135 |
8.2 PubMed | p. 138 |
8.3 KEGG via SOAP | p. 142 |
8.4 Getting gene sequence information | p. 144 |
8.5 Conclusion | p. 145 |
9 Interactive Outputs | p. 147 |
9.1 Introduction | p. 147 |
9.2 A simple approach | p. 148 |
9.3 Using the annaffy package | p. 149 |
9.4 Linking to On-line Databases | p. 152 |
9.5 Building HTML pages | p. 153 |
9.6 Graphical displays with drill-down functionality | p. 156 |
9.7 Searching Meta-data | p. 159 |
9.8 Concluding Remarks | p. 160 |
10 Visualizing Data | p. 161 |
10.1 Introduction | p. 161 |
10.2 Practicalities | p. 162 |
10.3 High-volume scatterplots | p. 163 |
10.4 Heatmaps | p. 166 |
10.5 Visualizing distances | p. 170 |
10.6 Plotting along genomic coordinates | p. 174 |
10.7 Conclusion | p. 179 |
III Statistical analysis for genomic experiments | p. 181 |
11 Analysis Overview | p. 183 |
11.1 Introduction and road map | p. 183 |
11.2 Absolute and relative expression measures | p. 185 |
12 Distance Measures in DNA Microarray Data Analysis | p. 189 |
12.1 Introduction | p. 189 |
12.2 Distances | p. 191 |
12.3 Microarray data | p. 199 |
12.4 Examples | p. 201 |
12.5 Discussion | p. 208 |
13 Cluster Analysis of Genomic Data | p. 209 |
13.1 Introduction | p. 209 |
13.2 Methods | p. 210 |
13.3 Application: renal cell cancer | p. 222 |
13.4 Conclusion | p. 228 |
14 Analysis of Differential Gene Expression Studies | p. 229 |
14.1 Introduction | p. 229 |
14.2 Differential expression analysis | p. 230 |
14.3 Multifactor experiments | p. 239 |
14.4 Conclusion | p. 248 |
15 Multiple Testing Procedures: the multtest Package and Applications to Genomics | p. 249 |
15.1 Introduction | p. 249 |
15.2 Multiple hypothesis testing methodology | p. 250 |
15.3 Software implementation: R multtest package | p. 259 |
15.4 Applications: ALL microarray data set | p. 262 |
15.5 Discussion | p. 270 |
16 Machine Learning Concepts and Tools for Statistical Genomics | p. 273 |
16.1 Introduction | p. 273 |
16.2 Illustration: Two continuous features; decision regions | p. 274 |
16.3 Methodological issues | p. 276 |
16.4 Applications | p. 285 |
16.5 Conclusions | p. 291 |
17 Ensemble Methods of Computational Inference | p. 293 |
17.1 Introduction | p. 293 |
17.2 Bagging and random forests | p. 295 |
17.3 Boosting | p. 296 |
17.4 Multiclass problems | p. 298 |
17.5 Evaluation | p. 298 |
17.6 Applications: tumor prediction | p. 300 |
17.7 Applications: Survival analysis | p. 307 |
17.8 Conclusion | p. 310 |
18 Browser-based Affymetrix Analysis and Annotation | p. 313 |
18.1 Introduction | p. 313 |
18.2 Deploying webbioc | p. 315 |
18.3 Using webbioc | p. 317 |
18.4 Extending webbioc | p. 322 |
18.5 Conclusion | p. 326 |
IV Graphs and networks | p. 327 |
19 Introduction and Motivating Examples | p. 329 |
19.1 Introduction | p. 329 |
19.2 Practicalities | p. 330 |
19.3 Motivating examples | p. 331 |
19.4 Discussion | p. 336 |
20 Graphs | p. 337 |
20.1 Overview | p. 337 |
20.2 Definitions | p. 338 |
20.3 Cohesive subgroups | p. 344 |
20.4 Distances | p. 346 |
21 Bioconductor Software for Graphs | p. 347 |
21.1 Introduction | p. 347 |
21.2 The graph package | p. 348 |
21.3 The RBGL package | p. 352 |
21.4 Drawing graphs | p. 360 |
22 Case Studies Using Graphs on Biological Data | p. 369 |
22.1 Introduction | p. 369 |
22.2 Comparing the transcriptome and the interactome | p. 370 |
22.3 Using GO | p. 374 |
22.4 Literature co-citation | p. 378 |
22.5 Pathways | p. 387 |
22.6 Concluding remarks | p. 393 |
V Case studies | p. 395 |
23 limma: Linear Models for Microarray Data | p. 397 |
23.1 Introduction | p. 397 |
23.2 Data representations | p. 398 |
23.3 Linear models | p. 399 |
23.4 Simple comparisons | p. 400 |
23.5 Technical Replication | p. 403 |
23.6 Within-array replicate spots | p. 406 |
23.7 Two groups | p. 407 |
23.8 Several groups | p. 409 |
23.9 Direct two-color designs | p. 411 |
23.10 Factorial designs | p. 412 |
23.11 Time course experiments | p. 414 |
23.12 Statistics for differential expression | p. 415 |
23.13 Fitted model objects | p. 417 |
23.14 Preprocessing considerations | p. 418 |
23.15 Conclusion | p. 420 |
24 Classification with Gene Expression Data | p. 421 |
24.1 Introduction | p. 421 |
24.2 Reading and customizing the data | p. 422 |
24.3 Training and validating classifiers | p. 423 |
24.4 Multiple random divisions | p. 426 |
24.5 Classification of test data | p. 428 |
24.6 Conclusion | p. 429 |
25 From CEL Files to Annotated Lists of Interesting Genes | p. 431 |
25.1 Introduction | p. 431 |
25.2 Reading CEL files | p. 432 |
25.3 Preprocessing | p. 432 |
25.4 Ranking and filtering genes | p. 433 |
25.5 Annotation | p. 438 |
25.6 Conclusion | p. 442 |
A Details on selected resources | p. 443 |
A.1 Data sets | p. 443 |
A.1.1 ALL | p. 443 |
A.1.2 Renal cell cancer | p. 443 |
A.1.3 Estrogen receptor stimulation | p. 443 |
A.2 URLs for projects mentioned | p. 444 |
References | p. 445 |
Index | p. 465 |