Skip to:Content
|
Bottom
Cover image for Bioinformatics and computational biology solutions using R and bioconductor
Title:
Bioinformatics and computational biology solutions using R and bioconductor
Series:
Statistics for biology and health
Publication Information:
New York : Springer, 2005
ISBN:
9780387251462
Added Author:

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000010071206 QH324.2 B564 2005 Open Access Book Book
Searching...

On Order

Summary

Summary

Bioconductor is a widely used open source and open development software project for the analysis and comprehension of data arising from high-throughput experimentation in genomics and molecular biology. Bioconductor is rooted in the open source statistical computing environment R.

This volume's coverage is broad and ranges across most of the key capabilities of the Bioconductor project, including importation and preprocessing of high-throughput data from microarray, proteomic, and flow cytometry platforms:

Curation and delivery of biological metadata for use in statistical modeling and interpretation

Statistical analysis of high-throughput data, including machine learning and visualization

Modeling and visualization of graphs and networks

The developers of the software, who are in many cases leading academic researchers, jointly authored chapters. All methods are illustrated with publicly available data, and a major section of the book is devoted to exposition of fully worked case studies.

This book is more than a static collection of descriptive text, figures, and code examples that were run by the authors to produce the text; it is a dynamic document. Code underlying all of the computations that are shown is made available on a companion website, and readers can reproduce every number, figure, and table on their own computers.


Table of Contents

W. Huber and R. A. Irizarry and R. GentlemanB.M. Bolstad and R.A. Irizarry and L. Gautier and Z. WuB.M. Bolstad and F. Collin and J. Brettschneider and K. Simpson and L. Cope and R.A. Irizarry and T.P. SpeedY.H. Yang and A.C. PaquetW. Huber and F. HahneX. Li and R. Gentleman and X. Lu and Q. Shi and J.D. Iglehart and L. Harris and A. MironR. Gentleman and V.J. Carey and J. ZhangV. J. Carey and D. Temple Lang and J. Gentry and J. Zhang and R. GentlemanC. A. Smith and W. Huber and R. GentlemanW. Huber and X. Li and R. GentlemanV. J. Carey and R. GentlemanR. Gentleman and B. Ding and S. Dudoit and J. IbrahimK. S. Pollard and M. J. van der LaanD. Scholtens and A. von HeydebreckK. S. Pollard and S. Dudoit and M. J. van der LaanV. J. CareyT. Hothorn and M. Dettling and P. BuhlmannC. A. SmithR. Gentleman and W. Huber and V. J. CareyW. Huber and R. Gentleman and V. J. CareyV. J. Carey and R. Gentleman and W. Huber and J. GentryR. Gentleman and D. Scholtens and B. Ding and V. J. Carey and W. HuberG. K. SmythM. DettlingR. A. Irizarry
I Preprocessing data from genomic experimentsp. 1
1 Preprocessing Overviewp. 3
1.1 Introductionp. 3
1.2 Tasksp. 4
1.3 Data structuresp. 6
1.4 Statistical backgroundp. 8
1.5 Conclusionp. 12
2 Preprocessing High-density Oligonucleotide Arraysp. 13
2.1 Introductionp. 13
2.2 Importing and accessing probe-level datap. 15
2.3 Background adjustment and normalizationp. 18
2.4 Summarizationp. 25
2.5 Assessing preprocessing methodsp. 29
2.6 Conclusionp. 32
3 Quality Assessment of Affymetrix GeneChip Datap. 33
3.1 Introductionp. 33
3.2 Exploratory data analysisp. 34
3.3 Affymetrix quality assessment metricsp. 37
3.4 RNA degradationp. 38
3.5 Probe level modelsp. 41
3.6 Conclusionp. 47
4 Preprocessing Two-Color Spotted Arraysp. 49
4.1 Introductionp. 49
4.2 Two-color spotted microarraysp. 50
4.3 Importing and accessing probe-level datap. 51
4.4 Quality assessmentp. 57
4.5 Normalizationp. 62
4.6 Case studyp. 67
5 Cell-Based Assaysp. 71
5.1 Scopep. 71
5.2 Experimental technologiesp. 71
5.3 Reading datap. 73
5.4 Quality assessment and visualizationp. 79
5.5 Detection of effectorsp. 85
6 SELDI-TOF Mass Spectrometry Protein Datap. 91
6.1 Introductionp. 91
6.2 Baseline subtractionp. 93
6.3 Peak detectionp. 95
6.4 Processing a set of calibration spectrap. 96
6.5 An examplep. 105
6.6 Conclusionp. 108
II Meta-data: biological annotation and visualizationp. 111
7 Meta-data Resources and Tools in Bioconductorp. 113
7.1 Introductionp. 113
7.2 External annotation resourcesp. 115
7.3 Bioconductor annotation concepts: curated persistent packages and Web servicesp. 116
7.4 The annotate packagep. 119
7.5 Software tools for working with Gene Ontology (GO)p. 120
7.6 Pathway annotation packages: KEGG and cMAPp. 125
7.7 Cross-organism annotation: the homology packagesp. 130
7.8 Annotation from other sourcesp. 132
7.9 Discussionp. 133
8 Querying On-line Resourcesp. 135
8.1 The Toolsp. 135
8.2 PubMedp. 138
8.3 KEGG via SOAPp. 142
8.4 Getting gene sequence informationp. 144
8.5 Conclusionp. 145
9 Interactive Outputsp. 147
9.1 Introductionp. 147
9.2 A simple approachp. 148
9.3 Using the annaffy packagep. 149
9.4 Linking to On-line Databasesp. 152
9.5 Building HTML pagesp. 153
9.6 Graphical displays with drill-down functionalityp. 156
9.7 Searching Meta-datap. 159
9.8 Concluding Remarksp. 160
10 Visualizing Datap. 161
10.1 Introductionp. 161
10.2 Practicalitiesp. 162
10.3 High-volume scatterplotsp. 163
10.4 Heatmapsp. 166
10.5 Visualizing distancesp. 170
10.6 Plotting along genomic coordinatesp. 174
10.7 Conclusionp. 179
III Statistical analysis for genomic experimentsp. 181
11 Analysis Overviewp. 183
11.1 Introduction and road mapp. 183
11.2 Absolute and relative expression measuresp. 185
12 Distance Measures in DNA Microarray Data Analysisp. 189
12.1 Introductionp. 189
12.2 Distancesp. 191
12.3 Microarray datap. 199
12.4 Examplesp. 201
12.5 Discussionp. 208
13 Cluster Analysis of Genomic Datap. 209
13.1 Introductionp. 209
13.2 Methodsp. 210
13.3 Application: renal cell cancerp. 222
13.4 Conclusionp. 228
14 Analysis of Differential Gene Expression Studiesp. 229
14.1 Introductionp. 229
14.2 Differential expression analysisp. 230
14.3 Multifactor experimentsp. 239
14.4 Conclusionp. 248
15 Multiple Testing Procedures: the multtest Package and Applications to Genomicsp. 249
15.1 Introductionp. 249
15.2 Multiple hypothesis testing methodologyp. 250
15.3 Software implementation: R multtest packagep. 259
15.4 Applications: ALL microarray data setp. 262
15.5 Discussionp. 270
16 Machine Learning Concepts and Tools for Statistical Genomicsp. 273
16.1 Introductionp. 273
16.2 Illustration: Two continuous features; decision regionsp. 274
16.3 Methodological issuesp. 276
16.4 Applicationsp. 285
16.5 Conclusionsp. 291
17 Ensemble Methods of Computational Inferencep. 293
17.1 Introductionp. 293
17.2 Bagging and random forestsp. 295
17.3 Boostingp. 296
17.4 Multiclass problemsp. 298
17.5 Evaluationp. 298
17.6 Applications: tumor predictionp. 300
17.7 Applications: Survival analysisp. 307
17.8 Conclusionp. 310
18 Browser-based Affymetrix Analysis and Annotationp. 313
18.1 Introductionp. 313
18.2 Deploying webbiocp. 315
18.3 Using webbiocp. 317
18.4 Extending webbiocp. 322
18.5 Conclusionp. 326
IV Graphs and networksp. 327
19 Introduction and Motivating Examplesp. 329
19.1 Introductionp. 329
19.2 Practicalitiesp. 330
19.3 Motivating examplesp. 331
19.4 Discussionp. 336
20 Graphsp. 337
20.1 Overviewp. 337
20.2 Definitionsp. 338
20.3 Cohesive subgroupsp. 344
20.4 Distancesp. 346
21 Bioconductor Software for Graphsp. 347
21.1 Introductionp. 347
21.2 The graph packagep. 348
21.3 The RBGL packagep. 352
21.4 Drawing graphsp. 360
22 Case Studies Using Graphs on Biological Datap. 369
22.1 Introductionp. 369
22.2 Comparing the transcriptome and the interactomep. 370
22.3 Using GOp. 374
22.4 Literature co-citationp. 378
22.5 Pathwaysp. 387
22.6 Concluding remarksp. 393
V Case studiesp. 395
23 limma: Linear Models for Microarray Datap. 397
23.1 Introductionp. 397
23.2 Data representationsp. 398
23.3 Linear modelsp. 399
23.4 Simple comparisonsp. 400
23.5 Technical Replicationp. 403
23.6 Within-array replicate spotsp. 406
23.7 Two groupsp. 407
23.8 Several groupsp. 409
23.9 Direct two-color designsp. 411
23.10 Factorial designsp. 412
23.11 Time course experimentsp. 414
23.12 Statistics for differential expressionp. 415
23.13 Fitted model objectsp. 417
23.14 Preprocessing considerationsp. 418
23.15 Conclusionp. 420
24 Classification with Gene Expression Datap. 421
24.1 Introductionp. 421
24.2 Reading and customizing the datap. 422
24.3 Training and validating classifiersp. 423
24.4 Multiple random divisionsp. 426
24.5 Classification of test datap. 428
24.6 Conclusionp. 429
25 From CEL Files to Annotated Lists of Interesting Genesp. 431
25.1 Introductionp. 431
25.2 Reading CEL filesp. 432
25.3 Preprocessingp. 432
25.4 Ranking and filtering genesp. 433
25.5 Annotationp. 438
25.6 Conclusionp. 442
A Details on selected resourcesp. 443
A.1 Data setsp. 443
A.1.1 ALLp. 443
A.1.2 Renal cell cancerp. 443
A.1.3 Estrogen receptor stimulationp. 443
A.2 URLs for projects mentionedp. 444
Referencesp. 445
Indexp. 465
Go to:Top of Page