Cover image for Life science data mining
Title:
Life science data mining
Series:
Science, engineering and biology informatics ; 2
Publication Information:
Hackensack, NJ : World Scientific Publishing, 2006
ISBN:
9789812700643

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000010139012 QH324.2 L53 2006 Open Access Book Book
Searching...

On Order

Summary

Summary

"This book identifies and highlights the latest data mining paradigms to analyze combine integrate model and simulate vast amounts of heterogeneous multi-modal, multi-scale data for emerging real-world applications in life science."--BOOK JACKET.


Table of Contents

Prefacep. v
Chapter 1 Survey of Early Warning Systems for Environmental and Public Health Applicationsp. 1
1 Introductionp. 1
2 Disease Surveillancep. 3
3 Reference Architecture for Model Extractionp. 5
4 Problem Domainp. 9
5 Data Sourcesp. 10
6 Detection Methodsp. 12
7 Summary and Conclusionp. 13
Referencesp. 14
Chapter 2 Time-Lapse Cell Cycle Quantitative Data Analysis Using Gaussian Mixture Modelsp. 17
1 Introductionp. 18
2 Material and Feature Extractionp. 20
2.1 Material and cell feature extractionp. 20
2.2 Model the time-lapse data using AR modelp. 23
3 Problem Statement and Formulationp. 24
4 Classification Methodsp. 26
4.1 Gaussian mixture models and the EM algorithmp. 26
4.2 K-Nearest Neighbor (KNN) classifierp. 28
4.3 Neural networksp. 28
4.4 Decision treep. 29
4.5 Fisher clusteringp. 30
5 Experimental Resultsp. 30
5.1 Trace identificationp. 31
5.2 Cell morphologic similarity analysisp. 33
5.3 Phase identificationp. 35
5.4 Cluster analysis of time-lapse datap. 37
6 Conclusionp. 40
Appendix A

p. 41

Appendix B

p. 42

Referencesp. 43
Chapter 3 Diversity and Accuracy of Data Mining Ensemblep. 47
1 Introductionp. 47
2 Ensemble and Diversityp. 49
2.1 Why needs diversity?p. 49
2.2 Diversity measuresp. 51
3 Probability Analysisp. 52
4 Coincident Failure Diversityp. 52
5 Ensemble Accuracyp. 55
5.1 Relationship between random guess and accuracy of lower bound single modelsp. 55
5.2 Relationship between accuracy A and the number of models Np. 56
5.3 When model's accuracy [Less than] 50%p. 57
6 Construction of Effective Ensemblesp. 58
6.1 Strategies for increasing diversityp. 59
6.2 Ensembles of neural networksp. 60
6.3 Ensembles of decision treesp. 61
6.4 Hybrid ensemblesp. 62
7 An Application: Osteoporosis Classification Problemp. 62
7.1 Osteoporosis problemp. 63
7.2 Results from the ensembles of neural netsp. 63
7.3 Results from ensembles of the decision treesp. 66
7.4 Results of hybrid ensemblesp. 67
8 Discussion and Conclusionsp. 68
Referencesp. 70
Chapter 4 Integrated Clustering for Microarray Datap. 73
1 Introductionp. 73
2 Related Workp. 77
3 Data Preprocessingp. 81
4 Integrated Clusteringp. 83
4.1 Clustering algorithmsp. 83
4.2 Integration methodologyp. 88
5 Experimental Evaluationp. 89
5.1 Evaluation methodologyp. 89
5.2 Resultsp. 91
5.3 Discussionp. 93
6 Conclusionsp. 94
Referencesp. 94
Chapter 5 Complexity and Synchronization of EEG with Parametric Modelingp. 99
1 Introductionp. 100
1.1 Brief review of EEG recording analysisp. 100
1.2 AR modeling based EEG analysisp. 101
2 TVAR Modelingp. 104
3 Complexity Measurep. 105
4 Synchronization Measurep. 109
5 Conclusionsp. 113
Referencesp. 114
Chapter 6 Bayesian Fusion of Syndromic Surveillance with Sensor Data for Disease Outbreak Classificationp. 119
1 Introductionp. 120
2 Approachp. 122
2.1 Bayesian belief networksp. 122
2.2 Syndromic datap. 126
2.3 Environmental datap. 128
2.4 Test scenariosp. 130
2.5 Evaluation metricsp. 130
3 Resultsp. 131
3.1 Scenario 1p. 131
3.2 Scenario 2p. 134
3.3 Promptnessp. 135
4 Summary and Conclusionsp. 136
Referencesp. 137
Chapter 7 An Evaluation of Over-the-Counter Medication Sales for Syndromic Surveillancep. 143
1 Introductionp. 143
2 Background and Related Workp. 144
3 Datap. 144
4 Approachesp. 145
4.1 Lead-lag correlation analysisp. 145
4.2 Regression test of predictive abilityp. 146
4.3 Detection-based approachesp. 148
4.4 Supervised algorithm for outbreak detection in OTC datap. 148
4.5 Modified Holt-Winters forecasterp. 150
4.6 Forecasting based on multi-channel regressionp. 151
5 Experimentsp. 153
5.1 Lead-lag correlation analysis of OTC datap. 153
5.2 Regression test of the predicative value of OTCp. 154
5.3 Results from detection-based approachesp. 156
6 Conclusions and Future Workp. 158
Referencesp. 159
Chapter 8 Collaborative Health Sentinelp. 163
1 Introductionp. 163
2 Infectious Disease and Existing Health Surveillance Programsp. 166
3 Elements of the Collaborative Health Sentinel (CHS) Systemp. 170
3.1 Samplingp. 170
3.2 Creating a national health mapp. 177
3.3 Detectionp. 177
3.4 Reactionp. 183
3.5 Cost considerationsp. 184
4 Interaction with the Health Information Technology (HCIT) Worldp. 185
5 Conclusionp. 188
Referencesp. 189
Appendix A HL7p. 192
Chapter 9 A Multi-Modal System Approach for Drug Abuse Research and Treatment Evaluation: Information Systems Needs and Challengesp. 195
1 Introductionp. 195
2 Contextp. 198
2.1 Data sourcesp. 198
2.2 Examples of relevant questionsp. 199
3 Possible System Structurep. 201
4 Challenges in System Development and Implementationp. 204
4.1 Ontology developmentp. 204
4.2 Data source control, proprietary issuesp. 205
4.3 Privacy, security issuesp. 205
4.4 Costs to implement/maintain systemp. 206
4.5 Historical hypothesis-testing paradigmp. 206
4.6 Utility, usability, credibility of such a systemp. 206
4.7 Funding of system developmentp. 207
5 Summaryp. 207
Referencesp. 208
Chapter 10 Knowledge Representation for Versatile Hybrid Intelligent Processing Applied in Predictive Toxicologyp. 213
1 Introductionp. 214
2 Hybrid Intelligent Techniques for Predictive Toxicology Knowledge Representationp. 217
3 XML Schemas for Knowledge Representation and Processing in AI and Predictive Toxicologyp. 218
4 Towards a Standard for Chemical Data Representation in Predictive Toxicologyp. 220
5 Hybrid Intelligent Systems for Knowledge Representation in Predictive Toxicologyp. 225
5.1 A formal description of implicit and explicit knowledge-based intelligent systemsp. 226
5.2 An XML schema for hybrid intelligent systemsp. 228
6 A Case Studyp. 231
6.1 Materials and methodsp. 232
6.2 Resultsp. 233
7 Conclusionsp. 235
Referencesp. 236
Chapter 11 Ensemble Classification System Implementation for Biomedical Microarray Datap. 239
1 Introductionp. 240
2 Backgroundp. 241
2.1 Reasons for ensemblep. 241
2.2 Diversity and ensemblep. 241
2.3 Relationship between measures of diversity and combination methodp. 243
2.4 Measures of diversityp. 243
2.5 Microarray datap. 244
3 Ensemble Classification System (ECS) Designp. 245
3.1 ECS overviewp. 245
3.2 Feature subset selectionp. 247
3.3 Base classifiersp. 248
3.4 Combination strategyp. 249
4 Experimentsp. 250
4.1 Experimental datasetsp. 250
4.2 Experimental resultsp. 252
5 Conclusion and Further Workp. 254
Referencesp. 255
Chapter 12 An Automated Method for Cell Phase Identification in High Throughput Time-Lapse Screensp. 257
1 Introductionp. 258
2 Nuclei Segmentation and Trackingp. 259
3 Cell Phase Identificationp. 260
3.1 Feature calculationp. 260
3.2 Identifying cell phasep. 262
3.3 Correcting cell phase identification errorsp. 265
4 Experimental Resultsp. 266
5 Conclusionp. 272
Referencesp. 272
Chapter 13 Inference of Transcriptional Regulatory Networks Based on Cancer Microarray Datap. 275
1 Introductionp. 275
2 Subnetworks and Transcriptional Regulatory Networks Inferencep. 277
2.1 Inferring subnetworks using z-scorep. 277
2.2 Inferring subnetworks based on graph theoryp. 278
2.3 Inferring subnetworks based on Bayesian networksp. 279
2.4 Inferring transcriptional regulatory networks based on integrated expression and sequence datap. 283
3 Multinomial Probit Regression with Baysian Gene Selectionp. 284
3.1 Problem formulationp. 284
3.2 Bayesian variable selectionp. 286
3.3 Bayesian estimation using the strongest genesp. 288
3.4 Experimental resultsp. 289
4 Network Construction Based on Clustering and Predictor Designp. 293
4.1 Predictor construction using reversible jump MCMC annealingp. 293
4.2 CoD for predictorsp. 295
4.3 Experimental results on a Myeloid linep. 296
5 Concluding Remarksp. 298
Referencesp. 299
Chapter 14 Data Mining in Biomedicinep. 305
1 Introductionp. 305
2 Predictive Model Constructionp. 306
2.1 Derivation of unsupervised modelsp. 307
2.2 Derivation of supervised modelsp. 311
3 Validationp. 316
4 Impact Analysisp. 318
5 Summaryp. 319
Referencesp. 319
Chapter 15 Mining Multilevel Association Rules from Gene Ontology and Microarray Datap. 321
1 Introductionp. 321
2 Proposed Methodsp. 323
2.1 Preprocessingp. 323
2.2 Hierarchy-information encodingp. 324
3 The MAGO Algorithmp. 326
3.1 MAGO algorithmp. 327
3.2 CMAGO (Constrained Multilevel Association rules with Gene Ontology)p. 329
4 Experimental Resultsp. 330
4.1 The characteristic of the datasetp. 331
4.2 Experimental resultsp. 331
4.3 Interpretationp. 334
5 Concluding Remarksp. 335
Referencesp. 336
Chapter 16 A Proposed Sensor-Configuration and Sensitivity Analysis of Parameters with Applications to Biosensorsp. 339
1 Introductionp. 340
2 Sensor-System Configurationp. 342
3 Optical Biosensorsp. 346
3.1 Relationship between parametersp. 347
3.2 Modelling of parametersp. 351
4 Discussionp. 356
Conclusionp. 358
Referencesp. 359
Epiloguep. 361
Referencesp. 364
Indexp. 365