Cover image for Knowledge discovery in multiple databases
Title:
Knowledge discovery in multiple databases
Personal Author:
Publication Information:
London : Springer, 2004
ISBN:
9781852337032

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000010134619 QA76.9.D3 Z42 2004 Open Access Book Book
Searching...

On Order

Summary

Summary

Many organizations have an urgent need of mining their multiple databases inherently distributed in branches (distributed data). In particular, as the Web is rapidly becoming an information flood, individuals and organizations can take into account low-cost information and knowledge on the Internet when making decisions. How to efficiently identify quality knowledge from different data sources has become a significant challenge. This challenge has attracted a great many researchers including the au­ thors who have developed a local pattern analysis, a new strategy for dis­ covering some kinds of potentially useful patterns that cannot be mined in traditional multi-database mining techniques. Local pattern analysis deliv­ ers high-performance pattern discovery from multiple databases. There has been considerable progress made on multi-database mining in such areas as hierarchical meta-learning, collective mining, database classification, and pe­ culiarity discovery. While these techniques continue to be future topics of interest concerning multi-database mining, this book focuses on these inter­ esting issues under the framework of local pattern analysis. The book is intended for researchers and students in data mining, dis­ tributed data analysis, machine learning, and anyone else who is interested in multi-database mining. It is also appropriate for use as a text supplement for broader courses that might also involve knowledge discovery in databases and data mining.


Table of Contents

1 Importance of Multi-database Miningp. 1
1.1 Introductionp. 1
1.2 Role of Multi-database Mining in Real-world Applicationsp. 2
1.3 Multi-database Mining Problemsp. 4
1.4 Differences Between Mono-and Multi-database Miningp. 6
1.4.1 Features of Data in Multi-databasesp. 6
1.4.2 Features of Patterns in Multi-databasesp. 8
1.5 Evolution of Multi-database Miningp. 9
1.6 Limitations of Previous Techniquesp. 12
1.7 Process of Multi-database Miningp. 14
1.7.1 Description of Multi-database Miningp. 14
1.7.2 Practical Issues in the Processp. 16
1.8 Features of the Defined Processp. 20
1.9 Major Contributions of This Bookp. 23
1.10 Organization of the Bookp. 24
2 Data Mining and Multi-database Miningp. 27
2.1 Introductionp. 27
2.2 Knowledge Discovery in Databasesp. 28
2.2.1 Processing Steps of KDDp. 28
2.2.2 Data Pre-processingp. 30
2.2.3 Data Miningp. 31
2.2.4 Post Data Miningp. 33
2.2.5 Applications of KDDp. 34
2.3 Association Rule Miningp. 36
2.4 Research into Mining Mono-databasesp. 41
2.5 Research into Mining Multi-databasesp. 51
2.5.1 Parallel Data Miningp. 51
2.5.2 Distributed Data Miningp. 52
2.5.3 Application-dependent Database Selectionp. 58
2.5.4 Peculiarity-oriented Multi-database Miningp. 59
2.6 Summaryp. 61
3 Local Pattern Analysisp. 63
3.1 Introductionp. 63
3.2 Previous Multi-database Mining Techniquesp. 64
3.3 Local Patternsp. 65
3.4 Local Instance Analysis Inspired by Competition in Sportsp. 67
3.5 The Structure of Patterns in Multi-database Environmentsp. 70
3.6 Effectiveness of Local Pattern Analysisp. 73
3.7 Summaryp. 74
4 Identifying Quality Knowledgep. 75
4.1 Introductionp. 75
4.2 Problem Statementp. 76
4.2.1 Problems Faced by Traditional Multi-database Miningp. 76
4.2.2 Effectiveness of Identifying Quality Datap. 78
4.2.3 Needed Conceptsp. 80
4.3 Nonstandard Interpretationp. 82
4.4 Proof Theoryp. 88
4.5 Adding External Knowledgep. 91
4.6 The Use of the Frameworkp. 95
4.6.1 Applying to Real-world Applicationsp. 95
4.6.2 Evaluating Veridicalityp. 96
4.7 Summaryp. 100
5 Database Clusteringp. 103
5.1 Introductionp. 103
5.2 Effectiveness of Classifyingp. 104
5.3 Classifying Databasesp. 107
5.3.1 Features in Databasesp. 107
5.3.2 Similarity Measurementp. 108
5.3.3 Relevance of Databases and Classificationp. 113
5.3.4 Ideal Classification and Goodness Measurementp. 115
5.4 Searching for a Good Classificationp. 120
5.4.1 The First Step: Generating a Classificationp. 121
5.4.2 The Second Step: Searching for a Good Classificationp. 123
5.5 Algorithm Analysisp. 127
5.5.1 Procedure GreedyClassp. 127
5.5.2 Algorithm GoodClassp. 129
5.6 Evaluation of Application-independent Database Classificationp. 130
5.6.1 Dataset Selectionp. 130
5.6.2 Experimental Resultsp. 131
5.6.3 Analysisp. 134
5.7 Summaryp. 135
6 Dealing with Inconsistencyp. 137
6.1 Introductionp. 137
6.2 Problem Statementp. 138
6.3 Definitions of Formal Semanticsp. 139
6.4 Weighted Majorityp. 143
6.5 Mastering Local Pattern Setsp. 146
6.6 Examples of Synthesizing Local Pattern Setsp. 148
6.7 A Syntactic Characterizationp. 150
6.8 Summaryp. 155
7 Identifying High-vote Patternsp. 157
7.1 Introductionp. 157
7.2 Illustration of High-vote Patternsp. 158
7.3 Identifying High-vote Patternsp. 161
7.4 Algorithm Designp. 163
7.4.1 Searching for High-vote Patternsp. 164
7.4.2 Identifying High-vote Patterns: An Examplep. 165
7.4.3 Algorithm Analysisp. 167
7.5 Identifying High-vote Patterns Using a Fuzzy Logic Controllerp. 168
7.5.1 Needed Concepts in Fuzzy Logicp. 168
7.5.2 System Analysisp. 170
7.5.3 Setting Membership Functions for Input and Output Variablesp. 171
7.5.4 Setting Fuzzy Rulesp. 172
7.5.5 Fuzzificationp. 174
7.5.6 Inference and Rule Compositionp. 174
7.5.7 Defuzzificationp. 176
7.5.8 Algorithm Designp. 177
7.6 High-vote Pattern Analysisp. 178
7.6.1 Normal Distributionp. 178
7.6.2 The Procedure of Clusteringp. 179
7.7 Suggested Patternsp. 183
7.8 Summaryp. 183
8 Identifying Exceptional Patternsp. 185
8.1 Introductionp. 185
8.2 Interesting Exceptional Patternsp. 186
8.2.1 Measuring the Interestingnessp. 186
8.2.2 Behavior of Interest Measurementsp. 189
8.3 Algorithm Designp. 189
8.3.1 Algorithm Designp. 189
8.3.2 Identifying Exceptions: An Examplep. 192
8.3.3 Algorithm Analysisp. 193
8.4 Identifying Exceptions with a Fuzzy Logic Controllerp. 195
8.5 Summaryp. 195
9 Synthesizing Local Patterns by Weightingp. 197
9.1 Introductionp. 197
9.2 Problem Statementp. 198
9.3 Synthesizing Rules by Weightingp. 200
9.3.1 Weight of Evidencep. 200
9.3.2 Solving Weights of Databasesp. 201
9.3.3 Algorithm Designp. 205
9.4 Improvement of Synthesizing Modelp. 206
9.4.1 Effectiveness of Rule Selectionp. 206
9.4.2 Process of Rule Selectionp. 208
9.4.3 Optimized Algorithmp. 210
9.5 Algorithm Analysisp. 211
9.5.1 Procedure RuleSelectionp. 211
9.5.2 Algorithm RuleSynthesizingp. 212
9.6 Summaryp. 213
10 Conclusions and Future Workp. 215
10.1 Conclusionsp. 215
10.2 Future Workp. 218
Referencesp. 221
Subject Indexp. 231