Available:*
Library | Item Barcode | Call Number | Material Type | Item Category 1 | Status |
---|---|---|---|---|---|
Searching... | 30000010184066 | QA76.9.D343 V35 2006 | Open Access Book | Book | Searching... |
On Order
Summary
Summary
Data mining has emerged as a significant technology for gaining knowledge from vast quantities of data. However, concerns are growing that use of this technology can violate individual privacy. These concerns have led to a backlash against the technology, for example, a "Data-Mining Moratorium Act" introduced in the U.S. Senate that would have banned all data-mining programs (including research and development) by the U.S. Department of Defense.
Privacy Preserving Data Mining provides a comprehensive overview of available approaches, techniques and open problems in privacy preserving data mining. This book demonstrates how these approaches can achieve data mining, while operating within legal and commercial restrictions that forbid release of data. Furthermore, this research crystallizes much of the underlying foundation, and inspires further research in the area.
Privacy Preserving Data Mining is designed for a professional audience composed of practitioners and researchers in industry. This volume is also suitable for graduate-level students in computer science.
Table of Contents
1 Privacy and Data Mining | p. 1 |
2 What is Privacy? | p. 7 |
2.1 Individual Identifiability | p. 8 |
2.2 Measuring the Intrusiveness of Disclosure | p. 11 |
3 Solution Approaches / Problems | p. 17 |
3.1 Data Partitioning Models | p. 18 |
3.2 Perturbation | p. 19 |
3.3 Secure Multi-party Computation | p. 21 |
3.3.1 Secure Circuit Evaluation | p. 23 |
3.3.2 Secure Sum | p. 25 |
4 Predictive Modeling for Classification | p. 29 |
4.1 Decision Tree Classification | p. 31 |
4.2 A Perturbation-Based Solution for ID3 | p. 34 |
4.3 A Cryptographic Solution for ID3 | p. 38 |
4.4 ID3 on Vertically Partitioned Data | p. 40 |
4.5 Bayesian Methods | p. 45 |
4.5.1 Horizontally Partitioned Data | p. 47 |
4.5.2 Vertically Partitioned Data | p. 48 |
4.5.3 Learning Bayesian Network Structure | p. 50 |
4.6 Summary | p. 51 |
5 Predictive Modeling for Regression | p. 53 |
5.1 Introduction and Case Study | p. 53 |
5.1.1 Case Study | p. 55 |
5.1.2 What are the Problems? | p. 55 |
5.1.3 Weak Secure Model | p. 58 |
5.2 Vertically Partitioned Data | p. 60 |
5.2.1 Secure Estimation of Regression Coefficients | p. 60 |
5.2.2 Diagnostics and Model Determination | p. 62 |
5.2.3 Security Analysis | p. 63 |
5.2.4 An Alternative: Secure Powell's Algorithm | p. 65 |
5.3 Horizontally Partitioned Data | p. 68 |
5.4 Summary and Future Research | p. 69 |
6 Finding Patterns and Rules (Association Rules) | p. 71 |
6.1 Randomization-based Approaches | p. 72 |
6.1.1 Randomization Operator | p. 73 |
6.1.2 Support Estimation and Algorithm | p. 74 |
6.1.3 Limiting Privacy Breach | p. 75 |
6.1.4 Other work | p. 78 |
6.2 Cryptography-based Approaches | p. 79 |
6.2.1 Horizontally Partitioned Data | p. 79 |
6.2.2 Vertically Partitioned Data | p. 80 |
6.3 Inference from Results | p. 82 |
7 Descriptive Modeling (Clustering, Outlier Detection) | p. 85 |
7.1 Clustering | p. 86 |
7.1.1 Data Perturbation for Clustering | p. 86 |
7.2 Cryptography-based Approaches | p. 91 |
7.2.1 EM-clustering for Horizontally Partitioned Data | p. 91 |
7.2.2 K-means Clustering for Vertically Partitioned Data | p. 95 |
7.3 Outlier Detection | p. 99 |
7.3.1 Distance-based Outliers | p. 101 |
7.3.2 Basic Approach | p. 102 |
7.3.3 Horizontally Partitioned Data | p. 102 |
7.3.4 Vertically Partitioned Data | p. 105 |
7.3.5 Modified Secure Comparison Protocol | p. 106 |
7.3.6 Security Analysis | p. 107 |
7.3.7 Computation and Communication Analysis | p. 110 |
7.3.8 Summary | p. 111 |
8 Future Research - Problems remaining | p. 113 |
References | p. 115 |
Index | p. 121 |