Cover image for Privacy preserving data mining
Title:
Privacy preserving data mining
Personal Author:
Series:
Advances in information security ; 19
Publication Information:
New York, NY : Springer, 2006
Physical Description:
120 p. : ill. ; 24 cm.
ISBN:
9780387258867

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000010184066 QA76.9.D343 V35 2006 Open Access Book Book
Searching...

On Order

Summary

Summary

Data mining has emerged as a significant technology for gaining knowledge from vast quantities of data. However, concerns are growing that use of this technology can violate individual privacy. These concerns have led to a backlash against the technology, for example, a "Data-Mining Moratorium Act" introduced in the U.S. Senate that would have banned all data-mining programs (including research and development) by the U.S. Department of Defense.

Privacy Preserving Data Mining provides a comprehensive overview of available approaches, techniques and open problems in privacy preserving data mining. This book demonstrates how these approaches can achieve data mining, while operating within legal and commercial restrictions that forbid release of data. Furthermore, this research crystallizes much of the underlying foundation, and inspires further research in the area.

Privacy Preserving Data Mining is designed for a professional audience composed of practitioners and researchers in industry. This volume is also suitable for graduate-level students in computer science.


Table of Contents

1 Privacy and Data Miningp. 1
2 What is Privacy?p. 7
2.1 Individual Identifiabilityp. 8
2.2 Measuring the Intrusiveness of Disclosurep. 11
3 Solution Approaches / Problemsp. 17
3.1 Data Partitioning Modelsp. 18
3.2 Perturbationp. 19
3.3 Secure Multi-party Computationp. 21
3.3.1 Secure Circuit Evaluationp. 23
3.3.2 Secure Sump. 25
4 Predictive Modeling for Classificationp. 29
4.1 Decision Tree Classificationp. 31
4.2 A Perturbation-Based Solution for ID3p. 34
4.3 A Cryptographic Solution for ID3p. 38
4.4 ID3 on Vertically Partitioned Datap. 40
4.5 Bayesian Methodsp. 45
4.5.1 Horizontally Partitioned Datap. 47
4.5.2 Vertically Partitioned Datap. 48
4.5.3 Learning Bayesian Network Structurep. 50
4.6 Summaryp. 51
5 Predictive Modeling for Regressionp. 53
5.1 Introduction and Case Studyp. 53
5.1.1 Case Studyp. 55
5.1.2 What are the Problems?p. 55
5.1.3 Weak Secure Modelp. 58
5.2 Vertically Partitioned Datap. 60
5.2.1 Secure Estimation of Regression Coefficientsp. 60
5.2.2 Diagnostics and Model Determinationp. 62
5.2.3 Security Analysisp. 63
5.2.4 An Alternative: Secure Powell's Algorithmp. 65
5.3 Horizontally Partitioned Datap. 68
5.4 Summary and Future Researchp. 69
6 Finding Patterns and Rules (Association Rules)p. 71
6.1 Randomization-based Approachesp. 72
6.1.1 Randomization Operatorp. 73
6.1.2 Support Estimation and Algorithmp. 74
6.1.3 Limiting Privacy Breachp. 75
6.1.4 Other workp. 78
6.2 Cryptography-based Approachesp. 79
6.2.1 Horizontally Partitioned Datap. 79
6.2.2 Vertically Partitioned Datap. 80
6.3 Inference from Resultsp. 82
7 Descriptive Modeling (Clustering, Outlier Detection)p. 85
7.1 Clusteringp. 86
7.1.1 Data Perturbation for Clusteringp. 86
7.2 Cryptography-based Approachesp. 91
7.2.1 EM-clustering for Horizontally Partitioned Datap. 91
7.2.2 K-means Clustering for Vertically Partitioned Datap. 95
7.3 Outlier Detectionp. 99
7.3.1 Distance-based Outliersp. 101
7.3.2 Basic Approachp. 102
7.3.3 Horizontally Partitioned Datap. 102
7.3.4 Vertically Partitioned Datap. 105
7.3.5 Modified Secure Comparison Protocolp. 106
7.3.6 Security Analysisp. 107
7.3.7 Computation and Communication Analysisp. 110
7.3.8 Summaryp. 111
8 Future Research - Problems remainingp. 113
Referencesp. 115
Indexp. 121