Privacy preserving frequent itemset mining 论文

2002引用 286
Data Mining Algorithms and ApplicationsPrivacy-Preserving Technologies in DataData Quality and Management

摘要

One crucial aspect of privacy preserving frequent itemset mining is the fact that the mining process deals with a trade-o#: privacy and accuracy, which are typically contradictory, and improving one usually incurs a cost in the other. One alternative to address this particular problem is to look for a balance between hiding restrictive patterns and disclosing nonrestrictive ones. In this paper, we propose a new framework for enforcing privacy in mining frequent itemsets. We combine, in a single framework, techniques for e#ciently hiding restrictive patterns: a transaction retrieval engine relying on an inverted file and Boolean queries; and a set of algorithms to "sanitize" a database. In addition, we introduce performance measures for mining frequent itemsets that quantify the fraction of mining patterns which are preserved after sanitizing a database. We also report the results of a performance evaluation of our research prototype and an analysis of the results.