Using CART to generate partially synthetic public use microdata 论文

2005引用 219

Data Mining Algorithms and ApplicationsPrivacy-Preserving Technologies in DataData Quality and Management

Data Mining Algorithms and Applications Data Quality and Management Privacy-Preserving Technologies in Data

作者

摘要

To limit disclosure risks, one approach is to release partially synthetic, public use microdata sets. These comprise the units originally surveyed, but some collected values, for example sensitive values at high risk of disclosure or values of key identifiers, are replaced with multiple imputations. This article presents and evaluates the use of classification and regression trees to generate partially synthetic data. Two potential applications of CART are studied via simulation: (i) generate synthetic data for sensitive variables; and, (ii) generate synthetic data for variables that are key identifiers. 1

作者查看全部 (1)

Jerome P. Reiter

Using CART to generate partially synthetic public use microdata 论文

摘要

作者查看全部 (1)

相关技术查看全部 (1)

相关事件

相关文章