Record linkage 论文
2006引用 290
Data Quality and ManagementPrivacy-Preserving Technologies in DataDistributed systems and fault tolerance
摘要
This tutorial provides a comprehensive and cohesive overview of the key research results in the area of record linkage methodologies and algorithms for identifying approximate duplicate records, and available tools for this purpose. It encompasses techniques introduced in several communities including databases, information retrieval, statistics and machine learning. It aims to identify similarities and differences across the techniques as well as their merits and limitations.