Discovering informative patterns and data cleaning 论文

1996引用 235
Machine Learning and AlgorithmsAlgorithms and Data CompressionMachine Learning and Data Classification

摘要

We present a method for discovering informative patterns from data. With this method, large databases can be reduced to only a few representative data entries. Our framework also encompasses methods for cleaning databases containing corrupted data. Both on-line and off-line algorithms are proposed and experimentally checked on databases of handwritten images. The generality of the framework makes it an attractive candidate for new applications in knowledge discovery. Keywords: knowledge discovery, machine learning, informative patterns, data cleaning, information gain. 4.1