Proceedings of the 2017 SIAM International Conference on Data Mining 论文
2017Society for Industrial and Applied Mathematics eBooks引用 398
Data Mining Algorithms and ApplicationsAdvanced Database Systems and QueriesRough Sets and Fuzzy Logic
摘要
Discovering the key structure of a database is one of the main goals of data mining.In pattern set mining we do so by discovering a small set of patterns that together describe the data well.The richer the class of patterns we consider, and the more powerful our description language, the better we will be able to summarise the data.In this paper we propose SQUISH, a novel greedy MDL-based method for summarising sequential data using rich patterns that are allowed to interleave.Experiments show SQUISHis orders of magnitude faster than the state of the art, results in better models, as well as discovers meaningful semantics in the form patterns that identify multiple choices of values.