The UCI KDD archive of large data sets for data mining research and experimentation 论文

2000ACM SIGKDD Explorations Newsletter引用 234
Data Mining Algorithms and ApplicationsData Stream Mining TechniquesAdvanced Database Systems and Queries

摘要

Advances in data collection and storage have allowed organizations to create massive, complex and heterogeneous databases, whichhavestymied traditional methods of data analysis. This has led to the development of new analytical tools that often combine techniques from a variety of #elds such as statistics, computer science, and mathematics to extract meaningful knowledge from the data. To support research in this area, UC Irvine has created the UCI Knowledge Discovery in Databases #KDD# Archive #http:##kdd.ics.uci.edu# which is a new online archive of large and complex data sets that encompasses a wide varietyof data types, analysis tasks, and application areas. This article describes the objectives and philosophy of the UCI KDD Archive. We draw parallels with the development of the UCI Machine Learning Repository and its a#ect on the Machine Learning community.