Feature-based classification of time-series data 论文

2001引用 253
Time Series Analysis and ForecastingAnomaly Detection Techniques and ApplicationsNeural Networks and Applications

摘要

Abstract: In this paper we propose the use of statistical features for time-series classification. The classification is performed with a multi-layer perceptron (MLP) neural network. The proposed method is examined in the context of Control Chart Pattern data, which are time series used in Statistical Process Control. Experimental results verify the efficiency of the feature-based classification method, compared to previous methods which classify time series based on the values of each time point. Moreover, the results show the robustness of the proposed method against noise and time-series length. Key-words: Data mining, time series, classification, statistical features 1 Introduction Data mining is the process of pattern identification in large databases [1]. The main objectives of data mining are prediction and description. Data mining methods belong to several categories. Regression maps data to prediction values. Generalization produces a simple description from complex data and association finds dependencies among data. Clustering identifies a set of types with which data can be categorized, whereas classification maps data to a set of a predefined types. Data mining has been mainly applied to relational data. Non-relational data present important challenges to data mining due to their size and dimensionality. Time-series data are supported by many database systems. A time series is a sequence of real numbers representing the values of a variable over time. They have found applications in temporal [2] and scientific databases, as well as in data warehouses containing a variety of data types, from stock market prices to electro-cardiograms. Mining time-series data can reveal important patterns, such as similarities [3], trends [4] or periodicity [5]. Since time-series data tend to grow rapidly over time, they present several performance issues to data mining algorithms.