Detection and Characterization of Cluster Substructure I. Linear Structure: Fuzzy <i>c</i>-Lines 论文
详细信息
- 发表期刊/会议
- SIAM Journal on Applied Mathematics
- 发表日期
- 1981-04-01
- 发表年份
- 1981
关键词
摘要
In Part I, a generalization of the Fuzzy c-Means (or Fuzzy ISODATA) clustering algorithms is developed. Necessary conditions for minimization of a generalized total weighted squared orthogonal error objective function lead to a Picard iteration scheme which generates simultaneously (i) c fuzzy clusters in the data; (ii) a set of c prototypical straight lines in feature space which best fit the data in a well-defined sense; (iii) a set of c prototpyical centers of mass (on the c lines) which characterize the “core” of each linear fuzzy cluster. Theoretical optimization is achieved using principal components of generalized within cluster fuzzy scatter matrices. A convergence theorem for each algorithm in the infinite family is given. The algorithms are exemplified by five numerical examples using both real and artificial data sets having essentially “linear” substructure. In Part II, the Fuzzy c-Means and Fuzzy c-Lines algorithms are shown to be special cases of a more general class of fuzzy algorithms, the Fuzzy c-Varieties family, which generate (for prototypes) c linear varieties of arbitrary dimension up to dimension one less than the dimension of the data space. Convex combinations of the c-Varieties algorithms will also be studied in Part II [SIAM J. Appl. Math., 40 (1981), pp. 358–372].