An improved approximation algorithm for the column subset selection problem 论文
详细信息
- 发表日期
- 2009-01-04
- 发表年份
- 2009
关键词
摘要
Abstract We consider the problem of selecting the "best " sub-set of exactly k columns from an m * n matrix A. Inparticular, we present and analyze a novel two-stage algorithm that runs in O(min{mn2, m2n}) time and re-turns as output an m * k matrix C consisting of ex-actly k columns of A. In the first stage (the random-ized stage), the algorithm randomly selects O(k log k)columns according to a judiciously-chosen probability distribution that depends on information in the topk right singular subspace of A. In the second stage(the deterministic stage), the algorithm applies a deterministic column-selection procedure to select and returnexactly k columns from the set of columns selected inthe first stage. Let C be the m * k matrix containingthose k columns, let PC denote the projection matrixonto the span of those columns, and let Ak denote the"best " rankk approximation to the matrix A as com-puted with the singular value decomposition. Then, we prove that kA- PCAk2 < = O ik