OSNAP: Faster Numerical Linear Algebra Algorithms via Sparser Subspace Embeddings 论文
摘要
An oblivious subspace embedding (OSE) given some parameters ε, d is a distribution D over matrices Π ∈ R <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">m×n</sup> such that for any linear subspace W ⊆ R <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</sup> with dim(W) = d, P <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Π~D</sub> (∀x ∈ W ||Πx|| <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> ∈ (1 ± ε)||x|| <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> ) > 2/3. We show that a certain class of distributions, Oblivious Sparse Norm-Approximating Projections (OSNAPs), provides OSE's with m = O(d <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1+γ</sup> /ε <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> ), and where every matrix Π in the support of the OSE has only s = O <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">γ</sub> (1/ε) non-zero entries per column, for γ > 0 any desired constant. Plugging OSNAPs into known algorithms for approximate least squares regression, ℓ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">p</sub> regression, low rank approximation, and approximating leverage scores implies faster algorithms for all these problems. Our main result is essentially a Bai-Yin type theorem in random matrix theory and is likely to be of independent interest: we show that for any fixed U ∈ R <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n×d</sup> with orthonormal columns and random sparse Π, all singular values of ΠU lie in [1 - ε, 1 + ε] with good probability. This can be seen as a generalization of the sparse Johnson-Lindenstrauss lemma, which was concerned with d = 1. Our methods also recover a slightly sharper version of a main result of [Clarkson-Woodruff, STOC 2013], with a much simpler proof. That is, we show that OSNAPs give an OSE with m = O(d <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> /ε <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> ), s = 1.