Abstract
Non-Gaussian Component Analysis (NGCA) is the statistical task of finding a non-Gaussian direction in a high-dimensional dataset. Specifically, given i.i.d. samples from a distribution PAv on Rn that behaves like a known distribution A in a hidden direction v and like a standard Gaussian in the orthogonal complement, the goal is to approximate the hidden direction. The standard formulation posits that the first k - moments of A match those of the standard Gaussian and the k-th moment differs. Under mild assumptions, this problem has sample complexity O(n). On the other hand, all known efficient algorithms require Ω(nk/2) samples. Prior work developed sharp Statistical Query and low-degree testing lower bounds suggesting an information-computation tradeoff for this problem. Here we study the complexity of NGCA in the Sum-of-Squares (SoS) framework. Our main contribution is the first super-constant degree SoS lower bound for NGCA. Specifically, we show that if the non-Gaussian distribution A matches the first (k−1) moments of N(O, 1) and satisfies other mild conditions, then with fewer than n(1−ε)k/2 many samples from the normal distribution, with high probability, degree (logn)12−on(1)SoS fails to refute the existence of such a direction v. Our result significantly strengthens prior work by establishing a super-polynomial information-computation tradeoff against a broader family of algorithms. As corollaries, we obtain SoS lower bounds for several problems in robust statistics and the learning of mixture models. Our SoS lower bound proof introduces a novel technique’ that we believe may be of broader interest, and a number of refinements over existing methods. As in previous work, we use the framework of [Barak et al. FOCS 2016], where we express the moment matrix M as a sum of graph matrices, find a factorization M≈LQLT using minimum vertex...
Original language | English |
---|---|
Title of host publication | 2024 IEEE 65th Annual Symposium on Foundations of Computer Science (FOCS) |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Pages | 949-958 |
Number of pages | 10 |
ISBN (Electronic) | 9798331516741 |
ISBN (Print) | 9798331516758 |
DOIs | |
Publication status | Published - 29 Nov 2024 |
Event | 65th IEEE Symposium on Foundations of Computer Science - Chicago, United States Duration: 27 Oct 2024 → 30 Oct 2024 Conference number: FOCS 2024 https://focs.computer.org/2024/ |
Publication series
Name | Annual IEEE Symposium on Foundations of Computer Science |
---|---|
Publisher | IEEE |
ISSN (Print) | 1523-8288 |
ISSN (Electronic) | 2575-8454 |
Conference
Conference | 65th IEEE Symposium on Foundations of Computer Science |
---|---|
Country/Territory | United States |
City | Chicago |
Period | 27/10/24 → 30/10/24 |
Internet address |
Research Groups and Themes
- Algorithms and Complexity
- Sum-of-Squares
- Non-Gaussian Component Analysis (NGCA)
- Algebra Representation