Sum-of-Squares Lower Bounds for Non-Gaussian Component Analysis

Ilias Diakonikolas, Sushrut Karmalkar, Shuo Pang, Aaron Potechin

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

Abstract

Non-Gaussian Component Analysis (NGCA) is the statistical task of finding a non-Gaussian direction in a high-dimensional dataset. Specifically, given i.i.d. samples from a distribution PAv on Rn that behaves like a known distribution A in a hidden direction v and like a standard Gaussian in the orthogonal complement, the goal is to approximate the hidden direction. The standard formulation posits that the first k - moments of A match those of the standard Gaussian and the k-th moment differs. Under mild assumptions, this problem has sample complexity O(n). On the other hand, all known efficient algorithms require Ω(nk/2) samples. Prior work developed sharp Statistical Query and low-degree testing lower bounds suggesting an information-computation tradeoff for this problem. Here we study the complexity of NGCA in the Sum-of-Squares (SoS) framework. Our main contribution is the first super-constant degree SoS lower bound for NGCA. Specifically, we show that if the non-Gaussian distribution A matches the first (k−1) moments of N(O, 1) and satisfies other mild conditions, then with fewer than n(1−ε)k/2 many samples from the normal distribution, with high probability, degree (logn)12−on(1)SoS fails to refute the existence of such a direction v. Our result significantly strengthens prior work by establishing a super-polynomial information-computation tradeoff against a broader family of algorithms. As corollaries, we obtain SoS lower bounds for several problems in robust statistics and the learning of mixture models. Our SoS lower bound proof introduces a novel technique’ that we believe may be of broader interest, and a number of refinements over existing methods. As in previous work, we use the framework of [Barak et al. FOCS 2016], where we express the moment matrix M as a sum of graph matrices, find a factorization M≈LQLT using minimum vertex...
Original languageEnglish
Title of host publication2024 IEEE 65th Annual Symposium on Foundations of Computer Science (FOCS)
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages949-958
Number of pages10
ISBN (Electronic)9798331516741
ISBN (Print)9798331516758
DOIs
Publication statusPublished - 29 Nov 2024
Event65th IEEE Symposium on Foundations of Computer Science - Chicago, United States
Duration: 27 Oct 202430 Oct 2024
Conference number: FOCS 2024
https://focs.computer.org/2024/

Publication series

NameAnnual IEEE Symposium on Foundations of Computer Science
PublisherIEEE
ISSN (Print)1523-8288
ISSN (Electronic)2575-8454

Conference

Conference65th IEEE Symposium on Foundations of Computer Science
Country/TerritoryUnited States
CityChicago
Period27/10/2430/10/24
Internet address

Research Groups and Themes

  • Algorithms and Complexity
  • Sum-of-Squares
  • Non-Gaussian Component Analysis (NGCA)
  • Algebra Representation

Fingerprint

Dive into the research topics of 'Sum-of-Squares Lower Bounds for Non-Gaussian Component Analysis'. Together they form a unique fingerprint.

Cite this