Universal characteristics of deep neural network loss surfaces from random matrix theory

Nick P Baskerville, Jonathan Keating, Francesco Mezzadri, Joseph Najnudel, Diego Granziol

Research output: Contribution to journalArticle (Academic Journal)peer-review

2 Citations (Scopus)
71 Downloads (Pure)

Abstract

This paper considers several aspects of random matrix universality in deep neural networks (DNNs). Motivated by recent experimental work, we use universal properties of random matrices related to local statistics to derive practical implications for DNNs based on a realistic model of their Hessians. In particular we derive universal aspects of outliers in the spectra of deep neural networks and demonstrate the important role of random matrix local laws in popular pre-conditioning gradient descent algorithms. We also present insights into DNN loss surfaces from quite general arguments based on tools from statistical physics and random matrix theory.
Original languageEnglish
Article number494002
Number of pages42
JournalJournal of Physics A: Mathematical and Theoretical
Volume55
Issue number49
DOIs
Publication statusPublished - 16 Dec 2022

Fingerprint

Dive into the research topics of 'Universal characteristics of deep neural network loss surfaces from random matrix theory'. Together they form a unique fingerprint.

Cite this