An end-to-end machine learning system for harmonic analysis of music

Yizhao Ni, Matt McVicar, Raul Santos Rodriguez, Tijl E P De Bie

Research output: Contribution to journalArticle (Academic Journal)peer-review

51 Citations (Scopus)


We present a new system for the harmonic analysis
of popular musical audio. It is focused on chord estimation,
although the proposed system additionally estimates the key
sequence and bass notes. It is distinct from competing approaches
in two main ways. Firstly, it makes use of a new improved
chromagram representation of audio that takes the human
perception of loudness into account. Furthermore, it is the first
system for joint estimation of chords, keys, and bass notes that is
fully based on machine learning, requiring no expert knowledge
to tune the parameters. This means that it will benefit from
future increases in available annotated audio files, broadening its
applicability to a wider range of genres. In all of three evaluation
scenarios, including a new one that allows evaluation on audio
for which no complete ground truth annotation is available, the
proposed system is shown to be faster, more memory efficient,
and more accurate than the state-of-the-art.
Original languageEnglish
Pages (from-to)1771-1783
JournalIEEE Transactions on Audio, Speech, and Language Processing
Issue number6
Publication statusPublished - Feb 2012


  • chord recognition
  • music information retrieval


Dive into the research topics of 'An end-to-end machine learning system for harmonic analysis of music'. Together they form a unique fingerprint.

Cite this