Skip to main content

All That Jazz: Improving Automated Piano Note Transcription

Posted by Lauren Cotton on Wednesday, November 17, 2021
hands playing a piano

Any fan of jazz music can attest to the beauty of musical improvisation. However, many famous improvisational piano pieces aren't recorded in sheet music. “There's a lot of music that exists in the world that doesn't have musical transcriptions because it was played improvisationally—virtuosos that never decided to write anything down,” explained Varun Rajaram. This is because transcribing the notes of a piece (especially polyphonic pieces where multiple notes play at a time) is a difficult task even for skilled musicians.

When exploring this topic, Varun realized there was a considerable error factor in the current transcription software when attempting to transcribe jazz music. As a piano player himself, Varun was aware that the current note transcription database Maestro only contained classical music and was much less successful when attempting to accurately transcribe other genres. This inspired him to investigate this topic for his Turing Scholars Honors Thesis. Overseen by UT Computer Science Professor David Harwath and UT Music Professor John Mills, he began investigating the best method to improve this database. Varun’s interdisciplinary research concluded that broadening this database to include jazz pieces would be an achievable way to make a lasting impact on the music industry’s ability to efficiently auto-transcribe pieces.

The first phase of bettering this process of music transcription began by collecting a significant corpus of data from the Jazz genre. Professor John Mills helped provide this data, later named the MILLS dataset. Another jazz dataset was created as well from the music of Doug McKenzie called the DMJ dataset. These datasets consisted of the audio recordings of pieces as well as the underlying MIDI recording (MIDI contains an encoded record of pitch and duration of the audio but doesn't show notes played). For those familiar with Garageband on Apple products you may recognize MIDI represented as a piano roll.

Image
Piano roll representation of the song All of Me by Frank Sinatra. The 88 piano keys are represented on the vertical axis and each colored bar represents a note being held for a particular duration at the corresponding pitch.

With the help of Professor David Harwath the next step was creating a machine learning algorithm that was taught using jazz pieces and running this model for comparison against the previous classically trained model. The results were successful, and these datasets are now publicly available to assist others in this goal. Through this broadening of the available transcription datasets, musicians are now one step closer to precise note recording. The industry applications of this research are far-reaching, including creative applications such as computer-generated music and easier analysis of infamous pieces.

Varun reflected on his research experience of writing an honors thesis and expressed his gratitude for both the opportunity and the dedication of his mentors. He explained how Professor Harwath, “was willing to get his hands dirty and help me a lot, and be more involved than the average advisor advising a CS thesis would be.” The Turing Scholars Honors Program is UTCS’s honors program for gifted computer science students that encourages these students to expand their skills through a cohort environment and the opportunity to write an honors thesis.

Varun is now a Digital Analyst with McKinsey and Company where he continues to pursue interdisciplinary solutions to complex problems.