Machine learning has helped a group of researchers at the University of Washington to devise a system, called Audeo, which creates audio from silent piano performances.
That is, this artificial intelligence recreates the performing experience of musicians and their instruments using only visual cues.
Audeo
Audeo uses a series of steps to decode what's happening in the video and then translate it into music. First, it has to detect which keys are pressed in each video frame to create a diagram over time. Then you need to translate that diagram into something that a music synthesizer will actually recognize as a sound a piano would make. This second step cleans up the data and adds more information, such as how hard each key is pressed and for how long.
The researchers trained and tested the system using YouTube videos of the pianist Paul Barton. The lineup consisted of about 172,000 video frames of Barton playing music by well-known classical composers, such as Bach and Mozart.
Audeo's reliability in interpreting which song is being played is so high that it even surpasses that of song recognition apps: the applications correctly identified the piece that Audeo was playing approximately 86% of the time, while Audeo reached the 93%.
Audeo was trained and tested only on Paul Barton piano videos. Future research is needed to see how well it can transcribe music for any musician or piano player.
–
The news
This AI can interpret the music played by an instrument only using visual cues
was originally published in
Xataka Science
by
Sergio Parra
.