By enlisting thousands of social gamers with the Facebook app Herd It, engineers at the University of California at San Diego (UCSD) have created a game-powered machine learning system that replaces expensive experts with music lovers having fun to train their song characterization and recommendation system.
The project, conducted at the Computer Audition Laboratory at UCSD, can already categorize music much faster than Internet radio stations like Pandora and hopes to eventually create a “zero-click” radio station that reads your mood and recommends the perfect playlist.
The researchers are aiming to do for audio—music, sound effects, voice recordings—what Google did for text documents. By using machine learning algorithms fed with data by volunteer game participants, the researchers are demonstrating an alternative to hiring people to do that categorization and recommendation work, the way industry leader Pandora Internet Radio does. And the UCSD researchers are not the only ones pursuing this goal when it comes to music: a company called Clio Music, a “search and discovery platform that uses music to find music” is working to develop new services for musicians using machine learning techniques.
“We’re basically building computers that can listen to that audio and understand that audio like humans do,” said UCSD professor and lab co-founder Gert Lanckriet.
Lanckriet and two of his students just published a paper about their project in the April 24 issue of The Proceedings of the National Academy of Sciences.
Luke Barrington, a co-founder and former doctoral student at the lab, said that while Pandora paid experts to code music, “We didn’t have the luxury of doing that. We thought we would build a game around music where people would describe what they heard, and that’s how Herd It was born.”
With the game Herd It, the goal is to build consensus among players on Facebook about what type of music they are listening to. Players get points for participating and answering questions such as what mood a song has, or what time of time of day a person would listen to it. The game was live in 2009, and collected data for a year before the UCSD engineers started using it. About 8,500 people have played the game, he said, categorizing approximately 10,000 songs.
Players are rewarded for arriving at a consensus, and that labeling provided data to feed the machine learning algorithm.
The effort started in 2006 after Lanckriet and several students including Barrington and Douglas Turnbull (the third co-author of the research paper) started a band but were confronted with the trouble of finding an audience for their music.
“We had a small amateur band that was thinking about uploading music into MySpace, and then realized nobody would ever be able to find it there,” Lanckriet said.
Barrington, who used the project as part of his doctoral thesis, said that problem spawned the idea was to create a “Google for music,” a simple text search engine using key words like “rainy day” or “saxophone” (or a combination of terms) that would include every song that’s accessible online.
“Without knowing the names of any popular music, or the hot band out there, you could input what you were looking for and that’s what would come up,” Barrington said. “But in building this ‘Google for music’ and using this machine learning method you need to teach the machine all the words that somebody could possibly search for.”
A similar effort has been underway since 1999 by the Music Genome Project, which was the idea behind Pandora. Pandora, which served 51.9 million listeners in April, hires music experts to listen to every song and identify about 400 music “genes,” like rhythm syncopation or vocal harmonies so users can listen to music of similar style.
Barrington said after 12 years Pandora features about 900,000 songs in its library, but the engineers at UCSD want the ability to go much bigger on a much smaller budget.
“The trick is: how do you get every song on MySpace and YouTube and each website… to get that genome sequencing?” he said. “That was the kind of problem we wanted to solve. We realized the machine learning would get us most of the way there.”
Barrington said they built the system to not only recognize patterns in the data distribution in the music, but also to recognize where its own deficiencies in pattern recognition lie and request more training data from game players.
The other component was how to avoid paying experts to create their training data like Pandora does. That’s where the crowd-sourcing Facebook game took over.
Lanckriet said all the components for a text-based music search engine are in place, and the lab could index a million songs overnight if needed. But when you start playing in the music industry’s sandbox some significant intellectual property problems arise.
“The problem with music is licensing,” he said. “While we can easily trawl the Internet and come up with a million songs, but once we have that database there is nothing we can do with it because we can’t expose it to the world because of licensing.”
Lanckriet said there could be some good news on the horizon, however, as some Internet radio providers like Rdio will start to allow third-party programs to access and categorize their music database.
“Rdio and Spotify have about 15 million songs,” Lankriet said. “MySpace has a multiple of that. We’re talking about content that’s several orders of magnitude bigger than what Netflix has, and that all needs to be indexed and recommended.”
Lanckriet said he hopes to start using data from smart phones or other sensors to gauge a listener’s environment and mood so the database recommends the perfect song for a rainy day, or for driving, or late night.
“Lots of people just turn the radio station, and hopefully there is on something that they like,” Lanckriet said. “Our system will allow the program to go out there on the internet, find that music, automatically index it and infuse it into the radio station into something they will hopefully like.”
Lanckriet said the Herd It game continues to be crucial to the database’s development, because music isn’t a static thing. New genres constantly emerge, and it’s also highly subjective.
“What is mellow rock for a 10-year-old girl in Beverly Hills is probably different than what a retired guy in Paris thinks it is,” Lanckriet said. “If you use a bunch of experts to provide you with your training data, that data is static. The game is basically live all the time, so it’s very adaptable to the dynamic character of music, using different terms and different generations.”
Barrington has since graduated from UCSD and helped create a company called Tomnod that uses crowdsourcing and game-powered machine learning to solve map related problems. Tomnod and National Geographic recently teamed up to try to find Genghis Khan’s lost tomb by inviting Internet users to identify possible archeological sites in Mongolia.
“We had tens of thousands of people contribute millions of points of interest, and we were actually able to go into the field and find real archeological sites,” Barrington said.
The game-powered crowdsourcing aspect of machine learning took off in 2006, Barrington said, when Google bought the ESP Game and relicensed it as Google Image Labeler, allowing users to label images to improve the quality of image search results. Google discontinued the game in September.
“Since then there has been a whole industry built around the fun side of crowdsourcing,” he said.
Barrington cited examples like Galaxy Zoo, where users help astronomers label and categorize far away galaxies, and Fold It, a game that encourages people to find the most efficient way to fold protein strains in a 3D environment, aiding gene therapy research looking to cure AIDs, cancer, and other diseases.
“I originally thought having machines solving all our big data problems for us was the right way to go,” Barrington said. “They’re only as smart as the data that you give them, and that data almost always comes from humans.
“How do we scalably and reliably get that smart out of the human mind?” Barrington said. “That for me is the cool part of all of this.”
Email Staff Writer Ian B. Murphy at email@example.com.