Ever gotten a bum steer on a wine recommendation? It’s pretty common. California startup VinoEno is aiming to eliminate human error in wine recommendations by using analytics to process a large consumer preference data set. In short, they’ve developed a recommendation engine for wine.
VinoEno is part of an emerging trend of companies large and small deploying recommendation engines that are built on Hadoop or other MapReduce analytics platforms. Foursquare recently made headlines when it included a recommendation engine in the latest version of its eponymous location-sharing mobile app. The company uses Hadoop to store and analyze millions of user check-ins. VinoEno’s mobile app VinSpin can recommend a new wine to try. Now Foursquare can recommend where to have a glass.
Recommendation engines are a natural fit for analytics platforms. They involve processing large amounts of consumer data that’s collected online, and the results of the analysis feed real-time online applications.
New tools are emerging to make it easier to build recommendation engines on Hadoop. For example, Twitter recently open-sourced its in-house Hadoop framework, dubbed Scalding. It’s designed to simplify programming big data applications, including recommendation engines.
“We use it for everything from one-off analysis jobs (how are users in country X responding to this new change?), to generating the data that we use to surface visualizations (how is this metric changing over time, sliced by this set of users?), and for powering the algorithms behind many of our production systems (which ads should we show for this search query?),” said Edwin Chen, a data scientist at Twitter.
Scalding is integrated with the Scala programming language and has a simple syntax, said Chen. Chen wrote a tutorial for programmers on coding recommendation engines with Scalding. Chen also has advice for anyone looking to design a recommendation engine.
First, you need three things:
- A metrics framework for tracking how users respond to recommendations, and making sure that a new change isn’t a worse one
- A system for evaluating different recommendation algorithms
A key issue is determining which metrics are important, said Chen. Metrics can sometimes be in conflict, or it can be unclear what a metric means. “For example, is clickiness a good thing? It’s not always obvious whether clickiness means users like what they’re seeing or whether they can’t find what they want,” he said.
It’s also important that humans monitor the system, said Chen. A new algorithm might recommend movie X to people who’ve watched movie Y, and weblogs might show that people are clicking heavily on movie X. But that may be because movie X is eye-catching or even eye-catching in a negative way, said Chen. Movie X might not necessarily be relevant, and relevance is hard to automatically detect, he said. “Not everything can be measured by a computer or detected from logs.”
Another consideration is ensuring that the recommendation system can handle conditions that are beyond the reach of the recommendation algorithms. MapReduce algorithms run as batch processes. If you capture data about a new user, it may take a while to make use of it and generate recommendations for him, said Chen. You need some way of responding to new users in real time. For example, you can use item-to-item similarities that allow you to make recommendations without knowing anything about the user, he said.
One tricky issue is the problem of diversity — balancing the accuracy of the recommendation algorithms against the business case for introducing new content. “How do you avoid showing the same recommendations over and over again, and how do you encourage your users to explore [your] content?” said Chen. Many metrics penalize diversity, because similar items are more likely to get clicked on, yet you want your users to discover new interests and avoid getting tired of your content, he said.
The rise of smartphones is driving recommendation engines to take new data dimensions into account. This increases the amount of data involved, but also tends to make the systems more useful. For example, Foursquare’s new recommendation engine takes time into account, said Chen. “If it’s late at night, you’re more likely to be looking for a bar or nightclub than a breakfast joint,” he said. “It’s a pretty obvious thing to do, but neat and useful.”
Smartphones also track location. “If you know that I’m walking around a particular location, you can infer a lot about what I might want to do or be interested in,” said Chen.
Recommendation Engine Startup
A pair of data analytics veterans are attempting to seize the moment with recommendation engine startup Fabless Labs. The company offers a recommendation platform that can be tailored to specific subject domains and data types. The system, Pastèque, is based on the R programming language for statistical analysis and runs on top of Hadoop and Hbase.
Pastèque analyzes a data set and figures out the optimal recommendation algorithm. It also implements business rules inside the engine. “There are times when an algorithm may be mathematically correct but for a variety of social reasons you can’t present the choices that way,” said John West, Fabless Labs’ CTO and cofounder.
Recommendation engines were spawned by e-commerce websites, but the technology can be applied to a broader range of problems, said West. Recommendation engines are also called prediction systems because they attempt to predict what users will click on next. And predictions are useful for any problem involving deciding what to do next, said West. These include problems that are traditionally thought of as logistics.
“For example, if you want to hold a meeting in a big city and a lot of people are traveling from different [locations], how do you arrange the travel schedule to make sure — based on anticipated delays in the airline system — that people make it there on time?” said West. That turns out to be more of a recommendation problem when you implement it, he said. “We’ve done a test database of that with the Department of Transportation’s Flight History as a way of scale-testing our platform.”
Big Wine Data
The wine recommendation engine VinSpin is built on Pastèque. VinoEno collected a lot of data from wine tasters who can discriminate the individual flavor components of wines. The company created a database of the attributes of many wines, and runs a recommendation engine on that data set. “When people start telling you what they like and dislike, you can start figuring out which components of the wine, or, more importantly, which combinations of components, they are valuing, and you can make a very accurate recommendation,” said VinoEno CEO Kevin Bersofsky.
The key to VinSpin is that it looks for what people mean rather than what they say. “When a consumer walks into a wine shop, very often they’ll say ‘I like really fruity wine’,” said Bersofsky. “That’s not, when you look at the data, what they actually like. You find out when they say ‘fruity’, often they mean ‘sweet’,” he said.
Instead of consumers having to try to describe what they like in wines, they can simply identify specific wines they like. “All they have to do is tell me, ‘Okay, I like Ménage à Trois Red,” said Bersofsky. The algorithm picks up on those few attributes, rates them higher in a recommendation engine, and spits out choices, he said. “This leads them down a path of other things they like.”
Eric Smalley is a freelance writer in Boston. He is a regular contributor to Wired.com. Follow him on Twitter at @ericsmalley.