Riot Games is the developer of the most popular video game in the world, League of Legends. The game, which is hosted entirely online and is free to play, sees on average five million players logged in simultaneously during peak hours. More than 32 million different people spend more than 1 billion hours per month playing the game.
But in order to keep those players coming back, and ultimately converting them to paying customers, Riot has to keep its product fresh, engaging and performing well. League of Legends has to be fun.
To get a faster look into the mountains of data its millions of users generate daily, Riot Games, based in Santa Monica, Calif., needed a product that could allow the designers and developers that keep the game fun get fast access to the data without waiting on an engineer to pull reports.
Barry Livingston, the director of engineering for the big data group at Riot Games, said the product they found is Platfora, an interactive business intelligence suite that was released for general availability in March.
Previous efforts to analyze data took too long and the results did not reach enough end-users. “What we really want to do is to enable people who are focused on some particular aspect of the domain to kind of self-serve,” Livingston said. “Platfora is a very disruptive technology on that front,” he said. “Getting to insight is always an iterative process. It’s not something where you do it perfectly the first time. I think more than anything else, the benefit of Platfora is that ability to iterate quickly on the data.”
Keeping Score of Many Matches
League of Legends, Riot’s only product, generates an enormous volume and variety of data. Livingston said each match in the game averages 45 minutes; each session creates about five megabytes of data once it’s been compressed.
Riot tracks everything about each session, like how players are using their champions, how each of the 112 champions fares against the others and which abilities, whether it’s a move to stun an opponent or get a boost of speed or power, players use to knock out competitors.
It’s all in the pursuit of keeping the game a fun environment, Livingston said. If it becomes apparent that a certain champion wins far more than others, or that specific abilities are disproportionally powerful, Riot tweaks the gameplay to even the playing field.
This is crucial because the four-year-old game is free. Privately-held Riot Games makes money through microtransactions, when League of Legends players unlock certain heroes and buy access to certain powers exactly when they want to use them in the game. Players can also pay to change the themed outfits champions wear, called skins. Riot has to convince players to stick around long enough to become invested in the game and the community that plays it, and then want to pay to unlock their favorite champions right away, and to buy cool looking skins.
A Quest to Avoid ETL
With the game seeing more than one billion hours of gameplay each month, that’s a lot of data to store and analyze, on the order of petabytes. Livingston said as League of Legends, launched in 2009, became popular, it quickly outgrew its MySQL system and had to move to Hadoop.
But using pure Hadoop presents its own problems when it comes to relational data, Livingston said, and provided a mixed database-gameplay metaphor to describe the issue. “Hadoop is not very good at joining things,” he said. “It can do it, but it’s kind of like the Death Star. It destroys planets, that’s what it’s made for. But it’s not great for the blaster fight in the hallway. When you’re trying to do quick things, that’s really not great for Hadoop.”
To query the data quickly, Livingston had his engineers using Hive. But trying to teach Hive to people who aren’t engineers was “daunting.” In addition, Livingston said he wants to ensure that everyone, from game developers and community managers to the marketing department, can get at the data Riot collects on their own to make their own decisions without having to perform a lot of extract, transform and load (ETL) processes.
“What we’re specifically after is that I don’t want to have to fight with a [data] model to get some basic insight out of things,” Livingston said. “I want to understand that, ‘Wow, players are doing X.’ I don’t want to have to model all of that, move data around 15 different ways and then pop it up inside.
“Platfora had enabled that for us,” he said. “It is the piece that we were missing. This was the vision that we had, being able to go from end to end very quickly, produce these graphs and produce this insight.”
Platfora’s Hadoop Interplay
Platfora is built with three layers, according to Ben Werther, the company’s founder and CEO. It reads data stored in the Hadoop Distributed File System (HDFS), which allows any type of data to be stored without worrying about what’s going to be used for analytics later.
“One nice thing about using HDFS in that way and about Hadoop in general, is you can treat it as a data reservoir,” Werther said. “You can land raw data in there without having to think about modeling, or architecting it to figure out what questions you want to answer ahead of time.”
That data is distilled based on user specifications turned columnar compressed pieces – a changeable, software-based data mart – and brought into an in-memory layer so users can interactively query that data. This is what Platora calls a “lens.”
“It has a schema, but that schema is overlaid dynamically on the underlying data, so that it can turn that raw data into useful, structured things in memory,” Werther said.
The third layer is an HTML 5-based Web client that analysts use to explore and visualize the data. Together, the three layers are capable of plotting millions of data points.
Werther said the company learned several things over the course of its beta testing with companies like Riot Games and Edmunds.com, the car pricing and information site. Because the Platfora system is built to be self-service for business users, there is a learning process of what’s capable when interactively querying Hadoop.
“It really changes the model where traditionally there is an assumption when you use Hadoop you have to do all these manual things and [have IT] build these data marts out,” Werther said, adding that while “there is a lot of excitement about the idea that users can directly get at it and drive the show,” it takes time to get used to that access.
“Helping [business users] understand what that means, and the power of that, and doing it in a way that’s easy to understand, that’s something we’ve had to work through,” he added.
Werther said those processes are optimized in the general availability release that was announced on March 26.
“Hadoop is a technology that is now rapidly becoming mainstream,” Werther said. “The question is shifting to business value. How do you land all this data in Hadoop and make it useful for the organization? People are building data reservoirs, and the number one question is, how do I make this data reservoir useful for my business? To do so would unlock so much value.”