Kaggle Visualization Contest Yields Insights on Influence at Harvard Business Review

by   |   March 1, 2013 11:55 am   |   0 Comments

HBR visualization

The winner in Harvard Business Review‘s Kaggle design contest uses circle size to shows the relative influence of the top articles the magazine has published. Click on image for a larger version. Image courtesy of HBR.org.

For nearly a century, the Harvard Business Review had been a largely text-laden magazine, from the front cover to the back page. In 2010, its editorial leaders unveiled a fresh look for the staid academic management journal that included, among other new visual elements, a recurring data visualization feature, in an attempt to catch up with its audience.

“Our reader is not a tweed-clad gentlemen smoking a pipe in a wood paneled room; our reader is out there making a difference in business and thinking about new ways to do things,” says James de Vries, who oversaw the redesign and now serves as HBR’s creative director. “Data has become a crucial tool for business, and the language of visualization is something that our audience understands. It’s right in our court, in a sense.”

Related Stories

FlightRadar24’s real-time visualization of air traffic uses global sensors network.

Read more»

Visualizations highlight data for improving manufacturer’s procurement process.

Read more»

Telling stories with visualizations: lessons from data journalists.

Read more»

More in Data Informed’s Visualizations section.

Read more»

The monthly data visualization features were mostly produced in-house by HBR’s small editorial staff. But as the magazine’s 90th birthday approached in November, De Vries seized an opportunity to incorporate outside analytics expertise. After a chance meeting with Kaggle executives at SXSW Interactive last year, De Vries decided to use the company’s platform to launch a competition for the best data visualization of the magazine’s complete archives. Harvard Business School’s Baker library had just delivered metadata and abstracts for the 12,000 HBR articles published since its 1922 launch. The magazine handed over the dataset along with an open-ended challenge to “find the story behind the data.”

“We thought a great way to test [Kaggle] would be to use our own data set, which was pretty flawed, and put it to them to see if they could do anything with it,” De Vries says.

With the anniversary issue deadline approaching, time was tight. HBR’s editorial team spent three weeks working with Kaggle to set up the parameters of the contest. The competition itself lasted just two weeks. Kaggle members could analyze the data, visualize patterns and relationships, and submit their proposals, which could then be commented on and voted on by Kaggle members, with HBR making the final selection of a winner and two runners up from the 10 most popular visualizations.

The HBR data was a particular challenge. Kaggle participants are more accustomed to working with numbers than words. And the dataset turned out to be messy. “There was a bunch of repeated information,” de Vries says. “But the people at Kaggle were able to go in razor-like and fix it.”

The response was robust. “We kind of expected a few guys working in their basements to hand us some rinky-dinky stuff, but many of them were really talented,” says Scott Berinato, a senior editor who works on the data visualization features. “I was impressed with the thoughtful ways they went about trying to make sense of the data.”

The entries included “a semantic landscape,” a map representing global citations, and a proposal that combined aspects of these and more using natural language processing. Not all the entries were stellar. There really bad ones were dismissed immediately. The more difficult part was determining which of the leading analyses would receive the $1,500 prize and be featured in the magazine.

“There was a fair bit of argument internally about which one was the best visual representation,” says de Vries. “I was looking for something visually surprising that would also respond to multiple levels of inquiries.” “We had a very specific editorial mission in mind—to look at the history of management ideas and not just the history of HBR itself,” Berinato adds. “The ones we honed in on looked at the ideas themselves.”

The winning visualization was developed by Eamonn O’Loughlin, a consultant working with business analytics for Accenture in Ireland.

O’Loughlin used circles to represent the relative number of citations the top articles received (the 53 most cited articles had at least 1,000 citations). The circles start in the 1950s, and the biggest (most cited) ones are from the 1990s, covering topics such as global business (“The Competitive Advantage of Nations,” by Michael E. Porter, and “The Core Competence of the Corporation,” by C.K. Pralahad and Gary Hamel.) Circles for the important articles are color-coded: performance management is green, IT innovation is brown and customer-related articles are orange, for example.

“It’s a lovely graphic that provides a lot of information in a very accessible way. He gave us an axis on ideas, on authors, on citations,” says Berinato. “Our guiding principle for a good visualization is that you can look at it and get it, but then spend more time with it if you want to.”

While the 90th anniversary data visualization was static, both print and online, the HBR team has begun considering the different formats in which their data visualizations can be presented. “Some material is better off in an animated or multi-state version,” says de Vries. For the last few issues, Berinato has begun working on visualizations that will vary in presentation from the printed page to the iPad. “We’re starting to conceive of these interactively from the get-go,” he says.

De Vries is interested in working with Kaggle again. “They aren’t the only ones who run this kind of competition, but the thing I’ve found most interesting is that whether or the prize is big or small, there’s some kind of fabulous magic that happens with how motivated people are,” says de Vries. “They want to be the best. And that’s the real x factor.” Next time around, however, he would incorporate a much longer lead time and make sure the data set they offered was one that would work best in the Kaggle environment.

Stephanie Overby, a contributing editor at Data Informed, is a Boston-based freelance writer. Follow her on Twitter: @stephanieoverby

Tags: ,

Post a Comment

Your email is never published nor shared. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>