Guide to Predictive Analytics

By Eric Smalley

March 29, 2013

People have used computers to forecast aggregate behavior like that of markets for almost as long as there have been computers. Making predictions about individuals is a much more difficult matter. The focus of predictive analytics is rapidly shifting to gazing at—and betting on—the future of individuals.

“It’s not forecasting how many ice cream cones you’re going to sell next quarter; it’s which individual person is likely to be seen eating ice cream,” said Eric Siegel,  author of Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die.

Today’s analytics technologies and ample sources of data have made making predictions about probable individual behaviors and many other potential outcomes more practical.

“Predictive analytics is, by definition, the most actionable form of analytics,” said Siegel. “Instead of just predicting the future, you can influence it.”

Related Stories

Podcast: A data scientist’s approach to predictive analytics and data management for marketers.

Read more»

Developing a strategy for integrating big data analytics into the enterprise.

Read more»

Inside the Obama campaign’s big data analytics culture.

Read more»

GE invests in project to embed predictive analytics in industrial Internet.

Read more»

More on predictive analytics.

Read more»

A growing number of vendors, anchored by heavyweights IBM (through its acquisition of SPSS in 2010) and SAS, supply predictive analytics software. The statistical analysis algorithms these companies offer have been around for years, but two important elements have changed how they are used. First, there’s a lot more data beyond traditional relational databases with corporate data for the algorithms to use among them weblogs tracking online user behavior; the text, audio and video musings of millions; email messages; IT system data, such as application performance logs and security intrusions; and sensors measuring location, time, temperature and wear on machines.

The second change is technological. Distributed architectures have enabled Hadoop, an open source software library, and NoSQL databases, a class of database management systems that do not use the relational database model, to attack very large datasets. These innovations and others, like in-memory databases, have ushered in a wave of innovation and a proliferation of use cases for predictive analytics.  

Use Cases

Predictive analytics is useful in any circumstance where you need to make a decision based on the likelihood of an individual—like a person, retail outlet, or product—behaving in a certain way. Is Sam likely to buy the latest fashion? Is Juan inclined to vote Libertarian? Does Julie pose a risk to the community? Is WonderWidget going to go viral? The potential applications are growing. Here are some snapshots:

Marketing and sales. A marketing use case is churn modeling: identifying which customers are likely to leave. This helps businesses target offers like heavy discounts or free phones that would be too expensive to offer in a less targeted way. Call center operations can identify the best response to a customer request or complaint to retain that customer and identify new opportunities for a sale. Another example: recommendation engines from social networks and e-commerce companies (think Facebook, Amazon, Netflix and others)  that make automated suggestions based on your online behavior and that of customers making similar choices.

Health care. Providing doctors with insights on how an individual’s condition compares to that person’s history and the outcomes of other patients is one use case. For example, researchers at Memorial Sloan-Kettering Cancer Center are experimenting with IBM analytics to scan historical data about cancer case outcomes and assess patient treatment options, with expectations that the system’s results will improve over time.

Manufacturing. Predictive analytics have the potential to sharpen forecasts about customer demand by rapidly harvesting new data sources, thus making for more efficient supply chains. Analytics applied to sensor data can detect the wear on factory equipment, jet engines and other mechanical devices, alerting operators about maintenance or replacement needs.

Human resources. By combining data about employee performance, personality tests about worker potential, data from office applications about employee contributions, and traditional metrics about compensation, human resources executives are engaging in more sophisticated approaches to workforce management. Companies can project top performers among existing workers, and predict a job candidate’s suitability for a role like customer service or sales.

Politics. A striking illustration of the power of predictive analytics comes from President Obama’s 2012 campaign. The re-election team used analytics to identify which undecided voters were inclined to vote for Obama and then determined which of those could be persuaded by contact from the campaign as well as those who might be put off by contact. They also used analytics to decide how to campaign to individual voters: whether to send a volunteer to the door, make a phone call or send a brochure, said Siegel. 

Law enforcement. Another use case that’s been getting attention is predictive policing. Rather than predicting when a person will commit a crime as in science fiction’s Minority Report, predictive policing identifies when a particular place is likely to experience a rash of break-ins or other crimes. Los Angeles, Santa Cruz, Calif., Memphis and Charleston, S.C., have lowered their crime rates by using the technology to increase patrols in the right places at the right times.

The legal profession. Civil litigation, with its mountain ranges of text documents and other data, represents another ripe field, both for those seeking to avoid court and those in court already. Lex Machina, a startup out of Stanford University, has developed a tool to identify the risk of patent lawsuits in situations where executives are evaluating corporate acquisition targets. Meanwhile, in a legal dispute involving the sale of a national restaurant chain, a Delaware judge last year ordered the use of predictive coding, a data analysis technique that would accelerate the process of identifying relevant documents, instead of using phalanxes of junior lawyers to do the work.

More in the Data Informed Guide Series

Guide to Leading Data-Driven Organizations

Every executive turns to data, at some point, to learn about how business operations are performing. But in data-driven, or evidence-based, organizations, leaders and front-line employees alike use data and analytics to influence performance systematically.
Read more.

Guide to Sentiment Analysis

Companies have always wanted to know what their customers really think. And there are times when customers are very willing to share their views online. This is where customer sentiment analysis comes in.
Read more.

Guide to Marketing Analytics

When consumers make a purchase, ring a call center, visit a website, click a banner ad, comment on social networks, or join a loyalty program, they’re producing valuable data—data that savvy marketers can analyze to make better business decisions.
Read more.

Interestingly, not all use cases require accuracy. Predictive analytics can help direct marketing campaigns boost response rates. The technology doesn’t provide much confidence that any individual will make a purchase, but it can boost response rates from, say, 1 percent to 3 percent, said Siegel. “That’s an example where it’s really not about accurate prediction, it’s about tipping the balance and playing the numbers game better,” he said. “And you can make a dramatic improvement on the bottom line by playing the numbers game better.”

How Predictive Analytics Works

Predictive analytics is driven by machine-learning algorithms, principally decision trees, log linear regression and neural networks. These algorithms perform pattern matching. They determine how closely new data matches a reference pattern. The algorithms are trained on real data, and then compute a predictive score for each individual they analyze. This way, the systems learn from an organization’s experience, said Siegel.

The current trend is to use multiple predictive analytics models. Ensembles of models perform better than any individual model. “It turns out that if you have a bunch of models come together and vote—anywhere between several and several thousand—you get the wisdom of the crowd of models; you get the collective intelligence of the models,” said Siegel.

Challenges in Implementation

The technology is innovative and the use cases have provoked a lot of interest—but making predictive analytics work in business is a challenging process that requires a serious assessment of an enterprise’s strategic goals, its appetite for investment and a willingness to experiment.

The first challenge in using predictive analytics is determining what technology and level of resources to deploy. This requires assessing the size of the problem an organization is trying to solve, said Omer Artun, CEO and founder of AgilOne, a cloud-based predictive marketing analytics company.

Projecting potential ROI is essential. For example, a large financial institution aiming to improve performance in a multi-billion-dollar market by a few percentage points stands to gain several hundred million dollars, so it can afford to spend a couple million dollars a year on an analytics team and custom modeling, Artun said. Conversely, a retail company looking to improve sales by a few percentage points might stand to gain a few million dollars, and therefore should probably spend less than six figures and buy a commercial analytics package, he said.

Other challenges in implementing predictive analytics are cultural and organizational, said Siegel. It’s important that the analytics are clearly connected to business goals and are aligned with business processes, particularly when the technology introduces operational changes. “It’s easy to make a model that’s cool and predicts well but doesn’t necessarily provide organizational value,” he said.

That means figuring out how the results of predictive analytics fit into existing business processes.

Organizations need to avoid what Artun calls “the last mile problem.” Sometimes a company will develop the right model and it will perform beautifully, but the results end up sitting in a file on a computer somewhere. The key is making sure results are delivered to the point in a business process where decisions are made, he said. People developing predictive analytics systems should take an API-first approach, meaning results should flow from the model to the point of execution in a seamless and real-time or near-real-time manner, said Artun. “That makes a huge difference in the outcome.”

These issues all point to the significance of the people who do the work. It’s also important for analytics teams to understand the business problems they’re tackling. Often times data scientists over-optimize or pay attention to the wrong aspects of a business problem, Artun said. “It’s an art and a science,” he said.

Scott Nicholson, who left LinkedIn to become chief data scientist at Accretive Health, said in a recent presentation that it is particularly challenging to find people and that he looks to build a team with members from diverse backgrounds, such as biology, physics, computer science and electrical engineering. They have to be familiar with analytical models, but more important is their possessing an investigative zeal.

The right people are those “who think about problems and then are excited about going out and finding data to answer questions,” he said. “This is not about algorithms. It’s not about industries. It’s just about being curious about data.”

With the continued spread of predictive analytics, it also pays to monitor public perceptions of its applications. Privacy is a concern with predictive analytics because data drives accuracy and the focus is on individuals. The more you know about someone, the better you can predict his or her behavior. In 2012, these concerns have prompted members of Congress and the Federal Trade Commission to ask questions about businesses’ use of consumer data, and policy debates are likely to continue both in the United States and in other countries. There are also legal questions when predictive analytics play a role in police stops, sentencing and parole decisions.

From Prediction to Persuasion

The future of predictive analytics is persuasion. Organizations are beginning to tune their predictive analytics efforts to go beyond just assessing whether and how to take action. They are now also using their systems to predict whether their actions will have a significant impact. “There’s a difference between predicting behavior and predicting whether you’ll influence that behavior, which is persuasion,” said Siegel.

Persuasion analytics borrows the concept of a control group from medical research to assess action versus inaction. Persuasion is more difficult analytically than prediction, but it more directly informs decisions, Siegel said. Persuasion was the essence of the Obama campaign’s highly successful analytics effort. Persuasion techniques are also used in targeted marketing, online ad selection, dynamic pricing, credit risk assessment, social services and personalized medicine.

Eric Smalley is a freelance writer in Boston. He is a regular contributor to Wired.com.  Follow him on Twitter at @ericsmalley.

Home page image of crystal ball via Wikipedia Commons by Gaming4JC.