You are about to be part of something amazing, something never before seen in business. You, my friend, are about to predict the future!
That’s right, the dreams of a thousand generations are about to happen and you, my dear data wrangler, are at the center of it. You are the one who will make all this magic happen.
You can’t do this with just the wave of a hand; it takes careful planning and direction. You don’t want to be stumbling through a third-rate stage show. You want to be the data-equivalent of Harry Potter, living in a Wizarding World where anything is possible.
In today’s business environment, everyone expects their data to be a magic mix that brings forth deep insights and changes the course of their business. When Gartner released its planned big data research this year, it pointed out that the entire business intelligence and analytics market is undergoing a major change. Fed by data coming from sources as diverse as mobile devices, smart machines, digital business, and just about anything connected to the cloud, “big data analytics” stands at the heart of just about everything. And Ventana Research notes that predictive analytics is “the most significant technology trend in business today.” That same research indicates that marketing and operations are most likely to apply predictive analytics as part of their workflow.
There’s a gap, however, typically between the person who does the predictive analytics and the individual who ultimately uses it to make decisions or gather insights. Ventana found that only about half of those who design and deploy predictive analytics (52 percent) are also those who utilize the output.
This gap represents a specific data challenge for two reasons: First, predictive output is useless if you aren’t asking the right questions, and asking the right questions requires an in-depth understanding of the underlying business problem. Also, the algorithms designed to conduct predictive analytics can deliver actionable insights only if the underlying data that feed them are accurate. This means all information – from a structured database like SQL Server or Oracle to unstructured data that sits in Hadoop – must be quality data – that is, prepared, cleansed, and matched. Otherwise, it’s like trying to take Professor Snape’s potions class without having purchased the right ingredients.
This doesn’t apply only to the B2C marketers who, traditionally, have been adept at using big data for key analytics practices. Forrester notes that 53 percent of B2B decision makers do their research online rather than speaking to a salesperson. This means that B2B marketers must develop the same analytics skills as their B2C brethren.
True predictive analytics is possible only when you do two things correctly: Ask the right questions and have the right data. Without addressing those two key issues, your “magic” will make you look like a foolish side-show magician.
Where the ‘Magic’ Happens
In the aforementioned Forrester report, analyst Laura Ramos writes, “More data doesn’t always deliver more valuable insight.” She advises to first decide exactly the problem that needs solving, and then determine the data needed to solve it. Of course, that data can be in any of a number of different locations, so it’s important to create a strategy that combines data from different sources and to have a strategy for combining them.
Ramos also notes that, “69 percent of unstructured data – like service records and call logs with rich customer history and insight – never make it into business intelligence systems.” Without the ability to combine and use all this data, be it business data, behavioral data, social data, or technical data, trying to execute any predictive model is like Harry Potter trying to lead the Wizard world without ever having attended Hogwarts. Where would he be without his magic education?
Once you know the question and the type of data you need, you just need to find it all. If it’s scattered throughout the organization, you’ll need a tool that lets you access everything, no matter the type of data or its location. Then you need to standardize and normalize that data. This is where the real “magic” happens. For example, you may find that, in one data source, a customer’s name is broken into discrete components – prefix, first name, middle name, and last name – while in another, the entire name is in a single field. You will need to either merge or split those fields in one of the two sources.
Once you have normalized the data, you can conduct some level of heuristic matching to link and de-duplicate records. Only then will you be able to connect the browsing data of Harry Potter on the Olivander website, which shows him looking at a maple wand with a unicorn hair core, with the offline purchasing information of H. Potter, who purchased a wand made of holly and phoenix feather. Then you can analyze the difference between his online and offline habits. You also need to know if Harry Potter’s home address at 4 Privet Drive matches another record that appears as 4 Privet Dr. across the various sources. In addition to matching addresses, you need to match against phone number, email address, and other available data elements.
Today’s data landscape has become more complicated to navigate, with the need to match structured data with unstructured data from feeds like Twitter and Facebook. Add to that location information from cell phones, IoT information from just about anything, time-based data, and a long list of others, and the process becomes exponentially more difficult.
Once a process exists for dealing with all this data, it’s important to keep it moving forward so the predictive analytics can run in real time or be run again in the future to test other theories.
With that in place, you have the ability to transform business, and that’s truly magical.
Todd Hinton is the vice president of product strategy at RedPoint Global, a leading provider of data management and customer engagement software. Todd brings over two decades of product management and executive leadership to the company, providing strategic direction for RedPoint’s data management products, including RedPoint Data Management for Hadoop. Prior to joining RedPoint, Todd served as executive vice president for Bernard Data Solutions, where he was responsible for the overall technology direction of the company’s CRM SaaS application serving the nonprofit industry. Todd specializes in data quality and the creation of building high-performance database applications capable of querying vast amounts of data in high-volume environments.
Todd has served in a number of executive management roles during his career, including general manager and CTO in a division at MarketModels, as well as director of postal products for Qualitative Marketing Software/Sagent Technology.
Subscribe to Data Informed for the latest information and news on big data and analytics for the enterprise, plus get instant access to more than 20 eBooks.