Today’s world is split-second. Whether you’re pitching an idea, selling a product, or presenting last quarter’s results, if you don’t grab your audience’s attention in the first moment, you are beat.
The problem is that our focus on speed and split-second sound bites doesn’t allow us time to validate the truth or substance behind the flash or hype of words and striking visualizations. We are conditioning ourselves to focus on the story or on the emotion that the presenter is trying to evoke, and less so on challenging the presented results.
In the world of analytics, this means that amazing visualizations could be based on bad data that presents inaccurate results. This, of course, would lead to incorrect conclusions that can have a negative impact on your business.
The people who have their hands in the data know this. They know how frequently data errors slip through the cracks and find their way into reporting. It takes a meticulous set of eyes and processes to catch them.
If the growing volume of blogs and articles about the value of good data preparation is any indicator (do a web search on “data prep” for a sampling), then the current buzz in the world of big data and BI is beginning to shift from the flash of presentation to how to ensure that the data behind the visualizations and presentations are accurate. It seems that all of the BI and data visualization vendors are turning their focus toward the important (but not-so-flashy) world of data prep – data cleaning, normalization, etc.
Data prep is not easy, and despite the claims of ease, agility, and automation, none of today’s (or, likely, tomorrow’s) tools will replace every human link in the chain of data prep. Whether during the manual steps of data entry at the source level or the aggregation and reporting steps at the back end, some person eventually will have to have her or his hands in the data itself, doing some cleanup work or writing scripts or the equivalent to perform the data cleanup work.
The industry is beginning to recognize this, and so descriptive words like “self-service” are now in almost all vendors’ marketing materials.
Data prep is not an easy nut to crack. However, here are suggestions that might help you address the issues that can arise in four areas as you shift your focus to data prep:
Recognize that ensuring data accuracy is critical to your business and that this doesn’t happen by itself. Therefore, it is essential that you provide appropriate resources in personnel and data prep tools to do it. Current visualization tools and other means to report the data are fabulous, but first you have to make sure that the data is accurate.
Keep in mind who on your staff will be doing the actual “hands-in-the-data” work or reporting. What level of technical proficiency is required to use the data prep tool(s) that you have selected? Ensure that you find a tool or set of tools that is easy to install, access, learn, and use. Otherwise, it simply won’t be used and people will default to what they already know, even if it’s slower, manual, or more error prone.
Make sure that there is appropriate governance, transparency, and auditability within the tool(s) or your process. At any time, you (or your data prep tools) need to be able to answer, with exact precision, the question, “What, exactly, is this number and how did it get here?”
On the extreme end, the value of having accurate data in your reporting is most obvious when you get caught with the impact of having presented inaccurate data. However, that’s a difficult ROI metric to present in the hypothetical. Easier and more palatable measures of ROI are the amount of time saved and the number of data errors found and fixed when using a tool to simplify or automate your data cleaning and data prep tasks. Ask your vendor to work with you on a trial so that you can compare the cost/time of doing a task manually (or with your current tool) with the cost/time of doing it with the tool you are considering.
David Lefkowich is the VP of Sales and Marketing for FreeSight Software, a data integration, cleaning, analysis and reporting tool.
Subscribe to Data Informed for the latest information and news on big data and analytics for the enterprise.