Today’s availability of big data offers so many opportunities to get a competitive advantage in almost every industry. This is possible now, thanks to a variety of different analytics technologies and innovative solutions providing sophisticated, data-based market insights.
So, why are data analysts still not sleeping well?
Because this blessed profusion of information also brought about today’s analytics’ biggest fear – data discrepancies – along with the challenge to overcome it – data cleansing.
When the quantity of data is no longer a problem, it seems nothing can stop IT teams from producing an accurate picture of the current market reality. So, where’s the problem? Sources. When there’s no one single provider of information, data coming from multiple sources is a recipe for disaster. There are either ‘black holes’ in the data or many inaccuracies. The ultimate result is often a distortion of essential market data.
These data discrepancies not only affect the reports with inaccuracies here and there, but, they can potentially harm the company in a variety of ways. Let’s take the pharma industry as an example.
Case Study: the Pharma Industry
In pharma, data is gathered both by data providers and the pharmaceutical companies themselves, with the goal of determining marketing trends and evaluating their products’ performance. However, since data is gathered from so many different sources, discrepancies occur constantly. For example, data on patients can come from the hospital, their pharmacy or their doctor’s clinic, while data on the same physician can come from both his private clinic and the hospital in which he works. If that’s not enough, they can all get mixed under different categories, leaving the analysts completely clueless as to whether they’re even referring to the same person or not!
Get where this is going?
You can’t gather any accurate insights on patients’ and physicians’ behavior when you’re not sure about the stats. These inaccuracies are at the very core of the data used by Managed Markets, the sales force, even for R&D purposes. Finally, this data is brought to pharma executives on which to base their decision making and strategy.
Worried? Don’t be.
The solution is simply to create a single version of the truth out of all that data. Or, maybe it’s not so simple…
Here comes the tricky part.
Let’s get back to our example – a recurrent issue in pharma analytics is missing data at the patient-level, mostly patient ID. Due to legal limitations and restrictions, the only information available to build a patient ID is based on different segmentation indicators such as sex, age, etc.
This missing data on patient ID makes it almost impossible to conduct a proper patient-level analysis, when you can’t analyze how many new refills were conducted last year or even how many new patients you have. Without this key patient-level data, you can’t establish any real insights on patient behavior, your product’s performance and current market trends. In other words, you’re analytically paralyzed.
So, what’s the solution? Just a little help from data analytics.
A common way to deal with data discrepancies in the pharma industry is Mastering, e.g. giving a new name to the same ID appearing multiple times, under different categories to match under all different sources. But this sort of manual manipulation can only go so far; what’s required here is more than a simple ‘mix & match’, but rather a more sophisticated solution.
The Allocation Methodology
This methodology is essentially a way to estimate the missing data based on the existing data using smart technology: it employs a sophisticated filtering process to generate the specific formula and come up with the missing data. But, should the allocation be used at a territory-level or a physician-level? On what data should it be based? These questions and more can only be answered by an analyst with pharma expertise; the allocation methodology doesn’t stand alone; it requires a combination of analytical sophistication with industry expertise. These are the two key ingredients to getting a clear view of the market.
Not just in Pharma
Now, it is hardly just the case with pharma; big data poses difficult, more sophisticated challenges for data scientists, beyond the abilities of standard, traditional BI tools in most industries. It requires a much deeper, meticulous analytical work, analyzing and comparing multi-sourced data into a single perspective. Such a solution, by entering the darkest corners and the highest resolutions of analytical complexities, and creating a single version of the truth, should provide an overall accurate picture of the market reality, no matter the industry.
Annie Reiss is a seasoned marketing executive with over 20 years of experience in a variety of industries. Annie brings to Verix profound analytical and far-reaching visionary skills. With comprehensive experience in both technology and Consumer Packaged Goods, Annie has a proven track record of setting companies on the track for success. Annie is in charge of Verix’s marketing programs and activities as well as global corporate strategy.
Subscribe to Data Informed for the latest information and news on big data and analytics for the enterprise.