According to a study conducted last year by my company, Xplenty, nearly one-third of business intelligence professionals say they spend between 50 and 90 percent of their time just cleaning raw data for analytics. As a result of valuable time and talent devoted to preparing data, businesses are often slow to unlock its insights or to act on them.
It’s also why accelerating “time-to-insight” has become today’s top data challenge. Here are four tips to shorten the time between data integration and analytics.
First Identify Business Requirements, then Design Processes
Before identifying and formalizing a data-collection process, you need to start with your business objectives in mind: What are the desired outcomes and goals? What problems do you want to address? These questions guide what data to collect and the subsequent integration process. As analytics expert Ram Chandrasekar said, “Find out what the top three or four problems are and how they map to the data that is within the company, the data outside the company, and the need to integrate them.”
As data’s value has grown clearer, the cost of data storage has been reduced, making collection easier than ever. Businesses can now collect all the data they can find. But without a hard focus and set limitations, more data does not enable greater insight. Instead, it only slows value extraction.
With a finite scope and a tight focus on specific goals and problems, you can begin to design your collection and integration process. Opening the floodgates only adds time to an already onerous data-analysis process.
Select an Integration Platform that is Flexible, Elastic, and Scalable
An estimated 2.5 quintillion bytes of data are created on a daily basis worldwide. The volume and velocity of data are overwhelming. Given the always-shifting priorities and demands of big data, the possibilities (and potential for dead-end insights) are limitless.
When it comes to data, you are limited only by what you don’t collect (and, sometimes, the decision about what not to collect is more important than what you do choose to collect). But here’s the data dilemma: You never know where the value might come from as business needs evolve, so you need a data-integration platform that can offer the most comprehensive and flexible service possible.
In my experience, the best data platform should have the following attributes.
- It should be far-reaching enough that it is able to take raw data from multiple sources – relational databases, NoSQL document stores, and web services, just to name a few. The greater the breadth of integrations available through the platform, the better positioned you will be.
- It must provide the on-demand adaptability you need to meet your particular focus areas and goals.
- It should scale up and down to meet the inevitably shifting priorities of business today.
Get on the Cloud
The cloud accounted for 60 percent of total IT spending growth in 2015. While on-premise data management is still popular, largely due to fear of migration, the cloud is obviously the future.
For integration, cloud environments are operationally superior in their agility. They are adaptable, enabling businesses to cope with increases in data volume from different stores. They also make daily tasks, like transferring data between systems or writing ad hoc queries, much easier than before.
Beyond workflow, cloud services are a more cost-effective option when compared to an on-premise system. Companies that use cloud environments save more than 15 percent on IT costs on average.
Choose your Analytics Store Wisely
With data warehouse services operated by large cloud providers like AWS Redshift and Google Big Query, as well as by smaller startups vying for their piece of the pie, it can be a challenge to identify the best data store for a business. Here are a few fundamentals that can help guide your decision:
- SQL. NoSQL may be the preference of programmers given how quickly and easily they can change data schemas, but the majority of analytics tools and analysts themselves are still focused on SQL. This is why you would want to make sure your analytics data store is also SQL compatible.
- Growth Ability. The data world is not the same as it used to be, when a business could plan on needing to scale its solutions every few years, based on factors like data capacity. Given the amount of data available today, businesses need to focus on the best opportunity to grow, without obvious seams and without downtime.
- Ease of Use. Analytics is complex enough without adding more intricacy. Choose a service that is easy to implement and to understand, and that integrates with a number of applications.
The insights gleaned at the end of data integration are very exciting. But they don’t just fall from the sky. Data preparation and integration is arguably the most important part of the whole process, as it sets the table for what ultimately drives decision making. To truly minimize the “time-to-insight” gap, this is what every business, no matter its size, needs to do in order to get it right.
Yaniv Mor is co-founder and CEO of Xplenty, the big data integration platform that makes it easy to process more data more quickly. Before Xplenty, Yaniv was involved in a number of BI and data-centric projects with major international companies. Yaniv managed the NSW SQL Services practice at Red Rock Consulting, a leading consulting firm in Australia and New Zealand. Yaniv holds a Bachelor of Science degree in Information Systems Engineering from The Israeli Institute of Technology and a Master’s degree in Business and Technology from the University of NSW, Australia.
Subscribe to Data Informed for the latest information and news on big data and analytics for the enterprise, plus get instant access to more than 20 eBooks.