Big data as we know it today is a very fragmented market. However, in the next few years, it will start to normalize through consolidation, and “best of breed” solutions will continue to live on. Cassandra, Hadoop, and Spark likely will integrate to become one ideal solution that can connect to multiple platforms, resulting in lower cost and broader adoption of big data. This anticipated shift is similar to previous technology trends, such as the evolution in the AppDev space with its myriad development languages.
A critical differentiator of companies that successfully leverage big data will be the ability to integrate data sources with various applications and systems quickly and cost-effectively. But doing so requires a solution that doesn’t need extensive rework and coding for multiple integrations. According to Gartner’s Eric Thoo, businesses and consumers using cloud-based data sources require extendible data integration strategies and capabilities to interact and integrate with cloud-based data as cloud-based services become more available and diverse.
Why is Access So Challenging?
One of the biggest challenges of leveraging big data is the fragmentation that exists between data sources. Most organizations have multiple data connectivity solutions, database platforms, data sources, and applications, all of which are used for a broad range of activities. But few organizations have policies to guide where data should and should not be stored.
According to research by Freeform Dynamics, 80 percent of survey respondents believe effective business decision making is hampered by data availability and inconsistency issues. And 83 percent are concerned about the security of their corporate data as it is increasingly dispersed across and beyond the corporate network. Performance is also a major challenge. In a 2012 survey, 59 percent of respondents said they believe their existing analytics framework processes big data too slowly, and most believe their existing analytics framework is simply unable to match the speed at which big data is flowing into the network.
Companies merging information from different sources must do so without bringing down the existing environment. Otherwise, business could come to a halt. In addition, migration is costly, and without an efficient data connectivity solution, the time and effort needed to connect all data sources – both on-premise and in the cloud – is prohibitive for most organizations. Estimates of the cost of migrating data from one system to another are up to 50 percent of the total project budget.
Cloud-Based Connectivity Solutions May be the Answer
According to Gartner, using data integration technology provided “as a service” is gaining traction as a way to meet entry-level requirements. But the cost savings aren’t limited to smaller businesses; the solution is a viable option for everyone. It not only is technically sound but also costs less and eliminates the complexity of mapping to different data sources.
And that’s important because an increase in cloud-based data archiving as well as value-add data sources available in public cloud environments has led to a hybrid computing environment that includes systems deployed across multiple clouds, as well as on-premise systems. As such, there’s a need to access and integrate data from one or more on-premise systems and provision that data to SaaS-based transaction-processing applications and BI systems. Companies also need to access and integrate data from SaaS-based apps and cloud-based data stores, and provision the data to on-premise transactional systems. In other words, the data flow both ways.
Data Connectivity as a Service: Safe, Simple, and Sensible
Given this environment, traditional data integration solutions require multiple connectors to innumerable sources. A SaaS-based connectivity solution, however, can leverage a single open database connectivity (ODBC) or Java database connectivity (JDBC) driver to simplify access without using proprietary APIs for each application. By communicating directly with the database over TCP/IP using the database’s wire-level API, such drivers perform even better than native database APIs by eliminating memory, CPU, and network bottlenecks. With no database client libraries to deal with, they do away with complex installation, configuration, and maintenance hassles for multiple machines, as well as the need for different client software products on each system. In fact, they can bypass client libraries and connect directly to the database, eliminating the potential for memory leaks and application crashes.
Using this approach, a company can leverage data connectivity as a service to easily accomplish numerous tasks that once were tedious and expensive. For example, they can pull data from the cloud into on-premise systems or capture changes to master data made in SaaS-based apps and bring them into an on-premise MDM system. And they can migrate data from on-premise systems to their SaaS BI systems and synchronize master data across SaaS applications with ease.
For instance, a company providing cloud-based data analysis and visualization software could leverage a data connectivity solution as a service to establish connections between its SaaS applications and cloud data sources. Customers would be able to extract data from the cloud in real time and access the data on a standard or mobile web browser. The result would be a clearly differentiated self-service BI solution that would scale to support cloud data from multiple SaaS apps and big data stores.
With such extendible connectivity, the possibilities for leveraging big data are endless. Eliminating cost and complexity, data connectivity as a service speeds time to market for solutions that bring together data from various sources in a meaningful, usable way.
Paul Nashawaty is Director of Product Marketing and Strategy at Progress Software.
Subscribe to Data Informed for the latest information and news on big data and analytics for the enterprise.