Big Data Platforms in 2017: Leveraging Hybrid Clouds for Intelligent Operations

by   |   January 13, 2017 5:30 am   |   1 Comments

Sumit Sarkar, Chief Data Evangelist at Progress

Sumit Sarkar, Chief Data Evangelist at Progress

According to a recent Gartner survey, big data investments reached a possible peak in 2016. But these investments show signs of contracting, with 48 percent of companies having invested in big data in 2016 – only three percent more than 2015. This trend signals that organizations have delivered significant big data insights and will now look to operationalize in the coming year (and beyond).

With the rise of hybrid cloud environments, we expect that companies will operationalize existing big data insights to heighten the efficiency of critical business operations in the year ahead. We’ll see this both in the movement to streamline operations using big data sets (both in the cloud and on the ground), and in the tech advances and standards in 2017 to overcome roadblocks to operationalizing created by these hybrid cloud environments.

Movement and Access of Big Data in Hybrid Clouds

Ground to Cloud

In 2016, we used customer management cloud apps to start to explore the benefits of integrating big data insights from enterprises’ on-premises platforms; these were separated by both functional areas and disconnected technology stacks. In 2017, we will continue to see functional areas collaborate with IT, and see improved connectivity with the big data ecosystem to really expand the use of existing big data platforms through increased access to that data, which will be of central importance to customer experiences.

Data lakes have become a valuable repository to store all facets of customer data from a variety of data streams – the different systems used by internal functional departments such as CRM, marketing automation, web analytics, survey platforms, webinar data, and so on. This creates a single repository of insights, forging the foundation for new and advanced analytics techniques using data sets with value that has yet to be derived.

The predicted movement of this data will be driven from customer management cloud applications to access the detailed “big data” data on-demand in the flexible spirit of the data lake (for example, being able to ask questions with an ad hoc schema on-read approach). In contrast, some of the more aggregated data (“not as big data”) will get physically moved to the cloud for more repeatable use cases.

Cloud to Ground

When thinking about the enterprise, many analytics and reporting platforms continue to run on-premises on private cloud or grid infrastructures. Big data volumes continue to grow in the cloud, and it’s not feasible to crunch those cloud resident big data sets in on-premises data centers. This is where cloud big data platforms such as Amazon EMR, IBM BigInsights on Cloud, Microsoft Azure HDInsight, or SAP Altiscale, are often more scalable and cost effective to crunch and transform big data sets into business insights. We have already seen this in 2016, and predict 2017 will be the year in which we will start to integrate those insights, which are manageable in size, by moving them into on-premise databases and analytics platforms for core business operations.

Overcoming Disconnected Tech Stacks to Operationalize Data

Two primary technical challenges have resulted from integrating on-premise and cloud-based insights: Firewalls limit access to developers who are exploring data access between cloud and ground; and there is limited out-of-box connectivity available from big data platforms for SaaS, web, and mobile cloud application development.

2016 delivered innovations for hybrid connectivity across cloud access security brokers (CASBs), security gateways, and new hybrid data pipelines to help address the first challenge. In regards to the second challenge, we’re seeing the emergence of OData provide out-of-box connections to any data set, including external data strategies for popular cloud applications such as Salesforce and Oracle Service Cloud, or for binding data in HTLM5 developer tools such as Kendo UI.

OData is an open protocol that allows the creation and consumption of queryable and interoperable RESTful APIs in a simple, standard way. This industry standard is an OASIS Standard REST API that some refer to as “SQL of the web,” providing a uniform way to query data. This OData REST interface will augment existing big data connectivity from standard SQL interfaces such as Hadoop Hive, BigSQL by IBM, Apache Phoenix, or HAWQ.

OData URL Query Conventions

OData URL Query Conventions

With these advances for connected cloud applications, organizations will have proven and reliable options to operationalize their big data insights into systems of engagement and web applications, both internal and external facing.

2017 and Beyond

Today businesses operate in a data-dependent universe, and those that can harness data and employ it will be the most successful. This means that organizations need to start looking beyond big data platforms and consider data access to expose crucial data to other business systems, enabling teams to use the insights in everyday core operations in hybrid environments. 2017 will expand the democratization of big data insights derived from data scientists or even machine learning. The future promises that these will just be considered “insights” to be embedded into everyone’s application development and deployment experiences. For now, as the cloud management space continues to become more fragmented, a hybrid cloud approach will enable businesses to strengthen their architectures and operationalize their data resources for the next generation of business operations.


Sumit Sarkar is a Chief Data Evangelist at Progress, with over 10 years’ experience working in the data connectivity field. The world’s leading consultant on open data standards connectivity with cloud data, Sumit’s interests include performance tuning of the data access layer for which he has developed a patent pending technology for its analysis; business intelligence and data warehousing for SaaS platforms; and data connectivity for aPaaS environments, with a focus on standards such as ODBC, JDBC, ADO.NET and ODATA. He is an IBM Certified Consultant for IBM Cognos Business Intelligence and TDWI member. He has presented sessions on data connectivity at various conferences including Dreamforce, Oracle OpenWorld, Strata Hadoop, MongoDB World and SAP Analytics and Business Objects Conference, among many others.


Subscribe to Data Informed for the latest information and news on big data and analytics for the enterprise.

Tags: , , , , , , , ,

One Comment

  1. Posted January 16, 2017 at 10:33 am | Permalink

    We are seeing similar trends. A lot of the companies we work with have tried / PoC’ed big data technologies. They have seen value from the technology but have also seen how complicated it is to configure and run these technologies as production systems. A lot are looking to alternative deployments for production. They do not have to deal with the day to day “keeping the lights on” but instead want to focus on the data and analytics. I find it interesting that in these examples the dev system is on premise and the production system is in the cloud. Kind of flips the traditional mantra that you should start with dev/test in the cloud on its head.

Post a Comment

Your email is never published nor shared. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>