Yield Big Results with Data Lakes and Automation

by   |   January 26, 2016 5:30 am   |   0 Comments

Abdul Razack, SVP of Platforms, Big Data, and Analytics, Infosys

Abdul Razack, SVP of Platforms, Big Data, and Analytics, Infosys

Think you had a tough day? Spare a thought for recruiters who compete with each other to hire one of the country’s most sought-after specialists: the data scientist. The demand for data experts is so high that there could still be a 60 percent hiring gap for data scientists three years from now.

It’s all a part of the huge effort to tap into the promise of big data. Many businesses are struggling to realize gains in their big data investments, and the reasons extend beyond a lack of qualified data scientists.

In my experience, the lack the skills to properly manage data is a very real problem, but another challenge to big data insight is the way enterprises organize data – many companies silo or fragment their data. It’s gotten to the point that some organizations are wondering what benefits big data held for them in the first place.

One immediate benefit of big data is automation – the ability to automatically identify and preemptively resolve symptoms before they become a problem, as well as to eliminate time-wasting processes. This one-two punch frees up time and resources, enabling organizations to focus on better understanding what the end user wants and needs.

To realize the benefits of automation, we must consider how data ought to be stored today. We need to discuss data lakes.

Related Stories

Get to Know Data Lakes.
Read the story »

Automation Is the New Reality for Big Data Initiatives.
Read the story »

Intelligent Process Automation: It’s About the Data, Not the Robot.
Read the story »

Use Semantics to Keep Your Data Lake Clear.
Read the story »

Data lakes are repositories for storing relevant data requiring analysis. The types of data stored in these lakes usually come in three forms: structured, unstructured, and semi-structured. These data are stored in their raw forms, allowing for deep and complex analysis and not losing fidelity due to aggregated data. The more data that organizations pool into their data lakes, the more opportunity they have to discover previously unseen correlations and insights.

The ease and flexibility of using data contained in data lakes helps to identify repeatable tasks and processes. In fact, data lakes, because they act as a central repository for automated systems, can be used in building a system capable of recognizing trends, learning, and acting on its own accord.

Let’s use the process of resetting a password as an example. The system monitors the actions of an administrator helping an end user reset his or her password. It observes the steps involved in resetting the password and stores this information in its data lake. Then, the next time a user submits a password reset request, a software robot, as we call them, can walk the end user through the password reset process without the need for admin intervention. In this example, it takes previously observed learning and applies it in practice.

Another example we can look at is in retail. Innovative retailers have leveraged their massive customer data collections to automatically identify customer behavior, trends, inventory replenishment cycles, and more. This helps personalize a customer’s shopping experience and deliver consistency across a brand’s engagement points.

Banks have applied automation to event-ticket processing. Automated systems have reduced the number of events by 35 percent and then helped to reduce the number of tickets needed to be processed by another 45 percent. First it reduced the noise and then it brought the number of actionable tickets down and to the proper employee’s attention. This helps to accelerate a bank’s ability to respond to customer concerns, thereby improving the customer experience.

This sort of identification, segmentation, and automation doesn’t happen overnight. Data must be accessible – automatically sent to the right spot at the right time – for companies to extract the most value from that data. Silos need to be eliminated and application interfaces managed before data can move about freely. Employees must be able to address contextual bias in data capture, among many other things. But the investments made in big data initiatives are worth it.

Big data automation does more than merely streamline and eliminate processes. It accentuates the uniquely human ability to take complex problems and deliver creative solutions to them. Automation, then, is a simple tool to enhance what we already have and to create new opportunities.

Abdul Razack is SVP of Platforms, Big Data, and Analytics at Infosys, focusing on overseeing platforms and reusable components across services, big data, automation, and the analytics business. Prior to Infosys, he worked at SAP, as Senior Vice President for Custom Development and Co-Innovation, where he was responsible for delivering unique and differentiating customer-specific solutions. In this capacity, he delivered over 40 innovations based on SAP HANA and Cloud to customers worldwide, across 12 different industry verticals. In a career that spans over two decades, he has been involved in several engineering and consulting roles at Commerce One, Sybase, KPMG Peat Marwick, and SAP. Abdul holds a master’s degree in Electrical Engineering from Southern Illinois University and a bachelor’s degree in Electronics and Communication Engineering from the University of Mysore, India.


Subscribe to Data Informed
for the latest information and news on big data and analytics for the enterprise.



Anzo Smart Data Lake [Whitepaper]




Tags: , , , ,

Post a Comment

Your email is never published nor shared. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>