Each organization’s journey to become data driven is unique. Some companies might be seeking new technology to streamline costs and optimize existing systems, while others are looking for data to help achieve strategic goals and future market differentiation.
Given the myriad motivations behind the push to become data driven, judging what constitutes success can be a challenge.
For many companies, the road to becoming data driven goes through Hadoop. But making a shift to Hadoop may be potentially disruptive and require organizations to consider many other systems and points of integration that could necessitate further design and solution bundling.
For these organizations that are not instantly all in with Hadoop, we often see two types of adoption: strategy adoption and use case adoption.
Strategy adoption is when platform technology is dictated by larger business conversations. Increasingly, Chief Data Officers are dictating more defined data strategies in which they are integrating the organization’s strategic and maturity roadmaps based on bringing online new layers of functionality that will create new revenue for the business.
With the strategy adoption approach, you can look at maturity in terms of two important considerations:
- People and process. Are my teams built and ready to face the challenge of new technology?
- Technology readiness. Will I be successful with the scoped technology for my desired outcome?
These are not mutually exclusive, as we often see the best-intentioned technology projects fail due to lack of executive buy-in or an inability to define success metrics. IT is often the first engaged in a Hadoop environment and continues to be the champion for its capabilities. But IT may be interested in only the use cases that allow the department to save money and operate more efficiently versus more strategic implementations that really cement Hadoop as a design approach to usher in big wins for the business.
This brings us to the second type of adoption.
Use Case Adoption
Use case adoption is fairly common. A company will choose Hadoop to do one specific thing, whether it’s relieving Extract, Transform and Load (ETL) pressures or enabling better security monitoring capabilities. Use case adoption allows for quick wins based on case-relevant outcomes. Many times, IT will use these wins to push forward more strategic directives.
To balance these needs, we can look at an organization’s use case adoption journey through the following three phases of maturity.
Now more than ever, it’s important to have clear roads to the fastest win with technology. Initial wins bolster faith and buy-in for internal projects and help take big data out of the realm of experimentation and into real-world context. Use cases that present initial cost savings include the following:
- ETL offload. To remove performance pressures on valuable Enterprise Data Warehouse (EDW) real estate, users can quickly shift this task to environments with better compute resources at the fraction of the cost of their EDW.
- Active archive. Keeping more data online for analysis at a fraction of the cost of traditional solutions, and often extending their capabilities.
- Infrastructure consolidation. An enterprise data hub allows you to combine and optimize around a single data platform that can service a wide variety of requirements.
- Customer segmentation. A simple extension of the analytics you do today can present some quick wins.
- Analytics acceleration. A move from simple reporting to more advanced analytics delivery.
Build Strategic Value
Now that we have saved some money and gotten better buy-in on our investment, we can devote some focus and development to items that will not produce return right away but will help us over time to become more strategic in our space and better differentiated from competitors. Some common use cases in this stage include the following:
- Real-time data pipeline creation. The ability to deploy pipelines that capture, transform, and utilize data in real time.
- EDW optimization. Extending the capabilities of our EDW to capture new types of data that were not a fit for the EDW itself. This may involve making design choices about which workloads are best for each.
- Customer 360. Incorporating internal and external data (clickstream, user sentiment) to create a better view of your customers.
- Advanced analytics. Moving beyond reporting to advanced analytical modeling that might be run inside your Hadoop cluster.
- Monitoring and detection. Building models to detect threats and patterns on all the data you are collecting.
- Enterprise data hub. Consolidating all of your data assets into a data hub (data lake) architecture and processing, serving, analyzing, and delivering data all in one platform.
We have built the queries and models. We know what data we want to utilize, and it’s being delivered to us via our pipelines. Now we need to think about deploying these capabilities into production to be consumed by our internal users and, possibly, our customers. Some of these use cases could include the following:
- Cybersecurity. Detecting threats over every vector of attack across your business.
- Recommendation engines. Adding revenue by spotlighting products and services that your customers are likely to want, based on their activity.
- Personalization. Delivering highly tailored ad content and offers based on the data you are collecting from your customers.
- Operational applications. Online dashboards and applications with Hadoop’s capabilities embedded right into the user experience.
- Self-service BI. Offering robust data access to users with diverse tools for uncovering analysis.
Hadoop creates a multitude of capabilities, and we can touch on only a few here. When it comes to people and process, you also can apply these stages based on the amount of training you are affording your team and how much strategic versus operational work is being done by the teams. The key is to help users understand where they are in their unique data journey and how to measure success. Whether your organization chooses a strategy adoption or a use case adoption model when embracing Hadoop, this list of considerations can be helpful in guiding your thoughts on how best to engage with Hadoop and the many promises and possibilities it offers.
Sean Anderson is marketing manager for IT Solutions at Cloudera. He is a tenured infrastructure scaling and cloud strategy consultant with a strong focus on strategic partnerships and innovative hybrid technology. He has been a part of integral shifts in technology, including the rise of cloud computing, open source standardization, and big data. Sean quickly became a go-to resource and speaker for data-specific workloads focusing on technologies like Hadoop, MongoDB, Redis, Elasticsearch, SQL, and Data Warehousing. At Rackspace Hosting, Sean helped build and launch open-source cloud platforms around Hadoop, MongoDB, Elasticsearch, and Redis.
Subscribe to Data Informed for the latest information and news on big data and analytics for the enterprise, plus get instant access to more than 20 eBooks.