There is no doubt that Hadoop excels as the leading software framework for distributed storage and processing of big data. This has been made clear by the success of early adopters such as Facebook, Twitter, and Yahoo, which successfully used Hadoop to build custom analytics and tackle their biggest big data challenges in ways that would not have been possible with traditional frameworks.
Hadoop utilizes clusters of commodity hardware and provides a powerful and cost-effective alternative to expensive, traditional data warehouse and BI database servers. While it’s earned its stripes as a data analytics platform designed for the storage and processing of both structured and unstructured data, it’s time for Hadoop to mature so it can continue to permeate the enterprise.
The time for Hadoop to grow up is now. This fact is made clear by a new AtScale study that found that more than 80 percent of study participants have been using Hadoop for more than six months, and more than three quarters of the organizations that already are using Hadoop said they expect to be doing more with it within the next three months. For Hadoop to transition into an enterprise-grade analytics tool, it must foster a true data democracy, implement strong standards across the ecosystem, and reduce the steep learning curve.
Foster a True Data Democracy
The real value of big data lies in the ability to do data discovery. With Hadoop-based tools, any type of user across an organization, from marketing to IT, can not only produce static reports, but also discover insights in all of their data, both structured and unstructured.
While Hadoop requires no pre-built data models, it lacks data loading and integration tools for end users. With the right integration, users can rest assured that they are accessing and analyzing all of their data versus only a portion of it. Typically, business intelligence tools work only with structured data, which often is stored in an expensive, proprietary environment that is costly and painful to scale. With the right Hadoop-based tools, users instead can work with structured, semi-structured, and unstructured data in its raw form, leveraging commodity hardware and open-source infrastructure, which is cheaper and easier to scale.
Implement Strong Standards Across the Ecosystem
The goal should be to make Hadoop-based tools as easy as possible to use, especially if Hadoop is to succeed in the enterprise. However, the current bickering among platforms is creating a chaotic marketplace that lacks that necessary ease.
Currently, there is no process for product standardization or for the development of formal specifications. As new versions of Hadoop are released, there is nothing governing what or how changes are made. As such, when developers modify a release to fix a problem for one customer, they potentially could break applications for other developers. The result? Time and resources are wasted on fixing and working around each version of Hadoop, slowing the innovation in the larger ecosystem.
As more Hadoop-native technologies are conceived and built, it will be important for players in the space to implement a committee or system to review new technology additions, similar to the Java Enterprise Edition (JEE) platform and its community process for standardization (JCP).
Reduce the Steep Learning Curve
According to Gartner’s 2015 Hadoop Adoption Study, one of the biggest barriers to big data adoption is the complexity of Hadoop, with more than half of respondents stating that skills gaps inhibit adoption. To be enterprise ready, Hadoop-based tools must eliminate this learning curve so that all users, regardless of skill sets, can drive the use of big data across their organizations.
Typical Hadoop use cases require strong Java and scripting skills, which has long been a roadblock to adoption. The way to get Hadoop ready for “the show” is to take a user friendly and self-service–oriented approach. Hadoop-related technologies need a quality interface that is comparable with those of other business intelligence tools. With business user-friendly interfaces, vendors can truly address the skills gap and more quickly expand access to both technical and non-technical users. The power of Hadoop lies in allowing everyday business users to ask questions of their data and avoid continually inundating their IT departments with data-related requests.
To have an enterprise-ready game face, Hadoop-based tools must not only support highly skilled business users, but also rapidly elevate the skills that are already available in most enterprises by making it easy for users to access the data and ask crucial business questions.
As the enterprise continues to embrace Hadoop for big data analytics, addressing these issues will ensure that they continue to rely on Hadoop, and that Hadoop shines during its time in the enterprise spotlight. With Hadoop, enterprises will discover astounding business transformation based on data-driven insights.
Raghu Thiagarajan is Senior Director of Product Management at Datameer.
Subscribe to Data Informed for the latest information and news on big data and analytics for the enterprise.