Hadoop is maturing. On October 15, the Apache Software Foundation released Version 2 of the open source Java-based framework.
This release comes with support for YARN, a kind of operating system for Hadoop that allows for new kinds of data processes, adding on to the MapReduce batch processes that have characterized Hadoop systems until now.
On this episode of the Data Informed podcast, Shaun Connolly, vice president of corporate strategy at Hortonworks, talks about the development of Hadoop and what it means for enterprises using it, or considering using it for data-intensive projects.
Hadoop makes it cheap to store large datasets. But as this article notes, while it’s straightforward to download and install its components, working with it—that is, designing, developing and deploying analytics applications on top of it—requires skill and expertise. Hortonworks, which announced support for Hadoop Version 2, is one of the vendors in a growing ecosystem of companies that builds and supports Hadoop-based systems.
In the podcast, Connolly explains how YARN and other enhancements that are part of the Hadoop ecosystem open up new business use cases for Hadoop users. He also discusses what he sees as future directions for the Hadoop ecosystem, including needed security improvements and innovations in machine learning systems.
Michael Goldberg is the editor of Data Informed. Email him at Michael.Goldberg@wispubs.com.
Related on Data Informed:
• Podcast: Inside the Hadoop Ecosystem with Hortonworks