The state of the typical enterprise in 2017 knows one certainty: A wave of data-powered competitors is rising. Once-monolithic core business applications are propagating AND disintegrating into microservices, while generating a factor more data, and the volume of data-producing endpoints in enterprises is quite unstoppable. Further, real-time services are being introduced across all these boundaries. As this is all happening, the engineers who support the infrastructure face the existential question:
Should we pursue each new data project as a unique discrete effort, or find an approach with greater scalability?
For many, Apache Kafka has become the answer to this question. Between the traction we have seen at conferences and meetups, the activity volume in online communities and the general excitement we see when Kafka developers get together, the enthusiasm around Kafka is tangible, and it’s creating a movement. Among other benefits, Kafka allows organizations to transform an immense volume of new data and data sources into a simple, unified streaming platform at the center of an organization, lets any team join the platform, permits a central team to manage the service and scales to trillions of messages per day while processing and delivering that data in real time.
To find out more about why and how companies are streaming data and the impact it has on their business, Confluent surveyed more than 350 organizations from 47 countries and a wide variety of industries to understand the evolving Apache Kafka user base, use cases and deployments. 1 in 4 respondents (26 percent) work for organizations with more than $1 billion in annual sales, and more than 15 percent of respondents are processing one billion messages per day, illustrating how quickly this open source technology has gained traction across large enterprises.
Below are some of the key takeaways from this year’s survey:
- Streaming data is surging across enterprises: 86 percent of respondents reported that the number of their systems that use Kafka is increasing and one-fifth (20 percent) reported that the number is “growing a lot!” A majority (52 percent) of organizations have at least 6 systems running Kafka with over one-fifth (21 percent) having more than 20. According to last year’s report, only 41 percent of organizations had at least 6 systems running Kafka and only one-tenth (10 percent) had more than 20.
- It is broadly used in the cloud: This year’s survey showed Kafka is used by organizations in some combination of virtual private clouds (34 percent), public clouds (52 percent), and on premises (57 percent). Nearly one-third (32percent) of respondents who use Kafka in the cloud have at least 6 Kafka applications.
- Kafka creates new business opportunities and improves the old ones, too: Because data is available, shared and immediate, companies can create new products and significantly transform existing ones. As Kafka is deployed in more mission-critical infrastructures, a majority (54 percent) of surveyed organizations say that their business can make more accurate and/or quicker decisions thanks to Kafka. In addition to creating new opportunities, companies leverage Kafka to be more efficient and transform existing processes. Of those surveyed, other business benefits worth noting are reduced operating costs (47 percent) and improved customer experience (40 percent).
- Organizations are using Kafka in many different ways: Now companies are solving a new problem with Kafka: microservices. While microservices involve many independent services, the goal is broader than simply running them across different machines. It’s about facing up to a world that is, itself, inherently distributed. This is not meant in some narrow technical sense but rather as a broad ecosystem composed from many people, many teams and many programs, all of which need the agility that microservices afford them. Two-thirds (66 percent) use it for stream processing and three out of five (60 percent) use it for data integration. The most common use case for Kafka is data pipelines (81 percent), while half (50 percent) are already using it for microservices.
- Kafka’s growing feature list is having an impact on organizations: The Kafka Connect API, included in Kafka, makes it easy to add new data stores to your data pipelines without having to write the interfaces from scratch. There was a 25-point increase in organizations using the Kafka Connect API over last year (37 percent in 2017 vs 12 percent in 2016). While a majority (59 percent) of respondents have databases connected to their Kafka clusters, only 36 percent use the Kafka Connect API with Hadoop/HDFS, which is a 4-point drop from last year.
- There is a shortage of skilled Kafka engineers: According to a Dice report, people with Kafka skills receive one of the highest salaries in the technology market. However, despite the salary and the growth of Kafka within organizations, three-quarters (75 percent) of respondents find it difficult to find the right talent with Kafka skills.
The results of this survey point to the increasing use of Apache Kafka for a broadening set of business and technical objectives. Many companies are implementing the distributed streaming platform for more accurate and faster decision making, reduced operating costs, improved customer experiences and reduced risk. With demands from customers to react and respond in real time and legacy technologies holding companies back, it’s no wonder companies have turned to Apache Kafka to embrace the power of streaming data for competitive advantage.
Luanne Dauber is an enterprise executive with nearly 20 years of experience in marketing, product management, product marketing, sales and leadership. She is CMO of Confluent, a company that provides the leading streaming platform based on Apache Kafka. Previously, she was head of marketing at Pure Storage, an all flash storage solution for enterprises where she was responsible for corporate vision and brand, thought leadership, market category creation, field marketing, strategic pipeline planning and acceleration as well as direct marketing.
Subscribe to Data Informed for the latest information and news on big data and analytics for the enterprise.