The market for Big Data and analytics technology is in a state of fast change and rapid growth. A recent development in that market is the emergence of a class of platforms and managed toolsets that can be termed Big Data-as-a-Service (BDaaS).
It’s easy to see the appeal of these kinds of solutions. Instead of building a data center, developing an analytics toolset stack, and investing in a team of trained data scientists – a costly and time-consuming project for any enterprise – why not simply pay as you go?
I predict that the size of this market will grow phenomenally in the near future, driven by the increasing adoption of Big Data analytics across the mid-tier enterprise. The solutions available in this market minimizing the need for large financial outlays before any results can be seen, making them a very attractive prospect for those organizations where a clear business case is a necessity rather than a luxury.
The rapid growth of the market does, however, mean that it is unsettled and constantly changing. The current standard model is that of a managed, cloud hosted Hadoop distribution alongside an ecosystem of open source or proprietary analytics, data management, and security technology. I can see a future in which the growing number of organizations with a need for analytics may create even more transparent, highly managed service offerings. Though all of the services I cover at the end of this article are based on Hadoop, the extent to which this is mentioned in their individual marketing literature differs.
Some of the providers of BDaaS are household names for reasons other than their cloud business services, and their status as digital giants is intended to inspire confidence that they know what they are doing when it comes to security and compliance. Others exist purely to provide BDaaS.
If you’re evaluating BDaaS for your own business, there are a few key questions anyone should ask when choosing a particular BDaaS provider:
Does it offer low or zero start-up costs?
Many of the providers listed here offer a free trial so, in theory, you could see results before you spend a penny.
Is the solution scalable?
Big Data projects have a tendency to grow in size beyond the initial vision – can you easily and affordably buy more storage and processing resources as you need them?
Is it already in use in my industry?
If you are paying for consultancy and project planning support alongside your data hosting and analytics, does your provider have experience supporting your business cases and customers?
Does it fit my organization’s needs?
BDaaS is particularly suited to strategies that involve analysis of very large, messy, and unstructured datasets. Additionally, there will be a requirement to move large amounts of data to a third party provider, which is likely to raise security and compliance issues.
Does it offer real-time analysis and feedback?
Today’s most exciting and rewarding Big Data projects provide insights based on what is happening now, not just what was happening last week. This means that companies can take action when it is needed rather than simply learning from the past.
Is it managed or self-service?
Most providers offer a mixture of both approaches, with technical staff working behind the scenes to provide you with services in as transparent a way as possible. However, the level of support and consultancy included in your package will vary.
Here’s a quick introduction to some of the most prominent BDaaS services available today.
Google Cloud Dataproc
Google’s managed Big Data service has enjoyed fast growth since it went into general release earlier this year. Clearly, Google has been able to leverage its presence and reputation for cloud innovation to offer a package industry has found attractive. It runs Hadoop and Spark on Google’s Cloud Platform and integrates with the BigTable storage and BigQuery analytics frameworks.
Amazon Web Services
AWS is the collective name for Amazon’s cloud-based business tools and services. Its managed Hadoop service is called Amazon Elastic MapReduce, and it runs on Amazon’s S3 storage infrastructure. It is the market leader in providing business cloud computing services, and customers benefit from their world class data security infrastructure.
Microsoft Azure HDInsight
Microsoft’s strong presence in the business software market made it a sure bet that it would play for a slice of the BDaaS pie. It has built on its Azure cloud framework by increasingly adding functionality and compatibility with open source technology such as Spark and Storm. For many organizations, a big plus will be the user interface (UI) features that are immediately familiar to millions, even if they have never been near an analytics dashboard previously.
Salesforce Wave Analytics
Salesforce built up partnerships with companies including Google and Cloudera to bring Hadoop-based Big Data analytics to its cloud data services. Wave Analytics uses a UI that will be familiar to any users of its market-leading CRM software to enable dynamic visualizations. It is also optimized for use on smart phones and watches, making it a strong contender if you want to put real time analytics in the hands of a mobile or shop floor workforce.
Qubole Data Service
Less of a household name than those mentioned previously, Qubole was founded by former Facebook data scientists who saw a need for a self-service Big Data platform for the enterprise. It is designed to be operated from a UI that assumes no prior experience of using Hadoop. As Qubole is not a cloud storage provider, its solution can be configured to run on Amazon, Google, or Microsoft cloud infrastructure.
IBM BigInsights on Cloud
IBM’s data management systems already have high penetration, so it was natural that they would be looking to move into business cloud computing. By integrating advanced analytics tools, it has put together a suite of services aimed at lowering the entry barrier to Big Data analytics. IBM has also forged partnerships with social media companies such as Twitter, making it easier to gain insights, and developed its own cognitive, natural language processing engine, Watson, allowing data to be queried and analyzed using natural human language.
I hope that this article gives you an overview of the current BDaaS market. As this market is evolving very fast, I will be keeping a close eye on the developments.
Bernard Marr is a bestselling author, keynote speaker, strategic performance consultant, and analytics, KPI, and big data guru. In addition, he is a member of the Data Informed Board of Advisers. He helps companies to better manage, measure, report, and analyze performance. His leading-edge work with major companies, organizations, and governments across the globe makes him an acclaimed and award-winning keynote speaker, researcher, consultant, and teacher.
Subscribe to Data Informed for the latest information and news on big data and analytics for the enterprise, plus get instant access to more than 20 eBooks.