Big data is getting bigger all the time, but that’s only half the story. The ever-growing amount of information streaming in from sensors, point-of-sales, social media and clickstreams means that enterprises must now, more than ever, have the capabilities to react quickly. Data, after all, has a shelf life. It’s all very well if your analytics framework can tell you how you should have kept your customers satisfied yesterday–but you’re likely to lose out to a competitor who has worked out how to keep them satisfied today and tomorrow.
This is the concept behind “fast data.” Of course “velocity” has always been one of the Vs of big data – along with volume, variety and veracity. But the explosion in the application of real-time, in-memory and edge analytics means that increasing efforts are going into tackling data as soon as it emerges from the firehose, where the insights which can be gleaned are at their most valuable.
For many of the most cutting-edge applications – for example demand forecasting, fraud detection and compliance reporting, data quickly loses its value if it can’t be analyzed and acted on immediately. For example, when data scientists at Walmart were putting together the latest iteration of the supermarket giant’s data framework, a decision was taken that only the previous few weeks’ worth of transactional data would be streamed through their pipelines – anything else was regarded as too untimely to have any real value in demand forecasting.
Likewise, in banking and insurance, enterprises are finding that immediate access to the most relevant data is vastly more valuable than petabytes of historical data that has sat in warehouses for years, gathering virtual dust (and incurring storage and compliance expense) because someone though that it may one day be useful.
The open source community has embraced the concept of “fast data” wholeheartedly, with platforms such as Spark, Kafka and Storm becoming popular in recent years due to their ability to process streams of data with lightning speed. To achieve this, data is often processed in-memory – cutting down the time needed to spin up physical hard disks and seek the information stored on them. An important differentiator is that “fast” Big Data is generally processed as a stream, while “slow” Big Data is processed in batches.
A company providing ‘fast data’ solutions is Nastel and of their customers, a Fortune 500 bank, is processing over $1T in funds per day. Several times during each day the bank is required to reconcile their vast accounting records with the Federal Reserve. Today, the bank is able to analyze (in-memory) these transactions and ensure that they are processed in priority order as some can be delayed while others must be processed immediately. This requires that the Nastel software analyze the payload contents of each payment in order to ensure that everything that must be processed immediately with the Fed is done first.
Similarly, another multi-national financial services customer of Nastel is now able to report on all their derivative trades as fast as technology allows. The institution acquires trade data from multiple trading systems, enriches the data, analyzes it in-memory and then immediately reports on each trade to an external regulatory agency. As this data is processed, business users attempt to forecast potential breaches in compliance and take action to avoid them. In fact, many financial services firms are now rated based on how fast and accurate their reports are.
Many analysts believe fast data to be essential for tasks such as recommendation and personalization engines – where information about a customer needs to be processed as soon as they visit a web page or walk into a store, and available to be acted on immediately.
Another application is “smart” power grids, where demand can be forecast and resources allocated across the grid to ensure supply is available when and where it is needed. This technology is being applied in smart city projects around the world.
It’s also vital for the complex fraud- and error-prevention algorithms employed by banks, where the problems caused by errant transactions can be magnified if they are not detected and rectified immediately.
What’s more, it isn’t just applications which rely on structured data. Increasingly, systems monitoring and responding to unstructured data – posts made to social media, or audio data gathered from customer service calls – is proving to be valuable when made available in real or near-real time. One example is video security systems, where predictive modelling can be used to raise alerts when suspicious or abnormal activity is picked up by surveillance devices.
More and more applications of fast data are quickly turning our big data landscape into a racetrack! Is it time to take your big data projects from the slow into the fast lane?
For more information on Fast Data Analytics.
Bernard Marr is a bestselling author, keynote speaker, strategic performance consultant, and analytics, KPI, and big data guru. In addition, he is a member of the Data Informed Board of Advisers. He helps companies to better manage, measure, report, and analyze performance. His leading-edge work with major companies, organizations, and governments across the globe makes him an acclaimed and award-winning keynote speaker, researcher, consultant, and teacher.
Subscribe to Data Informed for the latest information and news on big data and analytics for the enterprise, plus get instant access to more than 20 eBooks.