The total volume of data from connected devices is expected reach 1.6 zettabytes by 2020. From smart traffic management to industrial plants, the Internet of Things is opening the door to a deluge of data that enterprises must manage. Tremendous value will come from analyzing the data to gain understanding and make better, and faster, business decisions. Much like the market for “big data” analytics – machine-learning, predictive, and real time – the IoT industry is poised for a data revolution.
IoT data is sourced primarily from sensors (fixed and mobile) and, for many companies, is a new type of data. The majority of enterprises are used to structured and “clean” data accessible through queries to databases, not streaming, heterogeneous data transmitted perpetually in real time. Dealing with this new type of data and extracting value from it is challenging and requires extensive experience and know-how. Businesses with the potential to derive value from IoT (i.e., those collecting new or additional sensor data, extracting insights, and generating business value) need to be informed as how best to utilize this new resource. For example, with the right sensors, measurements, and analytics, new information can be extracted from existing processes to generate real and quantifiable business value – such as predictive maintenance, additional efficiencies, and even new services for customers.
The mass of data generated by IoT devices is characteristically different from the transaction-oriented business data that organizations are used to. IoT data is more dynamic, heterogeneous, unstructured, and real time, and therefore demands more sophisticated, IoT-specific analytics to generate meaningful insights. Organizations need to understand IoT analytics in advance and recognize how this data can be used to make business-critical decisions, thereby saving time, expertise, and expense.
“Big data” historically has referred to large, static, structured databases. Many people erroneously equate big data with the Hadoop framework. But Hadoop does not have the ability to deal with real-time streaming data, for example. More recently, people have begun to associate the “3 Vs” – volume, variety, and velocity – with big data. Volume is the “big” component of big data, variety refers to many types of data (i.e., video, audio, text), and velocity refers to the rate at which the data is collected. These characteristics also describe IoT data in many ways. So, in some sense, IoT is an ever-growing source of big data. Still, while most “traditional” big data analytics is geared toward traditional databases, IoT data requires analytics that account for the fact that IoT data is more complex. For example, IoT data is:
- Messy, noisy, and sometimes intermittent because sensors are often deployed in the field. IoT data is generated by sensors sitting somewhere – for example, a sensor could be deployed on a telephone pole or street light. Sensors often cut in and out, and the resulting data is often referred to as “messy data.”
- Often highly unstructured and sourced from a variety of sensors (fixed and mobile)
- Dynamic – “data in motion” as opposed to the traditional “data at rest”
- Sometimes indirect – we cannot always measure a certain relevant quantity directly: for example, using a video camera with video analytics to count people in a certain area
Enterprises that want to leverage analytics have to understand and incorporate all kinds of techniques to overcome the noisiness of IoT data. The challenges that companies will face in handling this new IoT data will lie in infrastructure changes, through deploying sensors or adapting new software, and lead to organizational changes. In fact, reorganization of internal staff will be necessary to the growth of IoT. For example, IT and Operational Technology (OT) cannot be separate. As new functions continue to pop up, the convergence of IT and OT will handle the responsibilities of a maintenance officer and oversee operations, while also having to manage data and predictive analytics on many devices that contain hundreds of sensors. Having this capability will create efficiency and reduce costs, as it will offer real-time visibility to enable quick decision making where necessary, such as ordering new parts or putting in a maintenance request to fix an issue before it disrupts operations.
IoT also enables a new way of handling analytics: distributed analytics, or the possibility of running some of the analytics on the “edge.” This type of edge analytics (also known as fog computing) is much more scalable and uses computation and bandwidth resources much more efficiently. It also cuts down on the cost of transporting large amounts of unnecessary data.
The sheer volume of IoT data and its constantly evolving data science techniques are expected to create many new business opportunities. For instance, IoT data will generate new service offerings for customers. Even companies that traditionally have collected sensor data – such as in the industrial sector, have used it in a very limited way, such as measuring some aspect of an industrial process or machinery, adjusting a controller, and then discarding the measured data. With the IoT, more sensors can be utilized and the data can be stored, mined, and analyzed in new ways and used for new services.
On the other hand, IoT data, coupled with advanced machine-learning approaches, can detect anomalies in an industrial process, vehicular traffic, even human activities and behavior to drive response and action in a quicker and more efficient manner. By using advanced data analytics to inspect and analyze historical IoT data, we can detect patterns indicating changes that require attention. For example, deterioration of machinery in the manufacturing sector, or in home appliances, enables predictive maintenance and fixing the problem when the impact is low in terms of money and time lost.
Making sense of this flood of data will be the challenge of driving better and more advanced analytics. Many organizations recognize that the gap between collected data and analyzed data is growing. We will need to be smart about how we tackle this challenge – do we collect everything? Do we store everything? Do we analyze everything just because we can?
Subscribe to Data Informed for the latest information and news on big data and analytics for the enterprise.