For years, organizations have turned to security events and logs, aka machine data, to meet compliance requirements for regulations and mandates such as PCI, HIPAA, FISMA, GLBA, NERC, ISO, COSO, and the EU Data Directive. These compliance requirements typically include security event logging and retention, threat detection and alerting, and incident review and response. Additionally, organizations must measure the effectiveness of the many technical controls required by these regulations and mandates.
In the past, organizations have turned to traditional Security Information and Event Management (SIEM) software to meet these requirements. SIEMs centrally collect event and log data from security devices. In turn, these logs can be harnessed for cross-data source correlations and rules to detect threats, after-the-fact incident investigations and response, and for compliance reporting.
Traditionally, organizations have only indexed in their SIEMs “security” data from security-focused products such as firewalls, authentication systems, and endpoint anti-virus. However, organizations are beginning to realize that “non-security” data also needs to be logged, as it has the fingerprints of advanced threats, especially threats that use custom malware and stolen, legitimate credentials to evade detection from traditional security products. This “non-security” data includes event data from hypervisors, cloud platforms, mobile devices, badging systems, operating systems and, more recently, the Internet of Things (IoT), which has experienced massive growth.
The IoT includes a wide variety of networked physical devices with embedded operating systems, and large organizations may have tens of thousands of IoT devices. The types of IoT devices in the enterprise also vary based on the industry:
- Point-of-Sale terminals
- In-store kiosks
- Imaging or medical systems
- Industrial Control Systems (ICS) that control factory floor robots, valves on an oil pipeline, meters on an electrical grid, the systems of a nuclear plant, and more.
To meet security and compliance requirements, the event data or logs generated by these IoT devices need to be indexed because if these devices are subject to a cyber attack, these logs will have the fingerprints and evidence of the attack. Typically, cyber attackers will use the compromised device as a beachhead to get access back to a primary data center where the confidential data, such as credit cards, taxpayer IDs, or intellectual property, is located. The threat will then evolve into stealing this valuable data. Or the IoT/ICS devices themselves may be the end target. An example of this is a nation-state wanting to cripple the industrial infrastructure of another country, as was suspected in the 2015 attack on the Ukrainian power grid. Or perhaps the attacker wants to put malware directly on point-of-sale systems to steal credit card information directly off of them, as was the case in the 2013 Target data breach.
If you log events from IoT devices, you can connect the dots to see these attacks as they happen and before the attackers succeed with their mission.
Traditional SIEMs Are Failing Amid Flood of IoT Data
Even before the massive growth of the IoT, traditional SIEMs had problems keeping up with the huge volume of security data that needed to be indexed. The problem has gotten even worse with the increased volume of event data from IoT/ICS and the variety of formats in which this data is generated.
Traditional SIEMs struggle because they use brittle “connectors” to ingest data only for common, popular security products and not for IoT/ICS products. It is costly and difficult to build new connectors. Even if you build a connector for a type of IoT device, the next problem is that these traditional SIEMs use a relational datastore on the back end, which means:
- The fixed schema of the datastore prevents the indexing of all logs and security events. Usually, one has to normalize raw data to fit the schema, which means valuable log data is lost that otherwise might be needed for threat detection or investigation.
- The single datastore is a point of failure and chokepoint that prevents scale and speed.
Another limitation of traditional SIEMs is their rigid user interfaces with inflexible search and report-building capabilities. This hampers the ability to create single reports that span multiple regulations or to run custom searches to satisfy an ad-hoc auditor request or facilitate an incident investigation.
Big data is the answer
If this pain around logging IoT sounds familiar to you, the good news is that there is an answer for you: using big data as a “next generation” SIEM. Big data is technology that uses a flat file data store, scales horizontally on commodity hardware, and uses distributed, Google-like search for scale and speed. Also, it has flexible and powerful search and reporting capabilities that can be used for SIEM use cases such as correlations, alerting, and reporting.
Big data solutions don’t suffer from the limitations of traditional SIEM solutions:
- Flexible and easy-to-build connectors make data onboarding much easier, including for IoT event data.
- Flat file data store means ability to index all original data and not modify it.
- Distributed architecture means fast scale and speed when it comes to data ingestion, searching, reporting, and alerting.
- Flexible UI and search/reporting capabilities mean the ability to easily create reports and run searches needed to show compliance status. Also, they enable you to pivot through raw logs to facilitate incident investigations and response.
Big data technology also has the capabilities required of an enterprise-ready compliance, logging, and SIEM solution, including real-time searching and alerting, roles-based access control, data hashing to demonstrate logs have not been tampered with, and granular control of how long to retain logs. Furthermore, many vendors offer big data solutions with pre-built searches and reports for various compliance regulations and mandates. An added bonus is that big data typically is software-only, so it enables flexible deployment options that meet the requirements of any organization: on-premise, in the cloud, or a hybrid of both. Technologies and vendors that represent big data include both commercial and open-source technologies.
Big data solutions can scale to index the massive amounts of machine and event data generated by any data source, including the rapidly increasing number of events from IoT and ICS systems, and harness all this data to detect, investigate, and report on threats and to meet compliance requirements. Some of these solutions have scaled up in the real world to index over a petabyte of raw logs per day. Simply put, by using big data as a “next-generation SIEM,” companies can implement better, faster, and lower-cost compliance.
As an added bonus, once IoT/ICS data is in a big data solution, it also can be used to measure and manage the operational health of all these devices, including alerting if there is a performance issue or facilitating root-cause investigations.
So next time you find yourself struggling with event logging or compliance, especially with IoT/ICS, give big data a look.
Joe Goldberg is the Security/Compliance/Anti-Fraud Evangelist at Splunk. Goldberg’s responsibilities include technical product marketing and evangelism for security and compliance use cases. He is also a published contributor for Wired Magazine, Dark Reading, and SC Magazine. Prior to Splunk, he did Technical Product Marketing for the Data Loss Prevention product at Symantec. Previously, he did product marketing for both VMware and Sun Microsystems. He has also worked in the financial services industry doing venture capital and corporate development. He has an M.B.A. from the Wharton School at University of Pennsylvania and a business degree from the Haas School of Business at UC Berkeley.
Subscribe to Data Informed for the latest information and news on big data and analytics for the enterprise, plus get instant access to more than 20 eBooks.