Computer security pros are learning about the benefits of big data that their peers in marketing, science, and medical research already know.
In a recent report from the Information Security Forum (ISF), an international association of organizations concerned about information risk management, big data analytics can enable malware fighters to move beyond chasing security incidents as they happen to predicting and preventing them before they occur.
While analytics has the potential to reduce cyber security risks for organizations and improve their agility in handling those risks, the report noted, it remains immature in the information security space.
When analytics has been used as a security tool, it has been deployed in a reactive way—to monitor security incidents or discover breaches, explained ISF Global Vice President Steve Durbin. “What we’re saying, though, is that going forward there is a massive opportunity for organizations to use analytics to try to be more proactive and forward looking about their security.”
In its report, the ISF explained how big data analytics helped one company purge its systems of an advanced persistent threat infesting its systems. It correlated business information—payroll, customer and vendor data—with security information—firewall logs, network scanner reports and vulnerability analyses. Then it applied a variety of advanced analytic techniques augmented with information from servers containing high risk data. The brew was then displayed using a variety of visualization techniques. What the company discovered was that its internal processes were being used by criminals to launder their money.
“This insight provided the organization not only with the ability to shut down the immediate cyber-criminal attack, but also with a richer understanding of the nature of the threats it faces,” the report explained.
One security company that’s been on the leading edge of using analytics to fight Black Hats has been Trend Micro. Its Smart Protection Network collects threat data from all over the world from a variety of sources. The rapid growth of that data made analytics an operational necessity for the company, according to Senior Manager for Threat Marketing Jon Clay.
When Trend Micro launched the network in 2008, it handled five billion queries a day. Today, that number has ballooned to 16 billion. What’s more, the amount of threat data analyzed by the network on a daily basis has also burgeoned, from one terabyte to six terabytes. That information allows the company to identify and block more than 200 million threats a day.
Initially, Trend Micro used common big data analytic tools like Hbase and Hadoop to massage its data. “As your data set grows, the challenge is having a tool that can search through the trillions of rows of data in a fast enough time to be proactive,” Clay explained. “So we’ve had to build custom tools that allow us to sift through the data faster.”
As cyber criminals increase the velocity at which they create new attacks, analytics is becoming increasingly important to malware fighters. “Criminals are putting a lot of stuff out there so that it’s hard to keep up if you’re not using some kind of machine-learning technology,” he observed.
On a regular basis, Trend Micro uses analytics to model behavior, Clay explained. “We analyze criminal techniques and tools and create models of behavior that we can plug in to our big data structures,” he said. “That allows us to identify threats very quickly and build-out machine-learning technology that can identify threats before they’re launched.”
Analytics has been a boon for securing company networks, too. At the University of Connecticut in Storrs, Chief Information Security Officer Jason Pufahl maintained that a big data solution from Splunk has enabled the school to strengthen its network security in ways that would be impossible without it.
Before the introduction of Splunk some five years ago, security information stored in server logs was scattered throughout the university. Splunk allowed the university to centralize security information and gave it the tools to extract meaningful information from that data. “With Splunk, you can run a query on something as simple as a user name and can produce log information from 20 or 30 different sources,” he observed. “So it becomes easy to do incidence response tasks that would have traditionally been a lot more complicated.”
That kind of information, combined with machine monitoring of a network, can be used for rapidly identifying security threats. For example, if a student logs in to a campus server from the library at noon and from China an hour later, that would set off red flags in the system.
While lauding what Splunk has done for information security at UConn, Pufahl acknowledged that the system can be complex to use. “They could make the product simpler,” he said. “They should simplify how data is pulled out of it for complex queries.”
As with any burgeoning technology, there’s a lot of hype surrounding big data in the market right now, contended Paul Stamp, product marketing director for RSA in Bedford, Mass., an EMC company. There is a gap between what security companies can do and what they’re actually doing. “When you look at use cases right now, they’re generally pretty simple,” he said. “They’re about eliminating a lot of tedious manual tasks that are associated with investigating or detecting an incident.”
They’re also about displaying information in a way that augments a human’s knowledge, he noted, rather than relying on the system to do everything for them. “It’s about allowing humans to draw conclusions from the data more easily,” he said.
Simplicity isn’t necessarily a bad thing. “The companies that we find that are having the most success are those that are starting simple and building out the infrastructure that’s going to grow with them, and deploying more sophisticated techniques at a later date,” he observed.
“Anyone who approaches this as a common data science project, anyone who launches immediately into sophisticated statistical analysis on the data that they’re capturing is bound for failure,” he added.
John Mello is a freelance writer specializing in business and technology subjects, including consumer electronics, business computing and cyber security. Follow him on Twitter @jpmello.