Advanced persistent threats (APTs) have become a nightmare for organizations ranging from media titans like The New York Times and Washington Post to energy companies. Because the malware is built for stealth, conventional security tools have a devil of a time combating it. Software and hardware systems that are often described as big data solutions can be used effectively to battle APTs, but not without some challenges.
A number of techniques are used by the creators of APTs to keep them off the radar of system defenders. Their primary purpose is to blend the malware’s activity into the network landscape of an organization. All that activity leaves spoor on a system, but common security solutions don’t have the capability to detect it. Approaches that come under the big data heading can analyze data from many sources, and correlate relationships among thousands of nodes in an enterprise network, for example, identifying potential threats and scoring the risks of each one.
That’s good news, but as with most issues in information security management, the note comes with a caution. For example, when enterprises adopt big data analytics technologies, they need to make sure their network architectures can support the demands of the new systems. They must assess storage needs for working with huge datasets and manage the associated costs. And they need to have staff with the skills to manage newer technologies such as Hadoop.
Advanced Threat Detection
“When a threat lays dormant inside a network until a specific time, it’s very hard to detect,” said Brian Christian, CTO and founder, Zettaset, a maker of Hadoop cluster management software. “Because they happen infrequently and once in a while, they can be very deadly.”
“This is what big data is perfect for,” he continued. “We can analyze every packet on a network for a year, find out what normal traffic looks like and then plot all the outliers to that normal traffic. APTs will show up in those outliers.”
For example, a big data analysis of your systems might identify all the files on all your computers. Then it might reveal a certain number of files appear on only .001 percent of your systems. Those files are executables. The executables are making a request to a particular server. Only four systems on the network have ever made a request to that server via a certain port.
“Once you understand the baselines for your organization, you can see the anomalies and over time, you can build a big enough dataset to immediately see when something is not right within the organization,” explained Adam Ely, a former CISO and co-founder of Bluebox, a mobile security provider.
Not only can big data identify malicious activity that may escape the notice of conventional anti-malware tools, but it can identify malicious inactivity. “An advanced persistent threat won’t present itself to you in the traditional sense until it’s weaponized its payload,” said Peter Tran, senior director for advanced cyber defense at RSA, the security division of EMC.
“You could be infected, but not weaponized,” Tran continued, “so that you won’t know about the APT until it’s activated itself. That’s a bad scenario to have.”
That’s not to say that all is lost after an APT has armed itself. What’s important for organizations to realize after discovering a machine infected on their network with an APT is that it’s probably just the tip of the iceberg. “A truly advanced actor probably has their tentacles all through the organization,” said Jeff Lunglhofer, financial services cyber lead with Booz Allen Hamilton.
“The only way to figure out the scope of an infection is to conduct a link analysis of all of the systems that host has touched within the enterprise,” Lunglhofer said. “You can imagine how difficult that can be in an enterprise of potentially hundreds of thousands of nodes.”
Log and net flow data going back months may need to be analyzed. “It’s an absolutely staggering amount of data that you have to have but without it, understanding an APT attack is an exponentially more difficult task,” Lunglhofer said.
System Management Challenges
Outfitting an organization to use big data techniques to fight APTs, though, can be challenging for its storage resources and network architecture. One of the current misconceptions about storage is that it’s cheap. While that’s true relatively speaking — the cost per megabyte of storage continues to decline — the amount of data organizations are storing is skyrocketing. According to IDC, data storage volumes will increase 50 times current levels by 2020.
“Once organizations began storing more data, they began to realize that storage isn’t cheap on the scale of big data,” Bluebox’s Ely said.
“Storage was cheap when all I wanted to do was store some logs, email and user files,” he added, “but now I want to collect everything that comes out of everything I have in my enterprise.”
Assessing storage needs is one of the biggest challenges facing organizations with a yen to protect their information resources with a big data solution, said David Wells, a data security consultant with Axis Technology. “People don’t have clear requirements about what they need to gather,” he said.
“There’s this idea that we should gather everything and then figure it out later,” Wells said. “If you do that, you potentially wind up having a bigger headache later.” That’s because later is when it becomes clear that there’s not enough storage capacity.
Zettaset‘s Christian has seen some woeful miscalculations about storage by some organizations. “We saw one customer who wanted to store 12 months or data, but they couldn’t store 12 hours of data,” he said. “It was poorly thought out from the beginning.”
Big data approaches, though, can play a role in controlling a company’s storage demands by determining what data is really important for a business and discarding the rest. For example, an organization may receive information on millions of malicious domains. That data can be reduced by applying criteria – such as the domain’s age, activity, threat to industry and other factors. “Through data enrichment, you can cull your data,” RSA’s Tran explained. “You don’t have to keep it all, just what’s contextually relevant and timely to your security.
Big data security can also present a challenge to a company’s data management architecture. If a shop is used to digesting data with SQL, it will take time adjusting to Java in a Hadoop environment. In addition, any data management system designed to accommodate the needs of big data must not lose sight of its own security. “Security teams need to make sure that a big data rollout occurs in a way that the big data is protected,” observed Roopak Patel, group product manager for HP ArcSight.
“They don’t want to get caught with their pants down,” Patel said. “What’s critical when modifying an existing infrastructure for Big Data deployment is that it not create new vulnerabilities or backdoors into the network.”
As powerful as big data can be in combating threats to an organization’s data, no one weapon by itself can eradicate all the cyber threats facing organizations today. “It’s not a silver bullet,” said Matt Standart, director of threat intelligence at HBGary. “It will be another tool in the arsenal that we use to defend our networks.”
John P. Mello Jr., is a freelance writer specializing in business and technology subjects, including consumer electronics, business computing and cyber security. Follow him on Twitter @jpmello.