As Internet-based warfare heats up and computer networks become a more critical component of conventional military operations, the Pentagon is investing in improving cybersecurity analytics overall and visual analytics in particular.
For security analysts, visual analytics tools represent a way to monitor and sift through massive amounts of network security data from many different sources, identify threats, quickly plan responses and tweak security settings.
The U.S. Air Force turned to VisiTrend, a Boston-based software development and consulting firm founded by computer scientists with backgrounds in human-computer interaction and machine learning, to create a framework for interactive visual analytics.
Since its inception, the Air Force has sponsored pioneering work in human factors and ergonomics, from flight instrument and cockpit design to pressurized flight suits, but according to John McIntire, the effort to apply principles of perception science to visual analytics for cyber security was something new. McIntire is an engineering research psychologist in the Battlespace Visualization Branch of the 711th Human Performance Wing at Wright-Patterson Air Force Base in Ohio.
The developers consulted information security analysts and reviewed cognitive task assessments breaking down the thought processes and knowledge involved in cybersecurity analysis. Drawing on the current understanding of vision and perception, they designed their tool to efficiently deliver the necessary information in a visual display. This includes network topology, potential security vulnerabilities, data flows, firewall activity, and intrusion prevention system alerts.
The project, Visualization for Command and Control of Cyberspace Operations, began in 2008. VisiTrend delivered a desktop prototype in May 2012. The resulting application, Visualization tool for Integrated Cyber Command and Control (VIC3), allows users to tap into cybersecurity applications to monitor networks, scan logs and live data streams and other security data and to conduct sophisticated analysis. By manipulating elements of visualizations, users can identify potential network incursions and project their impact, trace information flows, correlate disparate events, weigh the effectiveness of possible responses and see how cyber operations and kinetic (real-world) operations are related.
“This triggers analytics such as machine learning, but the user doesn’t need to know that,” explains John Langton, president and principal scientist at VisiTrend. “They provide inputs by selecting items in the visualization, which transparently guides the analysis algorithms to the answers they are seeking.”
Graphical Representations of Database Queries
The application uses abstract information layers to graphically represent database queries, much like online map applications represent geospatial information, Langton says. Each layer can draw from multiple databases and assigns a visual characteristic to the results.
Despite the availability of graphical tools, cybersecurity analysts still spend much of their time reading log files and alerts, combing through lines of text and tables in search of connections, patterns and anomalies. “Analysts and network operators don’t have much in the way of complex analytical tools, certainly not for visual exploration of their data,” notes McIntire. One reason for this text-heavy approach is that many visual interfaces fail to address the way people process visual information.
While visual analytics is both art and science, some experts are advocating greater attention to the science of perception. The goal is to optimize the first steps in the sequence of sensation, perception, cognition, and action/interaction, which informs the study of human-computer interaction.
Presenting information visually has several advantages, including speed and the ability to convey complex information and relationships. “Your visual sense is extremely high fidelity, but not if you are trying to read, when you’re converting the information to your verbal channel and speaking it inside your head.” explains McIntire. “Text is a strange way to go about looking at really complex, dense data.”
But leveraging the visual channel requires an understanding of human perception. Tools that represent large datasets in 3D can be hard to read and compressing data that has many variables into parallel coordinates on a two-dimensional graph can cause visual clutter. The wrong visual metaphor can obscure data, as can poor color choice and counterintuitive graphics. So, for example, selecting a rainbow color gradient to show a range of values is a poor choice, since people don’t perceive rainbows as continuous scales.
Vision science offers a guide to combining visual elements like spatial arrangement, size, color, texture, saturation, and brightness to get across more details. Conventional design choices often underuse this capacity. For example, pie charts are typically used to represent network activity, such as the number of alerts generated by device, but this doesn’t give contextual information like how machines are connected within the network.
While some approaches to harnessing the mechanics of perception are grounded in physiology and cognitive science, many draw on practical experience, such as those described by design guru Edward Tufte, author of the influential The Visual Display of Quantitative Information.
The VIC3 framework follows on the visual information theories of computer scientist Ben Shneiderman. In a 1996 paper for the IEEE proceedings, he described a visual-information seeking mantra for designing graphical user interfaces: “overview first, zoom and filter, then details on demand,” which, though popular, hasn’t been rigorously tested and validated, according to Langton.
VIC3 follows the mantra in its linking of data views such that when a user selects an item in one, related items in other views are also selected. “This helps users identify the relationships between items in different data sets, and also different perspectives of the same items,” says Langton.
In the VIC3 visualization tool, a layer can contain scripts for generating simple queries or advanced analytics, which are then presented in a user-defined visual pallet. So that users don’t inadvertently short circuit the visual connection, the framework’s front-end includes a help application to guide a user’s mapping of data to visual characteristics.
This visual mapping taps into users’ perceptual wiring to enable them to see patterns in the data, like unusual volumes of Hypertext Transfer Protocol Secure (HTTPS) traffic, which might indicate an HTTPS exploit, or the scattered footprints of a “low and slow” attack. Overlaying layers allows them to view data from several sources, for example firewalls, routers, intrusion prevention systems, or operating system logs, in various combinations. Since people detect the intersection of these overlapping layers very quickly (“pre-attentively,” in vision science terms), toggling them off and on can reveal how they relate, says Langton. This allows users to tease out critical information from massive amounts of data.
Plans for Additional Use Cases
VIC3 is currently a proof-of-concept. VisiTrend this summer plans to release NDVis, a commercial Web application based on its work for the Air Force. The NDVis front-end is primarily HTML5. The back-end component is built on the Spring open-source enterprise Java framework. The developers plan to add online analytical processing (OLAP) capabilities and better collaboration features. In its first incarnation, the software-as-service offering will be tailored for cybersecurity. Future releases will address other disciplines.
The developers limited the interface to three primary visualization frames, since people can only effectively track that many, Langton notes. They avoided 3-D graphics, since people have some difficulty reading information from the depth field, he adds.
They chose to represent network topology using tree maps, since the first thing people register about a graphic is its spatial arrangement and the hierarchical nature of tree maps reflects that of IP addressing. Also, the size of regions in tree maps corresponds to the size of subnets, Langton explains.
“A visualization tool that can show you big chunks of data in a single view, but also allow you to drill down when needed, will allow your perceptual system to recognize patterns that would otherwise go undetected in the raw data,” McIntire says.
Ted Smalley Bowen is a freelance writer and editor based in the Boston area Reach him via email at firstname.lastname@example.org.
Correction, May 31, 2013: This story has been updated from the original version. VisiTrend plans to release NDVis this summer as a commercial product, not an open source project as first reported. In addition, the Web application’s front-end uses HTML5, not Java.