Creating a data visualization that reveals interesting relationships between data isn’t easy. Even if the data is clean and of high quality, the visualizer must mine the data for correlations and find the best way to represent them to his or her audience.
There are several enterprise level tools available, like Tableau or QlikView, that specialize in dashboards and reporting but have the capabilities to get more advanced. Those tools have expensive enterprise licenses to consider.
Lynn Cherny, a data consultant based in Massachusetts, said she often does her more advanced data exploration and visualizations on open source tools, like NodeBox for Python or the Java-based program Processing, or the visualization library d3.js. Cherny is giving a presentation on NodeBox at PyData 2013 in Santa Clara on March 19.
Cherny said open source tools like NodeBox still have some maturing to do, but because they’re community-driven advances can happen more quickly than with enterprise tools. She said she prefers NodeBox to Processing because she strongly dislikes coding in Java, and Python is much easier.
But, she said, the ability to code is still an important skill in order to explore data visually and lacking this prowess can be a barrier to entry to creating robust visualizations.
In this interview with Data Informed staff writer Ian B. Murphy, Cherny discusses the gap between enterprise and open source data visualization tools, the growing community for Python as a data processing and visualization tool, and the process involved to create a good data visualization from start to finish. (Podcast running time: 19:22.)
Related articles on Data Informed: