Why does it seem I have to look hard to find good data visualization examples? Why do few tech companies devote resources to visualization (Google’s the obvious exception)? Why are there relatively few job postings for visualization, with many of those there are requiring mainly graphic design skills and not data visualization skills? I was thinking about this today and I came up with a few possible reasons, some based on perceptions, and others based on marketplace realities.
Reason #1: People Don’t Know What Data Visualization Is
People don’t know what data visualization is. Don’t believe me? Read the Amazon.com reviews for the book Data Visualization by Ben Fry. They contain negative comments such as “One would expect a book with the title ‘Visualizing Data’ to be crammed with pictures”. The issue seems be that too much of the book is devoted to data and the mapping of data properties to visual properties.
Graphic design is different from data visualization. Graphic designers are largely free from having to deal with actual data, and from having their product emerge from data. Graphic design components and data visualization components are often mixed, and with great success. But they are different. Art is not visualization. And visualization is not art…unless it is .
The above visualization (which is, in fact, by Ben Fry) is driven by the properties of two underlying datasets. One dataset is the DNA of a monkey. The genes (the data) are represented as very tiny white text. A second dataset used is human DNA. It is only depicted after the difference of the two datasets has been computed. Then the genes that are different between the monkey and human are represented in red. Fry obviously didn’t choose which areas of the visualization would be red, the data did. What about the monkey pic? Even that is a visual representation of a property of the dataset…the type of the DNA dataset shown in white text.
Reason #2: Crappy Existing Visualizations have Polluted Perception
The visualization on the left is the interface for the search engine Kartoo. The visualization on the right is a feature CNET used to have called The Big Picture. Both attempt to visualize data usually shown as lists (search results, related news articles) as 2D networks. Its a nice idea, as pairwise relationship properties can be visually represented as edges. But these particular efforts both miss the boat. They don’t actually increase the amount of information represented by very much vs lists, while greatly increasing the mental load placed on the user trying to extract the basic information.
Reason #3: People are Unable to Mentally Separate the View from the Data
Here’s another Ben Fry work (I was watching a video/talk of his earlier today, which is part of the reason he is so prevalent in this post). It shows six different visualizations of the same dataset.
Many times data relates to physical objects. In such cases people may have trouble dealing with such data as visually represented in any other manner than that which includes those physical objects. Or another situation is one in which data has just always been depicted in a certain way, which interferes with any new depiction.
Reason #4: Visualization is Difficult to Create and Easy to Copy
This is somewhat irrelevant, but I have had a Yahoo mail account for about a decade. There was a good six year stretch where it never changed. If Gmail hadn’t come along, who knows.
When Google released Google Finance, it marked a number of firsts…the use of AJAX for stock charts (the chart itself is actually Flash), the overlay of events on the chart, and the dual time sliders. No doubt Google spent much time and effort designing this visualization tool. How long did it take Yahoo Finance to copy Google Finance’s chart once Google revealed it? Not long. Good visualization design is hard. It’s even harder when its object is to deconstruct very complex data. Reverse engineering a visualization is easy.
Reason #5: People Won’t Pay for Visualization?
I’m not so sure about this one, but our company’s CTO recently commented to me that he couldn’t think of any successful standalone visualization effort other than Processing.
Applications such as Google Maps don’t count both because its free, and, more importantly, because people wouldn’t have access to the underlying data without the visualization. I can think of a few commercial successful standalone visualizations such as this one, but surely the list is fairly short.