This is a quick look at the state-of-the-art of network visualization in systems biology. It’s an interesting topic on its own (and my day job at the moment), and also as it relates to the visualization of other types of networks, such as social networks (think Facebook). Systems biology is all about looking at proteins, pathogens, and more, within the contexts in which they interact. Naturally, then, the visualizations that tend to be particularly useful are those such as network visualizations that can provide macro understanding of the interactions. Questions such visualizations help with include those of the form “if a drug affects protein X, what else will it affect?”
Quite a bit of interesting complexity is present in these interaction networks (the data). They are often small-world, disassociative (unlike social networks), scale-free, and exhibit modularity. Biologists are usually either interested in looking at larger scale cell level networks, or meaningful sub-networks called pathways, which typically are in the range of 50-500 nodes.
Making life interesting, duplicate nodes representing different states are often included. The edges are directed, and may be hyperedges when multiple nodes necessarily interact together. And, in truth, the edges are often approximations of the actual interactions in the underlying network. These approximations come from experimental findings published in journals.
A First Look
This image is part of Roche Applied Science’s “Biochemical Pathways” series of wall charts. The charts are in the style of circuit diagrams, which seems to be the most common 2-D representation of metabolic pathways. This set seems to have been particular influential. The appeal of this ‘map’ is likely its scale. Viewers can spend a great deal of time exploring. In visualization there is a notion of ‘information density’, meaning the more visual attributes used to convey the data, the more information that may be present in the visualization. This image has a very high information density.
In general (not just systems bio), network/graph layout (choosing where to place the nodes and edges) is done with consideration for (A) the topology network and (B) the aesthetics. The primary topology concern is to place connected node pairs near one another and unconnected pairs apart. The primary aesthetic concerns are to ensure that nodes do not overlap, edges do not cross, and labels are readable.
However, nodes in systems biology often also have biologically significant locations associated with them (e.g., within a cell, or within the nucleus of a cell). The most common way of handling this location information is to treat the layout in a standard network layout manner, but constrain nodes to a compartment/level designated as the extracellular, membrane, cytoplasm, nucleus, etc. This visualization, created with the Cerebral plugin for Cytoscape is the best example I know of of this.
Most of the network visualization tools for systems biology create very abstract images. However, in high quality publications, such as the journal Nature, the abstract images are often hand rendered to include more realistic imagery. Something I would like to do more of if look at actual microscope images and behavioral models to try to usefully bridge the gap.
Visual Data Mining
There are many uses of these network visualizations for biologists and others. One is just that they can leave a more lasting impression/memory than simple lists. A major use case, though, is visual data mining, which may take many forms. Followers of Tufte know that contrasts are often the most valuable element of a visualization. This image is a straightforward example. More sophistication visual data mining might include clustering and classification of those clusters.
Because the Roche wall charts beg to be explored, it is only natural that a tool would be created for doing so. G-Language is an open source shell that supports, among other things, pathway visualization plugins. The Genome Projector is module for G-Language which uses the Google Maps API to allow exploration and annotation. No doubt, as systems biology network visualization tools reach later versions, more and more will support rich interaction and, perhaps, treat the visualization as a vehicle for collaboration.
Hierarchy and Metanodes
In the networks section above, I mentioned that the networks are often modular. The most obvious modules are organelles. But other modules exist, such as those defined functionality. As the above examples show, incorporation of the modularity information into the visualization often is done in a manner that makes it even more abstract.