The Figure Model

On the right is an example of a typical figure constructed in Figure Composer. The application "sees" this figure as a hierarchical tree of component graphic objects, or elements. Each element possesses a number of defining attributes, or properties, that govern the element's size, location, and appearance. In addition, an element may admit one or more other elements as child nodes. Structural rules may restrict the numbers and kinds of elements that can appear in the content of any given element class. At the top of this tree is the figure object itself. The width and height of the figure, its location on the printed page, and the thickness of its border are all attributes defined on the figure element. Following the lead of the more general SVG technology, DataNav assigns a variety of style properties to the figure: stroke width, stroke color, text/fill color, font family, font size, and font style. These styles are inheritable: if an attribute is not explicitly defined for an element, then it will inherit the attribute value from its parent element. Thus, the author can control the "look" of the entire figure by setting properties on the figure node only -- while still having the flexibility to make adjustments on individual elements as needed. In the example, the graph labels "A" and "B" are emphasized by a larger font size and a bold-italic style, while the tick mark labels along the vertical axis are rendered using a smaller font size. As you might expect, the figure element is essentially a "container" for one or more graphs. It can also parent any number of label, textbox, line, or shape elements. These generic elements may be used to create a flowchart figure or to annotate the graphs in a figure (provide a title, render a line which divides the figure in half, and so on). As a container, the figure element defines a viewport relative to which all child nodes are positioned and sized. It is often advantageous to define an element's size and position as a percentage of the parent's viewport; then, if the parent is resized, the child object is automatically resized in a proportionate manner. The graph element, naturally, serves as the container for all plotted data sets. Two mandatory axis child nodes define the appearance of the graph's "X" and "Y" axes and the range of coordinate values along those axes. In this way, the graph's viewport is mapped to a coordinate system of the user's choosing. All data traces within a graph are rendered with respect to this coordinate system. Since the user determines implicitly the "units of measure" for the coordinate system, we say that data sets are specified in user units (the term appears throughout this chapter to refer to the implied units attached to data displayed in a graph).

More recently, a "Z" or "color" axis was added as a mandatory child of the graph element. This zaxis element defines the graph's color map; color is used to represent 3D data in a 2D graph -- either as a heat map or a bubble plot.

Figure Composer offers considerable flexibility in the design of the graph. In addition to the Cartesian plots depicted in the sample figure, semi-log, log-log and polar plots are all supported. Axes may be laid out in four-quadrant fashion across the graph viewport, or more typically located outside the viewport in a single-quadrant configuration. The axes, while mandatory, can be hidden. This is extremely useful when lining up two or more graphs sharing a common horizontal or vertical axis. In the example, two graphs share a common vertical axis, so the vertical axis of the rightmost graph is hidden. Furthermore, since the horizontal coordinate range happens to be the same for both graphs, both horizontal axes are hidden -- replaced by a calibration bar (the calib element) defined on the second one. Although not used in the example, a graph has a single legend child, which renders a legend for the parent graph, including a short representative trace line and an associated label for each data trace visible within that graph.

To a certain degree, DataNav separates the presentation of data from its storage. The data presentation element in this example is the trace node, which governs how a referenced data set is rendered within a graph. In the figure document, all referenced data sets are stored in a single ref node at the tail end of the document. (This organization makes it easier to review the figure's tree structure even when it contains very large data sets.) An individual data set can be thought of as a list of data tuples, where each such tuple is a list of one or more floating-point numbers. The expected content of each tuple depends on the data set format. A ptset is an arbitrary point set {x, y}, possibly with standard deviation data in x and/or y. An mset (short for "multi-set") is a collection of individual data sets that share the same x-coordinate values: {xi : y1i y2i ... yMi}. The individual point sets {xi, ymi} typically represent repeated measures of the same stochastic phenomenon or variable, so that the variability in the phenomenon is captured by the collection. A series is a regularly sampled point set -- a time series, frequency spectrum, histogram, or any other data series in which a variable y is sampled at regular intervals in x: {x0 + n·dx, yn} for n={0,1,...}. Similarly, an mseries (or, "multi-series") is a regularly sampled multi-set. For the sampled formats, the sample interval dx and initial value x0 are parameters in the dataset definition. These formats are discussed in more detail elsewhere.

Regardless of the format of the referenced data set, the trace element lets an author render the data in any of four possible display modes: polyline, errorband, histogram, and multitrace. The left-hand trace in the example is displayed as a multitrace. Obviously, this display mode is intended for mset or mseries data; the individual point sets are drawn as separate but identically styled polylines, while the average across all of the sets is rendered as another polyline, typically using a different stroke so that it stands out from the other polylines. On the right-hand side of the figure, one data trace is rendered in errorband mode; in this mode, two polylines are drawn at -1 and +1 standard deviation about the nominal data, and the band between these may optionally be filled with a solid color. The nominal data trace is then drawn on top of the error band, usually with different stroke styling. The other data trace on the right is drawn as a simple polyline, adorned with marker symbols and error bars; the trace element's private symbol and ebar children control the appearance of marker symbols and error bars.

In addition to the trace element, Figure Composer supports four more specialized data presentation nodes. The bar node draws a typical bar plot in stacked or grouped configurations, and oriented horizontally or vertically. It requires a collection-type data source (mset or mseries). The raster node is intended for the display of spike train rasters, a form of 1D discrete time-series data; its data source must adhere to the raster1d format. The contour and scatter nodes offer ways to render 3D data {x,y,z} in two dimensions. A contour node takes an image-like data matrix in which z is sampled at regular pixel locations (x,y) -- as defined by the xyzimg format -- and displays that data as a contour plot or a "heatmap"; in the latter case, it maps each value z(x,y) to a color via an 8-bit look-up table, or color map (as defined by the parent graph's color axis). The scatter node renders the 3D point set -- xyzset format -- as a bubble plot of unconnected marker symbols; for a given 3D point (x,y,z), the size and/or fill color of the marker drawn at (x,y) will reflect the value z (a scatter node also accepts 2D ptset data, in which case it renders a standard scatter plot in which all markers have the same size and color).

Another way to draw a trace in a graph is to use the function element, a set of points described by a mathematical function y = f(x). The function formula uses the letter x to represent the independent variable (e.g., 10.5*sin(2*pi*x/5), 32*x*exp(-x^2)). Attributes specify the range [x0 .. x1] over which the function is evaluated, as well as the sample interval dx. Functions are always rendered as "connect-the-dots" polylines. Marker symbols may be drawn in accordance with attributes defined on the function element's private symbol child node.

This brief discussion has touched upon most of the "building blocks" used to construct scientific figures in DataNav's Figure Composer. The complete set of graphic elements and their defining attributes, along with the rules governing property values and the content of each element class, constitute the "language" in which the program describes publication-quality figures of scientific data. Figure documents are written in this language, an XML-based "markup language" with the moniker FypML ("fyp" is short for Phyplot, Figure Composer's predecessor; it is also the preferred extension for figure document files). Knowledge of FypML makes it possible (BUT NOT EASY!) to manually prepare a figure document using only a text editor, or better yet, to write programs that automatically generate a document which can later be "tweaked" in the Figure Composer app. Check out the section on FypML If you're interested in learning the details and syntax of this markup language.