Data Presentation - 2D

Raw data can be plotted within a 2D graph in a myriad of ways. Figure Composer offers seven distinct 2D data presentation objects: trace, raster, heatmap, scatter, bar, area, and pie. Of these, the trace node is the most versatile (some examples are here and here). It supports data sets in any of four different formats, and it can render histograms, poly-line traces with and without marker symbols, error bar plots, error bands, and rose and compass plots (in a polar context).

Data Trace Properties

The screenshot on the right shows the property editor for the trace node. The first two rows of widgets in the editor panel are common to the property editors for all of the different data presentation elements. In the first row you can specify a title for the element. Not only does a unique title help you identify the object in the Figure Navigator's node tree, but it is also used to label the corresponding entry in the parent graph's automated legend. To include the trace node in the legend, check the box just to the left of the title field.

The second row displays the id of the source data set, along with three pushbuttons. Press the Edit button to raise the Dataset Editor Dialog, in which you can view and/or modify the data set by hand. The other buttons raise a file chooser so that you can either load a different data set from an external source file, or save the current data set to file. See the chapter on Loading Data for more information.

The remaining parameter controls on the Data Trace Properties editor define exactly how the raw data set is rendered. Note the tabbed pane at the bottom; the screenshot shows the contents of each of the three tabs -- Line, Symbols, and Error Bars. These tabs contain mostly graphic styles and a few other parameters; with these you can separately control the appearance of the trace line, any marker symbols, and any error bars drawn.

A key parameter control is the combo box that sets the trace display mode: polyline, staircase, errorband, histogram, or multitrace. In polyline mode, the referenced data set is drawn as a "connect-the-dots" trace, optionally adorned with marker symbols and/or error bars. If the data set is very large, you can set a skip interval -- the numeric field labeled "every" -- greater than 1 to reduce the number of points actually plotted. You can also offset the trace in X and Y without altering the data source by setting nonzero offsets x0 and y0. The rendered trace line is stroked in accordance with properties on the Line tab. Marker symbols are drawn at each well-defined data point in accordance with information supplied on the Symbols tab; if you do not want to display marker symbols, the symbol size should be set to 0. Finally, if the data source includes nonzero standard deviations in X and/or Y, error bars are drawn at each data point, defined and styled in accordance with the properties on the Error Bars tab. In the specific example illustrated by the screenshot (assume the display mode is polyline instead of multitrace), the trace line would be solid green, a half point in thickness, "butt"-ended, with mitered joins. It lacks any marker symbols or error bars, since symbol size is set to 0 inches on the Symbols tab and the Hide all? box is checked on the Error Bars tab. You could also configure the trace to look like a scatter plot by setting the stroke width to 0 on the Line tab and specifying a nonzero size on the Symbols tab; however, as of version 4.6.2, you can create a scatter or bubble plot directly with the scatter node.

The staircase mode is a slight variation on the polyline mode. Instead of connecting the data points with straight line segments, they are connected with a sample-and-hold "step: for each pair of consecutive points (x1, y1) and (x2, y2), an intervening point is included at (x2, y1). Note that symbols and error bars are only rendered at the actual data point locations, not at the intervening sample-and-hold points. This display mode might be used to emphasize the discrete nature of a sampled data set.

The errorband mode is intended for data sets with nonzero standard deviation data in Y. The variation in the nominal data is rendered as a band encompassing +/-1 standard deviation about the nominal trace {x, y}. One trace connects the points {x, y+dy}, while another connects the points {x, y-dy}, where dy represents one standard deviation in y at each point (x, y). Both standard deviation lines are stroked in accordance with the properties on the Error Bars tab. The band between the two lines will be filled with the text/fill color specified on that tab; use a transparent fill if you don't want to fill the band. Finally, a trace connecting the nominal data points {x, y} is stroked using the style properties on the Line tab. No marker symbols are drawn in this display mode, and the x0, y0 and every parameter fields have the same effect as in polyline mode.

In histogram mode, the data set is rendered as a histogram, possibly adorned with symbols and error bars. The barWidth ("barW") and baseline ("base") attributes define the histogram bar width and baseline level for this display mode. [Since they are applicable only to histogram mode, the corresponding widgets are disabled in any other display mode.] If the barWidth is 0, the bars are drawn as lines instead. On a polar graph, the bars become circular pie wedges (baseline = 0) or radial sections.The histogram bars are stroked and filled in accordance with the graphic styles listed on the Line tab. If any marker symbols and error bars are present (which would be unusual), their appearance is governed by the Symbols and Error Bars tabs, as in polyline mode; no trace line connects the symbols, however.

Finally, the multitrace display mode is intended for a data source which is actually a collection of two or more individual point sets sharing the same x-coordinates (the mset and mseries data formats). Typically, the individual point sets represent repeated measures of the same stochastic variable over time, so that the collection captures the variation in that variable. The individual point sets are drawn as separate traces, each styled in accordance with the properties listed on the Error Bars tab. If the error bar end-cap size on that tab is nonzero, then the specified marker symbol ("bracket" in the screenshot) is rendered at each well-defined data point across all sets; however, for best results, authors will typically specify a cap size of 0 so that no symbols are drawn. On top of all this, a nominal trace -- representing the average across the individual data sets -- is rendered as a trace line with optional marker symbols -- but only if the Avg? box is checked (again, this attribute has no meaning and is thus disabled in the other display modes). The trace line is styled by the attributes on the Line tab, while the marker symbols are styled in accordance with the Symbols tab. Note that, if the referenced data set does not contain two or more component point sets, then all that will be rendered is the "nominal" trace.

Contour Plot Properties

The screenshot on the right shows the layout of the Contour Plot Properties editor panel. The first two rows of widgets are the same as in the Data Trace Properties editor, although there's no "include in legend" checkbox next to the title for a contour node. Figure Composer does not support legend entries for contour objects.

The contour node, first introduced in version 3.0 as the heatmap node, can display a 3D "data grid" in one of four display modes. The underlying data source must adhere to the 3D xyzimg data format. This data set is essentially a WxH matrix containing the values of an independent variable Z = f(X,Y) sampled over a rectangle [X0..X1, Y0..Y1] in (X,Y)-space. In the heatMap display mode -- which replicates the functionality of the deprecated heatmap node -- the 3D data set is rendered as a WxH-pixel image by mapping each value in the matrix onto the range [1..255], which then serves as an index into an 8-bit color map, or look-up table, that specifies the RGB color of the corresponding pixel in the image. The image is scaled as needed so that it fits in the rectangle bounded by [X0..X1, Y0..Y1] in the parent graph. The graph's color axis defines the color map used to generate the heat map image. Note that any NaN values in the image data map to color index 0 which, in turn, maps to a designated NaN color that is part of the color map description.

In the levelLines display mode, contour level-curves are generated and stroked IAW the node's current stroke properties -- except that each curve's contour level is color-mapped in the manner just described to determine the line color. In the filledContours mode, the bands between the level-curves are color-mapped, while the level-curves are all drawn in the same color -- the node's current strokeColor. Finally, in the contouredHeatMap display mode, the heat map image is super-imposed with the contour level-lines, all drawn in the same color.

Note that a contour node will NOT render at all if the parent graph is a polar plot; it is currently supported only in standard Cartesian, log-log, or semi-log plots.

Beside its data source and display mode, a contour node has very few configurable properties. Enter up to 20 different contour level values in the Levels text box. They need not be entered in any particular order, and any repeat values (or any values outside the Z-range of the source data grid) are ignored. Be sure to press the button to the right of the text box to confirm any changes. You can leave the levels list empty, in which case FC will select the levels for you. The only other parameter of note is the Smooth heat map? flag, applicable only in the display modes that generate a heat map image. If set, FC will interpolate between defined points in the referenced xyzimg data set, resulting in a smoother appearance. However, the smoothing effect can be misinterpreted as higher-resolution data, so some authors prefer to disable the feature. You can, in fact, set a user preference for this attribute via the File|Preferences command.

Raster Properties

The raster object, added in version 3.0, is specialized to render a one-dimensional collection of samples, as encapsulated by the raster1d data set format. This data set is a collection of zero or more x-rasters, a sequence of sample values in x. The node was originally introduced to render the occurrence times of action potentials as a sequence of short vertical hash marks -- a "spike train". But it can also present a histogram of sampled data, whether those be spike times or anything else. The raster element's data set and various attributes are edited in the Raster Properties panel (see right). Again, the first two rows of widgets serve the same role as in the Data Trace Properties editor. There are no font properties in the editor panel because the raster node never renders any text.

Five different display modes are available, two alternate spike train-like modes (trains, trains2) and three histogram-type modes (histogram, pdf, cdf). In trains mode, each x-raster in the data collection is rendered in spike-train fashion, with short vertical hash marks drawn at each raster sample value. All sample values are offset in accordance with the value in X Offset (a numeric text field), and the baseline of the first raster is positioned at the value in the Y Offset field. The Line Ht parameter specifies the height of each raster hash mark in stroke-width units, while the Spacer property sets the distance between the baselines of consecutive raster trains, again in stroke-width units. The trains2 display mode is similar to trains, except that the Spacer property is ignored (the numeric field is disabled) and the baseline of each individual raster is its ordinal position (starting at 0) in the raster collection, adjusted by the value in the Y Offset field; thus, the raster train is more closely tied to the vertical axis of the parent graph in this mode. In both modes, the raster hash marks are stroked in accordance with the stroke style properties listed in the last row of the editor.

In histogram mode, DataNav renders a histogram computed from the sample data. The #Bins text field specifies the number of bins into which the histogram is divided. The Range Limits text fields select the sample range [S..E] over which the data is binned; if S >= E, the actual observed sample range is used instead. For each bin in the histogram, a bar is drawn from the specified baseline value to N, the # of samples falling in that bin; the bar spans the entire bin width. Bars are stroked IAW the node's stroke properties and filled with the its current text/fill color; transparent, translucent or opaque colors are all supported. Translucent fills are very useful when two histograms overlap. Typically, baseline will be 0, but that is not a requirement -- so a histogram bar could extend above or below the baseline. If the avg flag is set, the bin value is instead the average count per raster per bin (total count per bin divided by the number of individual rasters in the collection).

Two additional display modes were added in version 5.0.2 that offer an estimate of the data collection's probability density function, pdf, or cumulative density function, cdf. For the pdf, each bin value = C/(T*W), where C is the sample count in that bin, T is the total sample count over the specified sample range, and W is the bin width. In cdf mode, each bin value is S/T, where S is cumulative total sample count across the current bin and all preceding bins; the last bin, of course, will have the value T/T = 1. The avg flag is ignored in these two modes. Also, unlike histogram mode, the histogram bars are drawn from baseline to (baseline + bin value).

Bar, Area and Pie Chart Properties

Plot objects presenting data in bar, area or pie charts are relatively recent additions to Figure Composer, introduced in versions 4.6 and 4.7. The property editors for these three data presentation elements are shown in the right-hand screenshot. These elements are similar in the sense that they present relatively small groups of data: the bar groups of a bar plot, the stacked regions in an area chart, or the slices in a pie chart. Each such "data group" is typically assigned a distinct color and label, and each group gets a separate entry in the parent graph's automated legend. In all three cases, the legend entry for each data group is a rectangular bar filled with the color assigned to that group.

The bar node is specialized for rendering a relatively small collection of data sets {X: Y1 Y2 Y3 ...} as a traditional bar plot. It can be rendered only in a graph configured in standard Cartesian coordinates (if the graph has polar or logarithmic coordinates, the bar plot is simply not drawn). The data source must be an mset or mseries collection of up to 20 individual data sets, with each set in the collection plotted as a distinctly colored bar group. Any additional sets beyond the first 20 are ignored.

There are four different display modes, selected by the combo box in the third row on the editor panel: vgroup (groups of vertical bars), hgroup (groups of horizontal bars), vstack (single stacked vertical bar at each X coordinate), and hstack (single stacked horizontal bar at each Y coordinate). The raw data should always be specified for the vertical display modes; in the horizontal modes, the data is simply transposed. Bar width is specified as an integer percentage (5-100) in the numeric text field labeled Bar W; it determines the spacing between adjacent bars at a given value of X, as well as the gap between adjacent bar groups (in the grouped display modes). The Base parameter sets the baseline value for the bar plot; each bar is drawn from the baseline to the particular value Y represented by that bar. All bars are stroked in the same manner, as specified by the bar node's stroking properties.

At the bottom of the Bar Properties editor is a small table that shows the fill color and legend label assigned to each distinct data group in the bar plot. The example in the screenshot is a bar plot with four groups. The fill colors and labels are applied to the individual member sets in the raw data collection in the order they appear in the table. The number of rows in the table will match the number of member sets in the collection (up to the aforementioned limit of 20). Figure Composer will automatically generate colors and labels as needed. To change any color or label, simply double-click on the corresponding table cell and edit the value "in place". In the case of the bar color, an RGB color picker is raised by which you can choose a new color; note that a transparent fill is NOT supported here.

The area element renders a small collection of data sets {X: Y1 Y2 Y3 ...} as a stacked area chart. In this stacked presentation, each colored "band" or region in the area chart represents the relative contribution of a single member set in the collection, while the top of the chart represents the cumulative total across all member sets. The area chart can be drawn in any supported coordinate system, including polar graphs. Typically the Y values are all of the same sign across all member sets; although this is not required, the area chart will be difficult to interpret otherwise. The data source can come in any of the four 2D formats: ptset, series, mset, and mseries. If the source has just a single member set, then the area chart will only have one colored region.

The chart starts at the value specified in the Baseline text field in the Area Chart Properties editor; typically, the baseline is zero. The fill colors and legend labels assigned to each distinct region in the area chart (corresponding to each member set in the underlying data source) are listed in a table at the bottom of the editor panel, in the same manner as the Bar Plot Properties editor. All bands in the area chart are stroked identically, IAW the area node's stroke properties.

The pie element presents a short list of data {Y1 Y2 .. YN}, where N<=20, as a traditional pie chart. Obviously, the pie chart is rendered only within a polar graph context, and typically an author will hide both the axis lines and the grid lines of the parent graph. By convention, the data source for a pie node can be any of the four recognized 2D data formats, but the X-coordinate data are completely ignored. For the ptset and series formats, the list of Y-values are taken in order, with any NaN or negative Y value treated as 0. For the mset and mseries formats, the average value of Y across the member sets is computed at each X, omitting any NaN values. Any computed average that is NaN or negative is again treated as zero. In all cases, if the data length exceeds 20, only the first 20 values are used to build the pie chart.

The number of "slices" in the pie chart is the length of this compiled list of Y values. Let T be the sum of these values. The angular extent of the pie slice representing the value Yi is, in degrees, 360 * Yi / T.

Observe that the data does not define the radial extent of the pie chart. The inner and outer radii are properties of the pie object itself; they're specified in the numeric text fields labeled IR and OR in the Pie Chart Properties editor. The inner radius is always less than the outer radius and is typically zero. However, you can create a "donut"-shaped pie chart by specifying a non-zero inner radius (try it).

As with bar and area charts, the fill colors and legend labels attached to the individual pie slices are listed in a table at the bottom of the editor. There is, however, an additional column in this table, containing a check box to indicate whether or not the corresponding pie slice is displaced radially from the chart origin. Any displaced slices are offset by the same amount, specified in the offset% text field as an integer percentage of the outer radius.

Bar, Area and Pie Chart Property Editors

Scatter Plot Properties

The scatter element is another rather specialized data presentation node that renders a small 2D point set {X, Y} or 3D point set {X, Y, Z} as an X-Y scatter or "bubble" plot. Once again, a combo box selects the display mode:

- scatter : A symbol of fixed size and fill color (if it's a closed symbol!) is drawn at each well-defined data point in the data source.
- sizeBubble : The symbol size varies with the Z-coordinate of each data point (X, Y, Z); fill color is constant.
- colorBubble : Symbol size is fixed, while fill color varies with Z.
- colorSizeBubble : Both symbol size and fill color vary with Z.

The data source format must be the 2D ptset or 3D xyzset. Obviously, the three "bubble" display modes are applicable only when the data format is xyzset.

Under the display mode combo box are controls for selecting the marker symbol to use and a symbol size. While you will almost always want to use a closed, filled symbol in a scatter or bubble plot, non-closed symbols are also allowed. If the display mode is scatter or sizeBubble, all symbols are filled with the color specified by the node's text/fill color property (and since fill color be transparent, you can have "hollow" symbols in these modes). For the colorBubble and colorSizeBubble display modes, the symbol representing a 3D data point (X, Y, Z) is drawn at (X, Y) and filled with the (opaque) color identified by mapping Z to the parent graph's color map, in accordance with the current definition of the graph's color axis. For best results in these modes, it is important that the color axis range match that of the data source backing the scatter node.

If the display mode is scatter or colorBubble, all symbols are drawn at the same size S, as specified in the symbol size field in the property editor. In these modes the size is typically small, say 0.1 inches. In the sizeBubble and colorSizeBubble modes, the size of the symbol representing a 3D data point (X, Y, Z) is calculated as S*Z/Zmax, where Zmax is the maximum observed Z-coordinate value in the raw data set. Thus, S may be thought of as the "maximum symbol size" in these two display modes, and it may be set to a relatively large value. Note that it is expected that all Z-coordinate data will be non-negative; else expect some strangeness!

Regardless the display mode, all symbols in the scatter/bubble plot are outlined IAW the scatter node's stroking properties.

Function Properties

A FypML function node is a pseudo-data presentation node in which the "data source" is actually a mathematical function y = f(x). Like any other data presentation element, the function node has a title and can be included in the parent graph's automated legend. Uncheck the box next to the title field if you do not want to list the function object in the legend.

Immediately below the title field is another text field in which you enter the function formula. As you can see in the screenshot on the right, the formula string is a typical mathematical expression written in a natural way, with the letter "x" (or capital "X") representing the function's independent variable. In addition to "x", the formula string may contain any of the following operators:

- The typical binary operators: +, -, *, /, ^ (raise to a power), and % (modulo).
- Unary minus or negate operator, -. This is distinguished from the subtract operator based on its position in the function string.
- Grouping operators. Parentheses are used to clarify precedence or delimit the argument list for a function operator. The comma operator is valid only when used to separate the arguments of a function operator.
- The constant "pi". Useful in trigonometric functions.
- Function operators: sin, cos, tan, asin acos, atan, sqrt, pw, exp expm1, log log1p, log10, abs, floor, ceil, rint (round to nearest integer). All supported operators take double-precision floating-point arguments and return double-valued results.

Standard operator precedence is enforced (listed here from highest to lowest precedence) and associativity is left to right:

1. All function operators.
2. Negate (unary -) operator.
3. Raise-to-a-power (^) binary operator.
4. Multiply, divide, and modulo operators.
5. Add, subtract operators.
6. Grouping operators.

Whenever you enter or modify the function formula string, Figure Composer parses the string to verify it represents a valid, supported mathematical expression. If so, an "OK" icon (white check mark in a green circle) appears to the right of the formula field. If not, an "invalid" icon (red circle with a diagonal slash) appears; hover the mouse over that icon to see a brief description of the error detected.

Below the formula field are three numeric widgets that define the domain in X over which the function formula is evaluated. Enter the start and end of the range, and a sample interval dx. Of course, the start and end points of the range cannot be the same, and dx cannot be zero.

At the bottom of the Function Properties editor is a tabbed pane with two tabs, Polyline and Symbols. The graphic styles and other parameters on these two tabs control the appearance of the line connecting the points (x, f(x)} at which the function is evaluated, and the marker symbol (if any) rendered at each point. Their usage is very similar to the Line and Symbols tabs on the Data Trace Properties editor.