Similarity Analyses

Purpose

Graphically display the similarity between Sites (or something analogous) based on the presence or absence of items. Sites that share many of the same items and are missing many of the same items are expected to be very similar.

Analysis

The goal of the analysis is to produce a dendrogram.

Typical Data

Notes:

    • The above example shows the structure of a typical data table but is too small to be of much value.
    • The 0 and 1 values in the cells are "absence" and "presence."
    • Each cell must have either a 0 or 1.
    • Practical numbers of Sites are 8 to 50.
    • Practical numbers of Items are 20 to 200.

You can enter your data in several ways:

    • Copy a spreadsheet such as Excel or Google Docs.
    • Type a space (or tab) delimited file (see below for an example).

. Site_1 Site_2 Site_3 Site_4 Site_5

Item_1 1 1 0 1 0

Item_2 1 1 1 0 1

Item_3 1 1 0 0 0

Item_4 1 0 1 1 0

Item_5 0 1 1 1 0

Item_6 0 1 0 0 0

Item_7 1 1 1 1 1

Item_8 0 0 1 0 1

    • Note the period at the start of the first line; it needs to be there (because you have a blank cell in the upper left corner).
    • All rows must have at least one value that is 1.
    • Row and column names (e.g., Site_1 or Item_1) must not have embedded blanks.

Generalizations

Dendrograms are used to show patterns. This type of analysis applies to many situations.

    • Species that are found on sites (e.g., which are the most similar sites?)
    • Food items that are found on menus (e.g., which fast food places are most similar?)
    • Garden plants that are found in different people's yards (e.g., what are the patterns displayed by different ethnic groups?)

Example Problems (test your skills)

This diagram (generated from the "typical data" at the right) shows that the most similar sites are 3 and 5. The least similar site is 4.

Analysis Software

Use the PAST program. It is a free download (Windows PC only) and runs without the need for installation.

Start PAST and enable both of the Edit checkboxes (Edit Mode and Edit Labels).

Enter your data directly into the PAST spreadsheet or by cutting and pasting from another source (e.g., an Excel worksheet). You can also prepare a comma-delimited text file. (See information and examples on the right.) When you have completed your data entry, uncheck the Edit checkboxes.

You need to Transpose your data (rotate 90 degrees). Click on the Edit menu and choose the Transpose item.

After transposition, the area with your data values will already be selected. If you somehow deselect this area, you need to click-and-drag over the data area (not labels) to select it again before running an analysis.

Run the analysis by clicking on the Multivar menu and choosing the Cluster Analysis item. The analysis will run immediately and provide you with a dendrogram. However, you must choose the Bray-Curtis radio button to get the corrects display (Bray-Curtis assumes binary data as you are using).