If the CEL data file names are not informative, we can specify alternative names for them. A “Sample information file” is a tab-delimited text file; if edited in Excel make sure to save it in text format by “File/Save As/Save as type: Text (Tab delimited)”. The first header line is required. The first two columns are also required, and they are the array file names (without directory name and the .CEL or .DCP extension; can copy the “Array” column from the “Array summary file” generated by "Open group" to the “Array name” column here) and the corresponding sample names. The sample names should be different for each array, and also be different from any array names; it can be blank so a sample name is the same as its array name. The rest columns are optional descriptions of sample properties using discrete words or numbers. Here is an example file:
Using a “sample information file” is highly recommended. It will be very useful in later functions such as Significant sample clusters and selecting sample by categories. It can better facilitate the visual assessment of the sample clustering than the textual sample names. As an example, if a sample name “14c1” refers to “day 14, pair one, control sample”, we can create three sample information columns called “Day, Pair and Treatment”, and this sample has value “14, 1, C” for the three columns.
You may add a numerical column in “sample information file”. The column header needs to contain “(numeric)”, for example, “Time(numeric)”. Such continuous variable will be standardized and displayed at the top of clustering picture.