Collapse microarray

This software replace probsets of a microarray by the corresponding gene names. If many probsets corresponds to the same gene, the probset with the highest signal is used. All gene synonyms are computed, so the number of gene names is generally higher than the number of probsets (the collapsed table have more line than the initial table). We provide the Affymetrix U133+2 chip and Human Gene 2.x ST databases, but the software should be able to use any other database.

Caution : if a gene correspond to many probsets, collapse will choose the most expressed probset. To do that, collapse assume that the sum of expression values is >0. If this sum is <0, collapse will reject this probset.

Manual :

1- if not already installed, install Perl free programming language

2- unzip the software

3- copy your microarray files in the “data” directory. The files must be formated as follows (an example is given with the soft) :

- txt or csv files with TAB separator

- first line : column names

- first column : probset names

- next columns : gene expression values with dot as decimal separator

4- We provide the Affymetrix U133+2 chip database. If you want to use another chip, replace this file in the chip folder. The chip database file must be formatted as follows (U133+2 is given as example with the soft) :

- txt or csv files with TAB separator

- no titles in the first line

- first column : probset names

- second column : gene names. Genes synonyms are separated by ///.

- third column : gene definitions

4- execute the software by the command : perl software_name or double click on the .pl file

5- processed files are in the “results” directory.