AutoCompare ZE command line

AutoCompare_ZE is a software which scores the enrichment of a gene set in a gene list. 

This is a command line version that can be used remotely on a server without graphical interface.

AutoCompare_ZE is based on the Zelen’s exact test. We demonstrated that this test solves the false discovery problem observed with the Fisher's exact test.

 

Citation :

When using AutoCompare_ZE to process data for publication please cite:

Curbing false discovery rates in interpretation of genome-wide expression profiles. Ycart B, Pont F, Fournié JJ. J Biomed Inform. 2014 Feb;47:58-61.

Example of results Table : FE : Pvalue with the classical Fisher Exact test, ZE :  Pvalue with the Zelen’s exact test.

Manual :

AutoCompare ZE is a GNU Linux/Unix multithread software.

1- Install Julia  free programming language

2-  install R programming language.

3- install snowfall R package

4- unzip the software

5- The software has been made for linux assuming that the command to start R is "R"

if it is not the case, for windows users, edit the line 85 of the AutoCompareZE_CMDline_0.3.jl

cmd = `R --vanilla --args $db $data $T $pv $cpu $ndb`

and replace R by the full path to R.exe, it should work. 

Example of R full path on windows - note the double anti-slash :

cmd = `C:\\Users\\CLARA\\AppData\\Local\\Programs\\R\\R-4.3.2\\bin\\x64\\R.exe --vanilla --args $db $data $T $pv $cpu $ndb`

6- copy your lists of gene names in the data directory

7- copy your databases in the databases directory. Databases are text files, each file contains a list of gene and the title of the file is name of the biological function. The files should be placed in a directory and the name of the directory is the name of the database.

8- open a terminal. In the AutoCompare-ZE dirctory type : julia AutoCompareZE_CMDline_0.3.jl to start the software. Alternatively you can start the software in the Julia REPL using this method.

9- choose your databases and the Pvalue threshold.

Caution : test database C1 alone. The genes in C1 are often not found in user's genes list and the Pvalue cannot be computed resulting the software crash.

10- processed files are in the “results” directory.

Here we provide databases of > 20 000 biological functions, pathways etc... We also provide random samples and random databases to establish score thresholds.