AutoCompare ZE command line
AutoCompare_ZE is a software which scores the enrichment of a gene set in a gene list.
This is a command line version that can be used remotely on a server without graphical interface.
AutoCompare_ZE is based on the Zelen’s exact test. We demonstrated that this test solves the false discovery problem observed with the Fisher's exact test.
Citation :
When using AutoCompare_ZE to process data for publication please cite:
Curbing false discovery rates in interpretation of genome-wide expression profiles. Ycart B, Pont F, Fournié JJ. J Biomed Inform. 2014 Feb;47:58-61.
Example of results Table : FE : Pvalue with the classical Fisher Exact test, ZE : Pvalue with the Zelen’s exact test.
Manual :
AutoCompare ZE is a GNU Linux/Unix multithread software.
1- Install Julia free programming language
2- install R programming language.
3- install snowfall R package
4- unzip the software
5- The software has been made for linux assuming that the command to start R is "R"
if it is not the case, for windows users, edit the line 85 of the AutoCompareZE_CMDline_0.3.jl
cmd = `R --vanilla --args $db $data $T $pv $cpu $ndb`
and replace R by the full path to R.exe, it should work.
Example of R full path on windows - note the double anti-slash :
cmd = `C:\\Users\\CLARA\\AppData\\Local\\Programs\\R\\R-4.3.2\\bin\\x64\\R.exe --vanilla --args $db $data $T $pv $cpu $ndb`
6- copy your lists of gene names in the data directory
7- copy your databases in the databases directory. Databases are text files, each file contains a list of gene and the title of the file is name of the biological function. The files should be placed in a directory and the name of the directory is the name of the database.
8- open a terminal. In the AutoCompare-ZE dirctory type : julia AutoCompareZE_CMDline_0.3.jl to start the software. Alternatively you can start the software in the Julia REPL using this method.
9- choose your databases and the Pvalue threshold.
Caution : test database C1 alone. The genes in C1 are often not found in user's genes list and the Pvalue cannot be computed resulting the software crash.
10- processed files are in the “results” directory.
Here we provide databases of > 20 000 biological functions, pathways etc... We also provide random samples and random databases to establish score thresholds.