The GPI-AP Database lists bioinformatically predicted GPI-anchored proteins (GPI-AP) as well as proteins experimentally confirmed as GPI-AP.
For each protein listed, Uniprot, ENSEMBL, and ENSEMBLP IDs are showed, as well as PMIDs of refering articles. Furthermore, the FP rate and the probable omega sites are included.
The database was obtained by merging the predicted GPI-anchored proteins from the algorithms GPI-SOM and Pred-GPI with all the reviewed GPI-anchored proteins listed on Uniprot.
This process is described in details in the following section.
Experimentally confirmed GPI-AP were obtained by extracting the list of all human reviewed proteins annotated as GPI-anchored on Uniprot yielding 139 entries.
GPI-SOM predicted GPI-AP were obtained by extracting the list of all human reviewed proteins on Uniprot (20 386 entries) and using it as an input for the GPI-SOM algorithm. This process resulted in 526 entries.
Pred-GPI predicted GPI-AP were obtained by downloading the list of all the human GPI-AP already predicted by the algorithm containing 373 unique ENSEMBL IDs. ENSEMBL IDs were then converted to Uniport IDs resulting in 308 entries.
Uniprot IDs from the three sources were merged into one file yielding 644 unique entries. Uniprot was then used to retrieve PMIDs of refering articles for each entry.
FP rates (false positive rate) reflect the probability of a protein to be GPI-anchored. They were obtained by using the whole dataset (644 entries) as an input for the Pred-GPI algorithm.
FP rates are interpreted in this way:
The omega site refers to the emplacement of the protein sequence where the GPI-anchor is attached. The omega site of each entry was obtained along with FP rates.