Supplementary Table 1 Results obtained by using PRED-TAT for classifying simultaneously the protein sequences into three categories (Tat signal peptides, Sec signal peptides and non signal peptides, i.e. transmembrane and cytoplasmic proteins) in the form of a 3x3 confusion matrix. A: Results obtained by 30-fold cross-validation on the training set. B: Results obtained on the independent test set and C: Results obtained on the training set of TatP. PRED-TAT uses the Viterbi algorithm and thus no fine-tuning is required.
Supplementary Table 2 Results obtained by using PRED-TATHMMER for classifying simultaneously the protein sequences into three categories (Tat signal peptides, Sec signal peptides and non signal peptides, i.e. transmembrane and cytoplasmic proteins) in the form of a 3x3 confusion matrix. A: Results obtained by 30-fold cross-validation on the training set. B: Results obtained on the independent test set and C: Results obtained on the training set of TatP. Since we have two independent profile HMMs, the final decision here is obtained by choosing the model with the highest score (i.e. if the two scores of a protein are larger than zero, the highest-scoring model is chosen)
Supplementary Table 3 Results obtained by using PRED-TATHMMER for classifying simultaneously the protein sequences into three categories (Tat signal peptides, Sec signal peptides and non signal peptides, i.e. transmembrane and cytoplasmic proteins) in the form of a 3x3 confusion matrix. A: Results obtained by 30-fold cross-validation on the training set. B: Results obtained on the independent test set and C: Results obtained on the training set of TatP. Since we have two independent profile HMMs, the final decision here is obtained by giving priority to the HMM for Tat-substrates (i.e. if the score of a protein is larger than zero irrespectively of the other HMM, the protein is classified as Tat).
Supplementary Table 4 Results obtained by using TatP and SignalP3-NN for classifying simultaneously the protein sequences into three categories (Tat signal peptides, Sec signal peptides and non signal peptides, i.e. transmembrane and cytoplasmic proteins) in the form of a 3x3 confusion matrix. A: Results obtained on the training set. B: Results obtained on the independent test set and C: Results obtained on the training set of TatP (not cross-validated). Since we have two independent predictors, the final decision here is obtained by giving priority to TatP (i.e. if the score of a protein is larger than the cutoff irrespectively of SignalP’s output, the protein is classified as Tat).
Supplementary Table 5 Results obtained by using TatP and SignalP3-NN for classifying simultaneously the protein sequences into three categories (Tat signal peptides, Sec signal peptides and non signal peptides, i.e. transmembrane and cytoplasmic proteins) in the form of a 3x3 confusion matrix. A: Results obtained on the training set. B: Results obtained on the independent test set and C: Results obtained on the training set of TatP (not cross-validated). Since we have two independent predictors, the final decision here is obtained by choosing the predictor with the highest score (i.e. if the two scores of a protein are larger than the respective cutoffs, the highest-scoring predictor is chosen)
Supplementary Table 5 Detailed results for the 44 proteins that contain the RR motif but are experimentally verified not to be Tat-substrates.