1. Download the pipeline

The source code and example data can be download from https://github.com/UND-Wanglab/SMAP.

You can download it using the clone command:

git clone https://github.com/UND-Wanglab/SMAP.git

or directly download using the following link and unzip it:

https://github.com/UND-Wanglab/SMAP/archive/refs/heads/main.zip

After downloading the source code, you can put it in any working directory (e.g. /home/User/Software/SMAP) and ready to execute the program.

2. Run SMAP program


2.1 Command Line Arguments

you can use help for usage:

perl SMAP.pl -h

Command line:

perl SMAP.pl -vf variant_peptide_table.txt[file] -g genotype.vcf[file] -o result.txt[file]

--variant_peptide,-vf (A file containing quantitative values of variant peptides; required)

--genotype, -g (A genotype file used sample verification; required )

--output, -o (An output filename; required)

--plex, -p (Multiplex number of the isobaric labeling approach)

--fold_change, -fc (Signal to Noise ratio (optional; default is 3))

--noise_level, -nl (The upper threshold of a noise level)

--version, -h (Print version)

--help, -h (Print help)

--licence, -l (Print licence)


2.2 Demo and Program Testing

If you download the standalone program under the folder of /home/User/Software/SMAP, you can test the program using the following command:

perl src/SMAP.pl -g data/genotype_table.vcf -vf data/variant_peptide_table.txt -o output.txt -fc 5

The program takes two inputs:

2.2.1 Variant peptide quantification table. The table contains peptide id, gene/protein name, peptide spectrum match(PSM), SNP id and the quantification data for each sample.

2.2.2 Genotype file in VCF format.

3. Input data

3.1 Variant peptide table

The variant peptide table uses the following format:

Column 1: Peptide ID

Column 2: Gene/Protein

Column 3: Peptide Spectrum Match (PSM)

Column 4: SNP ID **MUST MATCH GENOTYPE SNP ID

Column 5-N: Sample Peptide Quantification (One column per sample)


An example of the variant peptide table

2_1_variant_peptide

3.2 Genotype file

SMAP also takes a genotype in VCF format.

An example of the genotype data

2_2_Genotype_file

3.3 Output files

SMAP generates a final report and several intermediate results.

The final report contains four columns, including Sample ID, Inferred ID, CScore and DeltaCScore.


An example of the final report

2_3_Final_report

An example of sample-specific genotype

2_4_Sample_specific_genotype

An example of inferred genotypes

2_5_Inferred_genotype

In addition, the program also generates three intermediate files, including sample-specific genotypes and inferred genotypes.