1. Download the pipeline
The source code and example data can be download from https://github.com/UND-Wanglab/SMAP.
You can download it using the clone command:
git clone https://github.com/UND-Wanglab/SMAP.git
or directly download using the following link and unzip it:
https://github.com/UND-Wanglab/SMAP/archive/refs/heads/main.zip
After downloading the source code, you can put it in any working directory (e.g. /home/User/Software/SMAP) and ready to execute the program.
2. Run SMAP program
2.1 Command Line Arguments
you can use help for usage:
perl SMAP.pl -h
Command line:
perl SMAP.pl -vf variant_peptide_table.txt[file] -g genotype.vcf[file] -o result.txt[file]
--variant_peptide,-vf (A file containing quantitative values of variant peptides; required)
--genotype, -g (A genotype file used sample verification; required )
--output, -o (An output filename; required)
--plex, -p (Multiplex number of the isobaric labeling approach)
--fold_change, -fc (Signal to Noise ratio (optional; default is 3))
--noise_level, -nl (The upper threshold of a noise level)
--version, -h (Print version)
--help, -h (Print help)
--licence, -l (Print licence)
2.2 Demo and Program Testing
If you download the standalone program under the folder of /home/User/Software/SMAP, you can test the program using the following command:
perl src/SMAP.pl -g data/genotype_table.vcf -vf data/variant_peptide_table.txt -o output.txt -fc 5
The program takes two inputs:
2.2.1 Variant peptide quantification table. The table contains peptide id, gene/protein name, peptide spectrum match(PSM), SNP id and the quantification data for each sample.
2.2.2 Genotype file in VCF format.
3. Input data
3.1 Variant peptide table
The variant peptide table uses the following format:
Column 1: Peptide ID
Column 2: Gene/Protein
Column 3: Peptide Spectrum Match (PSM)
Column 4: SNP ID **MUST MATCH GENOTYPE SNP ID
Column 5-N: Sample Peptide Quantification (One column per sample)
An example of the variant peptide table
3.2 Genotype file
SMAP also takes a genotype in VCF format.
An example of the genotype data
3.3 Output files
SMAP generates a final report and several intermediate results.
The final report contains four columns, including Sample ID, Inferred ID, CScore and DeltaCScore.
An example of the final report
An example of sample-specific genotype
An example of inferred genotypes
In addition, the program also generates three intermediate files, including sample-specific genotypes and inferred genotypes.