VCF File Annotations
Annotations in high-confidence somatic SNV/INDEL VCF files
In the FILTER field:
Only the reference calls are labeled PASS. This is not to be confused with AllPASS annotation, which denotes that SomaticSeq has classified a call as a somatic mutation in all 63 sample sets.
In the INFO field:
calledSamples
List of samples deemed PASS by SomaticSeq. For samples with corresponding SomaticSeq classifiers (i.e., 54 out of 63 data sets), these samples have SomaticSeq Score ≥ 0.7.
For the other 9 data sets (i.e., LL, EA, and NC), these samples have ≥ 50% somatic caller agreement.
rejectedSamples
List of samples where the variants have SomaticSeq Score < 0.1 (scores between 0.1 and 0.7 are pretty rare and are considered ambiguous).
For the other 9 data sets, these samples have ≤ 50% caller agreement.
noCallSamples
List of samples where the said variant is not detected by any one caller. For low VAF variants, you expect some of them here.
bwa_PASS, bowtie_PASS, and novo_PASS
For all the pairs of BAM files aligned with bwa/bowtie/novoalign, it records the number of samples (IL, NV, FD, NS, and Others) where the variants are classified as SomaticSeq PASS (score ≥ 0.7).
E.g., bwa_PASS=2,1,2,4,0 indicate that for BWA aligned data sets, this variant is classified as PASS (i.e., SCORE > 0.7) in 2, 1, 2, and 4 data sets from IL, NV, FD, and NS. The final number is always 0 because it is for the "Others" that do not have classifiers for them.
bwa_REJECT, bowtie_REJECT, and novo_REJECT
Similar to above, except to record the number of samples with SomaticSeq score < 0.1 (thus REJECT).
bwaMQ0, bowtieMQ0, and novoMQ0
In tumor BAM files aligned by bwa/bowtie/novo, the number of reads with MQ=0.
MQ0
sum of MQ0, bowtieMQ0, and novoMQ0 above.
bwaTVAF, bowtieTVAF, novoTVAF
Combined VAF (calculated by adding up the reference and variant reads in all tumor samples from BAM files) of samples aligned with bwa/bowtie/novoalign.
TVAF
Calculated by counting reference reads and variant reads in all tumor BAM files.
bwaNVAF, bowtieNVAF, novoNVAF
Combined VAF in normal samples by each of the 3 aligners.
NVAF
Same as TVAF, except it is for normal.
nPASSES
Number of samples with SomaticSeq score ≥ 0.7. Max = 54.
nREJECTS
Number of samples with SomaticSeq score < 0.1. Max = 54.
FLAGS (notations indicate the call may be experimental artifacts )
RandN: there are more data sets where this variant is classified as REJECT than there are classified as PASS, and there are more data sets where the variant is not detected at all than there are data sets where it is classified as PASS.
R: there are more samples classified as REJECT than there are classified as PASS, but the number of data sets where the variant is not detected is fewer than the number of data sets where this variant is deemed PASS.
N: there are more data sets where the variant is not detected than there are data sets where this variant is deemed PASS, but there are fewer REJECTS than PASSES.
RplusN: nREJECTS + nNoCall greater than nPASS
MQ0bwa: bwa MQ0 reads consist of more than 10% of bwa reads
MQ0bowtie: bowtie MQ0 reads consist of more than 10% of bowtie reads
MQ0novo: novo MQ0 reads consist of more than 10% of novo reads
bwa0: no PASS sample in bwa at all
bowtie0: no PASS sample in bowtie at all
novo0: no PASS sample in novo at all
bwaOnly: PASS samples exist only in bwa-aligned BAM files
bowtieOnly: PASS samples exist only in bowtie-aligned BAM files
novoOnly: PASS samples exist only in novoalign-aligned BAM files
bwa.bowtie.inconsistentVAF, bwa.novo.inconsistentVAF, or bowtie.novo.inconsistentVAF: tumor VAFs by the two aligners yields a p-value < 0.0027 on chi2 contingency test
inconsistentTitration: the VAFs do not move as expected from tumor purity assessment data sets
ArmLossInNormal: the variant coordinate is found in 6p, 16q, chrX, or chrY.
NonCallable: this variant coordinate is not found in the consensus callable regions.
bwaDP, novoDP, and bowtieDP
total number of variant depth and total depth in data sets aligned by bwa, novoalign, or bowtie.
In the SAMPLE columns
Described in the VCF header.