User Guide‎ > ‎

Interpreting OncoSNP output

OncoSNP v1.3 introduces a drastically changed output format in order to bring it inline with OncoSNP-SEQ

For details of the older formats please see below for OncoSNP v1.2 output details.

OncoSNP version 1.3

Standard output (.cnvs)

These output files contain a list of putative chromosomal alterations with the following columns:

Chromosome

The chromosome on which the somatic alteration is found.

StartPosition

The start position of the alteration.

EndPosition

The end position of the alteration.

CopyNumber

The copy number associated with the alteration. OncoSNP assumes that, at each locus, there exists tumour cells (A) that harbour a somatic alteration and other tumour (B) and normal cells with a normal copy number (2). The copy number reported here is the copy number of the tumour cells (A).

LOH

The LOH status associated with the alteration. OncoSNP assumes that, at each locus, there exists tumour cells (A) that harbour a somatic alteration and other tumour (B) and normal cells which are heterozygous (2). The LOH reported here is the LOH of the tumour cells (A). 

Possible values are: 0 (No LOH), 1 (Putative Somatic LOH) and 2 (Putative Germline Run of Homozygosity).

Rank

OncoSNP calls are ranked to allow a multi-resolution genome-wide copy number construction. The set of calls that are ranked 1 will give a coarse representation of the genome-wide copy number distribution whilst the use of all higher ranked calls will allow more detailed profiles to be constructed. See How OncoSNP calls are ranked for further details.

Loglik

A likelihood score associated with each call. This is not intended for use by end-users.

nProbes

The number of SNP array probes spanned by the call.

NormalFraction

The fraction of tumour and normal cells that do not harbour this somatic alteration. This value will vary between calls if the intra-tumour heterogeneity mode is used but is a constant if only stromal contamination is used.

TumourState

The tumour state table index of the copy number call.

PloidyNo

OncoSNP explores two different ploidy configurations during the training phase to assist whether a sample is diploid or non-diploid. 

The calls associated with the most probable ploidy configuration will have PloidyNo value 1, whilst the alternate configuration will have PloidyNo value 2. 

Users should use the calls from one ploidy number or the other not a mix of both!

See Ploidy Numbers Explained for more details.

MajorCopyNumber

The copy number of the major allele.

MinorCopyNumber

The copy number of the minor allele.


Quality Control (.qc)

LogRRatioShift

OncoSNP shifts the Log R Ratio values to account for unknown ploidy. This value indicates how much the Log R Ratio values were adjusted by.

NormalContent

The degree of normal contamination.
 
Average Copy Number

The average genome-wide copy number (ploidy) of the sample.

Log-likelihood

The log-likelihood of the data.

OutlierRate

The estimated outlier rate.

LogRRatioStd

The standard deviation of the Log R Ratio values.
 
BAlleleFreqStd

The standard deviation of the B Allele Frequency values.

PloidyNo

OncoSNP explores two different ploidy configurations during the training phase to assist whether a sample is diploid or triploid/tetraploid. 

The calls associated with the most probable configuration (largest log-likelihood) will have PloidyNo value 1, whilst the alternate configuration will have PloidyNo value 2. 

Users should use one configuration or the other not both!


Graphical Output

OncoSNP also returns graphical output to Postscript files (Windows users will need Ghostscript/GhostViewer to open these, Mac users can try PostView).

The Postscript files contain a summary of the genome-wide copy number/LOH profile and contains the following plots (from top to bottom):
  1. The adjusted Log R Ratio values (black) which correct for normalisation discrepancies in non-diploid tumours. The expected Log R Ratio levels corresponding to the inferred copy number profile is also shown (red).
  2. The B Allele Frequencies (black). The expected B Allele Frequency levels corresponding to the inferred copy number profile is also shown (red).
  3. Copy number/LOH profile based on Rank 1 calls (black/red). The fraction of normal/unmutated tumour cells at each locus (blue). 
  4. Additional copy number/LOH changes (relative to Rank 1)  based on Rank 2 calls (black/red). The fraction of normal/unmutated tumour cells at each locus (blue). 
  5. Additional copy number/LOH changes (relative to Rank 1+2) based on Rank 3 calls (black/red). The fraction of normal/unmutated tumour cells at each locus (blue). 
  6. Additional copy number/LOH changes  (relative to Rank 1+2+3) based on Rank 4 calls (black/red). The fraction of normal/unmutated tumour cells at each locus (blue).
Additional per-chromosome plots are shown on subsequent pages in the document showing the absolute copy number profiles by rank rather than relative changes.


Figure - Example OncoSNP Summary Figure




OncoSNP versions 1.2 and below

Standard output (.cnv)

These output files contain a list of putative chromosomal alterations with the following columns:
  • Chromosome
  • Start Position (bp)
  • End Position (bp)
  • Length (Mb)
  • Start Position (Probe Name)
  • End Position (Probe Name)
  • Number of Probes Spanned
  • Copy number (most probable)
  • LOH Status (most probable, 0 - No LOH, 1 - Somatic LOH, 2 - Germline LOH)
  • % Normal (estimated percentage of cells which do not carry the alteration) [NEW in v1.1]
  • Tumour State (most probable)
  • Log Bayes Factor (associated with most probable tumour state)
  • Multiple columns of Log Bayes Factors for each tumour state

Full output (.gg)


Probe-based summary file containing the following columns:

Probe Name

Chromosome

Position

Log R Ratio 

The values given will be corrected for local GC content and baseline ploidy.

B allele frequency (same as input)

Normal Germline Genotype

In single sample analysis mode, no data is reported in this field. In paired sample analysis mode, the germline genotype is obtained from the normal sample data.

Tumour State 1-3

The three tumour states with the highest Log Bayes Factors.

Copy number 1-3

The copy numbers associated with the top three tumour states.

LOH 1-3

The LOH state associated with the top three tumour states.

Tumour genotype 1-3

Tumour genotype associated with top three tumour states.
Germline genotype 1-3
Germline genotype associated with top three tumour states. In single-sample analysis mode, this is estimated from the tumour data itself, whilst in paired mode, this is the same as reported in the "Normal Germline Genotype" column.

Germline content 1-3

The estimated proportion of cells possessing the germline genotype at this probe location based for the top three tumour states. This field is NaN if the SNP is homozygous. 


Plots (.ps.gz or .png)


Example summary output [NEW in v1.2]

Summary graphic output

Top Row - Local GC content and ploidy corrected Log R Ratio values.
2nd Row - B Allele Frequency.
3rd Row - Tumour copy number and LOH.
Bottom Row - Putative amplification regions.


Example per-chromosome output



Log R Ratio

This plot shows the Log R Ratio corrected for local GC content (if specified) and normalisation artefacts due to baseline ploidy. The RED line is a moving average of the Log R Ratio and the GREEN lines indicate the expected Log R Ratio levels for different copy number states.

B allele frequency

This plot shows the B allele frequency. The solid lines below the plot correspond to putative somatic LOH regions. Dotted lines indicate putative germline mutations. The colours indicate the tumour state ranking (RED - 1st, GREEN - 2nd, BLUE - 3rd). 

Tumour Copy Number

This plot shows the estimated tumour copy numbers. The colours indicate the tumour state ranking (RED - 1st, GREEN - 2nd, BLUE - 3rd).

Normal content

This plot shows the estimated segmented germline content (i.e. the percentage of cells in the sample possessing the germline genotype at that SNP location. This value is calculated by averaging over per-SNP values in the region). The colours indicate the tumour state ranking (RED - 1st, GREEN - 2nd, BLUE - 3rd).

Smoothed Tumour Copy Number [NEW in v1.1]

This plot shows the estimated smoothed tumour copy numbers.

Normal content associated with smooth copy number profile [NEW in v1.1]

This plot shows the estimated per-SNP germline content associated with the smooth tumour copy number profile.


Quality Control (.qc)


This file contains a number of important quality control metrics:
  1. Outlier Rate - the estimated outlier rate.
  2. Std. Dev. LRR - a measure of the variance of the Log R Ratio.
  3. Std. Dev. BAF - a measure of the variance of the B Allele Frequencies.
  4. Stromal contamination (x2) - estimates of the level of stromal contamination for each mode.
  5. Log R Ratio baseline shift (x2) - an estimate of the baseline for each mode.
  6. Log-likelihood (x2) - log-likelihood value associated with each mode.
  7. Log R Ratio levels (x2) - estimates of the Log R Ratio level for copy numbers 0-6 for each mode.
Note: In OncoSNP v1.0, three different starting configurations were attempted and reported but in v1.1 we only examine two as, often in v1.0, two of the starting configurations converged to the same final model.

Smoothed CNAs output (.cnvs[NEW in v1.1]

These output files contain a list of putative chromosomal alterations (using smoothing) with the following columns:
  • Chromosome
  • Start Position (bp)
  • End Position (bp)
  • Copy number (most probable)
  • LOH Status (most probable, 0 - No LOH, 1 - Somatic LOH, 2 - Germline LOH)
  • Normal content % (% of cells with tumour genotype in this region -- average over all heterozygous SNPs, see .ggs files for per-SNP values) [NEW in v1.2]


Smoothed CNAs SNP output (.ggs[NEW in v1.1]

These output files contain a list of quantities output for each probe (using smoothing) with the following columns:
  • ProbeID
  • Chromosome
  • Position (bp)
  • Tumour State
  • Copy number
  • LOH Status (0 - No LOH, 1 - Somatic LOH, 2 - Germline LOH)
  • Number of B alleles in tumour genotype
  • Normal content (% of cells with tumour genotype at this SNP - values only given for heterozygous SNPs, NaN otherwise)
  • Amplification (0 - No Amplification, 1 - Putative Amplification) [NEW in v1.2]

Subpages (1): FAQ