created by LeeTL1220
on 2016-04-06
Once you have run GATK CNV, you can run ACNV for revised segments based on both the target-coverage profile and the ref/alt counts at heterozygous SNPs. ACNV will report estimates for the posterior probabilities for copy ratio and minor-allele fraction in each segment.
Requirements
Overview of steps
Step 1. Het Pulldown
* These instructions describe one method for Het Pulldown for matched samples. For more options, including tumor-only, please see: http://gatkforums.broadinstitute.org/gatk/discussion/7719/overview-of-getbayesianhetcoverage-for-heterozygous-snp-calling *
Inputs
Outputs
Format for both output files:
CONTIG POSITION REF_COUNT ALT_COUNT REF_NUCLEOTIDE ALT_NUCLEOTIDE READ_DEPTH 1 809876 5 16 A G 21 1 881627 23 12 G A 35 1 882033 9 10 G A 19 1 900505 26 24 G C 50 ....snip....
Invocation
java -jar <path_to_gatk_protected_jar> GetBayesianHetCoverage --reference <reference_sequence> --snpIntervals <snp_file> --tumor <case_bam> --tumorHets <tumor_het_pulldown> --normal <control_bam> --normalHets <normal_het_pulldown> --hetCallingStringency 30
Step 2. Allelic CNV
Inputs
Outputs
-sim-final.seg
containing posterior summary statistics for log2 copy ratio and minor-allele fraction in each segment. Using the above outputprefix, /home/lichtens/myacnv_output/sample1-sim-final.seg-sim-final.cr.param
containing posterior summary statistics for global parameters of the copy-ratio model. Using the above outputprefix, /home/lichtens/myacnv_output/sample1-sim-final.cr.param-sim-final.af.param
containing posterior summary statistics for global parameters of the allele-fraction model. Using the above outputprefix, /home/lichtens/myacnv_output/sample1-sim-final.af.paramOther files containing intermediate results of the calculation are also generated.
Invocation
java -Xmx8g -jar <path_to_gatk_protected_jar> AllelicCNV --tumorHets <tumor_het_pulldown> --tangentNormalized <coverage_profile> --segments <called_segments> --outputPrefix <output_prefix>
Step 3. Call CNLoH and Balanced Segments
* WARNING: This tool is experimental and exists primarily for internal Broad use. *
Inputs
Outputs
-sim-final.cnv.seg
. This file is formatted identically as the output of GATK CNV. Note that this implies that the allelic fraction values are not captured in this file.-sim-final.acs.seg
. This file is formatted identically as the output of Broad CGA AllelicCapSeg. Note that this file can be used as input to Broad-internal versions of ABSOLUTE.-sim-final.titan.het.tsv
. This file can be used as the input to TITAN for the het read counts.-sim-final.titan.tn.tsv
. This file can be used as the input to TITAN for the per-target copy-ratio estimates.Invocation
java -Xmx8g -jar <path_to_gatk_protected_jar> CallCNLoHAndSplits --tumorHets <tumor_het_pulldown> --segments <acnv_segments> --tangentNormalized <coverage_profile> --outputDir <output_dir> --rhoThreshold 0.2 --numIterations 10 --sparkMaster local[*]
Updated on 2016-09-07
From Lizz on 2016-08-30
Hi, if i have the paired (tumor/normal) sample, how to use the normalHets? In the step2, could I use the normalHets file for the arguments —allelicPanelOfNormals,-aPON:File (Input file for allelic-bias panel of normals. Default value: null.)?
From slee on 2016-08-30
Hi @Lizz,
If you use GetHetCoverage or GetBayesianHetCoverage to generate het pulldowns for a large number of normals, you can use these as input to the tool CreateAllelicPanelOfNormals to generate an allelic PoN to use with ACNV. However, the allelic PoN is still under development and the code for this tool was only recently merged into master; thus, if you are using one of the tagged releases, it will not be available.
For the time being, I’d recommend running without the allelic PoN. In this case, the normal het pulldown is not typically used after step 1. However, if you are interested in treating the normal as a tumor case sample, you could pass it as the argument of tumorHet in step 2.
From Lizz on 2016-09-05
thank u!@slee
I have already accomplished the ACNV pipline, and I get the allelicCNV-sim-final.cnv.seg in the step3, not the step2.
and there are still some other diffrences from this description: 1. In step 2, the outputs is : *-MAF.seg, *-no-small.seg, *-sim-final.seg, *-sim-initial.seg, but not the -sim-final.cnv.seg file. 2. In step 3, the outputs is : *-sim-final.acs.seg, *-sim-final.cnb_called.seg, *-sim-final.cnv.seg, *-sim-final.titan.het.tsv, *-sim-final.titan.tn.tsv, not including the file -sim-final.seg. 3. in step 2 and step3, there are have the same file, just for example: /home/lichtens/my_acnv_output/sample1-sim-final.seg , is it resonable?
what’s more, I still have 2 quenstions about the file *-sim-final.acs.seg as input to ABSOLUTE: 1. when I run ABSOLTE the cmdline as follows:
From Lizz on 2016-09-05
sorry about the horrible network!
what’s more, I still have 2 quenstions about the file *-sim-final.acs.seg as input to ABSOLUTE:
1. when I run ABSOLTE the cmdline as follows:(R-3.1.3 >>library(numDeriv) >>library(ABSOLUTE) >>RunAbsolute(“FQ1.allelicCNV-sim-final.acs.seg”, 0, 0.02, 0.95, 10, “PRIMARY_DISEASE”, “Illumina_WES”, “FQ1”, “./”, 1500, 0, 0, “allelic”, maf.fn=NULL, min.mut.af=NULL, output.fn.base=“ofb”, verbose=TRUE)
and the error as : Error: bad restore file magic number (file may be corrupted) — no data loaded In addition: Warning message: file ‘FQ1.allelicCNV-sim-final.acs.seg’ has magic number ‘Chrom‘ Use of save versions prior to 2 is deprecated
the file format is not right? do you have the example file (*-sim-final.acs.seg), and how do you run the ABSOLUTE command? and the format of my -sim-final.acs.seg file is : Chromosome Start.bp End.bp n_probes length n_hets f tau sigma.tau mu.minor sigma.minor mu.major sigma.major SegLabelCNLOH chr1 924630 1298560 11 373930 44 0.5 3.895360886524925 0.017691433016881752 0.961756993262878 0 0.961756993262878 0 2 chr1 1298561 1645376 13 346815 28 0.5 3.8816952962393625 0.030571261954410678 0.956686874399459 0 0.956686874399459 0 2 chr1 1645377 2355508 9 710131 17 0.5 3.9980144929179966 0.022250610170735652 0.9992837019032069 0 0.9992837019032069 0 2
From slee on 2016-09-05
Hi @lizz,
Glad to hear you were able to run ACNV, and thanks for pointing out the differences in this description (which has fallen a little out of date, as we are still actively developing CNV and ACNV—-I will be sure to update it soon, though!) It sounds like the run is completing correctly.
I am not quite sure why the input to ABSOLUTE is not working correctly. Which version of ABSOLUTE are you running? I will double check with some members in Broad CGA (who have been successfully using ACNV as input to ABSOLUTE) to see how they are running things and will get back to you!
From Lizz on 2016-09-06
Hi, @slee ,thank you!
I forgot an important thing that the version of reference genome is hg38, not hg19. And I couldn’t plot the pictures(.png), but the INFO is SUCCESS, why?
the environment that I run ABSOLUTE: 1. ABSOLUTE-1.0.6 (http://www.broadinstitute.org/cancer/cga/absolute_download) 2. R-3.1.3 with the numDeriv / ggplot2 / Rcpp / foreach / DNAcopy / gslibs package , not recommended version of R-2.12.0 3. the input file is descrip above
From slee on 2016-09-06
Hi @lizz,
Thanks for the info—-I’ll look into things. It’s possible that the issue is due to the versions of R you are using.
Also, it may be the case that using hg38 is causing the error in plotting, as plotting only supports hg19 at the moment. That being said, we will try to change the code so that an exception is thrown in this case, instead of a SUCCESS message being returned!
From slee on 2016-09-07
Hi @lizz,
Actually, I’m not sure if the publicly available version of ABSOLUTE-1.0.6 is able to take in either AllelicCapSeg or ACNV output—-it looks like it may only take in HAPSEG output in allelic mode.
The conversion of ACNV output to an ABSOLUTE-compatible AllelicCapSeg file was enabled for Broad CGA users, who are using a version of ABSOLUTE that is (unfortunately) not available to the public at the moment. We will update the post to reflect this—-apologies for the confusion!
From amjad on 2017-01-02
how can I disable spark in AllelicCNV?
From shlee on 2017-01-03
@Lizz,
GenePattern’s [documentation of ABSOLUTE v1.0.6](http://software.broadinstitute.org/cancer/software/genepattern/modules/docs/ABSOLUTE/2) recommends R2.15 with the following packages:
- numDeriv_2012.9-1
- getopt_1.17
- optparse_0.9.5
Please be exact in the version of R and packages you use.
From amjad on 2017-01-09
> @amjad said:
> how can I disable spark in AllelicCNV?
The problem was that I was running too many samples in parallel and spark apparently ran out of ports. It works now after reducing the number of running jobs.
From amjad on 2017-01-10
I am getting an error in AllelicCNV only in few samples:
java.lang.IllegalArgumentException: Initial point in slice sampler is not within specified range. at org.broadinstitute.hellbender.utils.Utils.validateArg(Utils.java:609) at org.broadinstitute.hellbender.utils.mcmc.SliceSampler.sample(SliceSampler.java:71) at org.broadinstitute.hellbender.tools.exome.copyratio.CopyRatioSamplers$VarianceSampler.sample(CopyRatioSamplers.java:54) at org.broadinstitute.hellbender.tools.exome.copyratio.CopyRatioSamplers$VarianceSampler.sample(CopyRatioSamplers.java:29) at org.broadinstitute.hellbender.utils.mcmc.ParameterizedModel.doGibbsUpdate(ParameterizedModel.java:134) at org.broadinstitute.hellbender.utils.mcmc.ParameterizedModel.update(ParameterizedModel.java:124) at org.broadinstitute.hellbender.utils.mcmc.GibbsSampler.runMCMC(GibbsSampler.java:76) at org.broadinstitute.hellbender.tools.exome.copyratio.CopyRatioModeller.fitMCMC(CopyRatioModeller.java:96) at org.broadinstitute.hellbender.tools.exome.ACNVModeller.fitModel(ACNVModeller.java:145) at org.broadinstitute.hellbender.tools.exome.ACNVModeller.(ACNVModeller.java:106) at org.broadinstitute.hellbender.tools.exome.AllelicCNV.runPipeline(AllelicCNV.java:280) at org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram.doWork(SparkCommandLineProgram.java:38) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:108) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:166) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:185) at org.broadinstitute.hellbender.Main.instanceMain(Main.java:76) at org.broadinstitute.hellbender.Main.main(Main.java:92)
From slee on 2017-01-13
Hi @amjad,
That appears to be the result of an edge case in the slice sampling—-should be a quick fix. Can I ask which version of gatk-protected (i.e., commit hash or release number) you are using?
From FPBarthel on 2018-10-17
Hi @LeeTL1220 I successfully used your recent GATK4 [workflow](https://github.com/broadinstitute/gatk/pull/5252 “workflow”) (part of 4.0.10.1 release) to generate `acs.seg` file for some cancer genome data. However using ABSOLUTE 1.0.6, I am unable to import these files using `RunAbsolute()`, which requires and R object as input. Am currently trying to figure out how to convert the `acs.seg` file into a siutable R object format. Do you have any pointers on how to do so?
From LeeTL1220 on 2018-10-18
@FPBarthel Unfortunately, ABSOLUTE is maintained by a different group. I have always had them set it up for me to run and I don’t have access to the source code/configuration. I currently run it through FireCloud (when I run it).
From FPBarthel on 2018-10-18
Thanks @LeeTL1220! After a bit more digging on Google, Github and message boards I came across a somewhat more recent version of ABSOLUTE [here](http://bcb.dfci.harvard.edu/~scarter/clonalevolutionsuite/ “here”) which does seem to support ACS input. It’s been giving a lot of errors but I’ve been able to address some of these with small code fixes and that seems to be fruitful. Is there anyone I can reach out to from the ABSOLUTE group you mention who may be able to give some additional insights?
From petr101 on 2019-01-23
Hi,
in what version of GATK I can find CallCNLoHAndSplits, please? I run successfully a whole workflow using GATK 4.beta.6 and at the last step I found out that CallCNLoHAndSplits is not there. Or is there any alternative to generate a ‘sim-final.acs.seg’ file using this version of GATK?
From jonathanYu on 2019-08-14
Hi,
I need the *-sim-final.acs.seg file as input of deTiN, so that I have to run step 3. However, I came across a problem on the installation of gatk-protected-1.0.0.0-alpha1.2.3. When running ‘./gradlew shadowJar’ , I got an error as below
Downloading https://services.gradle.org/distributions/gradle-2.13-all.zip
Unzipping /home/user/.gradle/wrapper/dists/gradle-2.13-all/1b9wlm73elu4cqnbc0rk7r7qn/gradle-2.13-all.zip to /home/user/.gradle/wrapper/dists/gradle-2.13-all/1b9wlm73elu4cqnbc0rk7r7qn
Set executable permissions for: /home/user/.gradle/wrapper/dists/gradle-2.13-all/1b9wlm73elu4cqnbc0rk7r7qn/gradle-2.13/bin/gradle
…..
Download https://repo1.maven.org/maven2/log4j/log4j/1.2.17/log4j-1.2.17.pom
FAILURE: Build failed with an exception.
BUILD FAILED
Can anyone help me to solve this problem?
From slee on 2019-08-16
@jonathanYu I would not recommend using gatk-protected or ACNV; these have been superseded by GATK4 and ModelSegments, which introduce many improvements.
The GATK4-equivalent of the internal scripts mentioned above can be found at https://github.com/broadinstitute/gatk/tree/master/scripts/unsupported/combine_tracks_postprocessing_cnv Note that these scripts are unsupported and use a number of questionable heuristics. However, you may find studying them useful in developing your own scripts.