Skip to main content

GATK Mutect2 Variant Calling

Whole genome/exome sequencing (WGS/WXS) aligned CRAM files are processed through the GATK Mutect2 Variant Calling Workflow as tumour/normal pairs. The ARGO DNA Seq pipeline has adopted the Genome Analysis Toolkit Docker Image developed at Broad Institute as the base workflow. For details, please see the latest version of the ARGO GATK Mutect2 Variant Calling workflow.

Inputs

  • Normal WGS/WXS aligned CRAM and index files
  • Tumour WGS/WXS aligned CRAM and index files
  • Reference files

Processing

  • BQSR Subworkflow is an optional data pre-processing step that detects systematic errors made by the sequencing machine when it estimates the accuracy of each base call. While availble as part of the workflow, this is not run as part of the ARGO pipeline.
  • Mutect2 calls SNV and InDel simultaneously via local de-novo assembly of haplotypes in an active region.
  • Learn Read Orientation implements the read orientation model, which produces the --orientation-bias-artifact-priors input to the step Filter Variants.
  • Calculate Contamination Subworkflow emits an estimate of the fraction of reads due to cross-sample contamination for both normal and tumour samples. It also generates an estimate of the allelic copy number segmentation of each tumour sample.
  • Filter Variants applies filters to the raw output of Mutect2.

Collect QC Metrics

  • Cross sample contamination is estimated by GATK:CalculateContamination for both normal and tumour samples
  • Variant callable stats file is generated by GATK:Mutect2
  • Variant filtering stats file is produced by GATK:FilterMutectCalls

Outputs

Workflow Diagram

GATK Mutect2 Variant Calling workflow