Quality Control Metrics

ICGC ARGO provides access to data from ARGO member programs generated through the standardized ARGO Analysis pipelines. The following page lists quality control metrics analyses and data files generated by the ARGO Analysis pipelines.

Quality control metrics are collected and recorded at several checkpoints in the ARGO Analysis pipeline to ensure that only high-quality data is released. All QC Metrics are associated to a corresponding analysis/file set that they annotate.

Read Group QC

Data files containing read group (lane) level QC metrics.

File Types

Filename PatternDescriptionAnalysis TypeData CategoryGenerating Workflow(s)
*.ubam_qc_metrics.tgzGenerated by Picard CollectQualityYieldMetricsqc_metricsQuality Control MetricsDNA Seq Alignment

Alignment QC

Data files containing quality control metrics for aligned CRAM files.

File Types

Filename PatternDescriptionAnalysis TypeData CategoryGenerating Workflow(s)
*.qc_metrics.tgzGenerated by Samtools statsqc_metricsQuality Control MetricsDNA Seq Alignment
*.bas_metrics.tgzGenerated by Sanger bam_stats.pl scriptqc_metricsQuality Control MetricsSanger WGS Variant Calling, Sanger WXS Variant Calling

Duplicates Metrics

Data files containing duplicates metrics for aligned CRAM files. Multiple reads that match at the same position in the genome are located and tagged as duplicate reads in the CRAM file, where duplicate reads are defined as originating from a single fragment of DNA.

File Types

Filename PatternDescriptionAnalysis TypeData CategoryGenerating Workflow(s)
*.duplicates_metrics.tgzGenerated by biobambam2 bammarkduplicates2qc_metricsQuality Control MetricsDNA Seq Alignment

OxoG Metrics

Data files containing OxoG metrics. OxoG quantifies the error rate resulting from oxidative artifacts for aligned CRAM files.

File Types

Filename PatternDescriptionAnalysis TypeData CategoryGenerating Workflow(s)
*.oxog_metrics.tgzGenerated by GATK CollectOxoGMetricsqc_metricsQuality Control MetricsDNA Seq Alignment

Ploidy and Purity Estimation

Data files containing tumour purity and ploidy estimate.

File Types

Filename PatternDescriptionAnalysis TypeData CategoryGenerating Workflow(s)
*.ascat_metrics.tgzEstimated by Sanger ASCAT CNV callerqc_metricsQuality Control MetricsSanger WGS Variant Calling

Cross Sample Contamination

Data files containing cross sample contamination estimate.

File Types

Filename PatternDescriptionAnalysis TypeData CategoryGenerating Workflow(s)
*.contamination_metrics.tgzEstimated by Sanger verifyBamHomChk.pl script, which provides information to determine whether the sample is possibly contaminated or swappedqc_metricsQuality Control MetricsSanger WGS Variant Calling

Genotyping Inferred Gender

Data files containing genotypes comparison results of CRAM files from the same donor, including the fraction of matched genotypes and inferred donor gender.

File Types

Filename PatternDescriptionAnalysis TypeData CategoryGenerating Workflow(s)
*.genotyped_gender_metrics.tgzGenerated by Sanger compareBamGenotypes.pl script.qc_metricsQuality Control MetricsSanger WGS Variant Calling