Quality Control Metrics
ICGC ARGO provides access to data from ARGO member programs generated through the standardized ARGO Analysis pipelines. The following page lists quality control metrics analyses and data files generated by the ARGO Analysis pipelines.
Quality control metrics are collected and recorded at several checkpoints in the ARGO Analysis pipeline to ensure that only high-quality data is released. All QC Metrics are associated to a corresponding analysis/file set that they annotate.
Read Group QC
Data files containing read group (lane) level QC metrics.
File Types
Filename Pattern | Description | Analysis Type | Data Category | Generating Workflow(s) |
---|---|---|---|---|
*.ubam_qc_metrics.tgz | Generated by Picard CollectQualityYieldMetrics | qc_metrics | Quality Control Metrics | DNA Seq Alignment |
Alignment QC
Data files containing quality control metrics for aligned CRAM files.
File Types
Filename Pattern | Description | Analysis Type | Data Category | Generating Workflow(s) |
---|---|---|---|---|
*.qc_metrics.tgz | Generated by Samtools stats | qc_metrics | Quality Control Metrics | DNA Seq Alignment |
*.bas_metrics.tgz | Generated by Sanger bam_stats.pl script | qc_metrics | Quality Control Metrics | Sanger WGS Variant Calling, Sanger WXS Variant Calling |
Duplicates Metrics
Data files containing duplicates metrics for aligned CRAM files. Multiple reads that match at the same position in the genome are located and tagged as duplicate reads in the CRAM file, where duplicate reads are defined as originating from a single fragment of DNA.
File Types
Filename Pattern | Description | Analysis Type | Data Category | Generating Workflow(s) |
---|---|---|---|---|
*.duplicates_metrics.tgz | Generated by biobambam2 bammarkduplicates2 | qc_metrics | Quality Control Metrics | DNA Seq Alignment |
OxoG Metrics
Data files containing OxoG metrics. OxoG quantifies the error rate resulting from oxidative artifacts for aligned CRAM files.
File Types
Filename Pattern | Description | Analysis Type | Data Category | Generating Workflow(s) |
---|---|---|---|---|
*.oxog_metrics.tgz | Generated by GATK CollectOxoGMetrics | qc_metrics | Quality Control Metrics | DNA Seq Alignment |
Ploidy and Purity Estimation
Data files containing tumour purity and ploidy estimate.
File Types
Filename Pattern | Description | Analysis Type | Data Category | Generating Workflow(s) |
---|---|---|---|---|
*.ascat_metrics.tgz | Estimated by Sanger ASCAT CNV caller | qc_metrics | Quality Control Metrics | Sanger WGS Variant Calling |
Cross Sample Contamination
Data files containing cross sample contamination estimate.
File Types
Filename Pattern | Description | Analysis Type | Data Category | Generating Workflow(s) |
---|---|---|---|---|
*.contamination_metrics.tgz | Estimated by Sanger verifyBamHomChk.pl script, which provides information to determine whether the sample is possibly contaminated or swapped | qc_metrics | Quality Control Metrics | Sanger WGS Variant Calling |
Genotyping Inferred Gender
Data files containing genotypes comparison results of CRAM files from the same donor, including the fraction of matched genotypes and inferred donor gender.
File Types
Filename Pattern | Description | Analysis Type | Data Category | Generating Workflow(s) |
---|---|---|---|---|
*.genotyped_gender_metrics.tgz | Generated by Sanger compareBamGenotypes.pl script. | qc_metrics | Quality Control Metrics | Sanger WGS Variant Calling |