Data Releases
An ARGO data release is a curated data set of clinical and molecular data submitted to the ARGO Data Platform. Data releases happen approximately four times a year and are cumulative in nature. Released data can be browsed using the File Repository and downloaded using a client tool, provided that access to controlled data has been granted. To access controlled data, please see the DACO application process here.
Data Release 11.0
Release Date: December 03, 2024
New Updates
Data release 11 features 37 new donors and updates to 405 existing donors from programs:
- CRUK Grand Challenge – Mutographs (MUTO-INTL)
- BC Cancer Personalized OncoGenomics Program (POG-CA)
- Polyethnic-1000 (P1000-US)
Program | Clinical | DNA-Seq Alignment | Mutect Variant | Sanger Variant | RNA-Seq Hisat2 | RNA-Seq STAR | Total Changes | |
---|---|---|---|---|---|---|---|---|
POG-CA | New | 37 | 36 | 36 | 25 | 24 | 24 | 37 |
Update | 1 | 1 | 393 | 394 | 394 | |||
MUTO-INTL | New | - | ||||||
Update | 1 | 1 | ||||||
P1000-US | New | - | ||||||
Update | 10 | 10 | ||||||
DR11.0 Summary | New | 37 | 36 | 36 | 25 | 24 | 24 | 37 |
Update | 1 | 12 | 393 | 394 | 405 |
Due to processing errors, 133 files have been rescinded.
Data Release 10.0
Release Date: September 25 , 2024
New Updates
Data release 10 features new 296 donors and updates to 397 existing donors from programs:
- CRUK Grand Challenge – Mutographs (MUTO-INTL)
- BC Cancer Personalized OncoGenomics Program (POG-CA)
- Polyethnic-1000 (P1000-US)
Program | Clinical | DNA-Seq Alignment | Mutect Variant | Sanger Variant | Total Changes | |
---|---|---|---|---|---|---|
POG-CA | New | 217 | 217 | 216 | 216 | 217 |
Update | 239 | 239 | ||||
MUTO-INTL | New | 79 | 79 | 79 | 78 | 79 |
Update | 1 | 155 | 156 | |||
P1000-US | New | |||||
Update | 1 | 1 | 1 | 2 | ||
DR10.0 Summary | New | 296 | 296 | 295 | 294 | 296 |
Update | 1 | 2 | 395 | 397 |
Data Release 9.0
Release Date: May 24 , 2024
New Updates
Data Release 9.0 includes new clinical and molecular data from the CRUK Grand Challenge – Mutographs (MUTO-INTL), BC Cancer Personalized OncoGenomics Program (POG-CA) and Polyethnic-1000 (P1000-US).
This release includes:
- MUTO-INTL: 913 new donors and 102 updated donors
- 913 new donors with clinical data, and aligned WGS reads.
- 912/913 donors additionally have Mutect2 variant calls
- 809/913 donors additionally have Sanger variant calls
- 102 previously released donors have new Sanger variant calls
- POG-CA: 300 new donors
- 300 new donors with clinical data, aligned WGS reads, and Mutect2 variant calls
- 58/300 donors additionally have Sanger variant calls
- P1000-US: 10 new donors and 53 updated donors
- 4/10 new donors with clinical data, aligned WGS reads, and Mutect2 variant calls
- 6/10 new donors with clinical data, aligned RNA-Seq reads
- 13 previously released donors have new Sanger variant calls
- 53 previously released donors have new aligned RNA-Seq reads
Data Release 8.0
Release Date: March 11, 2024
Data Release 8.0 includes new clinical and molecular data from the Mutographs (MUTO-INTL) and Polyethnic-1000 (P1000-US) programs.
This release includes:
- MUTO-INTL: 448 donors with clinical data, aligned WGS reads, and Mutect2 variant calls
- MUTO-INTL: 285/448 donors additionally have Sanger variant calls
- MUTO-INTL: 9/448 donors have open access Sanger variant Calls
- P1000-US: 67 donors with clinical data, aligned WGS reads, and Mutect2 variant calls
Data Release 7.0
Release Date: November 06, 2023
New Updates
ICGC ARGO is excited to announce the first release of new clinical data available now for Australian Pancreatic Cancer Genome Initiative (APGI-AU), Pancreatic Cancer (PACA-CA), and Papillary Thyroid Cancer (PTC-SA). APGI-AU was a member of the original ICGC-25K data portal as PACA-AU.
This release includes:
- APGI-AU: New clinical data has been added for 84 donors
- PACA-CA: New clinical data has been added for 38 donors
- PTC-SA: New clinical data has been added for 239 donors
Data Release 6.0
Release Date: April 05, 2023
New Updates
ICGC ARGO is excited to announce the first release of a new RNA-Seq analysis workflow: RNA Seq Alignment, available now for transcriptome sequencing data from Australian Pancreatic Cancer Genome Initiative (APGI-AU). APGI-AU was a member of the original ICGC-25K data portal as PACA-AU. While clinical data is being actively submitted by the program, a selection of its RNA-Seq data has been reprocessed against the latest GRCh38 Human Reference Genome using the ARGO RNA Seq Alignment Workflow.
This release includes:
- APGI-AU: 28 previously released donors now have
- STAR aligned reads in both genome and transcript coordinates
- HISAT2 aligned reads in genome coordinates
- High confidence splice junctions generated by both STAR and HISAT2
Data Release 5.0
Release Date: March 7, 2022
New Updates
Data Release 5.0 include the first release of a new DNA-Seq Pipeline: Open Access Variant Filtering Workflow, available now for both WXS and WGS samples.
This release includes:
- APGI-AU: 89 previously released donors now have Open Access variant calls
- PACA-CA: 136 previously released donors now have Open Access variant calls
- OCCAMS-GB: 400 previously released donors now have Open Access variant calls
- PTC-SA: 239 previously released donors now have Open Access variant calls
- LUCA-KR: 167 previously released donors now have Open Access variant calls
Data Release 4.0
Release Date: September 27, 2021
New Updates
This release includes a new programs' data becoming available:
- APGI-AU: 89 new donors with aligned WXS reads, Sanger and Mutect2 variant calls.
Additionally, more data has been released for:
- OCCAMS-GB: 168 new donors with aligned WGS reads, Sanger and Mutect2 variant calls.
- LUCA-KR: 8 new donors with aligned reads, Sanger and Mutect2 variant calls.
Data Release 3.0
Release Date: May 10, 2021
New Updates
We are happy to announce a release of a new DNA-Seq Pipeline Variant calling data type: GATK Mutect2, available now for both WXS and WGS samples.
This release includes:
- PACA-CA: 1 new donor with aligned reads, Sanger and Mutect2 variant calls with. Additionally, 135 previously released donors now have GATK Mutect2 variant calls.
- OCCAMS-GB: 130 new donors with aligned WGS reads, Sanger and Mutect2 variant calls. 94 previously released donors additionally have Mutect2 variant calls.
- PTC-SA: 99 new donors with aligned WXS reads, Sanger and Mutect2 variant calls. Additionally, 140 previously released donors now have Mutect2 variant calls.
- LUCA-KR: 130 new donors with aligned reads, Sanger and Mutect2 variant calls. Additionally, 29 previously released donors now have Mutect2 variant calls.
Bug Fixes
- PACA-CA/DO35226 was removed as only a Tumour Sample has completed workflow processing. We are working on resolving processing issues for this donor.
Data Release 2.0
Release Date: October 23, 2020
New Updates
Data Release 2.0 includes the first release of whole exome sequencing (WXS) data from papillary thyroid cancer (PTC-SA) and the first release of whole genome sequencing (WGS) data from lung cancer (LUKA-KR). Both programs were members of the original ICGC-25K data portal. While clinical data will soon be submitted by the programs, a selection of their molecular data has been reprocessed against the latest GRCh38 Human Reference Genome using the ARGO DNA Seq Pipeline.
This release includes:
- PACA-CA: 52 new donors with WGS data totaling 133 donors with WGS reads and 121 donors with variant calls
- PACA-CA: 12 new donors with aligned WXS reads
- LUCA-KR: 29 donors with aligned WGS reads, and 29 donors with variant calls.
- PTC-SA: 142 donors with aligned WXS reads, and 140 donors with variant calls.
In addition to these genomic files, the first release of pipeline Quality Control metrics is also included. For a breakdown of the provided QC metrics, please review the qc metrics file type documentation. For an explanation of qc metrics as they are generated through the pipeline, please review the DNA-Seq Analysis Pipeline documentation.
Bug Fixes
- Three OCCAMS-GB donors had alignment and variant data removed from release due to an issue in file naming in the analysis workflow. The data is being reprocessed and will be re-released in the next release.
Known Issues
None to report.
Data Release 1.0
Release Date: June 19, 2020
New Updates
ICGC ARGO is excited to announce its initial data release including whole genome sequencing (WGS) data from pancreatic cancer (PACA-CA) and esophageal cancer (OCCAMS-GB). Both programs were members of the original ICGC-25K data portal. While clinical data will soon be submitted by the programs, a selection of their molecular data has been reprocessed against the latest GRCh38 Human Reference Genome using the ARGO DNA Seq Pipeline. The resulting somatic mutation calls include single nucleotide variations (SNVs), insertion-deletion (indels), copy number variations (CNVs) and structural variations (SVs). This release includes:
- PACA-CA: 81 donors with aligned WGS reads, and 62 donors with variant calls.
- OCCAMS-GB: 96 donors with aligned WGS reads, and 95 donors with variant calls.