Skip to main content

Data Dictionary

The ICGC ARGO Data Dictionary expresses the details of the data model, which adheres to specific formats and restrictions to ensure a standard of data quality. The following describes the attributes and permissible values for all of the fields within the clinical tsv files for the ARGO Data Platform.

Version 1.24 (2024-10-01)

Version 1.23 (2024-08-19)

0 New field0 Updated field0 Deleted field
15 files with 244 fields
Data Tier:

All

Attribute:

All

Sample Registration (sample_registration)

9 Fields
The collection of data elements required to register the required Donor-Specimen-Sample data to the ARGO Data Platform. Registration of samples is required before molecular and clinical data submission can proceed.
File Name Example: sample_registration[-optional-extension].tsv
Field & Description
Data Tier
Attributes
Type
Permissible Values
Notes & Scripts
program_id
Unique identifier of the ARGO program.
ID
Required
TEXT
This is the unique id that is assigned to your program. If you have logged into the platform, this is the Program Id that you see in the Program Services area. For example, TEST-CA is a Program ID.
submitter_donor_id
Unique identifier of the donor, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
gender
Description of the donor self-reported gender. Gender is described as the assemblage of properties that distinguish people on the basis of their societal roles.
Core
Required
TEXT
Female
Male
Other
submitter_specimen_id
Unique identifier of the specimen, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
specimen_tissue_source
Tissue source of the biospecimen.
Core
Required
TEXT
Blood derived - bone marrow
Blood derived - peripheral blood
Blood derived
Bone marrow
Bone
21 more
tumour_normal_designation
Description of specimens tumour/normal status for data processing.
Core
Required
TEXT
Normal
Tumour
specimen_type
Description of the kind of specimen that was collected with respect to tumour/normal tissue origin.
Core
Required
TEXT
Cell line - derived from metastatic tumour
Cell line - derived from normal
Cell line - derived from tumour
Cell line - derived from xenograft tumour
Metastatic tumour - additional metastatic
13 more
submitter_sample_id
Unique identifier of the sample, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
sample_type
Description of the type of molecular sample used for testing.
Core
Required
TEXT
Amplified DNA
ctDNA
Other DNA enrichments
Other RNA fractions
polyA+ RNA
3 more

Donor (donor)

19 Fields
The collection of data elements related to a specific donor in an ARGO program.
File Name Example: donor[-optional-extension].tsv
Field & Description
Data Tier
Attributes
Type
Permissible Values
Notes & Scripts
program_id
Unique identifier of the ARGO program.
ID
Required
TEXT
submitter_donor_id
Unique identifier of the donor, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
vital_status
Donor's last known state of living or deceased.
Core
Required
TEXT
Alive
Deceased
cause_of_death
Indicate the cause of a donor's death.
Core
Conditional
TEXT
Died of cancer
Died of other reasons
Unknown
Cause of death is only required to be submitted if the donor's vital_status is Deceased.
survival_time
Interval of how long the donor has survived since primary diagnosis, in days.
Core
Conditional
INTEGER
Survival_time is only required to be submitted if the donor's vital_status is Deceased.
primary_site
The text term used to describe the primary site of disease, as categorized by the World Health Organization's (WHO) International Classification of Diseases for Oncology (ICD-O). This categorization groups cases into general categories.
Core
Required
TEXT
Accessory sinuses
Adrenal gland
Anus and anal canal
Base of tongue
Bladder
65 more
To include multiple values, separate values with a pipe delimiter '|' within your file.
height
Indicate the donor's height, in centimeters (cm).
Extended
NUMBER
weight
Indicate the donor's weight, in kilograms (kg).
Extended
NUMBER
bmi
Indicate the donor's Body Mass Index (BMI) in kg/m².
Extended
NUMBER
genetic_disorders
Indicate presence of any hereditary genetic disorders. Genetic diseases are diseases in which inherited genes predispose to increased risk. The genetic disorders associated with cancer often result from an alteration or mutation in a single gene. The diseases range from rare dominant cancer family syndrome to familial tendencies in which low-penetrance genes may interact with other genes or environmental factors to induce cancer. (References: NCIt C3101. Genetic disorder names were standardized using Orphanet (https://www.orpha.net/) and NCI Thesaurus)
Extended
TEXT
Alpha-1-antitrypsin Deficiency
Ataxia Telangiectasia Syndrome
BAP1-related Tumor Predisposition Syndrome
Beckwith-Wiedemann Syndrome
Birt-Hogg-Dub Syndrome
40 more
If the genetic disorder term you use is not included in the controlled terminology, please contact us at https://platform.icgc-argo.org/contact to request it be added. To include multiple values, separate values with a pipe delimiter '|' within your file.
menopause_status
Indicate the donor's menopause status at the time of primary diagnosis. (Reference: caDSR CDE ID 2434914)
Extended
TEXT
Not applicable
Perimenopausal
Postmenopausal
Premenopausal
Unknown
age_at_menarche
Indicate the donor's age, in years, at which the first menstruation event occurred. (Reference: NCIt C19666)
Extended
INTEGER
number_of_pregnancies
Indicate the total number of pregnancy events experienced by the donor. (Reference: NCIt C106551)
Extended
INTEGER
number_of_children
Indicate the number of children the donor has birthed. (Reference: caDSR CDE ID 2486644)
Extended
INTEGER
hrt_type
Indicate the type of hormone replacement therapy (HRT) the donor has taken or is currently taking.
Extended
TEXT
Combination HRT
Estrogen-only HRT
Injectable
Never taken HRT
Not applicable
4 more
hrt_duration
If donor has taken hormone replacement therapy (HRT), indicate how long donor has been taking HRT, in months. (Reference: caDSR CDE ID 5365433)
Extended
Conditional
INTEGER
contraception_type
Indicate the type of hormonal contraception the donor has taken or is currently taking. (Reference: caDSR CDE ID 3264234)
Extended
TEXT
Combination pill
Contraceptive implant
Contraceptive patch
Injectable
Intrauterine device
7 more
contraception_duration
If donor has taken hormonal contraception, indicate duration of use, in months. (Reference: caDSR CDE ID 5206887)
Extended
Conditional
INTEGER
lost_to_followup_after_clinical_event_id
If the donor became lost to follow up, indicate the identifier of the clinical event (eg. submitter_primary_diagnosis_id, submitter_treatment_id or submitter_follow_up_id) after which the donor became lost to follow up.
Extended
TEXT

Specimen (specimen)

24 Fields
The collection of data elements related to a donor's specimen. A specimen is any material sample taken for testing, diagnostic or research purposes.
File Name Example: specimen[-optional-extension].tsv
Field & Description
Data Tier
Attributes
Type
Permissible Values
Notes & Scripts
program_id
Unique identifier of the ARGO program.
ID
Required
TEXT
submitter_donor_id
Unique identifier of the donor, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
submitter_specimen_id
Unique identifier of the specimen, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
submitter_primary_diagnosis_id
Indicate the primary diagnosis event in the clinical timeline that this specimen acquisition was related to.
ID
Required
TEXT
Values must meet the regular expression
pathological_tumour_staging_system
Specify the tumour staging system used to assess the cancer at the time the tumour specimen was resected. Pathological classification is based on the clinical stage information (acquired before treatment) and supplemented/modified by operative findings and pathological evaluation of the resected specimen.
Core
Conditional
TEXT
AJCC 8th edition
AJCC 7th edition
AJCC 6th edition
Ann Arbor staging system
Binet staging system
6 more
This field is only required if the specimen is a tumour.
pathological_t_category
The code to represent the stage of cancer defined by the size or contiguous extension of the primary tumour (T), according to criteria based on multiple editions of the AJCC's Cancer Staging Manual.
Core
Conditional
TEXT
T0
T1
T1a
T1a1
T1a2
47 more
This field is required only if the selected pathological_tumour_staging_system is any edition of the AJCC cancer staging system.
pathological_n_category
The code to represent the stage of cancer defined by whether or not the cancer has reached nearby lymph nodes (N), according to criteria based on multiple editions of the AJCC's Cancer Staging Manual.
Core
Conditional
TEXT
N0
N0a
N0a (biopsy)
N0b
N0b (no biopsy)
21 more
This field is required only if the selected pathological_tumour_staging_system is any edition of the AJCC cancer staging system.
pathological_m_category
The code to represent the stage of cancer defined by whether there are distant metastases (M), meaning spread of cancer to other parts of the body, according to criteria based on multiple editions of the AJCC's Cancer Staging Manual.
Core
Conditional
TEXT
M0
M0(i+)
M1
M1a
M1a(0)
13 more
This field is required only if the selected pathological_tumour_staging_system is any edition of the AJCC cancer staging system.
pathological_stage_group
Specify the tumour stage, based on pathological_tumour_staging_system, used to assess the cancer at the time the tumour specimen was resected.
Core
Conditional
TEXT
Occult Carcinoma
Stage 0
Stage 0a
Stage 0is
Stage 1
78 more
This field depends on the selected pathological_tumour_staging_system, and is only required if the specimen is a tumour. Please refer to the documentation for Tumour Staging Classifications: http://docs.icgc-argo.org/docs/submission/dictionary-overview#tumour-staging-classifications
specimen_acquisition_interval
Interval between primary diagnosis and specimen acquisition, in days.
Core
Required
INTEGER
The associated primary diagnosis is used as the reference point for this interval. To calculate this, find the number of days since the date of primary diagnosis.
tumour_histological_type
The code to represent the histology (morphology) of neoplasms that is usually obtained from a pathology report, according to the International Classification of Diseases for Oncology, 3rd Edition (WHO ICD-O-3). Refer to the ICD-O-3 manual for guidelines at https://apps.who.int/iris/handle/10665/42344.
Core
Conditional
TEXT
Values must meet the regular expression
This field is only required if the specimen is a tumour.
specimen_anatomic_location
Indicate the ICD-O-3 topography code for the anatomic location of a specimen when it was collected. Refer to the guidelines provided in the ICD-O-3 manual at https://apps.who.int/iris/handle/10665/42344.
Core
Required
TEXT
Values must meet the regular expression

Examples:
specimen_laterality
For cancer in a paired organ, indicate the side on which the specimen was obtained. (Reference caDSR CDE ID 2007875)
Extended
TEXT
Left
Not applicable
Right
Unknown
specimen_processing
Indicate the technique used to process specimen.
Extended
TEXT
Cryopreservation in liquid nitrogen (dead tissue)
Cryopreservation in dry ice (dead tissue)
Cryopreservation of live cells in liquid nitrogen
Cryopreservation - other
Formalin fixed & paraffin embedded
5 more
specimen_storage
Indicate the method of specimen storage for specimen that were not extracted freshly or immediately cultured.
Extended
TEXT
Cut slide
Frozen in -70 freezer
Frozen in liquid nitrogen
Frozen in vapour phase
Not Applicable
4 more
For specimens that were freshly extracted or immediately cultured, select Not Applicable.
reference_pathology_confirmed
Indicate whether the pathological diagnosis was confirmed by a (central) reference pathologist. (Reference caDSR CDE ID 2007007)
Core
Conditional
TEXT
Yes
No
Unknown
This field is only required if the specimen is a tumour.
tumour_grading_system
Specify the tumour staging system used to assess the description of a tumour based on how abnormal the tumour cells and the tumour tissue look under a microscope. Tumour grade is an indicator of how quickly a tumour is likely to grow.
Core
Conditional
TEXT
FNCLCC grading system
Four-tier grading system
Gleason grade group system
Grading system for GISTs
Grading system for GNETs
6 more
This field is only required if the specimen is a tumour.
tumour_grade
Grade of the tumour as assigned by the reporting tumour_grading_system.
Core
Conditional
TEXT
Low grade
High grade
GX
G1
G2
13 more
This field depends on the selected tumour_grading_system, and is only required if the specimen is a tumour. Please refer to the documentation for Tumour Grading Classifications: http://docs.icgc-argo.org/docs/submission/dictionary-overview#tumour-grading-classifications
percent_tumour_cells
Indicate a value, in decimals, that represents the percent of tumour cells compared to the number of total cells in a specimen. (Reference: NCIt: C159484)
Core
Conditional
NUMBER
This field is only required if the specimen is a tumour.
percent_tumour_cells_measurement_method
Indicate method used to measure percent_tumour_cells.
Core
Conditional
TEXT
Genomics
Image analysis
Pathology estimate by percent nuclei
This field is only required if the specimen is a tumour.
percent_proliferating_cells
Indicate a value, in decimals, that represents the count of proliferating cells determined during pathologic review of the specimen.
Extended
Conditional
NUMBER
This field should only be submitted if the specimen is tumour.
percent_inflammatory_tissue
Indicate a value, in decimals, that represents the percent of a specimen that is positive for inflammatory markers, including the presence of capillary dilatation, edema and increased leukocytes. (Reference NCIt C159479)
Extended
Conditional
NUMBER
This field should only be submitted if the specimen is tumour.
percent_stromal_cells
Indicate a value, in decimals, that represents the percentage of reactive cells that are present in a tumour specimen but are not malignant such as fibroblasts, vascular structures, etc. (Reference caDSR CDE ID 2841241)
Extended
Conditional
NUMBER
This field should only be submitted if the specimen is tumour.
percent_necrosis
Indicate a value, in decimals, that represents the percent of cells undergoing necrosis compared to the number of total cells present in a tumour specimen. (Reference NCIt C159481)
Extended
Conditional
NUMBER
This field should only be submitted if the specimen is tumour.

Primary Diagnosis (primary_diagnosis)

19 Fields
The collection of data elements related to a donor's primary diagnosis. The primary diagnosis is the first diagnosed case of cancer in a donor. To submit multiple primary diagnoses for a single donor, submit multiple rows in the primary diagnosis file for this donor.
File Name Example: primary_diagnosis[-optional-extension].tsv
Field & Description
Data Tier
Attributes
Type
Permissible Values
Notes & Scripts
program_id
Unique identifier of the ARGO program.
ID
Required
TEXT
submitter_donor_id
Unique identifier of the donor, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
submitter_primary_diagnosis_id
Unique identifier of the primary diagnosis event, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
age_at_diagnosis
Age that the donor was first diagnosed with cancer, in years. This should be based on the earliest diagnosis of cancer.
Core
Required
INTEGER
cancer_type_code
The code to represent the cancer type using the WHO ICD-10 code (https://icd.who.int/browse10/2019/en) classification.
Core
Required
TEXT
Values must meet the regular expression
cancer_type_additional_information
Additional details related to the cancer type that are not covered by the ICD-10 code provided in the cancer_type field.
Extended
TEXT
basis_of_diagnosis
Indicate the most valid basis of how the primary diagnosis was identified. If more than one diagnosis technique was used, select the term that has the highest code number (see notes). (Reference: IACR Standard for Basis of Diagnosis http://www.iacr.com.fr/images/doc/basis.pdf)
Extended
TEXT
Clinical investigation
Clinical
Cytology
Death certificate only
Histology of a metastasis
3 more
0: Death certificate only: Information provided is from a death certificate. 1: Clinical: Diagnosis made before death. 2: Clinical investigation: All diagnostic techniques, including X-ray, endoscopy, imaging, ultrasound, exploratory surgery (such as laparotomy), and autopsy, without a tissue diagnosis. 4: Specific tumour markers: Including biochemical and/or immunologic markers that are specific for a tumour site. 5: Cytology: Examination of cells from a primary or secondary site, including fluids aspirated by endoscopy or needle; also includes the microscopic examination of peripheral blood and bone marrow aspirates. 6: Histology of a metastasis: Histologic examination of tissue from a metastasis, including autopsy specimens. 7: Histology of a primary tumour: Histologic examination of tissue from primary tumour, however obtained, including all cutting techniques and bone marrow biopsies; also includes autopsy specimens of primary tumour. 9: Unknown: No information on how the diagnosis has been made.
laterality
For cancer in a paired organ, indicate the side of the body on which the primary tumour or cancer first developed at the time of primary diagnosis. (Reference caDSR CDE ID 827)
Extended
TEXT
Bilateral
Left
Midline
Not a paired site
Right
2 more
lymph_nodes_examined_status
Indicate if lymph nodes were examined for metastases.
Core
Required
TEXT
Cannot be determined
No
No lymph nodes found in resected specimen
Not applicable
Yes
lymph_nodes_examined_method
Indicate the method used to examine lymph nodes.
Core
Conditional
TEXT
Imaging
Lymph node dissection/pathological exam
Physical palpation of patient
number_lymph_nodes_examined
The total number of lymph nodes tested for the presence of cancer. (Reference: caDSR CDE ID 3)
Extended
Conditional
INTEGER
This field should only be submitted if 'lymph_nodes_examined_status' is 'Yes'.
number_lymph_nodes_positive
The number of regional lymph nodes reported as being positive for tumour metastases. (Reference: caDSR CDE ID 6113694)
Core
Conditional
INTEGER
This field is only required if 'lymph_nodes_examined_status' is 'Yes'.
clinical_tumour_staging_system
Indicate the tumour staging system used to stage the cancer at the time of primary diagnosis (prior to treatment).
Core
TEXT
AJCC 8th edition
AJCC 7th edition
AJCC 6th edition
Ann Arbor staging system
Binet staging system
6 more
clinical_t_category
The code to represent the extent of the primary tumour (T) based on evidence obtained from clinical assessment parameters determined at time of primary diagnosis and prior to treatment, according to criteria based on multiple editions of the AJCC's Cancer Staging Manual.
Core
Conditional
TEXT
T0
T1
T1a
T1a1
T1a2
47 more
This field is required only if the selected clinical_tumour_staging_system is any edition of the AJCC cancer staging system.
clinical_n_category
The code to represent the stage of cancer defined by the extent of the regional lymph node (N) involvement for the cancer based on evidence obtained from clinical assessment parameters determined at time of primary diagnosis and prior to treatment, according to criteria based on multiple editions of the AJCC's Cancer Staging Manual.
Core
Conditional
TEXT
N0
N0a
N0a (biopsy)
N0b
N0b (no biopsy)
21 more
This field is required only if the selected clinical_tumour_staging_system is any edition of the AJCC cancer staging system.
clinical_m_category
The code to represent the stage of cancer defined by the extent of the distant metastasis (M) for the cancer based on evidence obtained from clinical assessment parameters determined at time of primary diagnosis and prior to treatment, according to criteria based on multiple editions of the AJCC's Cancer Staging Manual. MX is NOT a valid category and cannot be assigned.
Core
Conditional
TEXT
M0
M0(i+)
M1
M1a
M1a(0)
13 more
This field is required only if the selected clinical_tumour_staging_system is any edition of the AJCC cancer staging system.
clinical_stage_group
Stage group of the tumour, as assigned by the reporting clinical_tumour_staging_system, that indicates the overall prognostic tumour stage.
Core
Conditional
TEXT
Occult Carcinoma
Stage 0
Stage 0a
Stage 0is
Stage 1
78 more
This field is dependent on the selected clinical_tumour_staging_system. Please refer to the documentation for Tumour Staging Classifications: http://docs.icgc-argo.org/docs/submission/dictionary-overview#tumour-staging-classifications
presenting_symptoms
Indicate presenting symptoms at time of primary diagnosis.
Extended
TEXT
Abdominal Pain
Anemia
Back Pain
Bloating
Cholangitis
21 more
To include multiple values, separate values with a pipe delimiter '|' within your file.
performance_status
Indicate the donor's performance status grade at the time of primary diagnosis. (Reference: ECOG performance score grades from https://ecog-acrin.org/resources/ecog-performance-status).
Extended
TEXT
Grade 0
Grade 1
Grade 2
Grade 3
Grade 4
1 more
Grade 0: Fully active, able to carry on all pre-disease performance without restriction. Grade 1: Restricted in physically strenuous activity but ambulatory and able to carry out work of a light or sedentary nature (ie. Light house work, office work). Grade 2: Ambulatory and capable of all selfcare but unable to carry out any work activities; up and about more than 50% of waking hours. Grade 3: Capable of only limited selfcare; confined to bed or chair more than 50% of waking hours. Grade 4: Completely disabled; cannot carry on any selfcare; totally confined to bed or chair

Treatment (treatment)

22 Fields
The collection of data elements related to a donor's treatment at a specific point in the clinical record. To submit multiple treatments for a single donor, please submit treatment rows in the treatment file for this donor.
File Name Example: treatment[-optional-extension].tsv
Field & Description
Data Tier
Attributes
Type
Permissible Values
Notes & Scripts
program_id
Unique identifier of the ARGO program.
ID
Required
TEXT
submitter_donor_id
Unique identifier of the donor, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
submitter_treatment_id
Unique identifier of the treatment, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
submitter_primary_diagnosis_id
Indicate the primary diagnosis event in the clinical timeline that this treatment was related to.
ID
Required
TEXT
Values must meet the regular expression
treatment_type
Indicate the type of treatment regimen that the donor completed.
Core
Required
TEXT
Ablation
Bone marrow transplant
Chemotherapy
Endoscopic therapy
End of life care
8 more
Depending on the treatment_type(s) selected, additional treatment details may be required to be submitted. For example, if treatment_type includes 'Chemotherapy', the supplemental Chemotherapy treatment type file is required. To include multiple values, separate values with a pipe delimiter '|' within your file.
is_primary_treatment
Indicate if the treatment was the primary treatment following the initial diagnosis.
Core
Conditional
TEXT
Yes
No
line_of_treatment
Indicate the line of treatment if it is not the primary treatment.
Extended
Conditional
INTEGER
treatment_start_interval
The interval between the primary diagnosis and initiation of treatment, in days.
Core
Conditional
INTEGER
The associated primary diagnosis is used as the reference point for this interval. To calculate this, find the number of days since the date of primary diagnosis.
treatment_duration
The duration of treatment regimen, in days.
Core
Conditional
INTEGER
days_per_cycle
Indicate the number of days in a treatment cycle.
Extended
Conditional
INTEGER
number_of_cycles
Indicate the number of treatment cycles.
Extended
Conditional
INTEGER
treatment_intent
Indicate the purpose of the treatment, or the desired effect or outcome resulting from the treatment. (Reference: mCODE/FHIR)
Core
Conditional
TEXT
Curative
Diagnostic
Forensic
Guidance
Palliative
3 more
treatment_setting
Indicate the treatment setting, which describes the treatment's purpose in relation to the primary treatment. (Reference: NCIt C124308)
Core
Conditional
TEXT
Adjuvant
Advanced/Metastatic
Conditioning
Induction
Maintenance
5 more
response_to_treatment_criteria_method
Indicate the criteria used to assess the donor's response to the applied treatment regimen.
Core
Conditional
TEXT
ELN Dohner AML 2017 Oncology Response Criteria
IWG Cheson AML 2003 Oncology Response Criteria
iRECIST
RECIST
Response Assessment in Neuro-Oncology (RANO)
1 more
response_to_treatment
The donor's response to the applied treatment regimen.
Core
Conditional
TEXT
Complete remission
Complete remission with incomplete hematologic recovery (CRi)
Complete remission without minimal residual disease (CRMRD-)
Complete response
Cytogenetic complete remission (CRc)
21 more
This field depends on the selected response_to_treatment_criteria_method. Please refer to the documentation for Response to Treatment Criteria: http://docs.icgc-argo.org/docs/submission/dictionary-overview#response-to-treatment-criteria
outcome_of_treatment
Indicate the donor's outcome of the prescribed treatment.
Extended
Conditional
TEXT
Treatment completed as prescribed
Treatment incomplete due to technical or organizational problems
Treatment incomplete because patient died
Patient choice (stopped or interrupted treatment)
Physician decision (stopped or interrupted treatment)
5 more
toxicity_type
If the treatment was terminated early due to acute toxicity, indicate whether it was due to hematological toxicity or non-hematological toxicity.
Extended
Conditional
TEXT
Hematological
Non-hematological
Not applicable
Unknown
hematological_toxicity
Indicate the hematological toxicities which caused early termination of the treatment. (Codelist reference: NCI-CTCAE (v5.0))
Extended
Conditional
TEXT
Anemia - Grade 3
Anemia - Grade 4
Anemia - Grade 5
Neutropenia - Grade 3
Neutropenia - Grade 4
5 more
To include multiple values, separate values with a pipe delimiter '|' within your file.
non-hematological_toxicity
Indicate the non-hematological toxicities which caused early termination of the treatment. (Codelist reference: NCI-CTCAE (v5.0))
Extended
Conditional
TEXT
Cardiac disorders - Grade 1
Cardiac disorders - Grade 2
Cardiac disorders - Grade 3
Cardiac disorders - Grade 4
Cardiac disorders - Grade 5
25 more
To include multiple values, separate values with a pipe delimiter '|' within your file.
adverse_events
Report any treatment related adverse events. (Codelist reference: NCI-CTCAE (v5.0))
Extended
Conditional
TEXT
Abdominal distension
Abdominal infection
Abdominal pain
Abdominal soft tissue necrosis
Abducens nerve disorder
834 more
To include multiple values, separate values with a pipe delimiter '|' within your file.
clinical_trials_database
If the donor is a participant in a clinical trial, indicate the clinical trial database where the clinical trial is registered.
Extended
TEXT
NCI Clinical Trials
EU Clinical Trials Register
Not applicable
Unknown
If the clinical trials database you use is not included in the controlled terminology, please contact us at https://platform.icgc-argo.org/contact to request it be added.
clinical_trial_number
Based on the clinical_trial_database, indicate the unique NCT or EudraCT clinical trial identifier of which the donor is a participant.
Extended
Conditional
TEXT

Chemotherapy (chemotherapy)

14 Fields
The collection of data elements describing the details of a chemotherapy treatment regimen completed by a donor. To submit multiple treatment drugs for a single regimen, submit multiple rows in the chemotherapy file.
File Name Example: chemotherapy[-optional-extension].tsv
Field & Description
Data Tier
Attributes
Type
Permissible Values
Notes & Scripts
program_id
Unique identifier of the ARGO program.
ID
Required
TEXT
submitter_donor_id
Unique identifier of the donor, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
submitter_treatment_id
Unique identifier of the treatment, as assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
drug_rxnormcui
The unique RxNormID assigned to the treatment regimen drug.
Core
Conditional
TEXT
This field uses standardized vocabulary from the RxNorm database (https://www.nlm.nih.gov/research/umls/rxnorm), provided by the NIH. You can search for RX Norm values through the web interface (https://mor.nlm.nih.gov/RxNav/) or API (https://mor.nlm.nih.gov/download/rxnav/RxNormAPIs.html). For example, to find the rxnormcui based on drug name, you can use: https://rxnav.nlm.nih.gov/REST/rxcui.json?name=leucovorin or https://mor.nlm.nih.gov/RxNav/search?searchBy=String&searchTerm=leucovorin. If the drugs don't exist in RxNorm, please indicate drug_database, drug_id and drug_term where the drugs information can be found.
drug_name
Name of agent or drug administered to donor as part of the treatment regimen.
Core
Conditional
TEXT
This field uses standardized vocabulary from the RxNorm database (https://www.nlm.nih.gov/research/umls/rxnorm), provided by the NIH. You can search for RX Norm values through the web interface (https://mor.nlm.nih.gov/RxNav/) or API (https://mor.nlm.nih.gov/download/rxnav/RxNormAPIs.html). For example, to find the rxnormcui based on drug name, you can use: https://rxnav.nlm.nih.gov/REST/rxcui.json?name=leucovorin or https://mor.nlm.nih.gov/RxNav/search?searchBy=String&searchTerm=leucovorin. If the drugs don't exist in RxNorm, please indicate drug_database, drug_id and drug_term where the drugs information can be found.
drug_database
Indicate the drug database where drug term is found.
Core
Conditional
TEXT
KEGG
PubChem
NCI Thesaurus
If the drugs don't exist in RxNorm, please indicate drug_database, drug_id and drug_term where the drugs information can be found.
drug_id
Indicate the identifier from the drug_database for the drug.
Core
Conditional
TEXT
If the drugs don't exist in RxNorm, please indicate drug_database, drug_id and drug_term where the drugs information can be found.
drug_term
Indicate the drug term as it exists in the database specified in the drug_database.
Core
Conditional
TEXT
If the drugs don't exist in RxNorm, please indicate drug_database, drug_id and drug_term where the drugs information can be found.
chemotherapy_drug_dose_units
Indicate units used to record chemotherapy drug dose.
Core
Required
TEXT
mg/m2
IU/m2
ug/m2
g/m2
mg/kg
1 more
prescribed_cumulative_drug_dose
Indicate the total prescribed cumulative drug dose in the same units specified in chemotherapy_drug_dose_units.
Core
Conditional
NUMBER
Either the 'actual_cumulative_drug_dose' or the 'prescribed_cumulative_drug_dose' field must be submitted.
actual_cumulative_drug_dose
Indicate the total actual cumulative drug dose in the same units specified in chemotherapy_drug_dose_units.
Core
Conditional
NUMBER
Either the 'actual_cumulative_drug_dose' or the 'prescribed_cumulative_drug_dose' field must be submitted.
dose_intensity_reduction
Indicate if there was a significant reduction in dose intensity.
Extended
TEXT
Yes
No
Unknown
dose_intensity_reduction_event
If there was a signficiant reduction in dose intensity, indicate which event caused it.
Extended
TEXT
Dose reduction
Dose delay or dose omission
Both
This field should only be submitted if 'dose_intensity_reduction' is 'Yes'
dose_intensity_reduction_amount
If there was a significant reduction in dose intensity, indicate the amount.
Extended
TEXT
<20%
20-49%
>=50%
Unknown
This field should only be submitted if 'dose_intensity_reduction' is 'Yes'

Hormone Therapy (hormone_therapy)

11 Fields
The collection of data elements describing the details of a hormone treatment therapy completed by a donor. To submit multiple treatment drugs for a single regimen, submit multiple rows in the hormone_therapy file.
File Name Example: hormone_therapy[-optional-extension].tsv
Field & Description
Data Tier
Attributes
Type
Permissible Values
Notes & Scripts
program_id
Unique identifier of the ARGO program.
ID
Required
TEXT
submitter_donor_id
Unique identifier of the donor, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
submitter_treatment_id
Unique identifier of the treatment, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
drug_rxnormcui
The unique RxNormID assigned to the treatment regimen drug.
Core
Conditional
TEXT
This field uses standardized vocabulary from the RxNorm database (https://www.nlm.nih.gov/research/umls/rxnorm), provided by the NIH. You can search for RX Norm values through the web interface (https://mor.nlm.nih.gov/RxNav/) or API (https://mor.nlm.nih.gov/download/rxnav/RxNormAPIs.html). For example, to find the rxnormcui based on drug name, you can use: https://rxnav.nlm.nih.gov/REST/rxcui.json?name=leucovorin or https://mor.nlm.nih.gov/RxNav/search?searchBy=String&searchTerm=leucovorin. If the drugs don't exist in RxNorm, please indicate drug_database, drug_id and drug_term where the drugs information can be found.
drug_name
Name of agent or drug administered to donor as part of the treatment regimen.
Core
Conditional
TEXT
This field uses standardized vocabulary from the RxNorm database (https://www.nlm.nih.gov/research/umls/rxnorm), provided by the NIH. You can search for RX Norm values through the web interface (https://mor.nlm.nih.gov/RxNav/) or API (https://mor.nlm.nih.gov/download/rxnav/RxNormAPIs.html). For example, to find the rxnormcui based on drug name, you can use: https://rxnav.nlm.nih.gov/REST/rxcui.json?name=leucovorin or https://mor.nlm.nih.gov/RxNav/search?searchBy=String&searchTerm=leucovorin. If the drugs don't exist in RxNorm, please indicate drug_database, drug_id and drug_term where the drugs information can be found.
drug_database
Indicate the drug database where drug term is found.
Core
Conditional
TEXT
KEGG
PubChem
NCI Thesaurus
If the drugs don't exist in RxNorm, please indicate drug_database, drug_id and drug_term where the drugs information can be found.
drug_id
Indicate the identifier from the drug_database for the drug.
Core
Conditional
TEXT
If the drugs don't exist in RxNorm, please indicate drug_database, drug_id and drug_term where the drugs information can be found.
drug_term
Indicate the drug term as it exists in the database specified in the drug_database.
Core
Conditional
TEXT
If the drugs don't exist in RxNorm, please indicate drug_database, drug_id and drug_term where the drugs information can be found.
hormone_drug_dose_units
Indicate the units used to record hormone drug dose.
Core
Required
TEXT
mg/m2
IU/m2
ug/m2
g/m2
mg/kg
1 more
prescribed_cumulative_drug_dose
Indicate the total prescribed cumulative drug dose in the same units specified in hormone_drug_dose_units.
Core
Conditional
NUMBER
Either the 'actual_cumulative_drug_dose' or the 'prescribed_cumulative_drug_dose' field must be submitted.
actual_cumulative_drug_dose
Indicate the total actual cumulative drug dose in the same units specified in hormone_drug_dose_units.
Core
Conditional
NUMBER
Either the 'actual_cumulative_drug_dose' or the 'prescribed_cumulative_drug_dose' field must be submitted.

Radiation (radiation)

10 Fields
The collection of data elements describing the details of a radiation treatment completed by a donor.
File Name Example: radiation[-optional-extension].tsv
Field & Description
Data Tier
Attributes
Type
Permissible Values
Notes & Scripts
program_id
Unique identifier of the ARGO program.
ID
Required
TEXT
submitter_donor_id
Unique identifier of the donor, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
submitter_treatment_id
Unique identifier of the treatment, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
radiation_therapy_modality
Indicate the method of radiation treatment or modality.
Core
Required
TEXT
Electron
Heavy Ions
Photon
Proton
radiation_therapy_type
Indicate type of radiation therapy administered.
Core
Required
TEXT
External
Internal
Internal application includes Brachytherapy.
radiation_therapy_fractions
Indicate the total number of fractions delivered as part of treatment.
Core
Required
INTEGER
radiation_therapy_dosage
Indicate the total dose given in units of Gray (Gy).
Core
Required
NUMBER
anatomical_site_irradiated
Indicate body region where radiation therapy was administered. (Reference: Cancer Care Ontario)
Core
Required
TEXT
Abdomen
Body
Brain
Chest
Head
9 more
radiation_boost
A radiation boost is an extra radiation treatment targeted at the tumor bed, given after the regular sessions of radiation is complete (Reference NCIt: C137812). Indicate if this radiation treatment was a radiation boost.
Extended
TEXT
Yes
No
Not applicable
reference_radiation_treatment_id
If a radiation boost was given, indicate the 'submitter_treatment_id' of the primary radiation treatment the radiation boost treatment is linked to.
Extended
TEXT

Immunotherapy (immunotherapy)

12 Fields
The collection of data elements describing the details of an immunotherapy treatment completed by a donor.
File Name Example: immunotherapy[-optional-extension].tsv
Field & Description
Data Tier
Attributes
Type
Permissible Values
Notes & Scripts
program_id
Unique identifier of the ARGO program.
ID
Required
TEXT
submitter_donor_id
Unique identifier of the donor, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
submitter_treatment_id
Unique identifier of the treatment, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
immunotherapy_type
Indicate the type of immunotherapy administered to donor.
Core
Required
TEXT
Cell-based
Immune checkpoint inhibitors
Monoclonal antibodies other than immune checkpoint inhibitors
Other immunomodulatory substances
drug_rxnormcui
The unique RxNormID assigned to the treatment regimen drug.
Core
Conditional
TEXT
This field uses standardized vocabulary from the RxNorm database (https://www.nlm.nih.gov/research/umls/rxnorm), provided by the NIH. You can search for RX Norm values through the web interface (https://mor.nlm.nih.gov/RxNav/) or API (https://mor.nlm.nih.gov/download/rxnav/RxNormAPIs.html). For example, to find the rxnormcui based on drug name, you can use: https://rxnav.nlm.nih.gov/REST/rxcui.json?name=leucovorin or https://mor.nlm.nih.gov/RxNav/search?searchBy=String&searchTerm=leucovorin. If the drugs don't exist in RxNorm, please indicate drug_database, drug_id and drug_term where the drugs information can be found.
drug_name
Name of agent or drug administered to donor as part of the treatment regimen.
Core
Conditional
TEXT
This field uses standardized vocabulary from the RxNorm database (https://www.nlm.nih.gov/research/umls/rxnorm), provided by the NIH. You can search for RX Norm values through the web interface (https://mor.nlm.nih.gov/RxNav/) or API (https://mor.nlm.nih.gov/download/rxnav/RxNormAPIs.html). For example, to find the rxnormcui based on drug name, you can use: https://rxnav.nlm.nih.gov/REST/rxcui.json?name=leucovorin or https://mor.nlm.nih.gov/RxNav/search?searchBy=String&searchTerm=leucovorin. If the drugs don't exist in RxNorm, please indicate drug_database, drug_id and drug_term where the drugs information can be found.
drug_database
Indicate the drug database where drug term is found.
Core
Conditional
TEXT
KEGG
PubChem
NCI Thesaurus
If the drugs don't exist in RxNorm, please indicate drug_database, drug_id and drug_term where the drugs information can be found.
drug_id
Indicate the identifier from the drug_database for the drug.
Core
Conditional
TEXT
If the drugs don't exist in RxNorm, please indicate drug_database, drug_id and drug_term where the drugs information can be found.
drug_term
Indicate the drug term as it exists in the database specified in the drug_database.
Core
Conditional
TEXT
If the drugs don't exist in RxNorm, please indicate drug_database, drug_id and drug_term where the drugs information can be found.
immunotherapy_drug_dose_units
Indicate units used to record immunotherapy drug dose.
Core
Required
TEXT
mg/m2
IU/m2
ug/m2
g/m2
mg/kg
1 more
prescribed_cumulative_drug_dose
Indicate the total prescribed cumulative drug dose in the same units specified in immunotherapy_drug_dose_units.
Core
Conditional
NUMBER
Either the 'actual_cumulative_drug_dose' or the 'prescribed_cumulative_drug_dose' field must be submitted.
actual_cumulative_drug_dose
Indicate the total actual cumulative drug dose in the same units specified in immunotherapy_drug_dose_units.
Core
Conditional
NUMBER
Either the 'actual_cumulative_drug_dose' or the 'prescribed_cumulative_drug_dose' field must be submitted.

Surgery (surgery)

18 Fields
The collection of data elements related to a donor's surgical treatment at a specific point in the clinical record. To submit multiple surgeries, submit multiple rows in the Surgery file. If a specimen was resected during surgery, indicate the unique identifier of the specimen in the 'submitter_specimen_id' field. If multiple specimens were resected during a single surgical procedure, submit each 'submitter_specimen_id' as a new row with the same 'submitter_treatment_id' and 'surgery_type' values.
File Name Example: surgery[-optional-extension].tsv
Field & Description
Data Tier
Attributes
Type
Permissible Values
Notes & Scripts
program_id
Unique identifier of the ARGO program.
ID
Required
TEXT
submitter_donor_id
Unique identifier of the donor, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
submitter_specimen_id
If a specimen was resected during surgery, indicate the unique identifier of the specimen here. This submitter_specimen_id should exist in the Specimen file.
ID
TEXT
Values must meet the regular expression
Please refer to documentation for instructions on how to submit a specimen that was resected during surgery: https://docs.icgc-argo.org/docs/submission/submitting-clinical-data#submitting-data-in-surgery-file
submitter_treatment_id
Unique identifier of the treatment, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
surgery_type
Indicate the type of surgical procedure that was performed. (References: SNOMED, NCIt, UMLS)
Core
Required
TEXT
Axillary Clearance
Axillary lymph nodes sampling
Biopsy
Bypass Gastrojejunostomy
Cholecystectomy
48 more
surgery_site
Indicate the ICD-O-3 topography code for the anatomic site where the surgical procedure was performed, according to the International Classification of Diseases for Oncology, 3rd Edition (WHO ICD-O-3).
Core
Conditional
TEXT
Values must meet the regular expression
Refer to the ICD-O-3 manual for guidelines at https://apps.who.int/iris/handle/10665/42344. This field is not required if a specimen was resected during surgery (ie. if `submitter_specimen_id` is submitted) since anatomic site is collected in the Specimen table.
surgery_location
Indicate whether the surgical procedure was done at the primary, local recurrence or metastatic location.
Core
Conditional
TEXT
Local recurrence
Metastatic
Primary
This field is not required if a specimen was resected during surgery (ie. if `submitter_specimen_id` is submitted) since type of specimen is collected in the Specimen table.
tumour_length
Indicate the length of the tumour, in millimetres (mm).
Extended
NUMBER
tumour_width
Indicate the width of the tumour, in millimetres (mm).
Extended
NUMBER
greatest_dimension_tumour
Indicate the greatest dimension or diameter of the tumour, in millimetres (mm). (Reference: NCIt C157135)
Extended
NUMBER
tumour_focality
Indicate the characterization of the location of the tumour. (Reference: NCIt: C157425)
Extended
TEXT
Cannot be assessed
Multifocal
Not applicable
Unifocal
Unknown
residual_tumour_classification
Indicate the absence or presence of residual tumour after treatment. In some cases treated with surgery and/or with neoadjuvant therapy there will be residual tumour at the primary site after treatment because of incomplete resection or local and regional disease that extends beyond the limit of ability of resection. (Reference: AJCC 8th ed.)
Extended
TEXT
Not applicable
RX
R0
R1
R2
1 more
RX (Presence of residual tumour cannot be assessed), R0 (no residual tumour), R1 (microscopic residual tumour), R2 (macroscopic residual tumour)
margin_types_involved
Indicate the margin type(s) involved.
Extended
TEXT
Circumferential resection margin
Common bile duct margin
Distal margin
Not applicable
Proximal margin
1 more
To include multiple values, separate values with a pipe delimiter '|' within your file.
margin_types_not_involved
Indicate the margin type(s) not involved.
Extended
TEXT
Circumferential resection margin
Common bile duct margin
Distal margin
Not applicable
Proximal margin
1 more
To include multiple values, separate values with a pipe delimiter '|' within your file.
margin_types_not_assessed
Indicate the margin type(s) that cannot be assessed.
Extended
TEXT
Circumferential resection margin
Common bile duct margin
Distal margin
Not applicable
Proximal margin
1 more
To include multiple values, separate values with a pipe delimiter '|' within your file.
lymphovascular_invasion
Indicate the absence or presence of lymphovascular invasion (LVI). LVI includes lymphatic invasion, vascular invasion and lymphovascular invasion. (Reference: AJCC 8th ed.)
Extended
TEXT
Absent
Both lymphatic and small vessel and venous (large vessel) invasion
Lymphatic and small vessel invasion only
Not applicable
Present
2 more
perineural_invasion
A morphologic finding referring to a tumour that has spread along and infiltrated nerve fibers. Indicate the presence or absence of perineural invasion. (Reference: NCIt: C48260, ICCR)
Extended
TEXT
Absent
Cannot be assessed
Not applicable
Present
Unknown
extrathyroidal_extension
Indicate the involvement of perithyroidal soft tissues by direct extension from the thyroid primary. (Reference: AJCC 8th Ed.)
Extended
TEXT
Absent
Cannot be assessed
Not applicable
Present
Unknown

Follow Up (follow_up)

22 Fields
The collection of data elements related to a specific follow-up visit to a donor. A follow-up is defined as any point of contact with a patient after primary diagnosis. To submit multiple follow-ups for a single donor, please submit multiple rows in the follow-up file for this donor.
File Name Example: follow_up[-optional-extension].tsv
Field & Description
Data Tier
Attributes
Type
Permissible Values
Notes & Scripts
program_id
Unique identifier of the ARGO program.
ID
Required
TEXT
submitter_donor_id
Unique identifier of the donor, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
submitter_follow_up_id
Unique identifier for a follow-up event in a donor's clinical record, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
interval_of_followup
Interval from the primary diagnosis date to the follow-up date, in days.
Core
Required
INTEGER
The associated primary diagnosis is used as the reference point for this interval. To calculate this, find the number of days since the date of primary diagnosis.
disease_status_at_followup
Indicate the donor's disease status at time of follow-up. (Reference: RECIST)
Core
Required
TEXT
Complete remission
Distant progression
Loco-regional progression
No evidence of disease
Partial remission
3 more
submitter_primary_diagnosis_id
Indicate if the follow-up is related to a specific primary diagnosis event in the clinical timeline.
ID
TEXT
Values must meet the regular expression
submitter_treatment_id
Indicate if the follow-up is related to a specific treatment event in the clinical timeline.
ID
TEXT
Values must meet the regular expression
weight_at_followup
Indicate the donor's weight, in kilograms (kg), at the time of follow-up.
Extended
NUMBER
relapse_type
Indicate the donor's relapse type.
Core
Conditional
TEXT
Distant recurrence/metastasis
Local recurrence
Local recurrence and distant metastasis
Progression (liquid tumours)
This field is required to be submitted if disease_status_at_followup indicates a state of progression, relapse, or recurrence.
relapse_interval
If the donor was clinically disease free following primary treatment and then relapse or recurrence or progression (for liquid tumours) occurred afterwards, then this field will indicate the length of disease free interval, in days.
Core
Conditional
INTEGER
This field is required to be submitted if disease_status_at_followup indicates a state of progression, relapse, or recurrence.
method_of_progression_status
Indicate the method(s) used to confirm the donor's progression or relapse or recurrence disease status. (Reference: caDSR CDE ID 6161031)
Core
Conditional
TEXT
Biomarker in liquid biopsy (e.g. tumour marker in blood or urine)
Biopsy
Blood draw
Bone marrow aspirate
Core biopsy
19 more
This field is required to be submitted if disease_status_at_followup indicates a state of progression, relapse, or recurrence. To include multiple values, separate values with a pipe delimiter '|' within your file.
anatomic_site_progression_or_recurrence
Indicate the ICD-O-3 topography code for the anatomic site(s) where disease progression, relapse or recurrence occurred, according to the International Classification of Diseases for Oncology, 3rd Edition (WHO ICD-O-3). Refer to the ICD-O-3 manual for guidelines at https://apps.who.int/iris/handle/10665/42344.
Core
Conditional
TEXT
Values must meet the regular expression

Examples:
This field is required to be submitted if disease_status_at_followup indicates a state of progression, relapse, or recurrence. To include multiple values, separate values with a pipe delimiter '|' within your file.
recurrence_tumour_staging_system
Specify the tumour staging system used to stage the cancer at time of retreatment for recurrence or disease progression. This may be represented as rTNM in the medical report.
Extended
Conditional
TEXT
AJCC 8th edition
AJCC 7th edition
AJCC 6th edition
Ann Arbor staging system
Binet staging system
6 more
This field is required to be submitted if disease_status_at_followup indicates a state of progression, relapse, or recurrence.
recurrence_t_category
The code to represent the extent of the primary tumour (T) based on evidence obtained from clinical assessment parameters determined at the time of retreatment for a recurrence or disease progression, according to criteria based on multiple editions of the AJCC's Cancer Staging Manual.
Extended
Conditional
TEXT
T0
T1
T1a
T1a1
T1a2
47 more
This field is required only if the selected recurrence_tumour_staging_system is any edition of the AJCC cancer staging system.
recurrence_n_category
The code to represent the stage of cancer defined by the extent of the regional lymph node (N) involvement for the cancer based on evidence obtained from clinical assessment parameters determined at the time of retreatment for a recurrence or disease progression, according to criteria based on multiple editions of the AJCC's Cancer Staging Manual.
Extended
Conditional
TEXT
N0
N0a
N0a (biopsy)
N0b
N0b (no biopsy)
21 more
This field is required only if the selected recurrence_tumour_staging_system is any edition of the AJCC cancer staging system.
recurrence_m_category
The code to represent the stage of cancer defined by the extent of the distant metastasis (M) for the cancer based on evidence obtained from clinical assessment parameters determined at the time of retreatment for a recurrence or disease progression, according to criteria based on multiple editions of the AJCC's Cancer Staging Manual.
Extended
Conditional
TEXT
M0
M0(i+)
M1
M1a
M1a(0)
13 more
This field is required only if the selected recurrence_tumour_staging_system is any edition of the AJCC cancer staging system.
recurrence_stage_group
The code to represent the stage group of the tumour, as assigned by the reporting recurrence_tumour_staging_system, that indicates the overall prognostic tumour stage (ie. Stage I, Stage II, Stage III etc.) at the time of retreatment for a recurrence or disease progression.
Extended
Conditional
TEXT
Occult Carcinoma
Stage 0
Stage 0a
Stage 0is
Stage 1
78 more
This field is dependent on the selected recurrence_tumour_staging_system. Please refer to the documentation for Tumour Staging Classifications: http://docs.icgc-argo.org/docs/submission/dictionary-overview#tumour-staging-classifications
posttherapy_tumour_staging_system
Specify the tumour staging system used to stage the cancer after treatment for patients receiving systemic and/or radiation therapy alone or as a component of their initial treatment, or as neoadjuvant therapy before planned surgery. This may be represented as ypTNM or ycTNM in the medical report.
Extended
TEXT
AJCC 8th edition
AJCC 7th edition
AJCC 6th edition
Ann Arbor staging system
Binet staging system
6 more
posttherapy_t_category
The code to represent the extent of the primary tumour (T) based on evidence obtained from clinical assessment parameters determined after treatment for patients receiving systemic and/or radiation therapy alone or as a component of their initial treatment, or as neoadjuvant therapy before planned surgery, according to criteria based on multiple editions of the AJCC's Cancer Staging Manual.
Extended
Conditional
TEXT
T0
T1
T1a
T1a1
T1a2
47 more
This field is required only if the selected posttherapy_tumour_staging_system is any edition of the AJCC cancer staging system.
posttherapy_n_category
The code to represent the stage of cancer defined by the extent of the regional lymph node (N) involvement for the cancer based on evidence obtained from clinical assessment parameters determined determined after treatment for patients receiving systemic and/or radiation therapy alone or as a component of their initial treatment, or as neoadjuvant therapy before planned surgery, according to criteria based on multiple editions of the AJCC's Cancer Staging Manual.
Extended
Conditional
TEXT
N0
N0a
N0a (biopsy)
N0b
N0b (no biopsy)
21 more
This field is required only if the selected posttherapy_tumour_staging_system is any edition of the AJCC cancer staging system.
posttherapy_m_category
The code to represent the stage of cancer defined by the extent of the distant metastasis (M) for the cancer based on evidence obtained from clinical assessment parameters determined after treatment for patients receiving systemic and/or radiation therapy alone or as a component of their initial treatment, or as neoadjuvant therapy before planned surgery, according to criteria based on multiple editions of the AJCC's Cancer Staging Manual.
Extended
Conditional
TEXT
M0
M0(i+)
M1
M1a
M1a(0)
13 more
This field is required only if the selected posttherapy_tumour_staging_system is any edition of the AJCC cancer staging system.
posttherapy_stage_group
The code to represent the stage group of the tumour, as assigned by the reporting posttherapy_tumour_staging_system, that indicates the overall prognostic tumour stage (ie. Stage I, Stage II, Stage III etc.) after treatment for patients receiving systemic and/or radiation therapy alone or as a component of their initial treatment, or as neoadjuvant therapy before planned surgery.
Extended
Conditional
TEXT
Occult Carcinoma
Stage 0
Stage 0a
Stage 0is
Stage 1
78 more
This field is dependent on the selected posttherapy_tumour_staging_system. Please refer to the documentation for Tumour Staging Classifications: http://docs.icgc-argo.org/docs/submission/dictionary-overview#tumour-staging-classifications

Exposure (exposure)

15 Fields
The collection of data elements related to a donor's clinically relevant information not immediately resulting from genetic predispositions.
File Name Example: exposure[-optional-extension].tsv
Field & Description
Data Tier
Attributes
Type
Permissible Values
Notes & Scripts
program_id
Unique identifier of the ARGO program.
ID
Required
TEXT
submitter_donor_id
Unique identifier of the donor, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
tobacco_smoking_status
Indicate donor's self-reported smoking status and history. (Reference: caDSR CDE ID 2181650)
Extended
TEXT
Current reformed smoker for <= 15 years
Current reformed smoker for > 15 years
Current reformed smoker, duration not specified
Current smoker
Lifelong non-smoker (<100 cigarettes smoked in lifetime)
2 more
Current smoker: Has smoked 100 cigarettes in their lifetime and who currently smokes. Includes daily smokers and non-daily smokers (also known as occassional smokers). Current reformed smoker for >15 years: A person who currently does not smoke and has been a non-smoker for more than 15 years, but has smoked at least 100 cigarettes in their life. Current reformed smoker for <= 15 years: A person who currently does not smoke and has been a non-smoker for less than 15 years, but has smoked at least 100 cigarettes in their life. Current reformed smoker, duration not specified: A person who currently does not smoke and has been a non-smoker for unspecified time, but has smoked at least 100 cigarettes in their lifetime. Smoking history not documented: Smoking history has not be recorded or is unknown.
tobacco_type
Indicate the type(s) of tobacco used by donor. (Reference: NCIt CDE C177629)
Extended
Conditional
TEXT
Chewing Tobacco
Cigar
Cigarettes
Electronic cigarettes
Not applicable
5 more
To include multiple values, separate values with a pipe delimiter '|' within your file.
pack_years_smoked
This field applies to cigarettes. Indicate the smoking intensity in Pack Years, where the number of pack years is defined as the number of cigarettes smoked per day times (x) the number of years smoked divided (/) by 20. (Reference: caDSR CDE ID 2955385)
Extended
Conditional
NUMBER
alcohol_history
Indicate if the donor has consumed at least 12 drinks of any alcoholic beverage in their lifetime. (Reference: caDSR CDE ID 2201918)
Extended
TEXT
Yes
No
Not applicable
Unknown
alcohol_consumption_category
Describe the donor's current level of alcohol use as self-reported by the donor. (Reference: caDSR CDE ID 3457767)
Extended
TEXT
Daily Drinker
None
Not applicable
Occasional Drinker (< once a month)
Social Drinker (> once a month, < once a week)
2 more
alcohol_type
Indicate the type(s) of alcohol the donor consumes. (Reference: NCIt CDE C173647)
Extended
Conditional
TEXT
Beer
Liquor
Not applicable
Other
Unknown
1 more
To include multiple values, separate values with a pipe delimiter '|' within your file.
opiate_use
Indicate if the donor has ever used opium or other opiates like opium juice, heroin, or Sukhteh regularly (at least weekly over a 6-month period).
Extended
TEXT
Never
Not applicable
Unknown
Yes, currently
Yes, only in the past
hot_drinks_consumption
Indicate if the donor regularly drinks tea, coffee, or other hot drinks.
Extended
TEXT
Never
Not applicable
Unknown
Yes, currently
Yes, only in the past
red_meat_frequency
Indicate how frequently the donor eats red meat. Examples of red meat include beef, veal, pork, lamb, mutton, horse, or goat meat.
Extended
TEXT
Never
Less than once a month
1-3 times a month
Not applicable
Once or twice a week
3 more
processed_meat_frequency
Indicate how frequently the patient eats processed meat. Examples of processed meat include hams, salamis, or sausages.
Extended
TEXT
Never
Less than once a month
1-3 times a month
Not applicable
Once or twice a week
3 more
soft_drinks_frequency
Indicate the frequency of soft drink consumption by the donor.
Extended
TEXT
Never
Less than once a month
1-3 times a month
Not applicable
Once or twice a week
3 more
exercise_frequency
Indicate how many times per week the donor exercises for at least 30 minutes. (Reference: NCIt CDE C25367)
Extended
TEXT
Never
Less than once a month
1-3 times a month
Not applicable
Once or twice a week
3 more
exercise_intensity
Indicate the intensity of exercise. (Reference: NCIt CDE C25539)
Extended
Conditional
TEXT
Low: No increase in the heart beat, and no perspiration
Moderate: Increase in the heart beat slightly with some light perspiration
Not applicable
Vigorous: Increase in the heart beat substantially with heavy perspiration
Unknown

Family History (family_history)

11 Fields
The collection of data elements describing a donor's familial relationships and familial cancer history.
File Name Example: family_history[-optional-extension].tsv
Field & Description
Data Tier
Attributes
Type
Permissible Values
Notes & Scripts
program_id
Unique identifier of the ARGO program.
ID
Required
TEXT
submitter_donor_id
Unique identifier of the donor, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
family_relative_id
Unique identifier of the relative, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
This field is required to ensure that family members are identified in unique records. Ids can be as simple as an incremented numeral to ensure uniqueness.
relative_with_cancer_history
Indicate if donor has any genetic relatives with a history of cancer. (Reference: NCIt C159104, caDSR CDE ID 6161023)
Extended
TEXT
Yes
No
Unknown
relationship_type
Indicate genetic relationship of the relative to the donor. (Reference: caDSR CDE ID 2179937)
Extended
TEXT
Aunt
Brother
Cousin
Daughter
Father
25 more
gender_of_relative
The self-reported gender of related individual.
Extended
TEXT
Female
Male
Other
Unknown
age_of_relative_at_diagnosis
The age (in years) when the donor's relative was first diagnosed. (Reference: caDSR CDE ID 5300571)
Extended
Conditional
INTEGER
cancer_type_code_of_relative
The code to describe the malignant diagnosis of the donor's relative with a history of cancer using the WHO ICD-10 code (https://icd.who.int/browse10/2019/en) classification.
Extended
Conditional
TEXT
Values must meet the regular expression
relative_vital_status
Relative's last known state of living or deceased.
Extended
TEXT
Alive
Deceased
Unknown
cause_of_death_of_relative
Indicate the cause of the death of the relative.
Extended
Conditional
TEXT
Died of cancer
Died of other reasons
Unknown
relative_survival_time
Indicate how long, in days, the relative survived from the time they were diagnosed with cancer.
Extended
Conditional
INTEGER

Biomarker (biomarker)

30 Fields
The collection of data elements describing a donor's biomarker tests. A biomarker is a biological molecule found in blood, other body fluids, or tissues that is indicative of the presence of cancer in the body. Each row should include one or more biomarker test(s) associated with a particular clinical event (submitter_specimen_id, submitter_primary_diagnosis_id, submitter_treatment_id or submitter_follow_up_id field). If the biomarker test is not associated with a particular clinical event, then indicate the time interval at which the biomarker test was performed (test_interval field).
File Name Example: biomarker[-optional-extension].tsv
Field & Description
Data Tier
Attributes
Type
Permissible Values
Notes & Scripts
program_id
Unique identifier of the ARGO program.
ID
Required
TEXT
submitter_donor_id
Unique identifier of the donor, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
submitter_specimen_id
Unique identifier of the specimen, assigned by the data provider.
ID
TEXT
Values must meet the regular expression
Only one of ['submitter_specimen_id', 'submitter_primary_diagnosis_id', 'submitter_treatment_id', 'submitter_follow_up_id'] is required. If the biomarker test is not associated with a specimen or primary diagnosis, treatment or follow up event, then the 'test_interval' field will be required.
submitter_primary_diagnosis_id
If the biomarker test was done at the time of primary diagnosis, then indicate the associated submitter_primary_diagnosis_id here.
ID
TEXT
Values must meet the regular expression
Only one of ['submitter_specimen_id', 'submitter_primary_diagnosis_id', 'submitter_treatment_id', 'submitter_follow_up_id'] is required. If the biomarker test is not associated with a specimen or primary diagnosis, treatment or follow up event, then the 'test_interval' field will be required.
submitter_treatment_id
If the biomarker test was done at the initiation of a specific treatment regimen, indicate the associated submitter_treatment_id here.
ID
TEXT
Values must meet the regular expression
Only one of ['submitter_specimen_id', 'submitter_primary_diagnosis_id', 'submitter_treatment_id', 'submitter_follow_up_id'] is required. If the biomarker test is not associated with a specimen or primary diagnosis, treatment or follow up event, then the 'test_interval' field will be required.
submitter_follow_up_id
If the biomarker test was done during a follow-up event, then indicate the associated submitter_follow_up_id here.
ID
TEXT
Values must meet the regular expression
Only one of ['submitter_specimen_id', 'submitter_primary_diagnosis_id', 'submitter_treatment_id', 'submitter_follow_up_id'] is required. If the biomarker test is not associated with a specimen or primary diagnosis, treatment or follow up event, then the 'test_interval' field will be required.
test_interval
If the biomarker test was not associated with a specific specimen or follow-up, primary diagnosis or treatment event, then indicate the interval of time since primary diagnosis that the biomarker test was performed at, in days.
ID
INTEGER
This field is required if the biomarker test is not associated with a specimen or primary diagnosis, treatment or follow-up event. The associated primary diagnosis is used as the reference point for this interval. To calculate this, find the number of days since the date of primary diagnosis.
ca19-9_level
Indicate the level of carbohydrate antigen 19-9 (CA19-9). Carbohydrate antigen 19-9 testing is useful to monitor the response to treatment in pancreatic cancer patients. (Reference: LOINC: 24108-3)
Extended
INTEGER
crp_levels
Indicate the quantitative measurement of the amount of CRP, an inflammatory marker, in the blood in mg/L. Used for screening and monitoring for inflammatory disease, infections, and for cardiovascular disease risk assessment. (Reference: NCIt C64548, LOINC 30522-7)
Extended
INTEGER
ldh_level
Indicate the level of lactate dehydrogenase (LDH), in IU/L. An increased amount of LDH in the blood may be a sign of tissue damage and some types of cancer. (Reference: NCI)
Extended
INTEGER
anc
Indicate the value for a hematology laboratory test for the absolute number of neutrophil cells present in a sample of peripheral blood from a donor, in cells/uL. The ANC may be used to check for infection, inflammation, leukemia and other conditions. Cancer treatment such as chemotherapy may reduce the ANC. (Reference: caDSR CDE ID: 2180198)
Extended
INTEGER
alc
Indicate the absolute number of lymphocytes (ALC) found in a given volume of blood, as cells/uL. Lymphocytes help fight off infections and an altered cellular immune function has been demonstrated in patients with cancer. (Reference: NCIt: C113237)
Extended
INTEGER
brca_carrier
Indicate whether donor is a carrier of a mutation in a BRCA gene. A mutation in this gene is associated with an increased risk of familial breast and ovarian cancer.
Extended
TEXT
BRCA1
BRCA2
Both BRCA1 and BRCA2
No
Not applicable
1 more
er_status
Indicate the expression of estrogen receptor (ER). (Reference: NAACCR 3827)
Extended
TEXT
Cannot be determined
Negative
Not applicable
Positive
Unknown
er_allred_score
Indicate the Allred score for estrogen receptor. The Allred score is based on the percentage of cells that stain positive by immunohistochemistry (IHC) for estrogen receptor (ER) and the intensity of that staining. (Reference: NAACCR: 3828, caDSR CDE ID 2725288)
Extended
TEXT
Total ER Allred score of 1
Total ER Allred score of 2
Total ER Allred score of 3
Total ER Allred score of 4
Total ER Allred score of 5
5 more
er_percent_positive
Indicate a value, in decimals, that represents the percent of cells staining estrogen receptor positive by immunohistochemistry (IHC).
Extended
NUMBER
her2_ihc_status
Indicate the expression of human epidermal growth factor receptor-2 (HER2) assessed by immunohistochemistry (IHC). (Reference: AJCC 8th Edition, Chapter 48)
Extended
TEXT
Cannot be determined
Equivocal
Negative
Not applicable
Positive
1 more
Negative: 0 or 1+ staining, Equivocal: 2+ staining, Positive: 3+ staining
her2_ish_status
Indicate the expression of human epidermal growth factor receptor-2 (HER2) assessed by in situ hybridization (ISH). (Reference: NAACCR: 3854)
Extended
TEXT
Cannot be determined
Equivocal
Positive
Negative
Not applicable
1 more
pr_status
Indicate the expression of progesterone receptor (PR). (Reference: NAACCR 3915)
Extended
TEXT
Cannot be determined
Negative
Not applicable
Positive
Unknown
pr_allred_score
Indicate the Allred score for progesterone receptor. The Allred score is based on the percentage of cells that stain positive by IHC for the progesterone receptor (PR) and the intensity of that staining. (Reference: NAACCR 3916)
Extended
TEXT
Total PR Allred score of 1
Total PR Allred score of 2
Total PR Allred score of 3
Total PR Allred score of 4
Total PR Allred score of 5
5 more
pr_percent_positive
Indicate a value, in decimals, that represents the percent of cells staining progesterone receptor positive by immunohistochemistry (IHC).
Extended
NUMBER
pd-l1_status
Indicate the immunohistochemical test result that refers to the over-expression or lack of expression of programmed death ligand 1 (PD-L1) in a tissue sample of a primary or metastatic malignant neoplasm. (Reference NCIt: C122807)
Extended
TEXT
Cannot be determined
Negative
Not applicable
Positive
Unknown
alk_ihc_status
Indicate the expression of anaplastic lymphoma receptor tyrosine kinase (ALK) as assessed by immunohistochemistry (IHC). Abnormalities of ALK can be present in lung cancers.
Extended
TEXT
Cannot be determined
Negative
Not applicable
Positive
Unknown
alk_ihc_intensity
Indicate the intensity of anaplastic lymphoma receptor tyrosine kinase (ALK) as assessed by immunohistochemistry (IHC). Abnormalities of ALK can be present in lung cancers.
Extended
TEXT
0 (No stain)
+1
+2
+3
alk_fish_status
Indicate the expression of anaplastic lymphoma receptor tyrosine kinase (ALK) as assessed by fluorescence in situ hybridization (FISH). Abnormalities of ALK can be present in lung cancers.
Extended
TEXT
Cannot be determined
Negative
Not applicable
Positive
Unknown
ros1_ihc_status
Indicate the expression of receptor lymphoma kinase (ROS1) as assessed by immunohistochemistry (IHC). Gene fusions involving ROS1 can be present in lung cancers.
Extended
TEXT
Cannot be determined
Negative
Not applicable
Positive
Unknown
pan-trk_ihc_status
Indicate the expression of Pan-TRK as assessed by immunohistochemistry (IHC). Pan-TRK IHC screens for neurotrophic tyrosine kinase receptor (NTRK) fusions which have been described in many cancers including lung, thyroid and colorectal cancers.
Extended
TEXT
Cannot be determined
Negative
Not applicable
Positive
Unknown
ret_fish_status
Indicate the expression of gene arrangement involving the RET proto-oncogene (RET1) as assessed by fluorescence in situ hybridization (FISH). RET gene rearrangements are associated with several different neoplastic conditions. (Reference: NCIt C46005)
Extended
TEXT
Cannot be determined
Negative
Not applicable
Positive
Unknown
hpv_ihc_status
Indicate the expression of Human papillomavirus (HPV) p16 as assessed by immunohistochemistry (IHC).
Extended
TEXT
Cannot be determined
Negative
Not applicable
Positive
Unknown
hpv_dna_status
Indicate the expression of Human papillomavirus (HPV) as assessed using a laboratory test in which cells are scraped from the cervix to look for DNA of HPV. (Reference: NCIt C93141)
Extended
TEXT
Cannot be determined
Negative
Not applicable
Positive
Unknown

Comorbidity (comorbidity)

8 Fields
The collection of data elements related to a donor's comorbidities. A donor's comorbidities are any medical conditions (e.g diabetes, prior cancer malignancies) that have existed or may occur during the clinical course of the donor who has the index disease under study. To submit multiple comorbidities for a single donor, submit multiple rows in the comorbidity file for this donor.
File Name Example: comorbidity[-optional-extension].tsv
Field & Description
Data Tier
Attributes
Type
Permissible Values
Notes & Scripts
program_id
Unique identifier of the ARGO program.
ID
Required
TEXT
submitter_donor_id
Unique identifier of the donor, assigned by the data provider.
ID
Required
TEXT
Values must meet the regular expression
prior_malignancy
Prior malignancy affecting donor.
Extended
TEXT
Yes
No
Unknown
laterality_of_prior_malignancy
If donor has history of prior malignancy, indicate laterality of previous diagnosis. (Reference: caDSR CDE ID 4122391)
Extended
Conditional
TEXT
Bilateral
Left
Midline
Not applicable
Right
2 more
age_at_comorbidity_diagnosis
Indicate the age of comorbidity diagnosis, in years.
Extended
Conditional
INTEGER
comorbidity_type_code
Indicate the code for the comorbidity using the WHO ICD-10 code classification (https://icd.who.int/browse10/2019/en).
ID
Required
Conditional
TEXT
Values must meet the regular expression
This field is required because it should have a cancer or non-cancer ICD-10 code. This field is marked 'Conditional' because it depends on the value of the `prior_malignancy` field. Both these fields will need to be consistent. If `prior_malignancy` is `Yes`, then an ICD-10 code related to cancer is expected in this field. If `prior_malignancy` is `No`, then an ICD-10 code related to a non-cancer condition is expected in this field.
comorbidity_treatment_status
Indicate if the patient is being treated for the comorbidity (this includes prior malignancies).
Extended
Conditional
TEXT
Yes
No
Unknown
comorbidity_treatment
Indicate treatment details for the comorbidity (this includes prior malignancies).
Extended
Conditional
TEXT