Sequencing Reads
ICGC ARGO provides access to data from ARGO member programs generated through the standardized ARGO Analysis pipelines. The following page lists genomic read analyses and data files generated by the ARGO Analysis pipelines.
Submitted Reads
Raw reads submitted by ARGO member programs are used as inputs within the ARGO Analysis pipeline. Data is accepted in FASTQ, unaligned BAM, or aligned BAM formats, and must be compressed for submission.
File Types
Filename Pattern | Description | Analysis Type | Data Category | Generating Workflow(s) |
---|---|---|---|---|
*.fq.gz, *.fastq.gz | FASTQ file format compressed with gzip | sequencing_experiment | Sequencing Reads | N/A |
*.fq.bz2, *.fastq.bz2 | FASTQ file format compressed with bzip2 | sequencing_experiment | Sequencing Reads | N/A |
*.bam | Binary Alignment Map (BAM) file format | sequencing_experiment | Sequencing Reads | N/A |
Aligned Reads
A read is a sequence obtained from a single sequencing experiment. An aligned read is a sequence that has been aligned to a common reference genome. Typically these reads can number from the hundreds of thousands to tens of millions. Aligned reads are generated internally by the ARGO alignment workflow(s) and maintained in CRAM (a compressed version of BAM that only stores reads different from the reference sequence) format. ARGO reads are aligned to the GRCh38 Human Reference Genome.
File Types
Filename Pattern | Description | Analysis Type | Data Category | Generating Workflow(s) |
---|---|---|---|---|
*.aln.cram | CRAM file format represents a compressed version of the DNA Seq alignment. | sequencing_alignment | Sequencing Reads | DNA Seq Alignment |
*.genome_aln.cram | CRAM file format represents a compressed version of the RNA Seq alignment in genome coordinates. | sequencing_alignment | Sequencing Reads | RNA Seq Alignment |
*.transcriptome_aln.bam | BAM file format represents a compressed version of the RNA Seq alignment in trancript coordinates. | sequencing_alignment | Sequencing Reads | RNA Seq Alignment |
Aligned Reads Index
Secondary files that are external index files for CRAM format files. CRAI files follow a naming convention that corresponds to their matching CRAM file of the original filename suffixed with .cram.crai
. Index files are required for selective access to genomic data inside a CRAM via CRAM slicing.
File Types
Filename Pattern | Description | Analysis Type | Data Category | Generating Workflow(s) |
---|---|---|---|---|
*.aln.cram.crai | CRAM Index file format. Requires a corresponding CRAM file. | sequencing_alignment | Sequencing Reads | DNA Seq Alignment |
*.genome_aln.cram.crai | CRAM Index file format. Requires a corresponding CRAM file. | sequencing_alignment | Sequencing Reads | RNA Seq Alignment |