Sequencing Reads

ICGC ARGO provides access to data from ARGO member programs generated through the standardized ARGO Analysis pipelines. The following page lists genomic read analyses and data files generated by the ARGO Analysis pipelines.

Submitted Reads

Raw reads submitted by ARGO member programs are used as inputs within the ARGO Analysis pipeline. Data is accepted in FASTQ, unaligned BAM, or aligned BAM formats, and must be compressed for submission.

File Types

Filename Pattern	Description	Analysis Type	Data Category	Generating Workflow(s)
.fq.gz, .fastq.gz	FASTQ file format compressed with gzip	sequencing_experiment	Sequencing Reads	N/A
.fq.bz2, .fastq.bz2	FASTQ file format compressed with bzip2	sequencing_experiment	Sequencing Reads	N/A
*.bam	Binary Alignment Map (BAM) file format	sequencing_experiment	Sequencing Reads	N/A

Aligned Reads

A read is a sequence obtained from a single sequencing experiment. An aligned read is a sequence that has been aligned to a common reference genome. Typically these reads can number from the hundreds of thousands to tens of millions. Aligned reads are generated internally by the ARGO alignment workflow(s) and maintained in CRAM (a compressed version of BAM that only stores reads different from the reference sequence) format. ARGO reads are aligned to the GRCh38 Human Reference Genome.

File Types

Filename Pattern	Description	Analysis Type	Data Category	Generating Workflow(s)
*.aln.cram	CRAM file format represents a compressed version of the DNA Seq alignment.	sequencing_alignment	Sequencing Reads	DNA Seq Alignment
*.genome_aln.cram	CRAM file format represents a compressed version of the RNA Seq alignment in genome coordinates.	sequencing_alignment	Sequencing Reads	RNA Seq Alignment
*.transcriptome_aln.bam	BAM file format represents a compressed version of the RNA Seq alignment in trancript coordinates.	sequencing_alignment	Sequencing Reads	RNA Seq Alignment

Aligned Reads Index

Secondary files that are external index files for CRAM format files. CRAI files follow a naming convention that corresponds to their matching CRAM file of the original filename suffixed with .cram.crai. Index files are required for selective access to genomic data inside a CRAM via CRAM slicing.

File Types

Filename Pattern	Description	Analysis Type	Data Category	Generating Workflow(s)
*.aln.cram.crai	CRAM Index file format. Requires a corresponding CRAM file.	sequencing_alignment	Sequencing Reads	DNA Seq Alignment
*.genome_aln.cram.crai	CRAM Index file format. Requires a corresponding CRAM file.	sequencing_alignment	Sequencing Reads	RNA Seq Alignment

Submitted Reads​

File Types​

Aligned Reads​

File Types​

Aligned Reads Index​

File Types​

Submitted Reads

File Types

Aligned Reads

File Types

Aligned Reads Index

File Types