Skip to main content

Sequencing Reads

ICGC ARGO provides access to data from ARGO member programs generated through the standardized ARGO Analysis pipelines. The following page lists genomic read analyses and data files generated by the ARGO Analysis pipelines.

Submitted Reads

Raw reads submitted by ARGO member programs are used as inputs within the ARGO Analysis pipeline. Data is accepted in FASTQ, unaligned BAM, or aligned BAM formats, and must be compressed for submission.

File Types

Filename PatternDescriptionAnalysis TypeData CategoryGenerating Workflow(s)
*.fq.gz, *.fastq.gzFASTQ file format compressed with gzipsequencing_experimentSequencing ReadsN/A
*.fq.bz2, *.fastq.bz2FASTQ file format compressed with bzip2sequencing_experimentSequencing ReadsN/A
*.bamBinary Alignment Map (BAM) file formatsequencing_experimentSequencing ReadsN/A

Aligned Reads

A read is a sequence obtained from a single sequencing experiment. An aligned read is a sequence that has been aligned to a common reference genome. Typically these reads can number from the hundreds of thousands to tens of millions. Aligned reads are generated internally by the ARGO alignment workflow(s) and maintained in CRAM (a compressed version of BAM that only stores reads different from the reference sequence) format. ARGO reads are aligned to the GRCh38 Human Reference Genome.

File Types

Filename PatternDescriptionAnalysis TypeData CategoryGenerating Workflow(s)
*.aln.cramCRAM file format represents a compressed version of the DNA Seq alignment.sequencing_alignmentSequencing ReadsDNA Seq Alignment
*.genome_aln.cramCRAM file format represents a compressed version of the RNA Seq alignment in genome coordinates.sequencing_alignmentSequencing ReadsRNA Seq Alignment
*.transcriptome_aln.bamBAM file format represents a compressed version of the RNA Seq alignment in trancript coordinates.sequencing_alignmentSequencing ReadsRNA Seq Alignment

Aligned Reads Index

Secondary files that are external index files for CRAM format files. CRAI files follow a naming convention that corresponds to their matching CRAM file of the original filename suffixed with .cram.crai. Index files are required for selective access to genomic data inside a CRAM via CRAM slicing.

File Types

Filename PatternDescriptionAnalysis TypeData CategoryGenerating Workflow(s)
*.aln.cram.craiCRAM Index file format. Requires a corresponding CRAM file.sequencing_alignmentSequencing ReadsDNA Seq Alignment
*.genome_aln.cram.craiCRAM Index file format. Requires a corresponding CRAM file.sequencing_alignmentSequencing ReadsRNA Seq Alignment