Output Files

The ViralFlow tool generates 2 types of output: specific and compiled.

Specific Outputs

For each sample in the analysis, a results directory will be created with the pattern prefix_results where the prefix is a code created from the fastq file name of the sample. Each directory has the following results:

File

Description

prefix.all.fa.pango.out.csv

Tabular file with the results of the pangolin tool

prefix.ann.vcf

File in TSV format (Tab separated value) with annotation of the variants made by the snpEff tool

prefix.depth5.all.fa.nextclade.csv

Tabular file with the results of the nextclade tool

prefix.depth5.amb.fa

Consensus genome with ambiguous nucleotides at multiple allele positions

prefix.depth5.fa

Consensus genome with majority nucleotides. Majority consensus, normally deposited in GISAID

prefix.depth5.fa.algn

Consensus genome aligned with reference genome (following same size, mafft –keeplenght)

prefix.depth5.fa.algn.minor.fa

Consensus genome with minority nucleotides

prefix.depth5.fa.bc.intrahost.short.tsv

Tabular file summarizing the genomic positions where intrahost variants are supported

prefix.depth5.fa.bc.intrahost.tsv

Tabular file with all information on intrahost variant positions

prefix.mapped.R1.fq.gz

FASTQ R1 file with mapped reads

prefix.mapped.R2.fq.gz

FASTQ R2 file with mapped reads

prefix.metrics.genome.tsv

Tabular file with mapping depth and coverage metrics

prefix.fastp.html

HTML file summarizing the results of the fastp tool

prefix_snpEff_summary.html

HTML file summarizing the results of the snpEff tool

prefix.sorted.bam

File with the sorted mapping of the sample reads against the reference genome

prefix.tsv

VCF-like file generated by iVar

prefix.unmapped.R1.bam.fq

FASTQ R1 file with unmapped reads

prefix.unmapped.R2.bam.fq

FASTQ R2 file with unmapped reads

prefix.vcf

VCF file

metrics.alignment_summary_metrics

Text file with a summary of various metrics from the mapping

nextclade.errors.csv

File reporting nextclade errors

nextclade_gene_*.translation.fasta

File with the proteins of each gene

snpEff_genes.txt

Tabular file with the number of variants and estimated impact per gene

wgs

Textual file with mapping metrics, including depth by region

prefix_coveragePlot.png

PNG file with graphical visualization of genome coverage

prefix_coveragePlot.svg

SVG file with graphical visualization of genome coverage

prefix_coveragePlot.html

HTML file with graphical visualization of genome coverage

prefix_snpPlot.png

PNG file with graphical visualization of SNPs

prefix_snpPlot.svg

SVG file with graphical visualization of SNPs

prefix_snpPlot.html

HTML file with graphical visualization of SNPs

prefix_depthPlot.png

PNG file with graphical visualization of depth

prefix_depthPlot.svg

SVG file with graphical visualization of depth

prefix_depthPlot.html

HTML file with graphical visualization of depth

Compiled Outputs

ViralFlow also generates compiled outputs that aggregate results across all samples:

File

Description

assembly_statistics_summary.tsv

Summary of assembly statistics for all samples

alignment_metrics_summary.tsv

Summary of alignment metrics for all samples

all_consensus.fasta

Concatenated consensus sequences for all samples

intrahost_summary.tsv

Compiled intrahost variant information for all samples