Read Statistics
| Stage | Reads | Mean length |
|---|---|---|
| Unfiltered | 5000 | 564 |
| centrifuge-filtered | 1921 | 539 |
| human_dna | 116 | 456 |
| human_rna | 0 | 0 |
| reagent | 0 | 0 |
| Filtered | 2963 | 584 |
Reads in groups due to host/contaminant filtering
- Stage: Host/contaminant filtering step
- Reads: Number of reads in this group
- Mean length: Mean read length within this read group
The distributions of the reads after trimming. They still contain host reads.
The distributions of the reads after cleaning. Host reads and technical contaminants are removed.
Read classifications by centrifuge. Click to zoom.
Consensus
| Reference | Family | Organism | Segment | Length | Coverage | Description |
|---|---|---|---|---|---|---|
| GU830839.1 | Arenaviridae | Mammarenavirus lassaense | S | 3358 | 95.32 | Lassa virus strain BA366 glycoprotein precursor (GPC) and nucleoprotein (NP) genes, complete cds |
| GU979513.1 | Arenaviridae | Mammarenavirus lassaense | L | 7207 | 96.06 | Lassa virus strain BA366 Z protein (Z) and polymerase (L) genes, complete cds |
Reference-based genome assembly statistics
- Reference: ID of the reference sequence
- Family: Virus family
- Organism: Name of the species
- Segment: Identifier of the segment. Unsegmented for single segment virus genomes
- Length: Length of the reference genome
- Coverage: Percent of the reference genome that were succesfully called
- Description: Data base description of the reference
| Reference | Family | Organism | Segment | Length | Coverage | Positions called | Ambiguous positions | Mapped reads | Average read coverage |
|---|---|---|---|---|---|---|---|---|---|
| GU830839.1 | Arenaviridae | Mammarenavirus lassaense | S | 3358 | 95.32 | 3201 | 157 | 997 | 176.0 |
| GU979513.1 | Arenaviridae | Mammarenavirus lassaense | L | 7207 | 96.06 | 6923 | 284 | 999 | 83.1 |
Reference-based genome assembly statistics
- Reference: ID of the reference sequence
- Family: Virus family
- Organism: Name of the species
- Segment: Identifier of the segment. Unsegmented for single segment virus genomes
- Length: Length of the reference genome
- Coverage: Percent of the reference genome that were succesfully called
- Positions called: Number of bases called
- Ambiguous positions: Number of ambiguous positions set to "N"
- Mapped reads: Number of reads aligned to the reference genome
- Average read coverage: Average number of reads per reference genome position
| Reference | Family | Organism | Segment | Length | Coverage | Description |
|---|---|---|---|---|---|---|
| LC710218.1 | Fiersviridae | Emesvirus zinderi | Unsegmented | 3605 | 93.31 | Escherichia phage MS2 GI_B RNA, complete genome |
Reference-based genome assembly statistics
- Reference: ID of the reference sequence
- Family: Virus family
- Organism: Name of the species
- Segment: Identifier of the segment. Unsegmented for single segment virus genomes
- Length: Length of the reference genome
- Coverage: Percent of the reference genome that were succesfully called
- Description: Data base description of the reference
| Reference | Family | Organism | Segment | Length | Coverage | Positions called | Ambiguous positions | Mapped reads | Average read coverage |
|---|---|---|---|---|---|---|---|---|---|
| LC710218.1 | Fiersviridae | Emesvirus zinderi | Unsegmented | 3605 | 93.31 | 3364 | 241 | 798 | 130.5 |
Reference-based genome assembly statistics
- Reference: ID of the reference sequence
- Family: Virus family
- Organism: Name of the species
- Segment: Identifier of the segment. Unsegmented for single segment virus genomes
- Length: Length of the reference genome
- Coverage: Percent of the reference genome that were succesfully called
- Positions called: Number of bases called
- Ambiguous positions: Number of ambiguous positions set to "N"
- Mapped reads: Number of reads aligned to the reference genome
- Average read coverage: Average number of reads per reference genome position
| Reference | Family | Organism | Length | Coverage | Description |
|---|
Reference-based genome assembly statistics
- Reference: ID of the reference sequence
- Family: Virus family
- Organism: Name of the species
- Length: Length of the reference genome
- Coverage: Percent of the reference genome that were succesfully called
- Description: Data base description of the reference
| Reference | Family | Organism | Length | Coverage | Positions called | Ambiguous positions | Mapped reads | Average read coverage |
|---|
Reference-based genome assembly statistics
- Reference: ID of the reference sequence
- Family: Virus family
- Organism: Name of the species
- Length: Length of the reference genome
- Coverage: Percent of the reference genome that were succesfully called
- Positions called: Number of bases called
- Ambiguous positions: Number of ambiguous positions set to "N"
- Mapped reads: Number of reads aligned to the reference genome
- Average read coverage: Average number of reads per reference genome position
| Reference | Family | Organism | Length | Coverage | Description |
|---|---|---|---|---|---|
| GU830839.1 | Arenaviridae | Mammarenavirus lassaense | 3358 | 95.32 | Lassa virus strain BA366 glycoprotein precursor (GPC) and nucleoprotein (NP) genes, complete cds |
| GU979513.1 | Arenaviridae | Mammarenavirus lassaense | 7207 | 96.06 | Lassa virus strain BA366 Z protein (Z) and polymerase (L) genes, complete cds |
| LC710218.1 | Fiersviridae | Emesvirus zinderi | 3605 | 93.31 | Escherichia phage MS2 GI_B RNA, complete genome |
Reference-based genome assembly statistics
- Reference: ID of the reference sequence
- Family: Virus family
- Organism: Name of the species
- Length: Length of the reference genome
- Coverage: Percent of the reference genome that were succesfully called
- Description: Data base description of the reference
| Reference | Family | Organism | Length | Coverage | Positions called | Ambiguous positions | Mapped reads | Average read coverage |
|---|---|---|---|---|---|---|---|---|
| GU830839.1 | Arenaviridae | Mammarenavirus lassaense | 3358 | 95.32 | 3201 | 157 | 997 | 176.0 |
| GU979513.1 | Arenaviridae | Mammarenavirus lassaense | 7207 | 96.06 | 6923 | 284 | 999 | 83.1 |
| LC710218.1 | Fiersviridae | Emesvirus zinderi | 3605 | 93.31 | 3364 | 241 | 798 | 130.5 |
Reference-based genome assembly statistics
- Reference: ID of the reference sequence
- Family: Virus family
- Organism: Name of the species
- Length: Length of the reference genome
- Coverage: Percent of the reference genome that were succesfully called
- Positions called: Number of bases called
- Ambiguous positions: Number of ambiguous positions set to "N"
- Mapped reads: Number of reads aligned to the reference genome
- Average read coverage: Average number of reads per reference genome position
Contigs
| Filter | Contig | Length | Number of reads | Blast Hit | Organism | Hit length | Contig alignment coverage | Reference alignment coverage | Sequence Identity | Classification | Taxonomic Rank |
|---|---|---|---|---|---|---|---|---|---|---|---|
| no-target | tig00000001 | 6994 | 502 | GU979513 | Mammarenavirus lassaense | 7207 | 1.0 | 0.97 | 1.0 | Mammarenavirus lassaense | species |
| no-target | tig00000002 | 3420 | 398 | LC710218 | Emesvirus zinderi | 3605 | 1.0 | 0.95 | 1.0 | Escherichia phage MS2 | leaf |
| no-target | tig00000003 | 3209 | 509 | GU830839 | Mammarenavirus lassaense | 3358 | 1.0 | 0.96 | 1.0 | Mopeia Lassa virus reassortant 29 | species |
Contigs and targets found
- Filter: Filter used on reads before assembly
- Contig: Contig identifier
- Length: Length of the contig in base pairs
- Number of reads: Number of (corrected) reads used by canu to build this contig
- Blast Hit: Virus reference genome found with blast search
- Organism: Organism of the blast hit
- Hit length: Length of the blast hit reference genome in base pairs
- Contig alignment coverage: Share of the contig aligned by blast
- Reference alignment coverage: Share of the reference aligned by blast
- Sequence Identity: Sequence similarity of the aligned parts
- Classification: Classification according to centrifuge
- Taxonomic Rank: Taxonomic rank of the classification
| Stage | Sequence type | Reads/Contigs | Mean length |
|---|---|---|---|
| no-target | raw | 2868 | 599.1 |
| no-target | corrected | 1534 | 771.0 |
| no-target | contigs | 3 | 4541.0 |
| reassembly | input | 2868 | 599.1 |
Read and contig numbers in the different assembly runs
- Stage: Assembly run (e.g. for a given filter, no-filter or re-assemblies)
- Sequence type: Type of the sequence
- Reads/Contigs: Number of reads or contigs
- Mean length: Mean length of reads or contigs in base pairs
Versions
| Data Base | Version | Description |
|---|---|---|
| Filters | 1.0 | Human (GRCh38), mouse (8_GRCm38), mastomys and contaminant filter set |
| Classification | 1.0 | Refseq reference genomes plus genbank virus sequences |
| Virus | 2.0 | NCBI virus genomes from 26.10.2024 with covid sequences from RVDB version 29 |
Local data base versions used in this run.