Metagenomic Read Mapping Service

Overview

The Metagenomic Read Mapping Service uses KMA to align reads against antibiotic resistance genes, virulence factors, or other custom sets of genes.

See also

Using the Metagenomic Read Mapping Service

The Metagenomic Read Mapping submenu option under the Services main menu (Metagenomics category) opens the Metagenomic Read Mapping Service input form (shown below). Note: You must be logged into BV-BRC to use this service.

Metagenomic Read Mapping Service Menu

Options

Metagenomic Read Mapping Service Input Form

Input File

Paired read library

Read File 1 & 2: Many paired read libraries are given as file pairs, with each file containing half of each read pair. Paired read files are expected to be sorted such that each read in a pair occurs in the same Nth position as its mate in their respective files. These files are specified as READ FILE 1 and READ FILE 2. For a given file pair, the selection of which file is READ 1 and which is READ 2 does not matter.

Single read library

Read File: FASTQ file containing reads.

SRA run accession

Allows direct upload of read files from the NCBI Sequence Read Archive to the service. Entering the SRR accession number and clicking the arrow will add the file to the selected libraries box for use in the assembly.

Selected libraries

Read files to be mapped.

Parameters

Predefined Gene Set Name: A pre-built set of genes against which reads are mapped. Two options are available:

Feature Group: Reads can also be mapped to a previously created groups of features (genes or proteins). There are several ways to navigate to the feature group. Clicking on the drop-down box will show the feature groups, with the most recently created groups shown first. Clicking on the desired group will fill the box with that name.

Fasta File: Reads can be mapped to a fasta file describing an dna sequence. The file must be present in BV-BRC, which would be located by entering the name in the text box, clicking on the drop-down box, or navigating within the workspace. Inorder to select a file for this service the file type must be specified as one of our fasta types (aligned_dna_fasta, or feature_dna_fasta).

Output Folder: Workspace folder where the results will be saved.

Output Name: User-provided name used to uniquely identify results.

Output Results

Metagenomic Read Mapping Service Output Files

The Metagenomic Read Mapping Service generates several files that are deposited in the Private Workspace in the designated Output Folder. These include

  • MetagenomicReadMappingReport.html - A web-browser-friendly report that summarizes the results of the service (see description and image below).

  • kma.aln - The consensus alignment of the reads against their template.

  • kma.frag.gz - Mapping information on each mapped read, columns are: read, number of equally well mapping templates, mapping score, start position, end position (w.r.t. template), the choosen template.

  • kma.fsa - The consensus sequences drawn from the alignments.

  • kma.res - A result overview giving the most common statistics for each mapped template.

  • kma.mat.gz - Base counts on each position in each template, (only if “-matrix” is enabled).

Metagenomic Read Mapping Report Output

KMA Read Mapping Report

This page is a web-friendly report that summarizes the output of MKA. It provides a link to the input data, an interactive chart view (see description below), and a table of the reference genes mapped. The columns in the table are as follows:

  • Template - Identifier of the template (reference gene) sequence that match the query reads. Clicking on any of the template identifiers in the first column will open a Specialty Gene List View that shows all the genes in BV-BRC that have BLAT hits to the same template gene.

  • Function - Template gene function.

  • Genome: Genome that contains template gene. Clicking on the name in the Genome column will open a new tab that shows the Genome List view, which shows all the genomes in BV-BRC that fall under the same taxonomy of the selected name.

  • Score - Global alignment score of the template.

  • Expected - Expected alignment score if all mapping reads were smeared over all templates in the database.

  • Template_length - Template gene length in nucleotides.

  • Template_Identity - Percent identity between the query and template sequence, divided the length of the matching template sequence

  • Template_Coverage - Percent of the template that is covered by the query

  • Query_Identity - Percent identity between the query and template sequence, over the length of the matching query sequence

  • Query_Coverage - Length of the matching query sequnce divided by the template length

  • Depth - Number of times the template has been covered by the query.

  • q_value - Quantile from McNemars test, to test whether the current template is a significant hit.

  • p_value - p-vaue corresponding to the obtained q_value

Action buttons

After selecting one of the output files by clicking it, a set of options becomes available in the vertical green Action Bar on the right side of the table. These include

  • Hide/Show: Toggles (hides) the right-hand side Details Pane.

  • Guide Link to the corresponding Quick Reference Guide.

  • Download: Downloads the selected item.

  • View Displays the content of the file, typically as plain text or rendered html, depending on filetype.

  • Delete Deletes the file.

  • Rename Allows renaming of the file.

  • Copy: Copies the selected items to the clipboard.

  • Move Allows moving of the file to another folder.

More details are available in the Action Bar Quick Reference Guide.

References

  • Kent, W. James. “BLAT—the BLAST-like alignment tool.” Genome research 12.4 (2002): 656-664.

  • Jia, Baofeng, et al. “CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database.” Nucleic acids research (2016): gkw1004.

  • Liu, Bo, et al. “VFDB 2019: a comparative pathogenomic platform with an interactive web interface.” Nucleic acids research 47.D1 (2018): D687-D692.

  • Philip T.L.C. Clausen, Frank M. Aarestrup & Ole Lund, “Rapid and precise alignment of raw reads against redundant databases with KMA”, BMC Bioinformatics, 2018;19:307.