BCC2020 has ended
➞ Set your timezone before doing anything else on this site (home page, on the right)
Limit what is shown by Type, Category, or Hemisphere
Registration closed July 15.

BCC2020 is online, global, and affordable. The meeting and training are now done, and the CoFest is under way.

The 2020 Bioinformatics Community Conference brings together the Bioinformatics Open Source Conference (BOSC) and the Galaxy Community Conference into a single event featuring training, a meeting, and a CollaborationFest. Events run from July 17 through July 25, and is held in both the eastern and western hemispheres.

Back To Schedule
Sunday, July 19 • 13:00 - 13:45
P5-04: : Magic-BLAST an accurate RNA-seq mapper 🍐

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

➞ Abstract

This poster will be presented live at BCC West.

Grzegorz Boratyn, Jean Thierry-Mieg, Danielle Thierry-Mieg, Tom Madden
National Center for Biotechnology Information, National Library of  Medicine, National Institutes of Health.  Email: boratyng@ncbi.nlm.nih.gov

Project Website: https://ncbi.github.io/magicblast
Source Code: https://ftp.ncbi.nlm.nih.gov/blast/executables/magicblast/1.5.0/ncbi-magicblast-1.5.0-src.tar.gz
License: Public domain

Next-generation sequencing (NGS) technologies facilitate rapid analysis of gene expression across individuals,tissues, or conditions. Mapping reads against a reference genome is the first step in many genomics analysispipelines. It is therefore essential to map the reads reliably. Many algorithms were developed to tackle thisproblem however few of them can map well long reads.

We present Magic-BLAST, a tool for mapping NGS runs against one or multiple genomes or transcriptomes.It incorporates ideas from the MAGIC-AceView pipeline implemented within the BLAST code base. Magic-BLAST processes NGS reads in batches. It builds an index of a batch of reads and scans a BLAST database(a genome or transcriptome) for potential word matches. Each match becomes a seed for local alignmentcomputation. To avoid aligning to repeats Magic-BLAST first counts word occurrences in the genome andremoves frequent words from the read index. Finally, collinear local alignments are combined into spliced alignments.

Magic-BLAST is very robust across wide range of conditions. It works well with reads generated by Illumina,Roche 454, and PacBio platforms. It also provides very good performance when mapping against genomeswith biased compositions or from related species. Magic-BLAST is very accurate in intron discovery andoutperforms similar programs.

Magic-BLAST is convenient to use. It does not need any special tuning for different technologies andgenomes. It works well in different conditions using default parameters. It directly accesses reads stored inthe NCBI Sequence Read Archive (SRA), without the need to download the data beforehand. It works with FASTA and FASTQ files. It can align reads to sequences in BLAST databases or FASTA files and integrateswell with NCBI facilities and services.

Magic-BLAST is available as Linux, Mac, and Windows executable, docker image, and can be installed fromBioconda. Recently added features include better handling of nanopore reads and reporting results withskipping over regions with too many sequencing errors for reliable alignment.

This research was supported by the Intramural Research Program of the National Library of Medicine at the NIH.


Grzegorz Boratyn

BLAST developer at NCBI/NLM/NIH

Sunday July 19, 2020 13:00 - 13:45 EDT