➞ AbstractThis poster will be presented live at BCC West.
Grzegorz Boratyn, Jean Thierry-Mieg, Danielle Thierry-Mieg, Tom Madden
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health. Email: boratyng@ncbi.nlm.nih.gov
Project Website: https://ncbi.github.io/magicblastSource Code: https://ftp.ncbi.nlm.nih.gov/blast/executables/magicblast/1.5.0/ncbi-magicblast-1.5.0-src.tar.gzLicense: Public domainNext-generation sequencing (NGS) technologies facilitate rapid analysis of gene expression across individuals,tissues, or conditions. Mapping reads against a reference genome is the first step in many genomics analysispipelines. It is therefore essential to map the reads reliably. Many algorithms were developed to tackle thisproblem however few of them can map well long reads.
We present Magic-BLAST, a tool for mapping NGS runs against one or multiple genomes or transcriptomes.It incorporates ideas from the MAGIC-AceView pipeline implemented within the BLAST code base. Magic-BLAST processes NGS reads in batches. It builds an index of a batch of reads and scans a BLAST database(a genome or transcriptome) for potential word matches. Each match becomes a seed for local alignmentcomputation. To avoid aligning to repeats Magic-BLAST first counts word occurrences in the genome andremoves frequent words from the read index. Finally, collinear local alignments are combined into spliced alignments.
Magic-BLAST is very robust across wide range of conditions. It works well with reads generated by Illumina,Roche 454, and PacBio platforms. It also provides very good performance when mapping against genomeswith biased compositions or from related species. Magic-BLAST is very accurate in intron discovery andoutperforms similar programs.
Magic-BLAST is convenient to use. It does not need any special tuning for different technologies andgenomes. It works well in different conditions using default parameters. It directly accesses reads stored inthe NCBI Sequence Read Archive (SRA), without the need to download the data beforehand. It works with FASTA and FASTQ files. It can align reads to sequences in BLAST databases or FASTA files and integrateswell with NCBI facilities and services.
Magic-BLAST is available as Linux, Mac, and Windows executable, docker image, and can be installed fromBioconda. Recently added features include better handling of nanopore reads and reporting results withskipping over regions with too many sequencing errors for reliable alignment.
This research was supported by the Intramural Research Program of the National Library of Medicine at the NIH.