BCC2020 has ended
➞ Set your timezone before doing anything else on this site (home page, on the right)
Limit what is shown by Type, Category, or Hemisphere
Registration closed July 15.

BCC2020 is online, global, and affordable. The meeting and training are now done, and the CoFest is under way.

The 2020 Bioinformatics Community Conference brings together the Bioinformatics Open Source Conference (BOSC) and the Galaxy Community Conference into a single event featuring training, a meeting, and a CollaborationFest. Events run from July 17 through July 25, and is held in both the eastern and western hemispheres.

Back To Schedule
Sunday, July 19 • 11:45 - 11:50
Targeted nanopore sequencing by real-time mapping of raw electrical signal with UNCALLED 🍐

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!


The presenter(s) will be available for live Q&A in this session (BCC West).

Sam Kovaka 1, Yunfan Fan 2, Bohan Ni 1, Winston Timp 2, Michael C. Schatz 1,3,4
Email: skovaka1@jhu.edu

1 Department of Computer Science, Johns Hopkins University, Baltimore, MD.
2 Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 3. Department of Biology, Johns Hopkins University, Baltimore, MD
4. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY

Project Source Code: https://github.com/skovaka/UNCALLED
License: MIT License

ReadUntil sequencing allows nanopore devices to selectively stop sequencing an individual read in real-time by ejecting it from the pore and immediately switch to another read. If reads could be rapidly mapped to large references while being sequenced, this would enable targeted sequencing of specific genomic regions or even specific genomes. However, most mapping methods require basecalling, which is computationally intensive and requires a significant amount of the read to be sequenced.

Here we present UNCALLED (Utility for Nanopore Current ALignment to Large Expanses of DNA), an open-source mapper rapidly matches raw streaming nanopore current signals to a large DNA reference without basecalling. This is accomplished by probabilistically considering all possible k-mers that the signal could represent, and then pruning the possibilities based on the reference genome sequence encoded using an FM-index. Importantly, UNCALLED dynamically adjusts the signal level model probability cutoffs during alignment to achieve both high accuracy and high speed when aligning the noisy signal data.

We used UNCALLED to deplete the sequencing of known bacterial genomes within a Zymo mock microbial community, enriching the remaining yeast sequence from ~20x coverage to ~100x. We also used UNCALLED to enrich for 148 human genes associated with hereditary cancers to 29.6x coverage (a 5.6 fold increase) using a single MinION flowcell, enabling accurate detection of SNPs, indels, structural variants (SVs), and methylation in these genes. Notably, twice as many SVs were detected compared to 50x coverage Illumina sequencing, verified by whole-genome nanopore and PacBio HiFi sequencing. Finally, we show that UNCALLED could be used to enrich larger gene panels such as all 717 genes in the COSMIC Census, or be used with cDNA/RNA sequencing, for example to deplete high- abundance transcripts.


Sam Kovaka

Johns Hopkins University

Sunday July 19, 2020 11:45 - 11:50 EDT