BCC2020 has ended
➞ Set your timezone before doing anything else on this site (home page, on the right)
Limit what is shown by Type, Category, or Hemisphere
Registration closed July 15.

BCC2020 is online, global, and affordable. The meeting and training are now done, and the CoFest is under way.

The 2020 Bioinformatics Community Conference brings together the Bioinformatics Open Source Conference (BOSC) and the Galaxy Community Conference into a single event featuring training, a meeting, and a CollaborationFest. Events run from July 17 through July 25, and is held in both the eastern and western hemispheres.

Back To Schedule
Sunday, July 19 • 11:50 - 11:55
THAPBI PICT -- a metabarcoding analysis pipeline developed as a Phytophthora ITS1 Classification Tool 🍐

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!


The presenter(s) will be available for live Q&A in this session (BCC West).

Peter Cock 1, David Cooke 2, Leighton Pritchard 3

1 Information and Computational Sciences, James Hutton Institute, Invergowrie, Dundee, UK
2 Cell and Molecular Sciences, James Hutton Institute, Invergowrie, Dundee, UK
3 Strathclyde Institute of Pharmacy & Biomedical Sciences, Glasgow, UK

Repository: https://github.com/peterjc/thapbi-pict/
Documentation: https://thapbi-pict.readthedocs.io/
License: MIT

Molecular barcodes are central to environmental monitoring and identification of species present in a
sample, and use PCR primers to amplify a diagnostic genome region of the organisms of interest. We are
interested in metabarcoding where multiple samples are multiplexed for high-throughput sequencing on the
Illumina platform, using overlapping paired end reads. Each sample yields a collection of marker sequences,
and matching these to a database of known species produces a taxonomic breakdown reflecting community
THAPBI PICT is a metabarcoding tool we developed for the UK funded Tree Health and Plant Biose-
curity Initiative (THAPBI) Phyto-Threats project, which focused on identifying Phytophthora species in
commercial tree nurseries. Phytophthora (from Greek meaning plant-destroyer) are economically important
plant pathogens, important in both agriculture and forestry. This project targeted an ITS1 marker (Internal
Transcribed Spacer one, a region found in eukaryotic genomes between the 18S and 5.8S rRNA genes) with
nested primers to identify Phytophthora species. By varying primer settings and using a custom database,
THAPBI PICT can be applied to other organisms and/or barcode marker sequences - making it more than
just a Phytophthora ITS1 Classification Tool (PICT).
The analysis pipeline starts from demultiplexed paired FASTQ files, as produced by the Illumina MiSeq
platform. These are quality trimmed, overlapping reads merged and primer trimmed (calling external tools)
and then deduplicated giving a much smaller list of unique sequences and associated read counts (passing a
minimum count threshold intended to exclude "noise"). These are matched to a curated database using a
range of methods, producing both plain text and formatted Excel output. An edit graph in XGMML format
is also produced for display in Cytoscape and other visualisation tools.
THAPBI PICT is released as open source software under the MIT licence. It is written in Python, a free
open source language available on all major operating systems. Version control using git hosted publicly on
GitHub is used for the source code, documentation, and database builds including tracking the hand-curated
reference set of Phytophthora ITS1 sequences. Continuous integration of the test suite is currently run on
both TravisCI and CircleCI. Software is released to the Python Packaging Index (PyPI) as standard for
the Python ecosystem, and additionally packaged for Conda via the BioConda channel. This offers simple
installation of the tool itself, and all the command line dependencies on Linux or macOS. The documentation
is currently hosted on Read The Docs, updated automatically from the GitHub repository.

avatar for Peter Cock

Peter Cock

The James Hutton Institute
Bioinformatician at the James Hutton Institute, a member of the BOSC organizing committee, treasurer of the Open Bioinformatics Foundation, and a core developer on the Biopython project.

Sunday July 19, 2020 11:50 - 11:55 EDT