→ AbstractThe presenter(s) will be available for live Q&A in this session (BCC West).
Rodrigo Ortega-Polo 1, Shefali Vishwakarma 2,3, Lan Tran 4, Amanda Gregoris 4, Marta Guarna 4
1 Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada; Lethbridge, Alberta,
Canada. Email
: rodrigo.ortegapolo@canada.ca2 Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada; Lethbridge, Alberta,
Canada.
3 Department of Molecular Biology and Biochemistry, Simon Fraser University; Surrey, British Columbia,
Canada.
4 Beaverlodge Research Farm, Agriculture and Agri-Food Canada; Beaverlodge, Alberta, Canada.
Project Website:
https://github.com/BeeCSI-Microbiome/dada2_drake_workflowSource Code:
https://github.com/BeeCSI-Microbiome/dada2_drake_workflowLicense:
MIT LicenseThe use of workflow management systems promotes best practices in computational biology such
as reproducibility, provenance tracking and documentation of steps and parameters used in
analyses. Furthermore, the ability to restart workflows from a given point in the analysis instead of
starting over provides an efficient way for developing data analysis pipelines. The drake R package
is a framework for workflow management that allows users to design and visualize workflows
status in a reproducible and scalable manner (Figure 1). In our work, we used drake to design a
pipeline for amplicon-based microbial community data using DADA2 for denoising and taxonomic
classification, phyloseq and other R packages for visualization and data tidying. We implemented
this workflow for the analysis of 16S rRNA microbial community datasets from the honey bee gut
microbiome. This workflow has the advantage of enabling users to evaluate microbial communities
with amplicon sequencing data working entirely within R.