Loading…
BCC2020 has ended
➞ Set your timezone before doing anything else on this site (home page, on the right)
Limit what is shown by Type, Category, or Hemisphere
Registration closed July 15.

BCC2020 is online, global, and affordable. The meeting and training are now done, and the CoFest is under way.

The 2020 Bioinformatics Community Conference brings together the Bioinformatics Open Source Conference (BOSC) and the Galaxy Community Conference into a single event featuring training, a meeting, and a CollaborationFest. Events run from July 17 through July 25, and is held in both the eastern and western hemispheres.

Tuesday, July 21 • 11:30 - 12:15
P6-09: : Testing and investigating deep learning models for promoter recognition 🍐

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

This poster will be presented live at BCC West: https://deskle.com/dfGMxtY

A video is also provided: https://www.youtube.com/watch?v=_9r8D394T40

➞ Abstract

 Understanding DNA sequences has been an ongoing endeavour within bioinformatics research. Recognizing the functionality of DNA sequences is a non-trivial and complex task that can bring insights into understanding DNA. This project explored deep learning models for recognizing gene regulating regions of DNA - more specifically, promoters. Our project delves into implementing current models from the literature to replicate their results and explore how the models might be recognizing promoters. Literature in this field typically include web applications for the community to use, where one can submit limited data to obtain the model’s result. This has become the standard in the field, making it rare for the authors to provide the source code of their work. This can create unnecessary obstacles for new research into the field. 

 Previous work has also focused on limited curated datasets to both train and evaluate their models using cross-validation, obtaining high-performing results across a variety of metrics. We implemented various models from the literature and compared them against each other, using their datasets interchangeably throughout the comparison tests. These comparisons highlight shortcomings within the training and testing datasets for these models, prompting us to create a robust promoter recognition testing dataset and develop a testing methodology that creates a wide variety of testing datasets for promoter recognition. 

 It is then possible to test and analyse the models from the literature with the newly created datasets, which provides a standard benchmark that mimics a realistic scenario. To avoid replicability and model comparability issues in the future, we open-source our findings and testing methodology. New deep learning (DL) models can be implemented as Pytorch modules, while other machine learning (ML) models can be implemented using sklearn. Both can be trained and tested using sklearn’s procedures, where DL models can make use of skorch as a wrapper around PyTorch with an added sklearn interface. Training and testing scripts for DL models have been added as examples and can be expanded by the open source community. While we focus on DL models in this project, our training and testing scripts are also applicable to other ML models. 



Tuesday July 21, 2020 11:30 - 12:15 EDT
Joint