Loading…
BCC2020 has ended
➞ Set your timezone before doing anything else on this site (home page, on the right)
Limit what is shown by Type, Category, or Hemisphere
Registration closed July 15.

BCC2020 is online, global, and affordable. The meeting and training are now done, and the CoFest is under way.

The 2020 Bioinformatics Community Conference brings together the Bioinformatics Open Source Conference (BOSC) and the Galaxy Community Conference into a single event featuring training, a meeting, and a CollaborationFest. Events run from July 17 through July 25, and is held in both the eastern and western hemispheres.

Back To Schedule
Sunday, July 19 • 14:30 - 14:35
CrowdGO: Gene Ontology prediction using a meta approach 🍐

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!



Abstract


The presenter(s) will be available for live Q&A in this session (BCC West).

Maarten JMF Reijnders 1,2 and Robert M. Waterhouse 1,2

1 University of Lausanne, Lausanne, Switzerland.
2 Swiss Institute of Bioinformatics, Lausanne, Switzerland.
Email: maarten.reijnders@unil.ch

Source code: https://gitlab.com/mreijnders/CrowdGO
License: GNU General Public License v3.0

Methods to predict protein functions- defined here as assigning Gene Ontology (GO) terms -
vary considerably in their underlying approach, with different methods employing techniques
such as sequence homology, machine learning, or text mining. This often results in dramatically
different sets of GO terms predicted for the same sets of proteins. These methods are reviewed
in the Critical Assessment of Functional Annotation competitions (CAFA) (Zhou 2019), but even
the best scoring methods can be inaccurate, and none truly stand out. To concurrently exploit
the strengths of each method, we developed a meta-predictor that evaluates the predictions of
multiple top-performing methods.
CrowdGO compares the predictions of different methods and uses a machine learning model to
improve the precision, recall, and f-max scores of the resulting meta-predictions. The model can
be trained based on user-selected prediction methods, or a pre-trained model can be used. The
pre-trained models are built using prediction tools that are exclusively open-source, easy to use,
and computationally non-demanding. CrowdGO includes Snakemake workflows to use existing
models for GO term prediction, or to train new models.
Using a model built with four input predictions from a sequence homology- based predictor, Wei2GO (Reijnders 2020), two protein domain based predictors, InterProScan (Mitchell 2019) and FunFams (Scheibenreif 2019), and a deep learning predictor, DeepGOPlus (Kulmanov 2019), CrowdGO increases both the precision and meaningful recall compared to each input method (Figure 1).
CrowdGO is fully open source and leverages other open source tools.It is straightforward to use, both due to the simplistic nature of the software and the accompanying snakemake pipelines. Due to the nature of its meta-prediction algorithm, it will stay relevant even when improved function prediction software becomes
available.


Speakers
MR

Maarten Reijnders

Department of Ecology and Evolution, University of Lausanne



Sunday July 19, 2020 14:30 - 14:35 EDT
BOSC