Loading…
BCC2020 has ended
➞ Set your timezone before doing anything else on this site (home page, on the right)
Limit what is shown by Type, Category, or Hemisphere
Registration closed July 15.

BCC2020 is online, global, and affordable. The meeting and training are now done, and the CoFest is under way.

The 2020 Bioinformatics Community Conference brings together the Bioinformatics Open Source Conference (BOSC) and the Galaxy Community Conference into a single event featuring training, a meeting, and a CollaborationFest. Events run from July 17 through July 25, and is held in both the eastern and western hemispheres.

Back To Schedule
Sunday, July 19 • 14:35 - 14:40
Goslin - A grammar of succinct lipid nomenclature 🍐

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!



Abstract


The presenter(s) will be available for live Q&A in this session (BCC West).

Nils Hoffmann 1, Dominik Kopczynski 1, Bing Peng 2, Robert Ahrends 3

1 Leibniz-Institut für Analytische Wissenschaften ­ ISAS ­ e.V., Otto-Hahn-Straße 6b, 44227
Dortmund, Germany. Email: nils.hoffmann@isas.de
2 Karolinska Institutet, Solna, Stockholm, Sweden.
3 Department of Analytical Chemistry, University of Vienna, Vienna, Austria.

Project Website: https://lifs.isas.de/goslin & https://apps.lifs.isas.de/goslin
Source Code: https://github.com/lifs-tools/goslin (main hub to implementations)
License: Apache v2 LICENSE & MIT License


Main Text of Abstract

We introduce the 'Grammar of Succinct Lipid Nomenclature' (Goslin), a polyglot grammar for
common lipid shorthand nomenclatures based on the LipidMaps nomenclature and the shorthand
nomenclature established by Liebisch et al. and used by LipidHome and SwissLipids, accompanied
by parser implementations in C++, Java, Python and R.

Lipid naming has evolved into several dialects which complicates the unified computational
treatment and parsing of lipid names. As a consequence, long and error-prone manual curation
often is necessary in order to streamline lists of lipid names for their processing in follow-up
analysis scripts, workflows, or tools, or for their submission to research data repositories. Goslin
was designed to address the following pressing issues in the lipidomics field especially: 1) to
simplify the implementation of lipid name handling for developers of mass spectrometry-based
lipidomics tools; 2) to offer a tool that unifies and normalizes the main existing lipid name dialects
enabling a lipidomics analysis in a high-throughput fashion.

Goslin and its parser implementations are thus designed to act as a library for the development of
lipidomics tools providing a standardized data structure for storing structural lipid information.
The parsing of lipid names as well as the lipid name generation are the main functions of Goslin. We
therefor defined a context free grammar (with ANTLR4) that defines rules and productions for all
structural properties of the lipid nomenclature, including mass spectrometry specific information
about unlabeled and heavy isotope labeled species, as well as fragments and adducts. We recently
added the calculation of masses and sum formulas, when the head group's sum composition is
known. Currently, the grammar covers 289 lipid classes within the seven most occurring lipid
categories in eukaryotic organisms, namely fatty acyls, glycerolipids, glycerophospholipids,
saccharolipids, sphingolipids, sterol lipids, and polyketides. The major advantages of using a
grammar rather than a manually coded parser are its flexibility and extensibility. Regular
expressions are also not suitable for parsing lipid names, since they are incapable of recognizing
nested patterns and can only recognize words from regular languages.

We provide implementations of Goslin in four major programming languages, namely C++, Java,
Python 3, and R to kick-start adoption and integration. Further, we set up a web service for users to
work with Goslin directly and via an OpenAPI-compliant REST API. All implementations are
available free of charge under a permissive open source license, binary releases are available from
Zenodo. We are currently working on making the libraries available via BioConda/BioContainers
and other community-facing repositories.

Speakers
NH

Nils Hoffmann

Leibniz-Institut für Analytische Wissenschaften – ISAS – e.V.



Sunday July 19, 2020 14:35 - 14:40 EDT
BOSC