Goslin logo

Goslin

Normalization of lipid names

Description

Goslin is the Grammar of succinct lipid nomenclature project. It defines multiple grammars, one for each lipid name dialect, e.g., LipidMaps, SwissLipids, HMDB, Liebisch shorthand nomenclature. This allows to provide immediate feedback whether a processed lipid notation string is compliant with a particular grammar, or not. Goslin provides libraries for C++, Java, Python and R to read-in lipid names and generate normalized lipid names for a streamlined subsequent data analysis and integration. Each library can process over 1000 lipid names within one second providing the normalized lipid name, chemical sum formula, lipid mass and all structure details. In its current version, it fully supports the revised Liebisch shorthand nomenclature from 2020. This can only be achieved due to the ability of context-free parsing of nested or recursive patterns which may occur in lipid names on higher structural resolution. Another advantage of Goslin is, that the parsers are very robust against syntactically incorrect lipid names avoiding misinterpretation. Providing lipid names on a certain lipid name structure, Goslin is able to provide normalized lipid names on all lower structure hierarchies, e.g. providing the lipid class or category. This makes statistical analysis requests (e.g., lipid category distribution) very easy to execute.

Technical Information

License:
GPL & MIT (Academic)
GUI:
Yes
CLI:
Yes
Desktop client:
No
Web platform:
Yes
Input formats:
CSV
Output formats:
CSV
Platforms:
MacOS,
Linux,
Windows
Programming languages:
R,
Python,
Java,
C++
Download / Web-service link:
Training datasets:
NA
Publications:
PMID:32589019

Tasks

7.1) Lipid annotations and ID converters
Supported lipid classes:
7 out of 8 LIPID MAPS categories, 289 subclasses
Normenclature:
LIPID MAPS classification, nomenclature, and shorthand notation (PMID: 33037133)
Supported levels of structural annotations:
Structure defined levels,
Molecular species,
Species
Input annotations or ids:
Shorthand notation (according to PMID: 33037133),
HMDB,
SwissLipids,
LIPID MAPS LMSD
Output annotations or ids:
Shorthand notation (according to PMID: 33037133),
HMDB,
SwissLipids,
LIPID MAPS LMSD
Link to the external databases:
Other features:
Chemical formular,
Mass calculation