Cells need sugar to communicate Large volumes of information have been generated on the novel coronavirus since it first emerged, and researchers continue to pore over the available data. Bioinformatics tools can help researchers draw links between different pieces of information and gain deeper insights into biological processes, a topic at the heart of Dr Frédérique Lisacek’s research. The power of bioinformatics tools lies in their ability to help users correlate separate pieces of information that may otherwise seem to be unrelated, helping researchers to gain a deeper understanding of biological processes. This work holds particular importance in the context of the ongoing Covid-19 pandemic, and over the last year or so large amounts of data have been generated on the virus, including some strikingly detailed images of its structure. “In images we’ve seen that the spike protein on the surface is covered by carbohydrates. This is called the glycan shield, or the sugar shield,” says Dr Frédérique Lisacek, Head of the Proteome Informatics Group (PIG) at the Swiss Institute of Bioinformatics (SIB). A number of different terms are used to describe carbohydrates, including polysaccharides, glycans and sugars, which can cause a degree of confusion. Regardless of the precise naming convention, the sugar shield is a significant component of the surface of Covid-19 and other viruses, yet much remains unclear about the role of these molecules not only in viruses, but in the cells of all organisms, including bacteria, plants and fungi. “We don’t know the extent to which proteins are covered
glycans, and carbohydrate-binding proteins, yet overall it’s one and the same story,” says Dr Lisacek. This is an issue she and her colleagues in the PIG are working to address. “For many decades now, experimental data of the broken pieces have piled up in silos, which are isolated from each other,” she continues. “Glycoscientists recognise this themselves and try to reconnect. We are providing bioinformatics tools to accelerate change.”
Glycoproteomics Sars-Cov-2 spike protein (left), and the same protein with its glycan shield (right) [image created by Dr. Lorenzo Casalino in Professor Rommie Amaro lab (UCSD)].*
by carbohydrates, and little is known about the role of these molecules at the surface of proteins,” explains Dr Lisacek. As a bioinformatician, Dr Lisacek aims to help researchers work more collaboratively and share expertise, which can be challenging. The experimental means required to solve problems on the glycosylation of proteins are very different to those used in research on carbohydrate-binding proteins for example, which has consequences. “Glycoscientists are essentially working on the same topic but on different objects broken into glyco-proteins,
The focus for the PIG team is to develop databases and software tools that will help life scientists detect glycosylation, which could then lead to new insights into cellular communication. The technique most commonly used to resolve the structure of glycans is mass spectrometry; in comparison, DNA can be sequenced with relative ease. “There are a handful of experimental approaches to sequence DNA, while there are hundreds to identify proteins or glycans,” says Dr Lisacek. The main area of interest in her group is the field of glycoproteomics, which hinges on mass spectrometry. “Glycoproteomics is a means of breaking down the barriers described above, because it tackles the large-scale identification of both the glycoproteins and the glycans at the
N-glycans
Glycosphingolipid glycans
Cell-cell interface featuring glycans and glycan-binders.
16
Illustration inspired from nature.com/articles/nmeth0111-55/figures/1.
Mucin-type O-glycans
Glycan-binding protein
EU Research
same time,” she continues. “So, we are of course extremely interested in collecting the data that is generated. Glycoproteomics is a young field, only around a decade old.” The level of precision is such that it is not possible to accurately identify the glycans, they are only identified by their composition. “You can identify what type of mono-saccharide is used to build it for example, but you don’t have a really precise structure,” explains Dr Lisacek. A tool called Compozitor has been designed to uncover existing structural relationships between glycans. A Mass Spectrometry (MS) experiment simply generates a list of the molecules it identifies, and without bioinformatics processing, the items on the list would remain entirely independent of each other. “That is why bioinformatics tools are needed to parse the output of MS experiments in order to re-establish the relationships lost in the experimental process that starts with separating the molecules,” outlines Dr Lisacek. “The glycans are related because they are built with the same biosynthetic machinery.” The purpose of Compozitor is to generate a graph that captures these relations between glycans on the basis of the shared set of enzymes required for their synthesis. “This
For example, blood groups are one of the most well-known glycan molecules. They constitute the terminal parts of larger glycans and ‘stick out’ at the surface of red cells. “They are a signal and their recognition by lectins has partially known consequences,” outlines Dr Lisacek. “In all likelihood, there are also other parts of full glycan molecules that are as ‘meaningful’ as blood groups, but our understanding is still limited.” The ultimate goal would be to crack this code, yet this is an extremely challenging task, as there are huge numbers of proteins and glycans in the body. The extent of the glycome, the full set of glycans, is itself extremely difficult to define. “Some people say that there are tens of thousands of glycans in the glycome, while others argue that there are hundreds of thousands, or even more,” says Dr Lisacek. By developing new bioinformatics tools, Dr Lisacek hopes to help researchers navigate this extremely complex picture and study biological processes as well as the root causes of disease in greater detail. “In the case of the coronavirus, we have yet to understand if the glycan shield is more than camouflage, whether it drives specific protein-protein interactions that would be mediated by these sugars,” she says.
With glycoproteomics you try to identify the glycans and the proteins at the same time. However, the level of precision is such that you cannot really accurately identify the glycans, you only identify them by their composition. is to move away from uninformative lists. We believe this is the first tool to provide this interconnected view from a set of experimental results,” outlines Dr Lisacek. The situation she is tackling here is essentially a protein-protein interaction, mediated by glycans. “We have a glycoprotein, on which a glycan is attached, and then a whole different category of proteins is binding this particular glycan. This is another family of proteins called lectins, or carbohydrate-binding proteins,” she explains. As previously mentioned, the aim is to build a complete picture, connecting the glycoproteins with the glycosylated proteins and the carbohydrate-recognising proteins, which would then help researchers look at important questions around the role of these glycan molecules. “Where is a particular glycan actually located on the protein? What type of glycan is it? What kind of carbohydrate-binding protein can recognise it?” continues Dr Lisacek. “Some people describe this as the glyco-code. It is thought that there is information encoded in a glycan, a type of signal that triggers a response via the matching lectins.”
www.euresearcher.com
Glycan expression A further way of looking at disease would be to investigate whether there is a differential expression of glycans at the surface of a diseased cell that distinguishes it from a healthy cell. This approach can be used to monitor the progression of certain types of cancer. “There is evidence for glycans being biomarkers in cancer. Glycan biomarkers are more often a marker of the early onset of disease,” says Dr Lisacek. The PIG group works with collaborators in different parts of the world who contribute data. Dr Lisacek and her team are also part of the GlySpace Alliance, which was formed with other international bioinformatics teams to help researchers work more efficiently and effectively. “There’s a lot to do in the glycoinformatics field. We are trying to harmonise our respective developments, so that we can go forward faster,” she explains. “We don’t want to compete, because there are relatively few of us working in glycoinformatics. Gathering data on the glycome and putting it in the wider picture of the other ‘omes, such as the proteome and the genome, is a major priority in glycosciences research.”
Glycomics@Expasy Bioinformatics to support glycoscience Project Objectives
In cooperation with experimentalists our project focuses on the design and development of computer-based resources for scientists who investigate the role oligosaccharides/glycans in multiple applications, especially involving microbes and the immune response they trigger. We strive to provide user-friendly and interactive tools compliant with recognised bioinformatics standards in an attempt to bridge with multiple -omics.
Project Funding
This project has been supported by: - FP7 Innovative Training Network (# 316929) - Swiss Federal Government through the State Secretariat fo Education, Research & Innovation (SERI) and is currently supported by: - Swiss National Science Foundation (# 31003A_179249) - Glyco@Alps ANR PIA “Initiative of Excellence” (ANR-15-IDEX-02)
Project Partners
• Anne Imberty, CERMAV-CNRS, Grenoble, France https://www.cermav.cnrs.fr/language/en/ the-teams/1788-2/ • Nicolle H Packer, Macquarie University, Sydney, Australia https://researchers.mq.edu.au/en/persons/ nicki-packer
Contact Details
Project Coordinator, Frederique Lisacek SIB Swiss Institute of Bioinformatics Proteome Informatics Group CUI - Battelle - bâtiment A 7, route de Drize 1227 Geneva T: +41 22 379 01 95 E: frederique.lisacek@sib.swiss W: https://www.sib.swiss/frederique-lisacek-group *Based on work published on ACS Central Science. (DOI: https://doi.org/10.1021/acscentsci.0c01056)
Frédérique Lisacek
Frédérique Lisacek is Manager of the Proteome Informatics Group (PIG) at the Swiss Institute of Bioinformatics (SIB). Her group collaborates with an international network of glycoscientists to deliver increasingly popular glycoinformatics resources. She previously held research positions in biology labs in France, Japan and Australia working on biological knowledge representation and predictive methods.
17