Importance Sampling of Chemical Compound Space

Page 1

Importance sampling Importance sampling of chemical compound space: Thermodynamic properties from high-throughput coarse-grained simulations

A rational design of small molecules in specific biomolecular environments

Project Objectives Small organic molecule represented at an atomistic resolution (left) and coarse-grained resolution (right).

Illustration of a small molecule (left) close to a lipidmembrane interface. Water is not represented.

Researchers continue to search for new molecules with functions relevant to the development of new products, yet identifying the right molecules from the vast number available is a demanding task. We spoke to Dr Tristan Bereau about his work in using computer simulations to systematically explore chemical space and help accelerate compound discovery in soft matter. A lot of

attention in research is focused on moving towards a more structured approach to materials design, with scientists seeking to identify molecules with specific functionalities relevant for the development of new products. This includes using not just laboratory-based experiments, but also increasingly computer simulations, a topic at the heart of Dr Tristan Bereau’s research. “In my group we are trying to explore what we call chemical compound space, which is the ensemble of all stable molecules,” he outlines. “Most molecules have never been synthesized, so computer simulations can help us to discover interesting new compounds.” This work involves systematically exploring certain properties of these molecules, with the wider aim of moving towards a more rational approach to materials development, specifically for soft matter. Soft matter deals with materials that are structurally altered by thermal fluctuations - these include liquids, polymers, and colloids, as well as biological materials. A rational approach to materials design involves measuring the same quantity for many molecules, thereby establishing a relationship between chemical structure and function. Certain adjustments are required to make measurements for many molecules. Unlike in the laboratory, compounds don’t need to be synthesized, but the models need to be parameterized, to define how molecules interact with one another.

22

Optimizing these parameters has historically been a labour-intensive, manual task, but now machine learning can be used to help do this more efficiently. “We are effectively setting up a high-throughput screening experiment, using computer simulations. We can do this because we essentially parameterize every molecule automatically, rather than spending time on each compound,” continues Dr Bereau. The result of a high-throughput screening procedure is a database relating a large number of compounds to specific properties. Artistic representation of a drug (yellow) interacting with a lipid membrane. Our scheme allows us to explore how varying the small molecule affects drug-membrane thermodynamics at high throughput.

“We’re building databases of thermodynamic properties for small organic molecules in complex environments, like in a liquid,” continues Dr Bereau. “These properties are essential, yet we only have measurements for a small number of molecules.” Databases are often used to train machine learning models, which then predict compound properties for which there is no data, effectively filling the gaps for large portions of chemical space. Ultimately, such a computational approach can suggest compounds of interest to be tested experimentally in the laboratory, speeding up the compound discovery process.

Computer simulations This interest in developing computational high-throughput screening techniques to investigate thermodynamic properties is not new. However, so far work in this area has been limited by computational-power constraints. “Calculations of thermodynamic properties, such as free energies, are best obtained from molecular dynamics simulations, which unfortunately require significant computational investment,” explains Dr Bereau. This limits researchers to focusing on just a handful of compounds. To circumvent this barrier and reach much larger scales, Dr Bereau and his colleagues use so-called coarse-grained models, which simplify the representation by grouping several atoms together. “It’s a bit like looking at a Pointillism painting. If you look at these models from a

EU Research

distance, they look like molecules, but if you look closer, you don’t see every atom,” he explains. Not only do the simulations converge much faster, but a single coarse-grained simulation also provides information about many compounds. Dr Bereau draws an analogy here with construction toy models. “If you’re asked to use Lego bricks to build two specific molecules that closely resemble each other, you might find that you don’t have enough resolution in your building blocks to tell them apart,” he explains. “So you would use the same set of bricks for the two compounds, as at this specific resolution they effectively look the same. This is something that we use to our advantage, in the sense that we only need a few simulations to screen a large part of chemical

models are minimalistic, Dr Bereau says they are tailored to encode the relevant driving forces for this particular problem, making them extremely efficient. “We put a lot of the physical ingredients that we think are relevant into the model, making our calculations reliable,” he stresses.

Molecular design This research not only holds important implications for the pharmaceutical sector, but is also of intrinsic scientific interest, enabling scientists to probe deeper into the physical chemistry relevant to a problem. “Once we have generated the data, we can establish a structure-property relationship,” outlines Dr Bereau. “This can help us in the design of new molecules. For example, if you

Most molecules have never

been synthesized, so computer simulations can help discover new interesting compounds space. This makes coarse-graining an efficient strategy for high-throughput screening.” A property that Dr Bereau and his colleagues have been studying is the propensity for a molecule to cross a cell membrane, an important quantity in drug development. “Before going into a cell, a drug has to permeate across the cell membrane, a soft architecture made primarily of phospholipids,” he explains. The researchers have been studying how likely a molecule is to cross a lipid membrane, and how quickly it does it. “In our last paper we predicted the permeability coefficient for several hundred thousand molecules, several orders of magnitude more than previously achieved from computer simulations,” outlines Dr Bereau. While the coarse-grained

want to design a molecule that can permeate through the lipid membrane easily, then the database can help researchers identify what type of chemical group is most relevant.” Researchers have already gathered a lot of data, and Dr Bereau hopes their work will help encourage further use of computer simulations for high-throughput screening. “This protocol could be adapted to different types of systems, materials, and environments,” he says. Researchers are also looking to analyse the results in greater depth. “There’s still a lot of information that we could tease out from the data. It will be very interesting to go back and use machine learning to gain further insights,” continues Dr Bereau.

Illustration of the reduction of chemical space: many structurally- and thermodynamically-similar molecules map to the same coarse-grained representation.

www.euresearcher.com

The Emmy Noether project “Importance Sampling in Chemical Space” aims at a systematic investigation and rational design of small molecules in specific biomolecular environments. This calls for the development of high-throughput computer simulations combined with data-driven techniques to generate and subsequently analyze large databases of thermodynamic properties.

Project Funding

Emmy Noether programme of the Deutsche Forschungsgemeinschaft (DFG)

Contact Details

Project Coordinator, Dr Tristan Bereau Max Planck Institute for Polymer Research Ackermannweg 10 55128 Mainz Germany T: +49 (0)6131 379 478 E: bereau@mpip-mainz.mpg.de W: http://www.mpip-mainz.mpg. de/~bereau/

Menichetti, Kanekal, Bereau, Drug–Membrane Permeability across Chemical Space, ACS Central Science, 5, 290-298 (2019); https://pubs.acs.org/ doi/10.1021/acscentsci.8b00718 Menichetti, Kanekal, Kremer, Bereau, In silico screening of drug-membrane thermodynamics reveals linear relations between bulk partitioning and the potential of mean force, Journal of Chemical Physics, 147, 125101 (2017); https://doi.org/10.1063/1.4987012

Dr Tristan Bereau

Dr Tristan Bereau is an independent group leader of the Theory Group at the Max Planck Institute for Polymer Research in Mainz, Germany. His research focuses on the modeling of soft-matter and biomolecular systems using a combination of physics-based computer simulations and data-driven techniques.

23


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.