MARCH 2017 VOL 3 ISSUE 3
“We cannot solve our problems with the same thinking we used when we created them.� -
Albert Einstein
Site-specific docking: Frequently Asked Questions & answers for starters
A short introduction to protein structures modification and ModFinder
Public Service Ad sponsored by IQLBioinformatics
Contents
March 2017
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
Topics Editorial....
05
03 Tools A short introduction to protein structures modification and ModFinder 07
04 Tips & Tricks Site-specific docking: Frequently Asked Questions & answers for starters 09
FOUNDER TARIQ ABDULLAH EDITORIAL EXECUTIVE EDITOR TARIQ ABDULLAH FOUNDING EDITOR MUNIBA FAIZA SECTION EDITORS FOZAIL AHMAD ALTAF ABDUL KALAM MANISH KUMAR MISHRA SANJAY KUMAR NABAJIT DAS
REPRINTS AND PERMISSIONS You must have permission before reproducing any material from Bioinformatics Review. Send E-mail requests to info@bioinformaticsreview.com. Please include contact detail in your message. BACK ISSUE Bioinformatics Review back issues can be downloaded in digital format from bioinformaticsreview.com at $5 per issue. Back issue in print format cost $2 for India delivery and $11 for international delivery, subject to availability. Pre-payment is required CONTACT PHONE +91. 991 1942-428 / 852 7572-667 MAIL Editorial: 101 FF Main Road Zakir Nagar, Okhla New Delhi IN 110025 STAFF ADDRESS To contact any of the Bioinformatics Review staff member, simply format the address as firstname@bioinformaticsreview.com PUBLICATION INFORMATION Volume 1, Number 1, Bioinformatics Reviewâ„¢ is published monthly for one year (12 issues) by Social and Educational Welfare Association (SEWA)trust (Registered under Trust Act 1882). Copyright 2015 Sewa Trust. All rights reserved. Bioinformatics Review is a trademark of Idea Quotient Labs and used under license by SEWA trust. Published in India
EDITORIAL
Bioinformatics Review (BiR): Bridging Between The Two Worlds Informatics and Biology are two sciences which are as different from each other as possible. One runs on the core concept of variation and another on strict reasoning. But still, these two have combined in a most natural way under the realm of “Bioinformatics”. For a biologist today it’s difficult to imagine a world without all biological databases and further no branch to decipher the huge enigma that it brings. Bioinformatics Review (BiR) journal is a platform to discover the latest happenings in this melting pot of two varied fields.
Dr. Roopam Sharma
Honorary Editor
The era of “omics” kick-started with the drafting of Human Genome Project (HGP) in 2003. Since then, a number of technological advancements especially, NGS has been generating mind-boggling data for the knowledge banks. Latest inventions like single-cell transcriptomics or metagenomics of most unusual habitats show how the evolution of technological advancements is directly resulting in breakthroughs in biological sciences. Among various areas of biology which has benefited from these advancements is Pathology. In fact, deciphering the molecular and genetic basis of diseases in humans was the guiding force behind human genome sequencing Project. Bioinformatics has led to an impressive increase in recognition of possible pathogenic factors in varied systems, so much so that new techniques are being devised to increase the speed to actually test these factors in the wet lab. If we consider computationally, smaller but ever-changing genomes and transcriptomes of these pathogens, make them a much suitable candidate to test out many hypotheses for Bioinformatics studies. Effector Bioinformatics involves building custom pipelines for distinct species based on characteristics of effectors and size of the genome involved. These can be based on
Letters and responses: info@bioinformaticsreview.com
Homology or feature extraction or both, e.g. discovery of RXLR motifs in Oomycete effectors allowed many more effectors to be identified. This collaboration of two sciences for plant pathology has led to the development of many general use platforms like Broad-Fungal Genome Initiative, EuPathDB, PhytoPath and so on, but there is much need of developing specified resources like PHIbase for specific areas like effector biology. The use of machinelearning techniques like artificial neural network approach (which is actually based on biological neural networks) really shows how the two branches are so distinct yet so intertwined. All in all, it’s a brave new world where artificial communication is not only stimulating but also helping us understand the communication (between host and pathogen) going within the realm of life.
EDITORIAL
In this issue, BiR focusses on reviews related to some of the very basic techniques which have been used in computational biology and its applications in various biological studies. We look forward to continued support from our readers and contributors. For suggestions and feedback, do write to us at info@bioinformaticsreview.com
TOOLS
Recent advances in insilico approaches for enzyme engineering Image Credit: Stock Photos
“ModFinder is a novel software which offers the identification and visualization of protein modifications in the protein 3D structures available in PDB.� nzymes are natural biocatalysts and an attractive alternative to chemicals providing improved efficiency for biochemical reactions. They are widely utilized in industrial biotechnology and biocatalysis to introduce new functionalities and enhance the production of enzymes. In order to be proved beneficial for industrial purposes, the enzymes need to be optimized by applying protein engineering. This article specifically reviews the recent advancements in the computational approaches for enzyme engineering and structural determination of the enzymes developed in recent years to improve the efficiency of enzymes, and the creation of novel
E
functionalities to obtain products with high added value for industrial applications. Enzyme engineering strategies aimed at forming enzymes with novel and improved activities, specificities, and stabilities which is greatly influenced by in silico methods. In-silico approaches in enzyme engineering can be applied in three main forms: structure analysis, molecular modeling, and de novo design. A detailed investigation of engineered enzymes provides valuable information about their structural origin, biochemical catalysis, and natural protein evolution [1]. A large number of enzymes have been widely utilized in biotechnology, pharmaceutical, and
industrial processes. Due to the capability of accelerating the reaction speed by a factor up to 10^17 even in mild environments [2], many research is focused on making enzymes applicable in different fields such as academic, industrial and commercial fields, which resulted in the rapid progress of enzyme engineering in recent years. Databases and tools for engineering enzyme activity A web server named ZEBRA has been developed for analyzing enzyme functional subfamilies [3]. This web server attempts to systematically identify and analyze adaptive mutations. These subfamily specific positions (SSPs) are conserved within Bioinformatics Review | 7
the subfamily differing from each other. The implemented statistical analysis evaluates the significance of SSPs, which can then be modified by rational design or focused directed evolution. The method has been tested with the ι/β-hydrolase superfamily [4]. SSPs calculated for the amidases were integrated into the sequence of the lipase CALB and the library of mutants was constructed. In silico screening of the library for the reactive enzyme-substrate complexes resulted in the selection of lipases with significantly improved amidase activity. Another method named JANUS analyzes multiple sequence alignments to predict mutations which can be used for interconversion of structurally related but functionally distinct enzymes [5]. This method has been verified by applying for the interconversion of aspartate aminotransferase into tyrosine aminotransferase. The incorporation of 35 mutations resulted in a protein with the desired specificity but low catalytic activity, which had to be optimized by DNA back-shuffling [1]. Another similar approach has been made by Yang et al., (2012), they proposed a computational approach to engineer allosteric regulation [6]. They performed a statistical comparison between the catalytic and allosteric binding sites, which showed that allosteric
sites are evolutionarily more variable and comprise more hydrophobic residues than the catalytic sites. Tools for molecular modeling and structural analysis of enzymes ROSETTA and ORBIT are the most widely used web-based tools for de novo prediction of enzymes. A new algorithm has been developed by Hallen et al., (2013) known as DeadEnd Elimination with Perturbations (DEEPer), which calculates the global minimum-energy conformation of structures with large backbone perturbations [7]. This algorithm attempts to generate more flexible enzymes structures. A computational method has been developed by Khare et al., (2012) which redesigns the active site to catalyze many reactions [8]. Keedy et al., (2012) proposed a novel algorithm for modeling local backrub motions, which are subtle backbone adjustments and participate in natural protein evolution taking place during amino acid substitutions resulting in increased model accuracy [9]. Algorithms for engineering enzyme stability Most of the consensus-based algorithms for designing thermostable proteins uses the information from multiple sequence alignments to predict the most suitable and most
often naturally occurring amino acid at a particular position. But it was pointed out by Sullivan et al., (2012) that consensual approaches are not much reliable as the consensus mutations at more conserved positions were more likely to be stabilizing in the model protein triose phosphate isomerase, while mutations at highly correlated positions were destabilizing [10]. Another algorithm was developed by Wang et al., (2012) which was based on the opposite principle from Sullivan et al., (2012). This method called combinatorial coevolving-site saturation mutagenesis (CCSM) is used for identifying hotspots for mutagenesis [11]. The molecular modeling approaches greatly benefit from growing computational power and parallelized calculations on graphical cards. Molecular modeling studies offer the combination of several in silico methods, including bioinformatics analysis, to describe structurefunction properties and predict beneficial mutations. For example, the prediction of thermostable proteins by combining the calculation of Gibbs free energies with evolutionary analyses. Current challenges include the quantitative modeling of enzyme selectivity and activities, which require the precise estimation of binding energies and
Bioinformatics Review | 8
reaction activation barriers. Since enzyme engineering has become an important aspect for biotechnological, bio-catalysis, and industrial purposes, it has become the focus of research. Designed enzymes need to be improved by many rounds of directed evolution, and this will not change in the near future. References 1.
1. Damborsky, J., & Brezovsky, J. (2014). Computational tools for designing and engineering enzymes. Current opinion in chemical biology, 19, 8-16.
2.
Radzicka A, Wolfenden R. (1995) A proficient enzyme. Science 267: 90–93
3.
Suplatov, D., Kirilin, E., Takhaveev, V., & Švedas, V. (2014). Zebra: a web server for bioinformatic analysis of diverse protein families. Journal of Biomolecular Structure and Dynamics, 32(11), 1752-1758.
4.
Suplatov, D. A., Besenmatter, W., Švedas, V. K., & Svendsen, A. (2012). Bioinformatic analysis of alpha/beta-hydrolase fold enzymes reveals subfamily-specific positions responsible for discrimination of amidase and lipase activities. Protein Engineering Design and Selection, gzs068.
5.
Addington, T. A., Mertz, R. W., Siegel, J. B., Thompson, J. M., Fisher, A. J., Filkov, V.,... & Toney, M. D. (2013). Janus: prediction and ranking of mutations required for functional interconversion of enzymes. Journal of molecular biology, 425(8), 1378-1389.
6.
Yang, J. S., Seo, S. W., Jang, S., Jung, G. Y., & Kim, S. (2012). Rational engineering of enzyme allosteric regulation through sequence evolution analysis. PLoS Comput Biol, 8(7), e1002612.
7.
Hallen, M. A., Keedy, D. A., & Donald, B. R. (2013). Dead‐end elimination with
perturbations (DEEPer): A provable protein design algorithm with continuous sidechain and backbone flexibility. Proteins: Structure, Function, and Bioinformatics, 81(1), 18-39. 8.
Khare, S. D., Kipnis, Y., Takeuchi, R., Ashani, Y., Goldsmith, M., Song, Y., ... & Stoddard, B. L. (2012). Computational redesign of a mononuclear zinc metalloenzyme for organophosphate hydrolysis. Nature chemical biology, 8(3), 294-300.
9.
Keedy, D. A., Georgiev, I., Triplett, E. B., Donald, B. R., Richardson, D. C., & Richardson, J. S. (2012). The role of local backrub motions in evolved and designed mutations. PLoS Comput Biol, 8(8), e1002629.
10. Sullivan, B. J., Nguyen, T., Durani, V., Mathur, D., Rojas, S., Thomas, M., ... & Magliery, T. J. (2012). Stabilizing proteins from sequence statistics: the interplay of conservation and correlation in triosephosphate isomerase stability. Journal of molecular biology, 420(4), 384-399
Bioinformatics Review | 9
TIPS & TRICKS
Site-specific docking: Frequently Asked Questions & answers for starters Image Credit: Stock photos
“I have collected some frequently asked questions and provided the link to their answers present in our question answer section of Bioinformatics Review.� have been getting several E-mails from researchers and students alike regarding in-silico docking. Most questions are similar in nature, so I thought of answering them once and for all. In this article, I have collected some frequently asked questions and provided the link to their answers present in our question answer section of Bioinformatics Review.
I
It is good to have questions in mind and they can be solved in a way as quoted by Sir Einstein:
"We cannot solve our problems with the same thinking we used when we created them"
Question: What is the difference between the blind docking and binding site based docking?
In this article, I have collected some of the most Frequently Asked Questions while performing site-specific and/ or blind docking. You have to consider a lot of factors before performing an actual docking on a protein with a specific ligand.
Question: Since the protein is homo-dimer, should I get the same binding affinity values of ligand in two monomers?
Question: How do you predict protein's binding sites?
Question: Since I know the binding sites in both monomers, so while doing docking do I necessarily need to mimic (preserve) the attachment of the ligand to these binding sites, so as to get the binding energy?
Bioinformatics Review | 10
Question: While doing docking, if, in addition to the known binding site attachments, the ligand is showing few more attachments with some residues. Do they contribute to the binding energy ?? if Yes, do I need to dock the ligand in such a way so that, it shows attachments with the ligand as it is showing in PDB file when viewed in Pymol? Question: What about the conformation of the docked ligand ?? Should the docked ligand exactly fall on the ligand which we already have?
I want to know which one is better ?? Second thing, if I have to find the distance between a ligand and any residue in receptor molecule. Do I have to take the central atom distances or minimum distance between any two atoms of the ligand and receptor molecule can be considered ?? Question: Why are the hetatoms removed from the protein PDB file before docking? Question: Why do we choose only one chain of protein for docking?
Question: I have to find the 4 Armstrong neighborhood of a ligand. When I take the receptor molecule without adding hydrogen bonds and find 4A nbd of ligand, it gives one result. But when I first add hydrogens to the receptor molecule and then after finding the 4A nbd of ligand, it obviously gives different results.
Bioinformatics Review | 11
Subscribe to Bioinformatics Review newsletter to get the latest post in your mailbox and never miss out on any of your favorite topics. Log on to https://www.bioinformaticsreview.com
Bioinformatics Review | 12
Bioinformatics Review | 13