Dec 2016 VOL 2 ISSUE 12
“There is no law except the law that there is no law.� -
John Archibald Wheeler
How to perform docking in a specific binding site using AutoDock Vina?
Site-specific docking using Autodock Vina
Public Service Ad sponsored by IQLBioinformatics
Contents
December 2016
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
Topics Editorial....
03 Tutorial How to perform docking in a specific binding site using AutoDock Vina? 06
05
EDITOR Dr. PRASHANT PANT FOUNDER TARIQ ABDULLAH EDITORIAL EXECUTIVE EDITOR TARIQ ABDULLAH FOUNDING EDITOR MUNIBA FAIZA SECTION EDITORS FOZAIL AHMAD ALTAF ABDUL KALAM MANISH KUMAR MISHRA SANJAY KUMAR PRAKASH JHA NABAJIT DAS REPRINTS AND PERMISSIONS You must have permission before reproducing any material from Bioinformatics Review. Send E-mail requests to info@bioinformaticsreview.com. Please include contact detail in your message. BACK ISSUE Bioinformatics Review back issues can be downloaded in digital format from bioinformaticsreview.com at $5 per issue. Back issue in print format cost $2 for India delivery and $11 for international delivery, subject to availability. Pre-payment is required CONTACT PHONE +91. 991 1942-428 / 852 7572-667 MAIL Editorial: 101 FF Main Road Zakir Nagar, Okhla New Delhi IN 110025 STAFF ADDRESS To contact any of the Bioinformatics Review staff member, simply format the address as firstname@bioinformaticsreview.com
PUBLICATION INFORMATION Volume 1, Number 1, Bioinformatics Reviewâ„¢ is published quarterly for one year (4 issues) by Social and Educational Welfare Association (SEWA)trust (Registered under Trust Act 1882). Copyright 2015 Sewa Trust. All rights reserved. Bioinformatics Review is a trademark of Idea Quotient Labs and used under license by SEWA trust. Published in India
The future and learning from your elders: experimental ecology and ‘the omics’ for emerging Bioinformaticians
Jennifer Wood
EDITORIAL
Honorary Editor The excitement generated by the ‘omics-revolution’ has opened the door for new and innovative research. Whilst bioinformatics is advancing in many fields such as pathology, microbial ecology, agriculture, and medicine, there is room for researchers new to bioinformatics to take some important lessons from a much older field; ecology. For a biologist meeting bioinformatics for the first time, the massive amounts of data (also called ‘Big Data’) that is handled during bioinformatics projects can be overwhelming. For many, finding resources to develop a clear understanding as to the type of Big Data being generated and how it can be analyzed may be difficult and the resources themselves can seem to be impenetrable. Additionally, there is a temptation to forget that it is the robustness of the experimental design and quality of the data interpretation, not the amount of data generated, that will make the best science. Without a developed understanding of how data will be analyzed, it is impossible to generate a robust experimental design. We have at our fingertips a resource for exploring our world that is incredibly powerful but requires respect and forethought. The continued sharing of new bioinformatics developments, through journals such as Bioinformatics Review (BiR), will be paramount to the advancement of our field. However, for those arriving in bioinformatics, I would urge them to draw on lessons from established fields, such as experimental and fundamental ecology. As a biological discipline, ecology is one of the few disciplines that has been dealing with Big Data for decades, and many issues that arise in omics-based projects are discussed extensively in ecology literature. Issues such as: the need for robust experimental design and replication to discern patterns amongst large environmental heterogeneity; how to deal with uneven
Letters and responses: info@bioinformaticsreview.com
EDITORIAL
sampling depth; strategies for determining which species in a multivariate dataset respond to experimental treatments; the line between subsampling and true replication and; sample sizes that are too small for adequate power in tests of significance. As such, experimental ecology presents a relatively untapped resource for a) discussions around considerations for designing experiments that will yield Big Data, b) strategies for analyzing Big Data and c) theories that can enrich the interpretation of Big Data. Ecological theories such as Grimes CSR theory on plant strategies have recently been reinterpreted for microbial ecology. Furthermore, ecological discussions of Big Data are often presented in a manner more familiar to those with biological backgrounds and thus, can provide an understandable introduction to the basic concepts and challenges surrounding the use of Big Data. It is clear that the generation and treatment of Big Data are important elements to bioinformatics-based research that need to be fully considered before an experiment commences in order to adequately answer meaningful scientific questions. The role of journals like Bioinformatics Review (BiR) in demystifying available technologies and analysis techniques, but also fields such as ecology which has worked through issues that are paralleled in bioinformatics, will be paramount in equipping researchers with the best information to design robust experiments. Please do share your comments, feedback, and suggestions at info@bioinformaticsreview.com
With Best Wishes
TUTORIAL
How to perform docking in a specific binding site using AutoDock Vina? Image Credit: Stock Photos
“AutoDock Vina is a bioinformatics tool which is used to perform in- silico docking of proteins with a ligand. It provides many options depending on the needs of a user.”
utoDock Vina is a bioinformatics tool which is used to perform in- silico docking of proteins with a ligand. It provides many options depending on the needs of a user. This tool offers blind docking and binding in a specific pocket as well, which is sometimes more demanding when the binding site is already known. This article will guide you to dock a protein with a ligand in a specific binding site/ pocket.
A
We are docking a protein Human Serum Albumin (HSA) protein with a ligand Sodium Octanoate (SO), but HSA is already complexed with 3carboxy-4-methyl-5-propylfuranpropanoic acid (CMPF). We want to bind SO in the same site where CMPF has already bound in HSA.
We need the following files prepared for docking with AutoDock Vina: 1. Pdbqt files of protein and the ligand 2. Configuration file 3. Grid file
two options, either we can read the literature which is available on the same page of PDB from where we downloaded the structure, or we can visualize the protein structure in PyMol and note down the interacting sites of the protein. The first option is recommended.
Preparation of PDB file before docking
2. Open the PDB file and remove HETATOMS.
1. Download a protein crystal structure from PDB. We are using Human Serum Albumin complexed with 3-carboxy-4methyl-5-propyl-furanpropanoic acid (CMPF) (PDB ID: 2BXA).
The structure we are using is a crystal structure complexed with ligand(s), therefore, to dock the desired ligand with the protein in that particular position we need to remove the bound ligand by removing hetatoms from the PDB file. If we will dock our ligand without removing the already complexed ligand, then we will not get correct results. We can also easily remove ligand by visualizing the protein in PyMol.
Before proceeding further, we should make clear that whether we know the catalytic site of the protein or not. If we know then we can easily go to step 2, but if we don’t know, then we have
Bioinformatics Review | 6
3. After removing hetatoms, we will keep only one of the four chains (here, Chain A was taken) and remove rest of the three chains and save this file as “protein.pdb” The chains are removed from the protein structure just to avoid the complexity. 4. Now save the “protein.pdb”.
file
as
Now we have prepared our protein structure to proceed further for docking. Now we will prepare our ligand which we want to dock with the protein. Preparation of ligand before docking 5. Open PubChem (www.pubchem.ncbi.nlm.nih.go v) and search for the compound. We are using “sodium octanoate” as a ligand. We can download the structure from ZINC database also. 6. Click on Sodium octanoate and look under “3D Structure” section, click on “Download” and then you will see four different formats for downloading it. We will download the .SDF format. 7. Since we need the protein and the ligand to be in a .pdb format, therefore, we have to convert
.SDF to .pdb. We will use PyMol for this purpose and never use online converters because they may ruin your ligand file. 8. Open PyMol, and open the downloaded ligand. Click on “File” --> “Save Molecule” --> select the molecule --> click “OK”. You can save it to your desired folder. We will rename the ligand as “SO.pdb” just to avoid any kind of confusion. Now we have a PDB file of protein and that of ligand. In order to perform docking, we need to prepare .pdbqt files from the .pdb files of the protein and the ligand, because docking through AutoDock Vina requires .pdbqt file format to dock. Preparation of .pdbqt files First, we will prepare a .pdbqt file of the ligand. 1. Open AutoDock Vina --> click “Ligand” --> click “Input”--> click “Open” It will ask to select your ligand, we will go to the folder where we have saved our ligand’s .pdb file and click “SO.pdb”. 2. Click “Ligand” --> click “Torsion Tree” --> click “Detect Root”.
It will show the torsion angle on the ligand from where it can be rotated. 3. Click “Ligand” --> click “Output” -> Click “Save as PDBQT”. We can rename the ligand, but we will use the same name as before and will name it as “SO.pdbqt” and save it in the same folder. We have prepared a .pdbqt file of the ligand and now we will prepare the protein file. 4. Open AutoDock Vina, click “File” --> click “Read Molecule” --> select protein.pdb. 5. We will delete water molecules from the protein as they can make unnecessary bonds with the ligand. Click “Edit” à click “Delete water”. 6. We will add polar hydrogens in order to avoid any empty group/ atom left in the protein. Click “Edit” --> click “Add Hydrogens” -> click “Polar only”. 7. We will save this file as .pdbqt, click “Grid” --> click “Macromolecule” --> click “Choose” --> select the “protein.pdb” --> click “OK”. It will ask for a folder to save, then save it as "protein.pdbqt", in the same folder where the pdbqt file of ligand was saved.
Bioinformatics Review | 7
Defining binding site Now we will define the binding site in the protein. Look at the panel in the left corner of Vina window. You will see the name of the protein written there. Click on it once, then it will show you the chains present in the structure, but since we have deleted all the chains except the chain A, so there is only one chain, i.e., chain A. Click on it, then it will display all the residues present in the chain. Now scroll down and look for your desired residues. For example, in this protein, we will look for Tyr150, Lys199, Arg222, Arg257, and His242 because this is the binding site of CMPF in human serum albumin and we want to bind another ligand in the same position. On the right side of dropdown of all residues, we can see some squares and oval shaped images. By clicking these, we can see the selected residues in different types, such as on clicking “R”, the residue will be displayed in ribbon-shaped, but we will only select all the residues by clicking “S”. Now the selected residues will appear differently such as yellow in color. Now we have selected all the residues and we will define the grid box which can enclose all these residues inside itself. Defining Grid Box for docking
Now we will define the specific site for binding of a ligand in the protein. In blind docking, we do not need to define a specific site in the protein for the ligand to bind, because we do not know the binding site, so we enclose the whole protein into the grid box. But here we already know the binding site in this protein, therefore, we will define binding site for the ligand in the grid box. 1. Click “Grid” --> click “Grid Box”. You will see a small window in which you can see x, y, and z coordinates. 2. Now try to adjust the grid box by scrolling the three coordinates, such that it covers all the selected residues. 3. After adjusting the grid box, click “File” --> click “Output Grid Dimension File” --> save this file as grid.txt in the same folder. 4. Click “File” --> Click “Close saving current”. 5. Now close the Autodock Vina. You will get the grid file as follows:
Preparation of Configuration file AutoDock Vina requires an input configuration file which contains all the information of the parameters used in configuring the docking including the name of the protein and the ligand. The configuration is as follows: receptor= protein.pdbqt ligand= SO.pdbqt
center_x= 4.402 center_y= -8.060 center_z= 8.874
size_x= 66 size_y= 56 size_z= 54
out= vina_outSO.pdbqt log= logSO.txt
grid.txt Protein
exhaustiveness= 8
spacing 0.375 npts center
66 56 54 4.402 -8.060 8.874
Bioinformatics Review | 8
From “grid.txt” file, we have written the center_x, y, and z coordinates, and also the size_x, y, and z of the grid box. Save this file as “conf.txt”. Perform Docking Put all the following in the same folder (i.e., dock): 1. protein.pdbqt 2. SO.pdbqt 3. conf.txt 4. All the MGL_Tools, Autodock Tools, Python.exe (for Linux) and Autodock Vina setup files.
2. Type the following command: vina -–config conf.txt -–log logSO.txt Now grab a cup of tea/ coffee because it may take a few minutes to complete the docking, so meanwhile, enjoy your time! ;) Vina Output After the successful docking, you will get a log file, which in this case is named as “logSO.txt”.
appropriate pose and visualize it in PyMol viewer. Please share if you like this article! If you find any difficulty, then feel free to mail me at muniba@bioinformaticsreview.com. References 1.
Trott, O., & Olson, A. J. (2010). AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. Journal of computational chemistry, 31(2), 455-461.
The log file will be as follows:
Please keep in mind that you have named the files properly and kept all the setup files in the same folder otherwise you may get errors while running the docking. Linux 1. Open the terminal and enter into the “dock” folder. 2. Type the following command: ./vina -–config conf.txt -–log logSO.txt 3. Press “enter”. Windows 1. Open the command prompt and enter the folder where all the docking files are placed.
This file consists of all the poses generated by the AutoDock Vina along with their binding affinities and RMSD scores. In the Vina output log file, the first pose is considered as the best because it has more binding affinity than the other poses and without any RMSD value, but you can choose the
Bioinformatics Review | 9
Subscribe to Bioinformatics Review newsletter to get the latest post in your mailbox and never miss out on any of your favorite topics. Log on to https://www.bioinformaticsreview.com
Bioinformatics Review | 10
Bioinformatics Review | 11