BIOINFORMATICS REVIEW- MAY 2016

Page 1

May 2016 VOL 2 ISSUE 5

“The Science of today is the technology of tomorrow.� -

Edward Teller

Cytoscape.js: A graph library for network visualization and analysis

Geneious: A platform for Comprehensive Analysis and Organization of Genes


Public Service Ad sponsored by IQLBioinformatics


Contents

May 2016

░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░

Topics Editorial....

5

03 Bioinformatics News Geneious: A Platform for Comprehensive Analysis and Organization of Genes 07

05 04 Metabolomics Chemometrics in Metabolomics (Part-I): Overview of Biomarker Discovery 08

Software

Cytoscape.js: A graph library for network visualization and analysis 10


EDITOR Dr. PRASHANT PANT FOUNDER TARIQ ABDULLAH EDITORIAL EXECUTIVE EDITOR FOZAIL AHMAD FOUNDING EDITOR MUNIBA FAIZA SECTION EDITORS ALTAF ABDUL KALAM MANISH KUMAR MISHRA SANJAY KUMAR PRAKASH JHA NABAJIT DAS REPRINTS AND PERMISSIONS You must have permission before reproducing any material from Bioinformatics Review. Send E-mail requests to info@bioinformaticsreview.com. Please include contact detail in your message. BACK ISSUE Bioinformatics Review back issues can be downloaded in digital format from bioinformaticsreview.com at $5 per issue. Back issue in print format cost $2 for India delivery and $11 for international delivery, subject to availability. Pre-payment is required CONTACT PHONE +91. 991 1942-428 / 852 7572-667 MAIL Editorial: 101 FF Main Road Zakir Nagar, Okhla New Delhi IN 110025 STAFF ADDRESS To contact any of the Bioinformatics Review staff member, simply format the address as firstname@bioinformaticsreview.com PUBLICATION INFORMATION Volume 1, Number 1, Bioinformatics Reviewâ„¢ is published quarterly for one year (4 issues) by Social and Educational Welfare Association (SEWA)trust (Registered under Trust Act 1882). Copyright 2015 Sewa Trust. All rights reserved. Bioinformatics Review is a trademark of Idea Quotient Labs and used under license by SEWA trust. Published in India


EDITORIAL

Bioinformatics Review (BiR): Bridging Between The Two Worlds Informatics and Biology are two sciences which are as different from each other as possible. One runs on the core concept of variation and another on strict reasoning. But still, these two have combined in a most natural way under the realm of “Bioinformatics”. For a biologist today it’s difficult to imagine a world without all biological databases and further no branch to decipher the huge enigma that it brings. Bioinformatics Review (BiR) journal is a platform to discover the latest happenings in this melting pot of two varied fields.

Dr. Roopam Sharma

Honorary Editor

The era of “omics” kick-started with the drafting of Human Genome Project (HGP) in 2003. Since then, a number of technological advancements especially, NGS has been generating mind-boggling data for the knowledge banks. Latest inventions like single-cell transcriptomics or metagenomics of most unusual habitats show how the evolution of technological advancements is directly resulting in breakthroughs in biological sciences. Among various areas of biology which has benefited from these advancements is Pathology. In fact, deciphering the molecular and genetic basis of diseases in humans was the guiding force behind human genome sequencing Project. Bioinformatics has led to an impressive increase in recognition of possible pathogenic factors in varied systems, so much so that new techniques are being devised to increase the speed to actually test these factors in the wet lab. If we consider computationally, smaller but ever-changing genomes and transcriptomes of these pathogens, make them a much suitable candidate to test out many hypotheses for Bioinformatics studies. Effector Bioinformatics involves building custom pipelines for distinct species based on characteristics of effectors and size of the genome involved. These can be based on Homology or feature extraction or both, e.g. discovery of RXLR motifs in Oomycete effectors allowed many more effectors to be identified. This collaboration of two sciences for plant pathology has led to the development of many general use platforms like Broad-Fungal Genome Initiative, EuPathDB, PhytoPath and so on, but there is much need of developing specified resources like PHI-base for specific

Letters and responses: info@bioinformaticsreview.com


areas like effector biology. The use of machine-learning techniques like artificial neural network approach (which is actually based on biological neural networks) really shows how the two branches are so distinct yet so intertwined. All in all, it’s a brave new world where artificial communication is not only stimulating but also helping us understand the communication (between host and pathogen) going within the realm of life. In this issue, BiR focusses on reviews related to some of the very basic techniques which have been used in computational biology and its applications in various biological studies. We look forward to continued support from our readers and contributors. For suggestions and feedback, do write to us at info@bioinformaticsreview.com


BIOINFORMATICS NEWS

Geneious: A platform for Comprehensive Analysis and Organization of Genes Image Credit: Google Images

“Geneious is designed to be an easy-to-use and flexible desktop software application framework for the organization and analysis of biological data, with a focus on molecular sequences and related data types.�

he two main functions of bioinformatics are the organization and analysis of biological data using computational resources. Geneious is designed to be an easy-to-use and flexible desktop software application framework for the organization and analysis of biological data, with a focus on molecular sequences and related data types. It integrates numerous industry-standard discovery analysis tools, with interactive visualizations to generate publication-ready images. One key contribution to researchers in the life sciences is the Geneious public Application Programming Interface (API) that affords the ability to leverage the existing framework of the Geneious Basic software platform

T

for virtually unlimited extension and customization. The result is an increase in the speed and quality of development of computational tools for the life sciences, due to the functionality and graphical user interface available to the developer through the public API. Geneious Basic represents an ideal platform for the bioinformatics community to leverage existing components and to integrate their own specific requirements for the discovery, analysis, and visualization of biological data.

The software is also available from the Bio-Linux package repository at http://nebc.nerc.ac.uk/news/geneiou sonbl.

Availability and implementation: Binaries and public API freely available for download at http://www.geneious.com/basic, implemented in Java and supported on Linux, Apple OSX and MS Windows. Bioinformatics Review | 7


METABOLOMICS

Chemometrics in Metabolomics (PartI): Overview of Biomarker Discovery Image Credit: Google images

“Biomarker is defined as cellular, biochemical or molecular alterations that are measurable in biological media such as human tissues, cells, or fluids.�

D

iagnosis is a process to identify the exact cause of adverse symptoms experienced by the subject. In order to identify the cause, diagnosis process often looks at the constituents of the biofluid and check for the presence of a marker that is unique to the disease. Biomarker is defined as cellular, biochemical or molecular alterations that are measurable in biological media such as human tissues, cells, or fluids. Presence of various surface protein of the biopsy sample for cancer, bilirubin in urine for jaundice, blood glucose for diabetes is common examples of the biomarker. An ideal biomarker should have following properties a) should be sensitive and specific to a particular disease

condition, b) present in a noninvasive and minimally invasive fluid, c) can be detected at a very early stage of a disease onset, d) rapid analysis and e) cost-effective. Therefore, although diagnostic markers for many diseases are already available, hunt for identification of a marker that matches best to the abovementioned criterion is still on. In this article, I will confine my discussion on identification of biochemical marker using metabolomics. Metabolomics is emerging as a latest revolution in the functional genomics arena. A total number of metabolites and their abundances/concentrations in a biological system is known as metabolome and was coined for the first time in 1998. The technological approach to capture the closest form of this metabolome information is

known as metabolomics. Currently, Gas-Chromatography-Mass Spectrometry (GC-MS), Liquid Chromatography-Mass Spectrometry (LC-MS) and Nuclear Magnetic Resonance (NMR) are the major technology platform utilized for metabolomics analysis. However, none of the platform alone can identify all the metabolites present is a biological matrix viz, blood, serum, plasma, urine etc. It is estimated that there are close to 3000 metabolites present in human body, however, based on sensitivity, resolution, and type of instruments, a single platform can identify up to 1000 or little more. In fact, the techniques used for capturing metabolome information are not new. All these analytical platforms are it mass spectrometric or magnetic resonance are known since 60's or 70's. However, with Bioinformatics Review | 8


continuous development in terms of their sensitivity and resolution, numbers of molecules detected by these instruments have improved many folds. This was aided by concomitant advancement in the field of chemometrics. Chemometrics has made its presence relevant throughout the steps involved in metabolomics be its data acquisition, raw data pre-processing, pattern analyses and identification of important feature(s). The real challenge is to identify the biomarker(s) of a particular disease from hundreds of metabolites identified by metabolomics. Here chemometrics plays an important role. Chemometrics can be defined as the method of analyzing chemical data using mathematical, statistical and informatics tools and techniques. For diagnostic marker discovery, case-control subject classification is used, i.e., a comparative analysis between well-characterized patients and healthy controls. In some cases, a set of a patient cohort is followed from diseased to clinically treated condition following therapeutic intervention for a comparative analysis between before disease and after disease condition. Figure 1 demonstrates the steps involved in metabolomics-based biomarker discovery.

Fig: 1 Steps involved in metabolomics-based biomarker discovery. Each sample following data acquisition in an appropriate platform generates a data file commonly called as a raw data file. The process of mining meaningful information from the raw data file for further analysis is called data preprocessing. In mass spectrometric platform (GC-MS, LC-MS) data preprocessing include following steps: baseline correction, noise filtering, pick peaking, deconvolution, spectral matching, library annotation, alignment, and data integration. Currently, most of the instrument manufacturer develop their own software for data pre-processing, albeit external software is also available for pre-processing of MS raw data. Following data alignment and integration in raw data analysis the analyst now has the metadata or the data matrix to analyze further for pattern analyses and identification of

important feature. It involved multiple statistical steps to identify a robust biomarker of a set of biomarkers. Considering the complexity of the data matrix and variations within or between the groups, metabolomics researchers use both uni- and multivariate statistical tools to identify biomarkers. In practice, they develop a statistical model using a set of samples known as discovery set to identify important features that have a difference in presence between or among groups viz., disease/nondisease, before/after disease, mild/moderate/severe disease. The validity of the model is then checked in different sets of a subject if these tentative biomarkers can place the subjects in the appropriate group. In the upcoming articles on Chemometrics in Metabolomics, I will be discussing the raw data analyses part using bioinformatics tools and techniques and also about the statistical investigation for biomarker discovery. References: 1.

Johnson, C.H., Ivanisevic, J., Benton, H.P., Siuzdak, G., 2015. Bioinformatics: The Next Frontier of Metabolomics. Analytical Chemistry 87 (1), 147-156.

2.

Wishart, D., 2009. Bioinformatics for Metabolomics. In: Krawetz, S., (Ed.,): Bioinformatics for Systems Biology, pp 581599. DOI 10.1007/978-1-59745-440-7_30.

3.

Blekherman, G., Laubenbacher, R., Cortes, D.F., Mendes, P., Torti, F.M., Akman, S., Torti, S.V., Shulaev, V., 2011. Bioinformatics tools for cancer metabolomics. Metabolomics 7 (3) 329-343.

Bioinformatics Review | 9


SOFTWARE

Cytoscape.js: A graph library for network visualization and analysis Image Credit: Google Images

“Cytoscape.js is an API (Application Program Interface, i.e., it specifies how software components should interact and used to program graphical user interface), which allows the user to assimilate graphs into interaction models and web user interfaces.�

N

etwork visualization has become a strong need for studying the molecular interactions whether they are the protein or gene interactions. Network information is utilized in many contexts, from cellular functions to the identification of gene functions. Thus, an everincreasing volume of the research has used network visualization to gain deep insight into the molecular interactions that influence our body functions. Since the days of Flash have gone by, alternative technologies are rising. The interactive presentation used to be a dominion of Flash, it is no more same.

It is easier for the researchers to study the molecular interactions insilico, so it is equally important to visualize the network (linkage among the molecules) over modern portable devices. Many web platforms have been developed using standard technologies such as HTML, CSS, JavaScript (JS) to do this. Cytoscape.js leads the game. Cytoscape.js is an API (Application Program Interface, i.e., it specifies how software components should interact and used to program graphical user interface), which allows the user to assimilate graphs into interaction models and web user interfaces.

Cytoscape.js is a stand-alone tool and its architecture is broadly classified into two categories: 1. Core: The core is the main entry point for the developer. Core functions allow accessing graph elements. It represents the graph, and perform various operations on the graph. 2. Collection: It is a set of graph elements. It allows to filter, traverse, perform various operations. Sometimes, it is taken as an input by some core functions.

Bioinformatics Review | 10


In this demo, you can drag individual nodes, zoom in and perform other routine activities. References: .Max Franz, Christian T. Lopes, Gerardo Huck, Yue Dong, Onur Sumer and Gary D. Bader*. Cytoscape.js: a graph theory library for visualisation and analysis. Bioinformatics, 32(2), 2016, 309–311 doi:1093/bioinformatics/btv557

Fig: 1 A gene-gene interaction network visualized in Cytoscape.js Cytoscape.js offers a different type of graphs such as directed, undirected, traditional, multigraphs and hypergraphs. The graphs can also be modified by adding or deleting the graph elements such as edges. It includes the Graph theory algorithm which enables the user to search for the shortest path in the interaction graph. The graph consists of nodes representing the unit i.e., a protein or a gene. These nodes can also be modified by removing or by selecting the interested nodes only. The graph can be saved either in PNG or JPG format.

2. Lopes,C.T. et al. (2010) Cytoscape web: an interactive web-based network browser. Bioinformatics, 26, 2347–2348. 3. 4 Network Visualisation Tools. Fusion Table, Google Fusion Table August 20, 2014

Note: An exhaustive list of references for this article is available with the author and is available on personal request, for more details write to muniba@bioinformaticsreview.com.

Cytoscape.js is an open source software and is available at http://js.cytoscape.org. It is an improvisation over the Adobe Flashbased Cytoscape Web.

Bioinformatics Review | 11


Subscribe to Bioinformatics Review newsletter to get the latest post in your mailbox and never miss out on any of your favorite topics. Log on to https://www.bioinformaticsreview.com

Bioinformatics Review | 12


Bioinformatics Review | 13


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.