Nov 2016 VOL 2 ISSUE 11
“Medicine is a science of uncertainty and an art of probability.� -
William Osler
New features to predict miRNA target sites in mRNAs
Machine learning
Public Service Ad sponsored by IQLBioinformatics
Contents
November 2016
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
Topics Editorial....
03 Machine Learning New features to predict miRNA target sites in mRNAs 06
05
EDITOR Dr. PRASHANT PANT FOUNDER TARIQ ABDULLAH EDITORIAL EXECUTIVE EDITOR TARIQ ABDULLAH FOUNDING EDITOR MUNIBA FAIZA SECTION EDITORS FOZAIL AHMAD ALTAF ABDUL KALAM MANISH KUMAR MISHRA SANJAY KUMAR PRAKASH JHA NABAJIT DAS REPRINTS AND PERMISSIONS You must have permission before reproducing any material from Bioinformatics Review. Send E-mail requests to info@bioinformaticsreview.com. Please include contact detail in your message. BACK ISSUE Bioinformatics Review back issues can be downloaded in digital format from bioinformaticsreview.com at $5 per issue. Back issue in print format cost $2 for India delivery and $11 for international delivery, subject to availability. Pre-payment is required CONTACT PHONE +91. 991 1942-428 / 852 7572-667 MAIL Editorial: 101 FF Main Road Zakir Nagar, Okhla New Delhi IN 110025 STAFF ADDRESS To contact any of the Bioinformatics Review staff member, simply format the address as firstname@bioinformaticsreview.com
PUBLICATION INFORMATION Volume 1, Number 1, Bioinformatics Reviewâ„¢ is published quarterly for one year (4 issues) by Social and Educational Welfare Association (SEWA)trust (Registered under Trust Act 1882). Copyright 2015 Sewa Trust. All rights reserved. Bioinformatics Review is a trademark of Idea Quotient Labs and used under license by SEWA trust. Published in India
EDITORIAL: Welcoming BiR in its 2nd year
EDITORIAL
Bioinformatics, being one of the best field in terms of future prospect, lacks one thing - a news source. For there are a lot of journals publishing a large number of quality research on a variety of topics such as genome analysis, algorithms, sequence analysis etc., they merely get any notice in the popular press.
Dr. Prashant Pant
Editor
One reason behind this, rather disturbing trend, is that there are very few people who can successfully read a research paper and make a news out of it. Plus, the bioinformatics community has not been yet introduced to research reporting. These factors are common to every relatively new (and rising) discipline such as bioinformatics. Although there are a number of science reporting websites and portals, very few accept entries from their audience, which is expected to have expertise in some or the other field. Bioinformatics Review has been conceptualized to address all these concerns. We will provide an insight into the bioinformatics - as an industry and as a research discipline. We will post new developments in bioinformatics, latest research. We will also accept entries from our audience and if possible, we will also award them. To create an ecosystem of bioinformatics research reporting, we will engage all kind of people involved in bioinformatics - Students, professors, instructors and industries. We will also provide a free job listing service for anyone who can benefit out of it.
Letters and responses: info@bioinformaticsreview.com
MACHINE LEARNING
New features to predict miRNA target sites in mRNAs
Image Credit: Stock Photos
“Ding et al., (2016) has developed a random forest approach known as TarPmiR (http://hulab.ucf.edu/research/projects/miRNA/TarPmiR/), in which they have incorporated six conventional features and seven new features for miRNA target site prediction [8].”
n order to study gene regulation, it is necessary to identify the target sites of miRNA in mRNA. miRNAs have been the main point of research as its binding to mRNA degrades the target mRNA and also prevents the translation of target mRNAs [1-4]. The identification of miRNA target sites, target mRNAs and the potential functional roles of miRNA may be assigned.
I
There are seven features commonly used to predict the miRNA target sites (described in previous article Common features used to develop miRNA target prediction tools), and these methods are considered to be conventional to predict the target sites of miRNA. There are many target prediction tools available which are
based on the seven conventional features of target prediction such as miRANDA uses seed match, free energy, and conservation [5], TargetScan utilizes seed match, pairing of mRNAs with 3' of miRNAs, local AU content, etc., [6,7], and so on. Recently, few new features have been developed by Ding et al., (2016), they have applied four different machine learning approaches on the CLASH data [8]. CLASH (Crosslinking Ligation And Sequencing of Hybrids) is an experimental procedure, which implements a high-throughput approach to identify the sites of RNARNA interaction [9]. CLASH method is optimized to study the miRNA targets using Argonaute proteins [9]. Despite other high-throughput experimental
methods such as PAR-CLIP [10,11], and HITS-CLIP [12], the CLASH experiments provide a better understanding of miRNA target sites and help to develop better computational methods for miRNA target prediction. Ding et al., (2016) has developed a random forest approach known as TarPmiR (http://hulab.ucf.edu/research/proje cts/miRNA/TarPmiR/), in which they have incorporated six conventional features and seven new features for miRNA target site prediction [8]. According to Ding et al., (2016), the newly incorporated features are:
m/e motif
Bioinformatics Review | 6
It is the pairing probability of miRNA. If miRNA residues at positions in seed regions match the residues at the corresponding positions in target sites, then it is marked as 'm' (match) and if they do not match, and tends to form mismatches or bulges, then it is marked as 'e' (else). The probability score of m/e at each position of miRNAs is calculated as:
This feature allows 2 mismatches in the relative position of largest consecutive pairs to the 5' of miRNA.
Total number positions
of
paired
The total number of paired positions for each miRNA-mRNA binding site is calculated.
Target mRNA region length
It is the length of the number of residues of miRNA exactly binding to the target mRNA.
Largest consecutive pairs length
It is calculated as the length of the largest consecutive pairs to the 5’ end of miRNA.
Length of the largest consecutive pairs allowing 2 mismatches
consecutive
pairs
It is calculated as the relative position of largest consecutive pairs to the 5’ end of miRNA.
where x is the length of the miRNA which is smaller than 24.
Largest position
Position of the largest consecutive pairs allowing 2 mismatches
This feature is similar to the largest consecutive pairs position and is calculated as the relative position of largest consecutive pairs to the 5' end of miRNA allowing 2 mismatches.
Exon Preference
It considers the preference of miRNAmRNA binding in terms of an exon. If miRNA binds to a specific exon then, this feature assigns a score, otherwise, it remains zero.
Difference between the number of paired positions in the seed region and in the 3' end of miRNA
This feature counts the difference between the number of paired residues in the seed region and that within the 3' end of miRNA. TarPmiR is developed including all the conventional and the new features
[8]. This tool has been proved more efficient than the other available tools and found to provide less number of false positives and true negatives [8]. It is also available Linux (http://hulab.ucf.edu/research/proje cts/miRNA/TarPmiR/). For further reading please click here. References 1.
1. Axtell,M.J. et al. (2011) Vive la difference: biogenesis and evolution of microRNAs in plants and animals. Genome Biol., 12, 221.
2.
2. Bartel,D.P. (2009) MicroRNAs: target recognition and regulatory functions. Cell, 136, 215–233.
3.
3. Muljo,S.A. et al. (2010) MicroRNA targeting in mammalian genomes: genes and mechanisms. Wiley Interdisc. Rev. Syst. Biol. Med., 2, 148–161
4.
4. Wang,Y. et al. (2011) Transcriptional regulation of co-expressed microRNA target genes. Genomics, 98, 445–452.
5.
5. Enright,A.J. et al. (2004) MicroRNA targets in Drosophila. Genome Biol., 5, R1-R1.
6.
6. Friedman,R.C. et al. (2009) Most mammalian mRNAs are conserved targets of microRNAs. Genome Res., 19, 92–105.
7.
7. Grimson,A. et al. (2007) MicroRNA targeting specificity in mammals: determinants beyond seed pairing. Mol. Cell, 27, 91–105.
8.
8. Ding J. et al., (2016). TarPmiR: a new approach for microRNA target site prediction. Bioinformatics, 32(18), 2016, 2768–2775 doi: 10.1093/bioinformatics/btw318
9.
9. Helwak, A. et al. (2013) Mapping the human miRNA interactome by CLASH reveals
Bioinformatics Review | 7
frequent noncanonical binding. Cell, 153, 654–665 10. 10. Chi,S.W. et al. (2009) Argonaute HITS-CLIP decodes microRNA–mRNA interaction maps. Nature, 460, 479–486.
11. 11. Licatalosi,D.D. et al. (2008) HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature, 456, 464–469.
12. 12. Hafner,M. et al. (2010) Transcriptomewide identification of RNA-binding protein and microRNA target sites by PAR-CLIP. Cell, 141, 129–141.
Bioinformatics Review | 8
Subscribe to Bioinformatics Review newsletter to get the latest post in your mailbox and never miss out on any of your favorite topics. Log on to https://www.bioinformaticsreview.com
Bioinformatics Review | 9
Bioinformatics Review | 10