Volume-1, Issue-1
SREEPATHY JOURNAL OF COMPUTER SCIENCE & ENGINEERING
Published by
Department of Computer Science and Engineering Sreepathy Institute of Management and Technology, Vavanoor Palakkad - 679 533 June 2014
Sreepathy Jouranl of Computer Science and Engg.
i
Contents Secret sharing of Color Images by N out N POB System, Ganesh P , G Santhosh Kumar ,A Sreekumar, In this paper, a new secret sharing scheme for color image is proposed. Our scheme uses a method to construct an N out of N secret sharing scheme for color images based on a new number system called Permutation Ordered Binary (POB) Number System .This scheme is an efficient way to hide secret information in different shares. Furthermore the size of the shares is less than or equal to the size of the secret. This method reconstructs the secret image with original quality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
Locating Emergency Responders in Disaster Area Using Wireless Sensor Network, Aparna M, A worldwide increase in number of natural hazards is causing heavy loss of human life and infrastructure. An effective disaster management system is required to reduce the impacts of natural hazards on common life. The first hand responders play a major role in effective and efficient disaster management. Locating and tracking the first hand responders are necessary to organize and manage real-time delivery of medical and food supplies for disaster hit people. This requires effective communication and information processing between various groups of emergency responders in harsh and remote environments. Locating, tracking, and communicating with emergency responders can be achieved by devising a body sensor system for the emergency responders. This work discusses an algorithm and its implementation for localization of emergency responders in a disaster hit area. The indoor and outdoor experimentation results are also presente. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
Brute Force Attack Defensing With Online Password Guessing Resistant Protocol, Jyothis K P, Padmada M P, Krishnan N Brute force and dictionary attacks on password- only remote login services are now widespread and ever increasing. Enabling convenient login for legitimate users while preventing such attacks is a difficult problem. Automated Turing Tests (ATTs) continue to be an effective, easy-to-deploy approach to identify automated malicious login attempts with reasonable cost of inconvenience to users.In this paper, we discuss the inadequacy of existing and proposed login protocols designed to address largescale online dictionary attacks (e.g., from a botnet of hundreds of thousands of nodes). We propose a new Password Guessing Resistant Protocol (PGRP), derived upon revisiting prior proposals designed to restrict such attacks. While PGRP limits the total number of login attempts from unknown remote hosts to as low as a single attempt per username, legitimate users in most cases (e.g., when attempts are made from known, frequently-used machines) can make several failed login attempts before being challenged with an ATT. We analyze the performance of PGRP with two real-world data sets and find it more promising than existing proposals.PGRP accommodates both graphical user interfaces (e.g., browser-based logins) and character-based interfaces (e.g., SSH logins), while the previous protocols deal exclusively with the former, requiring the use of browser cookies. PGRP uses either cookies or IP addresses, or both for tracking legitimate users . . . . . . . . . . . . . . . . . . . . . . . . .
9
Intelligent Image Interpreter, Jayasree N Vettath, The intrinsic information present in an image is very hard to interpret by the computer. There is lot of approaches focused on finding out the salient object in an image. Here we propose a novel architectural approach to find out the relation between salient objects using local and global analysis. The local analysis focuses salient object detection with efficient relation mining in the context of the processing image. For an effective global analysis we created ontology tree by considering a wide set of natural images. From these natural images we create an affinity based ontology graph; with the help of this local and global contextual graph we construct an annotated parse tree. Above formed tree is helpful in large image search. So our proposal will give new heights to the content based image retrieval and image related interpretation problems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
Sreepathy Jouranl of Computer Science and Engg.
ii
Self Interpretive Algorithm Generator, Jayanthan K S, Thinking is a complex procedure which is necessary to deal with a complex world. Machines that help us handle this complexity can be regarded as intelligent tools, tools support our thinking capabilities. The need for such intelligent tools will grow as new forms of complexity evolve, for our increasingly globally networked information society. Algorithms are considered as the key component of any problem solving activity. In order for machines to be capable of intelligently predigesting information, they should perform in a way similar to the way human think. Humans have developed sophisticated methods for dealing with the worlds complexity, and it is worthwhile to adapt to some of them. Sometimes the principles of human thinking combined with the functional principles of biological cell systems. Here the consideration of human thinking is extended to interpretive capacity of the machine. This interpretive capacity measured the capacity of the machine to generate the algorithm. Many definitions of algorithm are based on the single idea that input is converted to output in a finite number of steps. An algorithm is a step by step procedure to complete a task. It is any set of detailed instructions which results in a predictable end-state from a known beginning. This project is an attempt to generate algorithms by machine without human intervention. Evolutionary computation is a major tool which is competitive to human in many different areas of problem solving. So here a using Genetically Evolved Cellular Automata is used for giving the system self interpretive capacity to generate algorithm. . . . . . . .
20
Natural Language Generation from Ontologies, Manu Madhavan Natural Language Generation is the task of generating natural language text suitable for human consumption from machine representation of facts which can be pre-structured in some linguistically amenable fashion, or completely unstructured. An ontology is a formal explicit description of concepts in a domain of discourse. An ontology is considered as a formal knowledge repository which can be used as a resource for NLG tasks. A domain ontology will provides the input for content determination and micro-planing of NLG task. A linguistic ontology can be used for lexical realization. The logically structured manner of knowledge organization within an ontology enables to perform reasoning tasks like Con- sistency checking, Concept Satisfiability, Concept Subsumption and Instance Checking. These types of logical inferencing actions will be applied to derive descriptive texts as answers to user queries from ontologies. Thus a simple natural language based Question Answering system can be implement, guided by robust NLG techniques that act upon ontologies. Some of the tools for constructing ontologies ( Protege, Natural OWL, etc.) and their combination with NLG process will also be discussed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
Speech synthesis using Artificial Neural Network, Anjali Krishna C R, Arya M B, Neeraja P N, Sreevidya K M, Jayanthan K S The text to speech conversion is a large area which shows a very fast development in the last few decades.Our goal is to study and implement the specific tasks concentrated during text to speech conversion namely text normalization grapheme to phoneme conversion, phoneme concatenation and speech engine processing.Usage of neural network for grapheme to phoneme conversion provides more accuracy than normal corpus or dictionary based approach.Usually in text to speech grapheme to phoneme conversion is performed using a dictionary based method.The main limitation of this technique is that it cant able to give the phoneme of a word which is not in the dictionary and to have more efficiency in phoneme generation we require a large collection of word- pronunciation pair.For using large dictionary we require large storage space also.This limitation can be overcome using a neural network.The main advantage of this approach is that it can able to adapt unknown situvations.ie it can able to predict the phoneme of a grapheme which is not defined so far.The neural network system requires less memory than a dictionary based system and performed well in tests.The system will be very much useful for an illiterate and vision impaired people to hear and understand the content, where they face many problem in their day to day life due to the differences in their script system. . . .
34
Cross Domain Sentiment Classification, S.Abilasha,C.H.Chithira,Fazlu Rahman,P.K.Megha,P.Safva Mol, Manu Madhavan Sentiment analysis refers to the use of natural language processing and machine learning techniques to identify and extract subjective information in a source material like product reviews. Due to revolutionary development in web technology and social media reviews can span so many different domains that it is difficult to gather annotated training data for all of them. A cross domain sentiment analysis invokes adaptation of learned information of some (labeled) source domain to unlabelled target domain. The method proposed in this project uses an automatically created sentiment sensitive thesaurus for domain adaptation. Based on the survey conducted on related literature, we identified L1 regularized logistic regression is a good binary classifier for our area of interest. In addition to the previous work we propose the use of sentiwordnet and adjective-adverb combinations for those effective feature learning. . . . . . .
41
Sreepathy Journal of Computer Science and Engg.,Vol 1, Issue 1, June 2014
1
Secret sharing of Color Images by N out N POB System 1
Ganesh P 1 , G Santhosh Kumar2 ,A Sreekumar3 ,
Dept. of Computer Science and Engg,SIMAT, Vavanoor, Palakkad ganesh.p@simat.ac.in 2,3 Department of Computer Application Cochin University of Science & Technology,Kochi
Abstract—In this paper, a new secret sharing scheme for color image is proposed. Our scheme uses a method to construct an N out of N secret sharing scheme for color images based on a new number system called Permutation Ordered Binary (POB) Number System .This scheme is an efficient way to hide secret information in different shares. Furthermore the size of the shares is less than or equal to the size of the secret. This method reconstructs the secret image with original quality. Keywords—Visual Secret Sharing, Visual cryptography, POB number system.
I. I NTRODUCTION ECRET sharing scheme is a method of distributing a secret among a group of participants, each of which is allocated a share of the secret. The secret can only be reconstructed when the shares are combined together; individual shares are of no use on their own. Secret sharing was introduced by Shamir [1] in 1979.Shamirs solution is based on polynomial interpolation in finite field. Several secret sharing schemes were proposed, but most of them need a lot of computations to decode the secret. In 1994, Naor and Shamir [2] introduced visual cryptography scheme (VCS) which allows visual information ( pictures, text, etc. ) to be encrypted in such a way that the decryption can be performed by the human visual system, without the aid of computers. The drawback of the scheme is that it works only with black and white images. In 1997, Verheul and Van Tilburg [7] used the concept of arcs to construct a colored visual cryptography scheme. The major disadvantage is that the number of colors and the number of subpixels determine the resolution of the recovered secret image. F.Liu et al.(2008) presents different work done to improve the quality of the recovered images. The major disadvantages of these schemes are the size of the shares increases with the number of participants and the quality of recovered secret is less than of the original secret. Rest of the paper is organized as follows. Section 2 briefly reviews the permutation ordered number system and the algorithms related to N out of N threshold scheme. Our approach for color image is given in Section 3. Results and discussions are given in Section 4. Section 5 concludes the work.
S
II.
P ERMUTATION O RDERED B INARY (POB) NUMBER SYSTEM
POB number system[14] is defined with two non-negative integral parameters n and r where nr and is denoted as POB
(n,r).The system represents all decimal integers in the range 0, 1,....,n Cr−1 as a binary string B=bn−1 bn−2 ....b0 of length n and having exactly r number of 1s.The binary string B is known as POB number. There exists a POB value for each POB number B, denoted as V(B) and is calculated using the following formulae. P Pj j V (B) = n−1 i=0 bj CPj where Pj = i=0 bj
POB (9, 4) system is used in this paper. So each binary string B is of length 9 with exactly 4 ones.There are 126 POB values with the range [0,125]. For POB (9, 4) number system the smallest POB number is 000001111 (decimal value is 15 and POB value is 0 C1 +1 C2 +2 C3 +3 C4 =0) and the largest POB number is 111100000(decimal value =480 and POB value =5 C1 +6 C2 +7 C3 +8 C4 = 5+15+35+70=125). POB value of (111100000)p= 5 C1 +6 C2 +7 C3 +8 C4 = 5+15+35+70=125. The binary representation of 125 is 1111101. A. N out of N Construction using POB [14] The N out of N secret sharing scheme encrypts the secret image into N shares so that only someone with all N shares could decrypt the image, while any N-1 shares revealed no information about the original image. Like the traditional visual cryptography, the POB scheme uses only XOR operations for encryption and decryption of the secret .The N out of N construction is based on the following theorem. Teorem: Let T be a binary string of even parity having length 9,then we can find two binary strings A and B, each having exactly four 1s and five 0s such that T = A ⊕ B. Proof: We can assume without loss of generality that the leading 2m digits of T are 1s where0 ≤ m ≤ 4and the remaining 9-2m digits are 0s. Now let A=PQ be the binary string obtained by concatenating the strings P and Q, where P is a string having exactly m 1s and 0s, and Q is having exactly 4-m 1s and 5-m 0s.Then the choice B = P 0 Q ,where P 0 is the complement of P will prove the theorem. B. Recovery Procedure To reconstruct the secret image/information from the N shares, first find out the decimal equivalent of each 7 bit binary string and perform the division on the second POB value by 14
c Dept. of Computer Science & Engg., Sreepathy Institute of Management And Technology, Vavanoor, Palakkad, 679533.
Ganesh P, et al., Secret sharing of Color Images by N out N POB System
2
and store the quotient as r. Then calculate the POB numbers of each decimal value and perform the XOR operation on these POB numbers. The result is stored in an array T of size 9.In order to get the secret information K, rth bit of T is dropped out. The detail of N out N construction using POB Scheme is discussed in [14]. III.
P ROPSED M ETHOD
The objective of the proposed scheme is to generate (N,N) scheme by apply the POB(9,4) system for color image. The aim is to get better quality decrypted images with the size same as the original image. Figure 1 show the flowchart of the encryption algorithm. In this encryption algorithm a color image is decomposed into three channels and each channel is considered as a gray-level image. For each gray level image halftoning is applied and a single gray scale image is generated. Perform POB (9,4) encryption scheme to accomplish the creation of N shares. Figure 2 shows the flowchart of decryption where, we superimpose (using XOR operation) the shares of each channel to get the decrypted image of each channel. These three decrypted channels are combined to get the decrypted color image.
Fig. 2: Decryption Algorithm.
3)
a halftone method defined by the equation Gray scale image=32*R+4*G+B. The value of each element of Gray scale image is in the range [0,255]. N out N POB(9,4) encoding method described in Section 2.1.1 is used for creating N encrypted shares. The encrypted shares are ES1 ,ES2 ,...ESN .
B. Decryption
Fig. 1: Encryption Algorithm.
A. Encryption In the encryption algorithm the shares are generated from the color image. The color image is decomposed into CR,CG and CB channels. From these channels the shares are created using following steps: 1) The values of CR and CG channels are mapped into 8 levels and the values of C B channel into 4 levels as shown in Table1 and Table2. 2) Generate single gray scale image from the R,G and B components obtained from the level description using Sreepathy Journal of Computer Sc. & Engg.
In the decryption algorithm the color image channels are reconstructed by stacking the shares of channels. These color image channels are combined to get the secret color image. 1) Find the 9 bit POB numbers corresponding to the POB values (elements) represented by the encrypted shares ES1 ,ES2 , ....., ESN and store into A1 ,...,AN respectively. Perform the XOR operation on A1 ,...,AN , and store the result in T. T is a matrix of mXn POB numbers where m is the width and n is the height of the share image. 2) Divide each element of share ES2 with 14 and store the quotient in Q. 3) Shift method: Let r=Qij where 1im, 1jn. Delete the bit in the r th position of Tij , shifting left the remaining bits by 1. The resultant T represents the recovered gray scale image. 4) Shift method: Let r=Qij where 1im, 1jn. Delete the bit in the r th position of Tij , shifting left the remaining bits by 1. The resultant T represents the recovered gray scale image. 5) For each element of gray scale image first three bits represent red component level, next three bits represent the green component level and the remaining represents the blue component level. Select random numbers in the value range of the corresponding level of each component based on their level description table and assign Vol.1, Issue.1, June-2014
Ganesh P, et al., Secret sharing of Color Images by N out N POB System
3
them to the corresponding channel values. Combine channels to get the decrypted color image. IV. R ESULT AND D ISCUSSION As described in Section 3, three color channels are extracted from the color image. Halftoning method with level descriptor table is applied to produce single gray-scale image. The 3out3POB (9,4) scheme is then applied to the resultant gray scale image to produce 3 shares. The standard Lena color image of size 256 256 is used for the experiment. The results are shown in Fig.3 from a-h. It is clear that the secret is revealed only when the three schemes are combined. The clarity of the obtained image is comparable with the original image. Table 3 represents the size of decrypted image
Fig. 3: (a) Secret Image (b)-(d) Shares of participants (e) Reconstructed image using share1 and share2 (f) Reconstructed image using share1 and share3 (g) using share2 and share3 (h) Reconstructed image using share1, share2 and share3.
of secret. This would mean that for a secret of m bytes, the probability of correct guess of a share will be as low as (1/126)m, which tends to 0 as m becomes large. V.
C ONCLUSION
In this paper we have proposed a (9,4) POB scheme for color images which uses the halftoning on color channels. The XOR operation is used in stacking which produces better quality of image and there is no expansion in the size of decrypted image. The quality of decrypted image is shown to be better than the other schemes .The basic (n,n) threshold visual cryptography scheme is used for color images in which the size of share image is nn-1. By optimizing the storage of POB values in the memory it is possible to reduce the size of the shares. To improve the quality of the images POB number system can be directly applied to each RGB components of the image. These are treated as future work. using different (n,n) color visual cryptography schemes with c number of colors. In all methods the size of decrypted image is increased, where in the proposed method the size remain same as that of original image.
R EFERENCES [1] [2] [3]
[4] [5]
A. Analysis of Attack We have analyzed the effects of attack on proposed scheme. In the construction under the POB (9, 4) number system, there are 126 shares corresponding to one byte of secret. The probability of correct guess of a share is (1/126) per byte Sreepathy Journal of Computer Sc. & Engg.
[6] [7]
Shamir, Adi (1979). �How to share a secret�. Communications of the ACM 22 (11): pp. 612-613. doi:10.1145/359168.359176. MoniNaor ,Adi Shamir, Visual Cryptography, EUROCRYPT 1994, pp. 112 Ateniese, G., Blundo, C., De Santis, A., and Stinson, D. R. (1996),Constructions andbounds for visual cryptography,23rd International Colloquium on Automata , Languages and Programming (ICALP 96), Lecture Notes in Computer Science,,Vol. 1099, pp. 416-428, Springer-Verlag, Berlin. Ateniese, G., Blundo, C., De Santis, A., and Stinson, D. R. (1996),Visual cryptography for General Access Structures,Information and Computation , vol.129(2),p.86-106. Atici,M.,Stinson,D.R and Wei,R.,A New Practical Algorithm for the construction of a Perfect Hash Function,final version,1995 Ateniese, G., Blundo, C., De Santis, A., and Stinson, D. R. (1996), Extended Schemesfor VisualCryptography, submitted to Discrete Applied Mathematics. Verheul,E., and Tilborg, H. V. , Constructions and Propertics of k out of n Visual Secret Sharing Schemes, Designs, Codes and Cryptography, vol. 11 no.2, 1997, pp. 179-196.
Vol.1, Issue.1, June-2014
Ganesh P, et al., Secret sharing of Color Images by N out N POB System
4
[8] Hsien-Chu Wu, Hao-Cheng Wang and Rui-Wen Yu, Color visual Cryptography Scheme using Meaningful Shares, Eighth International Conference on Intelligent Systems Design and Applications, Volume 3, pp.173 178, 2008. [9] Kirankumari, Shalinibhatia ,Multi-pixel Visual Cryptography for color images with Meaningful Shares ,International Journal of Engineering Science and Technology Vol.2(6), 2010, 2398-2407. [10] Nagaraj V. Dharwadkar, B. B. Amberker, Sushil Raj Joshi, Visual Cryptography for Color Image using Color Error Diffusion, ICGST-GVIP Journal, vol.10,issue.1,February 2010. [11] B.SaiChandana , S.Anuradha , A New Visual Cryptography Scheme for Color Images, International Journal of Engineering Science and Technology,vol.2(6), 1997-2000 ,2010. [12] F. Liu, C.K. Wu, X.J. Lin, Colour visual cryptography schemes, IET Information Security, 2008,Vol. 2, No. 4, pp. 151165.doi: 10.1049/iet ifs:20080066. [13] Chin-Chen Chang,Chwei-ShyongTsai,Tung-Shou Chen, A New Scheme for Sharing Secret Color Images in Computer Network, ICPADS ’00 Proceedings of the Seventh International Conference on Parallel and Distributed Systems,pp. 21- 27. [14] Sreekumar,A., BabuSundar,S., An Efficient Secret Sharing Scheme for n out of n scheme using POB-number system,IIT Kanpur Hackers workshop 2009.
Sreepathy Journal of Computer Sc. & Engg.
Vol.1, Issue.1, June-2014
Sreepathy Journal of Computer Science and Engg.,Vol 1, Issue 1, June 2014
5
Locating Emergency Responders in Disaster Area Using Wireless Sensor Network Aparna M
Dept. of Computer Science and Engg, SIMAT, Vavanoor, Palakkad aparna.m@simat.ac.in
Abstract—A worldwide increase in number of natural hazards is causing heavy loss of human life and infrastructure. An effective disaster management system is required to reduce the impacts of natural hazards on common life. The first hand responders play a major role in effective and efficient disaster management. Locating and tracking the first hand responders are necessary to organize and manage real-time delivery of medical and food supplies for disaster hit people. This requires effective communication and information processing between various groups of emergency responders in harsh and remote environments. Locating, tracking, and communicating with emergency responders can be achieved by devising a body sensor system for the emergency responders. In phase 1 of this research work, we have developed an enhanced trilateration algorithm for static and mobile wireless sensor nodes. This work discusses an algorithm and its implementation for localization of emergency responders in a disaster hit area. The indoor and outdoor experimentation results are also presented. Keywords—Wireless sensor networks, Localization, Disaster area, Emergency Responders.
T
I.
INTRODUCTION
HE advent of Wireless Sensor Network (WSN) has marked an era in the sensing and monitoring field. The technology has made possible to monitor otherwise remote and inaccessible areas such as active volcanoes, avalanches and so on. WSN is widely being used in various areas, such as, environmental monitoring, medical care, and disaster prevention and mitigation. This paper details yet another application of WSN in the post disaster scenario and comes up with an algorithm for localization. When a disaster has struck an area, it is important to act immediately to rescue and give first line help in the form of medical aid, food and so on to the people in that area. Thus the role of first line emergency responders becomes a vital part of the post disaster scenario. In a disaster scenario, locating and tracking the first hand responders is essential to organize and manage real-time delivery of medicine and food to disaster hit people. This requires effective communication and information processing between various groups of emergency responders in harsh and remote environments. Locating, tracking, and communicating with emergency responders can be achieved by devising a body sensor system for the emergency responders. This project aims to locate the emergency responders in different locations. In a disaster hit area whole communication networks may get damaged and the communication between responders is not possible.
So localization of responders is very difficult with other technologies other than WSN. This research project is an application of Wireless Sensor Network in disaster management. The paper addresses the development of an algorithm that can perform precise localization and tracking of the responders with indirect line-ofsight. Responders will be randomly located in the area so an ad-hoc network will be formed between the sensor nodes. The following sections briefs the role of responders and the location tracking algorithms used. II. R ELATED W ORK A lot of indoor localization algorithm has been developed using Received Signal Strength Indicator (RSSI). A method for distance measurement using RSSI has been discussed in [7]. The accuracy of RSSI method can be very much influenced by multi- path, fading, non-line of sight (NLoS) conditions and other sources of interference [11]. In a disaster prone area these effects are more pronounced and as such RSSI method cannot be used in our study. Coordinate estimation method of localization using the principle of GPS has been suggested in [1]. Since the GPS modules could be costly, it has been discarded for our study. In this project work we focused on a category of localization methods which estimate coordinate based on distance measurement. Instead of considering signal strength, time of arrival of each packet is considered which increases the accuracy of location. The algorithm is based on the Time Difference of Arrival (TDOA) method the accuracy of which is much higher when compared to the RSSI and we also present optimization methods to decrease the error of estimating the location using TDOA. III. S YSTEM A RCHITECTURE As suggested in [13], creating a common operating picture for all responders in an emergency situation is essential to take appropriate action in the disaster hit area and the safety of the responders. Protective suits used by the responders to be safe from hazardous materials create unique problems for response teams because their protective suits often make it difficult to read instrument screens, and if subject-matter experts arent on scene, responders must find ways to relay the information back to them. The aim is to develop an algorithm that can perform precise localization of sensor nodes with indirect line-of-sight by
c Dept. of Computer Science & Engg., Sreepathy Institute of Management And Technology, Vavanoor, Palakkad, 679533.
Aparna M, Locating Emergency Responders in Disaster Area Using Wireless Sensor Network utilizing location information and distance measurements over multiple hops. To achieve this goal, nodes use their ranging sensors to measure distance to their neighbors and share their measurement and location information with their neighbors to collectively estimate their locations. Multilateration is a suitable method for localization in outdoor navigation. As described in [10], when the receiver sends the signal to locate itself, it finds out at least three nearest anchor nodes which know their positions. The receiver then calculates the distance between one satellite and the receiver. If the distance is X, then it draws an imaginary sphere with X as the radius from the receiver to the satellite and also the node as the centre [13]. The same process is repeated for the next two nodes. Thus three spheres are drawn with just two possible positions. Out of these one point will be in space and the other will be the location of the receiver. Thus the exact position of the receiver is found out. Usually the receivers try to locate more than four satellites so as to increase the accuracy of the location. The Earth is made as the fourth sphere so that two points converge with the imaginary spheres of the other three satellites. This method is commonly called 3-D Trilateration method. Here in this paper we have tried to implement a modified version of the above mentioned method. Fig: 5 shows the entire architecture of the system. The entire wireless sensor network is formed by required number of MicaZ mote. MicaZ mote includes the program for Localization, tracking and monitoring the position of unknown node. Also it includes the program for time synchronization. The
6
Fig. 2: TinyDB Architecture
use a GPS receiver to find out the coordinate points of three nodes. Algorithm Localization Algorithm 1. Set N numbers of nodes in the field and synchronize all the nodes. 2. Each of them continuously sent RF signals with packet containing a field for time stamp, i.e. whenever RF signal is sent by an unknown node, unknown node add the time of sending RF signal to the packet it broadcasts. 3. In addition there are B beacon nodes. 4. Each node ni ,1 ≤ i ≤ N estimate its distance dij from each beacon j 1 ≤ j ≤ M where M is the no: of beacon nodes in its transmission range(assume M as 3 nodes) 5. Each beacon node will calculate the distance from node Distance = speed × time 6. Making this distance as radius each beacon create a circle. 7. The intersecting point of these 3 circles will be the position of unknown node. 8. Go to Step5 if the unknown node is moving and track its position at regular instance of time. Generate the genetically evolved population of cellular automata rules. 9. Stop.
Fig. 1: Systm Architecture TinyDB plays an important role here. Tiny database extract query information from a network of motes. TinyDB provides a simple Java API for writing PC applications that query and extract data from the network; it also comes with a simple graphical query-builder and result display that uses the API. IV.
A LGORITHM D ESIGN
The algorithm used here is a trilateration algorithm, and is implemented using TinyOS in the NesC language. This is the first step for responders localization. Trilateration is the method of using relative position of the nearby objects to calculate the exact location of the object of interest. Instead of using a known distance and an angle measurement as in normal triangulation methods, we used three known distances to perform the calculation. Here we Sreepathy Journal of Computer Sc. & Engg.
Fig. 3: Graphical Representation of Trilateration For example (see Figure 2), point B is where the object of interest is located. The distances to the nearby objects P1 and P2 are known. From geometry, it can be concluded that only two possible locations, A and B, can satisfy the criteria. To avoid ambiguity, the distance of the third nearby object is introduced and now there is only one point B that could Vol.1, Issue.1, June-2014
Aparna M, Locating Emergency Responders in Disaster Area Using Wireless Sensor Network possibly exist [14]. If we apply the concept of 2-D trilateration to a GPS application which exists in 3- D space, the circles in Figure 2 become spheres. In order for the receiver to calculate the distance from B, B send RF signals with time stamping. We know the speed of RF signal as 3x108 m/s and we know the time of sending from the time stamping value. From this distance can be measured using equation 1. Distance = speed × time
(1)
Y = (2k1 + 3k2 − k3 )/6;
(3)
7
A second method was tried out where the unknown node itself will act as a sink node. The beacon nodes calculate the distance and send it to the unknown node which does the localization calculations. This will avoid the use of more number of nodes. It was found that the accuracy of the calculation increases. The XSniffer output is as shown in figure 5. Actual distance is measured using the measuring
Using this calculated distance as radius and p1 as center, draw a circle. Repeat the same steps for p2 and p3. We will get three circles. Solving the three equations of the circle we get one intersecting point. Consider three circles with positions as (x1 ,y1 , z1 ); (x2 ,y2 , z2 ) (x3 ,y3 , z3 ); X = (4k1 + 3k2 − 2k1 )/6 (2) Z = (k1 + k2 − k3 )/2
(4)
X, Y, Z are the co-ordinates of unknown node.
Fig. 4: Distance measurement as viewed using XSniffer
V. I MPLEMENTATION With the entire setup field testing is also find out. The readings from the mote is calculated from the outdoor and find out the readings and they are tabulated which is shown in the table. The setup includes one unknown node and three
Fig. 5: Distance measurement
beacon nodes. The beacon nodes calculate the distance of the unknown node from them and calculate the intersection point of the three circles formed by calculated distances. The actual distance, the time stamp and speed of the wave is tabulated in table1. The beacon node uses the time stamp to calculate the time of arrival. Knowing the speed of the RF waves the distance could be computed as discussed in section 3.1.1. This distance is calculated in beacon node which localizes the unknown node. The data from beacon node is viewed using XSniffer (figure 4).The beacon node uses the time stamp to calculate the time of arrival. Knowing the speed of the RF waves the distance could be computed as discussed in section 3.1.1. This distance is calculated in beacon node which localizes the unknown node. The data from beacon node is viewed using XSniffer (figure 4). Sreepathy Journal of Computer Sc. & Engg.
instruments. Using the micaz mote we calculated the distance and viewed the result using XSniffer. We did an analysis on actual distance and calculated distance which has been plotted in Figure 6 and came up with following three conclusions: 1) There are small differences in actual value and calculated values. 2) Error rate always remains in the range of 0.025%-0.5%. 3) The error rate is minimum when the unknown node and the beacon node are closer. When they are far from each other, error rate increases due to interference and delay. Multilateration with a number of beacon nodes is a good method to reduce error in calculations. Multilateration, also known as hyperbolic positioning, is the process of locating an object by accurately computing the time difference of arrival (TDOA) of a signal emitted from that object to three or more receivers. It also refers to the case of locating a receiver by measuring the TDOA of a signal transmitted from three or more synchronized transmitters. In practice, errors Vol.1, Issue.1, June-2014
Aparna M, Locating Emergency Responders in Disaster Area Using Wireless Sensor Network in the measurement of the time of arrival of pulses means that enhanced accuracy can be obtained with more than four receivers. In general, N receivers provide N 1 hyperboloids. When there are N 多 4 receivers, the N 1 hyperboloids should, assuming a perfect model and measurements, intersect on a single point. In reality, the surfaces rarely intersect, because of various errors. In this case, the location problem can be posed as an optimization problem and solved using, for example, least squares method or an extended Kalman filter. Additionally, the TDOA of multiple transmitted pulses from the emitter can be averaged to improve accuracy. VI.
A DVANTAGES
Main advantage of the system is the use of distance measurements when compared to RSSI method which is based on the measurement of signal strength. The signal strength fluctuates every time as such accurate localization of the object in the plane may not be possible. In the above said algorithm the relationship between speed and time is used for measuring the distance. The time Stamping interface is used to keep track of the time. By knowing the speed of RF signal it is possible to calculate the distance. VII.
[7] [8] [9] [10] [11] [12] [13] [14] [15]
8
V.: RADER: An In-Building RF-based User Location and Tracking System. In: Proc. of IEEE INFOCOM 2000, pp. 775784 (2000) Biswas, P., Ye, Y.: Semidefinite Programming for Ad Hoc Wireless Sensor Network Localization. In: 3rd International Symposium on Information Processing Bulusu, N., Estrin, D., Girod, L., Heidemann, J.: Scalable Coordination for Wireless Sensor Networks: Self-Configuring Localization Systems. In: Proceedings of the 6th IEEE International Symposium. Kannan, A., Mao, G., Vucetic, B.: Simulated Annealing based Localization in Wireless Sensor Network. In: Vehicular Technology Conference (2006), ieeexplore.ieee.org. Mao, G., Fidan, B., Anderson, B.D.O.: Wireless Sensor Network Localization Techniques. The International Journal of Computer and Telecommunications Networking (2007). Sichitiu, M.L., Ramadurai, V.: Localization of Wireless Sensor Networks with a Mobile Beacon. In: Proceedings of the IEEE ICMASS (2004). Ladd, M., Bekris, K.E., Rudys, A., Kavraki, L.E., Wallach, D.S.: Robotics-Based Location Sensing UsingWireless Ethernet. In: International Conference on Mobile Computing and Networking (2002). Simic, S.N., Sastry, S.: Distributed Localization in Wireless Ad Hoc Networks. Tech. report, UC Berkeley, 2002, Memorandum No. UCB/ERL M02/26 (2002). Evans, D., Hu, L.: Localization for mobile sensor networks. Proc. IEEE FGCN 2007 workshop chairs (2007).
C ONCLUSION AND F UTURE W ORKS
This project tries to develop a proper localization of first responders. Algorithm is implemented and tested; trilateration algorithm is used for localization and extracts the value of time from Time stamping. From this distance is measured. So there is no ranging problem. When we measure distance using strength of radio signal there is a problem like reducing signal strength due to different factors. Took the relationship between time and speed and calculated the distance. This method helps to locate the real coordinate of object in the plane. This work could be extended by implementing the algorithm in a tag which has the capability to perform the functions of GPS receiver. So that exact location can be identified with great accuracy in an outdoor environment. The developed algorithm could be extended to Mulilateration with minor changes. When the number of nodes increases the accuracy increases. Also the unknown node can act as a sink node and reduce the complexity of the network. R EFERENCES [1] Hu, L., Evans, D.: Localization of mobile sensor networks. In: IEEE InfoCom 2000 (March 2000) [2] Sabatto, S.Z., Elangovan, V., Chen, W., Mgaya, R.: Localization Strategies for Large- Scale Airborne Deployed Wireless Sensors. IEEE, Los Alamitos (2009) [3] Kang, Li, X.: Power-Aware Markov Chain Based Tracking. IEEE Computer 37(8), 4149 (2004) [4] Park, J.Y., Song, H.Y.: Multilevel Localization for Mobile Sensor NetworkPlatforms. Proceedings of the IMCSIT 3 (2008) [5] http://www.tinyos.net/tinyos-1.x/doc/tutorial [6] Ramesh, M.V., Kumar, S., Rangan, P.V.: Wireless Sensor Network for Landslide Detection. In: Proceedings of the Third International Conference on Sensor Technologies
Sreepathy Journal of Computer Sc. & Engg.
Vol.1, Issue.1, June-2014
Sreepathy Journal of Computer Science and Engg.,Vol 1, Issue 1, June 2014
9
Brute Force Attack Defensing With Online Password Guessing Resistant Protocol 1 Jyothis
K P, 2 Padmadas M P, 3 Krishnan N
1
Dept. of Computer Science & Engg., Sreepathy Institute of Management And Technology, Vavanoor jyothis.kp@simat.ac.in 2 Research Scholar, Centre for Information Technology and Engineering, M.S University, Tirunelveli, India mpadmadas@gmail.com 3 Professor, Centre for Information Technology and Engineering, M.S University, Tirunelveli, India krishnan17563@gmail.com
Abstract—Brute force and dictionary attacks on password- only remote login services are now widespread and ever increasing. Enabling convenient login for legitimate users while preventing such attacks is a difficult problem. Automated Turing Tests (ATTs) continue to be an effective, easy-to-deploy approach to identify automated malicious login attempts with reasonable cost of inconvenience to users.In this paper, we discuss the inadequacy of existing and proposed login protocols designed to address largescale online dictionary attacks (e.g., from a botnet of hundreds of thousands of nodes). We propose a new Password Guessing Resistant Protocol (PGRP), derived upon revisiting prior proposals designed to restrict such attacks. While PGRP limits the total number of login attempts from unknown remote hosts to as low as a single attempt per username, legitimate users in most cases (e.g., when attempts are made from known, frequently-used machines) can make several failed login attempts before being challenged with an ATT. We analyze the performance of PGRP with two real-world data sets and find it more promising than existing proposals.PGRP accommodates both graphical user interfaces (e.g., browser-based logins) and character-based interfaces (e.g., SSH logins), while the previous protocols deal exclusively with the former, requiring the use of browser cookies. PGRP uses either cookies or IP addresses, or both for tracking legitimate users Keywords—Online password guessing attacks, brute force attacks, password dictionary, ATTs.
I. INTRODUCTION HE Online guessing attacks on password-based systems are inevitable and commonly observed against web applications and SSH logins. In a recent report, SANS identified password guessing attacks on websites as a top cyber security risk. As an example of SSH password guessing attacks, one experimental Linux honey pot setup has been reported to suffer on average 2,805 SSH malicious login attempts per computer per day Interestingly, SSH servers that disallow standard password authentication may also suffer guessing attacks, e.g., through the exploitation of a lesser known/used SSH server configuration called keyboard interactive authentication. However, online attacks have some inherent disadvantages compared to offline attacks: attacking machines must engage in an interactive protocol, thus allowing easier detection; and in
T
most cases, attackers can try only limited number of guesses from a single machine before being locked out, delayed, or challenged to answer Automated Turing Tests (ATTs, e.g., CAPTCHAs). Consequently, attackers often must employ a large number of machines to avoid detection or lock-out.
Fig. 1: Use Case Diagram for Password Attack One effective defense against automated online password guessing attacks is to restrict the number of failed rials without ATTs to a very small number (e.g., three), limiting automated programs (or bots) as used by attackers to three free password guesses for a targeted account, even if different machines from a botnet are used. II. LITERATURE SURVEY Even the best current guidelines for designing passwordcomposition policies, for instance, are based on theoretical estimates or small-scale laboratory studies (e.g., [12, 20]). What makes designing an appropriate password- composition policy even trickier is that such policies affect not only the passwords users create, but also users behavior. For ex-ample, certain password-composition policies that lead to more-difficult-topredict passwords may also lead users to write down their passwords more readily, or to become more averse to changing passwords because of the additional ef-fort of memorizing
c Dept. of Computer Science & Engg., Sreepathy Institute of Management And Technology, Vavanoor, Palakkad, 679533.
Jyothis K P, et. al., Brute Force Attack Defensing With Online Password Guessing Resistant Protocol the new ones. Such behavior may also affect an adversarys ability to predict passwords and should therefore be taken into account when selecting a policy. For instance, we compared two password-composition policies: one required only that passwords be at least 16 characters long; the other required at least eight characters but also an uppercase letter, a number, a symbol, and a dictionary check. According to the best available guidelines , these two policies should result in passwords of approximately the same entropy. We find, however, that the 16-character policy yields significantly less predictable passwords, and that it is, by several metrics, less onerous for users. believe this and other findings will be useful both to security professionals seeking to establish or review passwordcomposition policies, and to researchers interested in examining how to make passwords more secure and usable.Namely, OpenID has little evidence of user adaptations 10, Object based passwords 10 has immerged recently and little is known yet and many other schemes 10.C. Herley and P.C. van Oorschot argue that no silver bullet will meet all requirements, and not only will passwords be with us for some time, but in many instances they are the solution which best fits the scenario of use 10. C.Herley, PC.van Oorschot, J. Bonneau, C. and F. Stajano, shows that selected scheme authors for their benefit analysis in their survey paper, are not only optimistic but also incomplete, using the framework they have defined.Cormac Herley argues that most security advice simply offers a poor cost-benefit trade-off to users and the users’ rejection of the security advice they receive is entirely rational from an economic perspective 10. So the password security community has looked back the research history of password security and usability and came up with new paradigms of solutions and yet shows that the text password practice wont end soon III.
O NLINE PGRP
Given In this section, we present the PGRP protocol, including the goals and design choices.
A. Goals, Operational Assumptions and Overview 1) Protocol Goals: Our objectives for PGRP include the following: •
•
•
The login protocol should make brute force and dictionary attacks ineffective even for adversaries with access to large botnets (i.e., capable of launching the attack from many remote hosts). The protocol should not have any significant impact on usability (user convenience). For example: for legitimate users, any additional steps besides entering login credentials should be minimal. Increasing the security of the protocol must have minimal effect in decreasing the login usability. The protocol should be easy to deploy and scalable, requiring minimum computational resources in terms of memory, processing time, and disk space.
Sreepathy Journal of Computer Sc. & Engg.
10
2) Assumptions: We assume that adversaries can solve a small percentage of ATTs, e.g., through automated programs, brute force mechanisms, and low paid workers (e.g., Amazon Mechanical Turk ). Incidents of attackers using IP addresses of known machines and cookie theft for targeted password guessing are also assumed to be minimal. Traditional password-based authentication is not suitable for any un trusted environment (e.g., a key logger may record all keystrokes, including passwords in a system, and forward those to a remote attacker). We do not prevent existing suchattacks in untrusted environments, and thus essentially assume any machines that legitimate users use for login are trustworthy. The data integrity of cookies must be protected (e.g., by a MAC using a key known only to the login server). 3) Overview: The general idea behind PGRP is that except for the following two cases, all remote hosts must correctly answer an ATT challenge prior to being informed whether access is granted or the login attempt is unsuccessful: 1) when the number of failed login attempts for a given username is very small. 2) when the remote host has successfully logged in using the same username in the past (however, such a host must pass an ATT challenge if it generates more failed login attempts than a prespecified threshold). In contrast to previous protocols, PGRP uses either IP addresses, cookies, or both to identify machines from which users have been successfully authenticated. The decision to require an ATT challenge upon receiving incorrect credentials is based on the received cookie (if any) and/or the remote hosts IP address. In addition, if the number of failed login attempts for a specific username is below a threshold, the user is not required to answer an ATT challenge even if the login attempt is from a new machine for the first time (whether the provided username password pair is correct or incorrect). B. Data Structure and Function Description 1) Data Structures: PGRP maintains three data structures: 1) W. A list of source IP address, username pairs such that for each pair, a successful login from the source IP address has been initiated for the username previously. 2) FT. Each entry in this table represents the number of failed login attempts for a valid username, un. A maximum of k2 failed login attempts are recorded. Accessing a nonexisting index returns 0. 3) FS. Each entry in this table represents the number of failed login attempts for each pair of (srcIP, un). Here, srcIP is the IP address for a host in W or a host with a valid cookie, and un is a valid username attempted from srcIP A maximum of k1 failed login attempts are recorded;crossing this threshold may mandate passing an ATT (e.g.,depending on FT12un ). An entry is set to 0 after a successful login attempt. Accessing a nonexisting index returns 0. Each entry in W, FT, and FS has a “write-expiry“ interval such that the entry is deleted when the given period of time (t1, t2, or t3) has lapsed since the last time the entry was Vol.1, Issue.1, June-2014
Jyothis K P, et. al., Brute Force Attack Defensing With Online Password Guessing Resistant Protocol inserted or modified. There are different ways to implement write-expiry intervals (e.g., hashbelt ). A simple approach is to store a timestamp of the insertion time with each entry such that the timestamp is updated whenever the entry is modified. At anytime the entry is accessed, if the delta between the access time and the entry timestamp is greater than the data structure write-expiry interval (i.e., t1, t2, or t3), the entry is deleted. 2) Functions: PGRP uses the following functions (IN denotes input and OUT denotes output): 1) ReadCredential(OUT: un,pw,cookie). Shows a login prompt to the user and returns the entered username and password, and the cookie received from the users browser (if any). 2) LoginCorrect(IN: un,pw; OUT: true/false). If the provided username-password pair is valid, the function returns true; otherwise, it returns false. 3) GrantAccess(IN: un,cookie). The function sends the cookie to the users browser and then enables access to the specified user account. C. Cookies versus Source IP Addresses Similar to the previous protocols, PGRP keeps track of user machines from which successful logins have been initiated previously. Browser cookies seem a good choice for this purpose if the login server offers a web-based interface. Typically, if no cookie is sent by the user browser to the login server, the server sends a cookie to the browser after a successful login to identify the user on the next login attempt. However, if the user uses multiple browsers or more than one OS on the same machine, the login server will be unable to identify the user in all cases. Cookies may also be deleted by users, or automatically as enabled by the private browsing mode of most modern browsers. Moreover, cookie theft (e.g., through session hijacking) might enable an adversary to impersonate a user who has been successfully authenticated in the past. In addition, using cookies requires a browser interface (which, e.g., is not applicable to SSH). Alternatively, a user machine can be identified by the source IP address. Relying on source IP addresses to trace users may result in inaccurate identification for various reasons, including: • The same machine might be assigned different IP addresses over time (e.g., through the network DHCP server and dial-up Internet) • A group of machines might be represented by a smaller number or even a single Internet-addressable IP address if a NAT mechanism is in place. However, most NATs serve few hosts and DHCPs usually rotate IP addresses on the order of several days (also, techniques to identify machines behind a NAT exist Drawbacks of identifying a user by means of either a browser cookie or a source IP address) include: ◦ Failing to identify a machine from which the user has authenticated successfully in the past. ◦ Wrongly identifying a machine the user has not authenticated before. Sreepathy Journal of Computer Sc. & Engg.
11
Case 1) decreases usability since the user might be asked to answer an ATT challenge for both correct and incorrect login credentials. Case 2) affects security since some users/attackers may not be asked to answer an ATT challenge even though they have not logged in successfully from those machines in the past. However, the probability of launching a dictionary or brute force attack from these machines appears to be low. First, for identification through cookies, a directed attack to steal users cookies is required by an adversary. Second, for identification through IP addresses, the adversary must have access to a machine in the same subnet as the user. Consequently, we choose to use both browser cookies and source IP address (or only one of them if the other is not applicable) in PGRP to minimize user inconvenience during the login process. Also, by using IP addresses only, PGRP can be used in character-based login interfaces such as SSH. An SSH server can be adapted to use PGRP using text-based ATTs (e.g., textcaptcha.com). For example, a prototype of a textbased CAPTCHA for SSH is available as a source code patch for OpenSSH. The security implications of mistakenly treating a machine as one that a user has previously successfully logged in from is limited by a threshold such that after a specific number of failed login attempts (k1 in Fig. 1), an ATT challenge is imposed. For identification through a source IP address, the condition FS12srcIP; un < k1 in line 4 (for correct credentials) and in line 16 (for incorrect credentials) limits the number of failed login attempts an identified user can make without answering ATTs the function Valid (cookie, un, k1, true) in line 4 updates a counter in the received cookie in which the cookie is considered invalid once this counter hits or exceeds k1. This function is also called in line 16 to check this counter in case of a failed login attempt. D. Decision Function for Requesting ATTs Below we discuss issues related to ATT challenges as provided by the login server in Fig. 1. The decision to challenge the user with an ATT depends on two factors: • whether the user has authenticated successfully from the same machine previously. • the total number of failed login attempts for a specific user account. For definitions of W, FT, and FS. 1) Username-Password Pair Is Valid: As in the condition in line 4, upon entering a correct username-password pair, the user will not be asked to answer an ATT challenge in the following cases: • A valid cookie is received from the user machine (i.e., the function V alid returns true) and the number of failed login attempts from the user machines IP address for that username, FS12srcIP; un , is less than k1 over a time period determined by t3. • The user machines IP address is in the whitelist W and the number of failed login attempts from this IP address for that username, FS12srcIP; un , is less than k1 over a time period determined by t3; • The number of failed login attempts from any machine for that username, FT12un , is below a threshold k2 over a time period determined by t2. The last case enables a Vol.1, Issue.1, June-2014
Jyothis K P, et. al., Brute Force Attack Defensing With Online Password Guessing Resistant Protocol
Fig. 2: PGRP: Password Guessing Resistant Protocol
user who tries to login from a new machine/IP address for the first time before k2 is reached to proceed without an ATT. However, if the number of failed login attempts for the username exceeds the threshold k2 (default 3), this might indicate a guessing attack and hence the user must pass an ATT challenge. 2) Username-Password Pair Is Invalid: Upon entering an incorrect username-password pair, the user will not be asked to answer an ATT challenge in the following cases: • A valid cookie is received from the user machine (i.e., the function V alid returns true) and the number of failed login attempts from the user machines IP address for that username, FS12srcIP; un , is less than k1 (line 16) over a time period determined by t3; • The user machines IP address is in the whitelist W and the number of failed login attempts from this IP address for that username, FS12srcIP; un , is less than k1 (line 16) over a time period determined by t3; • The username is valid and the number of failed login attempts (from any machine) for that username, FT12un , is below a threshold k2 (line 19) over a time Sreepathy Journal of Computer Sc. & Engg.
12
period determined by t2. A failed login attempt from a user with a valid cookie or in the whitelist W will not increase the total number of failed login attempts in the FT table since it is expected that legitimate users may potentially forget or mistype their password (line 16-18). Nevertheless, if the user machine is identified by a cookie, a corresponding counter of the failed login attempts in the cookie will be updated. In addition, the FS entry indexed by the source IP address, username pair will also be incremented (line 17). Once the cookie counter or the corresponding FS entry hits or exceeds the threshold k1 (default value 30), the user must correctly answer an ATT challenge. 3) Output Messages: PGRP shows different messages in case of incorrect {username, password} pair (lines 21 and 24) and incorrect answer to the given ATT challenge (lines 14 and 26). While showing a human that the entered {username, password} pair is incorrect, an automated program unwilling to answer the ATT challenge cannot confirm whether it is the pair or the ATT that was incorrect. However, while this is more convenient for legitimate users, it gives more information to the attacker about the answered ATTs. PGRP can be modified to display only one message in lines 14, 21, 24, and 26 (e.g., ”login fails“ as in the PS and VS protocols) to prevent such information leakage. 4) Why Not to Black-List Offending IP Addresses: We choose not to create a blacklist for IP addresses making many failed login attempts for the following reasons: • This list may consume considerable memory; • legitimate users from blacklisted IP addresses could be blocked (e.g., using compromised machines); 3) hosts using dynamic IP addresses seem more attractive targets (compared to hosts with static IP addresses) for adversaries to launch their attacks from (e.g., spammers). If the cookie mechanism is not available for the login server, PGRP can operate by using only source IP addresses to keep track of user machines. IV. M ETHODOLOGY A. Existing Method However, this inconveniences the legitimate user who then must answer an ATT on the next login attempt. Several other techniques are deployed in practice, including: allowing login attempts without ATTs from a different machine, when a certain number of failed attempts occur from a given machine; allowing more attempts without ATTs after a time-out period; and time-limited account locking. Many existing techniques and proposals involve ATTs, with the underlying assumption that these challenges are sufficiently difficult for bots and easy for most people. However, users increasingly dislike ATTs as these are perceived as an (unnecessary) extra step; see Yan and Ahmad for usability issues related to commonly used CAPTCHAs. B. Disadvantages Due to successful attacks which break ATTs without human solvers ATTs perceived to be more difficult for bots are being Vol.1, Issue.1, June-2014
Jyothis K P, et. al., Brute Force Attack Defensing With Online Password Guessing Resistant Protocol deployed. As a consequence of this arms-race, present-day ATTs are becoming increasingly difficult for human users , fueling a growing tension between security and usability of ATTs. Therefore, we focus on reducing user annoyance by challenging users with fewer ATTs, while at the same time subjecting bot logins to more ATTs, to drive up the economic cost to attackers. Two well-known proposals for limiting online guessing attacks using ATTs are Pinkas and Sander (herein denoted PS), and van O orschot and Stubblebine (herein denoted VS). For convenience, a review of these protocols C. Proposed Method The PS proposal reduces the number of ATTs sent to legitimate users, but at some meaningful loss of security; for example, in an example setup (with p 14 0:05, the fraction of incorrect login attempts requiring an ATT) PS allows attackers to eliminate 95 percent of the password space without answering any ATTs. The VS proposal reduces this but at a significant cost to usability; for example, VS may require all users to answer ATTs in certain circumstances The proposal in the present paper, called Password Guessing Resistant Protocol (PGRP), significantly improves the security-usability trade-off, and can be more generally deployed beyond browser-based authentication. PGRP builds on these two previous proposals. In particular, to limit attackers in control of a large botnet (e.g., comprising hundreds of thousands of bots), PGRP enforces ATTs after a few (e.g., three) failed login attempts are made from unknown machines. On the other hand, PGRP allows a high number (e.g., 30) of failed attempts from known machines without answering any ATTs. We define known machines as those from which a successful login has occurred within a fixed period of time. These are identified by their IP addresses saved on the login server as a white list, or cookies stored on client machines. A white-listed IP address and/or client cookie expire after a certain time. D. Advantages PGRP accommodates both graphical user interfaces (e.g.,browser-based logins) and character-based interfaces (e.g.,SSH logins), while the previous protocols deal exclusively with the former, requiring the use of browser cookies. PGRP uses either cookies or IP addresses, or both for tracking legitimate users. Tracking users through their IP addresses also allows PGRP to increase the number of ATTs for password guessing attacks and meanwhile to decrease the number of ATTs for legitimate login attempts. Although NATs and web proxies may (slightly) reduce the utility of IP address information, in practice, the use of IP addresses for client identification appears feasible [4]. In recent years, the trend of logging in to online accounts through multiple personal devices (e.g., PCs, laptops, smart phones) is growing. When used from a home environment, these devices often share a single public IP address (i.e., a simple NAT address) which makes IP-based history tracking more user friendly than cookies. For example, cookies must be stored, albeit transparently to the user, in all devices used for login. Sreepathy Journal of Computer Sc. & Engg.
V.
13
EXPERIMENT RESULT
In this section, we provide the details of our test setup, empirical results, and analysis of PGRP on two different data sets. PGRP results are also compared to those obtained from testing the PS and VS protocols on the same data sets. VI.
DATA S ETS
We used two data sets from an operational university network environment. Each data set logs events of a particular remote login service, over a one-year period each. SSH Server Log. The first data set was a log file for an SSH server serving about 44 user accounts. The SSH server recorded details of each authentication event, including: date, time, authentication status (success, failed, or invalid username), username, source IP address, and source port. Log files were for the period of January 4, 2009 to January 22, 2010 (thus, slightly over one year). Table 4 shows that the majority of the login events (95 percent) are for invalid usernames suggesting that most login attempts are due to SSH guessing attacks. Note that attack login attempts involving valid usernames are not distinguishable from incorrect logins by legitimate users since there is no indication whether the source is malicious or benign. However, there were only few failed login attempts for valid usernames either over short bursts or over the whole log capture period.
Fig. 3: Dataset The number of invalid usernames that appear to be mistyped valid usernames represents less than one percent. Email server log (web interface). The second data set consisted of log files of a Horde IMP email client2 for the period of January 15, 2009 to January 25, 2010. The Horde email platform is connected to an IMAP email server in a university environment. For each authentication event, a log entry contained: date, time, authentication status (success, failed, or invalid username), username, and source IP address. Although the number of registered user accounts in this server is 1,758, only 147 accounts were accessed. Compared to the SSH log, Table 4 shows that malicious login attempts are far less prevalent, at only about one percent. Login attempts with valid usernames generated by guessing attacks are, as above, not distinguishable. We were unable to determine the percentage of misspelled valid usernames since the log file data including the usernames was anonymized. Vol.1, Issue.1, June-2014
Jyothis K P, et. al., Brute Force Attack Defensing With Online Password Guessing Resistant Protocol A. Simulation Method and Assumptions We performed a series of experiments with a Pythonbased implementation of PGRP with different settings of the configuration variables (k1, k2, t1, t2, and t3). The login events in each data set are ordered according to date (older entries first). Each event is processed by PGRP as if it runs in real time, with protocol tables updated according to the events. Since entries in the tables W, FT, and FS have write-expiry intervals,3 they get updated at each login event according to the date/time of the current event (i.e., the current time of the protocol is the time of the login event being processed). We assume that users always answer ATT challenges correctly. While some users will fail in answering some ATTs in practice (see, e.g., [3]), the percentage of failed ATTs depends on the mechanism used to generate the ATTs, the chosen challenge degree of difficulty (if configurable), and the type of the service and its users. The number of generated ATTs by the server can be updated accordingly; for example, if the probability of answering an ATT correctly is p, then the total number of generated ATTs must be multiplied by a factor of 1=p. Since no browser cookie mechanism was implemented in our tests, in either services of the data sets, the function Valid cookie; un; k1; status always returns false. In the absence of a browser cookie mechanism, a machine from which a user has previously logged in successfully would not be identified by the login system if the machine uses a different IP address that is not in W (see Section 3.3 for further discussion). Such legitimate users will be challenged with ATTs in this case. For
Fig. 4: Result Table a comparative analysis, we also implemented the PS and VS protocols under the same assumptions. The cookie mechanism in these protocols is replaced by IP address tracking of user machines since cookies are not used in either data sets. The probability p of the deterministic function is set to 0.05 0.30, and 0.60 in each experiment. For VS, b1 and b2 are both set to 5 suggested 10 as an upper bound for both b1 and b2. B. Analysis of Results In Fig. 4, we list the protocol parameter settings of eight experiments. For both SSH and email data sets, the total number of ATTs that would be served over the log period, and the maximum number of entries in the W, FT, and FS tables are reported. In the first five experiments, we change the parameter k2 from 0 to 4. k2 bounds the number of failed login attempts after which an ATT challenge will be triggered Sreepathy Journal of Computer Sc. & Engg.
14
for the following login attempt. Note that the total number of ATTs served over the log period decreases slightly with a larger k2 for both data sets. Other parameters have minor effects on the number of ATTs served. The number of entries in W in the email data set is larger than the SSH data set since there are more email users. Note that although the number of failed login attempts is larger in the SSH data set, the number of entries in FT is smaller than the email data set because the number of usernames is less in the SSH data set with very few common usernames (e.g., common first or last names that can be used in brute force attacks). Given that the protocol requires an ATT for each failed login attempt from a source not in W (and with no valid cookie) when k2 is set to 0, the FT table is empty in the first experiment for both data sets (as the second condition in line 19 is always false). VII.
C ONCLUSION AND F UTURE W ORKS
Moreover, the adversary is expected to need to correctly answer about N=2 ATTs in order to guess a password correctly as opposed to 1 2 pN in the PS protocol.The Online password guessing attacks on password-only systems have been observed for decades (see, e.g., [21]).Presentday attackers targeting such systems are empowered by having control of thousand to million-node botnets. Inprevious ATT-based login protocols, there exists a security usability trade-off with respect to the number of free failed login attempts (i.e., with no ATTs) versus user login convenience (e.g., less ATTs and other requirements). In contrast, PGRP is more restrictive against brute force and dictionary attacks while safely allowing a large number offree failed attempts for legitimate users. Our empiricalexperiments on two data sets (of one-year duration) gathered from operational network environments show that while PGRP is apparently more effective in preventing password guessing attacks (without answering ATT challenges), it also offers more convenient login experience, e.g., fewer ATT challenges for legitimate users even if no cookies are available. However, we reiterate that no user testing of PGRP has been conducted so far. However, PGRP appears suitable for organizations of both small and large number of user accounts. The required system resources (e.g., memory space) are linearly proportional to the number of users in a system. PGRP can also be used with remote login services where cookies are not applicable (e.g., SSH and FTP). R EFERENCES [1] [2]
Amazon Mechanical https://www.mturk.com/mturk/,June2010. S.M. Bellovin, ”A Technique for Counting Natted Hosts,“ Proc. ACM SIGCOMM Workshop Internet Measurement, pp. 267-272, 2002. [3] E. Bursztein, S. Bethard, J.C. Mitchell, D. Jurafsky, and C. Fabry, ”How Good Are Humans at Solving CAPTCHAs? A Large Scale Evaluation,“ Proc. IEEE Symp. Security and Privacy, May 2010. [4] M. Casado and M.J. Freedman, ”Peering through the Shroud: The Effect of Edge Opacity on Ip-Based Client Identification,“ Proc. Fourth USENIX Symp. Networked Systems Design and Implementation (NDSS 07), 2007.
Vol.1, Issue.1, June-2014
Jyothis K P, et. al., Brute Force Attack Defensing With Online Password Guessing Resistant Protocol [5] S. Chiasson, P.C. van Oorschot, and R. Biddle, ”A Usability Study and Critique of Two Password Managers,”Proc. USENIX Security Symp., pp. 1-16, 2006. [6] D. Florencio, C. Herley, and B. Coskun, “Do Strong Web Passwords Accomplish Anything?,” Proc. USENIX Workshop Hot Topics in Security (HotSec 07), pp. 1-6, 2007. [7] K. Fu, E. Sit, K. Smith, and N. Feamster, “Dos and Donts of Client Authentication on the Web,” Proc. USENIX Security Symp., pp. 251268, 2001. [8] P. Hansteen, “Rickrolled? Get Ready for the Hail MaryCloud!”,http: //bsdly.blogspot.com/2009/11/rickrolled-get-ready-forhail-mary.html, Feb. 2010. [9] Y. He and Z. Han, “User Authentication with Provable Security against Online Dictionary Attacks,“ J. Networks, vol. 4, no. 3, pp. 200-207, May 2009. [10] T. Kohno, A. Broido, and K.C. Claffy, ”Remote Physical Device Fingerprinting,“ Proc. IEEE Symp. Security and Privacy, pp. 211-225, 2005. [11] M. Motoyama, K. Levchenko, C. Kanich, D. Mccoy,G.M. Voelker, and S. Savage, ”Re: CAPTCHAs Understanding CAPTCHASolving Services in an Economic Context,” Proc. USENIX Security Symp., Aug. 2010. [12] C. Namprempre and M.N. Dailey, Mitigating Dictionary Attacks with Text-Graphics Character Captchas, IEICE Trans. Fundamentals of Electronics, Comm. and Computer Sciences, vol. E90-A, no. 1, pp. 179-186, 2007. [13] A. Narayanan and V. Shmatikov, “Fast Dictionary Attacks on HumanMemorable Passwords Using Time-Space Tradeoff,” Proc. ACM Computer and Comm. Security (CCS 05), pp. 364-372, Nov. 2005. [14] Natl Inst. of Standards and Technology (NIST), Hashbelt.http://ww.itl.nist.gov/div897/sqg/dads/HTML/has hbelt.html, Sept. 2010. [15] The Biggest Cloud on the Planet Is Owned by ...theCrooks,NetworkWorld.com.,http://www.networkworld.com/ community/node/58829, Mar. 2010. [16] J. Nielsen,“Stop Password Masking,” http://www.useit.com/ alertbox/passwords.html, June 2009. [17] B. Pinkas and T. Sander, “Securing Passwords against Dictionary Attacks,“ Proc. ACM Conf. Computer and Comm. Security (CCS 02), pp. 161-170, Nov. 2002. [18] D. Ramsbrock, R. Berthier, and M. Cukier, Profiling Attacker Behavior following SSH Compromises, Proc. 37th Ann. IEEE/IFIP Intl Conf. Dependable Systems and Networks (DSN 07), pp. 119-124, June 2007 [19] SANS.org, Important Information: Distributed SSH Brute Force Attacks, SANS Internet Storm Center Handlers Diary, http://isc.sans.edu/diary.html?storyid=9034, June 2010. [20] ”The Top Cyber Security Risks,“ SANS.org, http://www.sans. org/topcyber-security-risks/, Sept. 2009. [21] C. Stoll, The Cuckoos Egg: Tracking a Spy through the Maze of Computer Espionage. Doubleday, 1989. [22] ”Botnet Pierces Microsoft Live through Audio Captchas,“ TheRegister.co.uk, http://www.theregister.co.uk/2010/03/22/microsoft live captcha by pass/, Mar. 2010 [23] P.C. van Oorschot and S. Stubblebine, ”On Countering Online Dictionary Attacks with Login Histories and Humans-in-the- Loop,“ ACM Trans. Information and System Security, vol. 9, no. 3, pp. 235-258, 2006. [24] L. von Ahn, M. Blum, N. Hopper, and J. Langford, ”CAPTCHA: Using Hard AI Problems for Security,“ Proc. Eurocrypt, pp. 294- 311, May 2003. [25] M. Weir, S. Aggarwal, M. Collins, and H. Stern, ”Testing Metrics for Password Creation Policies by Attacking Large Sets of Revealed Passwords,“ Proc. 17th ACM Conf. Computer and Comm. Security, pp. 162-175, 2010.
Sreepathy Journal of Computer Sc. & Engg.
15
[26]
Y. Xie, F. Yu, K. Achan, E. Gillum, M. Goldszmidt, and T. Wobber, ”How Dynamic Are IP Addresses?,” SIGCOMM Computer Comm. Rev., vol. 37, no. 4, pp. 301-312, 2007. [27] J. Yan and A.S.E. Ahmad, “A Low-Cost Attack on a Microsoft CAPTCHA,” Proc. ACM Computer and Comm. Security (CCS 08), pp. 543-554, Oct. 2008. [28] J. Yan and A.S.E. Ahmad, “Usability of CAPTCHAs or Usability Issues in CAPTCHA Design,” Proc. Symp. Usable Privacy and Security (SOUPS 08), pp. 44-52, July 2008.
Vol.1, Issue.1, June-2014
Sreepathy Journal of Computer Science and Engg.,Vol 1, Issue 1, June 2014
16
Intelligent Image Interpreter Jayasree N V,
Dept. of Computer Science and Engg, SIMAT, Vavanoor, Palakkad jayasree.nv@simat.ac.in
Abstractâ&#x20AC;&#x201D;The intrinsic information present in an image is very hard to interpret by the computer. There is lot of approaches focused on finding out the salient object in an image. Here we propose a novel architectural approach to find out the relation between salient objects using local and global analysis. The local analysis focuses salient object detection with efficient relation mining in the context of the processing image. For an effective global analysis we created ontology tree by considering a wide set of natural images. From these natural images we create an affinity based ontology graph; with the help of this local and global contextual graph we construct an annotated parse tree. Above formed tree is helpful in large image search. So our proposal will give new heights to the content based image retrieval and image related interpretation problems. Keywordsâ&#x20AC;&#x201D;Ontology graph, Intelligent interpretation, Local analysis, Global analysis, global knowledge base.
I.
I
I NTRODUCTION
NTERPRETATION is the act of analyzing and making conclusions based on the given data. Nowadays we have to work with a huge amount of data which include text, video, audio etc. So the fundamental issue is to take some decisions based on the appearance, context or any other factors which present in the raw data. The symbolic interpretation of image is a challenging task. As there are lot of literatures which mention the application of symbolic computation such as natural language processing on decision making systems. So here we include a state of art solution to the above mentioned decision systems based on a step by step refinement of image or video. The refinement process includes all the symbolic interpretations of image such as object detection, grammatical arrangement of objects, semantic annotation of objects, Global analysis of object based on global knowledge base. . II.
L ITERATURE R EVIEW
Here we propose a novel algorithm for salient region detection by integrating three important visual properties of an image like, uniqueness, focusness and objectness (UFO)[7]. Uniqueness captures the appearance-derived visual contrast; focusness concentrates on the fact that salient regions are often photographed in focus; and objectness helps to think about the completeness of the salient region detected. While uniqueness has been used for saliency detection for long, it is new to integrate focusness and objectness for this purpose. In fact,focusness and objectness both provide important saliency information complementary of uniqueness. Human can prioritize external visual stimuli and localize their most interest in a scene quickly. As such, how to simulate
such human capability with a computer, i.e., how to identify the most salient pixels or regions in a digital image which attract humans first visual attention, has become an important task in computer vision. Further, results of saliency detection can be used to facilitate other computer vision tasks such as image resizing, thumb nailing, image segmentation and object detection. Due to its importance, saliency detection has received intensive research attention resulting in many recently proposed algorithms. The majority of those algorithms are based on low-level features of the image such as appearance uniqueness in pixel or super pixel level. One basic idea is to derive the saliency value from the local contrast of various channels, such as in terms of uniqueness defined in. While uniqueness often helps generate good saliency detection results, it sometimes produces high values for non-salient regions, especially for regions with complex structures. As a result, it is desired to integrate complementary cues to address the issue. Detecting visually salient regions in images is one of the fundamental problems in computer vision[2]. We propose a novel method to decompose an image into large scale perceptually homogeneous elements for efficient salient region detection, using a soft image abstraction representation. By considering both appearance similarity and spatial distribution of image pixels, the proposed representation abstracts out unnecessary image details, allowing the assignment of comparable saliency values across similar regions, and producing perceptually accurate salient region detection. We evaluate our salient region detection approach on the largest publicly available dataset with pixel accurate annotations. This paper propose a novel soft image abstraction approach that captures large scale perceptually homogeneous elements, thus enabling effective estimation of global saliency cues. Unlike previous techniques that rely on super-pixels for image abstraction, we use histogram quantization to collect appearance samples for a global Gaussian Mixture Model (GMM) based decomposition. Components sharing the same spatial support are further grouped to provide a more compact and meaningful presentation. This soft abstraction avoids the hard decision boundaries of super pixels, allowing abstraction components with very large spatial support. This allows the subsequent global saliency cues to uniformly highlight entire salient object regions. Finally, we integrate the two global saliency cues, Global Uniqueness (GU) and Color Spatial Distribution (CSD), by automatically identifying which one is more likely to provide the correct identification of the salient region. Semantic-based image retrieval has attracted great interest
c Dept. of Computer Science & Engg., Sreepathy Institute of Management And Technology, Vavanoor, Palakkad, 679533.
Jayasree N. Vettath, Intelligent Image Interpreter in recent years [6]. This paper proposes a region-based image retrieval system with high-level semantic learning. The key features of the system are: 1) it supports both query by keyword and query by region of interest. The system segments an image into different regions and extracts low-level features of each region. From these features, high-level concepts are obtained using a proposed decision tree-based learning algorithm named DT-ST. During retrieval, a set of images whose semantic concept matches the query is returned. Experiments on a standard real-world image database confirm that the proposed system significantly improves the retrieval performance, compared with a conventional content-based image retrieval system. 2) The proposed decision tree induction method DT-ST for image semantic learning is different from other decision tree induction algorithms in that it makes use of the semantic templates to discretize continuous-valued region features and avoids the difficult image feature discretization problem. Furthermore, it introduces a hybrid tree simplification method to handle the noise and tree fragmentation problems, thereby improving the classification performance of the tree.
Fig. 1: Architecture of our Automatic Image Annotation process. Here we propose a new approach for automatic image annotation (AIA) in order to automatically and efficiently assign linguistic concepts to visual data such as digital images, based on both numeric and semantic features[8]. The presented method first computes multi-layered active contours. The firstlayer active contour corresponds to the main object or foreground, while the next-layers active contours delineate the objects subparts. Then, visual features are extracted within the regions segmented by these active contours and are mapped into semantic notions. Next, decision trees are trained based on these attributes, and the image is semantically annotated using the resulting decision rules. Experiments carried out on several standards datasets have demonstrated the reliability and the computational effectiveness of our AIA system. Here we propose a new fully automatic image annotation method based on efficiently implemented active contours and decision trees. Hence, our approach consists of the automatic recursive image segmentation in multiple layers using multifeature active contours and the automatic semantic labeling of Sreepathy Journal of Computer Sc. & Engg.
17
the image based on decision trees. While being an unsupervised segmentation technique, the multi-layered multi-feature active contour approach does not use any prior knowledge about the foreground unlike top- contours, in order 1) to precisely and automatically down segmentation methods and reaches a semantically segment the image into background and semantically coherent segmentation of the objects more accurately than the meaningful foreground regions and 2) to extract bottom-up segmentation techniques and faster than the coherent and semantically meaningful subregions of combined ones or the extracted main object; On the other hand, our segmentation method also provides the background region. However, in this work, we only exploit the information about the main object and its subparts, in order to process the training of the corresponding decision trees and the automatic labeling of the dataset images in a more computational efficiently way than background-based systems like. AIA system illustrated in Fig. 1 above, which performs both the automatic visual segmentation of the image and its automatic semantic annotation. The main steps of the process are the multi-layered partition of the image in terms of background, foreground and foreground s semantically meaningful sub-regions, the extraction of the corresponding metric features from these delineated regions as well as the definition of the semantic attributes based on the visual features, and the labeling of the image followed by the final online annotation of the image using offline-trained decision trees A new framework for annotating images automatically using ontologies[11] is described here. An ontology is constructed holding characteristics from multiple information sources including text descriptions and low-level image features. Image annotation is implemented as a retrieval process by comparing an input (query) image with representative images of all classes. Handling uncertainty in class descriptions is a distinctive feature of SIA. Average Retrieval Rank (AVR) is applied to compute the likelihood of the input image to belong to each one of the ontology classes. SIA is a complete prototype system for image annotation. Given a query image as input, SIA computes its description consisting of a class name and the description of this class. This description may be augmented by class (ontology) properties depicting its shape, size, color, texture (e.g., “has long hair”, “small size” etc.). The system consists of several modules. The image ontology has two main components namely, the class hierarchy of the image domain and the descriptions hierarchy [5]. Various associations between concepts or features between the two parts are also defined: Class Hierarchy: The class hierarchy of the image domain is generated based on the respective nouns hierarchy of Wordnet4 . In this work, a class hierarchy for dog breeds is constructed (e.g., dog, working group, Alsatian). The leaf classes in the hierarchy represent the different semantic categories of the ontology (i.e., the dog breeds). Also a leaf class (i.e., a dog breed) may be represented by several image instance for handling variations in scaling and posing. For example, in SIA leaf class Labrador has 6 instances. Descriptions hierarchy: Descriptions are disVol.1, Issue.1, June-2014
Jayasree N. Vettath, Intelligent Image Interpreter tinguished into high-level and low-level descriptions. Highlevel descriptions are further divided into concept descriptions (corresponding to the glosses of Wordnet categories) and visual text descriptions (high-level narrative information). The later, are actually descriptions that humans would give to images and are further specialized based on animal shape and size properties (i.e., small, medium and big) respectively. The lowlevel descriptions hierarchy represents features extracted by 7image descriptors. Because an image class is represented by more than one image instances (6 in this work), each class is represented by a setof 7 features for each image instance. An association between image instances and low-level features is also defined denoting the existence of such features (e.g., “hasColorLayout”, “hasCEDD”). Fig. 2 illustrates part of the SIA ontology (not all classes and class properties are shown). The input image may contain several regions from which some
18
Fig. 3: System Architecture
will be first converted to the machine friendly form like ASCII characters.
Fig. 2: Part of the SIA ontology. may be more relevant to the application than others. In this work, dogs head is chosen as the most representative part of a dog image for further analysis. This task is implemented by manual Region of Interest (ROI) placement (the user drags a rectangle around a region) followed by background substraction by applying GrabCut. and noise reduction. III. P ROPOSED M ETHOD The different modules here are: • Image Acquisition • Salient Object Detection • Syntax Tree Generation • Ontology Tree Generation • Semantic Tree Generation The architecture modules of the Intelligent Image Interpreter system can be viewed as much the same as of a compiler, which includes Lexical analysis, Syntax analysis, Semantic analysis,intermediate code generation etc as shown in Fig. 3. A. Image Acquisition In the phase of Image acquisition capture the image to be annotated using a digital camera and convert it in to machine readable form. This is the input collection stage, similar to a program that is given as the input of a compiler. The program Sreepathy Journal of Computer Sc. & Engg.
B. Salient Object Detection Humans have the capability to quickly prioritize external visual stimuli and localize their most interest in a scene. As such, how to simulate such human capability with a computer, i.e., how to identify the most salient pixels or regions in a digital image which attract humans first visual attention, has become an important task in computer vision. In this phase, Analyzing the image captured and identifying the salient objects from the image this module returns. This can be compared to the tokens that are returned from compiler as lexeme from the input program, here the salient objects are identified and returned from the input image. C. Syntax Tree Generation This phase of can be compared with the syntax analysis phase of the compiler which generate a Parse tree from the lexeme. It checks only the syntax. No semantics will be analyzed here. Here we Create a syntax tree based on the image input by analyzing the image locally. The local analysis is done with the returned salient objects from phaseI. This module returns a syntax tree which is much the same as parse tree in second phase of the compiler. D. Ontology Tree Generation In this phase, Global analysis phase I is completed by creating a knowledge base by analyzing a set of possible natural images. Detect the salient regions from them and construct an ontology tree based on a set of natural images which are already stored in the system knowledge base. It returns an ontology tree which can be used as the input to phase5 . It shows the relationship between the salient objects by performing data mining operations. E. Semantic Tree Generation The final Phase of the project is to create an annotated parse tree based on the syntax tree and ontology tree created Vol.1, Issue.1, June-2014
Jayasree N. Vettath, Intelligent Image Interpreter
19
in previous phases. The syntax tree returned from phase 3 is analyzed with the help of ontology tree returned from Phase IV and interpret the meaning from this and construct an annotated parse tree
• • • • •
IV. P ROPOSED A LGORITHM Capture the image(I) using high quality digital camera. Detect the salient objects using edge detection algorithm and deep level segmentation. Salient objects: P1,P2,P3 Create a grammatical rule based on the fuzzy inference. IF X P1 && X M EM P2 THEN P1 R E Local analysis performed on the image to get a meaningful structure of objects and relation exist among them. Create an ontology tree for a set of images using Global analysis
V. C ONCLUSION In this work, the problem of image interpretation using semantics is investigated and it proposes a new method for interpreting the semantics of a given image and construct a semantic tree structure. Which is based on the salient regions in an image. The output of the system is an annotated semantic tree from which we can find out relation ships between objects and this can be extended to implement security systems. The intelligent image interpreter can be implemented in systems that are used as security systems in ATM, Safety lockers etc. purposes. R EFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]
Ying Liua, , Dengsheng Zhanga , Guojun Lua , Wei-Ying Mab, “A survey of content-based image retrieval with high-level semantics”, 2012 Ming-Ming Cheng, Jonathan Warrell, Wen-Yan Lin, Shuai Zheng, Vibhav Vineet , “Efficient Salient Region Detection with Soft Image Abstraction”, Vision Group, Oxford Brookes University Nigel Crook, 2011. Radhakrishna Achanta , Sheila Hemami, Francisco Estrada , and Sabine Susstrunk , “Frequency-tuned Salient Region Detection”,2010. Hao Fu ,Guoping Qiu , “Fast Semantic Image Retrieval Based on Random Forest”, 2012. Ming-Ming Cheng, Guo-Xin Zhang, “Global contrast based salient region detection”,2011. Ying Liu, Dengsheng Zhang , Guojun Lu, “Region-based image retrieval with high-level semantics using decision tree learning”, 2007 Peng Jiang ,Haibin Ling , Jingyi Yu , Jingliang Peng, “Salient Region Detection by UFO: Uniqueness, Focusness and Objectness”, 2013. Joanna Isabelle OLSZEWSKA, “Semantic, Automatic Image Annotation Based On Multi-Layered Active Contours and Decision Trees”, 2013. Dengsheng Zhang, Md Monirul Islam, Guojun Lu and Jin Hou ,“Semantic Image Retrieval Using Region Based Inverted File ”, 2009. Wei Wang, Yuqing Song and Aidong Zhang, “Semantics Retrieval by Content and Context of Image Regions”, 2012.
Sreepathy Journal of Computer Sc. & Engg.
Vol.1, Issue.1, June-2014
Sreepathy Journal of Computer Science and Engg.,Vol 1, Issue 1, June 2014
20
Self Interpretive Algorithm Generator Jayanthan K S,
Dept. of Computer Science and Engg, SIMAT, Vavanoor, Palakkad jayanthan.ks@simat.ac.in
Abstract—Thinking is a complex procedure which is necessary to deal with a complex world. Machines that help us handle this complexity can be regarded as intelligent tools, tools support our thinking capabilities. The need for such intelligent tools will grow as new forms of complexity evolve, for our increasingly globally networked information society. Algorithms are considered as the key component of any problem solving activity. In order for machines to be capable of intelligently predigesting information, they should perform in a way similar to the way human think. Humans have developed sophisticated methods for dealing with the worlds complexity, and it is worthwhile to adapt to some of them. Sometimes the principles of human thinking combined with the functional principles of biological cell systems. Here the consideration of human thinking is extended to interpretive capacity of the machine. This interpretive capacity measured the capacity of the machine to generate the algorithm. Many definitions of algorithm are based on the single idea that input is converted to output in a finite number of steps. An algorithm is a step by step procedure to complete a task. It is any set of detailed instructions which results in a predictable end-state from a known beginning. This project is an attempt to generate algorithms by machine without human intervention. Evolutionary computation is a major tool which is competitive to human in many different areas of problem solving. So here a using Genetically Evolved Cellular Automata is used for giving the system self interpretive capacity to generate algorithm. Keywords—Genetically evolved cellular automata, Genetic algorithm, Symbolic knowledge base, Problem solving
significant progress within the next few years. The field of artificial intelligence encompasses several different approaches to model natural thinking. They include semantic networks, Bayesian networks, Neural network and most prominently Cellular Automata. Cellular Automata are tools to compute complex situations. The idea behind cellular automata or cellular machines is quite natural: neighboring objects influence each other. It uses large number of so - called cells in a regular geometrical arrangement. The cells usually have discrete states that are influenced by their relationships with their neighbors. Like many natural process that are spatially extended cellular automata configurations often organize over time into spatial regions that are dynamically homogeneous. Sometimes in space time diagrams these regions are obvious to the eye as domains regions in which the same pattern appears. These domain patterns are described using Deterministic finite automaton. In the computer world problems may be classified in Dimensionality of the input. So here we are considering a set of numbers are considered as 1 Dimensional input and Image, Graph as 2 Dimensional inputs. So the system will accept the input and output as Cellular Automata entity, and find out the emergence of the input to output using Genetically Evolved Cellular Automata rule. From the previous experiments it is demonstrated that Genetic Algorithm can perform better than human rules. II.
I. I NTRODUCTION Omputer science is mainly concentrating on the development of algorithms to solve complex problems. An algorithm is a step by step procedure to complete a task. It is any set of detailed instructions which results in a predictable end-state from a known beginning. Algorithms are only as good as the instructions given, however the result will be incorrect if the algorithm is not defined properly. In the case of algorithm development, a huge amount of human effort is needed. Automation is the use of information technology to reduce human effort. Every computer program is simply a series of instructions, which may vary in complexity, and is listed, in a specific order, designed to perform a specific task. Mathematics also uses algorithms to solve equations by hand, without the use of a calculator. One good example is the human brain: most conceptions of the human brain define all behavior from the acquisition of food to falling in love as the result of a complex algorithm. The capabilities of todays machine are still far from the capabilities of humans. As natural computing has not yet reached its limits. Existing approaches combined with new concepts and ideas promise
C
L ITERATURE S URVEY
A cellular automaton or CA is a mathematical machine or tool that lends itself to some very remarkable and beautiful ideas[12]. A cellular automaton is a discrete, dynamical system that performs computations in a finely distributed fashion on a spatial grid. Cellular Automatas (CA) are often described as a counterpart to partial differential equations, which have the capability to describe continuous dynamical systems. The meaning of discrete is that space, time and properties of the automaton can have only a finite, countable number of states. The basic idea is not to try to describe a complex system from ”above” - to describe it using difficult equations, but simulating this system by interaction of cells following easy rules. The basic element of a CA is the cell[28]. A cell is a kind of a memory element and stores - to say it with easy words - states. In the simplest case, each cell can have the binary states 1 or 0. In a more complex simulation the cells can have more different states. It is even thinkable, that each cell has more than one property or attribute and each of these properties or attributes can have two or more states. The cells are the common elements which give the CA a wide application tool. These cells are arranged in a spatial web - a lattice. The
c Dept. of Computer Science & Engg., Sreepathy Institute of Management And Technology, Vavanoor, Palakkad, 679533.
Jayanthan K S, Self Interpretive Algorithm Generator simplest one is the one dimensional ”lattice”, meaning that all cells are arranged in a line like a string of perls. The most common CA’s are built in one or two dimensions, whereas the one dimensional CA has the big advantage, that it is very easy to visualize. The states of one time step are plotted in one dimension, and the dynamic development can be shown in the second dimension. A flat plot of a one dimensional CA hence shows the states from time step 0 to time step n. Consider a two dimensional CA: a two dimensional plot can evidently show only the state of one time step. So visualizing the dynamic of a 2D CA is by that reason more difficult. By that reasons and because 1D CA’s are generally more easy to handle. Most theoretical papers available deal with properties of 1D CA’s because the rules are comparably simple. The idea behind GA’s is to extract optimization strategies nature uses successfully - known as Darwinian Evolution - and transform them for application in mathematical optimization theory to find the global optimum in a defined phase space. One could imagine a population of individual explorers sent into the optimization phase-space. Each explorer is defined by its genes by its position inside the phase-space which is coded in his genes. Every explorer has the duty to find a value of the quality of his position in the phase space. Natural Language Processing is a subfield of artificial intelligence and linguistic, devoted to make computers understand statements written in human languages. A natural language is a language spoken, written by humans for general purpose communication [24]. The models developed by Natural Language Processing are useful to write computer programs to do useful task involving language processing, and there by a better understanding of human communication. To some extent these goals are complementary. A better understanding of human communication is the major goal of all Natural Language Processing systems. The process of building computer programs that understand natural language involves three major problems: the first one relates to the thought process, the second one to the representation and meaning of the linguistic input, and the third one to the world knowledge. Thus, an NLP system may begin at the word level to determine the morphological structure, nature of the word and then may move on to the sentence level to determine the word order, grammar, meaning of the entire sentence, etc. and then to the context and the overall environment or domain. III.
M ETHODOLOGY
A GA was used to evolve CA for two computational tasks: density classification and synchronization. In both cases, the GA discovered rules that gave rise to sophisticated emergent computational strategies [3]. The GA worked by evolving the CA rule table and the number of iterations that the model was to run. After the final chromosomes were obtained for all shapes, the CA model was allowed to run starting with a single cell in the middle of the lattice until the allowed number of iterations was reached and a shape was formed. In all cases, mean fitness values of evolved chromosomes were above 80%. The Density Classification Task (DCT) is one of the most studied examples of collective computation in cellular Sreepathy Journal of Computer Sc. & Engg.
21
automata [10]. The goal is to find a binary CA rule that can best classify the majority states in the randomized IC. If the majority of cells in the IC are in the quiescent (active) state, after a number of time steps M, the lattice should converge to a homogeneous state where every cell is in the quiescent (active) state. Since the outcome could be undecidable in lattices with even number of cells (N), this task is only applicable to lattices with an odd number of cells. Devising CA rules that perform this task is not trivial, because cells in a CA lattice update their states based only on local neighborhood information. However, in this particular task, it is required that information be transferred across time and space in order to achieve a correct global classification. The definition of the DCT used in our studies is the same as the one by Mitchell et al. The density classification task has been studied for many years. In this task, a one-dimensional binary CA is initialized with a random Initial Configuration(IC) and iterated for a maximum number of steps I or until a fixed point is reached [27]. If the IC contains more ones than zeros, the CA is deemed to have solved the task if a fixed point of all ones is reached and vice versa. The situation where the IC contains an equal number of ones and zeros. This is a difficult task for a CA because a solution requires coordinating the global state of the system while using only local communication between cells provided by the neighborhood. For this reason, the density classification task is widely used as a standard test function to explore CA behavior [12]. The ability of a particular Evolutionary Cellular Automata (EvCA) to solve the density classification task depends on the IC. Intuitively, ICs containing many ones or zeros are closer in Hamming distance to one of the solution fixed points, making it easier for a CA to iterate to the correct fixed point compared to an IC containing a more or less equal mix of ones and zeros. For this reason, performance of a CA on the density classification task is estimated by sampling many ICs generated from a known distribution. Performance is then the fractional number of times the CA achieves the correct fixed point. It has been proven that no binary CA exists that solves the density classification task for all possible ICs [19] [20]. Thus, a binary CA can only solve the problem for specific ICs or to a particular degree over multiple ICs [21]. Generating ICs using an equal probability of each cell being in the one or zero state creates a binomial distribution. Bitmap problem and Checkboard problem are widely discussed by Breukelaar etal[23].The bitmap problem is defined as Given an initial state and a specific desired end state:find the rule that iterates from the initial state to the desired end state in less than I iterations. IV.
A LGORITHM
Algorithm cellular automata based algorithm Input: input image I(x, y) and output image O(x, y) Output: the transformation rule or algorithm 1. Convert the input and output to bit pattern by comparing I(x, y) and O(x, y) with threshold and convert input image into binary image. I(x, y), O(x, y) ≡ B(I(x, y)), B(O(x, y)),where B(I(x, y)) and B(O(x, y))are the binary image value. Vol.1, Issue.1, June-2014
Jayanthan K S, Self Interpretive Algorithm Generator 2.
22
Generate the genetically evolved population of cellular automata rules Rand[(0, 1), Rulesize] Rulesize = 2n T otal number of population = 2Rulesize ,where n is the number of neighbors.
Fig. 3: Algorithm Generation. Fig. 1: Initial Stage. 3.
4.
5. 6.
Find out the transformation for each rule using moore neighborhood(Refer Fig. 7 .The selection based on the work of Olivera etal[11]. I(x, y) depending on I(x+1, y), I(x, y + 1), I(x, y − 1), I(x − 1, y), I(x + 1, y + 1), I(x − 1, y − 1), I(x + 1, y − 1), I(x − 1, y + 1) Apply cellular automata rule on the input image until it converges to output image in minimum number of iterations. while B(I(x, y))! = B(O(x, y)) apply new transformations end Using a symbolic knowledge base the transformations are interpreted. ∀rule(rulebit) ⇒ interpretation of transf ormations Using the intelligently interpreted transformations the above formed rules are transformed to natural language based algorithm aply mathemtical tools ∀ transf ormation to generate algorithm.
Fig. 4: GUI for input and output image.
Fig. 5: Plot for Cellular automata rule
Fig. 6: Plot for rule transformation Fig. 2: Moore Neighborhood.
V. I MPLEMENTATION The proposed method is implemented by using MATLAB and Prolog. For implementing cellular automata uses RBN toolbox. The symbolic knowledgebase is implemented with the help of prolog. Mathematica tool is used for generating the algorithm from transformations. Sreepathy Journal of Computer Sc. & Engg.
VI. C ONCLUSION This paper proposes a method which gives an insight into symbolic and non-symbolic knowledge. In this paper genetically evolved cellular automata outperforms other soft computing paradigms. Better rule transformation can give more accurate results. Here prolog knowledge base acting the role of a compiler .This paper focuses on the intelligent interpretation of systems, which traditionally uses human Vol.1, Issue.1, June-2014
Jayanthan K S, Self Interpretive Algorithm Generator
23 [19] [20] [21]
Fig. 7: Rule Generator.
knowledge for its working. The proposed method can be used in robot navigation, stock exchange signal valuation and other real world problems with specified input and output. Future work majorly focusing on the wide area of application specific method generation. R EFERENCES 7.
[1] [2] [3]
[4] [5] [6] [7]
[8] [9] [10] [11]
[12] [13] [14] [15]
[16] [17] [18]
Adam Callahan. Genetic Algorithm for Evolving CA to Solve the Majority Problem. August 6,2009. Adriana Popovici and Dan Popovici.Cellular Automata in Image Processing. Ajith Abraham, Nadia Nedjah ,Luzia de Macedo Mourelle,Evolutionary Computation:from Genetic Algorithms to Genetic Programming Springer Verlag Berlin Heidelberg -2006. A.M.Turing :Computing Machinery And Intelligence Mind ,New Series ,Vol 59,No.236.(oct.,1950),pp.433-460. Carlos A .Coello CoelloAn introduction to Evolutionary Algorithms and Their Applications CINVESTAV-IPN Evolutinary Computation Group. Christopher R.Houck,Jeffery A. Joines Micheal G.Kay. A genetic Algorithm for Function Optimization: A Matlab Implementation. David Andre ,Forrest H Bennett , John R.Koza, .Discovery by Genetic Programming of a Cellular Automata Rule that is better than any Known Rule for the Majority Classification Problem. Stansford University. Eric Cantu Paz A summery of Research on Parallel Genetic Algorithms University of Illinos Genetic Algorithm Laboratory. Franciszek Seredynski, Pascal Bouvry. Multiprocessor Scheduling Algorithms Based On Cellular Automata Training G.Binnig ,M.Baatz, J.Clenk, G.Schmidt.Will machines to start to think like humans?Artificial versus Natural intelligence. Gina M B.Olivera, Luiz G A.Martins, Laura B.de Carvalho, Enrich Fynn ,Some Investigations About Synchronization and Density Classification Tasks in One Dimensional and Two Dimensional Cellular Automata Rule SpaceElsevier Electronic Notes in Theoretical Computer Science 252 (2009) 121-142. Herald Niesche Introduction to Cellular Automata Organic Computing ss2006. James P.Crutchfield Melanie Mitchell The Evoltion of Emergent ComputationNational Academy of Science. Jan Paredis Coevolving Cellular Automata:Be Aware of the Red Queen University of Maastrichit. John r. Koza ,Forrest H Bennett David Andre Martin A.Keane.Four Problems for which a Computer Evoloved by Genetic Programming is Competitive with Human Perfomance. 1996 IEEE. Lu yuming ,LiMing ,LiLing .Cellular Genetic Algorithms with Evolutional Rule. 2009 IEEE. Melanie Mitchell ,James P.Crutchfield ,Rajarshi Das.Evolving Cellular Automata with Genetic Algorithms:A review of Recent Work .EvcA96. Micheal O Neill ,Leonardo Vanneschi,Steven Gustafson,Wolfgang Banzhaf, Open issues in Genetic Programming Springer 14 May 2010.
Sreepathy Journal of Computer Sc. & Engg.
[22] [23] [24] [25] [26] [27] [28]
Niloy Ganguly ,Pradipta Maji, Sandip Dhar,Biplab k.Sikdar ,P.Pal Chaudhuri Evolving Cellular Automata as Pattern Classifier Springer Verlag Berlin Heidelberg 2002. Niloy Ganguly,Biplab k Sikdar,Andreas Deutsch,Geofferey Canright ,P Pal Chaudhuri. A survey on Cellular automata 2004 Centre for High performance Computing. Rajarshi Das ,Melanie Mitchell, James P .Crutchfield,A Genetic Algorithm Discovers Particle Based Computation in Cellular Automata Parallel Problem solving from Nature PPSN .Berlin:Spinger Verlag. R.Breukelaar ,Th.Back .Using a Genetic Agorithm to Evolve Behavior in Multi Dimensional Cellular Automata GECCO 05 ,June 25-29,ACM 2005. Ron Breukelaar ,Thomas Back .Evolving Transition Rules for Multi Dimensional Cellular Automata ACRI 2004,LNCS 3305,pp.182-191,2004 Spinger-Verlag Berlin Heidelberg 2004. Sang Ho Shin, Kee-Young Yoo. Analysis of 2 state,3 Neighbourhood Cellular Automata Rules for Cryptographic Pseudorandom NumberGeneration IEEE Computer Society Press 2009. Stuart Bain,John Thornton,and Abdul Sattar . Methods of Automatic Algorithm Generation. .Institute for Integrated and Intelligent Systems. Sinisa Petric Painterly Rendering Using Cellular Automata SIGMAPI DESIGN. S.Wolfram .A New Kind of Science Wolfram media,Inc,2002. Arun P.V. , S.K. Katiyar., 2012,Automatic Object Extraction from Satellite Images using Cellular Automata Based Algorithm ., IEEETGRS ., 50( 3)-2,pp: 92-102
Vol.1, Issue.1, June-2014
Sreepathy Journal of Computer Science and Engg.,Vol 1, Issue 1, June 2014
24
Natural Language Generation from Ontologies Manu Madhavan,
Dept. of Computer Science and Engg, SIMAT, Vavanoor, Palakkad manumadhavan@simat.ac.in
Abstract—Natural Language Generation is the task of generating natural language text suitable for human consumption from machine representation of facts which can be pre-structured in some linguistically amenable fashion, or completely unstructured. An ontology is a formal explicit description of concepts in a domain of discourse. An ontology is considered as a formal knowledge repository which can be used as a resource for NLG tasks. A domain ontology will provides the input for content determination and micro-planing of NLG task. A linguistic ontology can be used for lexical realization. The logically structured manner of knowledge organization within an ontology enables to perform reasoning tasks like Consistency checking, Concept Satisfiability, Concept Subsumption and Instance Checking. These types of logical inferencing actions will be applied to derive descriptive texts as answers to user queries from ontologies. Thus a simple natural language based Question Answering system can be implement, guided by robust NLG techniques that act upon ontologies. Some of the tools for constructing ontologies ( Protege, Natural OWL, etc.) and their combination with NLG process will also be discussed... Keywords—Natural Language Generation (NLG), Ontology, Protege, Question Answering.
I. INTRODUCTION ATURAL Language Generation (NLG) is a young and fascinating area of Computational Linguistics. It is the intelligent process of generating informations in some natural language. The most important capability, that makes humnan intelligent is common sense. It has been accepted that without substantial bodies of background information concerning commonsense, everyday knowledge about the world or detailed information concerning particular domains of application, it will not be possible to construct systems that can support the use of natural language. Thence the development of NLG systems has reached the stage where concentrated efforts are necessary in the area of representing more ’abstract’, more ’knowledge’-related bodies of information. Systems need to represent concrete details of the ’worlds’ that their texts describe: for example, the resolution of anaphors, the induction of text coherence by recognizing regularities present in the world and not in the text, the recognition of plans by knowing what kinds of plans make sense for speakers and hearers in real situations, etc. all require world modeling to various depths. This need creates two interrelated problem areas[3]. The first problem is how knowledge of the world is to be represented. The second problem is how such organizations of knowledge are to be related to linguistic system levels of organization such as grammar and lexicons. For both problem areas the concept of ontologies for NLG has been suggested to be of potential
N
solution. With the advent of the “Semantic Web Vision”[6], ontologies have become the formalism of choice for knowledge representation and reasoning. Usually, ontologies are authored to represent real world knowledge in terms of concepts, individuals and relations in some variety of Description Logic. With suitable interpretation, one can consider an ontology to be an organized knowledge source repository which could serve as an input to NLG systems[9]. The remaining part of this section give a general introduction of NLG, Ontology and QA system. The section 2 give the details of literature survey and some of the related works. The detailed concept of ontology and the tools and approaches for developing an ontology is described in section 3. The task of generating natural language from ontology and modifying it into a QA system are described in next two chapters. Finally, summary of the discussion and some of the future works are mentioned. A. Natural Language Generation Natural language generation is the process of converting an input knowledge representation into an expression is natural language(either text or speech) according to the application. The input to the system is a four tuple[7]: (K,C,U,D) where K is the knowledge source, a database of world knowledge. C is the communication goal, specified as independent of language which is using. U is the user model based on which the system is working. Probabilistic models are most commonly used in generation process. Finally, D is the discourse history, which deals with the ordering of information in the output text. The output will be natural language text which can be followed by a speech synthesizer according to the application. In most of the systems, the process of NLG is achieved by a pipeline of tasks such as document plan, micro plan and surface realization. This architecture is represented in Figure 1. In the first stage, the system identifies the message from a non-linguistic representation of concept and representing it as text plan. The micro plan module select the lexical item, that can be used to represent the message in natural language. It also perform necessary sentence aggregation and pronominalization, to improve the readability. The final step is applying grammar rules in the micro plan to produce syntactically and semantically valid output sentence. B. Ontology The concept of ontology is adapted from philosophy. Considering it as the first philosophy, Aristotle defined ontology as
c Dept. of Computer Science & Engg., Sreepathy Institute of Management And Technology, Vavanoor, Palakkad, 679533.
Manu Madhavan, NLG from Ontologies
25
it involves all the process of natural language understanding, ontology and NLG to perform form the task of QA.
Fig. 1: NLG Architecture
an explicit formal specification of how to represent the objects, concepts and other entities that are assumed to exist in some area of interest and the relationships that hold among them. According to Websters Revised Unabridged Dictionary[18] the word ontology means: “That department of the science of metaphysics which investigates and explains the nature and essential properties and relations of all beings, as such, or the principles and causes of being”. An ontology is an explicit specification of a conceptualization. A conceptualization is an abstract, simplified view of the world that we wish to represent for some purpose. The common application of ontology in NLG is domain modeling. The ontology will act as the knowledge base of generating messages. It represents different objects in a world by set of hierarchies and relations. In CL applications, this provides a common vocabulary for the domain of interest. C. Question Answering Systems Question Answering is the task of automatically deriving an answer to a question posed in natural language. A good definition of a QA system is as follows: “A Question Answering system is an information retrieval application whose aim is to provide inexperienced users with flexible access to information, allowing them to write a query in natural language and obtaining not a set of documents that contain the answer, but the concise answer itself” [6]. In this work, the queries put by the user in natural language will be used as the content determiner for NLG. The output will be the answer to the user query, in natural language. So, Sreepathy Journal of Computer Sc. & Engg.
II. RELATED WORK The use of ontology for natural language processing is an interesting area in Knowledge Representation. The researchers started to construct domain ontology from the beginning of 1980. The success of PENNMAN[13], a text generation system is a monument in this development. The PENNMAN system consists of a knowledge acquisition system, a text plan and a language generation system which uses large systemic grammar of English. ONTOGENERATION[5] uses a concept of reusing domain and linguistic ontologies for text generation. This article proposes a general approach to reuse domain and linguistic ontologies with natural language generation technology, describing a practical system for the generation of Spanish texts in the domain of chemical substances. It uses a Generalized Upper Model(GUM) for NLG. The OntoSum project aims at developing new methods and techniques for - Ontology Learning - Ontology Coordination, Mapping and Ontology merging - Ontology-based Summarization[5]. The OntoSum methodology consists of the following steps - Specification of the theoretical framework - Development of methods and of a prototype system Specification of the evaluation methodology - Exploitation and evaluation of the developed methods and system in two case studies: (a) in the context of the e-Centric EXODUS platform for document management, and (b) in biomedical applications that are being developed in DEVLAB of Dartmouth College. - Dissemination of results. The approach of ontology based system towards question answering is discussed by Gyawali[9]. This also describe a generalized architecture for the same. The thesis identifies a set of factoid questions, that can be asked to a domain ontology. The ideas for developing a sample ontology is taken from [14]. An Ontology based multilingual sentence generator [15] for English, Spanish, Japanese and Chinese was a combination of example based, rule based and statistical components. This also provided an application-driven generation of sentences by feature based grammars. SimpleNLG [1] is a realization engine for English which aims to provide simple and robust interfaces to generate syntactic structures and linearize them. John A Batesman suggested a systemic functional Grammar (SFG) for representing the semantic features of a sentence. Based on SFG, a system network is developed and used as an internal representation for NLG[15]. In automatic story generation using Ontology [10] a rule based model generates the language and Ontology based internal representation check the semantic and pragmatic existence of the generated sentence. A. Applications of NLG • Canned Text Generation: Sometimes the general form of the sentences or their constructions in a text is sufficiently invariant that can be predetermined and stored as text string. For example, in compilers, shows the line Vol.1, Issue.1, June-2014
Manu Madhavan, NLG from Ontologies
• •
• • •
in which an error occurred. This approach to generation is called canned text. But, this is the simple case of generation, where text produced not by the system, but the author of the program[9]. Weather Forecasting Systems: Generate textual weather forecasts from representations of graphical weather maps. Machine Translation: NLG an be consider as a translation process, which convert an input non-linguistic representation to language specific output. So, if the system can generate this input representation from a source language, multilingual translation can be achieved efficiently[13]. Authoring Tools: NLG technology can also be used to build authoring aids, systems which help people create routine documents. Text Summarization: Applications of NLG can be extended in automatic summary generation in Medical field, News analysis etc.. Question Answering: QA is the task of automatically answering a question posed in natural language. The QA system generates an answer to a question, a QA computer program may use either a pre-structured database or a collection of natural language documents. III.
ONTOLOGY ENGINEERING AND TOOLS
This section discusses the formal definition of ontology, the approaches for constructing ontologies and ontology engineering tools. A. Definition Formally an ontology is defined by a seven tuple as follows:[9] O = (C, HC , RC , HR , I, RI , A) An ontology O consists of the following. The concepts C of the schema are arranged in a subsumption hierarchy HC . Relations RC exist between concepts. Relations (Properties) can also be arranged in a hierarchy HR . Instances I of a specific concept are interconnected by property instances RI . Additionally, one can define axioms A which can be used to infer knowledge from already existing one. With a formal Description Logic based knowledge representation scheme and support of complex reasoners that can perform inference on the knowledge specified within an ontology, the uses of ontologies have broadened from the purely theoretical inquiry initially carried out within the area of Artificial Intelligence to encompass practical applications by domain experts across heterogeneous fields. Classes are the focus of most ontologies. Classes describe concepts in the domain. For example, a class of wines represents all wines. Specific wines are instances of this class. The Bordeaux wine in the glass in front of you while you read this document is an instance of the class of Bordeaux wines. A class can have subclasses that represent concepts that are more specific than the superclass. Sreepathy Journal of Computer Sc. & Engg.
26
B. Ontology and NLP The most systems that deal currently with nlp already adopt some kind of ontology for their more abstract levels of information. However, theoretical principles for the design and development of ontologies meeting the goals of generality and detail remain weak. This is due not only to a lack of theoretical accounts at these more cleanse abstract levels of information, but also to the co-existence of a range of, sometimes poorly differentiated, functions such bodies of information are expected to fulfill. The following list gives an idea of the range of functions adopted in nlp. Ontologies are often expected to fulfill at least one (and often more) of:[4] • organizing ’world knowledge’ • organizing the world itself • organizing ’meaning’ or ’semantics’ of natural language expressions • providing an interface between system external components, domain models etc. and nlp linguistic components • ensuring expressibility of input expressions • offering an interlingua for machine translation • supporting the construction of ’conceptual dictionaries’ C. Types of Ontologies The nlp applications use ontology for the representation of world knowledge in a formal way, without including the linguistic details. Then another kind of ontology which explicitly contains the linguistic details for the system. Each of these variants has been adopted in some system where a concrete ontology has been attempted. This gives rise to three distinct kinds of ontology that can be found in nlp work. They are[4]: • Conceptual Ontology: an abstract organization of realworld knowledge (commonsense or otherwise) that is essentially non-linguistic. • Mixed Ontology: an abstract semantico-conceptual representation of real-world knowledge that also functions as a semantics for use of grammar and lexis. • Interface Ontology: an abstract organization underlying our use of grammar and lexis that is separate from the conceptual, world knowlege ontology, but which acts as an interface between grammar and lexis and that ontology. D. Ontology Representation 1) OWL: An ontology language is a formal language used to encode the ontology[14]. They are usually declarative languages, are almost always generalizations of frame languages, and are commonly based on either first-order logic or on description logic. The Web Ontology Language (OWL) is a family of knowledge representation languages for authoring ontologies. The languages are characterised by formal semantics and RDF/XML-based serializations for the Semantic Web. Current ontologies are authored in one of several varieties of OWL based representation, namely OWL FULL, OWL DL and OWL Lite. The choice among these varieties for representing a knowledge base will eventually affect the expressiveness of Vol.1, Issue.1, June-2014
Manu Madhavan, NLG from Ontologies the ontology and whether or not reasoning algorithms will be able to guarantee completeness and/or decidability. 2) RDF: RDF (Resource Description Framework)[17] can be used to describe ontology metadata. The RDF data model is similar to classic conceptual modeling approaches such as entity-relationship or class diagrams, as it is based upon the idea of making statements about resources (in particular Web resources) in the form of subject-predicate-object expressions. These expressions are known as triples in RDF terminology. The subject denotes the resource, and the predicate denotes traits or aspects of the resource and expresses a relationship between the subject and the object. 3) SPARQL: The predominant query language for RDF graphs is SPARQL[17]. SPARQL is an SQL-like language, and a recommendation of the W3C as of January 15, 2008. SPARQL allows users to write unambiguous queries. For example, the following query returns names and emails of every person in the dataset: P REF IX f oaf : hhttp : //xmlns.com/f oaf /0.1/i SELECT ?name ?email W HERE{ ?person a f oaf : P erson ?person f oaf : name ?name ?person f oaf : mbox ?email } This query can be distributed to multiple SPARQL endpoints (services that accept SPARQL queries and return results), computed, and results gathered, a procedure known as federated query. E. Ontology Tools Prot`eg`e, Oiled, Apollo, RDFedt, OntoLingua, OntoEdit, WebODE, KAON, ICOM, DOE and WebOnto are some of the ontology development tools available for research works. Medius Visual Ontology, Modeler LinKFactory Workbench and K-Infinity are some commercially avilable ontology tools. 1) Prot`eg`e: Prot`eg`e[17] is an open-source tool developed at Stanford Medical Informatics. Like most other modeling tools, the architecture of Prot`eg`e is cleanly separated into a “model” part and a “view” part. Prot`eg`e model is the internal representation mechanism for ontologies and knowledge bases. Prot`eg`e’s view components provide a user interface to display and manipulate the underlying model. Prot`eg`es model is based on a simple yet flexible metamodel, which is comparable to object-oriented and framebased systems. It basically can represent ontologies consisting of classes, properties (slots), property characteristics (facets and constraints), and instances. Prot g provides an open Java API to query and manipulate models. An important strength of Prot`eg`e is that the Prot`eg`e metamodel itself is a Prot`eg`e ontology, with classes that represent classes, properties, and so on. For example, the default class in the Prot`eg`e base system is called :STANDARD-CLASS, and has properties such as :NAME and :DIRECT-SUPERCLASSES. This structure of the metamodel enables easy extension and adaption to other representations. Sreepathy Journal of Computer Sc. & Engg.
27
Prot`eg`e 3.4 alpha is used used in this work for demonstration. IV. NLG FROM ONTOLOGY A less conventional approach of utilizing ontologies in the computational linguistics field is in the area of Natural Language Generation (NLG). In this approach, an ontology is considered as a formal knowledge repository which can be utilized as a resource for NLG tasks. The objective, then, is to generate a linguistically interesting and relevant descriptive text summarizing parts or all of the concisely encoded knowledge within the given ontology. It has been argued that ontologies contain linguistically interesting patterns in choice of the words they use for representing knowledge and this in itself makes the task of mapping from ontologies to natural language easier. It is along this line of thought that this thesis builds upon. The present work aims at utilizing ontologies for the sake of NLG and seek to justify the motive and rationale for doing so. Further,it will identify a set of generic questions that are suitable to be asked concerning an input ontology. The logically structured manner of knowledge organization within an ontology enables to perform reasoning tasks like Consistency checking, Concept Satisfiability, Concept Subsumption and Instance Checking[9]. These types of logical inferencing actions will motivate us in proposing a set of Natural Language based questions which can be asked. NLG, in turn, will be applied to derive descriptive texts as answers to such queries. This will eventually help us in implementing a simple natural language based Question Answering system guided by robust NLG techniques that act upon ontologies. A. Architecture The architecture of present work combine the pipeline of NLG, discussed in chapter 1 and some other steps. The NLG, here is achieved mainly through the following four steps[9]: • Extracting the knowledge from Ontology • Document Planning • Micro Planning • Reasoning The deatiled architecture is sown in Figure 2. In this work, the ontology is constructed using Prot`eg`e tool. The sample ontology used for the explanation is pizza.owl. This represents the details of different pizzas available. The structure of pizza ontology is given Figure 3. From the Prot`eg`e, abstract syntax of OWL can be obtained, which is is RDF format. The abstract syntax representation is semi-linguistics in nature. This will helps to extract the content for NLG task. B. Extracting Data from Ontology The input to the NLG system is comes from the knowledge base. In this case, the input is obtained from ontology (pizza.owl). From the Prot`eg`e -OWL interface, abstract syntax of ontology is obtained, which is represented in RDF. As mentioned earlier, this representation is semi-linguistics. The purpose in this stage is to extract axioms from this RDF Vol.1, Issue.1, June-2014
Manu Madhavan, NLG from Ontologies
28
Fig. 4: RDF Representation
1)
Fig. 2: NLG Architecture
Fig. 3: Pizza Ontology
representation.A sample RDF representation is given in Figure 4. The corresponding axiom will be: SubClassOf (< pizza > , , < F ood >), which is more intuitive, cleaner and easier for further processing. Then, the available axioms in the ontology are classified into the following categories and retrieve all of the axioms corresponding to each category[9]. Sreepathy Journal of Computer Sc. & Engg.
Subsumers a) Stated Subsumers with Named Concept b) Stated Subsumers with Property Restriction c) Implied Subsumers with Named Concept 2) Equivalents a) Stated Equivalents with Property Restriction b) Stated Equivalents with Enumeration c) Stated Equivalents with Cardinality Restriction d) Stated Equivalents with Set Operator e) Implied Equivalents with Named Concept 3) Disjoints 4) Siblings Subsumer axioms are the axioms which state that a concept (child concept) inherits properties from some other concept (parent concept) in the ontology. Typically, such axioms begin with the word “SubClassOf”. For a given concept, its subsumer can either be a named concept in the ontology or an anonymous concept (an anonymous concept is an unnamed concept which represents a new set of individuals that can be obtained after the specified property restrictions are exercised over the specified named concept in the ontology). Further, in addition to the explicitly stated subsumers (both named and anonymous type) for a given concept in the ontology, we use the reasoner to infer its additional named subsumers. Equivalent axioms are the axioms which state that the concepts involved describe exactly the same set of individuals. Typically, such axioms begin with the word “EquivalentClasses”. For a given concept, its equivalent concept can either be a named concept in the ontology or an anonymous concept. Here, the anonymous concepts can be defined in terms of property restrictions, cardinality restrictions, enumeration or set operations over some named concepts in the ontology. Disjoint axioms are the axioms which state that the concepts involved have no individuals in common. Typically, such axioms begin with the word “Disjoint-Classes”. The concepts are considered as sibling of each other when they are classified as subconcepts under the same parent concept, i.e., if a concept X has children Y and Z, then Y is a sibling of Z and vice versa (also Y and Z are siblings of themselves). C. Document Planning The task of document planner is to determine ’what to say’ and ’how to say’. These are achieved through two subtaskscontent determination and document structuring. The input to this stage is set of axioms from the ontology, and output will be a text (document) plan. Vol.1, Issue.1, June-2014
Manu Madhavan, NLG from Ontologies 1) Content Determination: The content determination is the process of selecting axioms from the ontology, according to the goal. Identifying the content for the generation is not an easy task. In an ontology based model, the axioms act as the messages for NLG system. There are two approaches - topdown and bottom-up - for content selection. The “top-down problems” need to identify specific contents that can address a specific goal while the “bottom-up problems“ have a more diffuse goal of identifying contents that can produce a general expository or descriptive text. 2) Document Structuring: The process involved in this stage is organization of different messages (axioms) identified in the previous subsection. The output thus obtained is the structure of the output to be generated, which is called text plan. The text plan should consider the rhetoric relations between the message units. The pragmatic and discourse structure are also considered in text plan. The text plan is represented as a tree, where the leaves are axioms and intermediates nodes are the category names of the axioms. A sample text plan is given in Figure 5.
Fig. 5: Text Plan
D. Micro Planning The micro planning comprises of two stages lexicalization and aggregation. 1) Pre-Lexicalization: In NLG, lexicalization is the task of identifying lexical items (words of natural language) that will serve to build up natural language sentences. The lexical items are vocabulary for the sentence. During the lexicalization phase, identify the lexical items that will help us in mapping the factual knowledge within each category of OWL axioms to natural language text. In this approach lexicalization is carried out in two phases- pre-lexicalization and lexicalization proper. In the Pre-Lexicalization stage,an initial plan of sentence structure is prepared. It also made a preliminary choice of lexical items for each category of the statements presented in section 4.2 are selected. It is easy to notice that the statements retrieved in content determiner are semi linguistic in nature. Each example have statements which contain concepts and relations that are either legitimate words of English (eg: Pizza, Sreepathy Journal of Computer Sc. & Engg.
29
Food, Country etc.) or are concatenations of legitimate words of English (eg: hasBase, MeatyPizza, PizzaTopping etc.). Based on similar observations, it is feasible to derive lexical items from the semi linguistic concept and relation names in the ontology itself. Further, the semantics of the predicates being used in the statements guide us on identifying what linguistic roles such lexical items play in the output sentence to be generated. Let us consider the following statement, for example: SubClassOf (hM argheritaiObjectSomeV aluesF rom (hhasT oppingihT omatoT oppingi)) The concept name ”Margherita“ is a legitimate word of English and the concept name ”TomatoTopping“ is formed by concatenation of two English words ”Tomato“ & ”Topping“. Likewise, the relation ”hasTopping“ is also formed by concatenation of two legitimate words has & Topping. Additionally, the semantics of the predicate SubClassOf guides us in mapping the concept ”Margherita“ to the subject and the concept ”TomatoTopping“ to the object of the sentence that the developer intend to verbalize from the axiom. Similarly, the relation ”hasTopping“ is a good candidate for the verb in the sentence to be generated; since relationships in OWL act as binders between concepts, they can serve to identify the linguistic roles of subject and object in the output sentence to be generated. The pre-lexicalization stage exploits such semantic information to plan the output sentence structure for each category of the statements presented in section 4.2. With regards to the task of identifying lexical items to represent concepts, the concept names that are legitimate words of English are directly approved as lexical items for our task; we shall refer to such concept names as ”Simplified Concept Name“. The other concept names which are formed by concatenation of two or more English words, are breakdown into possible Simplified Concept Names, which will then serve as lexical items. The strategy used to identify the simplified concept name from a complex concept is as follows: Since there can be multiple super concepts for a concept in the ontology (either stated or inferred via reasoning), it is possible to check for such a possibility among a number of super concepts. The superconcept that satisfies such requirements is then chosen and designated as ”Best Parent”. This allows us to model the features describing a concept (in our feature structure representation) in terms of base form and its modifier. The base form is set to the value of the “Best Parent” name and the modifier is set to the value of the Simplified Concept Name. For example, for the concept “CheeseyPizza” , we have its base form set to “Pizza” and its modifier set to “Cheesey”. This strategy of generating the base form and modifier for a given concept is referred as “Concept Lexicalisation Algorithm“ [9]. This idea can be represented as below: Extracted axiom: SubClassOf(<Concept_X> <Concept_Y>) Feature Set: ( S ) ubject : &Concept X Object : &Bset P arent of Concept Y ObjectM odif ier : &Simplif iedConcept Y The following example shows how verbs can be identified from the axiom strucutre. Extracted axiom: SubClassOf(<Concept_X> Vol.1, Issue.1, June-2014
Manu Madhavan, NLG from Ontologies ObjectAllValuesFrom(<Relation_A> <Concept_Y>)) Feature Set: ( S ) ubject : &Concept X V erb : &Relation A Object : &Concept Y 2) Aggregation: Aggregation is the task of grouping two or more simple structures to generate a single sentence, a frequent phenomenon in natural languages. There has been lot of works in aggragation, which concentrate on syntactic level. This work concentrate on aggragation of various feature structures at semantic level. Figure 6 shows an example of aggragation. For the concept ”Napoletana“ in the pizza ontology, the concept ”NamedPizza“ is stated to be it’s subsumer and the concepts ”InterestingPizza“, ”CheeseyPizza“, ”RealItalianPizza“ and ”NonVegetarianPizza“ are inferred to be its subsumers. Subsequently, during our PreLexicalisation phase, the following feature structures represent them, respectively.
30
highly dependent on the values assigned to Verb features in the feature structures; they served as a criterion for judging whether an aggregation task should be carried out on the available set of feature structures or not. Thus it was desirable that those values come directly from the name (string) representing the relation in the axiom and remain ”intact“ throughout the aggregation phase. Let us consider an example. ( S ) ubject : &M argherita V erb : &hasT opping Object : &T omatoT opping is represented as ( S ) ubject : &M argherita V erb : &has ObjectDescriptor : &topping Object : &T omato E. Reasoning
Fig. 6: Feature Structures Now, during the aggregation phase, the following new feature structure is generated and preserved for further processing; eliminating the above five feature structures, as in Figure 7. The aggregation should also consider the categories of axioms to be grouped together. This result with better readability in the output text.
Fig. 7: Aggregation of Feature structures 3) Lexicalization Proper: The completion of Aggregation phase opens up an opportunity to generate further lexical items; which was otherwise infeasible/unsuitable to obtain during the PreLexicalisation phase. In particular, the process attempt to generate lexical items for the Verb features (whenever present) in the feature structures and identifying the Verb lexical item will facilitate in augmenting the information pertaining to the Object feature in those feature structures. The need to postpone such activities until the aggregation phase has been completed stems from the fact that our Aggregation task is Sreepathy Journal of Computer Sc. & Engg.
Realisation is the task of generating actual natural language sentences from the intermediary (syntactic) representations obtained during the Micro Planning phase. For a NLG developer, a number of general purpose realisation modules are available to achieve such functionality. In particular, such modules facilitate in transforming syntactic information into natural language text by taking care of various syntactic (for example, the arrangement of Subject, Verb and Object in a sentence), morphological (for example, the generation of inflected forms of words, when required, such as the plural of child being children and not child) and orthographical (for example, placement of appropriate punctuation marks in the sentence, such as placing a comma to describe an aggregation of things) transformations that the contents from the Micro Planning phase need to adhere to for generating grammatically valid sentences. SimpleNLG[1] has java classes which allow a programmer to specify the content (such as Subject, Verb, Object, Modifiers, Tense, Preposition phrase etc) of a sentence by setting values to the attributes in the classes. Once such attributes are set, methods can be executed to generate output sentences; the package takes care of the linguistic transformations and ensures grammatically well formed sentences. An example is shown in Figure 8[9].
Fig. 8: Reasoning
Vol.1, Issue.1, June-2014
Manu Madhavan, NLG from Ontologies V. T OWARDS Q UESTION A NSWERING This chapter discusses how the technique of NLG from ontology can be utilized to develop a QA system based on an Ontology. The general architecture, implementation tools and examples are also explained. A. Question Answering Question Answering (QA) is a computer science discipline within the fields of information retrieval and natural language processing (NLP) which is concerned with building systems that automatically answer questions posed by humans in a natural language. A QA implementation, usually a computer program, may construct its answers by querying a structured database of knowledge or information, usually a knowledge base. The basic idea of this implementation is generate a natural language answer to the query given by the user using the ontology as knowledge base. The system will determine a clue for the content determination from the user query. Then the NLG technique described in Chapter 4 is used to generate the output answer. A common architecture is explained in next section. B. Architecture The architecture[9] shown in Figure 9 is an augmentation of architecture discussed in previous chapter. In QA system, the ontology served as the knowledge base. The question asked by the user is analyzed for identifying axioms for content determination. These axioms are given as input to the NLG system. The NLG pipeline, after grammatical reasoning, generate an output, which is the answer to the user query.
31
TABLE I: Classification of Questions and Answers Question & Expected Answer What did you buy? Where is my coat? Where did they go? What happened next? How did you cook the eggs?
[THING] [PLACE] [PLACE] [EVENT] [MANNER]
Since an ontology is domain specific, the possible set of question (type of questions) asked by the user can be predicted. This observation helps to limit the unnecessary searches in the ontology and improve the results. For example, the question “What is RealItalianPizza?” towards Pizza ontology will give the following result: RealItalianPizza is a pizza that falls under the class of thin and crispy pizza. RealItalianPizza can have base of thinandcrispy only. However, it might be the case that some instances of RealItalianPizza don‘t have any base at all. RealItalianPizza has Italy as its countryoforigin. C. Implementation Tools The tools used for implementation of the system include python, package rdflib, SPARQL, and NLG reasoning systems. The Figure 10 shows the implementation stages.
Fig. 10: Implementation tools
Fig. 9: Architecture of QA system with NLG from Ontology The output of the system should be structured according to the query asked. On an analysis of the questions asked, the mode of answer expected can be identified. Some idea in this context is given below in Table 1[8]. These tags of expected answers can be used for identifying the axiom category, rhetorical structuring of the features and final grammatical realization. Sreepathy Journal of Computer Sc. & Engg.
The ontology can be developed using Prot`eg`e[17]. The is a Java interface for ontology development. Prot`eg`e have options to get RDF file of the ontology. Then, NLP related works are done by python programs. Python has got a library package, rdflib, which can be used to interact with rdf file. The package rdflib also equipped with SPARQL query processing capacity. The query result is then processed by reasoners like SimpleNLG to produce the output. D. Example The concept of seminar is presented with a practical example of M. Tech ontology. The ontology contains the details of Vol.1, Issue.1, June-2014
Manu Madhavan, NLG from Ontologies M.Tech courses conducted in Kerala state. The hierarchy of ontology classes is shown in Figure 11. The whole ontology is considered as the subclass of claa Things(default class). The classes in the hierarchy are Institutions, Branches, Specilizations and Universities. All these classes has subclasses. Some relations (properties) in ontology are IsOfferedBy (course IsOfferedBy college), ApprovedBy (college ApprovedBy University) and HasSpecilizaion(Branch HasSpecilizaion course). A RDF code snippet is shown in Figure 12. The processing
32
Fig. 13: SPARQL Query
be:GECSKP offer M.Tech in CL. The system is capable of answering around five types of questions like which are the colleges offering course – , which university approve —(course or college) etc. The algorithms can be extended by adding more semantic features like synonyms and hyponyms to handle more complex queries. VI. C ONCLUSIONS & FUTURE SCOPE The NLG have good real time applications, currently which are limited by the lack of world knowledge. The construction of a domain specific ontologies will solve the problem of world knowledge. This work discusses the approaches for collaborating ontologies and NLG systems. The method discussed here classifies the axioms in the ontology into different categories and performs NLG pipeline. With necessary modification, an ontology supported NLG system can be extended as QA system. There are hands of tools for efficient implementation of the approach. The development of such a QA system is explained with a sample ontology of M.Tech courses. The system performance can be improved by applying more linguistic techniques to analysis the NL efficiently. Such systems have important role in the web-age of semantic search. R EFERENCES [1] [2] [3]
Fig. 11: Example: M.Tech Ontology of the user query ’ which college offer M.Tech CL?’ can be explained as follows. The sentence is split into words and identify the keyword which selects an axiom. Here,it is offered, which select the axiom OfferedBy. Figure 13 shows
[4]
[5] [6] [7] [8]
Fig. 12: Code segment from RDF File [9]
the SPARQL query for this question. The final output will Sreepathy Journal of Computer Sc. & Engg.
Albert G and Ehud R, “ SimpleNLG: A realization engine for practical applications, ” in Proceedings of the 12th European Workshop on Natural Language Generation, 2009. Allen J, Natural Language Understanding. Benjamin/Cummings Publication company, California, 1988. Batesman J A,“ Sentence generation and systemic grammar: an introduction,” in Iwanami Lecture Series: Language Sciences, Volume 8. Tokyo: Iwanami Shoten Publishers, 1997. Batesman J A,“The Theoretical Status of Ontologies in Natural Language Processing,” in Proceedings of the workshop on ’Text Representation and Domain Modelling–Ideas from Linguistics and AI, Technical University Berlin, 1991. Batesman J A, “Ontology Construction and Natural Language”, in Proceedings of Workshop on Formal Ontology in Conceptual Analysis and Knowledge Representation, Padova,,1993. Davis J, Studer R, and Warren P,Semantic Web Technologies. John Wiley & Sons Ltd, England, 2006. Ehud R, “ Building Applied Natural Language Generation Systems,” in Proceedings of Applied Natural Language Processing Conference in Washington dc, 1997. Ghorka W, Bownik L, Piasecki A, “ Information System Based on Natural Language Generation from Ontology ”, in Proceedings of the International Multiconference on Computer Science and Information Technology, pp. 357 364, 2007. Gyawali B, “Answering Factoid Questions via Ontologies : A Natural Language Generation Approach”, M.Sc. Dissertation, Dept of Intelligent Computer Systems, University of Malta, 2011.
Vol.1, Issue.1, June-2014
Manu Madhavan, NLG from Ontologies
33
[10] Jaya A and Uma G. V, “ A Novel Approach for Construction of Sentences for Automatic Story Generation Using Ontology,” in Proceedings of the International Conference on Computing,Communication and Networking (ICCCN), 2008. [11] Jurafsky D and Martin H, Speech and Language Processing. Prentice Hall Inc., 2008. [12] Kalina Bontcheva. “Generating tailored textual summaries from ontologies”. In ESWC, volume 3532 of Lecture Notes in Computer Science, pages 531–545. Springer, 2005. [13] Mann W. C, “ An Overview of the Pennman Text Generation System ”, in AAAI-83 Proceedings, pp 261–265, 1983. [14] Noy N. F and McGuinness L. D, “Ontology Development 101: A Guide to Creating Your First Ontology” , Stanford University, Stanford,2005. [15] Takako Aikawa , Maite Melero , Lee Schwartz, and Andi Wu, “Multilingual Sentence Generation,” in Proceedings of 8th European Workshop on Natural Language Generation, 2001. [16] http://protege.stanford.edu, refered on April, 2012. [17] http://code.google.com/p/rdflib, refered on April, 2012. [18] Webster’s Revised Unabridged Dictionary: http://machaut.uchicago.edu/websters,referred on April, 2014.
Sreepathy Journal of Computer Sc. & Engg.
Vol.1, Issue.1, June-2014
Sreepathy Journal of Computer Science and Engg.,Vol 1, Issue 1, June 2014
34
Speech synthesis using Artificial Neural Network Anjali Krishna C R, Arya M B, Neeraja P N, Sreevidya K M, Jayanthan K S Dept. of Computer Science and Engg, SIMAT, Vavanoor, Palakkad theerthaanjali@gmail.com
Abstract—The text to speech conversion is a large area which shows a very fast development in the last few decades.Our goal is to study and implement the specific tasks concentrated during text to speech conversion namely text normalization grapheme to phoneme conversion, phoneme concatenation and speech engine processing.Usage of neural network for grapheme to phoneme conversion provides more accuracy than normal corpus or dictionary based approach.Usually in text to speech grapheme to phoneme conversion is performed using a dictionary based method.The main limitation of this technique is that it can’t able to give the phoneme of a word which is not in the dictionary and to have more efficiency in phoneme generation we require a large collection of word- pronunciation pair.For using large dictionary we require large storage space also.This limitation can be overcome using a neural network.The main advantage of this approach is that it can able to adapt unknown situvations.ie it can able to predict the phoneme of a grapheme which is not defined so far.The neural network system requires less memory than a dictionary based system and performed well in tests.The system will be very much useful for an illiterate and vision impaired people to hear and understand the content, where they face many problem in their day to day life due to the differences in their script system.
I.
T
INTRODUCTION
EXT to speech synthesizer is a computer based system that can read text aloud automatically, from a source text.It has been made very fast improvement in this field over a couple of decades and a lot of TTS systems are now available for commercial use. A text to speech system converts a written text into speech.Speech is often based on concatenation of speech units,that are taken from natural speech put together to form a word or a sentence.Concatinative speech synthesize has become very popular in recent years due to its improved sensitivity compared with other.Many TTS systems are developed based on the principle, corpus -based speech synthesis.Since there are lot of speech systems developed none of them deals with quality [1].Speech is the most used and natural way for people to communicate.From the beginning of the man-machine interface research, speech has been one of the most desired mediums to interact with computers.Therefore, speech recognition and speech synthesis have been studied to make the communication with machines more human likely. In order to increase the naturalness of oral communications between humans and machines, all speech aspects must be involved.Speech does not only transmit ideas and concepts, but also carries information about the attitude, emotion and individuality of the speaker. Different applications of TTS in our day-to-day life are the
following [9]:Telephony:- Automation of telephone transactions(e.g., banking operations)automatic call centers for information services(e.g., access to weather reports),etc. • Automotive:- Information released by in-car equipments such as the radio,the air conditioning system,the navigation system,the mobile phone(e.g.,voice dialing)embedded telemetric systems,etc. • Multimedia : Reading of electronic documents(web pages, emails, bills) or scanned pages(output of an Optical Character Recognition system). • Medical:- Disabled people assistance: personal computer handling, demotic, mail reading. • Industrial:- Voice-based management of control tools, by drawing operators attention on important events divided among several screens. Evolution Of TTS :-In 1779 the Danish scientist Christian Kratzenstein,working at the Rusian academy of sciences,build models of the human vocal tract that could produce the five long vowel sounds they are a,e,i,o and u.In 1791 an Austrian scientist developed a system based on the previous one included tongue,lips and ”mouth” made of rubber and a ”nose” with two nostrils which was able to pronounce consonants.In 1837,Joseph Faber developoed a system which implemented Pharyngeal Cavity,used for singing.It was controlled by keyboard.Bell Labs Developed VOCODER, a clearly intelligible.keyboard-operated electronic speech analyzer and synthesizer.In 1939, Homer Dudely developed VODER which was an improvement over VOCODER.The Pattern Playback was built by Dr. Franklin S. Cooper and his colleagues at Haskins Laboratories.First Electronic based TTS system was designed in 1968.Concatenation Technique was developed by 1970s.Many computer operating systems have included speech synthesizers since the early 1980s. From 1990s there was a progress in Unit Selection and Diphone Synthesis.Still a lot of devolopment is taking in this area. •
A. Problem definition Most of the existing system uses dictionary based method for converting the grapheme to phoneme.Phoneme generation using this techniques is not an efficient method.In dictionary based method the entries of a dictionary will be a tupple of word-pronounciation pair.This technique cant able to give the phoneme of a grapheme which is not in the dictionary.And for higher efficiency we require a large dictionary with a large no of word pronounciation pair.For that it requires large
c Dept. of Computer Science & Engg., Sreepathy Institute of Management And Technology, Vavanoor, Palakkad, 679533.
Anjali Krishna, et. al, Speech synthesis using Artificial Neural Network storage space also.We can avoid these limitations using a neural network method. II.
R EQUIREMENT A NALYSIS
Requirements of TTS includes nltk,numpy,python and speech generating tool mbrola.Python is a free and open source software and is widely used in general-purpose, highlevel programming language.NLTK is a leading platform for building Python programs to work with human language data.It provides easy-to-use interfaces.NumPy is an extension to the Python programming language and NumPy is also an open source tool .So all the requirements for speech synthesizers are cost effective. A. Existing System Most Text To Speech engines can be categorized by the method that they use to translate phonemes into audible sound.Some TTS Systems are listed below:• Prerecorded:- In this kind of TTS Systems we maintain a database of prerecorded words.The main advantage of this method is good quality of voice.But limited vocabulary and need of large storage space makes it less efficient. • Formant:-Here voice is generated by the simulation of the behavior of human vocal cord.Unlimited vocabulary, need of low storage space and ability to produce multiple featured voices makes it highly efficient, but robotic voice, which is sometimes not appreciated by the users. • Concatenated:-In this kind of TTS systems,text is phonetically represented by the combination of its syllables.These syllables are concatenated at run time and they produce phonetic representation of text.Key features of this technique are unlimited vocabulary and good voice.But it cant produce multiple featured voices,needs large storage space.The Implementation of this TTS is done using the concatenation method. Some existing speech softwares are :• ESPEAK[9]:-eSpeak uses a ”formant synthesis” method.This allows many languages to be provided in a small size.The speech is clear, and can be used at high speeds, but is not as natural or smooth as larger synthesizers which are based on human speech recordings. eSpeak is available as:◦ A command line program(linux or windows) to speak text from a file or from stdin. ◦ A shared library version for use by other programs. (On Windows this is a DLL). ◦ A SAPI5 version for Windows, so it can be used with screen-readers and other programs that support the Windows SAPI5 interface. ◦ e Speak has been ported to other platforms, including Android, Mac OSX and Solaris. Features. ◦ Includes different Voices, whose characteristics can be altered. ◦ Can produce speech output as a WAV file. Sreepathy Journal of Computer Sc. & Engg.
35
SSML (Speech Synthesis Markup Language) is supported (not complete), and also HTML. ◦ Compact size. The program and its data, including many languages, totals about 2 Mbytes. ◦ Can be used as a front-end to MBROLA diphone voices, see mbrola.html. eSpeak converts text to phonemes with pitch and length information. ◦ Can translate text into phoneme codes, so it could be adapted as a front end for another speech synthesis engine. ◦ Potential for other languages. Several are included in varying stages of progress.Help from native speakers for these or other languages is welcome. ◦ Development tools are available for producing and tuning phoneme data Written in C. • FESTIVAL[9]:-The Festival Speech Synthesis System is a general multi-lingual speech synthesis system originally developed by Alan W. Black at Centre for Speech Technology Research (CSTR) at the University of Edinburgh.It offers a full text to speech system with various APIs, as well as an environment for development and research of speech synthesis techniques.It is written in C++ with a Scheme-like command interpreter for general customization and extension. Festival Usage:When you pass a text file to festival, it converts the contents of a text file into voice.For example, if you want to read a letter (mail) which is residing in a text file (say letter.txt) you can let festival read it out loud for you as follows: $ festival (hyphen)tts text file (name) Advantages:◦ Available for Free under an open source license. ◦ The quality of the voice and the pronunciations are very good. ◦ Supports 3 languages - English, Spanish, and Welsh. All the above mentioned softwares have a lot of limitations and someof them are: • No emotions in speaking styles. • Needs improvisation in collaboration between linguistics and technologists. • Text to speech should be made audibly communicate information to the user. • Sound produced is not natural. ◦
B. Proposed System Currently existing systems have lots of limitations.Here we are trying to improve the performance of the existing system by doing the phonetic analysis using Neural Network approach.Artificial intelligence and NN have been used more and more in recent decades. Potentials in this area are huge.NN are used in cases where rules or criteria for searching an answer is not clear (that is why NN are often called black box, they can solve the problem but at times it is hard to explain how problem was solved). Some applications of neural network[6]:• Character Recognition:- The idea of character recognition has become very important as handheld devices like Vol.1, Issue.1, June-2014
Anjali Krishna, et. al, Speech synthesis using Artificial Neural Network
36
the Palm Pilot are becoming increasingly popular.Neural networks can be used to recognize handwritten characters. •
Image Compression:- Neural networks can receive and process vast amounts of information at once, making them useful in image compression.With the Internet explosion and more sites using more images on their sites, using neural networks for image compression is worth a look.
•
Stock Market Prediction:- The day-to-day business of the stock market is extremely complicated.Many factors weigh in whether a given stock will go up or down on any given day.Since neural networks can examine a lot of information quickly and sort it all out, they can be used to predict stock prices.
•
Traveling Saleman’s Problem:- Interestingly enough, neural networks can solve the traveling salesman problem, but only to a certain degree of approximation.
•
Medicine, Electronic Nose, Security, and Loan Applications:- These are some applicationsthat are in their proof-of-concept stage, with the acception of a neural network that will decide whether or not to grant a loan, something that has already been used more successfully than many humans.
•
Miscellaneous Applications:- These are some very interesting (albeit at times a little absurd) applications of neural networks.NN advantages are that they can adapt to new scenarios, they are fault tolerant and can deal with noisy data.
The speech synthesizer has mainly four modules, and are given below:1.The first phase is Text Normalization. 2.Grapheme to Phoneme conversion, using Neural Networks. 3.Phoneme concatenation 4.Speech engine processing. The main advantage of the system is its efficiency in finding the phoneme corresponding to the input.By using the Neural Networking (along with back propagation algorithm) for finding the phoneme we dont want use the corpus of a directory like CMU .Artificial intelligence and NN have been used more and more in recent decades.NN are used in cases where rules or criteria for searching an answer are not clear.Currently existing speech systems uses are Festival, Espeak etc.Limitations of these kinds of tools are it cant produce emotions, feelings etc.Other limitations include it produce voices of words which are predefined in the corpus which are more artificial cant give a smooth voice as the output.We are trying to overcome the last mentioned problem using neural network.Thus improving the efficiency of existing system.Fig 1 indicate different phases of a text to speech system Sreepathy Journal of Computer Sc. & Engg.
Fig. 1: Text to Speech Conversion
III. M ODULE D ESCRIPTION The important challenge that needs to be addressed in any TTS systems is how to produce natural sounding speech for a plain text given as its input.It is not feasible to solve this by recording and storing all the words of a language and then concatenating the words in the given text to produce the corresponding speech.The TTS systems first convert the input text into its corresponding linguistic or phonetic representations and then produce the sounds corresponding to those representations.With the input being a plain text, the generated phonetic representations also need to be augmented with information about the intonation and rhythm that the synthesized speech should have.This task is done by a text analysis module in most speech synthesizers.The transcription from the text analysis module is then given to a signal processing module that produces synthetic speech . In speech synthesizers commonly two actions are taking place.The front end receives the text as input and outputs a symbolic linguistic representation.The back-end - receives the symbolic linguistic representation as input and outputs the speech.These two tasks are divided into four modules.They are 1) Text Normalization 2) Grapheme to Phoneme Conversion 3) Phoneme Concatenation 4) Speech Engine Processing Text normalization is the first phase of text to speech conversion.It is the process of transforming text into a single canonical form. Normalizing text before storing or processing it allows for separation of concerns, since input is guaranteed to be consistent before operations are performed on it.Text normalization requires being aware of what type of text is to be normalized and how it is to be processed afterwards.Grapheme to phoneme conversion is done by using neural network.MBROLA is used for speech generation. IV. T EXT NORMALIZATION The text normalization is the first fundementel component of the text to speech system.In this phase we do the text analysis also .The text analysis phase, which analyse the input text and organize into manaegable list of words.The text normalization is the trnasformation of text to pronounceable form.Text normalization is performed before the text is processed in some way.The main objective of this process is to identify the punctuation mark and pauses between the words.Usually the text normalization process is done for converting all letters of lowercsae or uppercase.And also to remove accent mark,stopwords or etc. Vol.1, Issue.1, June-2014
Anjali Krishna, et. al, Speech synthesis using Artificial Neural Network We know that the input is taken from the user ,and the user can input any text they would like to hear as voice.The normal user may input the strings with some punctuations and acronyms.Let consider an example string input hello how are you? when we process this input we dont want to read the pronuncation ?.when we produce the voice correspond to the input we dont want to read the questionmark.So we have to remove the tokens which have no pronouncation in the corpus.In the above case we have to remove the punctuations from the input then only we can move to the further phases.And similarly we have to create a corpus which contain the acronyms and their expansion.when we see an acronym in the input string we have to replace it with the corresponding expansion to get accurate result from the system.An example is given when the user input a string as HIV we have to convert it as aitch eye ve and for this we produce the voice. In our system we incoperate a python code, which can remove the puctuations from the input string.For that we created a list called pun which contains the puncuations to be removed from the text.Each time the word is checked with the members of pun list if word is not in list it is written to a file otherwise not written.Here we are giving a text which contains punctuations.This text is given into normalization unit.Then we get a normalized text and each word in that text is also known as graphemes. For eg:Input: how r ? u? Output:how r u Fig 2 explians the input - output relation of a text normalization phase.The input is a normal text which contains punctuations.This text is given into normalization unit.Then we get a normalized text and each word in that text is also known as graphemes.
37
machine that will mimic brain activities and be able to learn.NN usually learns by examples.If NN is supplied with enough examples, it should be able to perform classication and even discover new trends or patterns in data.Basic NN is composed of three layers, input, output and hidden layer.Each layer can have number of nodes and nodes from input layer are connected to the nodes from hidden layer.Nodes from hidden layer are connected to the nodes from output layer.Those connections represent weights between nodes. Figure 3 represents the architecture of an simple NN. It is made up from an input, output and one or more hidden layers. Each node from input layer is connected to a node from hidden layer and every node from hidden layer is connected to a node in output layer. There is usually some weight associated with every connection.Input layer represents the raw information that is fed into the network.This part of network is never changing its values.Every single input to the network is duplicated and send down to the nodes in hidden layer.Hidden Layer accepts data from the input layer.It uses input values and modies them using some weight value, this new value is then send to the output layer but it will also be modied by some weight from connection between hidden and output layer.Output layer process information received from the hidden layer and produces an output.This output is then processed by activation function.
Fig. 3: Sipmle Neural Network
Fig. 2: Text to Graphemes Conversion
V.
G RAPHEME TO PHONEME CONVERSION
This task is performed with the help of neural network.Neural Networks (NN) are important data mining tool used for classication and clustering.It is an attempt to build Sreepathy Journal of Computer Sc. & Engg.
A. Number of Nodes and Layers Choosing number of nodes for each layer will depend on problem NN is trying to solve, types of data network is dealing with, quality of data and some other parameters.Number of input and output nodes depends on training set in hand.Larose argued that choosing number of nodes in hidden layer could bechallenging task.If there are too many nodes in hidden layer, number of possible computations that algorithm has to deal with increases.Picking just few nodes in hidden layer can prevent the algorithm of it learning ability.Right balance needs to be picked.It is very important to monitor the progress of NN during its training, if results are not improving, some modication to the model might be needed.The way to control NN is by setting and adjusting weights between nodes.Initial weights are usually set at some random numbers and then they are adjusted during NN training. Vol.1, Issue.1, June-2014
Anjali Krishna, et. al, Speech synthesis using Artificial Neural Network
38
Phonemer can be locally split into two parts: preprocessing / feature selection and classification. The feature set we used is quite simple.Our features for each grapheme of a word are: a) the character before, b) the character afterward, c) the current character, d) the character 2 steps afterward, and e) what the class of the character is according to the Soundex algorithm.For the last feature, several characters such as h and y do not belong to any of the classes, so they got their own class.However, all vowels were grouped into one class. Phonemer can be expanded and more features could be added with ease for later trial.Some examples for future review could be, the number of vowels before and after the current character, or even whether this character is a duplicate of the one previous such as with the character t in the word bottle.Pronunciation classes apart from the Soundex Class can be used here as well, but we ignored these and instead focused on simple and widely used features to put emphasis on the abilities of neural networks.The feature set generated for each character is then turned into a binary input vector for the neural network.Each entry represents a possibility for each feature class. B. Dataset The data set we used was provided by Sproat with his work on the pmtools toolkit for word pronunciation modeling.The data set has almost 50,000 entries which we split up into a dev set and a final set at an 80 and 20 percent randomized split.With neural nets, the amount of training data severely affects its accuracy.So to avoid this testing error, we couldnt test the validity of our application by just training on a percentage of the final set and then testing with the rest.Instead, for final review, we trained on all of dev and then tested using final.For incremental tests however, we trained using 80% of the dataset was already aligned by character, so we could ignore the problem of character alignment, however the phoneme classes were fairly complex.They included stresses, as well as silent and added pronunciation. As mentioned above, silent pauses were indicated using special characters. C. Back Propagation (BP) Algorithm One of the most popular NN algorithms is back propagation algorithm. Rojas claimed that BP algorithm could be broken down to four main steps.After choosing the weights of the network randomly, the back propagation algorithm is used to compute the necessary corrections.The algorithm can be decomposed in the following four steps:1) Feed-forward computation. 2) Back propagation to the output layer. 3) Back propagation to the hidden layer. 4) Weight updates. The algorithm will stop when the value of the error function has become approximately small.This is very rough and basic formula for BP algorithm.There are some variations proposed by other scientists but Rojaâ&#x20AC;&#x2122;s denition seems to be quite accurate and easy to follow.This algorithm will repeat untill the value of the error function becomes sufficiently small.BP algorithm will be explained using the below shown Fig 4 Sreepathy Journal of Computer Sc. & Engg.
Fig. 4: Example
D. Worked example NN on gure 5.2 has two nodes (N0,0 and N0,1) in input layer, two nodes in hidden layer (N1,0 and N1,1) and one node in output layer (N2,0).Input layer nodes are connected to hidden layer nodes with weights (W0,1-W0,4).Hidden layer nodes are connected with output layer node with weights (W1,0 and W1,1).The values that were given to weights are taken randomly and will be changed during BP iterations.Table with input node values and desired output with learning rate and momentum are also given in gure 5. There is also sigmoid function formula f(x) = 1.0/(1.0 + exp(x)).Shown are calculations for this simple network (only calculation for example set 1 is going to be shown (input values of 1 and 1 with output value 1)).In NN training, all example sets are calculated but logic behind calculation is the same. E. Advantages And Disadvantages Articial intelligence and NN have been used more and more in recent decades.Potentials in this area are huge.Here are some NN advantages, disadvantages and industries where they are being used.NN are used in cases where rules or criteria for searching an answer is not clear (that is why NN are often called black box, they can solve the problem but at times it is hard to explain how problem was solved).They found its way into broad spectrum of industries,from medicine to marketing and military just to name few. Financial sector has been known for using NN in classifying credit rating and market forecasts.Marketing is another eld where NN has been used for customer classication (groups that will buy some product, Vol.1, Issue.1, June-2014
Anjali Krishna, et. al, Speech synthesis using Artificial Neural Network identifying new markets for certain products, relationships between customer and company).Many companies use direct marketing(sending its oer by mails)to attract customers.If NN could be employed up the percentage of the response to direct marketing, it could save companies lots of their revenue. At the end of the day, its all about the money.Post oces are known to use NN for sorting the post (based on postal code recognition).Those were just few examples where NN are being used.NN advantages are that they can adapt to new scenarios, they are fault tolerant and can deal with noisy data.Time to train NN is probably identied as biggest disadvantage.They also require very large sample sets to train model eciently.It is hard to explain results and what is going on inside NN. VI. P HONEME CONCATENATION The third phase of the system is phoneme concatenation.We know that the second phase of the system was the Grapheme to Phoneme at the end of this phase we get output ,the phoneme of the given grapheme(it simply means aword).The previous phase is done with the help of neural network.The input of this phase is obtain from the previous phase and the input will be as the word, which is seperated with space and the corresponding phoneme which also seperated with space.Currently the word is seperated into their constituent phonetic.In this phase the seperated phoneme syllebles are concatenated to reconstruct the desired words. To impliment this phase we create a python code which results the concatenated phonemes ie phoneme which are corresponding to the letters in the word. VII. S PEECH E NGINE P ROCESSING A speech engine is a generic entity that either processes speech input or produces speech output.Each type of speech engine has a well-defined set of states of operation, and welldefined behavior for transitions between states. MBROLA is an algorithm for speech synthesis, and software which is distributed at no financial cost but in binary form only.The MBROLA provides diphone databases for a large number of spoken languages.The MBROLA software is not a complete text-to-speech system for all those languages; the text must first be transformed into phoneme and prosodic information in MBROLA’s format, and separate software to do this is available for some but not all of MBROLA’s languages and can require extra setup.Although diphone-based, the quality of MBROLA’s synthesis is considered to be higher than that of most diphone synthesisers as it preprocesses the diphones imposing constant pitch and harmonic phases that enhances their concatenation while only slightly degrading their segmental quality.MBROLA is a time-domain algorithm, as PSOLA, which implies very low computational load at synthesis time.Unlike PSOLA, however, MBROLA does not require a preliminary marking of pitch periods.This feature has made it possible to develop the MBROLA project around the MBROLA algorithm, through which many speech research labs, companies, or individuals around the world have provided diphone databases for many languages and voices (the number of which is by far a world record for speech synthesis, but there are some notable omissions such as Chinese). Sreepathy Journal of Computer Sc. & Engg.
39
MBROLA is a speech synthesizer based on the concatenation of diphones.It takes a list of phonemes as input, together with prosodic information (duration of phonemes and a piecewise linear description of pitch), and produces speech samples on 16 bits (linear), at the sampling frequency of the diphone database. It is therefore not a Text-To-Speech (TTS) synthesizer, since it does not accept raw text as input.In order to obtain a full TTS system, you need to use this synthesizer in combination with a text processing system that produces phonetic and prosodic commands. A. Related Terms • Diphone:-In phonetics, a diphone is an adjacent pair of phones.It is usually used to refer to a recording of the transition between two phones. • Prosody:-Prosody reflect various features of the speaker or the utterance: the emotional state of the speaker; the form of the utterance (statement, question, or command), the presence of irony or sarcasm, emphasis, contrast, and focus, or other elements of language that may not be encoded by grammar or choice of vocabulary.The Mbrola voices are cost-free but are not open source.eSpeak can be used as a front-end to Mbrola.It provides the spellingto-phoneme translation and intonation, which Mbrola then uses to generate speech sound.To use a Mbrola voice, eSpeak needs information to translate from its own phonemes to the equivalent Mbrola phonemes.This has been set up for only some voices so far.MBROLA is a speech synthesizer based on the concatenation of diphones. VIII. S YSTEM A RCHITECTURE Fig 5 shows the complete phases of a TTS system.Also indicate input and output at each phases.
Fig. 5: steps in text to speech conversion
Vol.1, Issue.1, June-2014
Anjali Krishna, et. al, Speech synthesis using Artificial Neural Network IX.
S UMMARY O F R ESULTS
Individually each module is tested and after integrating, the framework is tested for the expected result. In text normalisation expectd output is obtained.All punctuations associated with the input text are removed.Also Obtained expected phonemes for the given input using neural network. Phoneme concatenation seems to be succesfull. Since speech engine works using MBROLA almost 90 percent accuracy is obtained.Found less accuracy in pronouncing words beginning with S. X.
[5]
Gopalakrishna anumanchipalli,Rahul Chitturi, Sachin Joshi, Rohit Kumar, Satinder Pal Singh,R.n.v Sitaram,D.P.Kishore, Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition System.
[6]
Mirza Cilimkovic, Neural Networks and Back Propagation Algorithm,Institute of Technology Blanchardstown, Blanchardstown Road North,Dublin 15,Ireland.
[7]
M.Ostendorf and I.Bulyko, The impact of speech recognition on speech synthesis, in proc, IEEE Workshop Speech Synthesis, Santa Monica,2002,pp. 99-106.
[8]
Richard Sproat, â&#x20AC;?Pmtools: A Pronunciation Modeling Toolkitâ&#x20AC;?, Proceedings of the Fourth ISCA Tutorial and Research Workshop on Speech Synthesis, Blair Atholl, Scotland, 2001.
[9]
Text To Speech Synthesis - a knol by Jaibatrik Dutta .
C ONCLUSION AND F UTURE W ORK
As per the goal of this project an attempt is made to show how the computer speaks out the English text.Here the provision is provided to the user to input the English text and he or she can listen to his text.Present system just pronounces the English character; however the naturalness of the synthetic speech needs to be improved for implementing the expressions of the human beings.By developing such systems, relationship between human and computer becomes much closer.Thus it helps in overcoming the problem of DIGITAL DIVIDE.
40
[10]
Qing Guo, Jie Zhang, Nobuyuki Katae, Hao Yu , High Qulity Prosody Generation in Mandrain Text-to-Speech system ,fuji tsu Sci.Tech,J., vol.46, No.1,pp.40-46 ,2010.
This work can be modified into an interactive communication system that communicates with human beings emotionally.Since the face and its expressions are the most important role for natural communication, we are thinking to develop a face robot that can express facial expressions similar to human beings and produce life like behaviour and gesters.Program should use the ability of neck and head movement, corresponding to the synthesized text, to create a more life - like limmitation.Emotion detection and emotion mimic:In the future a system that uses language proccessing to detect emotions in a given text and react with the appropriate face expression and may be tone, could be intergrated.An additional speech analyzing system may be developed in order to detect feelings, by analyzing the tone of the speaker and other parameters. R EFERENCES [1] A.Black, H.Zen and K.Tokuda Statistical parametric speech synthesis, in proc.ICASSP, Honolulu, HI 2007, vol IV, PP 1229-1232. [2] Deana L. Pennell,NORMALIZATION OF INFORMAL TEXT FOR TEXT-TO-SPEECH ,Approved by Supervisory Committe Dr. Yang Liu, Co-Chair,Dr. Vincent Ng, Co-Chair,Dr. John H.L. Hansen,Dr. Haim Schweitzer. [3] Frances Alias, Xavier Servillano, Joan Claudi socoro and Xavier Gonzalvo Towards High-Quality Next Generation Text-to-Speech Synthesis:A multi domain Approach by Automatic Domain Classification,IEEE Transactions on AUDIO,SPEECH AND LANGUAG PROCESSING, VOL16,NO,7 september 2008. [4] G.Bailly, N.Campbell and b.Mobius, ISCA special session: Hot topics in speech synthesis , inproc.Eurospeech, Genea,switzerland 2003,pp37-40.
Sreepathy Journal of Computer Sc. & Engg.
Vol.1, Issue.1, June-2014
Sreepathy Journal of Computer Science and Engg.,Vol 1, Issue 1, June 2014
41
Cross Domain Sentiment Classification S.Abilasha,C.H.Chithira,Fazlu Rahman,P.K.Megha,P.Safva Mol, Manu Madhavan Dept. of Computer Science and Engg, SIMAT, Vavanoor, Palakkad fmasc@googlegroups.com
Abstractâ&#x20AC;&#x201D;Sentiment analysis refers to the use of natural language processing and machine learning techniques to identify and extract subjective information in a source material like product reviews. Due to revolutionary development in web technology and social media reviews can span so many different domains that it is difficult to gather annotated training data for all of them. A cross domain sentiment analysis invokes adaptation of learned information of some (labeled) source domain to unlabelled target domain. The method proposed in this project uses an automatically created sentiment sensitive thesaurus for domain adaptation. Based on the survey conducted on related literature, we identified L1 regularized logistic regression is a good binary classifier for our area of interest. In addition to the previous work we propose the use of sentiwordnet and adjective-adverb combinations for those effective feature learning. Keywordsâ&#x20AC;&#x201D;Sentiment Classification, Opinion Mining, Logistic Regression, Cross Domain.
I. I NTRODUCTION ROSS domain sentiment classification is a method of classifying the sentiments as positive or negative. Sentiment analysis refers to the use of text analysis for extracting subjective information. Sentiment classification has been applied in numerous tasks such as opinion mining, opinion summarization, contextual advertising and market analysis . Sentiments in this system refers to reviews in various domains. Users express opinions about products or services they consume in blog posts, shopping sites, or review sites. It is useful for both consumers as well as for producers to know what general public think about a particular product or service. Automatic document level sentiment classification is the task of classifying a given review with respect to the sentiment expressed by the author of the review. For example, a sentiment classifier might classify a user review about a movie as positive or negative depending on the sentiment expressed in the review. We define cross domain sentiment classification as the problem of learning a binary classifier (i.e. positive or negative sentiment) given a small set of labeled data for the source domain, and unlabeled data for both source and target domains. In particular, no labeled data is provided for the target domain. In this proposed system, we describe a cross-domain sentiment classification method. In this work, the lexical elements (unigram or bigram) in a review are taken and score of each lexical elements is calculated using sentiwordnet. Using this, a trained dataset is created and the test data will be classified according to this trained data [3]. In this work, logistic regression based algorithm is used for sentiment classification. Remaining part of this paper is organized as follows. Section 2
C
describes analysis on the related literature. Section 3 contains the details of design and implementation. The test results and experiments is presented in section 4 and section 5 gives conclusion and future work of this system. II.
R ELATED W ORKS
Our requirement is to develop a cross domain sentiment classifier for classifying the reviews from different domains as either positive or negative. Thus this system helps to analyze on a review. In previous work [4],various methods have been used for classification in single domain.Some of the classification method involves Bayesian classification,Entropy based method,Support Vector machine,Structural correspondence learning etc.They are described as follows: 1) Domain Adaptation with Structural Correspondence Learning: Structural Correspondence Learning [6] is one of the first algorithm for domain adaptation. Many NLP tasks suffers lack of training data in the domain. To face this challenge the possible solution is adapt a source domain (known domain) to a target domain (new domain). This is called Domain Adaptation. Structural correspondence learning (SCL) is a general technique (a Domain adaptation algorithm) which can be applied to feature based classifiers, proposed by Blitzer[6]. The key idea of SCL is to identify correspondences among features from different domains by modeling their correlations with pivot features. Pivot features are features which behave in the same way for discriminative learning in both domains. Structural correspondence learning involves a source domain and a target domain. Both domains have ample unlabeled data, but only the source has labeled training data. The SCL algorithm involves selection of pivot features, training a binary classifier for every pivot features. The simplest criterion for selecting pivot feature is that it should occur frequently in the unlabeled data of both domains. The binary classifier here acts as prediction function. These binary classification problems can be trained from the unlabeled data, since they merely represent properties of the input. If the features are represented as a binary vector x, these can be solved by using m linear predictors. fl (x) = sgn(wÂŻl .X)l = 1..m Since each instance contains features which are totally predictive of the pivot feature , we never use these features when making the binary prediction. That is, we do not use any feature derived from the right word when solving right token pivot predictor. Then arrange the pivot predictor weight vectors in matrix W. Apply Singular Value Decomposition to W, and select the h top left singular vectors . Train a new model on the source data augmented with x. Singular Value
c Dept. of Computer Science & Engg., Sreepathy Institute of Management And Technology, Vavanoor, Palakkad, 679533.
S. Abilasha, et al., Cross Domain Sentiment Classification Decomposition (SVD) decompose a matrix A of order m X n, into product three matrices: A= L S Transpose(V), where L is an orthonormalized matrix of order m X m , S is a diagonal matrix of order m X n and V is the orthogonal matrix of order n X n. 2) Sentiment Classification Using Machine Learning Techniques: This work [2] mainly examine the effectiveness of applying machine learning techniques to the sentiment classification problem. A challenging aspect of this problem that seems to distinguish it from traditional topic-based classification is that what topics are often identifiable by keywords alone, sentiment can be expressed in a more subtle manner. Sentiment classification would be helpful in business intelligence applications and recommender systems, where user input and feedback could be quickly summarized. The main aim of this work was to examine whether it suffices to treat sentiment classification simply as a special case of topic based categorization with the two topics being positive sentiment and negative sentiment, or whether special sentiment-categorization methods need to be developed. Three basic standard algorithms , Naive Bayes Classification, Maximum Entropy Classification, Support Vector Machine are experimented in this work. experimented in this work. Naive Bayes Classification is an approach to text classification is to assign to a given document d the class c∗ − argmaxcp(c/d). The Nave Bayes classifier can be derived by observing the Bayes rule: P (c|d) =
p(c)p(d|c) p(d)
where p(d) plays no role in selecting c* to estimate the term p(d|c), NaiveBayes decomposes it by assuming the fis are conditionally independent given d class: Qm P (c)( P (fi |c)ni (d) ) i=1 PN B (c|d) = P (d)
where fi is the predefined feature that can appear in a document. The training method consist of relative-frequency estimation of p(c) and p(fi |c),using add-one smoothing. Nave Bayes is optimal for certain problem classes with highly dependent features. Maximum Entropy Classification is an alternative technique which has proven effective in a number of natural language processing applications. Its estimate of p(c/d) takes the following exponential form: P 1 PM E (c|d) = Z(d) exp( i λi,c Fi,c (d, c)) where z(d) is a normalization function. Fi,c is a feature class function for feature fi class c, defined as follows: n
if ni (d) > 0 and c¯ − c 0 Otherwise Maximum Entropy makes no assumptions about the relationship between features ,and so might potentially perform better when conditional independence assumptions are not met. Support vector machines (SVMs) have been shown to be highly effective at traditional text categorization, generally
Fi,c (d, c¯) =
1
Sreepathy Journal of Computer Sc. & Engg.
42
outperforming Naive Bayes . In the two-category case, the basic idea behind the training procedure is to find hyper plane, represented by vector w, that not only separates the document vectors in one class from those in the other, but for which the separation, or margin, is as large as possible. This search corresponds to a constrained optimization problem; letting cj ∈ [1, −1] (corresponding to positive and negative) be the correct class of document dj , the solution can be written as: P w ¯ : − j αj cj d¯j , αj ≥ 0
where the αj ’s are obtained by solving a dual optimization problem. Those -dj such that j is greater than zero are called support vectors, since they are the only document vectors contributing to w. Classification of test instances consists simply of determining which side of w’s hyper plane they fall on. From these observations, it can be concluded that the results produced via machine learning techniques are quite good in comparison to the human-generated baselines. In terms of relative performance, Naive Bayes tends to do the worst and SVMs tend to do the best, although the differences aren’t very large. On the other hand, all these methods were not able to achieve accuracies on the sentiment classification problem comparable to those reported for standard topic-based categorization, despite the several different types of features we tried. A. Problem Identification With the rapid growth of the Web, more and more people write reviews for all types of products and services and place them online. It is becoming a common practice for a consumer to learn how others like or dislike a product before buying, or for a manufacturer to keep track of customer opinions on its products to improve the user satisfaction. However, as the number of reviews available for any given product grows, it becomes harder and harder for people to understand and evaluate what the prevailing opinion about the product is. This can be illustrated with the help of an example: in case of camera Nikon D70, which gathers user reviews from several sites, get over about 759,000 reviews by searching Nikon D70 user review in Google. This demonstrates the need for algorithmic sentiment classification in order to digest this huge repository of hidden reviews( Example taken from [4]). B. Technical Background The cross domain sentiment classification for classifying the reviews is a data mining approach.Data mining (the analysis step of the ”Knowledge Discovery and Data Mining” process, or KDD), an interdisciplinary subfield of computer science, is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. Aside from the raw analysis Vol.1, Issue.1, June-2014
S. Abilasha, et al., Cross Domain Sentiment Classification step, it involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating. Classification is one of the techniques used in data mining.The proposed system is a classification technique.Classification is a supervised method of learning.In classification a trained dataset is provided.This system uses the trained data as lexical elements which is labeled as positive or negative.When the unlabeled test data is provided,classification algorithm will classify it. C. Language Tools 1) Language Used: Python is the language used for implementing this system. Python is actually an object oriented language. Python is found to be very effective in providing machine learning and artificial intelligence. Programming in python is simple and easy to implement. We mainly concentrate on string manipulation since sentiment classification is mainly concerned with text reviews. Python uses data types like Boolean, string, set, list, dictionary etc. As other programming languages, boolean uses two variables-true and false. Strings in python can be created using single quotes, double quotes and triple quotes. When we use triple quotes, strings can span several lines without using the escape character. 2) Tools Used: The various tools in python used for this method involves NLTK(natural language tool kit), SciPy, NumPy.These tools are described here in detail. The Natural Language Toolkit, or more commonly NLTK, is a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for the Python programming language. NLTK includes graphical demonstrations and sample data. NLTK is intended to support research and teaching in NLP or closely related areas, including empirical linguistics, cognitive science, artificial intelligence, information retrieval, and machine learning.NLTK has been used successfully as a teaching tool, as an individual study tool, and as a platform for prototyping and building research systems. corpus reader functions in NLTK can be used to read documents from that corpus. Corpus reader functions are named based on the type of information they return. Some common examples, and their return types, are: • words(): list of str • paras(): list of (list of (list of str)) • taggedwords(): list of (str,str) tuple • taggedsents(): list of (list of (str,str)) NumPy is an extension to the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large library of high-level mathematical functions to operate on these arrays. Because Python is currently implemented as an interpreter, mathematical algorithms written in it often run slower than compiled equivalents. NumPy seeks to address this problem for numerical algorithms by providing multidimensional arrays and functions and operators that operate efficiently on arrays. Thus any algorithm that can be expressed primarily as operations on arrays and matrices can run almost as quickly as the equivalent C code. Sreepathy Journal of Computer Sc. & Engg.
43
The core functionality of NumPy is its ”ndarray”, for ndimensional array, data structure.In contrast to Python’s builtin list data structure these arrays are homogeneously typed: all elements of a single array must be of the same type. The basic data structure in SciPy is a multidimensional array provided by the NumPy module. NumPy provides some functions for linear algebra, Fourier transforms and random number generation, but not with the generality of the equivalent functions in SciPy. NumPy can also be used as an efficient multi-dimensional container of data with arbitrary data-types. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.Older versions of SciPy used Numeric as an array type, which is now deprecated in favor of the newer NumPy array code. III. D ESIGN AND I MPLEMENTATION Cross Domain Sentiment Classification is a method of classification applied when we do not have any labeled data for a target domain but have some labeled data for multiple other domains, designated as the source domain. It focuses on the challenge of training a classifier from one or more domains (source domains) and applying the trained classifier in a different domain (target domain).A cross-domain sentiment classification system must overcome two main challenges. First, it must identify which source domain features are related to which target domain features. Second, it requires a learning framework to incorporate the information regarding the relatedness of source and target domain features. A. Input Design We use labeled data from multiple source domains and unlabeled data from source and target domains to represent the distribution of features.Our first Step is,Given a labeled or an unlabeled review, we first split the review into individual sentences.This is done in the Preprocessing Stage of our process. The review given will be the input to the first stage. On moving to the next stage or the attained pos tagged words are fetched to this stage,hence to the input given to sentiWordNet will be the pos tagged sentences .The score obtained from this stage will be the input for the next stage,that is the input given to the logistic regression will be the score calculated using sentiwordnet. B. Output Design After preprocessing stage of our execution we got the output in the form of POS tagged sentences. This output is given to the next stage as the input,and the next is the sentiWordNet,at this stage the score is calculated and this score will be the output.this score is given to the next logistic regression and the output of this stage will be the prediction that the given sentences or review is positive or not. C. Module Description We describe a sentiment classification method that is applicable when we do not have any labeled data for a target Vol.1, Issue.1, June-2014
S. Abilasha, et al., Cross Domain Sentiment Classification domain but have some labeled data for multiple other domains, designated as the source domains. The proposed system have mainly four modules: • Preprocessing: In this module,First, we select the lexical elements that co-occur with in a review sentence as features. Second, from each source domain labeled review sentence in which the sentence occurs, we create sentiment features by appending the label of the review to each lexical element we generate from that review.we use the notation *P to indicate positive sentiment features and *N to indicate negative sentiment features.In addition to word-level sentiment features,we replace words with their POS tags to create POS-level sentiment features. POS tags generalize the word-level sentiment features, thereby reducing feature sparseness.We then apply a simple word filter based on POS tags to select content words (nouns, verbs, adjectives, and adverbs). The preprocessing stage of a sentence is described in figure 3.1 TABLE I: Generating lexical elements and sentiment features. sentence POS tags
lexical elements (unigrams) lexical elements (bigrams) sentiment features (lemma) sentiment features (POS)
•
•
•
Excellent and broad survey of the development of civilization. Excellent/JJ and/CC broad/JJ survey/NN1 of/IO the/AT development/NN1 of/IO civilization/ NN1 excellent, broad, survey, development, civilization excellent+broad, broad+survey, survey+development, development+ civilization excellent*P, broad*P, survey* P, excellent+broad*P, broad+survey*P JJ*P, NN1*P, JJ+NN1*P
SentiWordNet:SentiWordNet is a lexical resource for opinion mining [1]. SentiWordNet assigns to each synset of WordNet two sentiment scores: positivity, negativity. SentiWordnet is an online dictionary and it provides positive and negative score for each lexical elements. Logistic Regression: In this module,we give the input obtained from the previous module that means the score obtained using the SentiWordNet and the labeled elements are given as the inputs.And in this module this input will become the trained set.And when we give the test data it could predict whether it is positive or not. Cross Domain: Till now, the review from a single domain is classified.Now the classification is extended for multiple domains. D. Implementation 1) System Architecture: We use labeled data from multiple source domains and unlabeled data from source and target domains to represent the distribution of features.Our first Step is,Given a labeled or an unlabeled
Sreepathy Journal of Computer Sc. & Engg.
44
review, we first split the review into individual sentences.This is done in the Preprocessing Stage of our process. The review given will be the input to the first stage.,First, we select other lexical elements that cooccur with in a review sentence as features. Second, from each source domain labeled review sentence in which the sentence occurs, we create sentiment features by appending the label of the review to each lexical element we generate from that review.we use the notation *P to indicate positive sentiment features and *N to indicate negative sentiment features. In addition to word-level sentiment features,we replace words with their POS tags to create POS-level sentiment features. POS tags generalize the word-level sentiment features, thereby reducing feature sparseness.We then apply a simple word filter based on POS tags to select content words (nouns, verbs, adjectives, and adverbs).After preprocessing stage of our execution we got the output in the form of POS tagged sentences. This output is given to the next stage as the input.On moving to the next stage or the attained pos tagged words are fetched to this stage,hence to the input given to sentiWordNet will be the pos tagged sentences .The score obtained from this stage will be the input for the next stage. SentiWordNet is a lexical resource for opinion mining. SentiWordNet assigns to each synset of WordNet two sentiment scores: positivity, negativity.SentiWordnet is an online dictionary and it provides positive and negative score for each lexical elements. at this stage the score is calculated and this score will be the output.The score obtained from this stage will be the input for the next stage,that is the input given to the logistic regression will be the score calculated using sentiwordnet. we give the input obtained from the previous module that means the score obtained using the SentiWordNet and the labelled elements are given as the inputs.And in this module this input will become the trained set.And when we give the test data it could predict whether it is positive or not.the implementation is only done in a single domain,now this is to implement in cross domain. IV.
E XPERIMENTS AND R ESULTS
To evaluate our method we use the cross-domain sentiment classification dataset.This dataset consists of Amazon product reviews for three different product types: books, electronics and movie.And for testing the classifier in cross domain,we provide another domain phones.This benchmark dataset has been used in much previous work on cross-domain sentiment classification and by evaluating on it we can directly compare our method against existing approaches. The accuracy of sentiment classification in a single movie domain was calculated by taking a maximum of 1000 documents for training and an average of 100 documents for testing. Each time the number of documents in the training set is varied in the range of 100 while keeping number of Vol.1, Issue.1, June-2014
S. Abilasha, et al., Cross Domain Sentiment Classification
45
a chance of misclassification of reviews due to various reasons like adaptation failure of the previously trained logistic regression classifier for a given new test data either from the same domain or cross domain.Sometimes due to inexistence of the synsets in the sentiwordnet. V. C ONCLUSION A ND F UTURE W ORK Sentiment analysis is found to be the method of classifying huge reviewsby analysing its opinion strength.We have implemented a system that classifies the reviews from a single domain.L1 logistic regression method is used for this classification. R EFERENCES [1]
Fig. 1: Accuracy in movie domain [2]
test documents as fixed and a graph is plotted. Figure 1 shows the graph of accuracy in movie domain. It was noted from the graph that as the number of training data increases,accuracy shows an increasing trend.The average accuracy was found to be 80 percentage. The similar experiment is repeated in the other two domains.These domains also shows an increasing trend of accuracy with more number of training data. The accu- racy is computed as the percentage of correctly classified target domain reviews out of the total number of reviews in the target domain. For testing the classifier in cross domain,we provide
[3] [4]
[5]
[6] [7] [8]
Andrea Esuli , Fabrizio Sebastiani, ” SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining”, Proc. of the 5th Conf. on Language Resources and Evaluation (LREC06), 2006 Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan, ” Thumbs up? Sentiment Classification using Machine Learning Techniques”, In EMNLP, pp 79-86, 2002. Bo Pang and Lillian Lee, ” Opinion mining and sentiment analysis”, in Foundations and Trends in Information Retrieval, 2(1-2):1-135, 2008. Danushka Bolegalla ,David Weir, John Caroll, ”Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for CrossDomain Sentiment Classification ”, in Proc. of the 49th Annual Meeting of the ACL: Human Language Technologies - Vol-1 pp132-14, 2011. Farah Benamara, Sabatier Irit, Carmine Cesarano, Napoli Federico, Diego Reforgiato, ” Sentiment Analysis: Adjectives and Adverbs are better than Adjectives Alone ”, In Proc of Int Conf on Weblogs and Social Media , 2007. John Blitzer, Ryan McDonald, and Fernando Pereira, ” Domain Adaptation with Structural Correspondence Learning”, In EMNLP, 2006. Sinno Jialin Pan, Xiaochuan Ni, Jian-Tao Sun, Qiang Yang, and Zheng Chen, ” Cross-Domain Sentiment Classification via Spectral Feature Alignment ”, In WWW 2010. Steve Bird, et al., Natural Language Processing with Python, OReilly Media Inc., 2009.
Fig. 2: Accuracy in cross domain a set of test reviews from the domain phones and the average accuracy in classifying these test data is found to be 75 percentage.The experiment is conducted with 1000 documents taken from all the three domains.The graph in figure 2 shows the result. Sometimes there is Sreepathy Journal of Computer Sc. & Engg.
Vol.1, Issue.1, June-2014