ISSN (ONLINE) : 2045 -8711 ISSN (PRINT) : 2045 -869X
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY & CREATIVE ENGINEERING
AUGUST 2017 VOL-7 NO-08
@IJITCE Publication
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.7 NO.08 AUGUST 2017, IMPACT FACTOR: 1.04
UK: Managing Editor International Journal of Innovative Technology and Creative Engineering 1a park lane, Cranford London TW59WA UK E-Mail: editor@ijitce.co.uk Phone: +44-773-043-0249 USA: Editor International Journal of Innovative Technology and Creative Engineering Dr. Arumugam Department of Chemistry University of Georgia GA-30602, USA. Phone: 001-706-206-0812 Fax:001-706-542-2626 India: Editor International Journal of Innovative Technology & Creative Engineering Dr. Arthanariee. A. M Finance Tracking Center India 66/2 East mada st, Thiruvanmiyur, Chennai -600041 Mobile: 91-7598208700
www.ijitce.co.uk
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.7 NO.08 AUGUST 2017, IMPACT FACTOR: 1.04
www.ijitce.co.uk
IJITCE PUBLICATION
International Journal of Innovative Technology & Creative Engineering Vol.7 No.08 August 2017
www.ijitce.co.uk
www.ijitce.co.uk
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.7 NO.08 AUGUST 2017, IMPACT FACTOR: 1.04
From Editor's Desk Dear Researcher, Greetings! Research article in this issue discusses about motivational factor analysis. Let us review research around the world this month. The first evidence for an exomoon a moon orbiting a planet orbiting a distant star may have been spotted in data from the Kepler space telescope. But exomoons in general may be rare, at least around planets close to their stars. A second smaller dip that appears ahead of or behind the planet could reveal a moon. Such exomoons researchers have speculated may be among the best places in the universe to look for extraterrestrial life. But because those signals are faint and inconsistent, they take a lot of computing power to find. Kipping has been searching for such signals for years in a project called the Hunt for Exomoons with Kepler. The object, if it exists, orbits a planet slightly larger than Jupiter around a star about 4,000 light-years away. Because the potential moon is probably about the size of Neptune, the team nicknamed it “Neptmoon.” If confirmed, this moon would be almost in a class of its own. The team calculated that, statistically speaking only 38 percent of Jupiter-like planets close to their stars are likely to host moons like Jupiter’s. That’s surprising, but given that there are thousands of exoplanets still to check, more moons may still be out there. The hunt continues. It has been an absolute pleasure to present you articles that you wish to read. We look forward to many more new technologies related research articles from you and your friends. We are anxiously awaiting the rich and thorough research papers that have been prepared by our authors for the next issue.
Thanks, Editorial Team IJITCE
www.ijitce.co.uk
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.7 NO.08 AUGUST 2017, IMPACT FACTOR: 1.04
Editorial Members Dr. Chee Kyun Ng Ph.D Department of Computer and Communication Systems, Faculty of Engineering,Universiti Putra Malaysia,UPMSerdang, 43400 Selangor,Malaysia. Dr. Simon SEE Ph.D Chief Technologist and Technical Director at Oracle Corporation, Associate Professor (Adjunct) at Nanyang Technological University Professor (Adjunct) at ShangaiJiaotong University, 27 West Coast Rise #08-12,Singapore 127470 Dr. sc.agr. Horst Juergen SCHWARTZ Ph.D, Humboldt-University of Berlin,Faculty of Agriculture and Horticulture,Asternplatz 2a, D-12203 Berlin,Germany Dr. Marco L. BianchiniPh.D Italian National Research Council; IBAF-CNR,Via Salaria km 29.300, 00015 MonterotondoScalo (RM),Italy Dr. NijadKabbaraPh.D Marine Research Centre / Remote Sensing Centre/ National Council for Scientific Research, P. O. Box: 189 Jounieh,Lebanon Dr. Aaron Solomon Ph.D Department of Computer Science, National Chi Nan University,No. 303, University Road,Puli Town, Nantou County 54561,Taiwan Dr. Arthanariee. A. M M.Sc.,M.Phil.,M.S.,Ph.D Director - Bharathidasan School of Computer Applications, Ellispettai, Erode, Tamil Nadu,India Dr. Takaharu KAMEOKA, Ph.D Professor, Laboratory of Food, Environmental & Cultural Informatics Division of Sustainable Resource Sciences, Graduate School of Bioresources,Mie University, 1577 Kurimamachiya-cho, Tsu, Mie, 514-8507, Japan Dr. M. Sivakumar M.C.A.,ITIL.,PRINCE2.,ISTQB.,OCP.,ICP. Ph.D. Project Manager - Software,Applied Materials,1a park lane,cranford,UK Dr. Bulent AcmaPh.D Anadolu University, Department of Economics,Unit of Southeastern Anatolia Project(GAP),26470 Eskisehir,TURKEY Dr. SelvanathanArumugamPh.D Research Scientist, Department of Chemistry, University of Georgia, GA-30602,USA.
Review Board Members Dr. Paul Koltun Senior Research ScientistLCA and Industrial Ecology Group,Metallic& Ceramic Materials,CSIRO Process Science & Engineering Private Bag 33, Clayton South MDC 3169,Gate 5 Normanby Rd., Clayton Vic. 3168, Australia Dr. Zhiming Yang MD., Ph. D. Department of Radiation Oncology and Molecular Radiation Science,1550 Orleans Street Rm 441, Baltimore MD, 21231,USA Dr. Jifeng Wang Department of Mechanical Science and Engineering, University of Illinois at Urbana-Champaign Urbana, Illinois, 61801, USA Dr. Giuseppe Baldacchini ENEA - Frascati Research Center, Via Enrico Fermi 45 - P.O. Box 65,00044 Frascati, Roma, ITALY.
www.ijitce.co.uk
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.7 NO.08 AUGUST 2017, IMPACT FACTOR: 1.04
Dr. MutamedTurkiNayefKhatib Assistant Professor of Telecommunication Engineering,Head of Telecommunication Engineering Department,Palestine Technical University (Kadoorie), TulKarm, PALESTINE. Dr.P.UmaMaheswari Prof &Head,Depaartment of CSE/IT, INFO Institute of Engineering,Coimbatore. Dr. T. Christopher, Ph.D., Assistant Professor &Head,Department of Computer Science,Government Arts College(Autonomous),Udumalpet, India. Dr. T. DEVI Ph.D. Engg. (Warwick, UK), Head,Department of Computer Applications,Bharathiar University,Coimbatore-641 046, India. Dr. Renato J. orsato Professor at FGV-EAESP,Getulio Vargas Foundation,São Paulo Business School,RuaItapeva, 474 (8° andar),01332-000, São Paulo (SP), Brazil Visiting Scholar at INSEAD,INSEAD Social Innovation Centre,Boulevard de Constance,77305 Fontainebleau - France Y. BenalYurtlu Assist. Prof. OndokuzMayis University Dr.Sumeer Gul Assistant Professor,Department of Library and Information Science,University of Kashmir,India Dr. ChutimaBoonthum-Denecke, Ph.D Department of Computer Science,Science& Technology Bldg., Rm 120,Hampton University,Hampton, VA 23688 Dr. Renato J. Orsato Professor at FGV-EAESP,Getulio Vargas Foundation,São Paulo Business SchoolRuaItapeva, 474 (8° andar),01332-000, São Paulo (SP), Brazil Dr. Lucy M. Brown, Ph.D. Texas State University,601 University Drive,School of Journalism and Mass Communication,OM330B,San Marcos, TX 78666 JavadRobati Crop Production Departement,University of Maragheh,Golshahr,Maragheh,Iran VineshSukumar (PhD, MBA) Product Engineering Segment Manager, Imaging Products, Aptina Imaging Inc. Dr. Binod Kumar PhD(CS), M.Phil.(CS), MIAENG,MIEEE HOD & Associate Professor, IT Dept, Medi-Caps Inst. of Science & Tech.(MIST),Indore, India Dr. S. B. Warkad Associate Professor, Department of Electrical Engineering, Priyadarshini College of Engineering, Nagpur, India Dr. doc. Ing. RostislavChoteborský, Ph.D. Katedramateriálu a strojírenskétechnologieTechnickáfakulta,Ceskázemedelskáuniverzita v Praze,Kamýcká 129, Praha 6, 165 21 Dr. Paul Koltun Senior Research ScientistLCA and Industrial Ecology Group,Metallic& Ceramic Materials,CSIRO Process Science & Engineering Private Bag 33, Clayton South MDC 3169,Gate 5 Normanby Rd., Clayton Vic. 3168 DR.ChutimaBoonthum-Denecke, Ph.D Department of Computer Science,Science& Technology Bldg.,HamptonUniversity,Hampton, VA 23688 Mr. Abhishek Taneja B.sc(Electronics),M.B.E,M.C.A.,M.Phil., Assistant Professor in the Department of Computer Science & Applications, at Dronacharya Institute of Management and Technology, Kurukshetra. (India).
www.ijitce.co.uk
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.7 NO.08 AUGUST 2017, IMPACT FACTOR: 1.04
Dr. Ing. RostislavChotěborský,ph.d, Katedramateriálu a strojírenskétechnologie, Technickáfakulta,Českázemědělskáuniverzita v Praze,Kamýcká 129, Praha 6, 165 21
Dr. AmalaVijayaSelvi Rajan, B.sc,Ph.d, Faculty – Information Technology Dubai Women’s College – Higher Colleges of Technology,P.O. Box – 16062, Dubai, UAE Naik Nitin AshokraoB.sc,M.Sc Lecturer in YeshwantMahavidyalayaNanded University Dr.A.Kathirvell, B.E, M.E, Ph.D,MISTE, MIACSIT, MENGG Professor - Department of Computer Science and Engineering,Tagore Engineering College, Chennai Dr. H. S. Fadewar B.sc,M.sc,M.Phil.,ph.d,PGDBM,B.Ed. Associate Professor - Sinhgad Institute of Management & Computer Application, Mumbai-BangloreWesternly Express Way Narhe, Pune - 41 Dr. David Batten Leader, Algal Pre-Feasibility Study,Transport Technologies and Sustainable Fuels,CSIRO Energy Transformed Flagship Private Bag 1,Aspendale, Vic. 3195,AUSTRALIA Dr R C Panda (MTech& PhD(IITM);Ex-Faculty (Curtin Univ Tech, Perth, Australia))Scientist CLRI (CSIR), Adyar, Chennai - 600 020,India Miss Jing He PH.D. Candidate of Georgia State University,1450 Willow Lake Dr. NE,Atlanta, GA, 30329 Jeremiah Neubert Assistant Professor,MechanicalEngineering,University of North Dakota Hui Shen Mechanical Engineering Dept,Ohio Northern Univ. Dr. Xiangfa Wu, Ph.D. Assistant Professor / Mechanical Engineering,NORTH DAKOTA STATE UNIVERSITY SeraphinChallyAbou Professor,Mechanical& Industrial Engineering Depart,MEHS Program, 235 Voss-Kovach Hall,1305 OrdeanCourt,Duluth, Minnesota 55812-3042 Dr. Qiang Cheng, Ph.D. Assistant Professor,Computer Science Department Southern Illinois University CarbondaleFaner Hall, Room 2140-Mail Code 45111000 Faner Drive, Carbondale, IL 62901 Dr. Carlos Barrios, PhD Assistant Professor of Architecture,School of Architecture and Planning,The Catholic University of America Y. BenalYurtlu Assist. Prof. OndokuzMayis University Dr. Lucy M. Brown, Ph.D. Texas State University,601 University Drive,School of Journalism and Mass Communication,OM330B,San Marcos, TX 78666 Dr. Paul Koltun Senior Research ScientistLCA and Industrial Ecology Group,Metallic& Ceramic Materials CSIRO Process Science & Engineering Dr.Sumeer Gul Assistant Professor,Department of Library and Information Science,University of Kashmir,India
www.ijitce.co.uk
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.7 NO.08 AUGUST 2017, IMPACT FACTOR: 1.04 Dr. ChutimaBoonthum-Denecke, Ph.D Department of Computer Science,Science& Technology Bldg., Rm 120,Hampton University,Hampton, VA 23688
Dr. Renato J. Orsato Professor at FGV-EAESP,Getulio Vargas Foundation,São Paulo Business School,RuaItapeva, 474 (8° andar)01332-000, São Paulo (SP), Brazil Dr. Wael M. G. Ibrahim Department Head-Electronics Engineering Technology Dept.School of Engineering Technology ECPI College of Technology 5501 Greenwich Road Suite 100,Virginia Beach, VA 23462 Dr. Messaoud Jake Bahoura Associate Professor-Engineering Department and Center for Materials Research Norfolk State University,700 Park avenue,Norfolk, VA 23504 Dr. V. P. Eswaramurthy M.C.A., M.Phil., Ph.D., Assistant Professor of Computer Science, Government Arts College(Autonomous), Salem-636 007, India. Dr. P. Kamakkannan,M.C.A., Ph.D ., Assistant Professor of Computer Science, Government Arts College(Autonomous), Salem-636 007, India. Dr. V. Karthikeyani Ph.D., Assistant Professor of Computer Science, Government Arts College(Autonomous), Salem-636 008, India. Dr. K. Thangadurai Ph.D., Assistant Professor, Department of Computer Science, Government Arts College ( Autonomous ), Karur - 639 005,India. Dr. N. Maheswari Ph.D., Assistant Professor, Department of MCA, Faculty of Engineering and Technology, SRM University, Kattangulathur, Kanchipiram Dt - 603 203, India. Mr. Md. Musfique Anwar B.Sc(Engg.) Lecturer, Computer Science & Engineering Department, Jahangirnagar University, Savar, Dhaka, Bangladesh. Mrs. Smitha Ramachandran M.Sc(CS)., SAP Analyst, Akzonobel, Slough, United Kingdom. Dr. V. Vallimayil Ph.D., Director, Department of MCA, Vivekanandha Business School For Women, Elayampalayam, Tiruchengode - 637 205, India. Mr. M. Moorthi M.C.A., M.Phil., Assistant Professor, Department of computer Applications, Kongu Arts and Science College, India PremaSelvarajBsc,M.C.A,M.Phil Assistant Professor,Department of Computer Science,KSR College of Arts and Science, Tiruchengode Mr. G. Rajendran M.C.A., M.Phil., N.E.T., PGDBM., PGDBF., Assistant Professor, Department of Computer Science, Government Arts College, Salem, India. Dr. Pradeep H Pendse B.E.,M.M.S.,Ph.d Dean - IT,Welingkar Institute of Management Development and Research, Mumbai, India Muhammad Javed Centre for Next Generation Localisation, School of Computing, Dublin City University, Dublin 9, Ireland Dr. G. GOBI Assistant Professor-Department of Physics,Government Arts College,Salem - 636 007 Dr.S.Senthilkumar Post Doctoral Research Fellow, (Mathematics and Computer Science & Applications),UniversitiSainsMalaysia,School of Mathematical Sciences, Pulau Pinang-11800,[PENANG],MALAYSIA. Manoj Sharma Associate Professor Deptt. of ECE, PrannathParnami Institute of Management & Technology, Hissar, Haryana, India
www.ijitce.co.uk
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.7 NO.08 AUGUST 2017, IMPACT FACTOR: 1.04
RAMKUMAR JAGANATHAN Asst-Professor,Dept of Computer Science, V.L.B Janakiammal college of Arts & Science, Coimbatore,Tamilnadu, India Dr. S. B. Warkad Assoc. Professor, Priyadarshini College of Engineering, Nagpur, Maharashtra State, India Dr. Saurabh Pal Associate Professor, UNS Institute of Engg. & Tech., VBS Purvanchal University, Jaunpur, India Manimala Assistant Professor, Department of Applied Electronics and Instrumentation, St Joseph’s College of Engineering & Technology, Choondacherry Post, Kottayam Dt. Kerala -686579 Dr. Qazi S. M. Zia-ul-Haque Control Engineer Synchrotron-light for Experimental Sciences and Applications in the Middle East (SESAME),P. O. Box 7, Allan 19252, Jordan Dr. A. Subramani, M.C.A.,M.Phil.,Ph.D. Professor,Department of Computer Applications, K.S.R. College of Engineering, Tiruchengode - 637215 Dr. SeraphinChallyAbou Professor, Mechanical & Industrial Engineering Depart. MEHS Program, 235 Voss-Kovach Hall, 1305 Ordean Court Duluth, Minnesota 55812-3042 Dr. K. Kousalya Professor, Department of CSE,Kongu Engineering College,Perundurai-638 052 Dr. (Mrs.) R. Uma Rani Asso.Prof., Department of Computer Science, Sri Sarada College For Women, Salem-16, Tamil Nadu, India. MOHAMMAD YAZDANI-ASRAMI Electrical and Computer Engineering Department, Babol"Noshirvani" University of Technology, Iran. Dr. Kulasekharan, N, Ph.D Technical Lead - CFD,GE Appliances and Lighting, GE India,John F Welch Technology Center,Plot # 122, EPIP, Phase 2,Whitefield Road,Bangalore – 560066, India. Dr. Manjeet Bansal Dean (Post Graduate),Department of Civil Engineering,Punjab Technical University,GianiZail Singh Campus,Bathinda -151001 (Punjab),INDIA Dr. Oliver Jukić Vice Dean for education,Virovitica College,MatijeGupca 78,33000 Virovitica, Croatia Dr. Lori A. Wolff, Ph.D., J.D. Professor of Leadership and Counselor Education,The University of Mississippi,Department of Leadership and Counselor Education, 139 Guyton University, MS 38677
www.ijitce.co.uk
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.7 NO.08 AUGUST 2017, IMPACT FACTOR: 1.04
Contents A Survey on Document Clustering Using Wordnet M.Sangeetha, S.Subasri & T.Priyanka
.…………………………………….[427]
Wireless Sensor Network Using Monitoring the Environmental Activities S.Kalaivani & Dr. P.Radha .…………………………………….[431]
www.ijitce.co.uk
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.7 NO.08 AUGUST 2017, IMPACT FACTOR: 1.04
A Survey on Document Clustering Using Wordnet M.Sangeetha Assistant Professor, PG and Research Department of Computer Science Kaamadhenu Arts and Science College, Sathyamangalam, Tamil Nadu, India. S.Subasri & T.Priyanka M.Phil Research Scholar, PG and Research Department of Computer Science, Kaamadhenu Arts and Science College, Sathyamangalam, Tamil Nadu, India. Abstract- WordNet is connected to several databases of the semantic web. WordNet is also commonly re-used via mapping between the WordNet synsets and the categories from ontologies. Most often, only the top-level categories of wordnet are mapped. It is used for a number of different purpose in information systems, including word-sense disambiguation information retrieval, automatic text classification, automatic text summarization, machine translation and even automatic crossword puzzle generation. Mostly this information data is stored in unstructured text. This large data developed has lead to the need of its systematic clustering for easy data retrieval organization and summarization, typically called as data mining. In this paper we Present document clustering using wordnet used different attributes and algorithm. Wordnet based algorithm is used for semantic similarity measure. It is designed to solve problems in text clustering. Semantic algorithm is compared with using all algorithm, Which proved to be more efficient and provides more pure clusters. Keywords- Suffix Tree, Lingo, Suffix Array, Information Retrieval, Search Engine, Semantic, Tree Clustering, Document Clustering.
1. INTRODUCTION Information Retrieval plays an important role in our daily life and its largest role is observed in search engines. Most users rely on Web search engines to look for specific information from the Web. These search engines often return a long list of search results that would be ranked by their relevance to the given query. Web users have to go through the long list and inspect the titles, and snippets sequentially to recognize the required results. Filtering the search engines' results consumes the users effort and time especially when multiple sub-topics of the given query are mixed together[1][2]. This paper describes how to overcome some of the major limitations in the current search engines. We proposed a multi-agent based information retrieval system to enhance the search process. We used different types of agents each of them has its own responsibility. We organize the results of Web search engine by clustering them into different categories for a given query. We utilized WordNet ontology and several approaches to cluster results in appropriate category according to WordNet synsets.
427
Search engines are an invaluable tool to retrieve information from the internet. On the other hand they tend to return an enormous amount of search results and this causes a time consuming task to find the relevant ones. Moreover, if the relevant results do not occur in the first part of the returned results, then the user may fail to find them[4][5]. Nowadays, the development of a search results clustering system involves semantic search results clustering, which in turn uses the semantic meaning of words to cluster. This idea considers semantically related words such as synonyms or hyponyms for increasing the quality of clusters. In 2010, Ahmed Sameh and Amar Kadray. proposed the Semantic Lingo algorithm which uses synonyms to extract phrase terms to use as discovery cluster labels. The Semantic Lingo algorithm extends the Lingo techniques by adding semantic recognition, particularly using the WordNet database to achieve semantic recognition. Semantic recognition can improve the quality of the clusters generated. Suffix tree clustering, as a fast, incremental, linear time clustering algorithm, has been widely concerned about. Carrot2, a well-known open source clustering search engine has an implementation based on suffix tree clustering algorithm[6][7][8]. However, label-contained and duplication exist in the results of the clustering. It’s an added burden for users to get their interest information. This paper is aiming to solve this problem. The existing search engines always come out with a long list of results for the given query and they are ranked by their relevance to the same query. Information retrieval and ranking functions are vital to the search engines[8][9]. To address the above challenges, some effective approaches such as web pages categorizing (Yahoo!), query classification and search results clustering have been used or proposed. And search results clustering have been proved to be a more effective way, which is an automatic, online grouping similar documents in the search results returned by a search engine into a hierarchy of labeled clusters. Based on the model described by the third approach, we argue that there are four key factors for search results clustering as The quality of cluster labels. Having a meaningful, unambiguous label for the text clustering is very important[11][12]. The accuracy of clustering results. Documents in the same cluster should have the consistent theme. A relatively short response time. Moreover, if the relevant results do not occur in the first part of the returned results, then the user may fail to find them. A possible solution to this problem is use of the search results clustering
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.7 NO.08 AUGUST 2017, IMPACT FACTOR: 1.04
techniques that works on snippet– a short text summarizing the context of search results. Search results clustering engine’s main role is to cluster search results into different groups of relevant data and create a navigator to easily access the relevant search results for users. The development of a search results clustering system involves semantic search results clustering, which in turn uses the semantic meaning of words to cluster. This idea considers semantically related words such as synonyms or hyponyms for increasing the quality of clusters[13][14][15]. 2. RELATED WORKS A) Semantic Clustering Approach Based Multi-agent System for Information Retrieval on Web Document clustering is an important technology which helps users to organize the large amount of online information, especially after the rapid growth of the Web. This paper focuses on semantic document clustering method and its application in search engine. We proposed a multi-agent based information retrieval system to enhance the search process. The agents retrieve the results of Web search engine and organize the results by clustering them into different categories for a given query. We utilized WordNet ontology and several approaches to cluster results in appropriate category according to WordNet synsets. The experiment shows that semantic clustering work better than original clustering. In this paper, we investigated the problem of how to cluster the search result from search engines. Queries are often ambiguous because many words have multiple meanings. [1][2]By clustering the search results based on the semantic of the query term, it makes it easier for users to identify relevant results from the retrieved results. We proposed a modified version of the lingo algorithm that combines both WordNet ontology and clustering techniques. Our preliminary experimental results indicated that our semantic clustering algorithm is effective, achieving an accuracy of about 90%. We also showed that this algorithm is significantly better than original lingo cluster by about 6.39%. We plan to continue this research in the following directions[3]. First, we will work on some criteria to avoid clusters overlapping that mean document cannot be assigned to more than one cluster. Second, we will try to remove the near duplicate cluster label. B) Semantic Suffix Tree Clustering This paper proposes a new algorithm, called Semantic Suffix Tree Clustering (SSTC), to cluster web search results containing semantic similarities. The distinctive methodology of the SSTC algorithm is that it simultaneously constructs the semantic suffix tree through an on-depth and on-breadth pass by using semantic similarity and string matching. The semantic similarity is derived from the WordNet lexical database for the English language[4][5]. SSTC uses only subject-verb-object classification to generate clusters and readable labels. The algorithm also implements directed pruning to reduce the subtree sizes and to separate semantic clusters. Experimental results show that the proposed algorithm has better performance than conventional Suffix Tree Clustering (STC). This paper proposes a new algorithm, called Semantic Suffix Tree Clustering (SSTC), that uses the meaning of the
words to cluster. SSTC can cluster documents that share a semantic similarity. Specific cluster are returned in a readable form. Additionally, the SSTC can improve the performance of approaches that use the original STC algorithm because it can cluster semantically similar documents, reduce the number of nodes and reach higher precision. For future work we plan to extend the SSTC algorithm to hierarchical clustering. C) Semantic-based Hierarchicalize the Result of Suffix Tree Clustering Suffix tree clustering is a fast, incremental, linear time clustering algorithm, but there are synonymous and labelcontained relations among the result clusters. So just return these results to the users directly, would give them an added burden. In response to this problem, this paper presents a method that merging the semantic duplicate clusters and hierarchicalizing the label-contained clusters. The experimental results show that this method can effectively remove semantic duplication and hierarchicalize label-contained clusters clearly[6][7]. It improves the organization of clustering results. To the STC search engine, this will provide users with better results and better classification. In this paper, through semantic-based processing the results of the STC, merging synonyms clusters, hierarchicalizing label-contained clusters, improve the organization of clustering results. To the STC search engine, this will provide users with better results and better classification. This can help users both in locating interesting documents more easily and in getting more clearly overview of the retrieved document set[8]. In this paper, the semantic-based hierarchicalizing process only uses synonymy relationship. How to make use of antonym and other important semantic relations to further improve the organization of the clustering results is a future work. D) A Relative Study on Search Results Clustering Algorithms - K-means, Suffix Tree and LINGO The performance of the web search engines could be improved by properly clustering the search result documents. Most of the users are not able to give the appropriate query to get what exactly they wanted to retrieve. So the search engine will retrieve a massive list of data , which are ranked by the page rank algorithm or relevancy algorithm or human judgment algorithm. The user will always find himself with the unrelated information related to the search due to the ambiguity in the query by the user[9][10]. Evaluating the performance of a clustering algorithm is not as trivial as counting the number of errors or the precision and recall of a supervised classification algorithm In this paper a comparative analysis is done on three common search results of clustering algorithms to study the performance enhancement in the web search engine. If we effectively organize the web documents through the proper means of clustering techniques, we could definitely increase the performance of the search engines. A systematic evaluation of the three clustering algorithms viz., Suffix tree clustering Lingo, and K-Means using multiple test collections and evaluation measures . It turns out that STC works well, when one wants to get a quick overview of documents relevant to distinct subtopics, whereas clustering is more useful when one is interested in retrieving multiple documents relevant to each subtopic. Each algorithm has its
428
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.7 NO.08 AUGUST 2017, IMPACT FACTOR: 1.04
own merits and demerits, Lingo produces high cluster diversity, the Small outliers are highlighted well, In STC and K-means algorithms the small outliers are rarely highlighted [11].In Lingo the number of clusters produced are more when compared to other two algorithms..With respect to the cluster labels , in LINGO they are descriptive but lengthy , not very descriptive in K-Means ,but in STC cluster labels are small but very appropriate. The Scalability is high in STC compared to Lingo and K-Means. Other features of K-Means clustering are Running time: O(KN) (K = number of clusters) Fixed threshold ,Order dependent. Features of STC are Overlapping clusters, Non-exhaustive Linear time, and High precision. E) Search Results Clustering Based on Suffix Array and VSM With the rapid growth of web pages, search engines will usually present a long ranked list of documents. The users must sift through the list with “title” and “snippet” (a short description of the document) to find the desired document. This method may be good for some simple and specific tasks but less effective and efficient for ambiguous queries such as “apple” or “jaguar”. To improve the effect and efficiency of information retrieval, an alternative method is to automatically organize retrieval results into clusters[12][13]. This paper presents an improved Lingo algorithm named Suffix Array Similarity Clustering (SASC) for clustering web search results. This method creates the clusters by adopting improved suffix array, which ignores the redundant suffixes, and computing document similarity based on the title and short document snippets returned by Web search engines. Experiments show that the SASC algorithm has not only a better performance in timeconsuming than Lingo but also in cluster description quality and precision than Suffix Tree Clustering. In the paper, we propose a search results clustering algorithm named SASC. And its main contributions are the efficiency of suffix array is improved by ignoring the redundant suffixes. We also proved that the equivalent cluster results can be obtained by analyzing the matrix of consequence rather than by computing SVD[14][15]. This method takes far less time than Lingo. Furthermore, SASC supports hierarchical structure. In the future, we intend to further improve the time efficiency and the accuracy as well as consider other information such as the user’s interaction with the clustering results for adaptive clustering. F) Clustering of Web Search Results Using Semantic Clustering is related to data mining for information retrieval. Relevant information is retrieved quickly while doing the clustering of documents. It organizes the documents into groups; each group contains the documents of similar type content. Different clustering algorithms are used for clustering the documents such as partitioned clustering (K-means Clustering) and Hierarchical Clustering (Agglomerative Hierarchical Clustering (AHC)). This paper presents analysis of Semantic Suffix Tree Clustering (SSTC) Algorithm and other clustering techniques (K-means, AHC, and Lingo). SSTC perform the clustering and make the clusters based on synonyms shared between the documents. SSTC is faster clustering algorithm for document clustering as it is incremental.
The paper presents the analysis of different clustering techniques such as partitioned clustering and hierarchical clustering. K-means presents the Partitioned clustering and Agglomerative Hierarchical Clustering presents the Hierarchical clustering. Also it analyses Semantic Lingo Algorithm. It introduces an algorithm for web search result clustering known as Semantic Suffix Tree Clustering Algorithm[16][17][18]. The paper proposes the main steps as to identify base clusters, merging the base clusters. SSTC can improve the performance of approaches that use the original STC algorithm because it can cluster semantically similar documents, reduce the number of nodes and reach higher precision. 3. ANALYSIS AND DISCUSSION In this section we conduct several experiment to validate the effectiveness of the proposed approach. The process was various clustering algorithm to the document collections and compared their precision. This method takes far less time that Lingo, furthermore SASC supports hierarchical structure. We intend to further improve the time efficiency and the accuracy as well as consider other information such as the user’s interaction with the clustering results for adaptive clustering. Semantic algorithm is compared with Lingo, Which proved to be more efficient and provides more pure clusters. Semantic Lingo increases efficiency and provide more relevant result. The higher number of matrix transformation leads to demanding memory requirements. So Semantic algorithm is designed for specific application like web search result clustering. Author Bassma S, Alsulami, Maysoon F, Abulkhair Fathy A Essa
Algorith m Clustering Algorithm
Jongkol Janruang, Sumanta Guha
SSTC STC
Guodong Hu, Wanli Zuo, Fengling He, Ying Wang
Suffix Tree Clustering Grouper
Mahalakshmi R, Lakshmi Prabha V
Suffix Tree Clustering K Means Clustering Algorithm LINGO
429
Attributes WordNet Semantic clustering
Results
The preliminary experimental results indicated that our semantic clustering algorithm is effective, achieving an accuracy of about 90%. We also showed that this algorithm is significantly better than original lingo cluster by about 6.39%. Semantic The SSTC can improve the Search performance of approaches that use Result the original STC algorithm because Clustering it can cluster semantically similar Text documents, reduce the number of Clustering nodes and reach higher precision. Semantic The STC search engine, this will Hierarchical provide users with better results and Suffix Tree better classification. This can help Clustering users both in locating interesting documents more easily and in getting more clearly overview of the retrieved document set. In this paper, the semantic-based hierarchicalizing process only synonymy relationship. The Scalability is high in STC Information compared to Lingo and K-Means. Retrieval Other features of K-Means clustering Search are Running time: O(KN) (K = Engines number of clusters),Fixed threshold, Order dependent. Features of STC are Overlapping clusters, Nonexhaustive, Linear time and High precision.
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.7 NO.08 AUGUST 2017, IMPACT FACTOR: 1.04
[5] Shunlai Bai, Wenhao Zhu, Bofeng Zhang
Shelke p p, Dhopte S V, Alvin A S
Search Result Clustering Algorithm named Suffix Array Similarity Clustering
SSTC STC
Suffix Array Suffix Tree Lingo
Document Clustering Partitioned Clustering
We propose a search results clustering algorithm named SASC and The efficiency of suffix array is improved by ignoring the redundant suffixes. We also proved that the equivalent cluster results can be obtained by analyzing the matrix of consequence rather than by computing SVD. It introduces an algorithm for web search result clustering known as Semantic Suffix Tree Clustering Algorithm. The paper proposes the main steps as to identify base clusters, merging the base clusters.
[6]
[7] [8]
[9]
[10] 6. CONCLUSION In this paper proposed a novel method ,document clustering algorithm which identifies key concept and automatically generates Ontology’s for users to conceptualize document corpora. It presents a novel approach termed as document clustering using algorithms based on concept of the text data. This concept driven approach is executed on Wordnet. It introduces an algorithm for web search result clustering known as semantic suffix tree clustering algorithm. On internet is a significant and challenging problem. Several new concepts and the mining problem are formally defined and a group of algorithm are designed and combined to systematically solve this problem. As compare different web search clustering algorithm like STC and SHOC does not reduce the high dimension of the text document hence its complexity is quite high for large text data based which ignores the semantic and lexical relationship between words, proposed new algorithm called Lingo Semantic for clustering.
[11]
[12]
[13]
[14] [1]
[2]
[3]
[4]
REFERENCES Zeng, H.J., He, Q.C., Chen, Z., Ma, W.Y., Ma, J.: "Learning to cluster Web search results. In: SIGIR ’04". Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, New York, NY,USA, ACM Press (2004) 210–217 T. de Simone and D. Kazakov. "Using WordNet Similarity and Antonymy Relations to Aid Document Retrieval". Recent Advances in Natural Language Processing (RANLP), 2005. M. A. Hearst, J. O. Pedersen. "Re-examining the Cluster Hypothesis: Scatter/Gather on Retrieval Results". In Proceedings of the ACM SIGIR Conference, 1996. Ahmed, M.S., Amar, M.K.: Semantic Web Search Results Clustering Using Lingo And Wordnet. In: IJRRCS: Kohat University of Science and Technology (KUST), Vol. 1, No 2, pp. 71–76. , Pakistan (2010)
[15]
[16]
[17]
[18]
430
Stanislaw, O., Jerzy, S.,: An algorithm for clustering of web search results. Master Thesis, Poznan University of Technology, Poland, June 2003 Carpineto, C., Osinski, S., Romano, G., and Weiss, D.: A survey of Web clustering engines. In: ACM Computing Surveys, Volume 41 , Issue 3, pp. 1–38. ACM, New York, USA (2009) E. Ukkonen, “On-line construction of suffix trees,” Algorithmica, vol. 14, 1995, pp 249-260. Oren Zamir ,Oren EtzioniO. “Grouper: A Dynamic Clustering Interface to Web Search Results,” University of Washington. Department of Computer Science and Engineering. 1999K. Elissa O. Zamir and O. Etzioni, “Web document clustering: a feasibility demonstration,” in: Proceedings of the 19th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’98),1998, pp 46-54. C Carpineto, and S Osiński, G Romano, and D Weiss, “A survey of web clustering engines”, ACM Computing Surveys (CSUR), ACM, 2009. H Cao, DH Hu, D Shen, D Jiang, JT Sun, E Chen, and Q Yang, “Context-Aware query classification”, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, ACM, Boston, USA, 2009, pp. 3-10. Oren Zamir, and Oren Etzioni, “Web Document Clustering: A Feasibility Demonstration”, Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, ACM, Melbourne, Australia, 1998, pp. 46-54. Pushpalatha, Ram Chatterjee: An Analytical Assessment on Document Clustering. In: I. J. Computer Network and Information Security, 2012, 5, 63-71 published Online June 2012 Ahmed, M.S., Amar, M.K.: Semantic Web Search Results Clustering Using Lingo and Wordnet. In: IJRRCS: Kohat University of Science and Technology (KUST), Vol. 1, No 2, pp. 71–76. , Pakistan (2010) Stanislaw, O., Jerzy, S.,: An algorithm for clustering of web search results. Master Thesis, Poznan University of Technology, Poland, June 2003 B. R. Prakash and M. Hanumanthappa, "Web Snippet Clustering and Labeling using Lingo Algorithm", International Journal of Advanced Research in Computer Science, vol. 3, no. 2, pp. 262-265, 2012 Carpineto, S. Osinski, G. Romano, D. Weiss. A Survey of Web Clustering Engines. ACM Computing Surveys (CSUR), 41(3): Article 17, 2009 Tingting Wei, Yonghe Lu, Huiyou Chnag, Qiang Zhou, Xianyu Bao ," Sematic approach for text clustering using WordNet and lexical chains", Expert Systems with Applications,Volume 42, Issue 4, March 2015, pp. 2264–2275.
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.7 NO.08 AUGUST 2017, IMPACT FACTOR: 1.04
Wireless Sensor Network using Monitoring the Environmental Activities S.Kalaivani M.Phil Research Scholar, Department of Computer Science, Vellalar College for Women(Autonomous),Erode, Tamil Nadu, India. Email: kalaivkl26@gmail.com Dr.P.Radha Assistant Professor, Department of Computer Science, Vellalar College for Women(Autonomous),Erode, Tamil Nadu, India. Email:radhasakthivel68@gmail.com Abstract- Wireless sensor network plays an important role in monitoring environmental activities. Many sensor devices are used to collect the spatial or temporal data. The data sets that are collected may have irregularities, missing values inconsistent data. To handle these data, data preprocessing is performed to remove, unwanted data and to fill in the missing values .Various clustering algorithm is performed on those data for cluster formation. This project analyses the two major clustering algorithms: K-means clustering and Fuzzy C-means clustering .The clusters are formed using both the algorithms and their performance is analyzed .The performance of these clusters are analyzed based on the inter and intra cluster distance. Based on the result, it is proved that the Fuzzy C means algorithm is efficient than K-means algorithm. Keywords-Wireless sensor, clustering, preprocessing, Fuzzy C, K-means algorithm.
data
1. INTRODUCTION Wireless sensor network grows and rapidly improves, this enable the new communication services. Sensor networks are the most useful way to collect the various parameters and information. A wireless sensor network is a collection of nodes organized into a cooperative network. Each node consist one or more microcontrollers, CPUs or DSP chips. Each node communicates with each other. Most of the wireless sensor networks are bi-directional in nature and they control all the sensor activity. The development of wireless sensor networks was motivated by military applications such as battlefield surveillance, industrial process monitoring and control, machine health monitoring, and so on. A sensor node may vary in size and the cost. Sensor nodes consist of processing unit with limited computational power and limited memory, sensors, communication device, power source in the form of battery. The base stations are the main components of wireless sensor network with more computational power, energy and resources. They act as a gateway between sensor nodes and the end user and they forward data from the wireless sensor network to a server. Sensor network basically consist of large amount of sensor nodes that are deployed to large physical area to monitor and detect the real time environmental activities. These sensor nodes works together to collect the data like temperature, humidity, acceleration etc from surroundings. As sensor
network is useful in application like in habitat monitoring, health monitoring, traffic, weather, pollution etc and in all such real life application sensor nodes generate large amount of data so mining data is really a fruitful task. Due to advancement in the wireless sensor networks the networks have ability to generate a large amount of data, and to find out the useful knowledge regarding the sensor network we apply data mining techniques. Her the wireless sensor network is linked up with environment monitoring and this link helps in various areas like fire detection in forest areas, saving wild life, and in other tropical conditions by analyzing temperature, humidity etc. In this paper, the data mining techniques like data preprocessing and cluster analysis were processed and analyzed. 2. RELATED WORKS Data in the real world is dirty. Real world data is often incomplete and noisy, say wrong values or duplicate records. This results in poor quality data which in turn results in poor quality mining results. Quality decisions are based on quality data and data warehouses needs consistent integration of quality data, which has no missing or noise data. In order to get quality data, the data in the database need to be checked for accuracy, completeness, consistency, timeliness, believability, interpretability and accessibility. The data preprocessing tasks were as follows: Data Cleaning: Filling in missing values, smooth the noisy data identify or remove outliers and resolve inconsistencies. Data Integration: Integration of multiple databases or files. Data Transformation: Normalization and aggregation. Data Reduction (Feature Selection): Obtains reduced representation in volume but produces the same or similar analytical results. Data Discretization: Part of data reduction but with particular importance, especially for numerical data. The clustering problem is defined as follows: For a given set of data points, it’s proposed to partition them into one or more groups of similar objects. The similarity of the objects with one another is typically defined with the use of some distance measure or objective function. The clustering problem has been widely researched in the database, data mining and statistics communities. The nature of the clusters may vary with both the moment at which they are computed as well as the time horizon over which they are measured. For example, a user may wish to examine clusters occurring in the last month, last year, or last decade. Such clusters may be considerably different.
431
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.7 NO.08 AUGUST 2017, IMPACT FACTOR: 1.04
Therefore, a data stream clustering algorithm must provide the flexibility to compute clusters over user-defined time periods in an interactive fashion. 3. DATASET The dataset for proposed work is downloaded from website link http://daac.ornl.gov/LBA/guides/CD04_Meteorology_Fluxes.ht ml. The data is presented as value measured at 30 minute interval over 3.5 years and compiled at the km 83 tower site. This data includes the variables relate to Meteorology, soil moisture, fluxes of momentum, heat, water vapor and carbon dioxide beneath the flux sensors. 4. IMPLEMENTATION 4.1 Data Preprocessing Data Pre-processing involves cleaning the data by putting in missing values and removing uninteresting data. It may also include summarization and aggregation of the data. This step basically involves preparing the data for analysis. Hence first and foremost this process can detect the irregularities in the sensor data and apply pre-processing technique. 4.2 K-means clustering algorithm The k-means clustering algorithm consists of two separate phases: the first phase is to define k centroids, one for each cluster. The next phase is to take each point belonging to the given data set and associate it to the nearest center. When all the points are included in some clusters, the first phase is completed and an early grouping is done. At this point it’s necessary to recalculate the new centroids, as the inclusion of new points may lead to a change in the cluster centroids. Once we find k new centroids, a new binding is to be created between the same data points and the nearest new center, generating a loop. As a result of this loop, the k centroids may change their position in a step by step manner. Eventually, a situation will be reached where the centroids do not move anymore. Algorithm 1: The k-means clustering algorithm Input: D = {d1, d2...dn} //set of n data items. K // Number of desired clusters Output: s A set of k clusters. Steps: 1. Arbitrarily choose k data-items from D as initial centroids; 2. Repeat 2.1 Assign each data item di to the cluster which has the closest centroid; 2.2 Calculate the new mean of each cluster; Until convergence criterion is met 4.3 Fuzzy C-means clustering algorithm Fuzzy C-means (FCM) is a method of clustering which allows one piece of data to belong to two or more clusters. Here, this method is used in clustering of the network data. It is based on minimization of the following objective function:
cluster. Fuzzy partitioning is carried out through an iterative optimization of the objective function shown above, with the update of membership uij and the cluster centers cj by:
,
.
This iteration will stop when Where a termination criterion between 0 and 1 and k is are the iteration steps. This procedure converges to a local minimum or a saddle point of Jm. The algorithm is composed of the following steps: 1. Initialize
matrix,
2. At k-step: calculate the centers vectors
3.Update
,
4. If return to step 2.
with
follows
then STOP; otherwise
5. PERFORMANCE EVALUATION In this module the results of Fuzzy C means and K means algorithm is compared to analyse the efficient perfomance . Inter-cluster distance measured within-cluster sum of squares. The Intra cluster distance, is the distance between All pairs of points in the cluster or between the centroid and all points in the cluster. The performance has been analysed based on the inter and intra cluster distance . In Kmeans clustering , the intra cluster distance is greater when compared with Fuzzy C-means. Thus the K-means clustering algorithm will take more time to compute than Fuzzy C-means clustering algorithm. 6. CONCLUSION In this paper, the implementation of data set is done in both Fuzzy C means and K means. Later it is compared with the performance of two clusters that has been generated by the above said algorithm. As a result it is proved that Fuzzy Cmeans clustering algorithm is better for monitoring environmental activities than K-means clustering algorithm.
[1]
[2]
, Where m is any real number greater than 1, uij is the degree of membership of xi in the cluster j, xi is the ith of ddimensional measured data, cj is the d-dimension center of the
432
REFERENCES C.C.Aggarwal, J.Han, J.Wang, P.S.Yu. A Framework for Clustering Evolving Data Streams. VLDB2003. J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol.2.Oxford: Clarendon, 1892, pp.68–73. I.F.Akyildiz, W.Su, Y.Sankarasubramaniam, and E.Cayirci. Wireless Sensor Networks: A Survey. Computer Networks, Vol.38, No.4, pp.393-422, 2002.
INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND CREATIVE ENGINEERING (ISSN:2045-8711) VOL.7 NO.08 AUGUST 2017, IMPACT FACTOR: 1.04
[3]
[4]
[5]
[6]
[7]
[8] [9]
[10]
S.Bandyopadhyay and E.J.Coyle. An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks. IEEE INFOCOM 2003. D.Barbara, W.Dumouchel, C.Faoutsos and P.Haas. The New Jersey Data Reduction Re-port. IEEE Data Engineering Bulletin,Vol.20, No.4, pp.3-45, 1997. D.Estrin, R.Govindan, J.Heidemann, and S.Kumar. Next Century Challenges: Scalable Coordination in Sensor Networks. MobiCOM 1999 G.J.Pottie and W.J.Kaiser. Wireless Integrated Network Sensors.Communications of the ACM. Vol.43, No.5, pp.51-58, May 2000. Agrawal, R.; Imielinski, T.; Swami, A. IEEE\z Transactionon Knowledge and Data Engineering; 1993; p 6. Han, J.; Kamber, M. Data Mining: Concepts and Techniques; Morgan Kaufmann: 2000. Dongqing Yang, Shiwei Tang, Qiong Luo ,Xiuli Ma, Dehui Zhang, Shuangfeng Li. Online Mining in Sensor Networks. NPC 2004, LNCS 3222, pp. 544550, 2004. Agrawal, R.; Srikant, R. Mining Sequential Patterns.ICDE 1995, 3- 14
433
@IJITCE Publication