IJSTE - International Journal of Science Technology & Engineering | Volume 3 | Issue 09 | March 2017 ISSN (online): 2349-784X
Trace Event – A Web Mining based Application for the Collection of Online Web Events Divya K B. Tech Student KCG College of Technology, India
Kavitha J B. Tech Student KCG College of Technology, India
Rajkumar Rajavel Associate Professor KCG College of Technology, India
Abstract Web mining is one of the applications of data mining which is used to retrieve information from large amount of data available on web as per user queries. Web mining techniques are used to find and extract required information from the web pages. Content mining is used to extract data from the web page contents. Usage mining is used to fetching information from user logs. Structure mining is used to mine the structure of the websites. Retrieving relevant content from web pages is tedious process. In this paper, we are going to use content mining and usage mining techniques for collecting technical and non-technical events published on various sites. The user can get required details about the events at a single place instead of visiting all other websites. Existences of data available on internet are in different formats such as structured, unstructured and semi structured. Most of the information available on internet is in unstructured format. To discover exact data which is requested in user query can be done by matching keywords with content of the web pages. Giving text preferences is the simple way for searching information from web documents. Keywords: web mining, content mining, structure mining, usage mining, server logs, text preferences ________________________________________________________________________________________________________ I.
INTRODUCTION
In this technological era, we can get all information from web through many web browsers. Size of the information available on internet also increases rapidly. In this case, retrieval of required content from large amount of data is difficult. To get relevant content from web takes more time for the users. This may reduce their interest towards browsing. Especially students have no patience to search information which is useful to enhance their knowledge and skills. Some students are searching for the right platform to showcase their talent. If they get irrelevant contents while searching they may quit their search. There are lots of chances to miss the opportunities due to huge amount of data available on the web and students may not be aware of so many events related to their interest happening in various colleges. For this reason Trace Event application is developed to make all the technical and non-technical events are available for the students at single place. This application can reduce the searching time of the students. They can get all the information about the events which are going to happen in nearby colleges. Instead of visiting websites of every college, simply they can visit trace event web application. Web is constructed by different forms of data such as structured, semi structured and unstructured data. Web mining concepts are used to retrieve data from web. We can extract only required details from plenty of data. As per user’s text preferences, matching data is extracted from the web documents. II. RELATED WORK For mining different forms of existence of data on web, mining strategies are used [1]. Identification of keywords in the web site content is the major issue in web mining [2]. A search engine uses keywords to find a specific topic in the web. In this paper a combination of information of retrieval techniques with web usage mining is proposed in order to extract user text preferences in a website. Web mining can be categorized into three parts: 1.Web content mining 2.Web structure mining 3.Web usage mining. Web content mining is the process of extracting useful information from the contents of web documents. It includes extraction of structured data from web pages and open search pages. Web Structure mining focuses on analysis of link structure and the purpose of the structure mining is to identify more preferable documents. The web pages are representing as nodes and hyperlinks representing as edges. Web usage mining is the technique analyzes the transaction data which is logged, when the users interact with the web. Web usage mining sometimes referred as ‘log mining’, because it involves mining the web server logs. A web server log file contains request made to the web server [3]. Various pattern mining algorithms are compared for performance analysis [9]. Tools are used in web mining. Bixo is an open source web mining toolkit that starts with a set of URLs to be fetched, and ends with some results extracted from parsed HTML pages [5]. Ruby on Rails, enables the programmer to easily create progressed database-driven websites using scaffolding and code generation, taking convention over configuration [4]. Ruby on Rails is the only framework which enforces security in web applications while developing [6]. Ranking algorithms
All rights reserved by www.ijste.org
334
Trace Event – A Web Mining based Application for the Collection of Online Web Events (IJSTE/ Volume 3 / Issue 09 / 070)
can be used to reduce the searching time of the users. Types of ranking algorithms are explained in detailed [7]. Enhancement of page rank algorithm for unstructured data retrieval is explained [10]. To resolve issues in web mining, information retrieval view, information extraction and database view are given as solutions [8]. III. TRACE EVENT SYSTEM ARCHITECTURE This research work is supposed to be used by all the members who wish to participate in events. The above architecture diagrams explains how the request send by the client reaches server and how the server response for the query. Websites links of various colleges and institutes are given as input to the web crawler. Users can give their search key words in the search engine box. Web crawler processes the query and does filtrations as per the query and displays the search results. Search and filter plugin is used for filtering. Users can login to see the events available for them. They have options to accept and reject the invitation for the event.
Fig. 1: System Architecture
IV. EXPERIMENTAL SETUP Ruby on Rails is a framework for back-end web applications. RoR is mainly focus on (MVC)Model, View and Controller to improve the applications. In ruby on rails, models are used to interact with the database element. View is the front-end application representing by user interface. Finally controller is used for interaction between the models and views, the request comes from browsers that are processed by controllers. Web log data is automatically generated by web server.Web server can maintain the log files after every request made through the server.Client can communicate to the server via request. The client send a query to server on a GET request method,afterwards server will be response the query in json format. Request contains client IP address,date and time.This system enables to collect the events from web with details and also improve the effictiveness of website.
All rights reserved by www.ijste.org
335
Trace Event – A Web Mining based Application for the Collection of Online Web Events (IJSTE/ Volume 3 / Issue 09 / 070)
Fig. 2: Pie chart
Performance of this research work is compared with existing system and the result is displayed in the line chart format for ease of understanding. This chart shows the percentage of extracted content matched with user’s query.
Fig. 3: Line chart for Performance
Parameters of this research work are estimated based on prediction. Retrieval time of query and relevance of content from web documents are the two main factors values are tabulated. This result analysis shows that proposed system have better performance than existing system. Table – 1 Result Analysis Performance Factors Query retrieval time Content Relevancy
Existing System 85% 82%
Trace Event System >90% >85%
V. CONCLUSION AND FUTURE WORK This drives to a conclusion that helping the users and posting the events in advance, instead of visiting all other website. This application is very effective for all the users, so accessing the web application is very easy through the internet. In future work, various types of filtering mechanisms can be added. Page rank algorithm can be implemented for ranking the pages while using search engine based on most frequently used search. REFERENCES [1] [2] [3] [4]
Asha Joy, Remya R, “Techniques for Web Mining of Various Forms of Existence of Data on Web: A Review”, (Jan 2015) International Journal of Advance Research in Computer Science and Management Studies. Juan D. Velasquez, Sebastian Rios, Alejandro Bassi, Hiroshi Yasuda and Turumasa Aoki, “Towards the Identification of Keywords in the Web Site Text Content: A Methodological Approach” (March 2005) presented at Research Center for Advanced Science and Technology, University of Tokyo, Tokyo. P. Menaka, A. Prathimadevi, “ A Survey on Web Mining and Its Techniques ”, (September 2015) International Journal of Advanced Research in Computer Science and Software Engineering. S. Meenakshi, “Ruby on Rails – An Agile Developer’s Framework”, (February 2015) presented at International Journal of Computer Applications (0975 – 8887) Volume 112 – No.1.
All rights reserved by www.ijste.org
336
Trace Event – A Web Mining based Application for the Collection of Online Web Events (IJSTE/ Volume 3 / Issue 09 / 070) M. Vengateshwaran, E.V.R.M. Kalaimani, “Web Mining Research Direction and Open Source Tools” (July 2014) International Journal of Advance Research in Computer Science and Software Engineering. Volume 4, Issue 7. [6] Stephen J. Tipton, Young Choi, “Toward Secure Web Application Design: Comparative Analysis of Major Languages and Framework Choices”, (2016) (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 7, No. 2. [7] Manoj Kumar Vemula, “An overview on Information retrieval from web using clustering process”, (July 2014), International Journal of Engineering and Innovative Technology (IJEIT) Vol. 4, Issue 1. [8] R. Munilatha, K.Venkataramana, “A study on issues and techniques of web mining”, (May 2014), IJCSMC (International Journal of Computer Science and Mobile Computing), Vol. 3, Issue 5, pg. 331-341. [9] Sheng-Tang Wu and Yuefeng Li, “Pattern-Based Web Mining Using Data Mining Techniques”, (April 2013), International Journal Education, e-Business, e-Management and e-Learning, Vol. 3, No. 2. [10] Ekta Bhardwaj, Shiv Kumar, Kuldeep Tomar, “Enhancing Page Rank Algorithm”, (May 2015), International Journal on Recent and Innovation Trends in Computing and Communication. volume: 3 Issue: 5 [5]
All rights reserved by www.ijste.org
337