A Comparative Analysis of K-Means and K-Medoids Algorithm for Educational Data

Page 1

IJSRD - International Journal for Scientific Research & Development| Vol. 4, Issue 05, 2016 | ISSN (online): 2321-0613

A Comparative Analysis of K-Means and K-Medoids Algorithm for Educational Data Dr.(Mrs.) Ananthi Sheshasaayee1 C.Kabila2 1 Research Supervisor 2Research Scholar 1,2 Department of Computer Science & Engineering 1,2 Quaid E Millath Govt. College for Women, Chennai Abstract— Data mining is useful to extract the particular set of information from large volume of database. Data mining is useful in all the fields especially in education field it is known as Educational Data mining (EDM). Educational data mining consist of huge amount of education related data. These data are used to predict the student’s performance, it has become very challenging task. By predicting the performance of the student each student can be monitored closely by the trainer. This prediction method is also helpful in keeping track of curriculum pattern. Many algorithms in clustering are used to find the performance, two algorithms are used k-means and k-medoids is used to calculate the student’s performance and the difficulty they have in the questions. Based upon the marks secured by each student in each question their performance is calculated and finally determining which algorithm will be best for predicting the student’s performance. Rapid miner tool is used. Key words: Educational Data Mining, K-Means, KMedoids, Rapid Miner I. INTRODUCTION Data mining attracts all the fields. It helps the industry to find the solution for their problem. It also helps the researcher to find the solution. In education field data mining plays a major role in predicting the performance, it may be called as Educational Data Mining (EDM). Education is must in every human being life it has to be provided in a proper and in an effective manner. In the higher education students performance is more important. The quality of the higher education is based on the performance of the student’s. Educators and the learner can be motivated by finding the performance of the student’s [1]. Data mining is an incredible methodology that helps to find the hidden information of the students from a large database. Students grade are found for recruitment process [2]. Data mining is also known as Knowledge Discovery in Database (KDD) useful information can be fetched from a large database, in the field of discovering the new techniques. With the dataset mean values of a cluster are measured and can be viewed as a centroid table [3]. There is an increasing interest in the field of education. This emerging field called Educational Data Mining (EDM), which helps discovering knowledge and originates data in the education field [4]. Educational data mining methods belong to a diversity of literatures. These literatures include data mining, machine learning, information visualization, and computational modelling [5]. Clustering can be used in Educational Data Mining (EDM) it can use the techniques like k-means, k-medoids, agglomerative, divisive. Using this technique student’s performance can be predicted. To improve current trends in higher education, motivating the students can be done by managing and processing the

student’s data. Data mining is used to manage these data [6]. The main objective of data mining in higher education is to provide quality education to the students and to improve their managerial decision [7]. The main objective of this research is that it uses clustering technique to predict the performance of the students and to find their difficulty in answering the questions. Clustering is that assigning a particular set of objects to a specific group [12]. K-means and k-medoids algorithm are used and then they are compared, k-means algorithm works better for the students data. The next section is focused on methodology then section 3 is about tools and techniques used. Section 4 contains the algorithm, tools, performance of the algorithm and the result. The last section 5 is conclusion and future work is outlined. II. METHODOLOGY The research has started after various studies and discussion. This study is to find the best algorithm which suits for the students data. For this work the student data is collected and these data are transferred into a standard format required by Rapid miner tool. These data is then given input to the tool which is used for this study.

Fig. 1: Methodology III. TOOLS AND TECHNIQUES USED Data mining technique is used for educational data. Clustering is the technique used to figure out the performance of the students, k-means and k-medoids are the two algorithms in clustering which has been taken for this study. For implementation work rapid miner tool is used. IV. K-MEANS AND K-MEDOIDS ALGORITHM IN RAPIDMINER Clustering is used to identify similar classes of objects. By using clustering dense and sparse region can be identified and can discover correlation among data attributes [8]. Kmeans is the centroid based algorithm which is used to cluster the data in same group. K-means algorithm is applied to group the student’s data and to predict their performance.

All rights reserved by www.ijsrd.com

1683


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.