International Journal of Engineering, Management & Sciences (IJEMS) ISSN-2348 –3733, Volume-2, Issue-5, May 2015
Mining Student Learning Behavior in Library Usage Using K-Means Algorithm (ClassificationAlgorithm) Krutibash Nayak Abstract— In this paper we use K-means classification data mining algorithm for classifying students based on their Library usage data and the marks obtained in their respective courses. We have used a specific mining tool for making the configuration and execution of data mining techniques easier for instructors. We have used real data from institutions and courses with University students. We have also applied pre-processing techniques on the original numerical data in order to verify if better classifier models are obtained. Finally, we claim that a classifier model appropriate for educational use has to be both accurate and comprehensible for instructor and management in order to be of use for decision making. Index Terms— K-means, Data Mining, classification, web mining
could also provide immediate guidance in order to promote the students’ learning effects. Data mining is also known as knowledge discovery in databases (KDD; Baker & Yacef, 2009; Han & Kamber, 2006), and follows the standard KDD process: 1. data cleaning and integration 2. selection and transformation 3. applying data mining algorithms, and evaluation and presentation
I. INTRODUCTION Data mining is a new kind of information processing technology, it can extract interesting patterns or knowledge implicated in a large number of incomplete, noisy, ambiguous and random practical application data people do not know in advance but with potentially application Many methods are being developed to study student behaviour in e-learning and virtual class room. But few methods are being applied to class room teaching and library usage. Most of the institutions spend a lot of its financial resources on library but it is difficult to calculate how much it will impact on the student's study and growth of institution. Faculty members face difficulties to teach the student if they don't know the student participation during the lecture. Learning methods generally refer to any records created in the learning process, such as notes, assignments, test papers, and reports. Through computer techniques, the students’ behaviour, such as the time taken to read learning materials(like books and journals), the duration spent online library and logon frequency, assignments, and records of online conversations with others on the learning platform can be recorded in a database. Thus, the learning portfolios of students participating in online learning include detailed raw data. If we could analyse the correlation between the students’ learning behaviour and learning achievements, we would be able to enable the teachers to control the students’ overall and personal learning situations to a greater extent. The teachers Manuscript received May 14, 2015. Krutibash Nayak,Student,M.Tech,Suresh Gyan Vihar University Jaipur, Rajasthan, India
81
The above figure 1 shows the overall architecture which will integrate all the phases of software development all the users are mining information various data mining algorithms are used to extract data from various databases listed in the above figure information migration takes place while the users extract the data as and when required [Namo Narayan,2008].
II. K-MEANS ALGORITHM The k-means algorithm is an evolutionary algorithm that gains its name from its method of operation. The algorithm clusters observations into k groups, where k is provided as an input parameter. It then assigns each observation to clusters based upon the observation’s proximity to the mean of the cluster. The cluster’s mean is then recomputed and the process begins again. Here’s how the algorithm works: The algorithm arbitrarily selects k points as the initial cluster centers (“means”). Each point in the dataset is assigned to the closed cluster, based upon the Euclidean distance between each point and each cluster center. Each cluster center is recomputed as the average of the points in that cluster. Steps 2 and 3 repeat until the clusters converge. Convergence may be defined differently depending upon the implementation, but it normally means that either no
www.alliedjournals.com
Mining Student Learning Behavior in Library Usage Using K-Means Algorithm (Classification Algorithm)
observations change clusters when steps 2 and 3 are repeated or that the changes do not make a material difference in the definition of the clusters. III. WORKING METHODOLOGY
1. Students will be provided with certain question paper(in
tutorial) with book for reference(min 2 books for each questions) 2. We have to calculate how much time the student has spent to solve the paper and how many books actually he/she has borrowed to solve the paper. 3. We have to calculate how frequently the same book has been used by the students. 4. Is there any extra use of journals or other books for solving the problem. 5. The above 4 steps should be repeated for all the subject that are being taught to the students.
Subject 1 Students Using library books & Journal for solving the problem
IV. RESULT ANALYSIS Below is a result from the textbook using this scheme. There are three classes green, red, and blue. The authors applied k-means using 5 prototypes for each class. We can see below that for each class, the 5 prototypes chosen are shown by filled circles.
Subject 2 Cluster wise Student Library Usage for Solving Tutorial Problems
We have taken database of 2000 students for
analysis.
Subject 1 Cluster wise Student Library Usage for Solving Tutorial Problems Subject 2 Students Using library books & Journal for solving the problem
82
www.alliedjournals.com
International Journal of Engineering, Management & Sciences (IJEMS) ISSN-2348 –3733, Volume-2, Issue-5, May 2015 V.CONLCUSION From the above result it is concluded that the co relationship between students using library and the results exist i.e. both the parameters are related .If the students are using other library materials like journals for the solving of the problems that were given to the students during the tutorial then the result changes dramatically. This shows that the library has a significant role in improving the result of students. It is also noticed that it will be very helpful for the library management to decide which type of books and journals should be bought and got the license of the journals so that it will be benificial to the students as well as it will be cost effective. In future other parameters like the quality time spend by the student in library may be considered so that the result may be improved with more parameter. It may be extended to the software that will be purchased by the management for the betterment of the students with highly cost effective manner. REFERENCES [1] F. Castro, A. Vellido, À. Nebot, and F. Mugica, “Applying data mining techniques to e-learning problems,” Evolution of teaching and learning paradigms in intelligent environment, 2007, pp. 183–221. [2] C. Romero, S. Ventura, P. G. Espejo, and C. Hervás, “Data mining algorithms to classify students,” Proceedings of Educational Data Mining, 2008, pp. 20–21. [3] J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques, Third. Morgan Kaufmann, 2011. [4] R. Hershkovitz and R. Nachmias, “Learning about online learning processes and students’ motivation through Web usage mining,” Interdisciplinary Journal of Knowledge and Learning Objects, vol. 5, 2009, pp. 197–214. [5] Eitel J.M. Lauría, “Joshua Baron, Mining Sakai to Measure Student Performance: Opportunities and Challenges in Academic Analytics,” 2011.
83
www.alliedjournals.com