International Journal of Research in Advent Technology, Vol.2, No.6, June 2014 E-ISSN: 2321-9637
An Improved Approach to Enhance the Performance of Classical Apriori Algorithm to Mine Frequent Itemsets Shivani Kwatra1, Ravneet Kaur2 1
Student of Master of Technology and 2Assistant Professor Department of Computer Science and Engineering Sri Guru Granth Sahib World University Fatehgarh Sahib, Punjab, India Email: shivanikwatra4@gmail.com1 and ravneetin2002@gmail.com2 Abstract— Data mining is a process that uses a variety of data analysis tools to discover patterns and relationships in data that may be used to make valid predictions. Data mining is used to obtain information from the data sets. Association Rule Mining is one of the technique used for obtaining frequent patterns. Frequent itemsets mining is an important and difficult task in association rule mining. In large databases , the research to improve the performance of mining data is necessary. The researchers invented various ideas to improve the performance of mining frequent itemsets. Most of the algorithms are based on time factor. In this paper a new algorithm is proposed to mine frequent itemsets which will use the technique of binary search to generate frequent itemsets. Keywords- Data mining; Association rule mining, Frequent Itemsets
1.
INTRODUCTION
Data mining is a process of discovering knowledge from the database to discover patterns and relationships in data that may be used to make valid predictions. Data mining is used to obtain information from the data sets. The growth in the size of database has led to the development of tools to mine frequent itemsets. The need of automatic extraction of knowledge from data is increasing. The certain information within databases has led to the discovery of association rules to uncover useful patterns for decision support, marketing strategies, financial forecast, and other applications. To find frequent itemsets different techniques are used such as association rules, correlations, clustering and classifiers and many more from which association rules are most populer in the field of frequent itemsets mining. The motivation behind frequent itemsets mining is to examine the items which are purchasing together in the supermarket. This paper is proposed for the survey on frequent itemsets. 2.
LITERATURE REVIEW
Wei Zhang et al. [4] introduces an improved apriori algorithm so called FP-growth algorithm that will help resolve two neck-bottle problems of traditional apriori algorithm and has more efficiency than original one. This introduces constructing method of FP tree structure and experimental results are shown, that the algorithm has higher mining efficiency in execution time, memory usage and CPU utilization than most current ones like Apriori.
Goswami D.N. et al. [5] described three different frequent pattern mining approaches (Record filter, Intersection and Proposed Algorithm) are given based on classical Apriori algorithm. In these approaches Record filter approach proved better than classical Apriori Algorithm, Intersection approach proved better than Record filter approach and finally proposed algorithm proved that it is much better than other frequent pattern mining algorithm. In last this performs a comparative study of all approaches on dataset of 2000 transaction. Basheer Mohamad Al-Maqaleh and Saleem Khalid Shaab [6] proposed an efficient algorithm to integrate confidence measure during the process of mining frequent itemsets, which generates confident frequent itemsets. Consequently, the suggested algorithm generates strong association rules from these confident frequent itemsets. This technique has been implemented and the experimental results show the usefulness and effectiveness of the proposed algorithm. Saurabh Malgaonkar et al. [7] described that the mentioned system is designed to find the most frequent combinations of items. It is based on developing an efficient algorithm that outperforms the best available frequent pattern algorithms on a number of typical data sets. This will help in marketing and sales. The technique can be used to uncover interesting crosssells and related products. Three different algorithms from association mining have been implanted and then best combination method is utilized to find more interesting results. The analyst then can perform the data mining and extraction and finally conclude the result and make appropriate decision.
252