IJIRST –International Journal for Innovative Research in Science & Technology| Volume 1 | Issue 7 | December 2014 ISSN (online): 2349-6010
Mining Sequences – Approaches and Analysis Manika Verma Assistant Professor Department of Computer Science
Dr. Devarshi Mehta Associate Professor
Kadi Sarva Vishwavidyalaya, Gandhinagar, Gujarat, India
GLS Institute of Computer Technology, Ahmedabad,India
Dr. Vishal Dahiya Associate Professor Indus University, Ahmedabad,India
Krupa Mehta Assistant Professor GLS Institute of Computer Technology, Ahmedabad,India
Abstract Sequential Pattern Mining is to discover sequential patterns, with user-specified minimum support of pattern where support is number of sequences that contains pattern, from a database of sequences. Each sequence of database consists of list of transactions ordered by transaction time and each transaction is a set of items. Closed Sequential Pattern Mining has same capability as Sequential pattern mining, but in Closed Sequential Pattern Mining redundant patterns to be generated and stored are reduced which is much economical. This paper presents approaches and key-feature of algorithms ClaSP, CM-ClaSP, CloSpan, BIDE which are used for mining closed sequential patterns as well as approaches and key features of algorithms GSP, SPADE, PrefixSpan, SPAM, LAPIN which are used for mining sequential pattern. It shows that number of sequences generated in Closed Sequential Pattern Mining is much less than those generated by Sequential Pattern Mining which makes Closed Sequential Pattern Mining Economical. The algorithms are compared by attributes total time required to find frequent sequences, number of frequent sequences generated and maximum memory required. Keywords: Sequential Pattern Mining, Closed Sequential Pattern Mining. _______________________________________________________________________________________________________
I. INTRODUCTION Sequential pattern mining is applied in various areas like market and customer behavior analysis, web log analysis, pattern discovery in protein sequences and tandem repeats in DNA sequences, mining XML query access patterns for caching [3][2]. Various Mining methods have been studied like General Sequential Pattern Mining, Closed Sequential Pattern Mining, Constraint based sequential Pattern Mining[8][9][10][13].In recent year many studies have presented the views that for identifying frequent patterns, rather than mining all frequent patterns only closed patterns should be mined which lead to better efficiency [3].
Fig. 1: Approaches and Algorithm Used For Sequential Pattern Mining
All rights reserved by www.ijirst.org
229