International Journal of Computer & Organization Trends – Volume 5 – February 2014
Activity Based Data Management in Mobile Environment Using Data Mining Techniques Satwant Kaur#1, Er.Rishma Chawla*2, Er.Varinderjit Kaur@3 1
M. Tech, Department of CSE, RIET, Phagwara, Punjab, India HOD, Department of CSE, RIET, Phagwara, Punjab, India 3 AP, Department of CSE, RIET, Phagwara, Punjab, India 2
Abstract - Wireless communications and mobile device
important rule to manage the database. [2]Data mining is an
technologies
wireless
emerging research area, whose goal is to discover potentially
communications applications but also enable the provision of
useful information in database. Hence the data mining has
plentiful
not
kinds
only
of
mobile
accentuate
services
various
.In
mobile
service
environments, mobile users may request various kinds of services and applications through mobile devices or laptop computers from arbitrary locations at any time via 3G, WiMax,
attracted a great deal of attention in recent years. [3]There are several data mining techniques are available to analyze of data. Decision tree rule is one most important technique of
wireless LAN, or other wireless networking technologies. Most
data mining. Based on decision tree, implement the three
users in a mobile environment are moving and accessing
algorithms ID3, CART, Enhanced algorithm on the activity
wireless services for the activities they are currently engaged in.
based data in mobile environment. data mining algorithm
It is propose the idea of complex activity for characterizing the
which involves incremental mining for user moving patterns
continuously changing complex behavior patterns of mobile
in a mobile computing environment and exploit the mining
users. An activity may be composed of sub activities. For the
results to develop data allocation schemes so as to improve
purpose of data management, a complex activity is modeled as a sequence of location movement, service requests, the cooccurrence of location and service. So, therefore, purpose new method for complex activity to manage database. In particular
the overall performance of a mobile system, reduce the time of execution and accuracy rate. In this paper describe the data mining, ID3, CART, and Enhanced algorithm .Implement
devise ID3, CART and new invent algorithm Enhanced
this algorithm on the WEKA tool and comparing the different
algorithm. Preliminary implementation and simulated results
parameters of algorithm.
that the proposed framework and techniques can significantly
II. DATA MINING
reduce the response time increase the performance, increase accuracy rate. Implement these algorithms on WEKA tools. Compare these algorithms with different parameters.
It is knowledge discovery, the computer-assisted process of digging through and analyzing enormous sets of data and
Keywords- Data Mining, ID3 Algorithm, CART Algorithm, Enhanced Algorithm.
then extracting the meaning of the data. [4] Data mining software is one of a number of analytical tools for analyzing data. It allows users to analyze data from many different
I. INTRODUCTION
dimensions or angles, categorize it, and summarize the Rapid advancement in mobile user environment is important
relationships identified. Data mining consists of five major
to manage the data of mobile users [1]. Data mining play
elements:
ISSN: 2249-2593
http://www.ijcotjournal.org
Page 57
International Journal of Computer & Organization Trends – Volume 5 – February 2014
IV. CART ALGORITHM
Extract, transform, and load transaction data onto the data warehouse system.
CART [7] is an acronym for classification and regression trees, a decision –tree procedure introduce in 1984 by world
Store and manage the data in a multidimensional database system.
renowned UC Berkeley and Stanford, Richard olshen and Charles stone [8].the CART methodology solves a numbers
Provide data access to business analysts and
performance, accuracy, and operational problems that still plague
information technology professionals.
many
current
decision-tree
methods.
CART
innovations include:
- solving the “how big to grow the tree
Analyze the data by application software.
“problems
- using strictly two-way splitting
Present the data in a useful format, such as a graph
- incorporating automatic testing and tree validation,
or table.
-Providing a completely new method for handling missing values. III. ID3 ALGORITHM
Features of CART Algorithm
[5] ID3 algorithm is the decision tree classification method based on information entropy, which selects the type of cases according to the attribute set values and regards the information entropy as it target evaluation function , using top-down non-return strategy to search a part of all the space to ensure that build decision tree is the most simple and each test datum is the least.[6]ID3 algorithm uses information entropy theory to select attribute values with maximum information gain in the current sample sets as the test attribute. Whereas the division of the samples sets is based on the value of the test properties, the numbers of test attributes decide the number of sub-sample sets, at the same time, new leaf nodes grow out of corresponding nodes of the sample set on the decision tree. Since the more simple the decision tree structure is, the easier for one to summarize the law of things in nature, the average paths from the non-leaf nodes to the descendant nodes are expected the shortest. The average depth of the decision tree is minimum, which requires choosing a good division on each node.
1. The visual display enables users to see the hierarchical interaction of the variables; 2. Future, because simple if then rules can be read right off the tree, models are easy to apply to new data. 3. CART uses strictly binary, or two-way, splits that divide each parent node into exactly two child nodes by posting question with yes/no answers at each decision node. 4. CART is unique among decision-tree tools. CART – proven methodology is characterized by [9]: (a) Reliable pruning strategy- CART developers determined definitively that (b) No stopping rule could be relied on to discover the optimal tree, (c) Powerful binary-split search approach-CART binary decision tree are more sparing with data is left for learning. (d) Automatic self-validation procedures- In the search for patterns in database it is essential to avoid the trap of over fitting. (e) Future, the testing and selection of the optimal tree are an integral part of the CART algorithm. (f) It has automated solutions that surrogate splitters intelligently handle missing values;
ISSN: 2249-2593
http://www.ijcotjournal.org
Page 58
International Journal of Computer & Organization Trends – Volume 5 – February 2014 Table 1: Attributes of dataset
(g) Multiple–tree, committee-of-experts methods increase the precision of results. Sr.no
Attributes
V. ENHANCED ALGORITHM
1.
SSID
The tree starts as a single node representing the training
2.
Strength
samples. If the samples are all of the same class, then the
3.
Authenticate type
4.
Frequency
5.
Channel
will best separate the samples into individual classes. This
6.
Speed
attribute becomes the “test” or “decision” attribute at the
7.
Bandwidth
node. (All of the attributes are categorical or discrete value.
8.
Transmitted frame count
9.
service
node becomes a leaf and is labeled with that class. Otherwise, the algorithm uses an entropy-based measure known as information gain as a heuristic for selecting the attribute that
Continues-valued attribute must be discretized.) A branch is created for each known value of the test attribute, and the samples are partitioned accordingly. The algorithm uses the same process recursively to form a decision tree for the samples at each partition. Once an attribute has occurred at a node, it need not be considered in any of the node’s Descendents. The recursive partitioning stops only when any one of the following conditions is true: All the samples for a given node belong to the same class or there are no remaining attributes on which the samples may be further partitioned. In this case, majority voting is employed. This involves converting the given node into a leaf and labeling it with the class in majority among samples. Alternatively, the class distribution of the node samples may be stored. There are no samples for the branch test-attribute. In this case, a leaf is created with the majority class in samples.
ISSN: 2249-2593
http://www.ijcotjournal.org
Page 59
International Journal of Computer & Organization Trends – Volume 5 – February 2014 VI. SIMULATION RESULT
1.3 IMPLEMENTATION OF THE CART ALGORITHM ON WEKA
Table 2: Collected Dataset
Figure 2: CART Algorithm Implementation on Weka
1. IMPLEMENTATION OF ID3, CART AND ENHANCED ALGORITHM
1.4 IMPLEMENTATION OF THE ID3 ALGORITHM
1.2 IMPLEMENTATION OF THE ENHANCED
ON WEKA
ALGORITHM ON WEKA
Figure 1: Enhanced Algorithm Is Implementation on Weka
ISSN: 2249-2593
Figure 3: ID3 Algorithm on Weka
http://www.ijcotjournal.org
Page 60
International Journal of Computer & Organization Trends – Volume 5 – February 2014 Compare the ID3, CART and Enhanced Algorithm
Correctly Classified Instances
with Different Parameters Table 3: Compare the ID3, CART and ENHANCED Algorithm Different Parameters
ID3
CART
EA
Execution Time
0.11
0.13
0.02
Correctly
87.2%
51.6%
34%
4.6%
48.3%
65.3%
37.3%
94.3%
96.3%
classified instance Incorrectly classified instance
100 90 80 70 60 50 40 30 20 10 0
Correctly Classified Instances
ID3
Accuracy
CART
EA
Figure 5: Show the Analysis Between Of CCI
Incorrectly Classified Instances
Execution Time
70 60
EA
50
CART
Execution Time
40
Incorrectly Classified Instances
30
ID3
20
0
0.05
0.1
0.15 10
Figure 4: Analysis Between of Execution Time
0 ID3
CART
EA
Figure 6: Analysis Between of Incorrect Classified Instance
ISSN: 2249-2593
http://www.ijcotjournal.org
Page 61
International Journal of Computer & Organization Trends – Volume 5 – February 2014 [4] Margaret H.Dunham,”Data Mining Introductory and
120
Advanced
100
topic”,published
by
person
education
Delhi,India,[2004].
80
[5] Quinlan J R,”Induction of decision tree,”Machine
60
Accuracy
Learning, vol.4,no.2,pp.81-106,[1986].
40
[6] Ren Yanna, “ The Design of Algorithm for Data
20
Mining System Used for Web Service” ,IEEE [2011] . [7] G.Sathyadevi “application of CART algorithm hepatitis
0 ID3
CART
disease
EA
diagnosis”,IEEE-International
Conference
on
recent trends in information technology,ICRTIT 2011,June 3-5,[2011].
Figure 7: Analysis between Accuracy in Data
[8] K. Cios, W.Pedrycz, and R. Swiniarsski. Data Mining VII. CONCLUSIONS
Methods
for
Knowledge
Discovery.Boston:Kluwer
In this Research, It is highlight the approaches for creating
Academic Publishers,[1998]
a decision tree. They are mainly available into academic
[9] Cheeseman, P., and Stutz, J. 'Bayesian Classification
tools from the machine learning community. This is note
(AUTOCLASS): Theory and Results In Advancesin
that they are an alternative quite credible to decision trees
Knowledge Discovery and Data Mining, eds. PP. 51
and classification rules, both in terms of accuracy than in
Articles FALL.[1996 ].
terms of processing time. After analysis Order ID3, CART and Enhanced algorithm is more suitable to find accurate and consuming less access time to mine data with minimum error rate. So Enhanced Algorithm is a best algorithm for mining a data on mobile environment data set. REFERENCES [1] W. Ma, Y. Fang, and P. Lin, “Mobility Management Strategy Based on User Mobility Patterns in Wireless Networks,” IEEE Trans. Vehicular Technology, vol. 56, no. 1, pp. 322-330, Jan. [2007] [2] B.N.Lakshmi,G.H Raghunandhan “A conceptual overview of data mining”,IEEE ,Proceeding of the national conference on innovation in emerging technology,pp.2732,17&18 feb,[2011]. [3] Omer Adel Nassar,Dr.Nedhal A.Saiyd,”the integrating between web usage mining and data mining techniques,”5th internal conference on computer science and information technology,[2013]
ISSN: 2249-2593
http://www.ijcotjournal.org
Page 62