Tr 00091

IDL - International Digital Library Of Technology & Research Volume 1, Issue 5, May 2017

Available at: www.dbpublications.org

International e-Journal For Technology And Research-2017

Web Oriented FIM for large scale dataset using Hadoop Mrs. Supriya C PG Scholar Department of Computer Science and Engineering C.M.R.I.T, Bangalore, Karnataka, India

supriyakuppur@gmail.com

Abstract: In large scale datasets, mining frequent itemsets using existing parallel mining algorithm is to balance the load by distributing such enormous data between collections of computers. But we identify high performance issue in existing mining algorithms [1]. To handle this problem, we introduce a new approach called data partitioning using Map Reduce programming model.In our proposed system, we have introduced new technique called frequent itemset ultrametric tree rather than conservative FP-trees. An investigational outcome tells us that, eradicating redundant transaction results in improving the performance by reducing computing loads.

using traditional data processing techniques or software‟s. information

Major

challenges in big data are

safekeeping,

distribution,

searching,

revelation, querying, updating such data. Data analyzation is another big apprehension need to concentrate while dealing with big data. It involves data which is formed by different types of data and applications like social media data, online auctions. Data is differentiated into 3 major types‟ structured, unstructured and semi-structured data. It also defines 3 major V‟s Volume, Velocity, and Variety which gives us apparent notion on what is big data. Now a day‟s data is growing very fast, consider an

Keywords: Frequent Itemset, MapReduce, Data partitioning, parallel computing, load balance

example: many hospitals have trillions of data facets of ECG data. Twitter alone collects around 170million temporal data, every now and then, serves as much as 200million queries/day. Most important limitations

1 INTRODUCTION

with the existing systems are handling larger datasets; our databases can handle only structured data but not

Big data is an emerging technology in modern world. It is a greater amount of data, which is hard to process

IDL - International Digital Library

1|P a g e

varieties of data, fault tolerance, scalability. That‟s why big data consign an important role in these days. Copyright@IDL-2017

Turn static files into dynamic content formats.

Create a flipbook