International Journal of Engineering and Applied Sciences (IJEAS) ISSN: 2394-3661, Volume-2, Issue-9, September 2015
Efficient Mining for Hadoop process with big data Savita Suryavanshi
capability 2.Designing Algorithm business methods 3.bussines models. Adapt to the multisource, massive, dynamic Big Data, researchers have expanded existing data mining methods in many ways, including the efficiency improvement of single source knowledge discovery methods, designing a data mining mechanism from a multisource perspective, as well as the study of dynamic data mining methods and the analysis of stream data . The main motivation for discovering knowledge from massive data is improving the efficiency of single-source mining methods. On the basis of gradual improvement of computer hardware functions, researchers continue to explore ways to improve the efficiency of knowledge discovery algorithms to make them better for massive data. Because massive data are typically collected from different data sources, the knowledge discovery of the massive data must be performed using a multi source mining mechanism. As real-world data often come as a data stream or a characteristic flow, a well-established mechanism is needed to discover knowledge and master the evolution of knowledge in the dynamic data source. Therefore, the massive ,heterogeneous and real-time characteristics of multi source data provide essential differences between single source knowledge discovery and multisource data mining. proposed and established the theory of local pattern analysis, which has laid a foundation for global knowledge discovery in multisource data mining. This theory provides a solution not only for the problem of full search, but also for finding global models that traditional mining methods cannot find. Local pattern analysis of data processing can avoid putting different data sources together to carry out centralized computing. Data streams are widely used in financial analysis, online trading, medical testing, and so on. In this project system to build a stream based Big Data analytic frame work for fast response and real-time decision making. The key challenges and research issues include: designing Big Data sampling mechanisms to reduce Big Data volumes to a manageable size for processing; - building prediction models from Big Data streams. Such models can adaptively adjust to the dynamic changing of the data. A knowledge indexing framework to ensure real-time data monitoring and classification for Big Data applications.
Abstract— Big data concern large-volume, complex, growing data sets that are too big. It is difficult to Big Data Mining with our current methodologies or data mining software tools, they are emerging in many important applications, such as Internet search, business informatics, and social networks, social media, genomics, and meteorology, Big Data mining grand challenge to identify the datasets and capability of extracting useful information from large datasets or streams of data The unification of multiple datasets from disparate sources in combination with advanced analytics techniques and technologies will advance problem solving capabilities, and in turn will improve the ability of predictive analysts to reveal insights that can effectively support decision making. The analysis of big data sources can be used to identify cost saving and opportunities to increase efficiency, which will directly contribute to an improvement in productivity. This can in turn help to encourage further innovations and prediction
Index Terms— Big Data, data mining, heterogeneity, autonomous sources, complex and evolving associations.
I. INTRODUCTION Now a days big data become too big to process with our existing tool and software and main challenge is Big data Mining or Retrieving the information of different format data, various volume data and velocity data. It is difficult to data mining the big data without loss of data at micro level. Recent year data will dramatically increase due to data will collect from various sensors, applications, device, in different formats from various networks. Let consider internet data, the web page index by google were around 2 million in 1998 , it quickly reach in billion within 2 year that is at 2000 and have already exceeded 20 trillion. Information rapidly expanding or accelerating daily because of acceptance of social networking applications such as Facebook from this daily billions of data will uploaded ,Twitter from this also daily billions of data will upload ,Google Plus ,LinkedIn etc. like from many social site flood of data will Importing daily . Furthermore mobile phone becomes to get data at real-time from different ways. Vast data carries by mobile can potentially process to improve the performance of daily life. Before CDR(call data record)-based processing for billing purposes only, it can see internet of the things applications will raise the scale of data to unprecedented level .peoples are loosely connected everywhere so that millions of connected application generate large volume of data, and valuable information must discovered by improved data mining process that help improve the quality of life at short time and get valuable of information. During valuable information discovering process from the huge data, facing many challenges such that 1.Hardware and software System
II. LITERATURE SURVEY R. Ahmed and G. Karypis, Algorithms for Mining the Evolution of Conserved Relational States in Dynamic Networks,[1] have recently being recognized as a powerful abstraction to model and represent the temporal changes and dynamic aspects of the data underlying many complex systems. This can help identify the transitions from one conserved state to the next and may provide evidence to the existence of external factors that are responsible for changing the stable relational patterns in these networks. This paper presents a new data mining method that analyzes the time-persistent relations or states between the entities of the
Savita Suryavanshi, Department of Computer Engineering DPCOE, Wagholi, Pune, India.
18
www.ijeas.org