Hadoop mapreduce tutorial for beginners

Page 1

Hadoop MapReduce Tutorial for Beginners Difference Between Hadoop and RDBMS?

http://crbtech.in/Student-Reviews/Oracle-Reviews


Hadoop MapReduce Tutorial for Beginners l

This post is not developed to get you prepared for Hadoop growth, but to offer a sound understanding for you to take the next measures in mastering the technology.

l

Hadoop is an Apache Application Platform venture that significantly provides two things:

l

An allocated file system known as HDFS (Hadoop Distributed File System)

l

A structure and API for developing and operating MapReduce jobs


.

Hadoop MapReduce Tutorial for Beginners HDFS is organized in detailed storage space is shipped across several devices. It should not have been an alternative to a normal file system, but rather as a file system-like part for big allocated techniques to use. It has in designed systems to deal with device problems, and is enhanced l for throughput rather than latency. l There are two and a half types of device in a HDFS cluster: Datanode – where HDFS actually shops the details, there are usually quite a few of these. Namenode – the ‘master’ device. It manages all the meta data for the cluster. Eg – what prevents blocks data, and what datanodes those prevents are saved on.


.

Hadoop MapReduce Tutorial for Beginners HDFS also has a whole lot of improvements that ensure it is best suited for allocated systems: Failing tolerant – details can be copied across several datanodes to guard against device problems. The market conventional seems to be a duplication aspect of 3 (everything is saved on three machines). Scalability – data transfers occur straight with the datanodes so your read/write potential devices pretty well with the variety of datanodes Space – need more hard drive space? Just add more datanodes and rebalance Industry standard – Lots of Other allocated programs develop on top of HDFS (HBase, Map-Reduce) Pairs well with MapReduce


Hadoop MapReduce Tutorial for Beginners MapReduce The second essential portion of Hadoop is the MapReduce aspect. This is comprised of two sub components: An API for composing MapReduce workflows in Java. A set of solutions for handling the performance of these workflows. The Map and Reduce APIs The primary assumption is this: 1)Map tasks perform a transformation. 2)Reduce tasks perform an aggregation.


THANK YOU!!!


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.