Welcome to the World of Big Data & Hadoop Click to edit Master subtitle style
www.easylearning.guru
Agenda What is Big Data ? Different Kinds of Big Data Big Data Global Market Hadoop Global job trends What is Hadoop ?
www.easylearning.guru
What is Big Data? Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.
Types of Big Data ? Semi-Structured Data
Traditional RDBMS deals with only Structured data.
Need of a technology which deals with Semi-structured data, Unstructured data and Structured data as well
The 3V’s of Big Data
www.easylearning.guru
Sources of Data
Social Media & Networks (All of us are generating data)
Mobile Devices (Tracking all the objects all the time) Sensor Technology & Networks (Measuring all kinds of data) Scientific Instruments (Collecting all sorts of data)
www.easylearning.guru
Where Big Data is used ?
www.easylearning.guru
Facebook Scenario Facebook on an average generates 70 thousand MB in 1 minute. 1 hour 1 Day 1 week 4 weeks 52 weeks
= 70,000 MB *60 = 4.2 Million MB = 4.2 Million *24 MB = 10.8 Billion MB = 98438 GB = 6.9 thousand GB = 690 TB = 690 TB * 4 = 2756 TB = 2.7 PB = 2.7 PB * 52 = 143.3 PB
And that’s aloooooooooot of data ! www.easylearning.guru
Various Bigdata Technologies
Big Data Global Market
20 10 0 2017
2016
2015
2014
2013
2012
Big Data Growth (in USD Billions)
30
TA BIG DA ANALY ST
Sources : Dice, LinkedIn.
1800% 2300% 3100% 4400% 4300% 5000%
40
Filled
50
Implemented Big Data Yet to Implement Big Data
60
Big Data Implementation
Unfilled
8200% 7700% 6900% 5600% 5700% 5000%
Filled/vacancy(%)
www.easylearning.guru
Hadoop Global Job Trends
More than 17,000 employees with Hadoop skill across these companies
Top Hadoop Technology Companies
Sources : Dice, LinkedIn.
www.easylearning.guru
Hadoop Global Job Trends 100
100% 100% 200% 200% 300% 300% 400% 500% 500% 600% 600%700% 700%800% 800%900% 900%1000% 1000% As of February 2014400% 38%
80 60 40
8%
20
Salary (USD p.a. in thousands)
120
Demand for Big Data in Cities
2%
3%
10%
14%
4%
AD
www.easylearning.guru
LO R E BANGA
B HYDERA
PUNE
ON GURGA
I MUMBA
AI CHENN
ABAD AHEMD
DE L HI
NOIDA
inframe
re VM Wa
MYSQL
.NET
VB
IBM Ma
C ++
ipt Java Scr
SAP
a Teradat
Unix
1 Column
2 Column
0
Sources : Dice, LinkedIn.
2%
8%
11%
What is Hadoop ? Hadoop was created by Doug Cutting and Mike Cafarella. Hadoop provides the reliable shared storage and analysis system. It is designed to scale up from a single server to thousand of machines, with a high degree of fault tolerance.
www.easylearning.gur
Hadoop History
www.easylearning.guru
Hadoop Core Components
Core Hadoop has two main systems: • Hadoop Distributed File System: The Hadoop file system is a Distributed file system which holds the large amount of data across multiple nodes in a cluster. • MapReduce: MapReduce is a distributed programming paradigm used to analyze the data in the HDFS. www.easylearning.guru
Hadoop Distributed File System (HDFS) A given file is broken down into blocks (default=64MB), then blocks are replicated across cluster (default=3). Optimized for throughput. HDFS allows you to put/get/delete files. Follows the philosophy “Write Once and Read Multiple times�
Block Replication for: - Durability, High Availability and Throughput.
www.easylearning.guru
MapReduce Flow
www.easylearning.guru
MapReduce Framework Map Reduce works by breaking the processing into two phases : Map Phase and Reduce Phase.
www.easylearning.guru
www.easylearning.guru
What we offer‌ www.easylearning.guru
www.easylearning.guru
Syllabus
Introduction
●
Hive
a)Big Data
●
Hive 1
b)Hadoop
●
Hive 2
●
Hbase
●
Zookeeper
●
Sqoop
●
Yarn
●
Project Class
Hadoop HDFS MapReduce
PIG Pig 1 Pig 2
www.easylearning.guru
Thank you for watching the Live Demo for Hadoop. You can always contact us on:
Phone : +91 124 4763660 (India) Email : contact@easylearning.guru Skype Id : easylearning.guru Website : www.easylearning.guru Your queries are always welcome. www.easylearning.guru