Architecting Microsoft Azure Solutions
Big Data Briefing How to implement Big Data on Windows Azure
http://www.pass4sureexam.co/70-534.html
Big Data; running cheaply and efficiently on Windows Azure to discover business value from data you already hold about your business.
Introduction Who
Agenda
Andy Cross, Director of Microsoft Partner Elastacloud, Windows Azure MVP, Big Data specialist and international speaker.
Windows Azure Intro HDInsight Intro Architecture BI and Visualisations Case Studies
Time until next coffee 55:00
30 minutes 30 minutes 30 minutes 15 minutes 15 minutes
Find out more at http://www.Microsoft.com
http://www.pass4sureexam.co/70-534.html
Introduction to
Windows Azure http://www.pass4sureexam.co/70-534.html
Today’s Transformation – Cloud
Pooled Resources
Self-Service
Elastic
Economics ▪ Agility ▪
Usage Based
Focus
http://www.pass4sureexam.co/70-534.html
IT Continuum Evolution toward highly-virtual and beyond to cloud
Physical
Virtual / Private
IaaS
PaaS
http://www.pass4sureexam.co/70-534.html
SaaS
Microsoft On The Internet
1.0
400m active accounts More than 3 billion queries per month
Microsoft account
11.0billion+ authentications per day
Global Foundation Services
750 million unique visitors at any given time
http://www.pass4sureexam.co/70-534.html
One of 3rd world’s largest private networks
The Microsoft Cloud ~Globally Distributed Data Centers
Quincy, WA
Chicago, IL
San Antonio, TX
Dublin, Ireland
Generation 4 DCs
Inside a Microsoft Data Center
Cloud Computing Patterns Inactivity Period
t
On & off workloads (e.g. batch job) Over provisioned capacity is wasted Time to market can be cumbersome
Growing Fast – Research Project Compute
t
Successful services needs to grow/scale Keeping up w/ growth is big IT challenge Cannot provision hardware fast enough
Unpredictable Bursting – Web demand Compute
t Compute
Compute
On and Off – Start/End Semester
Unexpected/unplanned peak in demand Sudden spike impacts performance Can’t over provision for extreme cases
Predictable Burst – Registration
t
Services with micro seasonality trends Peaks due to periodic increased demand IT complexity and wasted capacity
Big data
Database
Application building blocks
Media
Storage
Traffic
Caching
Messaging
CDN
Networking
Identity
Big Data Technologies
HDInsight Summary GA, recent preview of Hadoop 2 Managed Hadoop on Windows Azure Familiar tools such as Hive, Pig, Oozie Additional BoB Microsoft ecosystem tooling with .net SDK Powershell and .net for provision Execution with .net and powershell for Hive
Paired with Hortonworks HDP for on-premises Hadoop; compatible with all major Hadoop implementations Combined with Excel and http://www.pass4sureexam.co/70-534.html
traditional Microsoft
Position in Cloud
PaaS
Centralised Resources State – a Windows Azure SQL DB stores metadata to allow identical clusters to be spawned
Data – all your data resides on
Windows Azure Blob Storage; resilient, secure and cost effective
Logs – machine state and cluster
uptime is stored in Windows Azure Table Storage
Provision hundreds of machines against your data; gain understanding rapidly
Speed to understanding 500 servers. 20 minutes. 40,000,000,000 data points
Drive Investment
Buy the Look
Exploring Sentiment, Social and Metadata Graphs Alike
Owns Person
Product
Alik e?
Product
Owns Person
Social Media
Microsoft Big Data Solution FAMILIAR END USER TOOLS Power View
BI PLATFORM
Excel with PowerPivot
Predictive Analytics
Embedded BI
SSAS
SSRS
Connectors Hadoop On Windows Azure
UNSTRUCTURED & STRUCTURED DATA
Sensors
Hadoop On Windows Server
Devices
Bots
Microsoft EDW
Crawlers
ERP
CRM
LOB
APPs
Microsoft Offerings INSIGHTS
Hive ODBC Driver & Hive Add-in for Excel Integration with Microsoft PowerPivot
ENTERPRISE READY
Hadoop based distribution for Windows Server & Azure
BROADER ACCESS
JavaScript framework for Hadoop
Strategic Partnership with Hortonworks
RTM of Hadoop connectors for SQL Server and PDW
Hadoop on Windows
Insights to all users by activating new types of data
Differentiation
INSIGHTS
Integrate with Microsoft Business Intelligence
ENTERPRISE READY
Choice of deployment on Windows Server + Windows Azure
BROADER ACCESS
Integrate with Windows Components (AD, Systems Center) Easy installation and configuration of Hadoop on Windows Simplified programming with . Net & Javascript integration Integrate with SQL Server Data Warehousing
Contributions proposed back to community distribution
Microsoft Big Data Roadmap Microsoft is extending its leadership in business intelligence and data warehousing to provide insights to all users by activating new types of data of any size Microsoft is announcing an end-to-end roadmap for Big Data that embraces Apache HadoopTM by distributing enterprise class Hadoop based solutions on both Windows Server and Windows Azure
To accelerate the delivery of Microsoft’s Hadoop based solution for Windows Server and service for Windows Azure, Microsoft is announcing a partnership with Hortonworks Microsoft is committed to broadening accessibility and usage of Hadoop to end users, developers and IT professionals in organizations of all sizes
Microsoft’s value adds Why Elastacloud recommend Microsoft’s platform
Skill reuse Your existing development team can immediately realise value The frameworks facilitate deterministic testing for highly reliable queries
Express elegant solutions in C# Familiar Unit Testing patterns Concise programmatic terseness Complex logic is best expressed in programmatic form
Commoditised query
Value
De-provision
Provision
Cost
Time
Action Cost
Value of query
Execute
Hybrid Compatibility
Name=Andy
Microsoft are the only vendor with enterprise on premises and cloud big data offerings
Pnid=123456 123456
Hadoop On Premises
HDInsight in Azure
4712
Just as Big Data is more than Hadoop; Windows Azure is more than HDInsight HDP Spark Storm
Elastacloud provisioned a bespoke 500 core vir tual cluster on Windows Azure, allowing us to cut down a compute intensive task from 20 days over 8 cores to under 12 hours over 500 cores. Jedidiah Francis, Data Scientist, ASOS
Architectural Considerations Solving Big Data challenges cohesively
The first challenge you face with Big Data is somewhat the hardest
Elastacloud’s tooling aims to solve these challenges
Ingress, Compute, Egress automated; allowing focus on providing value
Workflows
Big Data solutions are rarely the result of a single piece of code running; instead jobs transform data into intermediate domains before completing an overall piece of work
The overall architectural challenge of Big Data is that just as Data can vary, so must architectures. Batch
Interactive
Real Time
Hadoop can operate with O(n) over Petabytes of data
Drill, Stinger and Tez bring must quicker querying
Storm allows enormous scalable throughput
Batch
Static Data
Batch Process
Precompute
Serving
Operational Data Store
Speed
Data Stream
Real Time
Increment
User Query
Technologies possible? Any. Batch
Interactive
Real Time
Hadoop
SQL / MySQL
Storm
Spark
Hadoop 2 / Tez / Drill
Spark
Lucene
Mongo Cassandra Windows Azure Table Storage Spark
Batch
Static Data
HDInsight
Precompute
Serving
SQL/NoSQL Database
Speed
Data Stream
Storm
Increment
User Query
Blob Persist
Windows Azure Service Bus
Elastacloud Service Bus Spout
Moving Mediatonic to Windows Azure
AWS didn’t work • Mediatonic are a leading UK games company • They had deployed an analytics platform to AWS • They required quite a bit of manpower to maintain their system operationally • Their biggest cost was an Amazon Redshift data warehouse which was a
bottleneck in their design
• Their system was built so that if a server went down whilst collecting gaming
messages they could lose 15 minutes of data
• They found the “Big Data” part so complex that only highly-paid specialists could
improve the system
How Windows Azure Helped • A joint collaboration between Microsoft, Elastacloud and Mediatonic migrated
their Game Fuel platform to Windows Azure
• With the Windows Azure Service Bus we were able to ensure that their
messaging was continuous and not discrete so no messages were ever lost
• With HDInsight we were able to teach their staff how to build “Big Data”
applications
• We were able to deliver a system with unique components of Windows Azure,
removing the expensive data warehousing bottleneck but delivering the same functionality
“I can’t believe that I could get up and running with HDInsight doing real work in 30 minutes. It took me days to do the same thing with Elastic Map Reduce on Amazon” Adam Fletcher – Gamefuel Project Lead
Initial State
Buy the Look Traditional Analytics
Limited Traditional Analytics allows for the tracing of single actions on a website; who added a product to a bag, where did they come from, where did they go?
Advancement
Cross Sell
Buy the Look
Improvements possible Beyond immediacy is the ongoing workflow of a system. Who added items to a basket after using the Buy the Look functionality. However, this lacks comprehension of customer behaviour.
Cutting edge
Who bought this?
Cutting edge – understand your customer
Purchase propensity
January
Time
February
Proving through Big Data technologies that engaged customers will return through engagement following life events such as paydays to continue the engagement. Building business on this basis.
http://www.pass4sureexam.co/70-534.html
Find out more admin@pass4sureexam.co http://www.pass4sureexam.co/70-534.html