T H E L E A D I N G V O I C E I N B I G D ATA I N N O VAT I O N
BIG DATA INNOVATION NOV 2015 | #19
+ T H E
L E A D I N G
A Potted History Of Big Data
V O I C E
I N
S P O R T S
I N N O V A T I O N
SPORTS PERFORMANCE & TECH Matthew Reaney talks us through how big data has been used throughout time, it is not just a new thing | 6
What do TalkTalk Hacks Tell Us?
After the recent TalkTalk data hacks all eyes are on the company, but there are lessons for the whole industry | 24
O C T
2 0 1 5
T H E L E A D I N G V O I C E I N S T R AT E G Y I N N O V AT I O N
|
# 1 7
CHIEF Spark Creates STRATEGY Uber Fire OFFICER OCT 2015 | #17
T H E L E A D I N G V O I C E I N F I N A N C E I N N O VAT I O N
FP&A
AUG 2015 | #14
ie.
Innovation Summit Speakers Include
FOUR SEASONS, LAS VEGAS JANUARY 28 & 29 2016
+ 1 415 692 5426 rasterley@theiegroup.com www.theinnovationenterprise.com
BIG DATA INNOVATION
3
ISSUE 19
EDITOR’S LETTER Welcome to the 19th Edition of the Big Data Innovation Magazine
Hello and welcome to issue 19 of Big Data Innovation. Whenever I look back to issue one of the magazine, one thing is clear: it was written for a niche audience. It was created with the idea that only a select few would understand the content, simply because it was not a well known subject. At the time, I remember being forced to describe data and analytics in the most basic ways to the non-initiated, the most common example being ‘you know those adverts you see everywhere after you have been on a website?’. It vastly oversimplified the subject, but if I had gone into the full and often scary description, it would have been impossible to convey the possibilities, opportunities and complexities of it. Now when I look at where we stand, the basic knowledge of data and analytics has increased
drastically. I still can’t describe the deep complexities inherent in the practices, but it goes beyond the simple ‘it shows you certain adverts.’ The reasons for this increase in knowledge have not always been positive though. One of the main educators has been the seriousness of having data stolen, something which is happening with alarming regularity and being reported frequently in the media. We have seen with the TalkTalk hack of a few weeks ago, that even those who have been hacked previously do not always learn from their mistakes. Hopefully following this latest hack more companies will learn. The prevalence of data in our society today makes these kind of attacks more common, but as companies have made billions of dollars from data, they need to protect it. If
people’s data is consistently stolen, the public will be less willing to share it and will put pressure on governments to take drastic action. This could lead to a limiting of current data collection methods and potentially destroy the progress made. The more data that is stolen and the more people who are affected, the more likely it is that the public reaction will be overwhelmingly negative. The only way to stop this from happening is to improve security and elevate its importance within companies. Until that happens, the hacks will continue, and public distrust will only increase.
George Hill MANAGING EDITOR
BIG DATA INNOVATION
4
SAN DIEGO FEBRUARY 18 & 19 2016
Apache Hadoop
Innovation Summit
Data Science
Innovation Summit
Speakers Include +1 415 992 7918 jordan@theiegroup.com www.theinnovationenterprise.com BIG DATA INNOVATION
5
CONTENTS 6 | A POTTED HISTORY OF BIG DATA
22 | SPARK CREATES UBER FIRE
Matthew Reaney talks us through how big data has been used throughout time, it is not just a new thing
Spark and its real time data capabilities are sitting in the driving seat for Uber’s huge growth 24 | WHAT DO TALKTALK HACKS TELL US?
9 | ENHANCED MARKETING THROUGH DATA
Data has many uses in a company and using it effectively in marketing has multiple benefits 12 | BUILD YOUR MARKETING DATA STRATEGY IN 6 STEPS
Marketing data is key to company success and Ian Thomas from Microsoft talks us through doing it right
After the recent TalkTalk data hacks all eyes are on the company, but there are lessons for the whole industry 27 | THE CROWDSOURCED CHIEF DATA OFFICER
Long Beach in California is taking the novel approach of crowdsourcing their CDO role, but will it work? 29 | GOOD DATA GOVERNANCE — THE ULTIMATE RECIPE FOR GROWTH
18 | COMMERCIALIZING BIG DATA: FREQUENT FLYER PROGRAMS
After the success of his first article, Mark Ross-Smith talks data sharing in frequent flyer programmes
Data governance is key to company success today, Fergus Kennedy talks us through why this is the case
WRITE FOR US
ADVERTISING
Do you want to contribute to our next issue? Contact: ghill@theiegroup.com for details
For advertising opportunities contact: achristofi@theiegroup.com for details
MANAGING EDITOR GEORGE HILL
| ASSISTANT EDITOR SIMON BARTON | CREATIVE DIRECTOR CHELSEA CARPENTER
CONTRIBUTORS MATTHEW REANY, IAN THOMAS, MARK ROSS SMITH, CHRIS TOWERS, GABRIELLE MORSE, LAURA DENHAM, FERGUS KENNEDY
BIG DATA INNOVATION
6
A Potted History of Big Data Matthew Reaney Director & Founder, Big Cloud
Quantifying the world around us has been a critical activity from our prehistoric roots. The earliest recorded tools to count inventory have been carbon dated right back to 18,000 BC, and you might imagine hunters counting their kills on their fingers right back in the midst of the early Stone Age.
Somehow we have an innate desire to measure ourselves, to know where we stand and to compare ourselves to others. The abacus was first invented in 2400BC, and Greek versions have been found dating to 600BC. The Greek orator Demosthenes (384 BC’322 BC) talked of the need to use pebbles for calculations too difficult to do in your head. Around that time, the great library of Alexandria was founded ‘to become the greatest ‘data centre’ BIG DATA INNOVATION
in the world at the time’. The library was in charge of collecting all the world’s knowledge - the staff being occupied with the task of translating works onto papyrus paper. It remained a major centre of learning for 300 years until its destruction by the Romans in 30BC. In terms of complex statistical analyses, one notable example was in 1663, when John Graunt used statistics to track the spread of the bubonic plague, thus playing a part in its eradication.
7
1984 30bc
1996
1965 1822 2400BC
1663
Hadoop was developed in 2005 – ten years ago, and now there are ever more complex ways of storing unimaginable amounts of data
2005
2010
1991
Many understand the first computer to have been invented by Charles Babbage in 1822 (which remained a concept as it was never completed), but actually its ancient predecessor came 1700 years earlier. The Antikythera Mechanism was an ancient analogue computer for use primarily in astronomy. It was composed of at least 30 bronze ‘gears’ and could perform relatively complex calculations. Now moving into the modern era, in 1965 the US government planned the world’s first data centre to store, you guessed it... tax returns. They also put 175 million sets of fingerprints onto magnetic tape. The Orwellian reality described in his novel 1984 (written in 1949) was beginning to come true. ‘Big Brother’ was starting to discover the tools to watch us.
ever more complex ways of storing unimaginable amounts of data. In 2010, Google legend Eric Schmitt said that the amount of data being created in two days equated to the amount of data created from the beginning of time to 2003. Today, the amount of data being generated is amazing... Facebook users like 4,166,667 posts every minute of the day for example. That is what I call a fast moving industry. I personally feel proud to be contributing in the smallest way to one of the most basic of human desires – to quantify the world around us. Big data is here to stay because the thirst for data has been around for millennia.
Since the appearance of (arguably) the first modern computer (the Turing machine) in 1936, capacity for storing data has increased exponentially decade-on-decade, and with the birth of the internet in 1991, sharing data became a global possibility. In 1996, digital data storage officially became more cost effective than paper, and in 1997 that repository of everything and anything, Google, came into being. Hadoop was developed in 2005 – ten years ago - and now there are BIG DATA INNOVATION
8
SINGAPORE
MARCH 2 & 3 2016
ie.
Innovation Summit
Smart Cities
Innovation Summit
Speakers Include + 852 8199 0121 ryuan@theiegroup.com www.theinnovationenterprise.com CHIEF STRATEGY OFFICER
9
Enhanced Marketing through Data Big Data has in some ways been used in marketing since its inception. In fact, historically it has been one of the business units most disrupted from the proliferation of new technologies and new idea. Gabrielle Morse Organizer, Big Data Innovation Summit
From the customers who are sent specific emails, through to how people see individual ads on websites, billboards or TV sets, it is the roadmap for every successful marketing campaign today. It is generally thought that data has specific uses in marketing departments, and the top three responses to a survey by 2nd Watch about this said that 29% used it to better understand customer insight, 18% to improve the supply chain and 16% to power campaigns and promotions. Therefore, the most common use for data is helping to understand exactly what a customer wants from your company and then either creating a new product to suit them, or marketing an existing
product in a way that appeals to a particular demographic. This is taking a very traditional view of how data is used by marketing teams though, and doesn’t take into account significant other aspects that have impacted marketing in big ways. Below, we look at the three of the most overlooked effects of a wider use of data.
BIG DATA INNOVATION
10
Clean Data One of the clear benefits of using so much more data is that it needs to be cleaner in order to make it effective, something that every data driven marketer should have learnt early on. As soon as the first email comes back saying ‘You got my name wrong’, the database should be validated and re-validated as often as possible. Through this necessary validation, the data being used by the marketing team (and wider company) will become more accurate and thus far more effective. As the marketing team are often the ones who directly deal with the people who’s data may have been input incorrectly, their involvement is one of the primary reasons for database validation.
BIG DATA INNOVATION
Instant Knowledge In addition to having a more indepth knowledge of their user base, marketing teams suddenly know anything about them in a considerably shorter space of time. This is because with thousands of potential actions recorded and retrievable, it is possible to see how many of your customers use a certain system, where they are or almost anything else about them at the click of a button. This means that strategies can be instantly changed through informed decisions at any time, moving marketing into a far more agile and adaptive position.
Global Impact As well as having an impact on how decisions and strategies are changed, the use of data means that the global decisions about marketing can be made from a single location, with the understanding of local culture being made far clearer through the data presented. This is not to say that there is not a necessity to have a thorough knowledge of local issues and traditions, but one of the most important aspects is knowing exactly what people like, not the perception of it. Through an accurate database and powerful data gathering, the ability to see this kind of information and make global decisions is possible.
11
Looking to find new ideas or share your own?
channels.theinnovationenterprise.com
Inform. Inspire. Innovate ANALYTICS INNOVATION
12
build your marketing data strategy in 6 steps
Ian Thomas Principal Group Program Manager, Customer Data & Analytics, Microsoft
BIG DATA INNOVATION
Your company has a Marketing Strategy, right? It’s that set of 102 slides presented by the CMO at the offsite last quarter, immediately after lunch on the second day, the session you may have nodded off in (it’s ok, nobody noticed. Probably). It was the one that talked about customer personas and brand positioning and social buzz, and had that video towards the end that made everybody laugh (and made you wake up with a start). Your company may also have a data strategy. At the offsite, it was relegated to the end of the third day, after the diversity session and that presentation about patent law. Unfortunately several people had to leave early to catch their flights, so quite a few people missed it. The guy talked about using big data to drive product innovation through continuous improvement, and he may (at the very end, when your bladder was distracting you) have mentioned using data for marketing. But that was something of an afterthought, and was delivered with almost a sneer of disdain, as if using your company’s precious data
for the slightly grubby purpose of marketing somehow cheapened it. Which is a shame, because marketing is one of the most noble and enlightened ways to use data, delivering a direct kick to the company’s bottom line that is hard to achieve by other means. So when it comes to data, your marketing shouldn’t just grab whatever table scraps it can and be grateful; it should actually drive the data that you produce in the first place. This is why you don’t just need a marketing strategy, or a data strategy: You need a Marketing Data Strategy.
13
These questions will not only help you identify the data you’ll need, but also some of the data that you can expect to generate with your marketing
A Marketing Data What? What even is a Marketing Data Strategy? Is it even a thing? It certainly doesn’t get many hits on Bing, and those hits it does get tend to be about building a datadriven Marketing Strategy (i.e. a marketing strategy that focuses on data-driven activities). But that’s not what a Marketing Data Strategy is, or at least, that’s not my definition, which is: A Marketing Data Strategy is a strategy for acquiring, managing, enriching and using data for marketing The four boldface words are the key here. If you want to make the best use of data for your marketing, you need to be thinking about how you can get hold of the data you need, how you can make it as useful as possible, and how you can use your marketing efforts themselves to generate even more useful data, creating a positive feedback loop and even contributing to the pool of big data that your big data guy is so excited about turning into an asset for the company.
Building your Marketing Data Strategy So now that you know why it’s important to have a Marketing Data Strategy, how do you put one together? Everyone loves a list, so here are six steps you can take to build and then start executing on your Marketing Data Strategy.
Step 1: Be clear on your marketing goals and approach This seems obvious, but it’s a frequently missed step. Having a clear understanding of what you’re trying to achieve with your digital marketing will help you to determine what data you need, and what you need to do with/to it to make it work for you. Ideally, you already have a marketing strategy that captures a lot of this, though the connection between the lofty goals of a marketing strategy (sorry, Marketing MBA people) and the practical data needs to execute the strategy are not always clear. Here are a few questions you should be asking: Get new customers, or nurture existing ones? If your primary goal is to attract new customers, you’ll need to think differently about data (for example relying on third-party sources) than if you are looking to deepen your relationship with your existing customers (about whom you presumably have some data already). What are your goals & success criteria? If you are aiming to drive sales, are you more interested in revenue, or margin? If you’re looking to drive engagement or loyalty, are you interested in active users/ customers, or engagement depth (such as frequency of usage)? Which communications strategies & channels? The environments in which you want to engage your audience make a big difference to your data needs for example, you may have more data at your disposal to target people using your website compared to social or mobile channels. Who’s your target audience? What attributes identify the people you’d most like to reach with your marketing? Are they primarily
ANALYTICS BIG DATA INNOVATION
14 demographic (e.g. gender, age, locale) or behavioral (e.g. frequent users, new users)? What is your conversion funnel? Can you convert customers entirely online, or do you need to hand over to humans (e.g. in store) at some point? If the latter, you’ll need a way to integrate offline transaction data with your online data. These questions will not only help you identify the data you’ll need, but also some of the data that you can expect to generate with your marketing.
Step 2: Identify the most important data for your marketing efforts Once you’re clear on your goals and success criteria, you need to consider what data is going to be needed to help you achieve them, and to measure your success. The best way to break this down is to consider which events (or activities) you need to capture and then which attributes (or dimensions) you need on those events. But how to pick the events and attributes you need? Let’s start with the events. If your marketing goals include driving revenue, you will need revenue (sales) events in your data, such as actual purchase amounts. If you are looking to drive adoption, then you might need product activation events. If engagement is your goal, then you will need engagement events ‘this might be usage of your product, or engagement with your company website or via social channels. Next up are the attributes. Which data points about your customers do you think would be most useful for targeted marketing? Does your product particularly appeal to men, or women, or people within a certain
BIG DATA INNOVATION
geography or demographic group? For example, say you’re an online gambling business. You will have identified that geo/location information is very important (because online gambling is banned in some countries, such as the US). Therefore, good quality location information will be an important attribute of your data sources. At this step in the process, try not to trip yourself up by secondguessing how easy or difficult it will be to capture a particular event or attribute. That’s what the next step (the data audit) is for.
Step 3: Audit your data sources Now to the exciting part – a data audit! I’m sure the very term sends shivers of anticipation down your spine. But if you skip this step, you’ll be flying blind, or worse, making costly investments in acquiring data that you already have. The principle of the data audit is relatively simple – for every dataset you have which describes your audience/customers and their interaction with you, write down whether (and at what kind of quality) they contain the data you need, as identified in the previous step: Events (e.g. purchases, engagement) Attributes (aka dimensions, e.g. geography, demographics) IDs (e.g. cookies, email addresses, customer IDs) The key to keeping this process from consuming a ton of time and energy is to make sure you’re focusing on the events, attributes and IDs which are going to be useful for your marketing efforts. Documenting datasets in a structured way is notoriously challenging (some of the datasets we have here at Microsoft
15
These questions will not only help you identify the data you’ll need, but also some of the data that you can expect to generate with your marketing
have hundreds or even thousands of attributes), so keep it simple, especially the first time around, you can always go back and add to your audit knowledge base later on. The one type of data you probably do want to be fairly inclusive with is ID data. Unless you already have a good idea which ID (or IDs) you are going to use to stitch together your data, you should capture details of any ID data in your datasets. This will be important for the next step.
Step 4: Decide on a common ID (or IDs) This is a crucial step. In order for you to build a rich profile of your users/customers that will enable you to target them effectively with marketing, you need to be able to stitch the various sources of data about them together, and for this you need a common ID. Unless you’re spectacularly lucky, you won’t be issuing (or logging) a single ID consistently across all touchpoints with your users, especially if you have things like retail stores, where IDing your customers reliably is pretty difficult (well, for the time being, at least). So you’ll need to pick an ID and use this as the basis for a strategy to stitch together data. When deciding which ID or IDs to use, take into consideration the following attributes: The persistence of the ID. You might have a cookie that you set when people come visit your website, but cookie churn ensures that that ID (if it isn’t linked to a login) will change fairly regularly for many of your users, and once it’s gone, it won’t come back. The coverage of the ID. You might have a great ID that you capture when people make a purchase, or sign up for online support, but if it
only covers a small fraction of your users, it will be of limited use as a foundation for targeted marketing unless you can extend its reach. Where the ID shows up. If your ID is present in the channels that you want to use for marketing (such as your own website), you’re in good shape. More likely, you’ll have an ID which has good representation in some channels, but you want to find those users in another channel, where the ID is not present. Privacy implications. User email addresses can be a good ID, but if you start transmitting large numbers of email addresses around your organization, you could end up in hot water from a privacy perspective. Likewise other sensitive data like Social Security Numbers or credit card numbers, do not use these as IDs. Uniqueness to your organization. If you issue your own ID (e.g. a customer number) that can have benefits in terms of separating your users from lists or extended audiences coming from other providers; though on the other hand, if you use a common ID (like a Facebook login), that can make joining data externally easier later. Whichever ID you pick, you will need to figure out how you can extend its reach into the datasets where you don’t currently see it. There are a couple of broad strategies for achieving this: Look for technical strategies to extend the ID’s reach, such as cookie-matching with a third-party provider like a DMP. This can work well if you’re using multiple digital touchpoints like web and mobile (though mobile is still a challenge across multiple platforms).
Look for strategies to increase the number of signed-in or persistently identified users across your touchpoints. This requires you to
BIG DATA INNOVATION
16 have a good reason to get people to sign up (or sign in with a third-party service like Facebook) in the first place, which is more of a business challenge than a technical one. As you work through this, make sure you focus on the touchpoints/ channels where you most want to be able to deliver targeted messaging, for example, you might decide that you really want to be able to send targeted emails and complement this with messaging on your website. In that case, finding a way to join ID data between those two specific environments should be your first priority.
Step 5:
Find out what gaps you really need to fill Your data audit and decisions around IDs will hopefully have given you some fairly good indications of where you’re weak in your data. For example, you may know that you want to target your marketing according to geography, but have very little geographic data for your users. But before you run off to put a bunch of effort into getting hold of this data, you should try to verify whether a particular event or attribute will actually help you deliver more effective marketing. The best way to do this is to run some test marketing with a subset of your audience who has a particular attribute or behavior, and compare the results with similar messaging to a group which does not have this attribute (but are as similar in other regards as you can make them). I could write another whole post on this topic of A/B testing, because there is a myriad of ways that you can mess up a test like this and invalidate your results, or I could just recommend you read the work of my illustrious Microsoft colleague, Ronny Kohavi.
BIG DATA INNOVATION
If you are able to run a reasonably unbiased bit of test marketing, you will discover whether the datapoint(s) you were interested in actually make a difference to marketing outcomes, and are therefore worth pursuing more of. You can end up in a bit of a chicken-and-egg situation in this regard, because of course you need data in the first place to test its impact, and even if you do have some data, you need to test over a sufficiently large population to be able to draw reliable conclusions. To address this, you could try working with a third-party data provider over a limited portion of your user base, or over a population the provider provides.
Step 6: Fix what you can, patch what you can’t, keep feeding the beast Once you’ve figured out which data you actually need and the gaps you need to fill, the last part of your Marketing Data Strategy is about tactics to actually get this data. Of course the tactics then represent an ongoing (and never-ending) process to get better and better data about your audience. Here are four approaches you can use to get the data you need: Measure it. Adding instrumentation to your website, your product, your mobile apps, or other digital touchpoints is (in principal) a straightforward way of getting behavioral events and attributes about your users. In practice, of course, a host of challenges exist, such as actually getting the instrumentation done, getting the signals back to your datacenter, and striking a balance between well-intentioned monitoring of your users and appearing to snoop on them (we know a little bit about the challenges of striking this balance). Gather it. If you are after explicit user attributes such as age or gender, the best way to get this data is to
17
Once you’ve determined which data makes the most difference to your marketing, and have hit upon a strategy (or strategies) to get more of this data, you need to keep feeding the beast
ask your users for it. But of course, people aren’t just going to give you this information for no reason, and an over-nosy registration or checkout form is a sure-fire way to increase drop-out from your site, which can cost you money (just ask Bryan Eisenberg). So you will need to find clever ways of gathering this data which are linked to concrete benefits for your audience. Model it. A third way to fill in data gaps is to use data modeling to extrapolate attributes that you have on some of your audience to another part of your audience. You can use predictive or affinity modeling to model an existing attribute (e.g. gender) by using the behavioral attributes of existing users whose gender you know to predict the gender of users you don’t know; or you can use similar techniques to model more abstract attributes, such as affinity for a particular product (based on signals you already have for some of your users who have recently purchased that product). In both cases you need some data to base your models on and a large enough group to make your predictions reasonably accurate. I’ll explore these modeling techniques in another post. Buy it. If you have money to spend, you can often (not always) buy the data you need. The simplest (and crudest) version of this is oldfashioned list-buying, you buy a standalone list of emails (possibly with some other attributes) and get spamming. The advantage of this method is that you don’t need any data of your own to go down this path; the disadvantages are that it’s a horrible way to do marketing, will deliver very poor response rates, and could even damage your brand if you’re seen as spamming people. The (much) better approach is to look for data brokers that can provide data that you can join to your existing user/customer data (e.g. they have a record for user abc@xyz.com and so do you, so
you can join the data together using the email address as a key). Once you’ve determined which data makes the most difference to your marketing, and have hit upon a strategy (or strategies) to get more of this data, you need to keep feeding the beast. You won’t get all the data you need – whether you’re measuring it, asking for it, or modeling it – right away, so you’ll need to keep going, adjusting your approach as you go and learn about the quality of the data you’re collecting. Hopefully you can reduce your dependency on bought data as you go. Finally, don’t forget – all this marketing you’re doing (or plan to do) is itself a very valuable source of data about your users. You should make sure you have a means to capture data about the marketing you’re exposing your users to, and how they’re responding to it, because this data is useful not just for refining your marketing as you go along, but can actually be useful in other areas of your business, such as product development or support. Perhaps you’ll even get your company’s big data people to have a bit more begrudging respect for marketing.
BIG DATA INNOVATION
18
Commercializing Big Data: Frequent Flyer Programs Mark Ross Smith Highly Experienced Hotel, Airline & Telco Product Loyalty, StayAngel
I recently shared an article on how airlines are using information to individually price airfares and extract maximum value from passengers which was thought provoking and insightful for many airline industry and big data professionals. With all the overwhelming feedback and questions I received, I wanted to dig deeper into the big data world and share with you some insights and a greater understanding on one of my favourite subjects—commercializing big data.
BIG DATA INNOVATION
Big data and frequent flyer programs have a symbolic relationship with each other, and when combined they represent a powerful illustration of what could be possible by leveraging commercial models within the big data landscape. So rather than get into the mundane world of data modelling, Hadoop, Machine Learning and coding—I want to explore the more practical side on steps that frequent flyer programs can take right now to increase the value offering to members, increase
19
There are 4 key areas of big data that I believe are important: Application, Modelling, Insights and Analytics
engagement and help generate new revenue for the business. There are 4 key areas of big data that I believe are important: Application, Modelling, Insights and Analytics. In this article we’ll explore how to commercialize the application of big data to frequent flyer loyalty programs. That is—how to generate new revenues by commercializing your existing data streams. Without a doubt, frequent flyer programs are collecting information on a massive scale and while your data scientists, analytics and business intelligence folks are crunching numbers, hypothesizing, looking for data trends and burning through your company’s cash in the process to hopefully one day strike gold, I believe there is a faster, easier, more efficient path to new revenues. I call it Data Technology Commercialization.
Data Technology commercialization — the business of making money Consider the amount of information you have on your frequent flyer member base—a lot, right? Now think about all the other companies and brands your customers interact with on a daily basis. How much data do you think they are gathering? It’s likely the data you’re both collecting has very little overlap—and this is where you begin to expand your view of the customers’ spending and engagement profile. Once we take a step back and look at the top 10 brands your customer interacts with outside of your FFP, we begin to understand what little information the frequent flyer program holds on that customer and it’s clear that a 360 degree view is simply not possible. Unless you’re the NSA. But since the NSA hasn’t opened a commercialization department; we need to create our
own mini database to understand all the trigger points of every customer, what drives those trigger points and which partner is best leveraged to engage in the call to action.
Turning data into revenues Many companies make the mistake of thinking they are best positioned to send marketing messages on their own products. While this may be true—in the world of big data, it’s possible to make an instant judgment call on such decisions to know if you or one of your commercial partners is best placed to trigger a marketing message on an individual basis. Who is more likely to convert this customer into one of your products based on the data available today? Data your partners hold may be the missing link in your marketing chain. Below are some examples of how their data insights can plug into yours for a 3-way mutual benefit between you, your partner and the customer: Your partner knows which credit cards your customers hold, have held in the past and which cards your customers do not want. Remaining cards in the equation = marketing proposition value. Using your internal data of how receptive this customer is to new card offers/churning and/or a points driven customer; you’re able to calculate—instantaneously which credit card and how likely on a score of 1–100 this customer is to apply for the card. From there, you compare NPS scores the customer has given both yours and the other company and which ever has the greatest should be the company who sends the marketing offer to the consumer to increase the overall effectiveness of the message. How many miles your customers are crediting to other programs = You know how much business you’re missing and can appropriately adjust individual messages to this
BIG DATA INNOVATION
20 customer. You may be receiving 30% of the BIS miles from this passenger, but 100% of premium class revenue tickets, and this may be the best possible revenue position, therefore no marketing inducements are triggered. Of course, if you’re receiving 10% of their business, your partners may be able to provide data such as who else is getting their business—which provides opportunity for you to wrap individual pricing/campaigns around their specific engagement with your brand. Your customer just bought an event badge for a conference they’re planning to attend in a foreign country. Using data from your partners you could instantly figure out if they have booked flights on a competing airline. Armed with this highly lucrative intel the event organizer could pre-search flights that have upgrades available and present these as options to your customer, with appropriate routing to match their preferences (does the customer prefer to fly via specific city? Are they status driven for the extra miles? Do they avoid 767s?)
Big data and frequent flyer programs have a symbolic relationship with each other
BIG DATA INNOVATION
The key here is to leverage your partner brand that has the highest possible chance of turning the data into a sale and for you to recognize that trying to ‘own’ the customer in every circumstance, may be damaging to your relationship. Remember to be open to the idea of letting someone else do the heavy lifting and share in the success with them. They bring more than just a customer to the table - they potentially bring the difference between you having something and your competitor having everything. These are only a few example of how you can leverage the influence of data and commercialize it in a way that makes sense for all parties. Ultimately your key drivers will be to drive revenue while adding real value to your customers experience, and when you find a balance between cross-sharing data specific data and marketing you’ll ultimately be in a position where new product revenue opportunities naturally occur and you’re able to extract the maximum value from each customer.
21
Have something to say? Share your insights at an Innovation Enterprise event
+ 1 415 992 5339 ctowers@theiegroup.com www.theinnovationenterprise.com BIG DATA INNOVATION
22
SPARK CREATING UBER FIRE Laura Denham Organizer, Big Data Innovation Summit
Uber has become a household name and few in the developed world don’t know what it does and the speed at which it has grown. Having launched in 2009, the company is already valued at $18 billion, and that number is increasing daily.
It seems like a very simple model; people want to make extra money by driving people around in their free time and people want to be able to get a taxi wherever they are at whatever time. However, because the process is based on an app and locational data, and the company focuses on the speed at which customers are picked up, the amount of data created and required is considerable. The amount of data within a single transaction is large, but when you consider that they now command a fleet of over 100,000 drivers in around 340 cities across 63
BIG DATA INNOVATION
countries, the scale of the data they need to deal with becomes clear. As a foundation for this they have turned to Apache Spark to allow them to not only process the data quickly, but also so that they can quickly scale their operations. To help with this scaling, and also to deal with the huge amounts of data, they have turned to a Kafka-based system that pushes the data to local data centers, then to a central Hadoop Cluster. This replaces the system of multiple distributed data centers in a relational model. It makes the process considerably simpler and faster.
23
Most current space missions use radio frequency to transfer data, which is relatively slow roughly the same speed as a 1990s phone modem
In fact, Vinoth Chandar, who is in charge of the scaling and creating Uber’s data systems, gave an interview to Datanami and claimed that Spark has been ‘instrumental in where we’ve gotten to’. Having this high speed and quick scaling has meant that Spark feeding into a central Hadoop cluster has been the ideal setup, given the historically strong scalability that this brings. So what does this all mean for the consumer? People often miss the importance of a backend data framework because it is not something that is noticed when it is working, but in the case of Uber it is vital to their ongoing success. When there are surge prices for instance, this is set through an analysis of the data rather than arbitrary time slots, making it a more accurate representation of supply and demand It also means that reaction to large events is much easier without necessarily affecting the overall service throughout a particular area, for instance if a large music event is happening at one end of a city, the other end will not be devoid of Uber drivers. It also strategically places Uber drivers in real-time, meaning that you can order a car and have it with you as soon as possible. So despite most Uber users not even knowing what Spark is, the truth is that their enjoyment of the service is very much dependent on it.
BIG DATA INNOVATION
24
What do TalkTalk Hacks tell us?
George Hill Managing Editor
BIG DATA INNOVATION
TalkTalk, the telecommunications company, have come under ‘Significant and Sustained’ attacks on its system, and hackers have gathered an unverified amount of information on its customers. According to media reports, TalkTalk do not know what has been taken or how many of its 4 million customers are likely to be affected, but even with the details being fairly hazy, we can learn a considerable amount about the nature of cyber attacks and some of the companies who fall victim to them.
25
Cybercrime is not like many other crimes where we could say that they could happen to anybody
There Is No Such Thing As Victim Blaming Cybercrime is not like many other crimes where we could say that they could happen to anybody. The targets of hackers are invariably larger companies who hold the most data. Throughout all of the media reporting, none have absolved the company of blame in the same way they would in a burglary would be. If a company is hacked it is quite simply their own fault because cybercrime is not a new or mysterious thing, it is scarily common and is something that can be mitigated against. Security experts, firewalls and any one of thousands of actions can be taken to try and stop hacks occurring. It seems that TalkTalk did not take the warnings that we will go into in another point. When a customer hands over their data to a company, the company has a duty of care for that data, and through allowing hackers to access their system, they have catastrophically failed in this duty. Many Hacks Are Not About Data Directly It was previously thought that the reasons for hacks in general was for criminals to get hold of personal information from the data they stole.
Credit card details, passwords etc would be stolen from individuals and then used to steal money and identities from them. However, banks, credit card companies and even email address providers have wised up to a lot of these actions and instead it seems that the primary motivating factor of hackers today is simply to steal information and then hold it to ransom. TalkTalk were contacted by the hacker, or hackers, to demand money to not release the data, which looks to back up this thought. We even saw it with the Ashley Madison hack that the ransom demands weren’t monetary, they were simply trying to influence the company to take specific action. In that case, the data was released because the company did not comply and the damage was made considerably worse, which may actually strengthen the position of future hacks in the future. It showed that these were not simply empty threats and that hackers were willing to follow through. Some Companies Do Not Learn The most painful thing we get from this hack is simply that some companies do not learn.
With some companies, although they have not taken the time to fully research and implement security effectively, their lack of preparedness could be put down to them not having been in the firing line in the past. If you are attacked once, you normally take note and beef up your security instantly. TalkTalk have not had one, but two hacks in the last 18 months that are currently being investigated by the police. This means that not being aware that their system had flaws isn’t an excuse, and in the 2.5 months since the last attack, they have not put themselves in a position to effectively protect their customer’s data. Possibly worse than this, they have admitted that some of the data that was stolen wasn’t encrypted despite the two previous attacks. This makes it much easier to use and steal and so the question needs to be asked about why they kept this data in this form? This makes this data loss inexcusable, but we can learn one thing from it, mainly that companies who ignore the safety of the customer’s data are likely to be punished. Hopefully many will learn from this hack and the repercussions that TalkTalk are likely to have.
BIG DATA INNOVATION
26
ie.
&
Innovation Summit Speakers Include
ie.
MELBOURNE JANUARY 28 & 29 2016
+ 61 2 8011 3007 dwatts@theiegroup.com www.theinnovationenterprise.com
BIG DATA INNOVATION
27
The Crowdsourced Chief Data Officer
Crowdsourcing has had many successes, from the way that companies and products have been funded through sites like Kickstarter or Indiegogo, all the way to the Guardian using it to analyze leaked data files. Chris Towers Head of Big Data
It is a common practice, but it is being used for something completely different in Long Beach, California in place of a Chief Data Officer. The thinking behind this is that as a small city, they are sceptical of creating a six figure salary job without knowing if they could make a positive impact. It is a common issue that is especially worrisome in government departments due to the increased pressure that using public money brings. Long Beach has a population of over 470,000, but is not a tech hub like many cities in the state. They still want to investigate the possibilities
of the role though. So they have created this experiment with the aim of: 1. Identifying high-value data that benefits citizens; 2. Supporting the cleaning and formatting of open data; and 3. Presenting open data insights to citizens via mobile and Web apps. With these three aims in mind, they will open up much of municipal data to allow their engaged, data-driven citizens to try enact some of the main duties of a Chief Data Officer, but without the financial risk that actively hiring one entails. It is at BIG DATA INNOVATION
28
They will open up much of municipal data to allow their engaged, data-driven citizens to try enact some of the main duties of a Chief Data Officer
present unclear whether or not this will be a success, but if it is, it could lead to a revolution in the way that governments use their data and also if they appoint a Chief Data Officer. As this is being done at a fraction of the cost of a dedicated Chief Data Officer, governments can assess whether or not they should employ one full time or even if there are elements of the role that can be outsourced on a more permanent basis. It also gives them the chance to see whether the use of data in their work is productive and if it has resonance with the population. However, it is not a simple operation to create and effectively implement, and there are likely to be some hurdles to jump before it becomes a success. For instance, there needs to be some kind of reward for the work done by those who are undertaking it, whether this is financial or social. There also needs to be appreciation of the work being done by those in government. If after spending hours analyzing something, the findings are then
BIG DATA INNOVATION
ignored, the chances are that they will be put off doing further work in the future. There also needs to be care taken about communicating what information is being given to these people. With the media currently fixated on data privacy issues, many people are rightly worried about their data being used without their knowledge. When it is in a more-or-less open format like this, communicating exactly what data is being used for in this work and why will be vital to the survival of the programme moving forward. So can this kind of programme work? In essence it can, but it will need to have significant support from the public and individuals in the government. If people can get it to work it may well be the best possible option for smaller governments in the future and could create a genuine opportunity for the further spread of data in the public sector.
29
good data governance –
the ultimate recipe for growth Fergus Kennedy Head of Compliance, Pulsant
In the age of big data, organizations are seeking to use analytics to their advantage — to improve efficiencies, better understand their customers, and ultimately growth their business. But as companies mature, the need to fully understand how their information assets grow. And the complexity of knowing what data you have and what you can do with it becomes a bit of a challenge. Data governance, or the ability of an organization to manage the availability, usability, integrity and security of its data, has a key role to play. It’s all about understanding the governance requirements of data in the ‘here and now’ as well as into the future, and knowing how changes in the business (no matter
how small) or in the greater market will affect things.
Cut the complexity — reap the rewards But is there a simple way to harness the power of data governance? Simply put - yes there is. Data governance can be a foundation for growth, from organic growth to an increase in capabilities. Regardless of sector, public, private or third sector, organizations should seek to simplify their approach to data governance. How? There are a number of steps that you can take — like streamlining and rationalizing your information assets, the number of locations they’re kept, and transit BIG DATA INNOVATION
30
Data governance, or the ability of an organization to manage the availability, usability, integrity and security of its data, has a key role to play
points. But that’s not enough; you should also focus on how many people are accessing the data and using it.
Splitting assets for consolidation The approach to simplification begins with separating information assets that are distinguishing between customer information, operations data and financial information, and marketing and advertising data. Once these sets of data are separated, you can properly address the needs of each element. Of course that’s not to say each bit of data should have its own section, rather just look at grouping the data into sets that make sense. This helps make the process more efficient and effective by having consolidated storage and backup requirements for each set. In the same way, when you outsource these elements (storage and backup) to service providers or third parties, you can maintain full control by knowing exactly where your data is, who is accessing it and where the transit points are. This approach is much simpler than managing your data using a distributed tool set or a range of different data protection rules and contracts across different geographic regions, thus making yourself subject to different data sovereignty rules. While there is no set of ‘one size fits all’ rules for all organizations, this approach can be applied to any business or public sector organization. From SMEs to fullscale enterprises the approach will work but it will have to be consistent. This is especially true as each particular industry has its own sets of challenges, as does the markets in which public sector organizations operate in. For example, the public sector is governed by procurement rules, financial services is driven by compliance and the charity sector faces significant budget challenges.
BIG DATA INNOVATION
What are the other benefits? Apart from spurring organizational growth, reducing the complexity of your data governance strategy can in turn reduce the time and cost of risk assessment and management. With a simplified view of data governance you can focus on your key governance challenges and therefore have a clearer view of the associated risks.
The role of the service provider When it comes to scoping out the benefits of simplified data governance or splitting out your information assets, the help of an expert service provider can be hugely advantageous. For example, when it comes to data storage, hosting your data in a service provider’s data centre ensures you don’t need to worry about factors like physical security because they will have the proper controls in place and expertise to handle everything. In the same way, using a trusted provider to assist with managing access to the data can also be beneficial. A service provider can manage access to your information remotely and manage the connections around it, while you can simplify user groups and the number of people within them.
Conclusion Ultimately, it’s all about understanding your organization’s data, what is important to your business to do with it and apply appropriate governance. While the help of a third party is beneficial, it doesn’t replace the need to understand the risk and liability around your data. Through appropriate data governance you can give your business the best foundation for healthy enterprise, regardless of the sector in which you operate.
31
Need to Engage your Audience? Digital Lead Generation
Online Advertising
Marketing Services
YOU CREATE THE MESSAGE,
LEADING BRANDS CHOOSE
CREATING USER ENGAGEMENT
WE DO THE REST
CHANNEL VOICE
ACROSS ALL MEDIA
Exclusive one hour webinars to a targeted demographic
Join an integrated, content-sharing platform
Maximize your exposure through our services;
Increase your companies visibility and host your white paper on our platform
Enables brands to demonstrate their thought leadership & expertise
Social media promotion Digital & print advertising E-mail marketing
ggb@theiegroup.com
Innovation Enterprise Media Pack
+ 1 (415) 692 5498 BIG DATA INNOVATION
32
For more great content, go to ieOnDemand Over 4000 hours of interactive on-demand video content
View today’s presentations and so much more
All attendees will recieve trial access to ieOnDemand
Stay on the cutting edge
Listen
Watch
Learn
Innovative, convenient content updated regularly with the latest ideas from the sharpest minds in your industry.
+1 (415) 692 5514
www.ieondemand.com BIG DATA INNOVATION
sforeman@theiegroup.com