Big Data Innovation, Issue 4

Page 1

1

BUILDING A DATA TEAM

David Barton talks to Pamela Peele, Chief Analytics Officer at UPMC

A DIFFERENT PERSPECTIVE

Sean Patrick Murphy gives us his unique views on Big Data

FINDING THE KEY DATA

Andrew Claster, from Obama for America discusses the use of Big Data in the re-election campaign


2

Go full throttle with big data. Want to get to relevant data quicker? Buckle up. With SAS High-Performance Analytics, you can reduce your big data analysis from days and hours to just minutes and seconds. Then, use that extra time to predict and solve the toughest business problems – while your competitors are still spinning their wheels. ®

sas.com/big-data-info to learn more about SAS big data solutions.

SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. © 2013 SAS Institute Inc. All rights reserved. S112807US.0813


3

Letter From The Editor Welcome to this issue of Big Data Innovation, whether you are reading the printed version or online, we hope you enjoy this edition as much as we have enjoyed putting it together. With growing appreciation of the benefits of Big Data, its popularity is only going to increase, and we hope the skills gap that organizations suffer from will shrink and eventually close completely. In this issue we have worked with the sharpest minds currently working in Big Data. We wanted to know their thoughts on the industry and how it will look in the future. We speak to Sean Murphy, Pamela Peele, Gregory Shapiro-Piatesky and Kirk Borne about their thoughts. Each have their own thoughts and experiences to draw on and give a unique and interesting perspective on how we can get round the gap currently affecting the industry. In addition to this, we also talk to Andrew Claster about his experiences using big data in the Obama re-election campaign. I hope you enjoy this issue and if you want to contribute to future issues or feel that you have a unique perspective or response to anything you read in here, please email me. George Hill Managing Editor ghill@theiegroup.com

ยง

Managing Editor George Hill President Josie King Art Directors Gavin Bailey & Joanna Violaris Assistant Editor Chloe Thompson Advertising Hannah Sturgess Contributors Dan Miller Chris Towers David Barton Heather James All Enquiries ghill@theiegroup.com


4

Contents

6

David Barton looks at how Pamela Peele has built a team around her big data needs

10

Chris Towers talks to Gregory Piatetsky-Shapiro about his take on current big data education and how it could be improved

16

Heather James talks data with Kirk Borne, discussing current issues at college level and before with one of the world’s leading big data professors

22

Sean Patrick Murphy, Senior Scientist at John Hopkins University talks to George Hill about his unique perspectives on data and approaches to it in education

27

Daniel Miller talks to Andrew Claster, Deputy Chief Analytics Officer at Obama for America about his use of analytics in the Obama re-election campaign


5


56

Building Teams in Big Data: An interview with Pamela Peele, Chief Analytics Officer at UPMC David Barton

When we are looking at the big data skills gap and education, one of the most important aspects to look at is how it is affecting individual industries. Pamela Peele is the Chief Analytics Officer at UPMC and I have had discussions with her in the past about how she has managed to build effective big data teams and what is needed to create an effective partnership.


7

mean that business problems are often looked at in the wrong ways, for instance data driven decisions that would not work in a particular business sector. Pamela believes the most important aspect of her big data team is having an analytics leader who can create the data strategy and implementation within the scope of the company whilst also taking a leading role in the hands on analysis.

The investment in an analytics leader alone is not enough however, the company must have the trust and the bravery to make significant investments in the team around the leader, otherwise the skills may be there, but the manpower would Many companies tend to create not be enough for the task. teams revolving around technical- Pamela is also interested in the ly minded people, which can often big data skills gap, a current trend create business problems in the that has caused considerable isfuture. Having an analytics lead- sues in the healthcare industry. er with not only leadership skills Whilst other industries such as but also business and technical finance, insurance and retail are awareness allows the team to be also feeling the pinch in terms of effectively steered. Having a team the numbers of qualified and exof technically minded people can perienced data scientists, health-


8

care has been hit even harder. The reason for this, according to Pamela, is "in healthcare whilst it is somewhat transactional delivering services, the service isn't exactly the same because the consumption and action of service varies by patient so it’s much harder to deal with health data than transaction pieces which are claims or transaction data." Of course, the only real way around this is through the ways in which we are educating graduates. Pamela believes that at PHD level, the graduates that come through the system are good, however at bachelor degree level, there could be some improvement. However, this may be a changing trend as in the US especially we are seeing universities making investments in their big data, analytics and statistical courses. This will hopefully see an improvement in the quality of their statistical bachelor degree graduates.

is a bias towards one side of a role than should be balanced. However, a strength and unique aspect of Pamela's thinking is that she manages to utilise a plethora of roles within her big data team. Pamela uses the example of a factory in order to explain why this is the case. "The way to think about this is that you are making knowledge...When you make knowledge it is no different from making widgets, it is a production. You would never staff a factory with everybody who is a dye cutter or a machinist. You need to have a whole different diverse set of skills to run the factory, in the same way that you need diverse skills to run an analytics shop." The data team at UPMC now includes biostatisticians, physicians, lawyers, policy makers and even journalists.

Each has an important role in creating, presenting and acting upon the data that is created and therefore making a successful team. This kind One of the ways in which healthcare of integration of different skills is a companies have tried in the past to novel and useful way of addressing make up for the dearth of health- the big data skills gap and can often care centric analytical talent is by overcome some of the limitations attempting to adapt either techni- that you find in analytics graduates. cal thinkers to healthcare or healthcare thinkers to become more technical. The issues that this creates


9

2

1

0

3 4

$

5

£ € $

¥

£

¥

#

8

6

€ ¥

$ £

£ $

Tweet Tweet Tweet Tweet Tweet Tweet

Share


10 10

Gregory Piatetsky-Shapiro Talks Big Data Education Chris Towers

One of the aspects of big data that many in the industry are currently concerned by is the perceived skills gap. The lack of qualified and experienced data scientists has meant that many companies find themselves adrift of where they want to be in the data world.


11

I thought I would talk to one of the most knowledgeable and influential big data leaders in the world, Gregory Piatetsky-Shapiro. After running the first ever Knowledge Discovery in Databases (KDD) workshop in 1989, he has stayed at the sharp end of analytics and big data for the past 25 years. His website and consultancy, KD Nuggets, is one of the most widely read data information sources and he has worked with some of the largest companies in the world. The first thing that I wanted to discuss with Gregory was his perception of the big data skills gap. Many have claimed that this could just be a flash in the pan and something that has been manipulated, rather than something that actually exists. Gregory references the McKinsey report of May 2011 which quotes: “There will be a shortage of talent necessary for organizations to take advantage of big data. By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.”

The report predicts that this kind of skills gap will exist in 2017, but Gregory believes that we are already seeing this. Whilst using Indeed. com to look at what expertise companies are looking for, Gregory found that of the top 10 job trends both Mongo DB and Hadoop appear. “Big Data is actually rising faster than any of them. This indicates that demand for Big Data skills exceeds the supply. My experience with KDnuggets jobs board confirms it - many companies are finding it hard to get enough candidates.” There are people responding to this however, with many universities and colleges recognising not only the shortages, but also the desire from people to learn. Companies looking to expand their data teams are also looking at both internal and external training. For instance companies such as EMC and IBM are training their data scientists internally. Not only does this mean that they know that they are getting a high quality of training, but that the data scientists that they are employing are being educated in ‘their ways’. With companies finding it hard to employ qualified candidates, through training


12

programs like this, companies can look for great candidates and make sure they are sufficiently qualified afterwards. The IBMs and EMCs of this world are few and far between. The money that needs to be invested in in-depth internal training is considerable and so many companies would struggle with this proposition. So what about those other companies? How can they avoid falling through the big data skills gap? Gregory thinks that most companies have three options. Do you need BIG data? Most companies confuse big data with basic data analysis. At the moment with the buzz around big data, many companies are over investing in technology that realistically isn’t required. A company with 10,000 customers, for instance, does not necessarily need a big data solution with multiple Hadoop clusters. Gregory makes the point that on his standard laptop he would be able to process data for a large software company with 1 million customers. Companies need to ask if they really need the depth of data skills that they think. What if you do need it? For large companies who may need to manage larger

data sets, the reality is that it is not necessary to employ a big data expert straight from university. Gregory makes the point that somebody who is trained in Mongo DB can become trained as a data scientist relatively easily. If an internal training programme is not a realistic target in this instance, then external training may become the best option. There are several companies who can offer this such as Cloudera and many others, who can train data scientists to a relatively high standard. Gregory also mentions that one way in which several companies are learning about big data and analytics is through attending conferences. There are now hundreds of conferences a year on Big Data and related topics, from leaders in the field such as Innovation Enterprise and other smaller conferences all around the world. What if these are untenable? Some Big Data and analytics work can be outsourced or given to consultants. This allows to not only free up their existing data team to specific tasks, but also means that they are not having to risk taking on a full time employee who may not be sufficiently qualified. Here, the leading companies


13

include IBM, Deloitte, Accenture and also pure-play analytics outsourcing providers like Opera Solutions and Mu Sigma. Having discussed the big data skills gap with several people who have worked in big data for years, one of the main concerns they have is the fanfare affecting the long term viability of the business function. Gregory does not have this concern, but does make it clear that we need to make sure that the buzzword ‘big data’ is separated from the technological trend. He has written in Harvard Business Review about this belief that the ‘sexy’ big data is being overhyped. The majority of companies who have implemented big data have done so in order to predict human behaviour, but this is not something that can be done consistently with big data. Therefore, Gregory believes that any disillusionment with big data will not come from an inability to find the right talent, but in it’s build-up not living up to the reality. On the other hand, Gregory is quick to point out that the

amount of data that we are producing will continue its rapid growth for the foreseeable future. This data will still need people to manage and analyze it and so we are going to continue to see growth even if the initial hype dies down. We are also seeing an increasing interest in countries outside the US, the current market leader. This global interest is likely to increase the big data talent pool and therefore allow for expansion. Having used Google Trends Gregory points out that the top 10 searches for ‘Big Data’ are:


14

Given the interest from elsewhere we are going to see an increasingly globalized talent pool and potentially the migration of the big data hub from the US to Asia. Gregory also points out that given that the top five do not have english as a primary language (the trend analysis was purely for english language searches) the likelihood is that this does not represent every search for big data in those countries. This interest in the subject certainly shows that the appetite for big data education exists globally and those working in the big data educational sphere are utilising technology to increase effectiveness. Gregory points out that many companies are using analytics within their online education to make the experience more productive for both the students and teachers. Through the use of this technology, big data education is becoming more productive and also more tenable to a truly international audience.

One of the aspects of big data that is clear, is that in order to succeed you need curiosity and passion. The other aspects of the role will always involve training and the kind of options and platforms for this will mean that in the coming years, we will see this gap closing. Gregory is a fine example of somebody who has managed to not only innovate within the industry for the past 25 years, but was one of the first to try and share the practice across many people. If we can find even one person from those working in data with the same passion and curiosity as him then the quality and breadth of education can continue to grow at the same speed as this exciting industry.


D TA

GI

B BI

G DA TA

15


16 16

Big Data Superstars: An interview with Kirk Borne Heather James

Big data has been described as the sexiest job in the world. Those working on it every day may disagree, but we have seen data scientists become superstars in the business world. The use of predictive analytics and effective analysis has seen an upturn in the successes of thousands of companies and spawned new and even more effective ways of communicating with customers.


17

When we think of big data today, for most people it is something that has been around for a few years and that has been hyped almost beyond perception by the media. The use of big data to do everything from predicting the pregnancy of a woman before she had even told her parents to winning the most powerful job in the world, has dominated news stands and thrust data science into the forefront of business thinking.

where he sees it going in the future.

One man however has done everything from briefing the Whitehouse on data mining to using systems for discoveries at NASA. He has spoken at TED and been voted as the most influential big data influencer on Twitter and number 11 in the Big Data top 100.

The answer is an unequivocal yes.

If we talk about data scientists being superstars, Kirk Borne is the quarterback. I wanted to speak to Kirk about his experiences in data and how he has seen the industry change and

Kirk is the Professor of Astrophysics and Computational Science at George Mason University as well as being on the board for multiple data organizations, so he is the perfect person with whom to discuss data education. The first thing I wanted to know from Kirk was his thoughts on the current idea that there is a skills gap in the big data industry. Kirk is now a man in demand, both for his opinion, connections and the people that he teaches. He says that “I now get two or three calls every day from companies trying to find data scientists�. This is at odds to what he was finding at the beginning of his career. Kirk has described this a flip in the equation from when he first graduated, where there were 100 gradu-


18 ates for every data job going. Given that at this time, this role was predominantly utilized by science and government based agencies, the numbers of positions available were considerably less.

sands of people who are all creating this much data is not something that we have ever dealt with before.

things like smartphones ,social media and dealing w i t h thou-

Kirk’s view on this is that there are two perspectives that need to be looked at in order to effectively assess current big data education initiatives.

In Kirk’s TED talk in April 2013, he discusses how up until 2002 the Today Kirk is finding that for every amount of data that we had created was 5 billion gigabytes. In 2003 100 jobs, there is one applicant. The reason for this shift is that alone this was created again. By our society is becoming increas- 2011 this much data was made in ingly social and digitally focussed, 2 days. Today this is made in 10 where incredible amounts of data minutes. are being created every day. One This kind of growth in data, not only of the aspects of big data that Kirk in terms of the amounts created, also finds interesting is the way in but in terms of the speed in which it is created, means that despite us which it is perceived due to this. Many have said to him that ‘big always having had data, the abildata has always existed’ but Kirk ity to not only deal with what we believes that this is a misleading have, but to adapt in order to deal statement. It is almost incompre- with the ever increasing amounts hensible how much in- means that education in dealing formation an individ- with this kind of data needs to be ual makes now due to good.

‘The phrase that I use with people is that it’s an education in data as well as data in education'


19

Kirk believes that data should be included in education from a young age data, as regardless of your future profession, it will be used in one way or another. For instance it can even be done at kindergarten level, the ways in which toys are sorted by colour, type, size or shape are all forms of data siloing. Using this kind of technique early where children can identify and explain why certain things are in certain areas forms a strong foundation to add more complex ideas on.

The need to teach people these aspects of data throughout their lives will be vital to improving education and closing the skills gap. Many, when looking to data for business solutions, want to find an all encompassing data scientist. Kirk believes that this is not always necessary. A business team is like any other team, you have different people in it to do different jobs. Kirk believes that companies who are struggling to find the complete package data scientist, can avoid this by looking at this concept. Sure there are ‘all star data scientists’ around, the ones who know about the algorithms and know about the business, sales, strategy, finance and can run almost as a department in themselves, but they are like “all stars” everywhere else; rare.

Education in Data: This initial education throughout earlier school opportunities will also allow the education in data aspect to be more thorough and successful. What many lecturers currently find is that people come into higher data education with a gap in understanding, with some teachers actually saying that students The way in which companies don’t know what ‘data’ is. are looking for data scientists


20

at the moment could be transformed to make it more of a team effort as opposed to just looking for an individual who can do it all. This collaborative approach (as discussed with Pamela Peele on page 6) can reap rewards and should be approached like a factory, you have many different specialities specialising in their chosen areas. Why should big data be any different? This approach allowing organizations to utilize the skills needed in data (be this through one all star or a collaborative effort through a team) will drive the industry forward. A forward thrust and pragmatism is what is going to be needed in the coming years. This is due to Kirk’s prediction that with the ways in which big data has grown, this growth is hard to see stopping. The growth so far has been exponential, growing year on year not only in scale but in speed. Kirk points out that an exponential growth curve is not only about exponential growth but an exponential growth in the rate of growth. This means that we are going to be seeing considerably more data produced, considerably quicker. The only way that this could stop would be if companies stopped putting sensors in devices or people stopped using social media. As both of these seem unlikely, the amount of data created and therefore the need for it to be managed will continue to increase.

One thing that Kirk is sure of is that the hype that we are currently seeing around big data will not destroy the potential that it has. He equated the hype around big data to the Titanic. When the ship set off it was the largest, fastest, most luxurious ship in the world. When it was sinking it was still all of these things but people weren’t concentrating on that, they were concentrating on swimming. Big data is like this now. There is still all the hype about what it can do and the reasons for doing it, but ultimately with the amount of data constantly increasing, we will need to start swimming to make the most of it. One of the things that I was struck by with Kirk was his genuine excitement at what big data has become, the ways in which it can be used to make breakthrough discoveries and help organizations everywhere make the best decisions. His success has been down to decades of dedication to data and it’s uses. Through this he has achieved unprecedented success and although he rightly says that there are few big data superstars, Kirk is undoubtedly one of them.


21

TURNING BIG DATA INTO INTELLIGENT ACTION

ONLY VITIA OPERATIONAL INTELLIGENCE (OI) PROVIDES: Continuous, Real-time Analyticstm on Big Data in motion & at rest Built-in process management capabilities that enable intelligent action Elastically scalable Operational Intelligence on premise or in the cloud

LEARN MORE ABOUT VITRIA OI FOR BIG DATA ANALYTICS AT:

www.vitria.com


22

Looking At Data From A Different Perspective: An Interview With Sean Patrick Murphy George Hill

his assimilation that ‘A data scientist, is a data analyst from San Francisco’. This aptly demonstrates the way that Sean looks at the uses of data across the board. It is something that is important, but the hype that surrounds the subject has warped its true use. One of the clearest indications of Sean’s thinking is his following


22 23

description of big data at the the rest of the world, including government and manmoment: “While many have tried, the agement in particular, has term “big data” lacks a true realized that data can create consensus definition. At the value, principally financial moment the most popular but also environmental and definitions seem to coalesce social value. And, if data is around the idea that big data valuable, more data is more is one or more data sets so valuable and who doesn’t large and complex that they want “big” (ie. large) value.” are challenging to process using traditional databases and tools. Often associated with this concept are the characteristic “three V’s” of big data: the volume (amount of data), velocity (speed of data in and out) and variety (range of data types and sources). Some enterprising companies and consultants throw in a 4th “V” for veracity or some other “V” word. Regardless, these definitions miss a key aspect of the term. To put it into hyperbolic language, “Big Data” isn’t about the size of data at all. Instead, it is the simple yet seemingly revolutionary belief that data is valuable. While “big data” does often happen to be large in size (although this is always relative to the available tool set), I believe that “big” actually means important (think big deal). Scientists have long known that data could create new knowledge but now

Taking this view and making the concept of ‘big’ data more of an abstract term in order to say that big data is more about the importance of data as opposed to its size simplifies the idea whilst also making it clear that this data revolution is about a new perception as opposed to a new size. Sean is also a believer that there is nothing that big data cannot improve. He sees the use of data not as something that will be of use in itself, but will be of use to improve and focus other areas. In order for anything to be improved there needs to be some kind of measurement and this measurement needs to be measured and analysed. This is the whole idea of what data is and therefore as long as there are elements to anything that can be improved there will be no limit to what data can achieve within the improvement process.


24

about whether Sean saw this change occurring through the increased use of technology or a heavier involvement from people. Seans answer was, “People provide the creAlthough he admits that ativity, the drive to explore, there may well be a gap in and the flashes of insight the amount of people who while the technology enables can actually analyse and col- them to execute”. This shows lect the data, what is real- the balanced approach that ly missing is the knowledge needs to be taken within the needed throughout the rest industry in order to drive it of the company. Without forwards. management knowledge I was once told that big data and willingness to act on the is like cooking a good meal. results of the collected data, You can have a great stove, realistically it does not mat- pans and knives but without ter whether or not there are the correct chef to put it all enough people to analyze together they are pointless. If the data, as it will make no you have the technology, but not the analytics skills to utidifference anyway. He believes that moving lize them properly, then the away from just using the technology is useless. Another unique way that Sean looks at one of the key areas that those working in big data are currently concentrating on, is the big data skills gap.

opinion of the HiPPO’s (Highest Paid Person’s Opinion) is the only way that we can make data really drive the future of companies, organizations and even more importantly, governments. So where does this leave the industry? I was curious

Sean has a unique opportunity within big data at the moment. He has worked on several different data projects across business, government and several other spaces. This breadth of usage gives him one of the most innovative and interesting


25

perspectives on big data that I have come across. His work at John Hopkins also means that he has the opportunity to use some of the latest technologies years before they are available to businesses. This is perhaps one of the main reasons why he has such high hopes for the future whilst making a concerted effort to not overhype the industry in its current state. One thing is for sure, with people like Sean looking to bring through breakthrough data techniques, the chances of us seeing a more data driven society are greatly increased.


26

On-Demand Business Education

www.membership.theiegroup.com


27

Data In An Election: An Interview With Andrew Claster, Deputy Chief Analytics Officer Obama for America Daniel Miller

We were lucky enough to talk to Andrew Claster, Deputy Chief Analytics Officer for President Barack Obama’s 2012 re-election campaign ahead of his presentation at the Big Data Innovation Summit in Boston, September 12 & 13 2013. Andrew Claster, Deputy Chief Analytics Officer for President Barack Obama’s 2012 re-election campaign, helped create and lead the largest, most innovative and most

spirit of america / Shutterstock.com


28

successful political analytics operation ever developed. Andrew previously developed microtargeting and communications strategies as Vice President at Penn, Schoen & Berland for clients including Hillary Rodham Clinton, Tony Blair, Gordon Brown, Ehud Barak, Leonel Fernandez, Verizon, Alcatel, Microsoft, BP, KPMG, TXU and the Washington Nationals baseball team. Andrew completed his undergraduate studies in political science at Yale University and his graduate training in economics at the London School of Economics. What was the biggest challenge for the data team during the Obama re-election campaign?

a unified dataset to inform campaign decisions? Online/Offline: How do we encourage online activists to take action online and vice-versa? How do we facilitate and measure this activity? Models: How do we develop and validate our models about what the electorate is going to look like in November 2012? Communications: Our opponents and the press are continually discussing areas in which they say we are falling short. When is it in our interest to push back, when is it in our interest to let them believe their own spin, and what information are we willing to share if we do push back?

It is difficult to identify just one. Here are some of the - Cost: How do we evaluate everything we do in terms of most important: cost per vote, cost per volData Integration: We unteer hour or cost per staff have several major database hour? platforms – the national votPrioritization: We don’t er file, our proprietary email list, campaign donation his- have enough resources to test tory, volunteer list, field con- everything, model everything tact history, etc. How do and do everything. How do we integrate these and use we efficiently allocate human


29

date always has a significant Internal Communica- advantage with their data tion, Sales and Marketing: effectiveness, do you think How do we support every this is the case? department within the cam- The incumbent has many adpaign (communications, field, vantages including the foldigital, finance, scheduling, lowing: advertising)? How do we Incumbent has data, indemonstrate value? How do frastructure and experience we build relationships? How from the previous campaign. do we ensure that data and Incumbent is known in analytics are used to inform decision-making across the advance – no primary – and can start planning and imcampaign? plementing general election Hiring and Training: strategy earlier. Where and how do we reIncumbent is known to cruit more than 50 highly qualified analysts, statistical voters – there is less uncermodelers and engineers who tainty regarding underlying are committed to re-elect- data and models. ing Barack Obama and will- However, the incumbent may ing to move to Chicago for also have certain disadvana job that ends on Election tages: Day 2012, requires that they Strategy is more likely work more than 80 hours to be known to the other side a week for months with no because it is likely to be simivacation in a crowded room lar to the previous campaign. with no windows (nicknamed With a similar strategy ‘The Cave’), and pays less with fewer benefits than they and many of the same stratwould earn doing a similar egists and vested interests as the previous campaign, it job in the private sector? could be harder to innovate. Many working within political statistics and analytics say On balance, the incumbent that the incumbent candi- has an opportunity to put herself or himself in a supeand financial resources?


30

rior position regarding data, analytics and technology. However, it is not necessarily the case that s/he will do so – the incumbent must have the will and the ability to develop and invest in this potential advantage. When there is no incumbent and there is a competitive primary, it is the role of the national party and other affiliated groups to invest in and develop this data, analytics and technology infrastructure. How much effect do you think data had on the election result? The most important determinants of the election result were: Having a candidate with a record of accomplishment and policy positions that are consistent with the preferences of the majority of the electorate. Building a national organization of supporters, volunteers and donors to register likely supporters to vote, persuade likely voters to support our candidate, turn out likely supporters and protect the ballot to ensure

their vote is counted. Data, technology and analytics made us more effective and more efficient with every one of these steps. They helped us target the right people with the right message delivered in the right medium at the right time. We conducted several tests to measure the impact of our work on the election result, but we will not be sharing those results publicly. As an example however, I can point out that there were times during the campaign when the press and our opponent claimed that states such as Michigan and Minnesota were highly competitive, that we were losing in Ohio, Iowa, Colorado, Virginia and Wisconsin, and that Florida was firmly in our opponent’s camp. We had internal data (and there was plenty of public data, for those who are able to analyze it properly) demonstrating that these statements were inaccurate. If we didn’t have accurate internal data, our campaign might have made multi-million dollar mistakes that could have cost us several key states and the election.


31 Given the reaction of the public to the NSA and PRISM data gathering techniques, what kind of effect is this likely to have on the wider data gathering activities of others working within the data science community?

Do you think that after the success of the data teams in the previous two elections that it is no longer an advantage, but a necessity for a successful campaign?

Campaigns have always used data to make decisions, Consumers are becoming but new techniques and more aware of what data is technology have made more available and to whom. It is data accessible and allowed increasingly important for it to be used in innovative those of us in the data sci- ways. ence community to help edu- Campaigns that do not incate consumers about what vest in data, technology or information is available, analytics are missing a huge when and how they can opt opportunity that can help out of sharing their informa- them make more intelligent tion and how their informa- decisions. Furthermore, their tion is being used. supporters, volunteers and donors want to know that the campaign is using their contributions of time and money as efficiently and effectively as possible, and that the campaign is making smart strategic decisions using the latest techniques.

Filip Fuxa / Shutterstock.com


32


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.