H T T P : / / W W W. A N A LY T I C S - M A G A Z I N E . O R G
DRIVING BETTER BUSINESS DECISIONS
JANUARY / FEBRUARY 2016 BROUGHT TO YOU BY:
Deep dive into
data lakes The premise, promise, potential for managing big data
ALSO INSIDE: • Get smart: digital business innovation • Customer lifetime value: new insights • Corporate profile: BNSF Railway • What ISIS fears most: stability
Executive Edge Ernst & Young CAO Chris Mazzei on data analytics’ better half: the human element
INS IDE STO RY
Only thing we have to fear If you and your company haven’t yet taken a dive into a data lake, maybe it’s time to test the waters. In this issue’s lead feature, Sean Martin, founder and chief technical officer of Cambridge Semantics, explains what the relatively new method of management of big data is all about and what’s driving all the excitement concerning data lakes. But dive and swim at your own risk; Martin also details the potential risks. For more about the premise, promise and potential, as well as the rewards and risks of the next great, “big data” and analytics innovation, see “Deep dive into data lakes.” When it comes to risk in today’s world, nothing can match the seemingly intractable problem of international terrorism. ISIS and other terrorist organizations have clearly instilled fear and chaos with their murderous and seemingly random worldwide attacks. While the attacks are strategically insignificant on a national level let alone a global scale – and we can argue whether rhetoric from political leaders from some of those countries attacked has only served to heightened the fear – perhaps a better question to ask in order to best counter terrorism is: What do terrorists fear most? The answer could be distilled down to a single word: “stability.” Scott Mann, a retired Army lieutenant colonel, Green Beret and longtime Special 2
|
A N A LY T I C S - M A G A Z I N E . O R G
Ops officer, was an architect and original implementer of the Village Stability Operations (VSO) program in Afghanistan. In his book, “Game Changer,” and drawing on his on-the-ground experiences from missions in Afghanistan, Iraq, Colombia and other conflict zones, Mann makes the case that “going local” – establishing stable communities on a village-by-village basis in conflict areas – is perhaps the best way to thwart terrorism. Doug Samuelson, himself a seasoned defense analyst, interviewed Mann for the article titled, “Changing the game: How analytics can help defeat violent extremism around the world.” These two articles bookend the feature section of this issue of Analytics. In between, you’ll find offerings on digital business innovation, estimating customer lifetime value and a profile of BNSF Railway and its operations research and advanced analytics team. In addition, regular columnists Vijay Mehrotra, Rajib Ghosh and Harrison Schramm provide commentary on such diverse topics such as the good and bad side of Uber, what 2016 holds for healthcare analytics and predicting Navy football games, respectively. ❙
– PETER HORNER, EDITOR peter.horner@ mail.informs.org W W W. I N F O R M S . O R G
C O N T E N T S
DRIVING BETTER BUSINESS DECISIONS
JANUARY/FEBRUARY 2016 Brought to you by
FEATURES
50
32
DEEP DIVE INTO DATA LAKES By Sean Martin The premise, the promise, the potential of method for managing big data has drawn widespread attention.
40
GET SMART: DIGITAL BUSINESS INNOVATION By Haluk Demirkan and Bulent Dal Smart technologies, services, processes and people add up to smart systems for every sector.
50
CUSTOMER LIFETIME VALUE By Matthew Lulay Leveraging predictive analytics adds key new insights for estimating familiar marketing metric.
56
CORPORATE PROFILE: BNSF RAILWAY By Amy Casas Operations research and advanced analytics team helps power rail giant’s success now and in the future.
64
CHANGING THE GAME By Doug Samuelson How analytics and village stability operations can help defeat violent extremism around the world.
56
64 4
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
XLMINER®: Data Mining Everywhere Predictive Analytics in Excel, Your Browser, Your Own App
XLMiner® in Excel – part of Analytic Solver® Platform – is the most popular desktop tool for business analysts who want to apply data mining and predictive analytics. And soon it will be available on the Web, and in SDK (Software Development Kit) form for your own apps.
Forecasting, Data Mining, Text Mining in Excel. XLMiner does it all: Text processing, latent semantic analysis, feature selection, principal components and clustering; exponential smoothing and ARIMA for forecasting; multiple regression, k-nearest neighbors, and ensembles of regression trees and neural networks for prediction; discriminant analysis, logistic regression, naïve Bayes, k-nearest neighbors, and ensembles of classification trees and neural nets for classification; and association rules for affinity analysis.
have in Excel, and generate the same reports, displayed in your browser or downloaded for local use.
XLMiner SDK: Predictive Analytics in Your App. Access all of XLMiner’s parallelized forecasting, data mining, and text mining power in your own application written in C++, C#, Java or Python. Use a powerful object API to create and manipulate DataFrames, and combine data wrangling, training a model, and scoring new data in a single operation “pipeline”.
Find Out More, Start Your Free Trial Now. Visit www.solver.com to learn more, register and download Analytic Solver Platform or XLMiner SDK. And visit www.xlminer.com to learn more and register for a free trial subscription – or email or call us today.
XLMiner.com: Data Mining in Your Web Browser. Use a PC, Mac, or tablet and a browser to access all the forecasting, data mining, and text mining power of XLMiner in the cloud. Upload files or access datasets already online. Use the same Ribbon and dialogs you
The Leader in Analytics for Spreadsheets and the Web Tel 775 831 0300 • info@solver.com • www.solver.com
DRIVING BETTER BUSINESS DECISIONS
REGISTER FOR A FREE SUBSCRIPTION: http://analytics.informs.org INFORMS BOARD OF DIRECTORS
14
74
DEPARTMENTS
2 8 14 20 24 28 70 74 78
Inside Story Executive Edge Analyze This! Healthcare Analytics INFORMS Initiatives News & Notes Conference Preview Five-Minute Analyst Thinking Analytically
Analytics (ISSN 1938-1697) is published six times a year by the Institute for Operations Research and the Management Sciences (INFORMS), the largest membership society in the world dedicated to the analytics profession. For a free subscription, register at http://analytics.informs.org. Address other correspondence to the editor, Peter Horner, peter.horner@mail.informs.org. The opinions expressed in Analytics are those of the authors, and do not necessarily reflect the opinions of INFORMS, its officers, Lionheart Publishing Inc. or the editorial staff of Analytics. Analytics copyright ©2016 by the Institute for Operations Research and the Management Sciences. All rights reserved.
6
|
A N A LY T I C S - M AGA Z I N E . O RG
President Edward H. Kaplan, Yale University President-Elect Brian Denton, University of Michigan Past President L. Robin Keller, University of California, Irvine Secretary Pinar Keskinocak, Georgia Tech Treasurer Sheldon N. Jacobson, University of Illinois Vice President-Meetings Ronald G. Askin, Arizona State University Vice President-Publications Jonathan F. Bard, University of Texas at Austin Vice President Sections and Societies Esma Gel, Arizona State University Vice President Information Technology Marco Lübbecke, RWTH Aachen University Vice President-Practice Activities Jonathan Owen, CAP, General Motors Vice President-International Activities Grace Lin, Institute for Information Industry Vice President-Membership and Professional Recognition Susan E. Martonosi, Harvey Mudd College Vice President-Education Jill Hardin Wilson, Northwestern University Vice President-Marketing, Communications and Outreach Laura Albert McLay, University of Wisconsin-Madison Vice President-Chapters/Fora Michael Johnson, University of Massachusetts-Boston INFORMS OFFICES www.informs.org • Tel: 1-800-4INFORMS Executive Director Melissa Moore Meetings Director Laura Payne Director, Public Relations & Marketing Jeffery M. Cohen Headquarters INFORMS (Maryland) 5521 Research Park Drive, Suite 200 Catonsville, MD 21228 Tel.: 443.757.3500 E-mail: informs@informs.org ANALYTICS EDITORIAL AND ADVERTISING
Lionheart Publishing Inc., 506 Roswell Street, Suite 220, Marietta, GA 30060 USA Tel.: 770.431.0867 • Fax: 770.432.6969
President & Advertising Sales John Llewellyn john.llewellyn@mail.informs.org Tel.: 770.431.0867, ext. 209 Editor Peter R. Horner peter.horner@mail.informs.org Tel.: 770.587.3172 Assistant Editor Donna Brooks donna.brooks@mail.informs.org Art Director Alan Brubaker alan.brubaker@mail.informs.org Tel.: 770.431.0867, ext. 218 Advertising Sales Sharon Baker sharon.baker@mail.informs.org Tel.: 813.852.9942 Aileen Kronke aileen@lionhrtpub.com Tel.: 770.431.0867, ext. 212
Su pp ac Pow orts he e T Sp r B abl ar I a ea k n u ExcelBig Dd , at a
Ap
ANALYTIC SOLVER PLATFORM ®
From Solver to Full-Power Business Analytics in
Solve Models in Desktop Excel or Excel Online.
Plus Forecasting, Data Mining, Text Mining.
From the developers of the Excel Solver, Analytic Solver Platform makes the world’s best optimization software accessible in Excel. Solve your existing models faster, scale up to large size, and solve new kinds of problems. Easily publish models from Excel to share on the Web.
Analytic Solver Platform samples data from Excel, PowerPivot, and SQL databases for forecasting, data mining and text mining, from time series methods to classification and regression trees and neural networks. And you can use visual data exploration, cluster analysis and mining on your Monte Carlo simulation results.
Conventional and Stochastic Optimization. Fast linear, quadratic and mixed-integer programming is just the starting point in Analytic Solver Platform. Conic, nonlinear, non-smooth and global optimization are just the next step. Easily incorporate uncertainty and solve with simulation optimization, stochastic programming, and robust optimization – all at your fingertips.
Find Out More, Download Your Free Trial Now. Analytic Solver Platform comes with Wizards, Help, User Guides, 90 examples, and unique Active Support that brings live assistance to you right inside Microsoft Excel. Visit www.solver.com to learn more, register and download a free trial – or email or call us today.
Fast Monte Carlo Simulation and Decision Trees. Analytic Solver Platform is also a full-power tool for Monte Carlo simulation and decision analysis, with 50 distributions, 40 statistics, Six Sigma metrics and risk measures, and a wide array of charts and graphs.
The Leader in Analytics for Spreadsheets and the Web Tel 775 831 0300 • info@solver.com • www.solver.com
EXE CU TIVE E D G E
Data analytics’ better half Why investing in the human element of analytics pays off.
Despite massive spending on technology to produce analytics, companies have spent relatively little on their ability to consume analytics – what we call the “human element of analytics.”
BY CHRIS MAZZEI
8
|
For years, companies have spent millions of dollars on data analytics, but many have not seen a breakthrough return on this investment. The problem? Despite massive spending on technology to produce analytics, these companies have spent relatively little on their ability to consume analytics – what we call the “human element of analytics.” Business executives acknowledge that this disconnect is at the heart of the data analytics’ conundrum. The latest EY/Forbes Insight study, “Analytics: Don’t Forget The Human Element” [1], highlights many of the obstacles to making analytics more actionable, and emphasizes what leaders are doing most effectively to achieve analytics excellence. The study surveyed 564 senior leaders and found that a majority of respondents do not have an effective business strategy for competing in a digital, analytics-enabled world. However, there is a segment of executives, the top 10 percent of survey participants, that is achieving a higher level of maturity and seeing competitive advantage. The top 10 percent of participants identified in the survey typically meet two criteria: • They use data analytics in their decision-making “all of the time” or “most of the time.”
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
Your Analytics App – Everywhere
Use Solver, Risk Solver, XLMiner in Excel Online, Google Sheets Or Turn YOUR Excel Model into a Web or Mobile App in Seconds
The easiest way to build an analytic model – in Excel – is now the easiest way to deploy your analytic application to Web browsers and mobile devices – thanks to the magic of Frontline Solvers® and our RASON® server.
Use our Analytics Tools in your Web Browser. Solve linear, integer and nonlinear optimization models with Frontline’s free Solver, and run Monte Carlo simulation models with our free Risk Solver® tool, in Excel Online and Google Sheets. Use our free XLMiner® Analysis ToolPak tool for statistical analysis, matching the familiar Analysis ToolPak in desktop Excel.
Build Your Own Apps with RASON Software. RASON – RESTful Analytic Solver® Object Notation – is a new modeling language for optimization and simulation that’s embedded in JSON (JavaScript Object Notation). With support for linear, nonlinear and stochastic optimization, array and vector-matrix operations, and dimensional tables linked to external databases, the RASON language gives you all the power you need.
Your Excel Model Can Be a Web/Mobile App. The magic begins in Excel with Frontline Solvers V2016: Our Create App button converts your Excel optimization or simulation model to a RASON model, embedded in a Web page, that accesses our cloud servers via a simple REST API. You’re ready to run analytics in a browser or mobile device! Or if you prefer, run your RASON model on your desktop or server, with our Solver SDK®. Either way, you’re light-years ahead of other software tools.
Find Out More, Sign Up for a Free Trial Now. Visit www.solver.com/apps to learn more, and visit rason.com to sign up for a free trial of RASON and our REST API. Or email or call us today.
The Leader in Analytics for Spreadsheets and the Web Tel 775 831 0300 • info@solver.com • www.solver.com
EXE CU TIVE E D G E
• They report a “significant” shift in their company’s ability to meet competitive challenges. THE HUMAN FACE OF ANALYTICS Investing in new technology and tools, data quality and advanced analytics skill sets is common to many companies. After all, these elements are critical for the “production” of analytics. But it is only half of the equation. What is often missing is the behavioral alignment required to move from insights to action to value. This includes key components such as culture, organizational processes, skills of the business “users” and individual employees’ incentives. These are the capabilities required to “consume” analytics throughout the organization. Finding ways to embed analytics into business processes at the point where decisions are made is essential to driving true value in analytics. It is also where organizations find the biggest challenge. THE ORGANIZATIONAL LEVEL Success with analytics requires an organizational commitment to make productive use of data that is integral to the business strategy. Companies demonstrate this organizational alignment in three ways: 1. Strategy: Analytics is central to the business strategy of leading enterprises, but that does not mean executives should 10
|
A N A LY T I C S - M A G A Z I N E . O R G
be asking, “What is my analytics strategy?” They should be asking, “What is my business strategy to compete in a digital, analytics-enabled world?” A slight majority (54 percent) of executives with leading analytics organizations report that analytics is central to their overall business strategy, versus approximately 1 in 10 of respondents in the remaining 46 percent of enterprises who are “lagging” or “learning.” 2. Leadership and culture: Excellence in big data and analytics requires strong leadership. Close to two-thirds (64 percent) of executives in the top 10 percent of enterprises indicate they “have a dedicated C-level executive – a chief analytics officer (CAO) – overseeing their data and analytics programs and engagements.” In contrast, only two in five (40 percent) of the lagging organizations have a designated CAO. However, it must be noted that effective analytics leaders are a rare breed. In many ways, they need to be a renaissance professional, with in-depth knowledge of the business, analytics and statistics, while also being an innovator, a network builder and a leader of teams. In addition to the analytics leadership role, there are five challenges that the CEO and C-suite executives must address to build an analytics-enabled culture: • Delegate an influential executive to lead the enterprise-wide analytics program. W W W. I N F O R M S . O R G
Figure 1: Leading enterprises have aligned their organizations around data and analytics. • Use analytics to challenge existing mental models in the leadership team. • Be clear on the critical business objectives and quantifiable measures for success. • Navigate the inevitable conflicts between established institutions or executives that analytics creates. • Foster collaboration within the C-suite to set an example for the rest of the organization. • Tolerate failure as part of using analytics to learn and innovate. 3. Organization and processes: Aligning analytics delivery and business requirements is crucial to enabling an organization to consume analytics. The survey found that the top 10 percent of organizations had processes in place to
A NA L Y T I C S
connect people and analytics within their organizations. More than half (56 percent) of these top companies have already aligned enterprise, department and linesof-business data and analytics groups, compared with just 13 percent of the rest of the organizations. THE INDIVIDUAL LEVEL Strong leadership and the right organizational and business processes increase the likelihood that a company will successfully be able to leverage analytics. But to achieve a positive impact, analytics must be used at the point where decisions are made – by individuals. There are three factors to this: 1. Decision bias: Companies need to provide the training to help individuals recognize decision biases – the psychological assumptions that often lead to poor
J A N U A R Y / F E B R U A R Y 2 016
|
11
EXE CU TIVE E D G E
decision-making. By being more aware of this subconscious thinking, employees can better interpret and act on the insights from analytics. 2. Capabilities: For analytics to create value, individuals within an organization must be able to understand and use the data and insights. First and foremost, this comes down to training. In the survey, we found that the top 10 percent of firms are more likely than their peers to conduct on-site seminars or workshops, enroll employees in off-site education programs or coaching, and provide mentoring by data and analytics professionals or leaders. But this kind of education is about more than what an individual knows; it also establishes an analytics mindset within the organization. As a result, everyone becomes more comfortable with analytics, which removes the fear factor when switching from judgmentbased to analytics-based decision-making. 3. Incentives: Incentives, rewards and measurement need to be aligned with the actions suggested from the analyticsbased insights. According to the survey, the top 10 percent understand the importance of motivation, with 40 percent of them having aligned incentives to desired change from analytics, compared with 23 percent of their peers. More than two-fifths (42 percent) of the top 10 percent also offer greater opportunities for promotion and advancement to individuals. 12
|
A N A LY T I C S - M A G A Z I N E . O R G
CONCLUSION All companies will need to have analytics as a core competency in order for business decisions to be informed by data. End users of the analytics, whether they are doctors, marketing professionals, factory workers, customer service representatives or financial professionals, will enhance their decision-making with the help of analytics. But this cannot happen without recognizing that the consumption of analytics is as important as the production. Now is the time to ask if your investment in producing data-driven insights is delivering a competitive advantage. If not, ask yourself if your organization is effectively consuming analytics. And as you look forward to what analytics will deliver for your organization in 2016, do not forget the human element. ❙ Chris Mazzei is the global chief analytics officer (CAO) and global Analytics Center of Excellence (COE) leader at Ernst & Young, LLP, where he is responsible for the overall development and go-to market strategy for EY’s various analytics businesses, as well as working with clients to transform core services through the use of analytics.
REFERENCES 1. http://www.forbes.com/forbesinsights/ey_ data_analytics_2015/index.html 2. Figure 1 was taken from the EY/Forbes Insight study, “Analytics: Don’t Forget The Human Element.”
W W W. I N F O R M S . O R G
Master of Science in
EnginEEring ManagEMEnt Delivered 100% Online
A Technical Alternative to the MBA Fast track option available – Finish in 12 months! b 30-credit-hour curriculum with admission offered in Fall, Spring and Summer semesters b Gain skills in demand by industrial, research and commercial firms
Now Accepting Applications
DistanceEd.uncc.edu 704-687-1281
b Concentrations and graduate certificates available: Logistics and Supply Chains, Energy Systems, Lean Six Sigma and Systems Analytics
ANALY ZE TH I S
Uber: good, bad side of automated free markets
I’m impressed and inspired by the way that several sophisticated technologies have been seamlessly stitched together by Uber. At the same time, there is so much about Uber that I intensely dislike.
BY VIJAY MEHROTRA
14
|
Time to roll. I’ve got to get to the other side of town, quickly, for a meeting. I pull the phone out of my pocket, click a single icon and the dot starts to flash: That’s me! They’re looking for me! Soon thereafter a detailed map appears with my location clearly indicated: They found me! With another click, a message goes out across the network, and within seconds information about my ride – the driver’s name, cell phone number, car make and model, license plate and estimated time of arrival – appears on my screen: They are coming to get me! While I wait, I watch the driver’s progress on my map, and if I need to clarify the pick-up details, I just hit another button to call the driver to sort things out. Within minutes, I’m picked up in a clean and comfortable vehicle, driven to my destination via a smart GPS-identified optimal route, and released as soon as I arrive (payment is handled automatically via credit card).
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
That’s Uber in action. Feels like magic, especially compared to the faith-based and stressful exercise of calling a dispatcher or trying to hail a cab (especially here in San Francisco, where there has always been a terrible shortage of traditional taxis [1]), then wondering whether the driver is giving me the runaround in order to jack up my fare, and finally fumbling around in my wallet looking for cash and hoping the driver has the requisite change. Beyond the convenience, I’m impressed and inspired by the way that several sophisticated technologies have been seamlessly stitched together by Uber. Among other things, the Uber experience depends on smartphone hardware and software, 21st century telecommunications infrastructure, increasingly sophisticated GPS systems, payment processing platforms and good, old-fashioned e-mail. The Uber platform – elegantly designed, smartly integrated – indeed makes the user feel empowered, lending some emotional truth to the company’s “everyone’s private driver” tagline [2]. So I am both joyful and amazed every time my Uber car pulls up. At the same time, there is so much about Uber that I intensely dislike. For starters, the company’s founder and CEO Travis Kalanick has a wellchronicled reputation for arrogance and misogyny [3]. The company is known for its
A NA L Y T I C S
long hours, high pressure, lack of work/life balance and utmost secrecy. None of this is unique to Uber, but there’s something about this particular San Francisco-based company that embodies the way that the tech industry and culture seems to have swallowed much of San Francisco almost overnight, with many of the diverse and creative people that inspired me to move here in the first place now priced out of an overheated real estate market that seems to be dominated by youngsters flush with tech dollars – all of whom seem to be constantly riding around in Uber cars. But Uber’s reach extends far beyond its San Francisco Bay Area home base, as the company is constantly expanding. Its basic approach is to thumb its nose at local laws until eventually managing to get them changed in an Uber-friendly direction. As Tracey Lien wrote in a recent Los Angeles Times article, “It [Uber] punches itself into markets and spends big on advance teams, lawyers and lobbyists to fight opposition and gain a foothold in markets around the world” [4]. Uber’s ambitions are vast, and its hiring of former Obama campaign strategist David Plouffe reflects the business importance of its constant combative campaigning. Meanwhile, Uber drivers – the people who not only do the actual transporting of passengers but also are required to invest their own capital to purchase and
J A N U A R Y / F E B R U A R Y 2 016
|
15
ANALY ZE TH I S operate the individually owned vehicles that collectively comprise Uber’s fleet – are seeking to be treated as employees in California [5] (rather than independent contractors) and have been granted the right to unionize in Seattle [6]. Recently, Uber’s unilateral decisions to decrease its prices while also increasing its share of total revenues have led to sharp drops in income for its drivers. Its practices for screening the drivers in its network have also been under scrutiny [7]. Uber’s growth has been phenomenal. Though the company is less than six years old, it is now possible to hail a ride in more than 150 cities around the United States and 68 countries around the world [8]. Nor are the company’s ambitions limited to moving passengers. To date, Uber has experimented with a variety of new pilot projects that leverage its platform and driver network to provide drugstore items (UberESSENTIALS), restaurant meals (UberEATS), urgent package deliveries (UberRUSH) and even flu shots (UberHEALTH). The company, it appears, wants to be the Amazon.com of in-person service delivery. Not yet six years old and still privately held, Uber was recently valued at somewhere north of $50 billion. Along with Uber, a number of other companies are developing specialized software platforms for matching buyers to sellers in many different industries,
16
|
A N A LY T I C S - M A G A Z I N E . O R G
including food delivery, in-home services, package shipment, elder care, overnight lodging, shopping and administrative work. From my perspective, these companies are market makers seeking to optimize the market dynamics in their own favor and service delivery networks seeking to operate cost effectively on a large-scale basis to capture customers, generate profits and crush potential competitors. Generating an expanding and relentless stream of proprietary operational data, these young firms provide analytics professionals with tremendous opportunities to put our talents to use. Indeed, in addition to the army of data scientists that it employs, Uber’s recent wholesale hiring of 40+ researchers from Carnegie Mellon’s famed Robotics Institute [9] is a vivid illustration of the value of specialized technical skills in this growing slice of the business world. But be aware: This so-called “gig economy” in which smart software platforms efficiently match workers with tasks represents a major disruption at many different companies. As tech heavyweight Tim O’Reilly wrote prior to his recent “What’s the Future of Work?” Conference [10], “every industry and every organization will have to transform itself in the next few years” as a result of the increasing number of jobs that can be defined, transmitted and/or delivered via integrated platforms
W W W. I N F O R M S . O R G
S S E N I BUS TICS Y L A N A
Online
MBA
BECOME MORE AT THE
Beacom School of Business Best Value MBA Ranked Top
Top Rated College by Forbes & Princeton Review
10
AFFORDABILITY & ACCREDITATION by Best Master’s Degree
Online MBA Ranked Top
25
IN THE WORLD
by Princeton Review
MBA – General MBA – Business Analytics MBA – Health Services Administration
Get started at
www.usd.edu/onlinemba cde@usd.edu • 800-233-7937
ANALY ZE TH I S like Uber’s. We now have an estimated 53.7 million freelance workers in the United States [11]. Analytics professionals will continue to play a big role in this revolution, so it is important for us to consider not just its technical challenges but also its social consequences. Marina Gorbis, executive director of the not-for-profit think tank The Institute for the Future, calls these platforms “new operating systems” for getting work done that are “based on always-on Internet, mobile devices, social media, sensors and geolocation technologies.” She also warns that these economic platforms “could also be riddled with catastrophic bugs, pushing large swaths of the population to labor at subsistence levels, with no benefits and little predictability over their earning streams” [12]. Personally, I’m still haunted by Jaron Lanier’s ominous warnings about Siren Servers [13]. Like Lanier, I don’t believe that highly automated and unfettered free markets for all kinds of services are inherently optimal. As freelance business writer Erik Sherman recently pointed out, there is “a systemic imbalance in favor of the company that can ignore or avoid regular conditions of doing business” [14], which sounds a lot like Uber when it enters a new market. I talk frequently with my MBA students and alums about the potential downside of concentrating
18
|
A N A LY T I C S - M A G A Z I N E . O R G
too much power in too few online procurement and delivery channels. Yet there’s also no real case for defending the traditional taxi industry either, certainly not here in San Francisco [15] and probably not in many other places. As Uber’s relentless expansion into new markets continues, expect to see more battles with local taxi companies and drivers [16] – and more passengers getting on the Uber app. Sorry, gotta go. My Uber just pulled up. ❙ Vijay Mehrotra (vmehrotra@usfca.edu) is a professor in the Department of Business Analytics and Information Systems at the University of San Francisco’s School of Management and a longtime member of INFORMS. NOTES & REFERENCES 1. My friend Brad Newsham, a former San Francisco taxi driver, provides a nice description of this situation at http://www. bradnewsham.com/articles/why_so_hard.shtml 2. https://vimeo.com/58800109 3. See for example http://www.modernluxury.com/san-francisco/ story/the-smartest-bro-the-room 4. http://www.latimes.com/business/la-fi-0822-uber-revenue20150822-story.html. 5. http://recode.net/2015/06/17/uber-drivers-are-employees-notcontractors-california-labor-commission/ 6. http://www.nytimes.com/2015/12/15/technology/seattleclears-the-way-for-uber-drivers-to-form-a-union.html 7. http://www.fastcompany.com/3050172/tech-forecast/the-truthabout-ubers-background-checks 8. https://www.uber.com/cities 9. http://www.nytimes.com/2015/09/13/magazine/uber-wouldlike-to-buy-your-robotics-department.html 10. http://conferences.oreilly.com/nextcon/economy-us-2015 11. “Freelancing in America: 2015,” accessible online at https:// www.upwork.com/i/freelancinginamerica2015/ 12. https://medium.com/the-wtf-economy/designing-anew-operating-system-for-work-28d1dc3e0f64?imm_ mid=0dde51&cmp=em-na-na-na-newsltr_ econ_20151218#.vtbs6vot4 13. http://www.analytics-magazine.org/july-august-2014/1069analyze-this-dark-side-of-the-digital-world 14. http://www.forbes.com/sites/eriksherman/2015/12/10/ the-gig-economy-depends-on-unequal-treatment-ofbusinesses 15. Even before Uber’s ascent, the San Francisco taxi driver community had been hit by “friendly fire” from City Hall. To learn more, see http://ww2.kqed.org/news/wp-content/ uploads/sites/10/2013/01/NewshamArticle.pdf 16. For some recent highlights, see https://www. popularresistance.org/anti-uber-protests-around-the-world/
W W W. I N F O R M S . O R G
THE
NATION’S FIRST
Associate in Applied Science (A.A.S.) degree in Business Analytics on campus or online.
Credential options • Enroll in one or several: • AAS degree
Why Study Business Analytics?
The Business Analytics curriculum is designed to provide students with the knowledge and the skills necessary for employment and growth in analytical professions. Business Analysts process and analyze essential information about business operations and also assimilate data for forecasting purposes. Students will complete course work in business analytics, including general theory, best practices, data mining, data warehousing, predictive modeling, project operations management, statistical analysis, and software packages. Related skills include business communication, critical thinking and decision making.The curriculum is hands-on, with an emphasis on application of theoretical and practical concepts. Students will engage with the latest tools and technology utilized in today’s analytics fields.
Accelerated Executive Program
Our accelerated learning options allow students to complete certificate credentials in two semesters part time or one semester full time. Accelerated options are available for the Business Intelligence and the Business Analyst certificates.
Questions? Tanya Scott
Director, Business Analytics
919-866-7106 tescott1@waketech.edu
• Certificates: Business Intelligence, Business Analyst, Finance Analytics, Marketing Analytics, and Logistics Analytics
Flexibility • Open-door enrollment • Courses are offered in the fall and spring • Courses can be taken online or on campus • Competitively priced tuition
Gain skills in: • Data gathering • Collating • Cleaning • Statistical Modeling • Visualization • Analysis • Reporting • Decision making
Use data and analysis tools: • Advanced Excel • Tableau • Analytics Programming • SAS Enterprise Guide • SAS Enterprise Miner • SPSS Modeler • MicroStrategy
• Presentation
Funded in full by a $2.9 million Dept. of Labor Trade Adjustment Assistance Community College & Career Training (DOLTAACCCT) grant.
businessanalytics.waketech.edu
HEALT H CARE A N A LY T I C S
Four mega trends to watch in 2016 Changes in 2015 caused the otherwise conservative and closed healthcare industry to change direction.
BY RAJIB GHOSH
20
|
It’s hard to believe that 2015 and half of the second decade of the new century is over. Many industries have changed or were disrupted during this time. Many more will share the same fate as we move through the decade. We have seen many changes in healthcare too, albeit at a slower pace than other industries such as mobile or transportation. Nonetheless, changes in 2015 caused the otherwise conservative and closed healthcare industry to change direction. Healthcare has become data- and analytics-driven in almost all parts of the value chain. As a direct consequence of the Affordable Care Act (ACA) traditional business models have changed. In the coming years those changes are expected to continue. In this article I focus on four trends that will drive healthcare analytics in 2016 and beyond. NO. 1: CONSUMERISM IN HEALTHCARE HAS BEGUN Since the implementation of ACA in 2010, pundits predicted that consumers would have bigger voices in the healthcare industry. We didn’t see much progress in the initial years of ACA. That is changing. More and more Americans are now buying high-deductible health plans. Enrollment in such plans doubled since
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
2010 to about a quarter of all American workers with health plan benefits in 2014. This forces consumers to pay more for healthcare as out-of-pocket expenses. Data from Commonwealth Fund shows that out-of-pocket household expenses for healthcare, including premiums and deductible, doubled to 9.6 percent of household income between 2003 and 2013. This is driving consumer demand for the ability to compare gross and net prices for healthcare services. In theory, price transparency may allow consumers to make better decisions for their healthcare, and price competitiveness should drive costs down like other industries. Care delivery organizations should scrutinize their costs, rethink their delivery workflow and manage their revenue cycle well to keep costs down and attract more clients. Whether that will happen or not remains to be seen. At the same time consumers are increasingly gravitating toward wearables to make self-care easier. A recent IDC report shows that worldwide wearable shipment has grown 163 percent since 2014. Both areas have made positive impact on the need for better data analytics. NO. 2: PROVIDERS ARE TAKING MORE RISK FOR OUTCOMES AND CONSOLIDATING Results from the initial accountable care organizations were quite mixed. The
A NA L Y T I C S
Center for Medicare and Medicaid (CMS) and some private health plans have pushed delivery organizations to accept more risks for population health management. Provider organizations, feeling this price pressure from public and private plans, are trying to consolidate in many markets to retain pricing power. This trend became quite pervasive in 2015. Combining hospitals with physician groups is growing. Kaiser is leading the way as their CEO, Bernard Tyson, said in a recent interview that their model is the best way to deliver care for patients and populations. To steer power away from payer organizations, providers are also offering their own plans and trying to adopt KP-like integrated delivery network (IDN) models. To counter that strategy in 2015, we have seen a mega merger trend among payers as well. Anthem Blue Cross and Cigna, Humana and Aetna, United Healthcare and Catamaran are just a few examples. The business drivers for most mergers are cost containment and defending pricing power. Mega mergers create opportunities to combine large data sets with analytics to have a bigger impact on delivering better population health management. NO. 3: PREDICTIVE ANALYTICS IN HEALTHCARE FINALLY ARRIVED Some 40 percent of healthcare executives reported more than 50 percent
J A N U A R Y / F E B R U A R Y 2 016
|
21
HEALT H CARE A N A LY T I C S
data volume increase in 2014 according to a report by Manatt, Phelps and Phillips, a prominent U.S. law and consulting firm. As the data sets become bigger, health systems and payers take advantage of predictive analytics. In 2014, 47 percent of the managed care organizations (MCO) possessed predictive analytics tools. By 2016 the number is expected to rise to 80 percent. That’s a significant jump. Healthcare organizations are also adopting the insight that social determinants of health contribute to the wellbeing of a patient more than the medical issues. In 2016, both social determinants of health along with usual suspects such as drug use and emergency room admissions data will drive predictive model for identifying cost risks of population cohorts. NO. 4: CAPITATED PAYMENT WILL DRIVE STAKEHOLDERS TOWARDS ANALYTICS DRIVEN POPULATION HEALTH MANAGEMENT One delivery organization can’t undertake population health management unless it is an integrated delivery network. A patient seldom visits just one care delivery organization during a disease life cycle. Access issues and the insurance exchange marketplace will support patient mobility in 2016. As a result, we can expect non-competing healthcare organizations to partner with each other 22
|
A N A LY T I C S - M A G A Z I N E . O R G
to manage the health of a population. Pharmaceutical companies may follow suit and become a partner in care with healthcare organizations. Government payers, i.e., Medicare and Medicaid, are fast moving toward capitated payment and value-based-purchasing models where outcome will be measured and rewarded. To be successful in this new model, data and analytics will become as important as providers, and soon a data analyst will figure in the care teams within provider organizations alongside with physicians, nurses and case managers. 2016 marks the beginning of the second half of this decade, and it is expected to be transformative for the healthcare industry overall. It is also the year for the presidential election. If politics do not get in the way of this fast moving train of “transformation,” we should buckle up for more disruptive changes. ❙ Rajib Ghosh (rghosh@hotmail.com) is an independent consultant and business advisor with 20 years of technology experience in various industry verticals where he had senior-level management roles in software engineering, program management, product management and business and strategy development. Ghosh spent a decade in the U.S. healthcare industry as part of a global ecosystem of medical device manufacturers, medical software companies and telehealth and telemedicine solution providers. He’s held senior positions at Hill-Rom, Solta Medical and Bosch Healthcare. His recent work interest includes public health and the field of IT-enabled sustainable healthcare delivery in the United States as well as emerging nations. Follow Ghosh on twitter @ghosh_r.
W W W. I N F O R M S . O R G
CPLEX Optimization Studio®. Still the best optimizer and modeler for the finance industry. Now you can get it direct
CPLEX Optimization Studio is well established as the leading, complete optimization software. For years it has proven effective in the finance industry for developing and deploying business models and optimizing business decisions. Now there’s a new way to get CPLEX – direct from the optimization industry experts. Find out more at optimizationdirect.com The IBM logo and the IBM Member Business Partner mark are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. *IBM ILOG CPLEX Optimization Studio is trademark of International Business Machines Corporation and used with permission.
INFO RM S IN I T I AT I VE S
aCAP, pro bono & Data Science Bowl The aCAP program allows individuals to apply for and take the CAP exam and hold the aCAP designation until they’ve earned the requisite work experience to apply for the CAP credential.
24
|
CAP NEWS: INFORMS TO LAUNCH ASSOCIATE PROGRAM INFORMS will launch an Associate Certified Analytics Professional (aCAP) program in 2016. Aimed at young professionals and career changers, the aCAP program allows individuals to apply for and take the CAP® exam and hold the aCAP designation until they’ve earned the requisite work experience to apply for the CAP credential. If you’ve already earned CAP certification, you may be interested in serving as a CAP ambassador. INFORMS will soon provide CAP holders with information regarding the ambassador program and how you can help INFORMS increase the value and visibility of CAP certification. For those interested in taking the CAP exam, INFORMS offers online, computer-based testing so you can test on your schedule, as well as paper-and-pencil exams at selected sites. To access any exam, you must first apply and be approved for the CAP examination. Eligible veterans can use their GI Bill to reimburse the exam fee.
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
For more information, visit: www.certifiedanalytics.org. Who is a CAP? INFORMS recently queried its applicant pool (includes both
CAP holders and those who have applied for certification) and came up with a snapshot illustrated by the following graphs:
INFORMS SUPPORTS DATA SCIENCE BOWL INFORMS is once again a partner in the National Data Science Bowl, an online, three-month-long (ending March 14, 2016) competitive event sponsored by Booz Allen Hamilton and Kaggle. Held in conjunction with the National Heart, Lung and Blood Institute (part of the National Institutes of Health), this year’s challenge is to develop an algorithm to empower doctors to more easily diagnose dangerous heart
conditions and help advance the science of heart disease treatment. Declining cardiac function is a key indicator of heart disease. Doctors determine cardiac function by measuring end-systolic and end-diastolic volumes (i.e., the size of one chamber of the heart at the beginning and middle of each heartbeat), which are then used to derive the ejection fraction (EF). EF is the percentage of blood ejected from the left ventricle with each heartbeat. Both the
A NA L Y T I C S
J A N U A R Y / F E B R U A R Y 2 016
|
25
INFO RM S IN I T I AT I VE S
volumes and the ejection fraction are predictive of heart disease. While a number of technologies can measure volumes or EF, magnetic resonance imaging (MRI) is considered the gold standard test to accurately assess the heart’s squeezing ability. The challenge with using MRI to measure cardiac volumes and derive ejection fraction, however, is that the process is manual and slow. A skilled cardiologist must analyze MRI scans to determine EF. The process can take up to 20 minutes to complete – time the cardiologist could be spending with his or her patients. Making this measurement process more efficient will enhance doctors’ ability to diagnose heart conditions early, and carries broad implications for advancing the science of heart disease treatment.
26
|
A N A LY T I C S - M A G A Z I N E . O R G
This year’s Data Science Bowl challenges individuals and teams to create an algorithm to automatically measure end-systolic and end-diastolic volumes in cardiac MRIs after examining MRI images from more than 1,000 patients. The data set was compiled by the National Institutes of Health and Children’s National Medical Center and is an order of magnitude larger than any cardiac MRI data set released previously. With it comes the opportunity for the data science community to take action to transform how to diagnose heart disease. The competition offers an award of $200,000 to the winner. For more information, visit www.datasciencebowl.com/ and watch the tutorial video (https://youtu.be/dFu_5T0ODrM)
W W W. I N F O R M S . O R G
INFORMS TO LAUNCH ‘PRO BONO ANALYTICS’ PROGRAM) INFORMS, the leading professional association in analytics and operations research, recently announced it is launching a new initiative – “Pro Bono Analytics” – in an effort to connect analytics experts with non-profit organizations seeking to improve how they achieve greater results by leveraging data and information. With the Pro Bono Analytics initiative, non-profit organizations have the opportunity to work with analytics professionals on a volunteer basis to help
A NA L Y T I C S
solve challenges and create new opportunities for success with the scientific process of transforming data into insight. The initiative matches INFORMS’ analytics professional volunteers with nonprofit organizations that would benefit from advanced analytics and operations research training and techniques. By focusing on current analytics issues as they relate to non-profit organizations, the Pro Bono Analytics team will be able to take the necessary steps in assisting to solve the most complex of issues. ❙
J A N U A R Y / F E B R U A R Y 2 016
|
27
NE W S & N OT E S
Edelman, queues, STEM & survey The Edelman finalists were chosen after a rigorous review by verifiers, all of whom have led successful analytics projects.
28
|
INFORMS ANNOUNCES 2016 EDELMAN AWARD FINALISTS INFORMS has named six organizations representing applications of real-world operations research and advanced analytics for the 2016 Franz Edelman Award competition. The winner will be announced at the INFORMS Conference on Business Analytics & Operations Research in Orlando, Fla., in April following a daylong series of presentations before a panel of judges.
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
• •
•
•
•
•
The finalists include: 360i for “360i’s Digital Nervous System” BNY Mellon for “Transition State and End State Optimization Used in the BNY Mellon U.S. Tri-Party Repo Infrastructure Reform Program” Chilean Professional Soccer Association (ANFP) for “Operations Research Transforms Scheduling of Chilean Soccer Leagues and South American World Cup Qualifiers” The New York City Police Department (NYPD) for “Domain Awareness System (DAS)” UPS for “UPS On Road Integrated Optimization and Navigation (Orion) Project” US Army Communications Electronics Command (CECOM) for “Bayesian Networks for US Army Electronics Equipment Diagnostic Applications: CECOM Equipment Diagnostic Analysis Tool, Virtual Logistics Assistance Representative”
University of Chicago and University of Maryland. Now in its 45th year, the Franz Edelman Award is the world’s most prestigious recognition for excellence in developing and applying advanced analytical methods to help organizations solve complex problems or create new opportunities that result in highly impactful outcomes for the economy and society. ART, SCIENCE AND PSYCHOLOGY OF MANAGING LONG QUEUES As a world-renown expert in queueing theory, MIT professor Richard Larson, aka “Dr. Queue,” knows all about waiting in lines. So it’s no surprise that when the Washington Post’s Wonkblog reporter Ana Swanson needed an expert source for her story on the art and science of managing long queues, she called on Dr. Queue. According to Larson, people can expect to spend one to two years of their lives waiting in line, most of it stuck in traffic. But those five-minute waits in the
The finalists were chosen after a rigorous review by verifiers, all of whom have led successful analytics projects. The verifiers come from organizations such as Verizon Wireless, HP, Turner Broadcasting, Carnegie Mellon University, PriceWaterhouseCooper, SAITECH, Princeton Consultants, A NA L Y T I C S
J A N U A R Y / F E B R U A R Y 2 016
|
29
NE W S & N OT E S
checkout line at the supermarket, stuck behind someone talking on their smartphone while fumbling with a pile of coupons and dollar bills to give to the checker, can be just as annoying. As Swanson notes in the article, waiting in line not only irritates the customer, it’s bad for business. “A long and unpleasant wait can damage a customer’s view of a brand, cause people to leave a line or not enter it in the first place (what researchers respectively call ‘reneging’ and ‘balking’), or discourage them from coming back to the store entirely,” she writes. Businesses, of course, realize this and come up with various ways to solve the problem, starting with good, old-fashioned distraction such as magazines in the doctor’s waiting room and near the supermarket checkout lines. Larson, a past president of INFORMS, considers Disney the “undisputed master” of designing queues that are entertaining and that create anticipation for the ride. “In my book, [Disney is] number one in the psychology and in the physics of queues,” Larson tells the Post. Writes Swanson: “The design is so successful that parents with young children can happily stand in line for an hour for a four-minute ride – a pretty remarkable feat, [Larson] points out. And of course, the capacity of the line and the ride are carefully calculated to balance customer satisfaction with profits.” 30
|
A N A LY T I C S - M A G A Z I N E . O R G
To read the complete article “What really drives you crazy about waiting in line (it actually isn’t the wait at all),” click here. STEM MAJORS WITH THE BEST VALUE Not surprisingly, WorldWideLearn. com’s updated list of the “STEM Majors With the Best Value for 2015” is loaded with majors common among members of the analytics community. The list includes information technology (No. 1), computer programming (No. 3), computer and information science (No. 5), engineering (No. 7), data modeling (No. 9), computer systems analysis (No. 11), mathematics (No. 18), management science (No. 21), informatics (No. 22), petroleum engineering (No. 23) and physics (No. 25). WorldWideLearn.com analyzed 122 majors belonging to the STEM disciplines. To be included in the rankings, each major had to meet at least one of the following criteria: • Be on the 2012 STEM-Designated Degree Program List from the Department of Homeland Security
W W W. I N F O R M S . O R G
• Be matched by the National Center for Education Statistics to a job on the Bureau of Labor Statistics’ list of STEM occupations Ranking criteria including educational availability, educational affordability, earnings and employment opportunity. GAPS BETWEEN TEACHING, PRACTICE OF ADVANCED ANALYTICS Students of advanced analytics who aspire to leave academia and succeed quickly in business and government arenas should assess their approaches and tools in the classroom and their research, according to an informal Princeton Consultants survey conducted at the 2015 INFORMS Annual Meeting in Philadelphia. The survey revealed notable gaps between what students learn, what professors teach and what practitioners need. Irv Lustig of Princeton Consultants, a longtime INFORMS member and a former employee of CPLEX, ILOG and IBM, reported the following findings: • Students must learn more about building applications with modern technologies so they have the skills needed by the practice community. • Professors are, for the most part, not teaching the programming languages used by students or in practice. Students and practitioners are using
A NA L Y T I C S
both Python and R, both of which are used heavily in the data science community, but faculty members are not adapting their courses to teach these new languages. • With few exceptions, there seems to be misalignment between the use of modeling languages in academia and the use of modeling languages in practice. The survey of 72 self-selected participants, all of whom were onsite at the INFORMS Annual Meeting, was comprised of college professors (44 percent), students (32 percent) and practitioners (24 percent). The non-scientific “snapshot” survey was designed to compare the responses of these three groups about solvers, programming languages, modeling languages and software development based on the participants’ last two years of experience. ❙
J A N U A R Y / F E B R U A R Y 2 016
|
31
DE AL IN G W I T H B I G DATA
The ascendency of
data lakes The premise, the promise, the potential of new method for managing big data.
BY SEAN MARTIN he data lake concept occupies a central place of prominence in contemporary big data initiatives. The past two years have unveiled numerous headlines, vendor solutions (including repackaging of former solutions) and enterprise use cases for the utility of this centralized approach for accumulating, analyzing and actuating big data. The fervor for this method of managing big data is based on a simple premise that promises value for organizations
T
32
|
A N A LY T I C S - M A G A Z I N E . O R G
regardless of size or vertical industry. Data lakes provide a singular repository for storing all data – unstructured, semistructured and structured – in their native formats, granting access and insight to all without lengthy IT preparation. Moreover, the data lake movement is largely spurred by adoption rates for Hadoop. As Hadoop’s presence increases, its function as an integration hub for all data delivers more credence and traction to the notion of data lakes. The data lake concept may be relatively new, but the association
W W W. I N F O R M S . O R G
Big data is the principal driver of data lakes. of Hadoop and big data is nearly as ubiquitous as big data itself. The combination of these two factors, Hadoop’s deployment as a data lake and the storage and access benefits this method produces, is largely responsible for the widespread attention data lakes have garnered. A recent post from Gartner reveals that data lake interest is “becoming quite widespread.” Forbes indicates that “one phrase in particular has become popular for the massing of data into Hadoop, the ‘Data Lake.’” Most of all, the intrigue behind the data lake phenomenon pertains to the potential of these centralized repositories. In A NA L Y T I C S
a world in which organizations are confronted with new and differing technologies, tools and platforms daily, data lakes offer something of an oasis: a one-stop hub for all aspects of big data, from initial ingestion to analytics-based action, that makes big data more manageable and demonstrable of its value. DATA LAKE DRIVERS Big data is the principal driver of data lakes. Organizations realize the business value that collecting large quantities of data engenders; they understand that exploiting this opportunity will give them an advantage over competitors who do J A N U A R Y / F E B R U A R Y 2 016
|
33
DATA L AK ES
not. The most immediate advantages of this architecture involve costs for storage and physical infrastructure. Data lakes enable organizations to store massive amounts of data at reduced costs that were not previously available. Additionally, this architecture is extremely scalable and suited for daily ingestion of petabytes. Alternative methods of storing such data present greater upfront costs than open source Hadoop does. Data lakes also enable organizations to simplify their infrastructure; their comprehensive nature decreases the needs for silos and data marts. Consequently, there is less physical infrastructure, which translates to cost benefits associated with managing and maintaining a single repository instead of multiple ones. Another driver for data lakes is the increased availability and accessibility they deliver. This advantage is best measured in temporal terms. Data lakes dispel the lengthy data preparation processes that typify the involvement of IT departments with other options for managing big data. Instead, users across the enterprise can access data from the same place with a degree of immediacy that is vital to the speed at which big data is absorbed. That accessibility correlates to an availability of data that is unparalleled with traditional database life cycles. 34
|
A N A LY T I C S - M A G A Z I N E . O R G
Organizations can encompass data from different sources (with varying schema and structure, or lack thereof) that utilize multiple technologies (cloud, social, mobile, etc.). Additionally, they can do so to suit the needs of individual business units and across vertical industries, if need be. Nonetheless, the driver that is likely to make data lakes mainstream is the perception of open source technologies. Hadoop’s salience is directly related to the burgeoning familiarity, acceptance, and penetration of open source technologies. Granted, adoption rates for Hadoop reflect many of the foregoing drivers for data lakes. However, its ubiquity is also linked to a greater ease to attain upperlevel management support for the data lake concept, since many executives already associate big data with Hadoop. The notion of dark data, and the realization that elucidating such data improves big data’s ROI, also contributes to the ascendency of data lakes. Positioning an organization’s entire data assets into a single place provides the first step in attaining insight, and then value, from them comprehensively. With the majority of the world’s newly generated data involving unstructured and semi-structured forms, data lakes are poised as the optimal environment to parse and utilize such data in accordance with structured data for a holistic overview of data assets. W W W. I N F O R M S . O R G
COMPARATIVE ADVANTAGES
Therefore, warehousing is incongruent with the current self-service movement within data management, which seeks to empower the business and give it more control over its data.
A comparison between data lakes and traditional repository methods for big data illustrate a number of pivotal advantages and disadvantages – for both. Data lakes are arguably displacing data COMPARATIVE DISADVANTAGES warehousing as the de facto means of storing data and facilitating analytics. Data lakes rectify the cost concerns Multiple facets of data warehouses renfor storage and the rapidity of access asder them unsuitable for the quantities sociated with warehousing time-sensitive and varieties of big data that are required big data. However, these benefits beto truly profit from this technology. The come disadvantageous without critical most readily apparent are storage costs, aspects of data management that require which are exorbitant compared to those for Hadoop. READ THE POPULAR BOOK - NOW REVISED AND UPDATED - AND IN PAPERBACK The increase in sources and types of big data merely exacerbates the storage issue, and makes the warehouse approach particularly unwieldy. This fact is compounded by the time consumption of warehousing and the traditional BI it was designed to support. The business is constantly waiting for IT to model, prepare tTranslated into 9 languages and transform data before tUsed in courses at more any analysis and reportthan 30 universities ing is performed, which decreases the value of the More info: www.thepredictionbook.com velocity at which big data *Free audiobook with purchase of paperback or e-book is ingested and consumed.
Predictive Analytics The Power To Predict Who Will Click, Buy, Lie, or Die
A NA L Y T I C S
J A N U A R Y / F E B R U A R Y 2 016
|
35
DATA L AK ES
more than just depositing data into Hadoop or NoSQL stores; failing to implement them frequently results in these points of chaos: Lack of context and meaning: Large data volumes, disparate data types and big data sources are collected in data lakes without any sort of context or readily discernible meaning. Without those conventional, lengthy preparation processes facilitated by IT, end users (or data scientists) are left to implement them as best they can, oftentimes without formal training in this critical prerequisite. The result is an obfuscation of data’s meaning and makes data discovery extremely difficult. Inconsistent data: The jumbled data in data lakes lack semantic and metadata consistency, creating further ambiguity about data’s meaning, purpose and relation to other data. Subsequently, there are considerable deleterious effects for … Data governance: The unrestrained approach of unmanaged data lakes considerably worsens some of the hallmarks of data governance including role-based access to data, security concerns, and transparent data lineage and traceability. Another serious problem that implementers of early data lakes struggle to address is the scarcity of the data scientist and big data manipulation or even big data programming skills that are usually 36
|
A N A LY T I C S - M A G A Z I N E . O R G
needed to extract value for or even obtain clean access to the data residing in the data lake. As inflexible and cumbersome as they are, data warehouses can draw on an army of DBAs, armed with a host of mature data wrangling technologies and will generally produce reliable reports on a regular schedule. In many cases data lakes can rapidly resemble a “Wild West” for data. MAXIMIZING DATA LAKE UTILITY The data lake concept fulfills its promise via smart data lakes that leverage semantic models and graphs to eliminate the aforementioned points of disorder while adding additional advantages such as delivering drastically improved business end-user self-service capability. Semantic models (based on ontologies) provide concise descriptions of data and are visually represented in a semantic graph. These ontologies clarify data and enhance context by denoting just what the data mean, regardless of source, structure, type or schema. The visual representation of data in a graph illustrates their relationships to one another, providing further context and the foundation for application and analytics usage. These definitions and relationships are digestible for the business and other end users, which expedites their access to and deployment of big data. W W W. I N F O R M S . O R G
Utility solutions: Role-based access: Semantic technologies also maintain the necessary governance and security policies for long-term sustainability of data lakes. Organizations can implement role-based access to data in accordance with governance protocols by specifying who can and cannot view data elements as expressed by triples. Such access is one of the primary means of engendering order and structure to data lakes based on enterprise-wide policies. Thus, even though
the data is in one place, restrictions and permissions to their use are as enforceable as if the data were siloed according to governance mandates, providing internal security for disparate use cases of the same repository. Provenance and regulatory compliance: Provenance issues are addressed due to the inherent consistency of semantic models and the ease with which it is possible to augment data sets with metadata capturing the originating context and full data lineage; the ensuing
Are You Looking For an Analytics Professional to Make Sense of Your Data?
RESERVE YOUR SPACE NOW FOR THE INDUSTRYʼS PREMIER CAREER FAIR! • Find the seasoned professionals you need – over 800 analytics professionals expected • Provide your recruitment materials in a casual setting • Arrange discreet on-site meetings in private booths • Enjoy discounted combination pricing with the fall Annual Meeting Career Fair • Enhance your visibility with an ad in Analytics or OR/MS Today
Questions?
careers@informs.org or call (800) 4-INFORMs
CAREER CENTER
INFORMS Conference on Business Analytics & Operations Research April 10–12, 2016 Hyatt Regency Grand Cypress, Orlando, Florida www.meetings.informs.org/analytics2016 A NA L Y T I C S
J A N U A R Y / F E B R U A R Y 2 016
|
37
DATA L AK ES
traceability and lineage is critical for determining regulatory compliance. This method allows organizations to analyze any variety of data sources and applications—emails, online account activity, trades, etc.—to see just where and how data was used, and if it was done or should be done in accordance to regulations. The degree of meta-tagging and metadata consistency that such models provide also improves regulatory compliance by enabling semantic models to be mapped to compliance protocols in conjunction with relevant metadata attributes. Data discovery: The combination of open data standards-based semantic models and their graphic representation also enhances the data discovery process, as end users can query the relationships and meaning of data associated with data sets to see which are appropriate for specific use cases. The application of the semantic standards ensure that the data is both immediately available for reuse and that it is self-describing through the use of standards-based tags that tie them to the associated business concept. This application of semantic technologies may provide the greatest utility to organizations via the sort of celeritous integration of complex unstructured, semi-structured and structured data sets – of any magnitude and type 38
|
A N A LY T I C S - M A G A Z I N E . O R G
– according to highly specific needs of end users. Depending on the discernible attributes and context of data elements. In life science organizations for example, clinicians and data scientists have found significant value in quickly juxtaposing the data from multiple clinical trials results through ad hoc queries that navigate across multiple data sets. In financial services, identifying the potential for misuse of material nonpublic information can be extremely arduous. Links and relationships need to be examined by compliance officers to understanding what, how, why and when information is shared and whether it is compliant or not. Similarly difficult is tying together information that builds a comprehensive picture of counterparty risks. 2016 PREDICTIONS Analytic expansion: Of all the ways that semantically enhanced data lakes will influence the data landscape in 2016, their impact on analytics will be the most profound. The numerous aforementioned possibilities of such data lakes coalesce into the fact that by deploying them, it is possible to place an organization’s entire data assets on an RDF graph, elucidating the relationships between elements in such a way that effectively overcomes the dark data phenomenon. Innately understanding the context and meaning of data W W W. I N F O R M S . O R G
prior to analysis profoundly affects the type, degree and nature of analytics performed, which considerably refines their results and use. Semantics at scale: The ultimate expression of what is actually an expansion of analytical prowess is the concept of semantics at scale, in which the organization utilizing a smart data lake graph is optimized for analytics with in-memory, massively parallel processing of semantically tagged data. Such an engine, when combined with a smart data lake’s RDF graph and ontological models of business meaning, incorporates all relevant enterprise data for comprehensive results at a speed which semantic technology advancements have only recently been able to produce. Democratization of stewardship: The expedience of access and availability of data provided by data lakes is aligned with the self-service movement and the notion of the democratization of big data that in turn supports it. Data lakes will contribute to the solidification of these trends by facilitating the democracy of data stewardship. Semantic models and semantic graphs will help end users discern data and their relations to other data elements, which will enable a more pervasive form of governance than that conventionally reinforced by only a few dedicated data stewards. With increasing regulatory mandates, this A NA L Y T I C S
enterprise-wide ubiquity of data stewardship will prove vital to organizations. Automating IT and data science: Additionally, the alignment of smart data lakes with the self-service movement will result in automation of some of the more mundane, but highly necessary aspects of data science and the work of IT departments. Facets of integration, data discovery and data preparation that consume so much time of those working in these two departments are either expedited or unnecessary with smart data lakes, enabling these professionals to concentrate on more substantial ways to improve data-driven processes and drive more quickly to value. Finally, the preeminence of smart data lakes themselves will be another trend that should foment in the new year. The interest in this method for managing big data deployments will continue to multiply as organizations realize that they can facilitate all of its benefits, while negating its detriments, through the utilization of user-friendly semantic technologies that belong in front offices as much as, if not more so, than in back ones. â?™ Sean Martin is the founder and chief technical officer of Cambridge Semantics, a provider of smart data solutions driven by semantic web technology. Prior to Cambridge Semantics, he spent 15 years with IBM Corporation where he was a founder and the technology visionary for the IBM Advanced Internet Technology group.
J A N U A R Y / F E B R U A R Y 2 016
|
39
PRODUC T O R S E R V I C E ?
Get smart:
Digital business innovation Smart technologies, services, processes and people add up to smart systems for every sector.
BY HALUK DEMIRKAN (left) AND BULENT DAL compete in the marketplace and maintain relevancy, companies need to constantly innovate. Just as important, today’s economic environment demands that innovation also consider how to design and transform delivery processes to improve productivity and performance. While there is
To
40
|
A N A LY T I C S - M A G A Z I N E . O R G
a desire to be more global, integrated and customer-centric, actually getting new products and services to market are rare, and what we call frequent and radical innovations – new services and products that dramatically change the marketplace – are even rarer. For the past decade, many organizations have focused on traditional product innovation
W W W. I N F O R M S . O R G
IS THIS A PRODUCT OR SERVICE?
IS THIS A PRODUCT OR SERVICE?
“An automobile is actually art, entertainment and mobile sculpture, which, coincidently, also happens to provide transportation.”
The Kindle’s real breakthrough springs from a feature that its predecessors never offered: wireless connectivity. As a result, says Amazon founder Jeff Bezos: “This isn’t a device, it’s a service.”
– Robert Lutz, chairman, GM
ARE THESE PRODUCTS OR SERVICES?
to address the challenges of globalization and economic transformation. Most of these companies are still clinging to what we call the invention model, centered on structured, bricks-and-mortar product development processes and platforms. If everybody is doing innovation, what are you doing differently? TODAY, WHEN A CUSTOMER BUYS A DRILL, DOES HE/SHE WANT A DRILL OR A HOLE? According to research, people don’t want to buy a quarter-inch drill. They want a quarter-inch hole. Another example can be cars. Robert Lutz, chairman of GM,
A NA L Y T I C S
once said, “An automobile is actually art, entertainment and mobile sculpture, which, coincidently, also happens to provide transportation.” Other examples are service platforms such as Uber, the world’s largest taxi company but owns no taxis; Airbnb, the largest accommodation provider but owns no real estate; Skype, one of the largest phone companies but owns no telco infrastructure; Alibaba, the world’s most valuable retailer but has no inventory; Facebook, the world’s most popular media owner but creates no content; and Netflix, the largest movie house but owns no cinemas.
J A N U A R Y / F E B R U A R Y 2 016
|
41
DIG ITAL B US I N E S S I N N OVAT I ON
Today … Customers want to “hire” a product to do a job. Commoditization of products results in price and margin pressures. Customers are demanding services and solutions. Services can provide platforms for profitability. Loyalty and customer satisfaction are often driven by services. Service offerings can differentiate firms in highly competitive industries. The “ICT-enabled services-basedeconomy” is growing exponentially.
• • • • • • •
As a result, flexibility and agility to respond to changing business needs and to harness resources across global value chain partners are creating many challenges and issues for companies. Many organizations attempt to overcome these challenges and issues through improved efficiency, quality and speed of their operations, through mergers and networks [1]. But unanticipated consequences result in unnecessary costs, lack of responsiveness to customers, and missed opportunities for innovation. However, they often find that traditional innovation methods are inadequate and create negative externalities because they have insufficient scope in relation to the complexity and dynamics of their internal 42
|
A N A LY T I C S - M A G A Z I N E . O R G
organizational systems and their external, resource-network and market systems. If that is the case, we need to look at things differently. The convergence of information communication technologies (ICT) and service thinking changed the nature of businesses, services and products by delivering them through digital solutions. This revolution created an emerging field called “digital business innovation,” “digitization” or “digital service innovation.” TAKING THE PATH TO SERVICE TRANSFORMATION, ORIENTATION AND DIGITAL BUSINESS INNOVATION Influenced by the emerging field of service science and systems (e.g., service-oriented technologies and management), several companies have gained attention in the past few years by developing more flexible business processes that co-create value with customers [2]. For example, Rolls Royce leveraged its expertise in aircraft engine manufacturing to implement a service-oriented power-by-the-hour offering for customers. This new business model better met customer needs and gave Rolls Royce more information about the way their customers use resources to create value. Apple and Google became the world’s largest software platforms without writing apps. Amazon became the world’s largest W W W. I N F O R M S . O R G
virtual computing service provider with its cloud platform. Service thinking has transformed traditional products and services by adopting manufacturing concepts such as division of labor and knowledge, standardization and coordination of production and delivery to enable new forms of value creation and consumption. Industries such as retail, hospitality, restaurant, telecommunications, healthcare, transportation, finance and education are undergoing this type of transformation. ICT has enabled traditional manufacturers to become providers of services [3]. At the same time, ICT is moving off the desktop and out of offices and homes and into buildings, infrastructure and objects. Our ability to collect and analyze a flood of data from mobile solutions, sensors, cameras, etc. is getting much more efficient and effective. Cisco predicts that the Internet of Things (IoT) is expected to generate $14 trillion revenue in the next decade by connecting more than 200 billion devices [4]. Internet speed may double by next year. Smarter cities, retail, manufacturing, healthcare,
transportation, telecommunication, logistics, supply chain, etc., will increase rapidly. We will increasingly utilize intelligent robotics, additive manufacturing (e.g., 3-D printers), self-driving cars and augmented reality. This will result in more data generation and collection storage, as well as increase the need for analysis and cognitive business (e.g., IBM Watson, Apple Siri, Microsoft Cortana, Google Now, Amazon Echo and Facebook AI). Digital innovations have great potential to reduce costs, increase efficiency and improve outcomes. DIGITAL BUSINESS REVOLUTION WITH CONVERGENCE OF ICTS AND SERVICES In today’s globally competitive business environment, innovation is not a strategic option; it is a fundamental prerequisite for a company’s survival, organizational renewal and national economic wealth. Firms are now establishing market leadership and growing their revenues by mastering digital service innovations. For example, the traditional advertising agencies now have to be able to blend digital products and
Request a no-obligation INFORMS Member Benefits Packet For more information, visit: http://www.informs.org/Membership
A NA L Y T I C S
J A N U A R Y / F E B R U A R Y 2 016
|
43
DIG ITAL B US I N E S S I N N OVAT I ON
services with creative strategy. Amazon is as much a retailer and supply chain leader as it is a digital service innovator. Similarly, the Netflix business model is heavily reliant on continuously building and enhancing digital products and services to compete in the entertainment industry. Ford is realizing that its future competitors are likely to be Facebook and Google and not BMW and Toyota. Apple is more than a computer manufacturer with iTunes, apps, cell phones, tablets, etc. Another good example can be smart retail platforms (e.g., Obase Detailer, Intel’s AIM suite) that collect and analyze data from transactional systems, data warehouses, customer relationship management systems and location-based analytics. SO, ARE YOU LOOKING TO BE AN INNOVATIVE SERVICE PROVIDER WITH DIGITAL BUSINESS? HOW TO START SUCH A JOURNEY, AND HOW TO STAY THE COURSE. Digital innovation is a new way of thinking and doing things. A key characteristic of digital innovation is that it
Subscribe to Analytics It’s fast, it’s easy and it’s FREE! Just visit: http://analytics.informs.org/
44
|
A N A LY T I C S - M A G A Z I N E . O R G
often changes the roles of providers, co-producers and customers of services and alters their patterns of interaction. Different organizations have different perspectives on the opportunities created by the ICTs, but all are looking to improve efficiency and outcomes. One of the biggest challenges is deciding where and how to start this journey, and how to stay on course. Culture change/mindset. Understand the service (value co-creation). Service, which can be defined as the application of competence, knowledge and capability to create benefit (or value) for another, derives from the interactions of entities known as service systems. They are: intangible, inability to inventory, perishable, inseparability (produced and consumed simultaneously), value co-creation process, collaboration with many stakeholders (e.g., B2B2C), high involvement of people in delivery (or service system), and very complex. Simultaneity of production and consumption of services occur in complex service environments due to interaction of people, processes, technology and shared information. Customer experience. Think about how to measure and improve customer experience. The customer experience embodies what it’s like to be a digital service customer of your organization, W W W. I N F O R M S . O R G
EVERY BUSINESS… EVERY ORGANIZATION… AND EVERY ANALYTICS PROFESSIONAL... Experiences the ups and downs, and the twists and turns of analytics. Making analytics work in real organizations can be a dynamic (dare we say wild?) ride for even the most seasoned practitioners. Analytics 2016 will help you conquer the challenge.
SUBMIT AN ABSTRACT OR POSTER! ABSTRACT SUBMISSIONS OPEN Check the site for current abstract submission information
meetings.informs.org/analytics2016
DIG ITAL B US I N E S S I N N OVAT I ON
whether buying digital or physical products. Amazon’s customer experience includes the website and the digitized business processes touching the customer, like the shopping cart and payment options, as well as messaging, such as delivery alerts and email acknowledgments with design thinking. The experience also includes Amazon’s welldeveloped customer-created content: customer product ratings and reviews, as well as sophisticated tools like search, a detailed history of purchases and tailored recommendations [5]. Strategy re-mapping. Redefine your market space for future growth by assessing your current market space. This would include: current markets/ current offerings (market penetration strategy), new markets/current offerings (market development strategy), current markets/new products-services (product/service development strategy) and maximum opportunity strategy (new markets/new products and services). Concept/idea. A “new value proposition” targeted at a particular market. One way to reduce that risk is to make changes to your company’s mix of products or services [6]. Focus narrowly, search for commonalities across products and services or create a portfolio of offerings. If your business currently serves multiple 46
|
A N A LY T I C S - M A G A Z I N E . O R G
segments, it may be best to subdivide into focused units rather than try to apply one model. Platform economics. Driving digital business innovation with analytics, smart services, social, cloud, Internet of Everything (IoE), service-orientation and cognition for value co-creation and outcome [7]: • To achieve economies of scale with digital business models requires the development and reuse of servicebased digitized platforms across the enterprise [8]. • Review the business processes, applications, data networks and connections, databases, servers, etc., to identify which applications need to remain in their present form, and which can be removed to the new framework. Also identify which IT platforms servers, PCs, workstations, operating systems and software need to be upgraded or replaced. • Global platform but customizable locally. This means an enterprise with a federated business architecture with a global content repository, expanded taxonomies, modular design and global and local innovation. • Enabled with IoT, smart services gather and share information directly W W W. I N F O R M S . O R G
with each other through onsite and virtual cloud solutions, making it possible to collect, record and analyze new data streams faster and more accurately. The ability to collect and analyze a flood of data from mobile solutions, sensors, cameras, etc., with smart automation is getting much more efficient and effective. These IT-enabled solutions should have integration capability that helps implement the new configurations of operational competencies by developing the required patterns of interactions with each other. • Every “thing” should be able to reconfigure itself – the ability to rearrange existing resources and services into new configurations of operational competencies that better match the environment. • Every “thing” should be able to sense the environment, identify needs and spot new opportunities. It requires tracking and monitoring service providers’ and receivers’ activities, as well as technology performance to understand usage trends, navigation trends, etc. • Every “thing” must have coordination capability – the ability to manage dependencies among resources and tasks to create new ways of performing a set of activities. A NA L Y T I C S
• Significant amount of data are collected with IoE and smart service. New models, methods and algorithms are needed to analyze this data effectively and efficiently. • The next generation of things should have cognitive capabilities. They should be able to learn by driving innovative thinking and new knowledge generation to enhance existing services. This involves incorporating user community feedback and modifying, adding, deleting and synthesizing content and software services as indicated, thus capturing industry trends and needed software service categories for adding, updating or deleting skills, knowledge and experience categories and content. • Data collected is useful, relevant and actionable. In the 21st century, everybody and everything become data creators and data consumers. • After use, every “thing” should have a plan for disabling, destroying and disposing plans for itself if there are no needs for them. Apply correct privacy and security procedures. Companies need to get value from product complexity without confusing customers or making it too difficult for employees to get things done [9]. J A N U A R Y / F E B R U A R Y 2 016
|
47
DIG ITAL B US I N E S S I N N OVAT I ON
DIGITAL BUSINESS INNOVATION: THE TIME IS NOW. There is a big move toward digitization of business: incorporating more of customers’ experience; executing more processes and working together with partners in the value chain; increasing the number of “digital natives” (young current and future customers and employees who expect a brilliant digital experience in all of their interactions); and embracing the dawning of the age of the customer voice, in which customers have a much stronger impact on enterprises via ratings of their services and via online comments through Twitter and other social media. Before the Internet, business operated primarily in a physical world of “place”: It was a world that was tangible, product-based and oriented toward customer transactions. Today, many industries – all moving at different rates – are shifting toward a digital world of “space”: more intangible, more service-based and oriented toward customer experience. Technology allows customers to produce service entirely on their own (“selfservice”), employees to provide services from anywhere in the world (remote, outsourced services), and companies to integrate technology into a total mix of service offerings (smart services). To be truly successful, such a move will require a new kind of talent – T-shaped 48
|
A N A LY T I C S - M A G A Z I N E . O R G
people – supported by a new kind of organization. In other words, companies need to retune their talent engines to support a new generation of innovation [10]. Organizations need to find new or improved ways of generating, prioritizing and managing digital innovation from idea generation through the end of the development lifecycle when the innovation becomes a new service platform or a complementary value-added service. These new ways of managing innovation need to consider the differences between incremental and radical innovation and recognize the leverage that can be gained from co-creation of value with the customer and customer experience. ❙ Haluk Demirkan (haluk@uw.edu) is a professor of Service Innovation and Business Analytics at the Milgard School of Business, University of Washington-Tacoma. He has a Ph.D. in information systems and operations management from the University of Florida. He is a longtime member of INFORMS. Bulent Dal (bulent.dal@obase.com) is a co-founder and general manager of Obase Analytical Solutions (http://www.obase.com/index.php/en/obase), Istanbul, Turkey. His expertise is in scientific retail analytical solutions. He has a Ph.D. in computer sciences engineering from Istanbul University. ACKNOWLEDGEMENT: Part of this article is excerpted with permission of the publisher, HBR Turkey from Demirkan, H. and Dal, B. “Digital Innovation and Strategic Transformation,” Harvard Business Review (Turkish Edition; published in Turkish), April 2015. REFERENCES For references, click here.
W W W. I N F O R M S . O R G
Abstract Submission & Registration is Now Open
2016 INTERNATIONAL CONFERENCE HAWAII June 12–15, 2016 Hilton Waikoloa Village SUBMIT AN ABSTRACT:
http://meetings.informs.org/2016international/abstracts/ Hawaii 2016 delivers an impressive lineup of keynote and plenary speakers interspersed with invited tracks emerging topics ranging from Global Supply Chains to Social Networks affording you the opportunity to network and collaborate with colleagues across the globe and from both academia and industry.
2016 INTERNATIONAL
HAW II
REGISTER at meetings.informs.org/2016international
MARKE TIN G ME T R I C
Leveraging predictive analytics to estimate customer lifetime value
BY MATTHEW LULAY ustomer lifetime value (CLV) is not a new tool for marketers. Its application has been used for decades to understand a customer’s financial value. It comes in many shapes and sizes, varying from historical CLV, which calculates a CLV based only on what a customer has previously spent with a business, to predictive CLV, which leverages both observed historical behavior and predicted retention to estimate a discounted stream of future (lifetime) revenue. Historical CLV has several drawbacks, the most important of which being that,
C
50
|
A N A LY T I C S - M A G A Z I N E . O R G
since it is the sum of past revenue or profit for a particular customer or group, it only provides insight into what has already occurred, and, thus, sheds little insight into the value of new subscribers. Predictive CLV, however, with its ability to incorporate expected retention, allows marketers to obtain several key insights, including what types of subscribers will be the most profitable over a specific time period, where acquisition dollars earn the highest return on investment and what customer attributes are drivers of retention. These types of actionable insights can help marketers make more well-informed,
W W W. I N F O R M S . O R G
In the newspaper industry, revenue for a particular subscriber includes the subscription rate and the subscriber’s share of the market’s advertising revenue. data-driven decisions that promote efficiency, savings and revenue growth. This article explores the basic tenets of predictive CLV, illustrated by examples from the newspaper industry. MAJOR COMPONENTS OF CLV The calculation at the bottom of the page shows the three major components of predictive CLV: profitability, predicted retention and discounting. Profitability: Profitability is the simplest component of the CLV metric, as it is a straightforward calculation of revenues
minus costs. In the newspaper industry, revenue for a particular subscriber includes the subscription rate and the subscriber’s share of the market’s advertising revenue, which comes in the form of preprint advertising inserted into each day’s paper, as well as digital advertising revenue via impressions on the market’s website. The subscription rate can vary based on a variety of factors, including the number of delivery days (e.g., Sunday only vs. seven-day), the period length (e.g., 13-week vs. 52-week), acquisition source (e.g., direct mail vs. telemarketing) and
CLV = [(Revenues – Costs)*(Predicted Retention Probability)] Net Present Value (NPV)
A NA L Y T I C S
J A N U A R Y / F E B R U A R Y 2 016
|
51
CU STOME R LI F E T I ME VA LU E
payment method (e.g., check vs. credit card). Pre-print advertising value is dependent upon the subscriber’s demographic profile, which is normally measured at the zip code or zip+4 level. Costs at the subscriber level for newspapers include print and ink, delivery and acquisition. Predicted retention: Once revenues and costs are calculated and we arrive at a profit level, the next component of predictive CLV is estimating retention probability, which provides us with the risk-adjusted profit. By “risk-adjusted,” we simply mean profit that has been adjusted to account for the risk of subscriber churn – the probability that a particular customer will retain over a certain time period. In the newspaper industry, while all subscribers come up for renewal at different points throughout the year based on the term length of the subscription, not all subscribers exhibit the same propensity to renew. In fact, subscribers with different characteristics can retain at drastically different rates. While an average newspaper may experience overall annual retention of 75 percent, pockets of subscribers within the market may be retaining at 90+ percent, while others retain at less than 40 percent. Mather Economics uses an econometric method called “survival analysis” to estimate the retention probabilities among different subscriber groups. Survival analysis, originally developed for application in the biosciences, 52
|
A N A LY T I C S - M A G A Z I N E . O R G
is a method of estimating the probability of an event occurring at a particular time interval. Examples include the probability of survival for a heart transplant patient, the probability of transmission failure on new cars or the probability of divorce after marriage. The probability of these events can be estimated over time using survival analysis. With the application to the newspaper industry, we use survival analysis to calculate the probability of subscriber retention at different intervals of time. More specifically, we leverage historical transaction information to fit a parametric survival model with a log-logistic distribution. We use a parametric model because we understand the underlying distribution of our dependent variable, which is retention probability. The distribution of that variable is log-logistic in nature, where the rate of decline in the probability of retention increases in the early stages and decreases later. This creates a curve that is downward sloping with a slope that decreases in severity over time. An example of this is shown in Figure 1, where we estimate survival probability for subscribers in different income groups, revealing that the most affluent subscribers in this particular market had a retention probability approximately three times higher than those with in the lowest income level after 365 days. Figure 1 shows only the expected retention probabilities for subscribers grouped W W W. I N F O R M S . O R G
Figure 1: Estimate survival probability for subscribers in different income groups. by one variable. But when we combine all of the information we have on a particular subscriber, we can estimate a unique survival curve for every single subscriber in a database. In Figure 2, predicted retention is plotted for a new subscriber by day from the point of acquisition to a point two years out from acquisition. The area under the curve gives us the second component of predictive CLV – estimated retention (expected lifetime). Discounting: Predictive CLV attempts to capture the present value of a customer’s stream of lifetime revenue. Since we’re trying to capture the present value of future revenue, we need to incorporate a discount rate to account for the positive time value, or positive time preference, of money, which essentially states that money today is worth more than the same amount at some point in the future. This concept is why interest rates tend to be positive and why the need for a discount rate exists
A NA L Y T I C S
for valuing future dollars in present value terms. The selection of a discount rate is an important decision, as values are highly sensitive to this rate, especially in estimations in which predictions are made over longer periods of time. A variety of factors are taken into account when choosing a discount rate, including the length of time of the estimation, costs of capital, rate of return on private investment, interest rates on government and corporate bonds and output growth. With this in mind, government agencies in the United States tend to leverage discount rates of 2 percent to 3 percent on intra-generational projects. At Mather Economics, we normally estimate CLV as the risk-adjusted present value of five years of expected earnings for an individual subscriber and use a discount rate of two percent.
Figure 2: Day-to-day prediction retention of a new subscriber over a two-year period.
J A N U A R Y / F E B R U A R Y 2 016
|
53
CU STOME R LI F E T I ME VA LU E
APPLICATIONS Predictive CLV has a variety of useful applications, varying from acquisition optimization to upgrade campaign targeting to customer service prioritization. Consider the simplified example in Figure 3. While the telemarketing source delivers a cost per order one-tenth that of direct mail, the types of customers acquired from the telemarking source are much less valuable to the firm, and thus, in the aggregate, are less profitable, even when considering the much lower cost to acquire. In this example, CLV helps marketers prioritize their acquisition efforts to acquire subscribers with the most lifetime value to the firm. Additionally, CLV can help inform upgrade campaigns. Since the value of customers is known with CLV, steps can be taken to increase engagement and increase the value of existing subscribers. In the newspaper industry, for example, publishers use CLV to target lower value Sunday-only subscribers with paid upgrade offers to weekend subscriptions. This promotes a higher level of engagement with the product from the subscriber, which may lead to better retention in the long term and also provides value to advertisers, as
the number of circulation units increases, improving advertising frequency. Another application of CLV is prioritizing the customer service experience. Once a current subscriber base is scored, meaning their lifetime value has been estimated, customer service teams can leverage that data to improve the customer experience and make it more efficient. One example of this in the newspaper industry is having dedicated customer service representatives for high-value subscribers, where calls from these customers are prioritized to minimize wait times and provide the highest level of service possible. CLV data can also be used in the customer service department to create customized retention (stop save) scripting based on estimated subscriber value, where representatives are given the ability to offer more aggressive save offers to keep high-value subscribers on the books. The three application examples above are just a small sample of the multitude of positive impacts CLV can have on businesses in a variety of industries. CONCLUSION We’ve seen that predictive CLV has several advantages over historical CLV,
Figure 3: Simplified example of CLV in telemarketing vs. direct mail. 54
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
the most important of which being that it is a dynamic, forward-looking tool, allowing users to calculate the risk-adjusted value of a customer’s lifetime. Leveraging some of the econometric tools at our disposal, including survival analysis, allows us to add the predictive portion to CLV by estimating retention probabilities for individual customers. Once calculated, CLV provides the analytical foundation for many applications, including those in acquisition, retention and customer service. As such, predictive CLV can serve as a valuable asset in a firm’s analytical
toolset to help inform strategic, data-driven and profit-maximizing decisions. ❙ As a director for Mather Economics, Matthew Lulay helps media companies optimize pricing through data-driven analysis. In addition, he has authored several white papers on environmental economics, including a model forecasting the employment effects of increased funding along the Gulf Coast region following the Deepwater Horizon oil spill. He has extensive experience collecting, formatting and analyzing data from various state, national and international sources including the Bureau of Labor Statistics, the Federal Reserve and the World Bank. Lulay has a degree in economics from the University of Minnesota and a master’s degree in economics from Florida Atlantic University, where his research focused on macroeconomics and U.S. tax policy.
ADVERTISE WITH INFORMS INFORMS offers advertising opportunities in 11 of our scholarly journals covering the latest OR/MS research, industry developments, methods, and applications. INFORMS also organizes professional conferences and application-oriented meetings that offer numerous sponsorship and exhibit opportunities. See what INFORMS can do for you. INFORMS Advertising Helps You Reach • Consultants to Help You Solve Business Problems • Individual Purchasers of OR/MS & Related Products & Services • Institutional Product & Services Purchasing Decision Makers • Analytics Professionals & Executives • Career Opportunities
• • • •
National & International Meeting Attendees OR/MS Job Seekers Specialized Software Developers Students Seeking Summer Employment
2016 INTERNATIONAL INFORMS Annual Meeting – Philadelphia
HAW II
INFORMS ANNUAL MEETING November 1–4, 2015
Nashville 2016
meetings.informs.org/philadelphia2015
Click to view rates: www.informs.org/advertising A NA L Y T I C S
J A N U A R Y / F E B R U A R Y 2 016
|
55
CO RPO RATE P RO F I LE
BNSF Railway Operations research and advanced analytics team helps power rail giant’s success now and in the future.
BY AMY CASAS hen people think of railroads, what often comes to mind are thoughts of the American West with steaming locomotives chugging along endless miles of track interrupted only by the occasional train station. While railroads are not settling the Wild West anymore, they are continuing to break into new frontiers of trade, efficiency and innovation that find them pioneering and on the cutting edge of research and technology. Such is the case for today’s BNSF Railway, which
W
56
|
A N A LY T I C S - M A G A Z I N E . O R G
is the product of nearly 400 different railroad lines that merged or were acquired during more than 160 years. The Fort Worth, Texas-based BNSF has more than 47,000 employees spread across its 32,500 miles of rail network operating in 28 states and three Canadian provinces. As one of the largest freight railroads in North America, BNSF serves more than 10,000 customers by transporting goods and commodities that make people’s lives better. The coal hauled by BNSF powers one out of
W W W. I N F O R M S . O R G
BNSF moves a ton of freight almost 500 miles on a single gallon of diesel fuel thanks to technological advancements. every 10 homes in the nation. BNSF also transports enough grain to supply 900 million people with a year’s supply of bread, enough asphalt in a year to lay a single lane road four times around the equator, and last year it moved more than 5 million containers of consumer products that were sold by big box retailers and specialty shops alike all across America. According to the most recent Commodity Flow Survey, railroads carry more than 40 percent of freight volume in the United States – more than any other mode of transportation – and provide the most fuel- and cost-efficient means for moving freight over land.
A NA L Y T I C S
In fact, BNSF is able to move a ton of freight almost 500 miles on a single gallon of diesel fuel thanks to the technological advancements achieved in operating today’s locomotives. Unlike other forms of freight transportation, BNSF trains operate on an infrastructure built and financed mostly by the railroad. BNSF has created one of the most advanced rail networks in the world by better utilizing existing rail capacity and making record investments in new infrastructure and equipment directly connected to its operations. Since 2000 BNSF will have reinvested more than $50 billion by the end of this year to improve the safety
J A N U A R Y / F E B R U A R Y 2 016
|
57
CO RPO RATE P RO F I LE
and reliability of its rail network and accommodate expected growth. DRIVING SOLUTIONS WITH O.R. & ANALYTICS Behind the scenes helping equip BNSF leaders with the research necessary to make decisions on everything from capital investments to improving train efficiency is the company’s Operations Research and Advanced Analytics team. The team is led by Pooja Dewan, BNSF’s general director of decision support systems, who provides analytical consulting and decision support throughout the company. Dewan has led the team for the past 12 years and has been with BNSF for 17 years. Through the years the team has grown to include more than 20 full-time team members with advanced degrees in a variety of fields including operations research, industrial engineering and applied math and statistics. “We use advanced analytics to help drive and enhance the decisions made by BNSF’s leaders,” Dewan says. “With a network of thousands of miles of tracks and tens of thousands of train movements to move millions of containers and railcars, we have vast amounts of data. Our team works to make sense of all that data. That analysis then gives business leaders insight into how we can increase efficiency and grow our business.” 58
|
A N A LY T I C S - M A G A Z I N E . O R G
Every day Dewan and the Operations Research and Advanced Analytics team help the company bridge the gap between practice and academia. Over the years the team has grown to play an integral part of the decision-making process at BNSF in projects such as creating the most efficient and safest train routes, the availability of train crews, building trains at rail terminals, equipment ordering, train dispatching, fueling and inspecting trains, and so much more. “Considering the complexity of railroad operations and the speed with which business, technology and government regulations can change, it is important for us to be able to use scientific thinking to help us solve problems,” says Rollin Bredenberg, BNSF’s vice president for capacity planning and operations research. The team always looks to spend quality time with its internal clients and whenever possible even shadow them so they can gain additional insights that help them frame and solve problems. They often put together cross-functional teams to develop solutions that their internal clients can understand and find useful. When certain tools aren’t being used, they reach out to their stakeholders to learn why. This helps ensure that the tools are still relevant and when necessary improve upon the models and insights. W W W. I N F O R M S . O R G
“I believe our recipe for success has been our ability to hire technically strong team members, who help us deliver quality solutions that solve the right problem,” Dewan says. “To be effective we have to take these technically strong individuals and educate them about the railroad industry and help each person cultivate strong relationships within the organization.”
INFORMS member Pooja Dewan (seated, third from left) leads BNSF’s Operations Research and Advanced Analytics team.
DESIGNING A TRAIN TRANSPORTATION PLAN In 2014, BNSF handled more than 10 million shipments. Roughly 19 percent of those shipments were moved in what are known as “merchandise” trains. Those are trains made up of rail cars that contain a variety of mixed freight. They include lumber, paper, machinery and various industrial parts, as well as tank cars containing various types of chemicals used in manufacturing. The merchandise train business is complex. Those rail shipments move between 2,300 railway stations and consist of small groups of railcars ranging from 1 to 40 railcars, which are added together to make full-length trains. To facilitate efficient movement of these shipments, railcars A NA L Y T I C S
with similar destinations are grouped at freight yards for movement on trains. Each group of railcars, known in the rail industry as a block, moves together from one rail yard to the next, where the cars are re-sorted into new groups with railcars more closely matching their destination. The objective of grouping together railcars with similar destinations is to minimize the miles and amount of sorting that has to happen before the shipment reaches its final destination, all while taking into consideration each rail yard’s capacity. Working closely with their BNSF colleagues, the Operations Research and Advanced Analytics team developed a series of tools to help improve the flow of merchandise trains across BNSF’s network by assigning the right railcar groups to the J A N U A R Y / F E B R U A R Y 2 016
|
59
CO RPO RATE P RO F I LE
the consideration of a large number of factors in a very limited amount of time. The train crew planner is tasked with assigning crews located across a vast geography in the most efficient and cost-effective manner possible so trains are not delayed or canceled. The task is made One of the largest freight railroads in North America, BNSF’s 32,500-mile rail more complicated by the network spans 28 states and three Canadian provinces. large number of rules the right trains. Those tools allow service decrew planners must observe to maintain sign planners to determine the most effisafe rail operations. cient frequency and timing of trains, which Partnering with operations leaders, has reduced the amount of time it takes for the BNSF’s Operations Research and each railcar to reach its final destination. Advanced Analytics team developed an The team’s efforts to develop these tools application to assist in the crew planning have been recognized with the Daniel H. process. Launched in 2013, the Crew Wagner Prize for Excellence in Operations Decision Assist (CDA) application uses Research Practice and being named a a formal optimization algorithm that sugsemifinalist for the Franz Edelman Award gests, in real time, effective crew plans to from INFORMS. dispatchers. The implementation of this application has helped reduce the overall cost of positioning crews to operate trains CREW PLANNING & TIME PREDICTION TOOLS within a territory. The CDA application is In the railroad industry, like many incurrently being used in approximately dustries with large workforces, labor costs 60 percent of BNSF’s train crew districts are the businesses’ biggest expense. with plans to expand it even more. For decades, experienced crew planners Timing is everything, especially when have, for the most part, manually assigned customers are relying on the railroad to train crews. Planning train crew assignget their shipments where they need to be ments is a complex process that involves on time. It is especially critical for BNSF’s 60
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
intermodal business, which moves containers and trailers carrying the consumer products people use every day. Working with BNSF’s intermodal group, the Operations Research and Advanced Analytics team is developing a model that more accurately predicts train arrival times. The Train Travel Time Prediction tool will help BNSF further enhance the information that it provides its customers. The development of this tool will enable BNSF to provide customers with information that more accurately tracks their shipments in real time. There are also plans to expand
the use of this tool to other parts of the business to assist with planning for train crew operations and inspections, as well as equipment and track maintenance. PROJECTS CURRENTLY UNDERWAY In today’s data rich environment everything creates an electronic trail, and the Operations Research and Advanced Analytics team can use that information to generate better inputs for the models they develop. Whether it’s GPS tools that help track the exact location of BNSF’s assets or sensors and gauges that ®
CAREER CENTER
Job Seekers: Find Your Next Career Move INFORMS Career Center contains the largest source of O.R. and analytics jobs in the world. It’s where
careercenter.informs.org JOB SEEKER BENEFITS • POST employers to you. • SEARCH • PERSONALIZED job alerts notify you of relevant job opportunities right to your in-box. • ASK the experts advice, resume critique and writing, career assessment test services and more!
www.informs.org | 800.446.3676
A NA L Y T I C S
powered by
J A N U A R Y / F E B R U A R Y 2 016
|
61
CO RPO RATE P RO F I LE
monitor and assist in determining when would be the best time to schedule railcars and locomotives for inspections and maintenance, these tools analyze data that enable the team to delve deeper and create effective solutions. With thousands of sensors across its network, BNSF produces large amounts of data daily. These data sets are so large and complex that traditional data processing applications are inadequate. The Operations Research and Advanced Analytics team is currently looking into how to best process and synthesize all that information more effectively for use throughout the organization. Another tool currently on the horizon is Movement Planner. When complete this tool will improve the movement of trains across BNSF’s entire network instead of just certain segments. Having this tool will enable BNSF to better dispatch trains and improve the utilization of its tracks. WHAT THE FUTURE HOLDS Over the years the Operations Research and Advanced Analytics
team has helped BNSF save millions of dollars with their work. As computing powers continue to advance and the data that is generated and gathered increases exponentially, the team will continue to play an increasingly valuable role in developing decision tools that help BNSF leaders manage and operate trains in the most effective and efficient way possible. “To continuously improve the way your rail operations work, you have to evolve the way you make operational decisions,” says David Freeman, BNSF’s senior vice president for transportation. “The recommendations and programs that our Operations Research and Advanced Analytics team has developed have resulted in solutions that have positively impacted our bottom line and the way we run trains today and into the future.” ❙ Amy Casas (Amy.Casas@BNSF.com) is the director of corporate communications for BNSF Railway. During her career in communications she has led the creation and successful implementation of comprehensive communications plans, lead media relations efforts and provided strategic communications and public relations counsel to leadership.
Join the Analytics Section of INFORMS
For more information, visit: http://www.informs.org/Community/Analytics/Membership
62
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
®
CERTIFIED ANALYTICS PROFESSIONAL Analyze What CAP Can Do For You ®
www.certifiedanalytics.org
VILL AG E STA B I LI T Y O P E RAT I ON S
Changing the game How analytics can help defeat violent extremism.
BY DOUGLAS A. SAMUELSON
A
thoughtful, deeply experience-based analysis by a retired U.S. Army Special Forces officer offers a new perspective on the political debate raging over how to defeat terrorism: “They’re all wrong.” What we need, he says, is a change in focus from conventional force, applied in support of national governments, to village stability operations (VSO), emphasizing developing local leadership and building from there. Lt. Col. (ret.) Scott Mann should know. In his 22 years of service, he led VSOs in Iraq, Afghanistan, Colombia, Peru and Ecuador. Leading U.S. experts labeled his accomplishments, and those by his Special Forces colleagues, as the “game changer” in Afghanistan in 2010-2012. In his new book, “Game Changers: Going Local to 64
|
A N A LY T I C S - M A G A Z I N E . O R G
Defeat Violent Extremists,” he outlines the VSO approach in four steps: 1. Get yourself surrounded. Move in with the locals, get to know their concerns first-hand and on a day-today basis, let them get to know you. 2. Meet them where they are. As he explains in his book, “For VSO to work, we must embrace local realities. That means working with what is already there, and not with what we want to be there. In Nangahar Province (Afghanistan), it took two successive teams of Special Forces to identify all of the local grievances standing in the way of village autonomy and connection to the Afghan government. Only when these Green Berets started to help locals address their own local W W W. I N F O R M S . O R G
Lt. Col. (ret.) Scott Mann (right), shown with an unidentified Afghan National Army fighter, has led village stability operations in Iraq, Afghanistan, Colombia, Peru and Ecuador. problems did they persuade them to stand up for themselves.” 3. Connect through extreme collaboration. “It takes more than a village. It takes a network to empower that village,” he writes. “Let’s not forget our own organizational complexity, tensions and self-induced feuding as our second enemy in defeating violent extremism.” 4. Tell a story that sticks. “The side that tells the most compelling story, and backs it up with meaningful action, is the side that wins,” he asserts. In Nangahar, he adds, it A NA L Y T I C S
took about two years for the “master narrative of local clans standing up for themselves, supported by their government, against an oppressive and unwanted group of violent extremists” to take hold. The narrative spread to other communities in the area. “This expansion was possible largely because of a compelling narrative and well-told stories,” he adds. NEED TO ADAPT “We’re losing now,” Lt. Col. Mann told Analytics magazine in a recent interview, “because we keep trying to find a national J A N U A R Y / F E B R U A R Y 2 016
|
65
VILL AG E STA B I LI T Y O P E RAT I ON S government we can work with, top-down, applying massive conventional force. We’ve been in this conflict for 15 years; we ought to be adapting, and we’re not. “There’s nothing new in this approach; it’s just that nobody’s listening,” he continues. “We had a coordinated program called FID, foreign internal defense, to apply all our instruments of power in fragile countries. It’s run by the embassy country team, coordinating defense, law enforcement and economics. We’ve been doing it for decades, and we’re good at it. We did it in Colombia, El Salvador and the Philippines. But after 9/11, we abandoned it and went back to COIN, counterinsurgency, and it was as much a failure in Iraq and Afghanistan as it had been in Vietnam.” Mann cautions, “It takes a long time. At this point we’ll have to punch our way back in, but then we need to conduct long-term FID, position our talent at both bottom-up and top-down, looking for stability inhospitable to violent extremists at the local and national levels. “Everything was in place after the surge [in Iraq],” he said, “but then we needed to implement FID. In these places where violent extremists set up shop, 80 percent of the land is tribal, outside national control.” Hence, he adds, “we have to learn to exploit an honor-based society to create the opportunities we want.”
66
|
A N A LY T I C S - M A G A Z I N E . O R G
This means working with local and tribal leaders, participating in their governmental and dispute resolution processes, and encouraging them in community development and security initiatives without imposing what we think would work. He describes a number of such efforts in detail in the book. According to Mann, the current political rhetoric, from all sides, is “ridiculous,” and it’s reinforcing the narrative our enemies want to spread. “What ISIS wants to do is divide us against Muslims and exploit gaps within our communities about trust.” The success of extreme political rhetoric in the United States, he hypothesizes, is a response to a general vague sense of unease about the current situation. “Americans see that it’s not working, and they’re primed for something new, ready for some leadership,” he states. “I’ll say this in a very apolitical way: We just need to understand what works and do it. And we’re not.” What we do need, he offers, is a reexamination of the authorizations Congress gives to Special Forces to “get outside the wire” and work with local leaders, particularly to work with irregular forces along with the regulars. In Syria, he asserts, the authorizations were so narrow and restrictive that they impeded what might have been effective. “If we had had persistent presence out there, we would have
W W W. I N F O R M S . O R G
teams building community presence and maintaining fire bases. But it might take more than a decade; Colombia took 40 years.” Many of the same principles apply to what is known within the United States as “community policing” – building comOne of the keys to VSO is getting to know locals and their concerns first-hand and munity relationships and on a day-to-day basis, and to let them get to know you. networks of people who will help rather than relying mostly on the It is hard to document the benefits and use of massive force when crises erupt. problems of VSO succinctly and convinc“I’d say pretty much the same approach ingly, especially to senior leaders caught would work in Ferguson, Mo., and Baltiup in a perpetual cycle of PowerPoint more, too,” Mann says briefings. There is some movement at This also means that American nationthe policy-making levels of DoD away al leaders need to have a much better unfrom operations research back toward derstanding of the problems. “We Green wargaming, based on the recognition Berets took nine years to arrive at the VSO that more flexibility is needed in depicting program in Afghanistan,” Mann says. “Jefand evaluating possible courses of acfersonian democracy, targeted aid and tartion. However, according to experienced geted killings all failed. We need to step wargamers, stability operations actions back and reframe.” And that in turn, he emin wargames tend to be portrayed in phasizes, means “the days of outsourcing ways that require numerous assumplocal leadership are over. You have to dive tions about feasibility and effects – betin. You don’t just get handed some talking ter than ignoring such actions entirely as points right before walking into some place O.R. models too often do, but not conto give a speech.” vincing to anyone who has strongly held differing views. ANALYTICS CONTRIBUTIONS Nevertheless, creative analytics can Analytics has much to contribute to generate substantial benefits. Nearly six the reframing Lt. Col. Mann advocates. years ago, a small team of O.R.-trained but
A NA L Y T I C S
J A N U A R Y / F E B R U A R Y 2 016
|
67
VILL AG E STA B I LI T Y O P E RAT I ON S creatively oriented analysts in the Joint Improvised Explosive Devices Defeat Organization (JIEDDO) came up with a clever way to depict which tribes were where, based on contact reports. Gen. David Petraeus complimented these (unnamed) analysts at the time for the useful support they had provided. Demonstrating that a hostile tribe’s stronghold was also heavily intermingled by a friendly tribe could convince a commander not to apply massive force. The problem with wiping out much of the hostile tribe is that it also removes the friendly source of information – and turns the survivors unfriendly. According to Mann, the tribedepicting analytical products originated at JIEDDO remain among the more useful tools available in theater, and they are still being generated and used. As in any other problem of measurement and analysis, it is vital to start with a clear objective and good metrics of whether that objective is being met. “We need a point of departure of what relative stability looks like [in the locale of interest], then we need to shape goals by what is possible,” Mann says. “We need a definition of relative stability that is agreed upon by all levels of power, and we shouldn’t go to war without it.” Metrics of such stability might include, for example, observations of: • whether locals tell the police where the criminals are or tell the criminals where the police are;
68
|
A N A LY T I C S - M A G A Z I N E . O R G
• what people do when they have a grievance (do they trust local government and law enforcement enough to work through those channels?); or • what happens to new economic development (in unstable regions, promising new facilities get blown up). Creative, sensible analytical approaches – not unduly restricted by prior assumptions and backed and informed by close contact with the people with deep experience – can greatly advance getting the policies right. VSO, Lt. Col. Mann concludes, “is not a silver bullet, but it’s the best approach we’ve got.” The challenge for analytics is to assess, in terms senior decision-makers will readily grasp, how right or wrong this conclusion is, and how it might be shaped to be more effective. ❙ Douglas A. Samuelson (samuelsondoug@yahoo. com), D.Sc. in operations research, is president and chief scientist of InfoLogix, Inc., a small R&D and consulting company in Annandale, Va. He is a contributing editor of OR/MS Today and Analytics. He is a longtime member of INFORMS. REFERENCES 1. D. Scott Mann, “Game Changers: Going Local to Defeat Violent Extremists,” Tribal Analysis Publishing, Leesburg, Va., 2015. 2. Telephone interview with Lt. Col. Scott Mann, Dec. 17, 2015.
W W W. I N F O R M S . O R G
INFORMS MEMBERSHIP IS CHANGING FROM CALENDAR YEAR TO ANNIVERSARY DATE INFORMS has switched from a calendar year membership (January through December) to an anniversary membership (one year from your ofďŹ cial join date). Renew today to preserve your member benefits. Learn more about what this change means for you. Visit www.informs.org/Membership/Anniversary-Date-Memberships-FAQs
Online renewals available at
!
renew.informs.org 443.757.3500 or 1.800.446.3676 informs@informs.org
CO N FERE N C E P R E V I E W
2016 Conference on Business Analytics and O.R. set for Orlando
The conference consistently delivers proven applications and processes that will improve your business’s bottom line.
70
|
Do the twists and turns of making smart business decisions by maximizing the value of your data have you feeling like you’re on an analytical rollercoaster? If so, the 2016 INFORMS Conference on Business Analytics and Operations Research, set for the Hyatt Regency Grand Cypress in Orlando, Fla., on April 10-12, is the one analytics conference you should attend. This conference consistently delivers proven applications and processes that will improve your business’s bottom line. Nearly 1,000 leading analytics professionals and industry experts are expected to come together to share ideas, network and learn about a wide range of problem-solving techniques and methods. With presentations that revolve around real-world solutions, attendees will hear the full story behind successful analytics projects. This, in turn, will help you gain insight and drive your business planning. This year’s world-class line-up of speakers and strategic thought leaders, including keynote speakers Paul Ballew of Ford and Samuel Eldersveld of Uber (see
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
sidebar), will present talks on a wide range of problem-solving techniques and methods. Invited tracks on Analytics Leadership & Soft Skills, Decisions & Risk Analysis, Fraud Detection & Life Sciences, Internet of Things, Marketing Analytics, Revenue Management & Pricing, Sports & Entertainment and Supply Chain Analytics will be of great interest to anyone looking to make analytics work in real organizations. About 30 handpicked, member-contributed talks, in-depth technology workshops, poster presentations, panels and several opportunities for networking will round out the program. The Edelman Awards Gala, an Oscar-like dinner and awards celebration and a traditional conference highlight, will be held April 11. Considered the “Super Bowl of O.R.,” the Edelman recognizes the world’s best applications of analytics, operations research and management science. The Edelman Gala is included in the conference registration fee. For more on the Edelman finalists, click here. A NA L Y T I C S
Ford’s Ballew, Uber’s Eldersveld to deliver keynotes Paul Ballew, global chief data and analytics officer, Ford Analytics, and Samuel K. Eldersveld, director of operations research, Uber Technologies, will deliver the keynote talks at the 2016 INFORMS Conference on Business Analytics & Operations Research. The conference will be held April 10-12 at the Hyatt Regency Grand Cypress in Orlando, Fla. Ballew leads Ford’s global data and Paul Ballew of Ford. analytics teams including development of new capabilities supporting connectivity and smart mobility. Prior to joining Ford, he was chief data, insight & analytics officer at Dun & Bradstreet where he was responsible for the company’s global data and analytic activities along with the company’s strategic consulting practice. Ballew’s impressive career includes stints as Nationwide’s senior vice president for customer insight and analytics (where he directed customer analytics, market research Samuel Eldersveld of Uber. and information and data management functions), as General Motors Corporation’s executive director for global market and industry analysis and as a partner for J.D. Power and Associates. Ballew’s keynote presentation is scheduled for 8-9 a.m. on April 11. Eldersveld, Ph.D., joined Uber Technologies Inc. in 2015 to work with the data sciences teams in San Francisco. He received his Ph.D. in operations research from Stanford University in 1992 and worked as a mathematician for manufacturing and planning at Boeing from 1991-2000 in the Phantom Works division. He joined Expedia as a startup and spent time at Starbucks before moving to Amazon as a principal research scientist in Seattle. He was also affiliated with the University of Washington Foster School of Business as a non-tenure-track auxiliary faculty member and lecturer in the department of Information Systems and Operations Management from 2005-2008. Elersveld’s keynote presentation, “O.R. in Today’s Dynamic Business Environment,” is scheduled for 10:30-11:20 a.m. on April 12. ❙
J A N U A R Y / F E B R U A R Y 2 016
|
71
CO N FERE N C E P R E V I E W
Special programs within the conference are designed for future analytics leaders. The Early Career Connection provides early-career professionals with new perspectives into some of the most critical problems facing industry today, enabling them to broaden their research agendas. The INFORMS Professional Colloquium is designed to help practiceoriented master’s and Ph.D. students transition into successful careers. Participants in both programs can register for the full conference at a discounted rate but must be nominated and selected to attend. The Analytics Career Fair is INFORMS’ premier, professional career event that allows top analytics employers and seasoned professionals the ability
to connect in a casual atmosphere. The career fair is included the registration for this conference. The conference site, the Hyatt Regency Grand Cypress, is a world-class resort that offers an impressive collection of on-site activities, in addition to being just one mile from Walt Disney World and close to everything else Orlando has to offer. Registration rates for this conference start at $1,065 for INFORMS members. Special rates are available for students, retired attendees, conference newcomers and selected speakers. A team discount is also available. ❙ For information regarding conference registration or submitting a presentation, visit meetings.informs.org/ analytics2016.
The Hyatt Regency Grand Cypress in Orlando, Fla., will host the conference. 72
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
THE UNIVERSITY OF MICHIGAN Department of Industrial and Operations Engineering Faculty Positions The Department of Industrial and Operations Engineering at the University of Michigan invites applications and nominations for faculty positions beginning September, 2016. We seek outstanding candidates for faculty positions in all areas of industrial engineering -- methodological and applied -- with a particular focus on (1) Manufacturing Systems, (2) Transportation Systems, (3) Big Data/Analytics, (4) Cognitive Ergonomics and (5) Engineering Education. Individuals at all ranks are encouraged to apply. Candidates must have a Ph.D. and must demonstrate a strong commitment to high-quality research and evidence of teaching potential. Candidates for Associate or Full Professor should have a commensurate record of research publications and are expected to provide organizational and research leadership, develop sources of external funding, build relationships with industry, and interact with faculty colleagues. Candidates should provide (i) a current C.V., (ii) a list of references, and one page summary statements describing: (iii) career teaching plans; (iv) research plans, and (v) course (teaching) evaluations for candidates with prior teaching experience. Candidates should have their references send recommendations to us directly at IOEFacultySearch@umich.edu. The deadline for ensuring full consideration of an application is October 25, 2015, but the positions will remain open and applications may still be considered, at the discretion of the hiring committee, until appointments are made. We seek candidates who will provide inspiration and leadership in research and actively contribute to teaching. We are especially interested in candidates who can contribute, through their research, teaching and/or service, to the diversity and excellence of the academic community. The University of Michigan is responsive to the needs of dual career families. Please submit your application to the following: Web: http://ioe.engin.umich.edu/people/fac/fac_search/ If you have any questions regarding the web application submittal process or other inquiries, please contact, Gwendolyn Brown at gjbrown@umich.edu or (734) 763-1332.
QUEEN’S SCHOOL OF BUSINESS (Queen’s University, Kingston, Ontario CANADA) invites applications for two tenure-stream positions in Management Science/Operations Management. Candidates at all levels of experience will be considered. Entry-level candidates must have a PhD or be near completion. Applicants with research interests in all areas of operations management and management science will be welcomed, but we are particularly interested in at least one new faculty member with research interests in applications of online, real time, optimization and analytics. The successful candidate will exhibit potential for outstanding scholarly research and excellent teaching in support of the School’s public and private programs and will be expected to make contributions in service to the School, to the University, or the broader community. The MSOM group is a strong research group with particular expertise in revenue management and pricing, sustainability, energy markets, supply chain management and the interface of operations and marketing. For more information about our faculty see http://business.queensu.ca/faculty_and_research/index.php . The University invites applications from all qualified individuals. Queen’s is committed to employment equity and diversity in the workplace. All qualified candidates are encouraged to apply; however, in accordance with Canadian Immigration requirements, Canadian citizens and Permanent Residents of Canada will be given priority. The University will provide support in its recruitment processes to applicants with disabilities, including accommodation that takes into account an applicant’s accessibility needs - klewis@business.queensu.ca . The effective date of the appointment will be July 1st, 2016, but is flexible. Please submit a cover letter, current CV, references and a research sample, electronically to: omrecruiting@business.queensu.ca
The University of Michigan is a non-discriminatory, affirmative action employer.
Ingenuity meets passion. Big data meets the human heart. Data science is more than a skill – it’s a way of life. It rewards grit as much as talent. Failure, curiosity, and small successes lead to discovery.
DataScienceBowl.com
Sponsored by:
This year’s Data Science Bowl will empower data scientists from across the world to improve both the ability to diagnose and the capacity to care for those with heart disease, enabling people to live longer, healthier lives. At stake is a $200,000 prize to those able to observe the right patterns, ask the right questions, and create an algorithm to cut the time and cost of the traditional diagnostic process.
FIVE- M IN U T E A N A LYST
Predicting Navy football games This turned out to be more difficult than expected. Instead of constructing a rigorous hypothesis test, we compare differential methods with metrics.
My longtime followers will know that as a graduate of the U.S. Naval Academy, I have a soft spot for Navy football. I’ve written about it before (Analytics magazine, November/December 2013 issue). I also have been very excited about the re-launch of the INFORMS Section on O.R. in Sports (SpORts) led by Walt DeGrange and Scott Nestler. With the epic season that Navy just had combined with new directions in INFORMS, I decided to write about predicting football games. Now, there are people who make a living at this, and what I propose here is just to spark discussion and pique interest. Taking the game history to date, we can use the (observed) expected points scored and variance to compute a probability of win by imposing normality. (I say “impose” normality because I have no basis on which to assume it.) Letting X denote the differential score (to date), N denote Navy and O denote opponent, we have:
BY HARRISON SCHRAMM
74
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
ASSESSING THE PREDICTION This turned out to be more difficult than expected. Instead of constructing a rigorous hypothesis test, we instead will compare our differential methods with some other candidate metrics: 1. The “Null” method, which assumes a probability of .5 for each game 2. The “Plebe” method, which chooses Navy as winner every game 3. The “Streak” method, which takes the previous game as a predictor for success in the next game We use a “Complementary Brier Score.” We chose this somewhat arbitrarily; making arbitrary choices is a benefit of authorship. The Brier Score is defined as:
A NA L Y T I C S
Where p is the predicted probability and O is the outcome (1 = win, 0 = loss). In this application, the Brier Score is the same as Mean Square Error. We use the complementary score, i.e., 1-B, to rescale and let better scores correlate with better outcomes. While this is trivial mathematically, it has the desirable property of giving better predictions a higher score (see Figure 1). Now, we show in Figure 2 our predicted game outcomes (in terms of
J A N U A R Y / F E B R U A R Y 2 016
|
75
FIVE- M IN U T E A N A LYST
probability of Navy win) vs. the actual data. Note that care was taken to ensure that only the game history to date was used in the prediction. If we are going to predict the outcome of a sporting event, we should see how well our predictions performed. To this end, we compare it with three other potential scoring methods: the “plebe” method which always picks a Navy win; the “Streak” method, which uses that previous games’ outcome as the next game’s prediction; and the “naïve” method, which simply gives Navy a 50 percent chance of winning each game. The results of these predictions are presented in Figure 3. Conclusion: While I don’t plan to give up my daytime operations research job and go into sports analysis, this was a fun and interesting case study. I’d be curious to think about what other methods could be applied with a minimum of data and computing power. ❙ Harrison Schramm (harrison.schramm@ gmail.com), CAP, is an operations research professional in the Washington, D.C., area. He is a member of INFORMS and a Certified Analytics Professional.
76
|
Figure 1: Differential scores from Navy football vs. opponents; lines above the x-axis indicate wins, while lines below indicate inverse wins, sometimes known as losses.
Figure 2: Results of differential score prediction. This method had two “missed classifications” – it picked Notre Dame as a win and Memphis as a loss.
Figure 3: Comparison of prediction methods as measured by Complementary Brier Score. The “Plebe” model and our proposed differential scoring performed approximately equal by the end of the season, while the naïve and streak models clearly performed worse.
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
To-Do-List Go to the gym Begin to eat healthy
EXPAND YOUR NETWORK!
N I O J S M R O F IN Is the largest association for analytics in the center of your professional network? It should be. • INFORMS Allows You to Network With Your Professional Peers and Others Who Share Your Interests • INFORMS Connect, the New Member-only, Online Community Lets You Network with your colleagues • INFORMS Provides Unsurpassed Networking Opportunities Available in INFORMS Communities and at Meetings • INFORMS Offers Certification for Analytics Professionals • INFORMS Helps You Take Leadership Roles to Help Build your Professional Profile • INFORMS Career Center Provides You with the Industry's Leading Job Board
Join Online Today! http://join.informs.org
THIN K IN G A N A LY T I CA LLY
Crazy cake
Figure 1: Cut the cake: Make sure everyone gets an equal share. It’s your birthday. Four of your closest friends are with you to help you celebrate. Unfortunately, as a joke, one of your friends has cut the cake into 15 different pieces each of varying sizes. The weights of the slices are as follows (in grams): 18, 48, 19, 59, 46, 72, 67, 57, 49, 80, 50, 69, 10, 48 and 83. Question: How can you pass out all of the slices of cake so that everyone gets an equal amount among the five of you? Send your answer to puzzlor@gmail.com by March 15, 2016. The winner, chosen randomly from correct answers, will receive a $25 Amazon Gift Card. Past questions and answers can be found at puzzlor.com. �
BY JOHN TOCZEK
78
|
John Toczek is the assistant vice president of predictive modeling at Ace Group in the Decision Analytics and Predictive Modeling department. He earned his BSc. in chemical engineering at Drexel University (1996) and his MSc. in operations research from Virginia Commonwealth University (2005). He is a member of INFORMS.
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
OPTIMIZATION GENERAL ALGEBRAIC MODELING SYSTEM High-Level Modeling The General Algebraic Modeling System (GAMS) is a high-level modeling system for mathematical programming problems. GAMS is tailored for complex, large-scale modeling applications, and allows you to build large maintainable models that can be adapted quickly to new situations. Models are fully portable from one computer platform to another.
State-of-the-Art Solvers GAMS incorporates all major commercial and academic state-of-the-art solution technologies for a broad range of problem types.
GAMS Integrated Developer Environment for editing, debugging, solving models, and viewing data.
GGIG - A GAMS Graphical Interface Generator Complex economic models offer a wide range of options for simulation runs and return a vast amount of data, which can be explored and exploited in various ways. The GAMS Graphical Interface Generator (GGIG) enables interaction with such economic models through a Java-based user interface. Originally developed for the Common Agricultural Policy Regionalised Impact (CAPRI) modeling system, GGIG generates a basic graphical user interface (GUI) for GAMS projects based on XML files. These files define the controls for simulation runs and subsequent processing of results. GGIG is currently used in a number of economic and agricultural modeling projects around the world, including the Policy Evaluation Model of the OECD and the Global Trade Analysis Project (GTAP).
GGIG's strengths: • Handling of complex economic applications through an efficient and portable user interface • Visualization of results in various ways • Support of distributed teams through SVN
For further information please visit www.gams.com/ads/ggig.htm
www.gams.com