Analytics March/April 2014

Page 1

H T T P : / / W W W. A N A LY T I C S - M A G A Z I N E . O R G

DRIVING BETTER BUSINESS DECISIONS

M ARC H / APRI L 2014 BROUGHT TO YOU BY:

ANALYTICS IN THE OILFIELD • E xploration & production insights •H ey, what’s the fracking problem?

ALSO INSIDE: • Hottest analytics job markets in the U.S. • Key attributes of analytics professionals • Information decay: What to do about it

Executive Edge Oversight Systems CEO Patrick Taylor on using operational & strategic analytics to achieve competitive business advantage


INS IDE STO RY

Oil & analytics do mix From the boardroom to the oil field, from healthcare to manufacturing, analytics continues to expand its reach and its impact throughout the business world, but as many of the articles in this issue point out, the analytics profession and those who practice it are only just now scratching the surface of their true potential. Take the oil and gas industry, the subject of this month’s cover, for example. The industry in general, and exploration and production (E&P) companies in particular, faces many complex problems that could benefit big time from data-driven analysis and solutions, yet the adoption and application of analytics remains sketchy at best. In “Analytics in the oilfield,” Warren Wilson notes that while many E&P companies have embraced business intelligence and other analytics tools in their back offices, they are way behind the curve in terms of operations technology. Wilson, a long-time IT analyst in the E&P sector and an oil field roughneck in a previous life, says drilling data is routinely gathered in real time so that a rig can be shut down if problems arise, yet the data is then discarded, “foreclosing any opportunity to look for patterns that could 2

|

A N A LY T I C S - M A G A Z I N E . O R G

enable earlier problem detection and point the way toward better practices.” Along with discarding potentially valuable data, Wilson says the E&P sector also suffers from data fragmentation, furthering hampering analytical efforts. Atanu Basu follows Wilson’s article with an article that looks at how prescriptive analytics can reshape fracking in oil and gas fields. Basu, CEO of a company whose prescriptive analytics software focuses on improving oil and gas exploration and production, notes that current fracking practices are quite inefficient: Horizontal drilling and hydraulic fracturing recovers 20 percent or less of the oil in the shale rocks. That means drillers spent $31 billion in 2013 on suboptimal frack stages across 26,100 wells in the United States. Given the sky-high cost of oil exploration and production, and the potential for analytics to boost operational efficiency, E&P companies are leaving a lot of money on the table. Sure, oil companies make a ton of money, but why would they spend it on inefficient operations when they don’t have to?

– PETER HORNER, EDITOR peter.horner@ mail.informs.org W W W. I N F O R M S . O R G


OPTIMIZE YOUR BUSINESS WITH UNPRECEDENTED SPEED IDEA

IN A FEW HOURS

MISSION CRITICAL ENTERPRISE APP

IN A FEW MONTHS

PUBLISHED INSTANTLY TO YOUR ENTERPRISE OPTIMIZATION APP STORE

PROOF OF CONCEPT

IN A FEW DAYS

OPTIMIZATION APP

IN A FEW WEEKS

To learn more about AIMMS Optimization Apps, visit aimms.com. info@aimms.com | +1 425 458 4024


C O N T E N T S

DRIVING BETTER BUSINESS DECISIONS

MARCH/APRIL 2014 Brought to you by

FEATURES

22

22

EXECUTIVE EDGE: DATA POWER By Patrick Taylor Savvy execs make the most of data analytics by using insights on both a strategic and operational level.

30

INFORMATION DECAY By Dhiraj Rajaram, Krishna Rupanagunta and Aditya Kumbakonam The value of information diminishes over time. Here’s what enterprises need to do in response.

34

ANALYTICS IN THE OILFIELD By Warren Wilson Predictive and other forms of advanced analytics can yield crucial insights for exploration and production companies.

44

WHAT’S THE FRACKING PROBLEM? By Atanu Basu Fracking is an inefficient means to capture oil and gas, but hybrid data, big data and analytics can turn it around.

48

WHERE THE ANALYTICS JOBS ARE By Scott Nestler, CAP An analysis of data from LinkedIn and other sources reveals the best job-hunting grounds in the U.S.

58

CORPORATE PROFILE: GENERAL MOTORS By Jonathan H. Owen, David J. VanderVeen and Lerinda L. Frost GM uses advanced analytics to meet auto industry challenges, provide value to customers and company.

30

44 4

|

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


N Di ew m AS en P sio V2 na 01 lM 4w in excel ode ith lin g

AnAlytic Solver PlAtform easy to Use Predictive and Prescriptive Analytics

How can you get results quickly for business decisions, without a huge budget for “enterprise analytics” software, and months of learning time? Here’s how: Analytic Solver Platform does it all in Microsoft Excel, accessing data from PowerPivot and SQL databases. Sophisticated Data Mining and Predictive Analytics Go far beyond other statistics and forecasting add-ins for Excel. Use classical multiple regression, exponential smoothing, and ARIMA models, then go further with regression trees, k-nearest neighbors, and neural networks for prediction, discriminant analysis, logistic regression, k-nearest neighbors, classification trees, naïve Bayes and neural nets for classification, and association rules for affinity (“market basket”) analysis. Use principal components, k-means clustering, and hierarchical clustering to simplify and cluster your data.

Help and Support to Get You Started Analytic Solver Platform can help you learn while getting results in business analytics, with its Guided Mode and Constraint Wizard for optimization, and Distribution Wizard for simulation. You’ll benefit from User Guides, Help, 30 datasets, 90 sample models, and new textbooks supporting Analytic Solver Platform. Analytic Solver Platform goes further than any other software with Active Support that alerts us when you’re having a problem, and brings live assistance to you right where you need it – inside Microsoft Excel. Find Out More, Download Your Free Trial Now Visit www.solver.com to learn more, register and download a free trial – or email or call us today.

Simulation, Optimization and Prescriptive Analytics Analytic Solver Platform also includes decision trees, Monte Carlo simulation, and powerful conventional and stochastic optimization for prescriptive analytics.

Tel 775 831 0300 • Fax 775 831 0314 • info@solver.com


DRIVING BETTER BUSINESS DECISIONS

REGISTER FOR A FREE SUBSCRIPTION: http://analytics.informs.org

72

78

DEPARTMENTS 2 Inside Story 8 Analyze This! 12 Forum 16 Healthcare Analytics 20 INFORMS Initiatives 66 Conference Previews 72 Five-Minute Analyst 78 Thinking Analytically Analytics (ISSN 1938-1697) is published six times a year by the Institute for Operations Research and the Management Sciences (INFORMS), the largest membership society in the word dedicated to the analytics profession. For a free subscription, register at http://analytics.informs.org. Address other correspondence to the editor, Peter Horner, peter.horner@mail.informs.org. The opinions expressed in Analytics are those of the authors, and do not necessarily reflect the opinions of INFORMS, its officers, Lionheart Publishing Inc. or the editorial staff of Analytics. Analytics copyright ©2014 by the Institute for Operations Research and the Management Sciences. All rights reserved.

6

|

INFORMS BOARD OF DIRECTORS President Stephen M. Robinson, University of Wisconsin-Madison President-Elect L. Robin Keller,
University of California, Irvine Past President Anne G. Robinson, Verizon Wireless Secretary Brian Denton, University of Michigan Treasurer Nicholas G. Hall, Ohio State University Vice President-Meetings William “Bill” Klimack, Chevron Vice President-Publications Eric Johnson, Dartmouth College Vice President Sections and Societies Paul Messinger, CAP, University of Alberta Vice President Information Technology Bjarni Kristjansson, Maximal Software Vice President-Practice Activities Jonathan Owen, General Motors Vice President-International Activities Grace Lin, Institute for Information Industry Vice President-Membership and Professional Recognition Ozlem Ergun, Georgia Tech Vice President-Education Joel Sokol, Georgia Tech Vice President-Marketing, Communications and Outreach E. Andrew “Andy” Boyd, University of Houston Vice President-Chapters/Fora David Hunt, Oliver Wyman

INFORMS OFFICES www.informs.org • Tel: 1-800-4INFORMS Executive Director Melissa Moore Meetings Director Laura Payne Marketing Director Gary Bennett Communications Director Barry List Headquarters INFORMS (Maryland) 5521 Research Park Drive, Suite 200 Catonsville, MD 21228 Tel.: 443.757.3500 E-mail: informs@informs.org

ANALYTICS EDITORIAL AND ADVERTISING Lionheart Publishing Inc., 506 Roswell Street, Suite 220, Marietta, GA 30060 USA Tel.: 770.431.0867 • Fax: 770.432.6969 President & Advertising Sales John Llewellyn john.llewellyn@mail.informs.org Tel.: 770.431.0867, ext. 209 Editor Peter R. Horner peter.horner@mail.informs.org Tel.: 770.587.3172 Assistant Editor Donna Brooks donna.brooks@mail.informs.org Art Director Jim McDonald jim.mcdonald@mail.informs.org Tel.: 770.431.0867, ext. 223 Advertising Sales Sharon Baker sharon.baker@mail.informs.org Tel.: 813.852.9942


N Di ew m AS en P sio V2 na 01 lM 4w od ith excel eli ng

AnAlytic Solver PlAtform from Solver to full-Power Business Analytics in

The Excel Solver’s Big Brother Has Everything You Need for Predictive and Prescriptive Analytics From the developers of the Excel Solver, Analytic Solver Platform makes the world’s best optimization software accessible in Excel. Solve your existing models faster, scale up to large size, and solve new kinds of problems. From Linear Programming to Stochastic Optimization Fast linear, quadratic and mixed-integer programming is just the starting point in Analytic Solver Platform. Conic, nonlinear, non-smooth and global optimization are just the next step. Easily incorporate uncertainty and solve with simulation optimization, stochastic programming, and robust optimization – all at your fingertips.

Comprehensive Forecasting and Data Mining Analytic Solver Platform samples data from Excel, PowerPivot, and SQL databases for forecasting and data mining, from time series methods to classification and regression trees, neural networks and association rules. And you can use visual data exploration, cluster analysis and mining on your Monte Carlo simulation results. Find Out More, Download Your Free Trial Now Analytic Solver Platform comes with Wizards, Help, User Guides, 90 examples, and unique Active Support that brings live assistance to you right inside Microsoft Excel. Visit www.solver.com to learn more, register and download a free trial – or email or call us today.

Ultra-Fast Monte Carlo Simulation and Decision Trees Analytic Solver Platform is also a full-power tool for Monte Carlo simulation and decision analysis, with a Distribution Wizard, 50 distributions, 30 statistics and risk measures, and a wide array of charts and graphs.

Tel 775 831 0300 • Fax 775 831 0314 • info@solver.com


ANALY ZE T H I S !

Key attributes for analytics professionals The complexity of both the data sources being integrated and the business problems being addressed under the banner of analytics is continuing to grow, and the breadth of capabilities needed to implement effective solutions is often a very real challenge.

BY VIJAY MEHROTRA

8

|

In a recent column, Nicholas Kristof of the New York Times decries the growing isolation of college and university faculty members [1]. Notably, he quotes Will McCants, a Middle East specialist at the Brookings Institution, as saying “Many academics frown on public pontificating as a frivolous distraction from real research.” Well, I have a long track record of public pontificating, and that I’m a big fan of both real research and frivolous distraction. Indeed, this column has now been in every issue of this magazine for the last four years. In addition, I will be speaking at the upcoming Predictive Analytics World 2014 Conference, which will be held on March 1718 at the Marriott Marquis Hotel here in San Francisco [2] (and I’d love to see you there!). This public pontificating is particularly satisfying when people respond to your ramblings (hint, hint). Last month’s column was about a few odd interactions with some technically oriented colleagues about what “real” analytics actually was. In response, I received a very thoughtful response from Fredrick Odegaard, a former supply chain analyst and consultant who is now on the faculty at the Ivey School of Business. Fred first proposed his own definition of analytics (“combining sources of information to create valuable insight that is not readily apparent from the data alone”) and then added, “for me, ‘descriptive statistics’ is NOT analytics.

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


New Solver for Office 365 Excel: Free for Your Tablet or Phone New

With our Solver App for Office 365, SharePoint 2013, and Excel 2013, you have all the capabilities of the Excel Solver on your desktop, laptop, tablet or phone. It works in the Excel Web App with all popular Web browsers, and solves your model “in the cloud” on Windows Azure. And it’s free and available now! Just visit solver.com/app or the Office Store online, or use Insert Apps for Office in Excel 2013.

Analytic Solver Platform: Multidimensional Models with PivotTables New

Analytic Solver Platform 2014 brings multi-dimensional optimization modeling to Excel. Easily create dimensions or index sets and cubes of computed values, using regular Excel formulas – extended to operate over multiple dimensions. Use PivotTables, created in Excel or from databases with PowerPivot, to populate your model with data. Easily create new PivotTables of optimization results.

Find Out More, Download Your Free Trial Now. Visit www.solver.com to learn more, register and download a free trial – or email or call us today. Frontline Solvers – The Leader in Spreadsheet Analytics

Tel 775 831 0300 • Fax 775 831 0314 • info@solver.com


ANALY ZE T H I S !

Yet looking objectively at all the advertisement and public manifestations of ‘analytics’ it is 99.9 percent descriptive stats with either (A) fancy charts or (B) tables with a gazillion descriptive statistics.” Later in his e-mail, he made another very interesting observation: “The hardest part about analytics is not, as most people think, the math. In fact, the math might actually be the easiest part. Analytics require a lot of thinking and a lot of creativity, ingredients that require time and persistence, both of which are in short supply in today’s world. Most managers (and definitely students) cannot and do not want to spend more than a few minutes (or is it a few seconds?!) before receiving gratification. Which means that too often they will take a pie chart or a summary table and rave about their analytics!” I often hear this kind of thing from analytics managers. Patience, persistence and the ability to function effectively even under a wide variety of pressures (including a shortage of time) just might be the most important attributes for successful analytics professionals. Given some foundational programming and mathematical capability, the knowledge of a particular coding language or a specific statistical technique can be acquired more quickly and more cheaply than ever before; however, there are as yet no effective massively open online courses for the business effectiveness skills 10

|

A N A LY T I C S - M A G A Z I N E . O R G

(including problem framing, relationship management, effective communication with non- and less-technical stakeholders) that often determine how big an impact is made. But don’t get me wrong; I’m not trying to minimize the importance of what some of my colleagues call “technical chops.” The complexity of both the data sources being integrated and the business problems being addressed under the banner of analytics is continuing to grow, and the breadth of capabilities needed to implement effective solutions is often a very real challenge. With most of my MBA students, I feel like there is a clear ceiling on how much of the “solution stack” they will ever truly be able to understand, and I am frankly unclear on what career limitations they may face as a result. On this note, a company recently contacted me because the number of data scientists on staff had grown substantially since we had last spoke and these people had been identified as key corporate assets to be developed and retained. As part of this initiative, a few analytics leaders within the organization had sketched out competencies and job titles for two distinct career paths – one that led to senior analytics management roles and the other culminating in a highly esteemed (and very well compensated) senior data scientist title. When asked for my feedback, I had two immediate responses. First of all, the very W W W. I N F O R M S . O R G


existence of such imperfect but constructive proposals for data science careers was itself a huge, positive signal. Too often, business organizations view analytics people as high-priced commodities to be acquired when clearly needed and discharged casually when not. Knowing this, the skilled professional is compelled to make sure that their own financial and intellectual needs are taken care of, even when that means leaving the company for better opportunities (and there are typically many opportunities available to skilled data scientists). In such cases, a data scientist ends up leaving a relatively good situation largely in order to feel appreciated, while the company finds that a unique collection of broad analytics skills and hard-earned domain knowledge has just walked out the door. Secondly, I was struck by just how many different technical competencies their proposed plan required, even for people who wanted to pursue managerial and leadership roles in data science. When we discussed this, they were adamant about the need for this broad and deep set of capabilities, both in order to be skilled in creating and assessing sources and to be credible within the data scientist community. Not long after this discussion, a former MBA student of mine came to visit me. “Richie” had taken several courses with A NA L Y T I C S

me and had landed an interesting job as data analyst for a large global organization. After a year and a half on the job, he turned down a good opportunity to move into a line management position. Instead, Richie had decided to go back to graduate school again, this time to get a master’s degree in analytics. “My company doesn’t know how much it is leaving on the table,” he told me, “but I do. I just need more technical capabilities to be a real hero in this kind of environment.” His five-year goal, however, was a senior analytic leadership role, and both he and I were confident that he would get there, because of the broad background from his MBA and also because of his strong commitment to learning and growing on all fronts. When someone with that sort of attitude gets enough technical chops, look out! OK, I’m done pontificating for now. More next time. Vijay Mehrotra (vmehrotra@usfca.edu) is an associate professor in the Department of Analytics and Technology at the University of San Francisco’s School of Management. He is also an experienced analytics consultant and entrepreneur, an angel investor in several successful analytics companies and a longtime member of INFORMS. NOTES & REFERENCES 1. Kristof, Nicholas D., “Professors, We Need You,” New York Times, Feb. 15, 2014. 2. See http://www.predictiveanalyticsworld.com/ sanfrancisco/2014/agenda_overview.php for the complete agenda for PAW 2014 SF.

M A R C H / A P R I L 2 014

|

11


FO RUM

Provocative questions for analytics to answer Gen Y/Echo boomers are accustomed to having easy access to information and are highly self-sufficient in understanding its utility. The next generation after them will not have any fear of analytics or looking toward an “expert” to do the math.

BY GARY COKINS

12

|

Consider what young people are learning in school today. They are taught mean, mode, range and probability theory in their freshman university statistics course. Today’s children have already learned some of this math in the third grade! They are taught these methods in a very practical way. If you have x dimes, y quarters and z nickels in your pocket, what is the chance of you pulling a dime from your pocket? Learning about range, mode, median, interpolation and extrapolation follow in short succession. We are already seeing the impact of this learning with Gen Y/Echo boomers who are getting ready to enter the work force. They are accustomed to having easy access to information and are highly self-sufficient in understanding its utility. The next generation after them will not have any fear of analytics or looking toward an “expert” to do the math. Given that these analytical capabilities are becoming commonplace, there is a broad range of problems and opportunities that can be addressed that were unimaginable to be tackled only a few years ago. I am interested when the questions listed below might be routinely answered with business analytics, big data, and enterprise and corporate performance management (EPM/CPM) software:

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


• Why can’t traffic intersection stoplights be more variable based on street sensors that monitor the presence, location and speed of approaching vehicles? Then you would not have to impatiently wait at a red light when there is no cross-traffic. • Why can’t a call center route your inbound phone call to a more specialized call center representative based on your phone number and your previous call topics or transactions? And once connected, why can’t that call rep offer you rulebased offers, deals or suggestions most likely to maximize your customer experience? Then you might get a quicker and better solution to your call. • Why can’t dentists and doctors synchronize patient appointment schedule arrival times to reduce the amount of wasted time that so many people collectively have to idly sit while in the waiting room? Then you could show up just before your appointment. • Why can’t airlines better alert their ground crews for plane gate arrivals? Then passengers don’t have to wait, sometimes endlessly, for the jet bridge crew to show up and open the airplane’s door. • Why can’t hotel elevators better position the floors the elevators A NA L Y T I C S

arrive at to pick up passengers based on when hotel guests depart their rooms? Then you don’t have to get stuck on a slow “milk-run” elevator stopping at so many floors while an “express” elevator subsequently arrived and could have quickly taken you to your selected floor. • Why can’t airport passport control managers regulate the number of agents in synchronization with the arrivals of international flights? Then you don’t have to wait in long queue lines only to have the extra staff show up (sometimes) much later. • Why can’t retail stores partner with credit card companies and their transaction histories and use algorithms like Amazon.com and Netflix do to suggest what a customer might want to purchase? Then you might more quickly find what you are shopping for. • Why can’t water, gas and electrical utility suppliers to home residences provide instant monitoring and feedback so that households can determine which appliances or events (e.g., taking showers) consume relatively more or less? Then households could adjust their usage behavior to lower their utility bills. M A R C H / A P R I L 2 014

|

13


FO RUM

• Why can’t personnel and human resource departments do better workforce planning on both the demand and supply side? That is, for the supply side, why can’t they predict in rank order the most likely next employee to voluntarily resign based on statistical data (e.g., their age, pay raise amount or frequency) of employees who have previously resigned? For those who will retire, isn’t this predictable? For the demand side, why can’t improved forecasting of sales volume and mix be translated into headcount capacity planning by type of skill or job group? Then the workforce on hand will match the needs without scrambling when mismatches occur. • Why can’t magazines you subscribe to print at the time of production a customized issue for you that has advertisements (and maybe even articles) tailored to what you likely care more about based on the profile they may have about you? Then the magazine’s content may be more relevant to you. • Why can’t your home’s refrigerator and food pantry keep track using microchips and barcode scanners of what you purchased and the 14

|

A N A LY T I C S - M A G A Z I N E . O R G

rate of usage? Then you could better replenish those items when out shopping. Are these a vision of the future? Not in all cases. With business analytics software and communication technology some, if not all, of these questions are already solvable. Analytics not only proves or disproves an analyst’s hypothesis, but its truth-seeking tests also reveal cause-and-effect relationships. Understanding causality serves for making better decisions by reducing uncertainty. It is a complex world that we live in. It is now time that gut-feel, intuition and guessing be replaced with applying analytics to better manage organizations and better serve their customers. Gary Cokins gcokins@garycokins.com, CPIM, is the founder of Analytics-Based Performance Management LLC, an advisory firm. He is an internationally recognized expert, speaker and author in advanced cost management and performance improvement systems. He previously served as a principal consultant with SAS. For more of Cokins’ unique look at the world, visit his website at www.garycokins.com. He is a member of INFORMS. A version of this article appeared in Information Management.

Subscribe to Analytics It’s fast, it’s easy and it’s FREE! Just visit: http://analytics.informs.org/

W W W. I N F O R M S . O R G


Explore and solve complex problems for FREE FICO® Xpress Optimization Suite… FREE to Academic Institutions and Their Students Solve enormously complex problems, faster Xpress can handle complex problems with millions of variables and constraints. You’ll quickly narrow down choices to the very best (or the N-best) solution when there are innumerable possible solutions that would otherwise be difficult to compare. Explore “What If” Scenarios Xpress offers improved solution sensitivity analysis, making it possible to efficiently explore larger quantities of “what if?” scenarios. It finds the strongest levers by adjusting variables and constraints to measure how far they move optimal operating points. Take advantage of unique features for large scale optimization Xpress offers true 64-bit support for optimization and modeling, allowing users to solve ultra-large-scale models with more than 2 billion coefficients. Learn more about the Xpress Academic Partner Program: fico.com/app © 2014 Fair Isaac Corporation. All rights reserved.


HEALT H CARE A NA LY T I C S

Algorithm is the new doctor and data is the new drug Vinod Khosla argued that given the level of service that we seek and eventually receive from 80 percent of physicians, we might be better off receiving that care from a computer with sophisticated algorithms. Khosla fondly named that system “Dr. Algorithm.”

BY RAJIB GHOSH

16

|

Two years ago, Vinod Khosla, the luminary venture capitalist and the co-founder of Sun Microsystems, shook the technology and the medical communities with his highly talked about article, “Do We Need Doctors Or Algorithms?” In the article Khosla argued that given the level of service that we seek and eventually receive from 80 percent of physicians, we might be better off receiving that care from a computer with sophisticated algorithms. Khosla fondly named that system “Dr. Algorithm” or “Dr. A,” for short. Later in his follow-up talks, including the recently concluded Rock Health CEO Summit in San Francisco, he ignited the debate further by saying that 80 percent of physicians in the United States can be replaced with machines, and that day is not very far away. The medical community responded with the argument that healthcare is not about technology – it is about the intersection of technology, science and human emotions, along with the therapeutic touches and listening abilities of a doctor. David Liu, M.D., did a balanced rebuttal in The Healthcare Blog. As healthcare analytics continues to evolve in 2014, let’s pause for a few moments and think about the debate at hand. There are some big ideas embedded in it that we as data scientists and big data technologists need to consider seriously. If Khosla is right in his prediction that clinical data analytics will usher in a new era in U.S. healthcare – a sea change that will transform

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


Will data, analytics and computers replace physicians? Probably not, but they can help improve healthcare by augmenting human capabilities.

healthcare like never before – what Khosla is actually predicting is that healthcare in the future will essentially become a data game! Data is the new drug! In theory, the more data available, the more precise the diagnosis, and the efficacy of the treatment will also improve, which, in Khosla’s words, is far less complex than the problem of autonomous driving. With the deluge of data coming from multiple sources, such as wearable and ambient sensors, gene sequencing and digitized encounters, diagnosing a problem in the human body will become a matter of pattern recognition. There could be billions of possibilities, but searching a large set of possibilities with sophisticated algorithms, image processing, machine learning and artificial intelligence is what machines do well. Machines are doing it now and in real time! A NA L Y T I C S

Humans create algorithms tapping into their own “knowledge” of today. Machines take that knowledge and develop new knowledge based on the emerging patterns in the data. That’s the whole premise of IBM’s (Dr.) Watson, which is using cancer knowledge created by oncologists plus related data to fine-tune cancer treatments for patients [1]. So why do we need doctors to tell us what ails us when machines are capable of doing this? CAN MACHINES REALLY REPLACE PHYSICIANS? The 2010 National Ambulatory Care Survey reveals that out of 1 billion physician office visits, the average number of visits per person is approximately four per year. The most common reason for the visit is for a cough, and the most commonly diagnosed condition is “essential hypertension” [2]. Algorithms are available today for diagnosing hypertension. But legal liability and regulatory hurdles play a big role in preventing software developers from declaring a confirmed diagnosis. Decision support software, therefore, seeks confirmation from the clinicians. Machines also don’t have access to the huge data in healthcare that is needed to generate the desired precision in diagnosis. Genomic data is sporadic, and the majority of the clinical encounter data is still not digitized. Further complicating matters, when M A R C H / A P R I L 2 014

|

17


HEALT H CARE A NA LY T I C S

electronic data is available, the absence of data liquidity and interoperability within and among healthcare organizations makes it harder to get a holistic view of any patient. IBM’s Watson, therefore, is not only just an “advisor” it is an incredibly expensive “advisor” that takes too long (18 to 24 months) to understand how care pathways work [3]. The key here is that physicians have to let the machines learn from their decisions or mistakes, and as IBM is finding out, that is non-trivial. How do you scale when every project is custom built, takes a long time to complete and yet you are at the mercy of the physicians who fear that they are training their replacement? Moreover, even when personal medical data is available patients are concerned that seamless data flow among healthcare stakeholders will destroy their privacy and make them more vulnerable to insurance payers and employers. Not an easy problem – is it? Developing algorithms and technology for the purpose of replacing physicians is the wrong premise to begin with. Having said that, I have to admit that the future of medicine will no doubt embrace a larger role of data and analytics. The barriers that face Dr. Watson today will eventually come down. Business models will emerge. Privacy will be addressed through legislation. Treatments will be personalized in real time. But human beings are social animals – we want to hear from other humans that 18

|

A N A LY T I C S - M A G A Z I N E . O R G

no matter what the current situation is, we will be OK! A sick patient wants to go back home with assurance from a human minus the “confidence levels of 90 percent.” Armed with the data and algorithms, doctors of the future will be able to triage patients far more effectively and preemptively, spend more time with those that they need to see, and be the listener, healer and collaborator that a patient expects. This is how Dr. A will help to augment, not replace, the human capabilities to take care of an increasingly aging population that will continue to live longer. Rajib Ghosh (rghosh@hotmail.com) is an independent consultant and business advisor with 20 years of technology experience in various industry verticals where he had senior level management roles in software engineering, program management, product management and business and strategy development. Ghosh spent a decade in the U.S. healthcare industry as part of a global ecosystem of medical device manufacturers, medical software companies and telehealth and telemedicine solution providers. He’s held senior positions at Hill-Rom, Solta Medical and Bosch Healthcare. His recent work interest includes public health and the field of ITenabled sustainable healthcare delivery in the United States as well as emerging nations. Follow Ghosh on twitter @ghosh_r. NOTES & REFERENCES 1. Laura Nathan-Garner, “The future of cancer treatment and research: What IBM Watson means for our patients,” MDAnderson.org. 2. CDC.gov, “Ambulatory Care Use and Physician Visits,” http://www.cdc.gov/nchs/fastats/docvisit.htm. 3. Spencer E. Ante, “IBM Struggles to Turn Watson Computer Into Big Business,” http://online.wsj.com/ news/articles/SB100014240527023048871045793068 81917668654

W W W. I N F O R M S . O R G


Stay Ahead of your next business challenge

Learn more about the Business Analytics Certificate “The instructors are true industry leaders, and I’m able to pose them the tough questions about what’s going on behind the scenes in the industry.” Jeff JohNSoN / OptumHealth

Topic areas include: 3 Data analytics for competitive advantage 3 Statistical and visualizationbased data exploration 3 Predictive modeling and data mining 3 Analysis of unstructured data, fundamentals of text mining, and sentiment analysis

Starting March 5 at the Carlson School of Management, Minneapolis, MN

U.S. News & World Report ranks the Carlson School #4 in Master’s in Information Systems

Carlson Executive Education


INFO RM S IN I T I AT I VE S

Continuing education for analytics professionals

20

INFORMS’ popular continuing education courses, “Essential Practice Skills for Analytics Professionals” and “Data Explo-

and applications of analytics and OR/MS. Be a part of these intensive two-day, inperson courses that deliver real take-away

ration & Visualization,” will both be held March 28-29 prior to the 2014 INFORMS Conference on Business Analytics & Operations Research in Boston. The Essential Practice Skills course will also be offered June 20-21 (before the INFORMS Conference on Big Data in San Jose, Calif.) and the Data Exploration & Visualization course will be offered June 25-26 (after the Big Data conference). Register early to save on registration. Patrick Noonan of Emory University will teach the “Essential Skills” course in Boston, and E. Andrew (Andy) Boyd, Texas A&M and the University of Houston, will teach the course in San Jose, while Stephen McDaniel and Eileen McDaniel of Freakalytics, LLC will teach the “Data Exploration & Visualization” course. INFORMS serves the scientific and professional needs of analytics professionals and operations researchers including educators, scientists, students, managers, analysts and consultants. The Institute is a focal point for analytics and O.R. professionals with its mission to advance the practice, research, methods

value to implement immediately at work. You’ll leave the classroom ready to apply the real skills, tools and methods of analytics. For more information on the courses and available discounts, visit www.informs.org/continuinged.

|

A N A LY T I C S - M A G A Z I N E . O R G

Certified Analytics Professional exam schedule Following is a schedule of upcoming 2014 exam dates and locations for the INFORMS Certified Analytics Professional program: March 6: Drexel University James E. Marks Intercultural Center, Philadelphia March 29: INFORMS Conference on Business Analytics and O.R., Westin Boston Waterfront, Boston March 30: Gartner BI & Analytics Summit, Venetian Resort Hotel & Casino, Las Vegas April 15: Queens University School of Business, Toronto, Ontario, Canada May 22: University of Cincinnati, Lindner College of Business, Cincinnati June 21: INFORMS Conference on The Business of Big Data, San Jose Marriott, San Jose, Calif. To apply, click here. For more information, click here.

W W W. I N F O R M S . O R G


Visual Analytics Opportunity at your fingertips.

The answers you need, the possibilities you seek—they’re all in your data. SAS helps you quickly see through the complexity and find hidden patterns, trends, key relationships and potential outcomes. Then easily share your insights in dynamic, interactive reports.

Try Visual Analytics and see for yourself

sas.com/VAdemo

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. © 2014 SAS Institute Inc. All rights reserved. S120597US.0214


EXE CU TIVE E D G E

How savvy execs make the most of data analytics BY PATRICK TAYLOR

W

hen it comes to analytics, organizations can use data insights on a strategic as well as an operational level. The 2011 film “Moneyball,” based on the bestselling book by Michael Lewis, tells the story of how Major League Baseball’s Oakland Athletics, on a limited budget, compiled an outstanding team of players using deep data analysis to drive their team-building strategy. The team then furthered their use of data to make day-to-day playing decisions based on analytic insights. This combination of strategic and operational analysis led the A’s to an outstanding performance 22

|

A N A LY T I C S - M A G A Z I N E . O R G

– making the MLB playoffs and nearly beating the Yankees – with one of the lowest payrolls in baseball. Many businesses face the daunting challenge of building analytics programs within their organizations, yet become so wrapped up with the system and technology that they fail to realize the full value of the insights. Some kick-start the process focused entirely on the strategic value of the data. Others implement analytics on the operational side, using it to flag exceptions or identify anomalies in the way processes are followed, but never use the data to reveal massive, game-changing findings. The W W W. I N F O R M S . O R G


Anyone who has ever worked with data understands that no data set is ever “clean.”

most successful businesses do both, just like the Oakland A’s. Whether knee-deep in a big data implementation or just starting to explore the options, companies should consider some tips, pitfalls and best practices for getting the maximum value from their data. A good way to start is to make data analytics decisions with eyes wide open about what is truly required for setup, which tools are most effective for the organization, and how to maximize alwayslimited resources. NOTE TO SELF: DATA IS NOT PERFECT Anyone who has ever worked with data understands that no data set is ever “clean.” The situation becomes even more complicated when organizations are pulling data from multiple production applications. A NA L Y T I C S

A few examples highlight the enormous, unavoidable challenges associated with data inconsistencies. Consider an international company looking to identify fraud in offices worldwide. The company may start with a database of countries with the highest risk of corruption, and then evaluate transactions for those countries. In different production applications, countries may be noted in multiple different ways depending on the system, the purpose for which the information was captured, and the individual who entered the data. For example, South Korea may be entered as a standard two-letter abbreviation such as “KR” in one system, and specified in various other standard text formats such as “South Korea,” “Korea, South” or “Republic of Korea.” M A R C H / A P R I L 2 014

|

23


EXE CU TIVE E D G E

The benefits of moving to the cloud are widely recognized. Scalability, accessibility and expandable horsepower and storage provide resources precisely when and where they are needed.

Similar issues exist for person names. Taking the United States only as a simple case, generally names are straightforward with a first and last name such as “John Smith.” However, sometimes middle names are captured such as “John James Smith” or the names are entered in an alternate format such as “Smith, John.” In a simple text comparison, “John Smith,” “John James Smith” and “Smith, John” do not match; however, they could be the same person. It gets more complex internationally where people may use up to five or six name components. To accurately identify activities associated with a particular person, the analytics tool must be flexible and intelligent enough to allow for various name formats. There are many possible solutions, such as normalizing names to remove special characters and standardize formats; breaking the names down into components and matching on various combinations of the name components (tokenizing); and cross referencing known alternate spellings into standardized names such as ISO country names. The important thing is to ensure that the analytics solution being used is capable of effectively handling variances. A brittle solution that only accommodates a single naming convention will likely have issues. CLOUD-BASED ANALYSIS
VS. ANALYTICS The benefits of moving to the cloud are widely recognized. Scalability, accessibility and expandable horsepower and storage provide resources precisely when and where they are needed. As a result, many companies are turning to cloud-based analytics: analytics tools available in the cloud. While cloud-based analytics solutions present all of the familiar benefits

24

|

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


associated with the cloud, they still require the same data scientist prowess needed to power in-house analytics solutions. Statistical knowledge, business understanding and analytical savvy are all required to use cloud-based analytics programs to effectively bridge the gap between business questions and meaningful data insights. Cloud-based analysis is a new breed of solutions that encompass much of the “heavy lifting� when it comes to analysis. With cloud-based analysis, domain expertise is resident in the solution. Rather than

12-6 leeds ad alternate_Layout 1 12/20/13 8:13 AM Page 1

exclusively serving as an analytics tool in the cloud, cloud-based analysis also offers pre-configured analytic queries to apply to the data sets found in a given industry. Companies can upload their data, which is then analyzed using a series of vetted, tried and true statistical analyses and algorithms that instantly reveal actionable insights for that particular industry. These cloud-based analysis solutions are best suited for operational analysis where best practices and industry norms are the most relevant.

Stand Out. Leeds alum Matt Emmi, founder of OneButton, Boulder-based tech company

Position yourself in a lucrative new career

with a master’s degree in Business Analytics or Supply Chain Management. Intensive nine month programs World-renowned faculty

Experiential projects with industry clients Personalized professional development

Be Boulder.

Get started now: www.leeds.colorado.edu/MS

A NA L Y T I C S

M A R C H / A P R I L 2 014

|

25


EXE CU TIVE E D G E

For example, in travel and expense management, companies rely on categorization of expenses to help classify and report for trending purposes, but also to prepare tax filings, which may include different deduction rates depending on the expenditure category. Employees may either inadvertently or purposefully misclassify expenses. Cloud-based analysis can analyze T&E expenses for miscategorization, frequent offenders and merchants associated with multiple misclassified expenses. With this insight, the company can investigate to determine if there is fraudulent activity taking place, if certain inappropriate merchants (i.e., dating services expensed as “meals”) should be blocked, or if a process or policy change needs to be implemented to guard against problems. Cloud-based analysis puts available data to work immediately, asking key questions and delivering business-critical insights on day one. Minimal ramp-up time is required and enterprises can start seeing trends immediately. These new solutions enable companies to see immediate benefit from analytics and also avoid the lead-time and resources required to progress through the learning curve of which questions to ask, which queries to configure and how to deliver meaningful reports. Companies should consider if there are areas where cloud-based analysis can deliver immediate operational value, 26

|

A N A LY T I C S - M A G A Z I N E . O R G

allowing analytics gurus to focus on “deep dive” strategic issues. ACCOUNT FOR NUANCES OF THE BUSINESS
 While some expenses may be a red flag for most any business (i.e., dating services) beyond the most obvious examples, determining what kinds of transactions represent a possible risk for a particular company is a critical first step to ensuring the analytic reports delivered are valuable. For every industry and every business, there are differences in what qualifies as “typical” or “atypical.” For example, a large invoice to a plumbing vendor may represent a red flag to a pharmaceutical company, but be quite typical for a construction company. Likewise, a $500 dinner expense at the Ritz in New York may not be uncommon for a company with all East Coast clients, but the same dinner expense in Robert’s Restaurant (part of the Scores strip club) may be a red flag. The type of business, number of daily transactions and specific situations combine to make each company different. Managers typically understand these exceptions and anomalies, but they may not come to mind when initiating an analytics program. There are a couple of ways to capture and integrate this information. One is to start off with a questionnaire prior to implementing an analytics solution. A survey may W W W. I N F O R M S . O R G


NORTHWESTERN ANALYTICS As businesses seek to maximize the value of vast new streams of available data, Northwestern University offers two master’s degree programs in analytics that prepare students to meet the growing demand for data-driven leadership and problem solving. Graduates develop a robust technical foundation to guide data-driven decision making and innovation, as well as the strategic, communication and management skills that position them for leadership roles in a wide range of industries and disciplines.

MASTER OF SCIENCE IN ANALYTICS • 15-month, full-time, on-campus program • Integrates data science, information technology and business applications into three areas of data analysis: predictive (forecasting), descriptive (business intelligence and data mining) and prescriptive (optimization and simulation) • Offered by the McCormick School of Engineering and Applied Science www.analytics.northwestern.edu MASTER OF SCIENCE IN PREDICTIVE ANALYTICS • Online, part-time program • Builds expertise in advanced analytics, data mining, database management, financial analysis, predictive modeling, quantitative reasoning, and web analytics, as well as advanced communication and leadership • Offered by Northwestern University School of Continuing Studies 877-664-3347 | www.predictive-analytics.northwestern.edu/info


EXE CU TIVE E D G E

queue managers to think of anomalies about their business. The following questions may encourage managers to think along the right lines: • What types of vendors indicate possible risk for your business? • Is there a typical size/number of transactions per week/month that are typical of your business? • What policies/guidelines are in place that you typically find employees skirting to avoid hassle or make transactions easier? (For example, breaking expenses in half to avoid expense limits that require a long pre-approval process) With the answers to these questions in mind, the analyst can gain a better understanding of what to be looking for, and perhaps more importantly, what not to be looking for in results. Sometimes an even easier way to get to this information is for the analyst to deliver the first set of reports and then collect feedback in real time. Managers typically don’t understand statistical calculations, but they do understand well-delivered results and have a keen eye for identifying when something is amiss. Based on reactions to initial reports, the analyst can adjust the queries/algorithms to take into account the newly shared insights. For example, a retail analyst may identify that 75 percent 28

|

A N A LY T I C S - M A G A Z I N E . O R G

more cash refunds for product were issued at register No. 4 than at any other register. This is a potential red flag for fraudulent returns perpetrated by the cash register operator. However, in looking at the report, the manager may know that to keep customers moving quickly through check-out, refunds are directed to the customer service desk (home to register No. 4), where these transactions are handled whenever possible to prevent delaying other customers. This operational policy needs to be taken into account in the analysis so that the “normal” volume for register No. 4 refunds is appropriately adjusted. By spending some time up-front and in the first few cycles of analysis to account for nuances in the business, analysts can set up much more valuable reports and avoid time and energy spent on mislabeled red flags. LOOK BEYOND OPERATIONAL ANALYSIS Leveraging analytics for operational analysis is a great place to start due to the quick ROI and powerful insights yielded in a short time. However, as in the example of the Oakland A’s, the savviest organizations should use analytics for both operational and strategic insights. Once organizations become comfortable with operational analysis to deliver insights for better day-today decision-making, it is easy to fall into W W W. I N F O R M S . O R G


a pattern of contentment. However, once cloud-based analysis gets rolling, it should leave in-house talent with the bandwidth to explore strategic-level queries that could lead to the next “ah-ha” discovery that will reshape the business. If cloud-based solutions can be leveraged for some day-to-day analysis, then analysts with true domain expertise can focus their energies on coming up with the next big discovery. Companies often know the questions they would like to have answered. Big, game-changer questions like: How can we know which past customers of one product are the most likely customers of a new product? Or, which new markets are the most potentially lucrative? Data analytics hold the answers to these questions, but it often requires some lead time and many interim answers before arriving at the ultimate answer. It can take months or even years to investigate these questions. Therefore, companies should begin applying their analytic manpower to those big questions as soon and as efficiently as possible. Strategic-level insights may also be conducted at different times of year and at different intervals than continuous monitoring. For example, at the end of the year, managers may be making strategic sourcing plans and may wish to identify vendors that cause the most problems over the last year, requiring a different kind of analysis of A NA L Y T I C S

the data with comparison against a different baseline. Likewise, larger trends may require a comparison of a full year’s results over those of the last several years to identify operational challenges or sales trends. Analysts also need to focus on delivering information in a consumable format that is understandable and usable by their “customers.” Sets of data in tables may not be as understandable to the typical business user as a chart or graph. A chart may also be enhanced with accompanying explanatory text. Delivering the information in a way that is too challenging may leave critical insights unaddressed. It is when continuous monitoring is combined with strategic data analysis that companies fully realize the value of analytics. The Oakland A’s went beyond conventional baseball statistics like batting averages and stolen bases to perform much deeper, more rigorous statistical analysis to understand and select players. Then they sought to outperform their opponents at each and every game by using more tactical insights such as having batters take more pitches to tire the opposing pitcher. The combination led to astounding success. In the same way, savvy executives can outperform their competitors with a combination of strategic and continuous analytics. Patrick Taylor is CEO and founder of Oversight Systems, a provider of business analytic software. He is a member of INFORMS.

M A R C H / A P R I L 2 014

|

29


DATA AS S ET S

Information decay How the value of information diminishes over time.

BY DHIRAJ RAJARAM, KRISHNA RUPANAGUNTA AND ADITYA KUMBAKONAM or those of you who still remember high school chemistry, you may recall that radioactive decay is an inherent property of all matter. And as a quantum physicist would tell you, while it is impossible to predict when a particular atom will decay, the chance that a given atom will decay is constant over time. We believe that the same principle holds true for information within an organization as well: While it is difficult to predict when a particular information entity (e.g., a set of data records) will lose its relevance for a decision-maker, it is certain that all

F

30

|

A N A LY T I C S - M A G A Z I N E . O R G

information loses value over time. The parallels are striking – so much so that we believe that every information entity should have an attribute called “information decay” that describes how the value of this information decreases over time, much like the half-life captures the rate of decay for all matter. As far as the information entity is concerned, this phenomenon has accelerated in recent times as advances in analytics, data and technology have transformed the way organizations leverage information to drive decisions. Thanks to an explosion in data, there is not only a lot of information, W W W. I N F O R M S . O R G


An example of information decay that marketers would recognize is the half-life associated with GRPs/TRPs or impressions to create a derived metric – ad-stock, used to quantify the impact of marketing. The delayed effects of marketing campaigns have been well understood and have been successfully leveraged to measure short- and long-term effects on revenue and brand equity. WHY DOES INFORMATION DECAY? Every information entity should have an attribute called “information decay” that describes how the value of this information decreases over time.

but the rate of information accumulation is accelerating as well. Combine that with a highly dynamic business environment, and it starts becoming clear that the value of each information entity is decreasing at an ever-faster rate. There is no better example of information decay than that of the Oakland A’s from 1996-2004, famously storied in the book “Moneyball.” Starting in 1996, the team adopted a novel approach to scouting, driven by using an analytical, evidence-based (“sabermetric”) approach. The results were dramatic, as the A’s made it to the playoffs four straight years starting in 2000. Other teams then caught on, and the A’s lost their advantage rapidly – a case of rapid information decay. A NA L Y T I C S

Information decays for several reasons, and, as is usually the case, more than one of the following reasons is typically at play: Information becomes outdated: In many situations, information has a temporal value that decays unless refreshed on a regular basis. Consumer credit scores, for instance, need to be continuously refreshed in order to retain their value, which is directly impacted by the refresh frequency. In a world where a combination of data and technologies make near real-time refreshes possible, the information decay of consumer credit scores is increasing. Natural decay of information: As technologies evolve, some information elements begin to lose relevance. Traditionally, surveys were the preferred (often, the only) method for companies to get a pulse of their customer base. However, with the explosion of e-commerce and social M A R C H / A P R I L 2 014

|

31


INFO RM AT IO N DE CAY

media, companies are increasingly tapping multi-channel data sources to better understand the moments of truth in the customer lifecycle. They are tapping into these newer, richer sources for better customer insights, and in the process the information value of surveys is diminishing. The “efficient information hypothesis”: In finance, the efficient market hypothesis posits that the prices of traded assets reflect all the available information. As the access to information increases, its value decays. This information decay is accelerating at an unprecedented rate, thanks to technology. A case in point is competitive pricing. Once upon a time, not very long ago, competitive pricing strategies kept scores of managers busy in organizations. And then came the Internet and with it, website scraping, which has given organizations the ability to track real-time changes to competitor prices. Big data technologies allow multiple retailers to dissect every price change in the ecosystem in near real time, sucking away any possible arbitrage opportunity. In other words, the Internet has accelerated the information decay rate of competitive pricing. Another situation that is all too familiar for city dwellers is traffic information. As real-time information about traffic flows (or more likely snarls) becomes available, this triggers a bandwagon effect 32

|

A N A LY T I C S - M A G A Z I N E . O R G

of redirecting the traffic to the hitherto unclogged routes, sucking them to the gridlock as well. The value of the traffic information comes down, and the speed with which this information is distributed determines its rate of information decay. The “observer effect”: One of the more esoteric concepts in quantum physics, this refers to the changes that the very act of observation cause when any phenomenon is being observed. This is well known in stock markets – often, the very act of an analyst initiating coverage of a relatively unknown stock brings attention to the stock. And more eyes on the stock can change the dynamics of the stock, altering the information decay of the stock price. Until, of course, the efficient market hypothesis kicks in and brings the stock back to its natural levels. WHY DOES INFORMATION DECAY MATTER? Data is the “new oil” of the 21st century, and companies are fast accumulating data assets. Organizations need to invest in extracting the true value from data by institutionalizing a culture of data-driven decision-making. As they embark on this journey, managers would do well to recognize that the value of data, like any asset, depreciates over time. To begin with, any data governance process should have a strong data value W W W. I N F O R M S . O R G


audit process. This should be a structured process that, on a pre-defined frequency, takes a critical look at every data element (from raw data to derived metrics) and evaluates its value. If it turns out that the value derived by business from that data element is decaying, follow-up with corrective action – either refresh the computation method to revise the data element or replace the data element completely. This alone should set companies well along the way to incorporate the concept

of information decay into their data DNA. Over time, we expect information decay – which we now refer to as “µDecay” – to be formalized as an attribute of every information entity. And once organizations begin to measure information decay and drive corrective actions, the value they can extract out of data assets will grow. Dhiraj Rajaram is the CEO of Mu Sigma, an analytics services provider that provides services to more than 75 of Fortune 500 companies. Krishna Rupanagunta is the “geography head” and Aditya Kumbakonam is the “delivery head” at Mu Sigma.

Successful Applications of Customer Analytics WCAI 2014 Annual Conference Thursday, May 1, 2014 Philadelphia, PA

Opening Keynote Speaker Thomas H. Davenport

As a research center positioned at the intersection between academics and industry, The Wharton Customer Analytics Initiative (WCAI) will be showcasing presentations that illustrate a high level of rigor but are also broadly accessible to practitioners. Featuring MGM, Cleveland Indians, Pfizer/Kaggle, GE, Deloitte, Nielsen, and Visible World, presentations and case-studies will cover a wide range of challenges pertaining to customer-level data, including:

Emerging sources of television advertising response data Driving season ticket sales Modeling attrition and managing customer churn Improving the effectiveness of marketing campaigns by micro-targeting individual customers Forecasting new product sales one customer at a time

A NA L Y T I C S

Register Now

www.wharton.upenn.edu/wcai

M A R C H / A P R I L 2 014

|

33


DATA IN TEG RAT I O N

Analytics in the oilfield Properly deployed, predictive and other forms of advanced analytics can yield crucial insights for exploration and production companies.

BY WARREN WILSON redictive analytics software has become increasingly attractive as it has increased in capability and fallen in price. It is becoming so powerful that many enterprises consider it a must-have technology. As with any investment decision, however, mounting enthusiasm tends to mute the critics and skeptics who may have valid questions about the rationale. The importance of predictive and other forms of advanced analytics is self-evident. Properly deployed, they can yield crucial insights obtainable

P

34

|

A N A LY T I C S - M A G A Z I N E . O R G

in no other way. Accordingly, exploration and production (E&P) companies should develop an analytics strategy as quickly as possible, if they have not already done so. E&P companies should also be mindful that investments made without clear goals and implementation plans risk wasting critical resources and not achieving the desired results. As E&P companies formulate their plans, they can boost the likelihood of success by taking care to address three basic questions: What business value do you expect to gain, W W W. I N F O R M S . O R G


Drilling data often isn’t saved. It is simply discarded, foreclosing any opportunity to look for patterns that could enable earlier problem detection and point the way toward better practices.

what data is required to realize that value, and which analytics tools are best suited to your goals? Only when those questions have satisfactory answers can an enterprise move forward with confidence. Many E&P companies have already embraced business intelligence and other analytics tools in their back offices, particularly for financial management and enterprise resource planning. But they are significantly further behind in operations technology. Many are using only rudimentary IT in the operations that define their industry: exploring for oil and gas, developing reserves and managing production for maximum lifetime value. Drilling data, for example, is routinely gathered in real time so that the rig can be shut down if key measurements such as

torque on the drill pipe and bit, or pressure in the mud circulation system, move outside of established limits. But this drilling data often isn’t saved. It is simply discarded, foreclosing any opportunity to look for patterns that could enable earlier problem detection and point the way toward better practices. So the starting point must be to identify gaps in data capture and plug as many of them as possible. The next challenge is to minimize, or at least significantly reduce, data fragmentation. Typically, exploration, development and production departments have maintained separate data repositories, each with its own data types. These repositories may be further fragmented geographically, for example, if companies organize and store data on the basis of M A R C H / A P R I L 2 014

|

35


OI L F IEL D AN A LY T I C S

The E&P industry has become so data-driven that the limitations of piecemeal adoption are all too evident. The resulting data fragmentation makes data management less efficient and more expensive than necessary. In addition, it prevents the company from analyzing its data in a comprehensive, unified and forward-looking manner.

36

|

national boundaries or oil-producing regions. Such fragmentation reflects the way IT has typically been adopted: in piecemeal fashion, by local or departmental managers, to address a specific problem. Today, however, the E&P industry (like many others) has become so data-driven that the limitations of piecemeal adoption are all too evident. For one thing, the resulting data fragmentation makes data management less efficient and more expensive than necessary. In addition, it prevents the company from analyzing its data in a comprehensive, unified and forward-looking manner. This, in turn, poses two main problems. One is that fragmentation reduces the value of each type of data – exploration, development and production – individually. The other is that fragmentation makes it impossible to analyze the three types holistically, denying the company the insights that can come only from an integrated approach. The best path forward can vary considerably from one enterprise to another, because piecemeal adoption means that no two companies start from the same place. One E&P company might be getting suboptimal results from its seismic tests and exploration drilling, but not realize it because it lacks the tools to analyze historical data and identify the factors degrading its results (to say nothing of predictive tools). Another company might be drilling too many or too few development wells, or not siting them correctly – again, due to lack of analytical insight. Yet another company might be deferring too much or too little production because it lacks the predictive analytics

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


Connect with the earned expertise of business forecasters and practical research from top academics from around the globe.

Each issue of Foresight contains articles that you’ll use in your day-today work, whatever types of forecasting you do.

““

Here’s what our readers say:

Issue 26 Summer 2012 THE INTERNATIONAL JOURNAL OF APPLIED FORECASTING THE INTERNATIONAL JOURNAL OF APPLIED FORECASTING

“The information is relevant to practitioners and is presented in a way that is not overly academic but with significant credibility.” Thomas Ross, Financial Analyst, Brooks Sports

5 Setting Internal Benchmarks Based on a Product’s ForecaStaBIlIty DNa

“Foresight make(s) important research findings available to the practitioner.”

18 Regrouping to Improve Seasonal Product Forecasting 32 Forecasting Software that Works For – Not against – Its Users

38 Book Review Abundance: The Future Is Better Than You Think 41 reliably Predicting Presidential elections

Anirvan Banerji, Economic Cycle Research Institute

“...an important forum for practitioners to share their experiences....”

Issue 27 Fall 2012 THE INTERNATIONAL JOURNAL OF APPLIED FORECASTING THE INTERNATIONAL JOURNAL OF APPLIED FORECASTING

Dan Kennedy, Senior Economist, Connecticut Department of Labor

“I find Foresight very useful! I use it as a teaching resource to bring theoretical forecasting techniques to life for the students.” Dr. Ilsé Botha, Senior Lecturer, University of Johannesberg

5 Special Feature: Why Should I Trust Your Forecasts? 23 Tutorial: The Essentials of Exponential Smoothing 29 S&OP: Foundation Principles and Recommendations for Doing It Right

40 New Texts for Forecasting Modelers

Put Foresight to work to improve your forecasts and rally support for the people, processes and tools that accurate forecasting requires.

Subscribe today! forecasters.org/foresight/subscribe/

Issue 29 Spring 2013 THE INTERNATIONAL JOURNAL OF APPLIED FORECASTING THE INTERNATIONAL JOURNAL OF APPLIED FORECASTING

5 Forecasting revenue in Professional Service Companies 14 Forecast value added: A Reality Check on Forecasting Practices 19 s&oP and Financial Planning 26 cPFr: Collaboration Beyond S&OP 39 Progress in Forecasting rare events 50 Review of "global trends 2030: alternative Worlds"

Foresight | IIF Business Office | 53 Tesla Ave. | Medford, MA 02155 USA | forecasters@forecasters.org | +1 781 234 4077


OI L F IEL D AN A LY T I C S

Regardless of different starting points and unique challenges, E&P companies share common goals – reducing the drag on business performance that stems from the lack of unified data and analytics capabilities.

38

|

capabilities needed to make better deferment decisions. Or it might not have adequate insight into why production in a given well, field or region is declining, and how future production might be optimized using various intervention methods. In addition to different starting points, E&P companies have unique assets. As a result, the key problem for one company may lie in its seismic exploration methods, while for another the main challenge might be production management. Such differences dictate different strategies with regard to data integration and analytics, leading individual E&P companies toward different vendors and applications. Furthermore, E&P companies’ exploration, development and production operations historically have operated as separate departments. They support each other, of course, and information is routinely shared among departments. But integrated approaches have been difficult to impossible because the necessary information has typically been fragmented, housed in separate databases that are isolated from one another. This isolation takes two basic forms. Similar types of data may be isolated geographically. For example, production data may be stored in regional databases that cannot talk to each other. Production data also may be stored in different formats, and/or with different technologies, that prevent holistic analysis. Still, regardless of different starting points and unique challenges, E&P companies share common goals – reducing the drag on business performance that stems from the lack of unified

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


Analytics SAS and Hadoop take on the Big Data challenge. And win.

Why collect massive amounts of Big Data if you can’t analyze it all? Or if you have to wait days and weeks to get results? Combining the analytical power of SAS with the crunching capabilities of Hadoop takes you from data to decisions in a single, interactive environment – for the fastest results at the greatest value.

Read the TDWI report

sas.com/tdwi

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. © 2014 SAS Institute Inc. All rights reserved. S120598US.0214


OI L F IEL D AN A LY T I C S

Respondents said data warehousing and data management/integration technologies are among their top priorities for investment this year. Both technologies are important steps in laying the groundwork for broader use of analytics tools.

40

|

data and analytics capabilities. The solution starts with creating a platform for analytics by consolidating and integrating data by type (exploration, development or production) over as broad a geographic area or portion of the company as possible. Next is to deploy analytics tools that can optimize the value of historical data in each of the three main categories, while laying groundwork for real-time and predictive tools that can holistically analyze the three main categories of data. Ovum primary research shows that E&P and oilfield services companies are taking up this challenge. In its 2013 ICT Enterprise Insights survey, Ovum interviewed more than 400 IT decisionmakers in E&P (among more than 6,500 across 17 industries). Asked about their priorities in information management, the respondents said data warehousing and data management/integration technologies are among their top priorities for investment this year. Both technologies are important steps in laying the groundwork for broader use of analytics tools. Important though it is, data integration should not be undertaken all at once. Depending on the degree of fragmentation of its existing data, an E&P company may face a complex challenge extracting all of this data, transforming it into a consistent format, and loading it into a new, unified database. Most enterprises will want to rely on an outside company – the analytics software vendor, a systems integrator or both – to do that, rather than build or hire for such skills internally. Pragmatism dictates tackling the problem in

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


smaller bites – for example, focusing on just one data type across a small to medium-sized geographic area. The approach will depend on what challenges the company is trying to address. If offshore seismic data is a key problem, the company might focus first on analytics that allow it to monitor the data coming back from the seismic vessel in real time. That will allow it to identify poor-quality data immediately and have the operator correct it under its current contract, rather than waiting weeks to

INFORMS CONFERENCE ON

USINESS ANALYTICS & PERATIONS RESEARCH Applying Science to the Art of Business

discover the problem and having to engage the contractor again under a new contract. If the company’s main challenges involve development drilling, analytics tools can help determine the optimal number and spacing of wells to optimize yield and production costs. Similarly, predictive and prescriptive analytics tools can help an E&P company maximize the value of a well field’s lifetime production. Such tools also can help to minimize the cost of replacing submersible pumps

5th Annual Executive Forum

Nominate Your Executives to Attend Join executives from Boeing, IBM, Intel, Chevron, Nielsen, IRS, Ford, Schneider, Target, Land O’Lakes, UPS, Bank of America, HP, Verizon, Jeppesen,Verizon Wireless, and more!

This special program for senior executives provides decision makers with a bottom-line understanding of how business analytics & O.R. are used to drive better organizational decisions. • special

networking reception hosted by the INFORMS Board of Directors,

• selected • Tom

Thank You to Sponsor

executive-level talks,

Davenport keynote address,

• executive

seating at Gala awards ceremony.

Submit your application to meetings.informs.org/analytics2014/execforum.html

A NA L Y T I C S

March 30-April 1, 2014 The Westin Boston Waterfront Boston, MA

M A R C H / A P R I L 2 014

|

41


OI L F IEL D AN A LY T I C S

Companies that unify their data to enable holistic analysis of all three domains will understand each of the three much more deeply than they do today. In addition, they will likely find hidden interrelationships that can only be guessed at today.

(which fail with some regularity, bringing production to a halt), or to choose the best procedures to “work over” a well whose production has fallen due to causes such as sand accumulation or casing deterioration. Still, while analytics software can deliver significant value in each of these cases, it is important to keep in mind that these examples address the three domains – exploration, development and production – separately. Companies that unify their data to enable holistic analysis of all three domains will understand each of the three much more deeply than they do today. In addition, they will likely find hidden interrelationships that can only be guessed at today. Ultimately, these new and deeper understandings will improve exploration success, bring new efficiencies to the development phase, and increase the lifetime return on their assets and investments. Warren Wilson (warren.wilson@ovum.com) leads Ovum’s energy team, focusing primarily on IT for upstream oil & gas. His research focuses on the ways in which leading-edge IT such as analytics, information management and mobile/wireless technologies can enable better practices and results throughout the upstream industry. Wilson brings a unique combination of skills to his oil & gas research. He holds a degree in geology, has direct experience working in the oilfield, and spent several years as a journalist covering the exploration and production industry. An IT analyst for the past 15 years, his research has focused on mobile business applications and enterprise applications including ERP, CRM, supply chain management and analytics. Wilson joined Ovum in 2006 when Ovum acquired his former employer, Summit Strategies, where he had worked for the previous eight years. Before becoming an IT analyst, he was a reporter and editor for U.S. newspapers including the Seattle Post-Intelligencer and The Miami Herald. He majored in geology at Carleton College in Northfield, Minn., and later worked in the oilfield as a roughneck and in well logging.

42

|

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


Learn how Analytics and O.R. are revolutionizing decision-making – enabling companies to maximize the value of their data to drive better, smarter business decisions. Keynote Speaker Analytics pioneer and thought-leader – Thomas H. Davenport Distinguished Professor, Babson College Fellow, MIT Center for Digital Business Co-Founder, International Institute for Analytics

Focused Tracks - The Analytics Revolution - Big Data - Marketing Analytics - Healthcare Applications

Analytics 3.0: Where Big Data and Traditional Analytics Meet

- INFORMS Roundtable: OR/MS in Practice - Supply Chain Management

Register Now and Save!

- Decision Analysis

Early rate deadline: March 17, 2014

- Soft Skills - Software Tutorials - More to come

Thanks to our Sponsors

meetings.informs.org/Analytics2014


HYBRID DATA

How prescriptive analytics can reshape fracking in oil and gas fields BY ATANU BASU he United States is reemerging as an energy superpower. According to the International Energy Agency, by 2016 the U.S. will surpass Saudi Arabia and become the world’s largest oil producer. The domestic energy industry’s recent rise is the result of lower demand through energy efficiency and the rise in production of unconventional oil and gas

T

44

|

A N A LY T I C S - M A G A Z I N E . O R G

discovered in underground shale formations. Horizontal drilling and hydraulic fracturing have made it possible to economically produce oil and gas from tight rocks. In October 2013, U.S. oil production reached its highest monthly total in the last 25 years. In Texas, with crude oil production of more than 2.7 million barrels per day, two shale oil fields alone – Eagle Ford and Permian Basin – are on target to produce nearly W W W. I N F O R M S . O R G


Horizontal drilling and hydraulic fracturing have made it possible to economically produce oil and gas from tight rocks.

2 million barrels of oil equivalent a day in 2013. However, while abundant, shale oil and gas can be difficult to locate and extract. Horizontal drilling and hydraulic fracturing processes are expensive and, some say, potentially harmful to the environment. Another relatively unknown fact – especially to industry outsiders – is that fracking is quite inefficient today: 80 percent of the production comes from 20 percent of the fracking stages. Today, horizontal drilling and hydraulic fracturing recover about 20 percent, probably less, of the oil in the shale rocks. According to PacWest, drillers A NA L Y T I C S

will spend $31 billion in 2013 on suboptimal frack stages across 26,100 wells in the United States. In response, some of the largest oil and gas companies are using big data analytics technologies to improve their exploration and production. Big data analytics includes three categories: descriptive analytics, which tells you what happened and why; predictive analytics, which tells you what will happen; and prescriptive analytics, which tells you what will happen, when, why and how to improve this predicted future. Marketers, operations experts, financial officers and other business leaders M A R C H / A P R I L 2 014

|

45


RES H AP E F RAC K I NG

For the oil and gas industry, prescriptive analytics can help locate fields with the richest concentrations of oil and gas, make pipelines safer, and improve the fracking process for greater output and fewer threats to the environment.

46

|

have already used prescriptive analytics to improve customer experience, reduce churn, increase up-selling and cross-selling revenue, streamline logistics and enhance other important applications. For the oil and gas industry, prescriptive analytics can help locate fields with the richest concentrations of oil and gas, make pipelines safer, and improve the fracking process for greater output and fewer threats to the environment. About 80 percent of the world’s data today is unstructured – videos, images, sounds and texts. Until recently, most big data analytics technologies looked only at numbers. The oil and gas industry looked at images and numbers, but in separate silos. However, the ability to analyze hybrid data – a combination unstructured and structured data – provides a much clearer and more complete picture of the current and future problems and opportunities, along with the best actions to achieve the desired outcomes. For example, to improve hydraulic fracturing performance, the following datasets must be analyzed together: • images from well logs, mud logs, seismic reports, • videos from down-hole cameras of fluid flow, • sounds from fracking recorded by fiber optic sensors, • texts from drillers’ and frack pumpers’ notes, and • numbers from production and artificial lift data. Taking hybrid data into account is critical

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


because of the multi-billion dollar investment and drilling decisions that are being made by the energy companies regarding where to drill, where to frack and how to frack. It calls for combining disparate computational and scientific disciplines to be able to interpret different types of data together. For example, to algorithmically interpret images (such as well logs), machine learning needs to be combined with pattern recognition, computer vision and image processing. Mixing these different disciplines provides more holistic recommendations regarding where and how to drill and frack, while reducing the chances of problems that could emerge along the way. For example, by developing detailed analytical signatures – using data from production, subsurface, completion and other sources – one can better predict performing and non-performing wells in a field. This process is supported by the prescriptive analytics technology’s ability to automatically digitize and interpret well logs to create depositional maps of the subsurface. With a better idea of where to drill, companies save invaluable resources by skipping wells that shouldn’t be drilled in the first place. At the same time, they minimize damage to that particular landscape. Prescriptive analytics can be used in other areas of oil and gas production. A NA L Y T I C S

In both traditional and unconventional wells, by using data from pumps, production, completion and subsurface characteristics, one can predict failures of electric submersible pumps and prescribe actions to mitigate production loss. Apache Corp., for example, is using analytics to predict failures in pumps that pull oil out from subsurface and preempt the associated production loss from these pump failures. Another potential application of prescriptive analytics is that it can possibly predict corrosion development or cracks in pipelines and prescribe preventive and preemptive actions by analyzing video data from cameras along with other data from robotic devices called “smart pigs” inside these pipelines. Smarter decisions equal fewer resources, lower environmental impact and greater yields. Successful companies will be the ones that know how to prioritize resources to extract, produce and transport oil and gas in the most efficient and safest manner. Look for big data and prescriptive analytics to play a much bigger role in this space over the coming years. Atanu Basu is CEO of AYATA, a software company headquartered in Austin, Texas. AYATA’s prescriptive analytics software focuses on improving oil and gas exploration and production. Basu is a member of INFORMS. A version of this article appeared in DataInformed.

M A R C H / A P R I L 2 014

|

47


CAREER ADVA NC E ME N T

Where the analytics jobs are BY SCOTT NESTLER, CAP

B

ecause that’s where the money is.” — Willie Sutton (on why he robbed banks)

CONSIDER THE OBVIOUS In pondering an upcoming change of career (or at least employer), a good 48

|

A N A LY T I C S - M A G A Z I N E . O R G

start is to follow Sutton’s Law, paraphrased in the apocryphal quote above and taught to medical students learning about diagnosis. In a more general form, it states something like, “first consider the obvious.” With regard to where jobs in analytics can be found, the obvious might include considering a mix of W W W. I N F O R M S . O R G


large, metropolitan cities and smaller, but technology-centric areas. While this proves to be a good approach, additional research yields results that are interesting, informative and potentially useful to anyone looking for a job in the analytics field. THE REALM OF POSSIBILITIES One line of investigation is to use LinkedIn to see where jobs related to analytics can be found. While there are many other job boards available

(e.g., general sites like Monster, SimplyHired and Indeed, as well as more focused options like the INFORMS Career Center), this study used LinkedIn as a proxy for the universe of analytics job postings. Using the job search capability, it is possible to do a keyword search for all currently listed positions within a distance of a zip code. While poking around using the Sutton’s Law approach might be a useful start, a more systematic approach seems appropriate.

Attention Companies Who Hire Analytics Professionals Reserve your space now for the industry’s premier, professional job fair!

Analytics Connect Brought to you by Analytics Magazine

INFORMS CONFERENCE ON

USINESS ANALYTICS & PERATIONS RESEARCH

CAREER CENTER

Applying Science to the Art of Business

March 30–April 1, 2014 |     

The Westin Boston Waterfront | Boston, Massachusetts

Find the seasoned professionals you need – over 800 analytics professionals expected Provide your recruitment materials in a casual setting Arrange discreet on-site meetings in private booths Enjoy discounted combination pricing with the fall Annual Meeting Job Fair Enhance your visibility with an ad in Analytics or OR/MS Today

Questions? careercenter@informs.org or call (800) 4-INFORMs A NA L Y T I C S

M A R C H / A P R I L 2 014

|

49


WH ERE T H E J O BS A R E

Using the job search capability, it is possible to do a keyword search for all currently listed positions within a distance of a zip code. While poking around using the Sutton’s Law approach might be a useful start, a more systematic approach seems appropriate.

The idea of a metropolitan area seemed to be a good place to start, but what does that include (or leave out)? The U.S. government’s Office of Management and Budget (OMB) defines a number of statistical areas that might provide a useful framework. There are 388 metropolitan statistical areas (MSAs) with population greater than 50,000, and 541 micropolitan statistical areas (mSAs), with population between 10,000 and 50,000. There is also a grouping of adjacent MSAs and mSAs based on social and economic ties and incorporate commuting patterns; these 169 combined statistical areas (CSAs) seemed a good place to start, but initial exploration revealed that this list does not include MSAs that have only one urban core and therefore omits places like San Diego, Calif., Phoenix, Ariz., Tampa, Fla., and San Antonio, Texas. As these locations may be of interest to jobseekers in the analytics field, another approach is warranted. Further searching revealed a list of 574 (unofficial but commonly used) groupings called primary statistical areas (PSAs), which include all 169 CSAs, 122 (of the 388) MSAs, and 283 (of the 541 mSAs). As this assemblage seems to have been developed for studies like this one, the 569 PSAs in the United States (but not Puerto Rico) were considered in this analysis. LET’S COLLECT SOME DATA While LinkedIn provides a straightforward search capability (for people, groups and jobs), there is also an advanced search capability. Exploring the advanced query indicates that a

50

|

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


INFORMSCONFERENCE

BIG DATA

THE

BUSINESS OF

June 22-24 Hand-Picked Speakers Lead You through Real-World Case Studies

2014

San Jose, California

Focus is purely on how you gain true business value from the data you have. We bridge the gap between decision makers and analytics professionals like no one else.

Paul Brown, Koverse | Will Ford, Alpine Data Labs | Kevin Foster, IBM | Anthony Goldbloom, Kaggle | Peter Guerra, Booz Allen Hamilton | Link Jaw, Intel | Brian Keller, Booz Allen Hamilton | John Koch, Merck | Keynote Speaker Bill Franks Chief Analytics Officer Teradata Corporation

A.J. Mobley, Kaiser Permanente | Govind Nagubandi, JP Morgan Chase | Alan Papir, Analytics Media Group | Lee Paries, Aster Data | Ion Stoica, UC Berkeley | Marina Thottan, Bell Labs, Alcatel-Lucent | Simon Zhang, LinkedIn |

Putting Big Data to Work

meetings. informs.org/bigdata2014 Conference Co-Chairs:

Margery H. Connor Diego Klabjan Chevron Corporation Northwestern University


WH ERE T H E J O BS A R E

customized query can be produced by editing the uniform resource locator (URL) entered into a Web browser. The simplest representation of a query for searching within 25 miles (a reasonable commuting distance that incorporates a city and surrounding suburbs without overlapping nearby areas) of Washington, D.C., (zip code 20005) looks like this: http://www.linkedin.com/vsearch/j?k eywords=analytics&postalCode=20005 &countryCode=us&distance=25 While the standard user interface only allows limited selections (e.g., distances of five, 10, 25, 50 and 100 miles), it is possible to customize the search (e.g., a distance of 12 miles) if desired. Using a few dozen lines of Python code, this can be automated to repeat an identical query (modifying only the zip code) for all 569 PSAs. The script took about an hour to run, due to the addition of some random delays between queries to emulate human behavior and possible LinkedIn account suspension. While the actual number of job listings changes from one day to the next, and even during the course of a day, the results included here are likely representative of the relative job markets applicable to those looking for jobs in analytics. 52

|

A N A LY T I C S - M A G A Z I N E . O R G

BIGGER IS GENERALLY BETTER On Feb. 1, 2014, there were a total of 11,584 jobs containing the keyword “analytics” on LinkedIn. A total of 339 of the 569 PSAs had no analytics job listings on this day. Quants looking for work can probably skip Fresno, Calif.; Vernon, Texas; and a few hundred other locations. As might be expected, larger cities in general have more analytics jobs than smaller towns. Table 1 shows the 10 largest cities, their population (in millions) and the numbers of analytics job listings. The correlation between population and jobs is quite high (0.85), but even a cursory look at Table 1 shows that some large cities (e.g., Chicago and Miami) might not be “pulling their weight” in terms of providing jobs in analytics. Metropolitan Area

Population (M)

New York Los Angeles Chicago Washington, D.C. San Francisco Boston Philadelphia Dallas Miami Houston

23.3 18.2 9.9 9.3 8.4 8 7.1 7.1 6.4 6.4

Jobs 2122 512 166 660 1330 854 253 443 93 202

Table 1: Analytics jobs in the 10 largest U.S. metropolitan areas.

W W W. I N F O R M S . O R G


Organizing Committee General Chair Candace A. Yano University of California-Berkeley Program Chair Philip Kaminsky University of California-Berkeley Plenary/Keynotes Chair Shmuel S. Oren University of California-Berkeley Invited Sessions Co-Chairs Hyun-Soo Ahn Damien Beil University of Michigan Sponsored Sessions Co-Chairs Alper Atamturk Zuo-Jun Max Shen University of California-Berkeley Contributed Sessions Co-Chairs Rachel Chen University of California-Davis Steven Nahmias Santa Clara University Practice Program Co-Chairs Vijay Mehrotra San Francisco State University Warren Lieberman Veritec Solutions Thomas Dag Olavson Google, Inc.

November 9-12, 2014 Hilton San Francisco & Parc 55 Wyndham San Francisco, California

Submission Deadline: May 15, 2014 Submit Early, Capacity Limited!

meetings2.informs.org/sanfrancisco2014

Interactive Sessions Co-Chairs Hari Balasubramanian Ana Muriel Univ. of Massachusetts-Amherst Tutorials Co-Chairs Alexandra M. Newman Colorado School of Mines Janny Leung Chinese University of Hong Kong Arrangements Co-Chairs Julia Miyaoka Theresa M. Roeder San Francisco State University


WH ERE T H E J O BS A R E

Rank 1 2 3 4 5 6 7 8 9 10

Metropolitan Area Jobs Rank Metropolitan Area Jobs New York, NY 2122 11 Raleigh, NC 220 San Francisco, CA 1330 12 San Diego, CA 214 Boston, MA 854 13 Houston, TX 202 Washington, DC 660 14 Minneapolis, MN 187 Seattle, WA 651 15 Columbus, OH 182 Los Angeles, CA 512 16 Chicago, IL 166 Dallas, TX 443 17 Richmond, VA 164 Atlanta, GA 394 18 Denver, CO 145 Austin, TX 308 19 Detroit, MI 130 Philadelphia, PA 253 20 Phoenix, AZ 127

Table 2: 20 metropolitan areas with the most analytics jobs. BIGGER ISN’T ALWAYS BETTER Table 2 shows the 20 PSAs with the greatest number of jobs. Not surprisingly, nine of the 10 largest metropolitan areas (indicated by grayed-out text) are also included in this list. The “Top 10” accounted for 7,274 (or 73 percent of all) jobs, while the “Top 20” covered 6,264 (or 80 percent of all) jobs. Not surprisingly, the distribution of analytics jobs is skewed toward a small number of locations. Figure 1 shows a Pareto diagram of the first 50 (sorted by number of jobs) PSAs. Note that these locations account for 94 percent of all listed positions. We compute a “persons per job” metric in an attempt to determine which areas are “punching above their weight,” and identify smaller areas that have an unusually high (relative to their total 54

|

A N A LY T I C S - M A G A Z I N E . O R G

population) number of analytics jobs. This is computed by dividing the total population by the number of analytics job listings within the CSA. As shown in Table 3, Platteville, Wis., leads in this category with roughly one analytics job opening for every 800 people; keep in mind that this includes people of all ages, including children too young to work and retirees, not just those eligible for work. This compares to an overall average (for those areas with at least one advertised position) of one opening per 122,000 people. So, some areas are a more “target rich environment” than others for job hunters. While looking at some of these smaller areas that provide more “bang for the buck” (in terms of the number of analytics jobs relative to the population) may seem wise, W W W. I N F O R M S . O R G


Figure 1: Pareto diagram of jobs in 50 PSAs.

Primary Statistical Area Jobs Population Per Job Platteville, WI 65 51087 786 Dubuque, IA 65 95097 1463 Plattsburgh, NY 19 81654 4298 Austin, TX 308 1834303 5956 San Francisco, CA 1330 8370967 6294 Winona, MN 8 51629 6454 Seattle, WA 651 4399332 6758 Richmond, IN 13 92375 7106 Richmond, VA 164 1231980 7512 Raleigh, NC 220 1998808 9085 Bennington, VT 4 36697 9174 Huntingdon, PA 5 45943 9189 Boston, MA 854 7991371 9358 Burlington, VT 21 213701 10176 New York, NY 2122 23362099 11009 Columbus, OH 182 2348495 12904 Washington, DC 660 9331587 14139 San Diego, CA 214 3177063 14846 Atlanta, GA 394 6092295 15463 Lewistown, PA 3 46773 15591

A NA L Y T I C S

A “persons per job” metric in an attempt to determine which areas are “punching above their weight,” and identify smaller areas that have an unusually high (relative to their total population) number of analytics jobs.

Table 3: 10 Micropolitan (and metropolitan) areas with greatest “jobs per capita.”

M A R C H / A P R I L 2 014

|

55


WH ERE T H E J O BS A R E

Figure 2: Info graphic view of analytics jobs. Circles are scaled to represent the number of jobs in each location. some caution may be in order. Taking one of the three available positions in Lewistown, Pa., might be appealing to someone who prefers a more rural setting, but if the situation didn’t work out, there might not be any other opportunities in the area, potentially necessitating an unexpected relocation. A PICTURE IS WORTH 11,584 JOBS Figure 2 shows an info graphic view of the 230 areas with at least one analytics job. The circles are scaled to represent the number of jobs in each location; however, the largest circle (New York) is only 200 times larger than the smallest, not 2,000 times larger if it were true to scale. As you can see, they are spread 56

|

A N A LY T I C S - M A G A Z I N E . O R G

out across the U.S., with opportunities in many geographic regions and climatic zones. All in all, it appears to be a good job market for those with skills in analytics. However, the majority of the available positions appear to exist in a relatively small number of metropolitan and micropolitan areas. After all, 90 percent of the listed jobs are in only 35 locations. Happy job hunting to those who are looking! Scott Nestler, Ph.D., CAP®, PStat®, is an Army operations research analyst, a member of INFORMS and chair of the INFORMS Analytics Certification Board (ACB). Disclaimer: The views expressed in this article are those of the author and do not necessarily reflect the official policy or position of the Army, the Department of Defense or the U.S. Government.

W W W. I N F O R M S . O R G


Is the largest association for analytics in the center of your professional network? It should be.

▪ Certification for Analytics Professionals ▪ A FREE Community Membership

to join online visit http://join.informs.org

▪ Continuing Ed courses for Analytics Professionals ▪ Online access to the latest in operations research and advanced analytics techniques ▪ Unsurpassed Networking Opportunities available in INFORMS Communities and at Meetings ▪ Subscriptions to online and print INFORMS Publications ▪ INFORMS Career Center - the industry’s leading job board

JOIN


CO RPO RATE P RO F I LE

2014 Chevrolet Corvette Stingray

General Motors Using operations research and other advanced analytics to meet auto industry challenges and provide value to customers and company.

BY JONATHAN H. OWEN (left), DAVID J. VANDERVEEN (right) AND LERINDA L. FROST

I

n today’s world, where globalization is a fact of business life, where competition is fiercely intense and where concerns such as energy security and climate change are global in scope, a science-based approach to decision-making and problem-solving is essential. 58

|

A N A LY T I C S - M A G A Z I N E . O R G

At General Motors, operations research provides that framework, particularly for complex issues and systems that involve multiple objectives, many alternatives, trade-offs between competing effects, large amounts of data and situations involving uncertainty or risk. In truth, for an entity the size of General Motors, W W W. I N F O R M S . O R G


these are the only kind of challenges the company faces because GM is huge and no issue is simple! With products that range from electric and mini-cars to heavy-duty full-size trucks, monocabs and convertibles, GM offers a comprehensive range of vehicles in more than 120 countries around the world. Along with its strategic partners, GM sells and services vehicles under the Chevrolet, Buick, GMC, Cadillac, Opel, Vauxhall, Holden, Baojun, Wuling and Jiefang brand names. GM also has

significant equity stakes in major joint ventures in Asia, including SAIC-GM, SAIC-GM-Wuling, FAW-GM and GM Korea. GM has 212,000 employees located in nearly 400 facilities across six continents. Its employees speak more than 50 languages and touch 23 time zones. The work they do demonstrates the depth and breadth of the auto business – from developing new vehicles and product technologies to designing and engineering state-of-the-art plants, organizing and managing the company’s vast global

Alliance for Paired Donation, for "Kidney Exchange" U.S. Centers for Disease Control and Prevention, for "Using Integrated Analytical Models to Support Global Health Policies to Manage Vaccine Preventable Diseases: Polio Eradication and Beyond" The Energy Authority, for "Hydroelectric Generation and Water Routing Optimizer" Grady Health System, for "Transforming Emergency Department Workflow and Patient Care" NBN Company, for "Fiber Optic Network Optimization at NBN Co." Twitter, for "The ‘Who to Follow’ System at Twitter: Strategy, Algorithms, and Revenue Impact" Be there at the Edelman Gala, March 31 when the 2014 winner is announced! http://meetings.informs.org/analytics2014

A NA L Y T I C S

M A R C H / A P R I L 2 014

|

59


CO RPO RATE P RO F I LE

GM Chairman and CEO Dan Akerson (fourth row, third from left), GM CTO and Global R&D Vice President Jon Lauckner (fourth row, fourth from left) and Global R&D Executive Director Gary Smyth (top) with the GM R&D Operations Research team.

supply chain and logistics systems, building new markets and creating new business opportunities. The work is multifaceted, but whether in Detroit, Frankfurt, Sao Paulo or Shanghai, the goal is straightforward: offer products and services that establish and maintain a deep connection with customers around the world while simultaneously generating revenue and profit for the company. Considering the complexity of the challenges in the auto business and the speed at which change is occurring in every arena – technology, business, materials and resources, governmental policies and regulations – it is critical to employ a scientific approach in thinking about and attempting to understand 60

|

A N A LY T I C S - M A G A Z I N E . O R G

problems and implement viable solutions. Today, no area of GM is untouched by analytical methods. THE EARLY YEARS Even before the industry entered the current period of globalization and profound technological change, operations research was valued within GM. As early as the 1960s and 1970s, GM employed analytical techniques for transportation studies and traffic flow analysis. In the 1980s, GM developed analytical principles and used mathematical optimization methods to improve assembly line job sequencing. In the 1990s, it patterned warranty cost reduction analyses after Centers for Disease Control epidemiology studies. W W W. I N F O R M S . O R G


In 2005, GM won the Franz Edelman Award from INFORMS for its work on production throughput analysis and optimization. Even when overall industry production capacity is above demand, it is usually the case that demand for certain “hot” vehicles exceeds planned plant capacities. In such cases, an increase in production capacity will generate larger profits via more sales revenue and/or overtime cost avoidance. GM’s operations research team analyzed production throughput using

math models and simulation, identified cost drivers and bottlenecks, and developed a throughput improvement process to increase productivity and reduce costs. The resulting software has been enhanced over a 20-year period to extend GM’s capabilities, enabling it to accommodate product and manufacturing flexibility, variable control policies and more complex routing. The software is used globally in GM plants, as well as to design new production systems and processes.

pply for 2014 THE DANIEL H. WAGNER PRIZE Excellence in Operations Research Practice

Apply to win this prestigious practice prize that rewards professionals who devise innovative analytical methods, utilize those methods is a verifiably successful O.R./analytics project, and describe their work in a clear, well-written paper. Two-page abstract is due by May 1, 2014.

Daniel H. Wagner

This top INFORMS practice prize spans all O.R. and analytics disciplines and application fields. Any work presented in an INFORMS section or society practice-oriented competition is eligible as long as the work did not result in a published paper.

The Wagner Prize competition is high-profile, with its own track at the INFORMS Annual Meeting. Presentations are widely distributed via streaming video. Finalist papers are published as a special issue in INFORMS respected practice journal Interfaces. The 2014 competition will be held at the INFORMS Annual Meeting, November 9-12, in San Francisco, California. First-place prize of $1,000 will be awarded at the Edelman Gala, during the April 2015 Conference on Business Analytics and O.R. in Huntington Beach, California.

www.informs.org/wagnerprize

A NA L Y T I C S

M A R C H / A P R I L 2 014

|

61


CO RPO RATE P RO F I LE

This long-term effort is just one example that demonstrates how GM has applied operations research (O.R.) methods to change the way it leverages O.R. and advanced analytics on a continuing basis. The importance of activities like this in a company the size of GM cannot be fully measured. For plant throughput alone, the savings are estimated at more than $2 billion over the past two decades. But just as important as the economic benefits is the mindset – the scientific approach to problem-solving, decision-making, scheming the business, and identifying new opportunities. O.R. AT GM TODAY Given the success of the work described above, the R&D operations research team broadened its mission about five years ago and today provides a research capability within the company focused on tackling long-term strategic challenges. With the wide-ranging scope of potential assignments, the O.R. team is composed of Ph.D. and master’s-level technical experts, along with subjectmatter experts with hands-on and executive leadership experience in key areas of the business, such as manufacturing, supply chain, engineering, quality, planning, marketing, and research and development. 62

|

A N A LY T I C S - M A G A Z I N E . O R G

Projects are aligned with top company priorities, which are based on a combination of business performance drivers and senior leadership input. The work may start with targeted questions, e.g., what’s the opportunity of (fill in the blank), or it can focus on improving operational effectiveness through process improvements in areas such as manufacturing productivity, capital or supply chain management, or dealer inventory management. Many opportunities to improve revenue management exist through the application of tools and systems that help decision-makers optimize portfolio planning, reduce complexity, target incentives, or optimize content and packaging. In addition, given the large new data streams coming from the intelligence available in today’s vehicles, new emphasis is being put on improving vehicle efficiency, quality and diagnostics, as well as more deeply understanding customers so GM can provide differentiated value through new automotive products and services. The team’s implementation model comprises a mix of: • analysis by internal consultants to understand the issue, • capability development, including analytical principles, math models and tools, and • partnering with stakeholders and decision-makers early to scope and W W W. I N F O R M S . O R G


maximize the potential impact of implemented solutions. O.R. DRIVES TRANSFORMATION The R&D operations research team recently received two “teamGM Transformer” Awards for developing business tools that use “big data” and analytics to improve decision-making. This is an internal award that rewards employees who are leading change across the company by finding significant and innovative

ways to drive GM business priorities. One of the O.R. team’s Transformer Awards recognized new analytic tools to support GM’s Product Development activity. These include a range of tools that help guide engineering decisions to reduce complexity in the vehicle and powertrain, apply market research to vehicle attribute balancing and optimize portfolio planning in light of greenhouse gas performance objectives. The other award was for development of a new approach

To find an expert to help you, log onto INFORMS Find An Analytics Consultant Database

informs.org/Find-Analytics-Consultant/Search INFORMS is the foremost association of O.R. and analytics experts. Our members literally wrote the book on how analytics and the principles of operations research are used to improve organizational decision making.

A NA L Y T I C S

M A R C H / A P R I L 2 014

|

63


CO RPO RATE P RO F I LE

to optimizing inbound logistics. This tool is being expanded to support its use by all vehicle programs early in the vehicle development process. With the exponential growth in data, the ever-expanding digital connection to customers and the introduction of exciting new vehicles technologies, this is an exciting time for operations research at GM. With so many research-rich opportunities, the team is always mindful of the characteristics that are key to successfully applying O.R. methods and achieving organizational excellence, including the ability to: • choose the right problem to address; • see and convince others that a complicated problem is important and solvable; • work as part of a team toward a common and well-defined goal; • have tenacity in chasing down details and data, and then equal tenacity in the implementation of a solution; • get the model to the right level of detail for the purpose at hand so it is not too complex, nor too data intensive, but sufficiently detailed to capture the salient characteristics and trade-offs;

• engage the key stakeholders in the process of development and implementation, in order to gain joint ownership. Technology transfer is something that takes place between consenting adults; and • deliver an O.R. solution to decisionmakers in a form or format that they can understand and act upon. O.R. practitioners who embody these characteristics can have a profound impact on their organization, help their company rise above the competition, and most importantly provide increased value to customers. As the world goes global – as innovation strives to create more, faster, better and at less cost; as new business and technology paradigms emerge – endless opportunities abound to take advantage of operations research and reap the substantial good that can be realized from its practice. Jonathan H. Owen is director of Operations Research at GM R&D. David J. VanderVeen, now director of analytics in GM Global Product Development, was formerly director of GM R&D Operations Research. Lerinda L. Frost leads executive communications and business support at GM R&D. Owen and Vander Veen are members of INFORMS.

Request a no-obligation INFORMS Member Benefits Packet For more information, visit: http://www.informs.org/Membership

64

|

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


Wanted: Big Data Experts. Introducing the SMU Cox Master of Science in Business Analytics. Across a variety of industries, there’s a need for big data problem solvers. And the SMU Cox School of Business is filling it. We’re training the best and brightest to translate big data into the big picture. All in less than a year’s time. Whether you’re looking to hire the most wanted or become one of them, look no further. Learn more at coxmsba.com.

SMU is an Affirmative Action/Equal Opportunity Institution.


CO N FERE N C E P R E V I E W S

INFORMS Conference on Business Analytics & O.R. Davenport, Kilmer to keynote March 30-April 1 event in Boston.

Tom Davenport

For the past several years the INFORMS Analytics Conference has been setting records for attendance and presentation submissions. Now is your chance to see what all the buzz is about. The 14th Annual INFORMS Conference on Business Analytics and O.R., set for March 30-April 1 in Boston, will feature presentations by more than 100 speakers representing a broad range of industries and application areas. INFORMS expects more than 900 attendees this year, making it the largest analytics-focused event in the world. With special events ranging from career-building programs to world-class prize presentations, this conference is sure to offer valuable content for practitioners and academics alike. KEYNOTES: ANALYTICS PIONEER, DISNEY SENIOR EXECUTIVE

Kathy Kilmer

66

|

Keynote speakers Tom Davenport and Kathy Kilmer will headline this year’s conference. Davenport will speak on Monday, March 31, while Kilmer will speak on Tuesday, April 1. Davenport is the President’s Distinguished Professor of Information Technology & Management at Babson College, a Fellow of the MIT Center for Digital Business, co-founder of the International Institute for Analytics and senior advisor to Deloitte Analytics.

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


Davenport’s 2006 Harvard Business Review article and best-selling 2007 book, “Competing on Analytics” (co-authored with Jeanne Harris), launched the revolution that has made analytics the hottest business trend and “data scientist” the sexiest job profile. His most recent book, “Keeping Up with the Quants: Your Guide to Understanding and Using Analytics,” with Jinho Kim, has been called “the quantitative literacy guide” for managers. He has written or edited 16 other books and more than 100 articles for Harvard Business Review, Sloan Management Review, the Financial Times and many other publications. Kilmer, director of sales planning and analytics at Walt Disney Parks and Resorts, oversees the teams responsible for providing analytical and technology integration support across the sales teams and contact centers. Prior to her current role, she was director of industrial engineering at Disney, overseeing more than 120 industrial engineers who serve as internal business consultants. She is on Purdue University’s Engineering Advisory board and INFORMS Analytics Certification Board, and she has been featured on the “Today in America” show. Kilmer is a recipient of IIE’s Fellows Award, Purdue University’s Outstanding IE Alumni Award and Distinguished Engineering Alumni Award, and is in A NA L Y T I C S

Purdue University’s Engineering Cooperative Education Hall of Fame. HAND-PICKED TOPICS AND SPEAKERS The key to the conference’s ongoing success is its program committee, chaired this year by Freeman Marvin, CAP, vice president and executive principal analyst at Innovative Decisions, Inc. “Attendees will have the opportunity to experience firsthand how the application of analytics is changing the way business is conducted across many industries and transforming everyday life for billions of people around the world,” Marvin says. The 36 members of the committee include analysts and managers from companies such as Google, Target, IBM, Chevron, Amtrak, SAS, Intel and UPS, as well as leading universities and government agencies. The committee develops the topic tracks, selects speakers and organizes the presentations that comprise the heart of the conference. This year the speakers will present talks that are organized into the following focused tracks: The Analytics Revolution, Healthcare Applications, Big Data, Marketing Analytics, Decision Analysis, Soft Skills and Supply Chain Management. New this year is a track organized by the INFORMS Roundtable that will feature world-class O.R. projects in established and mature M A R C H / A P R I L 2 014

|

67


CO N FERE N C E P R E V I E W S

analytics companies. The program will be and networking events for all attendees: rounded out by six tracks of contributed • INFORMS Professional Colloquium: talks, plus tracks on software solutions. Intensive career guidance for master’s Sixty-eight poster presentations will and Ph.D. students interested in augment the oral presentations. These practice, held on March 30. visual presentations include case studies, • Richard E. Rosenthal Early Career best practice examples and academic Connection: Exclusive networking research with a practitioner orientation. program for junior faculty and young industry practitioners. AWARD-WINNING ANALYTICS • Soft Skills Workshop: Full-day workshop On the evening of Monday, March on the “soft” skills needed to partner 31, the winner of the 2014 Franz Edelwith decision-makers and users. man Award will be announced at the • Technology Workshops: In-depth training Awards Gala and Banquet. The Edelman from leading solution providers. Free Award is the highest international award to conference registrants. for achievement in operations research. This year’s finalists are Twitter, The U.S. MARCH 17 DEADLINE FOR EARLY Centers for Disease Control and Preven- CONFERENCE RATES tion, The Energy Authority, Grady Health System, Alliance for Paired Donation and NBN Company. Other high-impact work will be showcased throughout the meeting in talks by finalists and winners of the INFORMS Prize, Daniel H. Wagner Prize, UPS George D. Smith Prize, Innovative Applications in Analytics Award, Gary L. Lilien Marketing Science Practice Prize and the coveted Spreadsheet “Guru” Prize. SPECIAL PROGRAMS The conference also offers the following special programs for graduate students and young researchers, as well as learning 68

|

A N A LY T I C S - M A G A Z I N E . O R G

Early rates of $965 for INFORMS members and $1,200 for nonmembers are available until March 17. Organizations can take advantage of the $827 team discount rate when they send three or more attendees. All meals for two days are included in the fees. The conference venue, the new Westin Boston Waterfront, is located in Boston’s exciting waterfront area. This area is abundant with great restaurants, shops and public green space – plus easy access to all the history and attractions that make Boston such a fascinating city. For additional conference information, click here. W W W. I N F O R M S . O R G


For your career & your company The following special events will be held in conjunction with the INFORMS Conference on Analytics & O.R. in Boston:

INFORMS Continuing Education

Analytics Certification

These two-day, in-person courses

Analytics certification is offered

presented by INFORMS provide real-

by INFORMS to provide analytics

world value in skills, tools and methods

professionals with a means to distinguish

that can be implemented in your work.

themselves and demonstrate to

Two courses will be offered just before

employers, colleagues and the public

the INFORMS Analytics Conference,

that they are competent analytics

both held on March 28-29 in Boston:

professionals. Those attaining certification

• Essential Practice Skills for Analytics

will be able to list “CAP®” after their

Professionals: Participants will learn practical tools for integrating their

analytical skills into real-world problemsolving for businesses and other organizations. The course provides approaches that can be applied immediately to a wide variety of settings,

names. INFORMS is pleased to offer the CAP certification exam at this conference. The exam is offered on Saturday, March 29, at the conference hotel. Attendees of this conference qualify for a discounted, bundled rate of Conference + Certification. Also, don’t miss a special session

whether within a participant’s own

on March 31 providing information

organization or for an external client.

and guidance on the benefits and

• Data Exploration & Visualization:

requirements of the CAP certification.

Participants can expect to be

re-introduced to approaching data in a powerful, yet playful manner. They will see and experience how exploration and visualization can be used to answer existing questions, thereby corroborating or invalidating hunches and preconceptions. For more information and to register, click here.

A NA L Y T I C S

For more information, click here.

Analytics Maturity Model How well does your organization use analytics? What can you do to progress from good to great? Attend this debut of the new INFORMS analytics maturity model to see how you can score your organization, set target goals, and attain the enormous advantage of analytics leadership. This special session will be on Tuesday, April 1.

M A R C H / A P R I L 2 014

|

69


CO N FERE N C E P R E V I E W S

INFORMS Big Data Conference New event, set for June 22-24, to focus on the business of big data. How do you get from data discovery to return on investment and real business value? The INFORMS Big Data Conference, set for June 22-24 in San Jose, Calif., aims to help you discover just that. This newly launched topical conference will put the focus squarely on the business of big data. A major component of the conference will be case studies of big data projects that illustrate the complete journey from business problem to analytics solution. The conference committee is being spearheaded by Margery Connor, CAP, senior operations researcher-Advanced Analytics at Chevron, and Diego Klabjan, CAP, professor and director of the Master of Science in Analytics Program at Northwestern University. Other conference committee members are practitioners in the data arena, hailing from organizations such as Humana, IBM, Booz Allen Hamilton, SAIC, Intel, SAS, Alcatel-Lucent, and UPS. The talks are being arranged into 70

|

A N A LY T I C S - M A G A Z I N E . O R G

tracks on Case Studies, Big Data 101, and Emerging Trends. Other talks will address topics such as: • expediting the journey from business problem to analytics solution; • bridging the gap between decisionmakers, IT managers and analytics professionals; and • selecting and using the right big data technologies. Speakers are handpicked by the committee from an impressive list of “who’s who” in the field of big data. Anthony Goldbloom of Kaggle, Ion Stoica of UC- Berkeley, Paul Kent of SAS and Simon Zhang of LinkedIn have all confirmed they will share their best practices, success stories and lessons learned on real implementation of big data analytics. The Big Data 101 track will offer tutorials on how to navigate the big data ecosystem, how to select and use the right technologies, as well as the challenges of building data W W W. I N F O R M S . O R G


science teams. The speakers will also address critical topics such as ethics and privacy requirements. Bill Franks will deliver the keynote address, “Putting Big Data to Work.” Franks is the chief analytics officer at Teradata Corporation. At Teradata and throughout his career, Franks has focused on translating complex analytics into terms that business users can understand and then helping organizations

implement the results effectively within their processes. He is author of the book, “Taming the Big Data Tidal Wave,” and his work has spanned clients in a variety of industries. The INFORMS Big Data Conference will be held at the San Jose Convention Center. Rooms are being held for conference attendees at the Marriott San Jose, which is connected to the center. For more information, click here.

Job Seeker Benefits • Access to high quality, relevant job postings. No more wading through postings that aren’t applicable to your expertise.

CAREER CENTER

• Personalized job alerts notify you of relevant job opportunities. • Career management – you have complete control over your passive or active job search. Upload multiple resumes and cover letters, add notes on employers and communicate anonymously with employers. • Anonymous resume bank protects your confidential information. Your resume will be displayed for employers to view EXCEPT your identity and contact information which will remain confidential until you are ready to reveal it. • Value-added benefits of career coaching, resume services, education/training, articles and advice, resume critique, resume writing and career assessment test services.

http://careercenter.informs.org A NA L Y T I C S

POWERED BY

M A R C H / A P R I L 2 014

|

71


FIVE- M IN U T E A N A LYST

Lego Brickbox Computing the distribution of random bricks tossed into a box is difficult, because each layer is dependent on the one below it. Also, real children do things that real children do, such as shake the box to make the Legos settle.

BY HARRISON SCHRAMM, CAP 72

|

At the holidays, many children received special promotional “Brick Boxes” from Lego, which may be taken back to the store and filled from the brick repositories on the back wall in the store. After Christmas, one child, “Norah,” saw one of her friends, “Tyler,” meticulously build a shape to fit exactly in his box. She asked me, “How much better do you think that Tyler did by building an exact shape than I did just by tossing what I wanted into the box?” Like many things, this turned out to be a much simpler question to ask than to answer! For the remainder of the article, we will use the natural unit of “Lego cubes” (LC), which are the size of a 1x1 Lego brick, as shown in Figure 1. So, a 2x4 brick has an area of 8LC, and so on. First, an easy problem: The promotional brickbox is 11x11x9 LC, and has a capacity of 1,089 squares. Packing square bricks into a square box is very easy. This turns out to be the only easy thing about this problem. Computing the distribution of random bricks tossed into a box is difficult, because each layer is dependent on the one below it. Also, real children do things that real children do, such as shake the box to make the Legos settle. A few minutes with a paper and pencil convinced me that this was not the proper approach. So, I decided to simulate. Now, computer simulation has some of the same difficulties – imagine playing a 3-D version of Tetris – but fortunately, this is not the only way

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


Figure 1: A Standard Lego brick, measuring 8 mm square (1LC in this article’s measurements), with a standard U.S. quarter and Darth Vader for size comparison.

Figure 2: Lego promotional brickbox (left) and for-purchase brickbox cup (right). Which holds more?

to simulate. It is possible, for small problems, to simulate the system itself, which entails actual Legos, actual children and an actual box. And here’s where I stopped being in control and the problem took over. I went to the store and purchased a (non-promotional) Lego brickbox. This one is different than the promotional version, because it is a large round cup, and now things get really interesting. Because while packing square Legos in a square box is easy, packing square Legos in a round cup is hard. My idea was to have a set of Lego bricks, 1x1, 1x4, 2x2 and 2x4 of different colors for a group of children to toss into the promotional (square) box. We could then determine what an

“average” random fill of bricks might be. I didn’t concern myself too much with optimally packing the cup; I reasoned that it was so much larger than the box (946 vs. 670 cubic centimeters) that I wouldn’t need to worry too much about optimizing. Naïvely, based solely on volume, one might estimate that the large cup holds 1,678 bricks. This is a naïve measure because it simply divides the volume of the box by the volume of the bricks. I turned out to be dead wrong; my brick purchase that haphazardly filled the cup only filled the box (when optimally stacked) a little more than half way! This is because it’s difficult to pack squares into a round container, even more so when you don’t try. M A R C H / A P R I L 2 014

|

73


FIVE- M IN U T E A N A LYST

When analyzing putting bricks in round cups, there are two approaches one may take: The first is to consider how many squares may be packed in a circle in any arrangement, which is known as Square Packing [1]. The other approach is to ask how many integer lattice points may be contained in a circle of radius r. We usually think of building Legos with a lattice because we want to build layers on top of layers, so we choose this method. It turns out that a similar problem was studied by Gauss and is known as Gauss’ Circle Problem [2]. The key idea is to realize that the number of lattice points inside a square is the number pairs of integers (m, n) such that which is, of course, the equation of a circle. In calculus we just take the limit as the area of the boxes tends to zero and arrive at the well known . However, requiring makes the problem much more complicated. Fortunately, Hilbert et al. come to the rescue, and the number of lattice points in the circle may be found by evaluating:

(1) Where is the Gauss bracket or Floor function, which means “round down to the nearest integer.” If you 74

|

A N A LY T I C S - M A G A Z I N E . O R G

Figure 3: Proposed bottom and top layers next to the brickbox cup. These two layers were the only ones built in the course of this analysis, and are smaller than the theoretical maximum layers that would fit in from equation (1). restrict yourself to integer values of r , this formula will generate a named sequence [3]. Here we have used noninteger values for the radius of the circle because the cup does not have an exact radius in terms of our foundational unit (bricks). This will still tend to over-estimate the number of bricks that will fit in a cup because it assumes that the lattice points have zero dimensions, and we know that our bricks have finite dimension; therefore, the calculations that follow are an upper bound. W W W. I N F O R M S . O R G


Figure 4: Upper bound on bricks per layer, conical Lego cup, computed three different ways. The blue line is the theoretical maximum, using equation (1) and is a strict upper bound. The red line considers the size of the largest square that could be fit at each layer and should be considered a strict lower bound. The green line is the linear trend line of the brick counts of the “bottom” and “top” layers, pictured in Figure 3. This equation may be readily implemented in Excel. Because there is a “floor” function on the summands, for our purposes need only be evaluated up to . The base of the cup has a radius of approximately 4.4 LC and has a theoretical maximum of 61 bricks. I was able to achieve a base layer of 58, but this is probably because I’m not a great builder. The top of the cup has a radius of approximately 6 LC and has a theoretical maximum of 113 Bricks. I was able to achieve 98 A NA L Y T I C S

in my build. Using these and assuming a linear trend in the cup (the sides of the cup look smooth and straight), we estimate that a theoretical maximum of 1,364 Lego bricks could fit in the round cup, with a more likely number being approximately 1,250. See Figure 4 for three different calculations of bricks-per-layer. So the round cup holds about 200 more bricks than the box if you take the time to pack it. Real Lego enthusiasts use a greedy heuristic to fill their cups, M A R C H / A P R I L 2 014

|

75


FIVE- M IN U T E A N A LYST

Achieving an optimal fill of the round cup will be much more difficult than achieving an optimal fill of the square box. In fact, the amount of time that it takes a child to optimally fill a box is about the amount of time that it takes an adult to create the bottom level of the round box.

putting large pieces in first, then filling the rest of the space with smaller pieces (“elements”), which for tractability were excluded from this analysis. The conclusion of this article is that while they didn’t really look like much, the promotional brickbox was a really nice gift. As a final note, we observe that achieving an optimal fill of the round cup will be much more difficult than achieving an optimal fill of the square box. In fact, the amount of time that it takes a child to optimally fill a box is about the amount of time that it takes an adult to create the bottom level of the round box. Next time: We answer our original question of how much better off one is by packing both the round and square cups than by randomly tossing bricks in. Harrison Schramm (harrison.schramm@gmail.com) is an operations research professional in the Washington, D.C., area. He is a member of INFORMS and a Certified Analytics Professional (CAP). NOTES & REFERENCES 1. There is a community of people interested in this problem, for starters, see Erich’s Packing Center: http://www2.stetson. edu/~efriedma/squincir/ 2. There is a very nice description of the problem at mathworld: http://mathworld.wolfram.com/GausssCircleProblem.html. Additionally, Hilbert discusses the problem in “Geometry and the Imagination,” which I purchased during the course of writing this article. 3. Sloane’s A000328: 1, 5, 13, 29, 49, 81, 113, 149, 197, 253…

Join the Analytics Section of INFORMS

For more information, visit: http://www.informs.org/Community/Analytics/Membership

76

|

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


UPCOMING ANALYTICS CERTIFICATION EXAMS Conducted BY INFORMS, the leading professional society in analytics

Join other dedicated professionals, and become a Certified Analytics Professional (CAP®). Make plans now to take the profession's first analytics certification exam. Candidate Handbook available at www.informs.org/Build-Your-Career/Analytics-Certification.

The INFORMS Analytics Certification Program is positioned to be the defacto standard for Analytics Professionals worldwide. It will be a must-have for the analytics field in the same way PMP is for project managers. ~ Greta Roberts, CEO Talent Analytics, Corp.

®

BENEFITS OF CERTIFICATION • Advances your career potential by setting you apart from the competition • Drives personal satisfaction of accomplishing a key career milestone • Helps improve your overall job performance by stressing continuing professional development • Recognizes that you have invested in your analytics career by pursuing this rigorous credential • Boosts your salary potential by being viewed as experienced analytics professional • Shows competence in the principles and practices of analytics

ELIGIBILITY • BA/BS or MA/MS degree or higher • At least five years of analytics work-related experience for BA/BS holder in an analytics related area • At least three years of analytics work-related experience for MA/MS (or higher) holder in an analytics related area • At least seven years of analytics work-related experience for BA/BS (or higher) holder in an area unrelated to analytics

APPLICATIONS • Prepare to apply by reviewing Candidate Handbook now • Arrange now to secure academic transcript and confirmation of “soft skills” from employer to send to INFORMS

COST • $495 INFORMS Members • $695 Non-Members • Bundled rates with meetings and team rates available

NEXT • CAP® Study Guide to help study and prepare • Computer based testing: take exam at your convenience

QUESTIONS

• Email certification@informs.org

www.informs.org/Build-Your-Career/Analytics-Certification

UPCOMING CAP® EXAM SCHEDULE MARCH 6, 2014 Drexel University James E. Marks Intercultural Center Philadelphia, PA MARCH 29, 2014 Precedes INFORMS Conference on Business Analytics and Operations Research, March 30-April 1, 2014 Boston, MA MARCH 30, 2014 Precedes Gartner’s Business Intelligence & Analytics Summit, March 31 – April 2 Las Vegas, NV APRIL 15, 2014 Queens University School of Business Toronto, Ontario, Canada MAY 22, 2014 University of Cincinnati Lindner College of Business Cincinnati, OH JUNE 21, 2014 Precedes INFORMS Conference on The Business of Big Data, June 22-24 San Jose, CA


THIN K IN G A N A LY T I CA LLY

Pizza delivery

BY JOHN TOCZEK John Toczek is the senior director of Decision Support and Analytics for ARAMARK Corporation in the Global Operational Excellence group. He earned a bachelor of science degree in chemical engineering at Drexel University (1996) and a master’s degree in operations research from Virginia Commonwealth University (2005). He is a member of INFORMS.

78

|

As the owner of a pizza delivery restaurant, you are constantly looking for ways to keep costs low while maintaining quality service. Because your profit margins are thin, you’d like the number of delivery drivers to be as low as possible. You receive orders from customers at an inter-arrival time of six minutes exponentially distributed. A driver can pick up one order, deliver it to a customer and return back to the restaurant in 20 to 60 minutes, equally distributed. The corporate office mandates that you must have an average order delivery time of less than 60 minutes. Some deliveries can be more than 60 minutes and some can be under 60 minutes, but on average they must be below one hour. You may hire as many drivers as you like. But hiring too many drivers will cause your payroll to be unnecessarily high and too few drivers will put you over the 60-minute delivery requirement. Assume that a driver can only deliver one order per round trip. Question: How many drivers are needed in order to keep average delivery times under one hour? Send your answer to puzzlor@gmail.com by May 15. The winner, chosen randomly from correct answers, will receive a $25 Amazon Gift Card. Past questions can be found at puzzlor.com.

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


OPTIMIZATION

www.gams.com

High-Level Modeling The General Algebraic Modeling System (GAMS) is a high-level modeling system for mathematical programming problems. GAMS is tailored for complex, large-scale modeling applications, and allows you to build large maintainable models that can be adapted quickly to new situations. Models are fully portable from one computer platform to another.

State-of-the-Art Solvers GAMS incorporates all major commercial and academic state-of-the-art solution technologies for a broad range of problem types.

GAMS Integrated Developer Environment for editing, debugging, solving models, and viewing data.

PAVER 2: The next generation of the GAMS Performance Tools PAVER 2 automates the analysis and comparison of solver performance data. The use of the Python Data Analysis Library (http://pandas.pydata.org/) ensures platform independence, simple use, high performance, and flexibility. PAVER 2 highlights: • Easy customization of performance metrics • Computation and visualization of performance statistics • Automated handling of inconsistent solver outcomes • Integration with GAMS/EXAMINER solution point analyzer

Europe GAMS Software GmbH info@gams.de USA GAMS Development Corporation sales@gams.com

http://www.gams.com

PAVER 2 is open-source and available at: http://www.gamsworld.org/performance/paver2/


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.