H T T P : / / W W W. A N A LY T I C S - M A G A Z I N E . O R G
DRIVING BETTER BUSINESS DECISIONS
J ULY / AUGUST 2014 BROUGHT TO YOU BY:
WHY ANALYTICS PROJECTS FAIL Key considerations for deep analytics on big data, learning and insights
ALSO INSIDE: • Dark side of digital world • Real-time text analytics • Data scientists’ time to shine • The future of forecasting
Executive Edge Hewlett-Packard V. P. Rohit Tandon: Six ways of value creation via E-commerce analytics
INS IDE STO RY
What I learned today One of the advantages of editing Analytics (as well as OR/MS Today, the membership magazine of INFORMS) is I learn something new every day, thanks to the wide array of contributed articles we receive. For example, just in preparing this issue, I learned: • Nearly 20 years ago, Amazon founder Jeff Bezos said that Amazon intended to sell books at or near cost as a way of gathering data on affluent, educated shoppers, as reported by George Packer in The New Yorker. The implication: The data, once analyzed, had more value than the loss-leader books, which proved absolutely correct when Amazon began selling everything under the sun to welltargeted consumers. Drawing on Packer’s article, as well as a couple of books (“Who Owns the Future?” and “The Ethics of Big Data”), Vijay Mehrotra explores the dark side of technology, big data and analytics – and the perceived and/or potential threat it poses – in his Analyze This! column. Don’t miss it. • A Formula 1 pit crew, working in an optimized, well-coordinated fashion, can change a set of four tires in less than two seconds. That means that unless you’re 2
|
A N A LY T I C S - M A G A Z I N E . O R G
Evelyn Wood, that crew can change 12 tires in the time it takes you to read this sentence. For the story behind the motorsports magic, check out Andy Boyd’s Forum column. Seeing is believing, so don’t miss the amazing videos referenced at the end of the article. • We all know the digital/technical world will come to a wordy end without acronyms, but do you know what MOOC stands for? I do (“massively open online course”), thanks to an interview I did with executive search honcho Linda Burtch regarding the red-hot analytics job market. • Finally, I also learned from Linda that in today’s dynamic world, young people should plan on three or four careers during their lifetime. “It’s not good to specialize in one thing and try to stick with one company or one industry or one vertical application for your entire career,” she says in the Q&A. “It’s incredibly dangerous, and it likely won’t carry you through a 35-year career. You need to be continuously learning something new.” I got that last part going for me, every day.
– PETER HORNER, EDITOR peter.horner@ mail.informs.org
W W W. I N F O R M S . O R G
OPTIMIZE YOUR BUSINESS WITH UNPRECEDENTED SPEED IDEA
IN A FEW HOURS
MISSION CRITICAL ENTERPRISE APP
IN A FEW MONTHS
PUBLISHED INSTANTLY TO YOUR ENTERPRISE OPTIMIZATION APP STORE
PROOF OF CONCEPT
IN A FEW DAYS
OPTIMIZATION APP
IN A FEW WEEKS
To learn more about AIMMS Optimization Apps, visit aimms.com. info@aimms.com | +1 425 458 4024
C O N T E N T S
DRIVING BETTER BUSINESS DECISIONS
JULY/AUGUST 2014 Brought to you by
FEATURES
34
54
34
REAL-TIME TEXT ANALYTICS By Aveek Mukhopadhyay and Roger Barga How a cloud-based analytical engine yields instant insight using unstructured social media data.
44
WHY DO ANALYTICS PROJECTS FAIL? By Haluk Demirkan and Bulent Dal Not just another IT project: Key considerations for deep analytics on big data, learning and insights.
54
‘IT’S THEIR TIME TO SHINE’ By Peter Horner Job prospects for data scientists and elite analytics professionals have never been better – and the future is even brighter.
62
ANALYTICS TRANSFORMS A ‘DINOSAUR’ By Brenda Dietrich, Emily Plachy and Maureen Norton The story of how industry giant IBM not only survived but thrived by realizing business value from big data.
70
THE FUTURE OF FORECASTING By Jack Yurkiewicz Making predictions from hard and fast data: Biennial survey of popular software for analytics professionals.
62
4
|
A N A LY T I C S - M A G A Z I N E . O R G
70
W W W. I N F O R M S . O R G
AnAlytic Solver PlAtform visualize, Analyze, Decide with Power Bi + Premium Solver
Before your company spends a year and a small fortune on “advanced analytics”, shouldn’t you find out what your people can do with the latest enhancements to the tool they already know – Microsoft Excel – in business intelligence and advanced analytics today? Did you know that with Power Pivot in Excel 2013 and 2010, your Excel desktop can easily analyze 100 million row datasets, with the power of Microsoft’s SQL Server Analysis Services xVelocity engine inside Excel? Did you know that with Power Query in Excel, you can extract, transform and load (ETL) data from virtually any enterprise or cloud database with point-and-click ease? Did you know that with Analytic Solver Platform in Excel, you can create powerful data mining, forecasting and predictive analytics models, rivaling the best-known statistical packages, again with point-and-click ease? Did you know that with Analytic Solver Platform, you can build sophisticated Monte Carlo simulation, risk analysis, conventional and stochastic optimization models, using
the world’s best solvers, and modeling tools proven in use by over 7,000 companies? Did you know that with Power View and Frontline’s XLMiner Data Visualization, you can visualize not only your data, but the results of your analytic models? Now you know that with Microsoft’s Power BI and Frontline’s Premium Solver App, you can publish your Excel workbook to Office 365 in the cloud, share your visualizations, refresh from on-premise databases, and re-optimize your model for new decisions immediately. Find Out More, Download Your Free Trial Now Visit www.solver.com/powerbi to learn more, register and download a free trial – or email or call us today.
Tel 775 831 0300 • Fax 775 831 0314 • info@solver.com
DRIVING BETTER BUSINESS DECISIONS
REGISTER FOR A FREE SUBSCRIPTION: http://analytics.informs.org
32 82
DEPARTMENTS 2 Inside Story 8 Executive Edge 14 Analyze This! 24 Healthcare Analytics 28 INFORMS Initiatives 32 Forum 82 Conference Preview 84 Five-Minute Analyst 90 Thinking Analytically
INFORMS BOARD OF DIRECTORS President Stephen M. Robinson, University of Wisconsin-Madison President-Elect L. Robin Keller, University of California, Irvine Past President Anne G. Robinson, Verizon Wireless Secretary Brian Denton, University of Michigan Treasurer Nicholas G. Hall, Ohio State University Vice President-Meetings William “Bill” Klimack, Chevron Vice President-Publications Eric Johnson, Dartmouth College Vice President Sections and Societies Paul Messinger, CAP, University of Alberta Vice President Information Technology Bjarni Kristjansson, Maximal Software Vice President-Practice Activities Jonathan Owen, CAP, General Motors Vice President-International Activities Grace Lin, Institute for Information Industry Vice President-Membership and Professional Recognition Ozlem Ergun, Georgia Tech Vice President-Education Joel Sokol, Georgia Tech Vice President-Marketing, Communications and Outreach E. Andrew “Andy” Boyd, University of Houston Vice President-Chapters/Fora David Hunt, Oliver Wyman
INFORMS OFFICES www.informs.org • Tel: 1-800-4INFORMS Executive Director Melissa Moore Meetings Director Laura Payne Marketing Director Gary Bennett Communications Director Barry List Headquarters INFORMS (Maryland) 5521 Research Park Drive, Suite 200 Catonsville, MD 21228 Tel.: 443.757.3500 E-mail: informs@informs.org
ANALYTICS EDITORIAL AND ADVERTISING
Analytics (ISSN 1938-1697) is published six times a year by the Institute for Operations Research and the Management Sciences (INFORMS), the largest membership society in the world dedicated to the analytics profession. For a free subscription, register at http://analytics.informs.org. Address other correspondence to the editor, Peter Horner, peter.horner@mail.informs.org. The opinions expressed in Analytics are those of the authors, and do not necessarily reflect the opinions of INFORMS, its officers, Lionheart Publishing Inc. or the editorial staff of Analytics. Analytics copyright ©2014 by the Institute for Operations Research and the Management Sciences. All rights reserved.
6
|
Lionheart Publishing Inc., 506 Roswell Street, Suite 220, Marietta, GA 30060 USA Tel.: 770.431.0867 • Fax: 770.432.6969 President & Advertising Sales John Llewellyn john.llewellyn@mail.informs.org Tel.: 770.431.0867, ext. 209 Editor Peter R. Horner peter.horner@mail.informs.org Tel.: 770.587.3172 Assistant Editor Donna Brooks donna.brooks@mail.informs.org Art Director Jim McDonald jim.mcdonald@mail.informs.org Tel.: 770.431.0867, ext. 223 Advertising Sales Sharon Baker sharon.baker@mail.informs.org Tel.: 813.852.9942
AnAlytic Solver PlAtform easy to Use, industrial Strength Predictive Analytics in excel
How can you get results quickly for business decisions, without a huge budget for “enterprise analytics” software, and months of learning time? Here’s how: Analytic Solver Platform does it all in Microsoft Excel, accessing data from PowerPivot and SQL databases. Sophisticated Data Mining and Predictive Analytics Go far beyond other statistics and forecasting add-ins for Excel. Use classical multiple regression, exponential smoothing, and ARIMA models, but go further with regression trees, k-nearest neighbors, and neural networks for prediction, discriminant analysis, logistic regression, k-nearest neighbors, classification trees, naïve Bayes and neural nets for classification, and association rules for affinity (“market basket”) analysis. Use principal components, k-means clustering, and hierarchical clustering to simplify and cluster your data.
Help and Support to Get You Started Analytic Solver Platform can help you learn while getting results in business analytics, with its Guided Mode and Constraint Wizard for optimization, and Distribution Wizard for simulation. You’ll benefit from User Guides, Help, 30 datasets, 90 sample models, and new textbooks supporting Analytic Solver Platform. Surprising Performance on Large Datasets Excel’s ease of use won’t limit what you can do – Analytic Solver Platform’s fast, accurate algorithms rival the best-known statistical software packages. Find Out More, Download Your Free Trial Now Visit www.solver.com to learn more, register and download a free trial – or email or call us today.
Simulation, Optimization and Prescriptive Analytics Analytic Solver Platform also includes decision trees, Monte Carlo simulation, and powerful conventional and stochastic optimization for prescriptive analytics.
Tel 775 831 0300 • Fax 775 831 0314 • info@solver.com
EXE CU TIVE E D G E
Six ways of value-creation through analytics in E-commerce It has become imperative for organizations to be on the customers’ online radar with respect to new products or services and to be able to influence their choices.
BY ROHIT TANDON
AND SHRUTI UPADHYAY
8
|
Increasing popularity and access to the Internet has changed the way marketers are interacting with customers. These customers are smart, well informed and empowered, as Internet connectivity is available to them at their fingertips and on the go. It has therefore become imperative for organizations to be on the customers’ online radar with respect to new products or services and to be able to influence their choices. Not surprisingly, according to one study, 34 percent of marketers are generating leads through Twitter. India’s online retail market grew at a staggering 88 percent in 2013 to $16 billion and continues to grow. These examples are a testimony to the growth of e-commerce. The Internet deluge has opened an assortment of opportunities. Customers are able to buy high-end fashion and designer shoes, book hotels, buy movie tickets and you-name-it. Therefore, an opportunity exists for business research to capture, compile, churn and store colossal bytes of information about customers, suppliers and operations. This is what we call the age of “big data.” We believe that this age is a natural progression in online business and is here to stay. We are already seeing a surge in adoption of digital channels such as social media, e-mail marketing and display ads in e-commerce. Imagine the amount of data this
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
AnAlytic Solver PlAtform from Solver to full-Power Business Analytics in excel
The Excel Solver’s Big Brother Has Everything You Need for Predictive and Prescriptive Analytics From the developers of the Excel Solver, Analytic Solver Platform makes the world’s best optimization software accessible in Excel. Solve your existing models faster, scale up to large size, and solve new kinds of problems. From Linear Programming to Stochastic Optimization Fast linear, quadratic and mixed-integer programming is just the starting point in Analytic Solver Platform. Conic, nonlinear, non-smooth and global optimization are just the next step. Easily incorporate uncertainty and solve with simulation optimization, stochastic programming, and robust optimization – all at your fingertips.
Comprehensive Forecasting and Data Mining Analytic Solver Platform samples data from Excel, PowerPivot, and SQL databases for forecasting and data mining, from time series methods to classification and regression trees, neural networks and association rules. And you can use visual data exploration, cluster analysis and mining on your Monte Carlo simulation results. Find Out More, Download Your Free Trial Now Analytic Solver Platform comes with Wizards, Help, User Guides, 90 examples, and unique Active Support that brings live assistance to you right inside Microsoft Excel. Visit www.solver.com to learn more, register and download a free trial – or email or call us today.
Ultra-Fast Monte Carlo Simulation and Decision Trees Analytic Solver Platform is also a full-power tool for Monte Carlo simulation and decision analysis, with a Distribution Wizard, 50 distributions, 30 statistics and risk measures, and a wide array of charts and graphs.
Tel 775 831 0300 • Fax 775 831 0314 • info@solver.com
EXE CU TIVE E D G E
In the race to utilize the online space, marketers may be focusing more on advertising and less on analysis of the data that could potentially increase sales.
has created for marketers to lay their hands on for analysis. Despite that, in the race to utilize the online space, marketers may be focusing more on advertising and less on analysis of the data that could potentially increase sales. In our opinion, understanding the customer behavior becomes more complex in business-toconsumer companies and more so in a 24/7 e-commerce business that sells technology products in an increasingly commoditized industry. A strong analytics foundation may make e-commerce a thriving and successful channel of sales. Businesses, therefore, are increasingly creating customizable campaigns for their installed base customers and improving sales effectiveness through e-commerce. For example, pricing and merchandising decisions need to be taken in real time, and the need to have real-time insights is ever-increasing. To make these decisions faster and better, marketers would need to quickly analyze their digital marketing strategies by mining data exhaustively and cost effectively through advanced analytics. KEY DRIVERS OF INCREASED REVENUES An organization’s ability to achieve its goal of increased revenues and margins would depend heavily on its ability to improve three key drivers: 1) volume of customer traffic to the online store (number of visits); 2) customer conversion (percentage of conversion); and 3) basket size (revenue per average order size). Analytics has a very important role to play in this value chain. So while organizations may have the best talent with an analytical mindset and eagerness to apply it, we need to equip data
10
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
scientists in organizations with the right tools and insights. Conversations with analytics professionals reiterate our belief in some of the following must-haves that will elevate an organization’s e-commerce agenda to the next level: 1. Development of best-in-class tools and techniques are a must to build scalable solutions and tackle the optimization of key drivers. Over the years various products such as SAS have provided excellent development 1environments, but1 every data analytics_Layout 4/25/14 12:51 PM Page
scientist had to start from scratch and depend on their “personal” techniques to tackle new problems. However, in recent years, data scientists and organizations are now moving toward using templates and building packaged models and solutions to reuse and replicate technologies with ease. One of the first such pilot solutions within HP was developed for HPDirect.com’s demand generation function, where global analytics developed V.1 of a series of demand generation models. These models also paved the way for the development of
Stand Out. Put yourself in a lucrative new career. Apply now for a master’s degree in business analytics or supply chain management. • Intensive nine month programs • World-renowned faculty • Experiential projects with industry clients • Personalized professional development
www.leeds.colorado.edu/ms 303-492-8397 leedsms@colorado.edu
A NA L Y T I C S
J U LY / A U G U S T 2 014
|
11
EXE CU TIVE E D G E
Answering questions such as whom to target, what to offer and when to offer bring a paradigm shift in garnering customer interest and loyalty.
customer targeting models. In most organizations, such initiatives if implemented have the potential to lay the foundation for similar opportunities with other business functions such as planning, store operations and category management. When an organization reaches such a stage of maturity, that’s when true “return on data” (ROD) is possible. 2. The three Ws …whom, what, when. Traditionally, marketers have used a uni-dimensional approach to target customers. However, results show that these can be sub-optimal and might have an adverse effect on customer loyalty and brand image. Answering questions such as whom to target, what to offer and when to offer bring a paradigm shift in garnering customer interest and loyalty. These help rank customers on their propensity to re-purchase, and lead to preferential treatment of the right customers with the right product portfolio or allow marketers to understand when to offer discounts. Effective tools and modeling will also note clues on probability of customers picking one product over another or repeat customer behaviors. This brings us back to the importance of using effective, proven analytics tools and techniques. 3. Automate and innovate. Creating and applying big data algorithms will help organizations in taking appropriate actions. Many of them are programmed automatically, save time and allow better decisions faster. Creating a robust tool-based ecosystem that allows creation of funnels that track visitors, bounce rates, conversations, etc., is vital to a successful Web analytics initiative.
12
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
4. Site search analytics. Tracking site search is a very useful resource that allows you to know what your visitors are looking for in your website. Is the search engine directing the customer to your website or redirecting them to the next best option in absence of the product? Keeping tabs on this will help companies increase customer loyalty and sales. Another application of site search analytics allows you to understand what is being searched on your website. By understanding this, marketers can influence the site layout and design so that visitors are able to easily locate answers to common queries or the most searched products. 5. Marketing spend optimization. HP’s online store uses a mix of marketing vehicles to reach different customer segments with different communication and buying preferences. Optimizing spend on various marketing vehicles is critical to optimizing demand generation efforts as well. However, determining which marketing mix is most beneficial to the business is not an easy process, requiring not only a scientific approach to analyzing spend and revenue, but also a test-learn-optimize culture. For example, ongoing analysis of the response to different types of marketing vehicles helps in identifying the best fit for a particular type of message. Based on such analysis, one can decide A NA L Y T I C S
if a banner would work best vis-à-vis a customized landing page, or would an e-mail campaign be the best option. 6. Connect marketing with warehousing. In large supply chain environments, an accurate forecast of orders that get shipped out of the warehouse on a daily basis can be tracked using predictive analytics methodologies to enable accurate warehouse space/staffing allocation in order to meet the aggressive shipping timeline. In conclusion, marketers can apply data mining and advanced analytical skills to derive key insights to better understand drivers of Web traffic and reasonably accurate traffic forecast for use in business planning. We sense that if companies use data accurately, they can easily exhibit a three to five times growth of the online business and will make analytics easily replicable across different functions of the organization. Rohit Tandon is vice president of corporate strategy and worldwide head of Global Analytics at HewlettPackard. As part of HP’s corporate strategy team, he helps drive the analytics ecosystem to support HP’s vision and priorities through delivery of cutting-edge analytical capabilities across sales, marketing, supply chain, finance and HR domains. He was recently named one of the top-10 most influential analytics leaders in India for 2014 by Analytics India Magazine. Shruti Upadhyay is a manager with HP Global Analytics.
J U LY / A U G U S T 2 014
|
13
ANALY ZE T H I S !
Dark side of the digital world Big data, unintended consequences: What Amazon’s domination of the book publishing industry could portend. “In the book business the prospect of a single owner of both the means of production and the modes of distribution is especially worrisome ...” — George Packer
BY VIJAY MEHROTRA
14
|
Given my love of books, it is perhaps not surprising that Amazon.com – where, thanks to the digital technologies of today, a plethora of books can immediately be found about nearly any idea that pops into my head and be delivered (free with Amazon Prime membership!) to my doorstep with remarkable speed – is a website that I love deeply. Like many avid readers, I purport to do my best to support my local independent booksellers, but too often there is simply no denying the powerful pull of the super convenient, instantly gratifying, highly personalized Amazon.com experience. Thanks to my bi-monthly book club, I recently read “Who Owns the Future?” by Jaron Lanier, a celebrated technologist and MacArthur “genius” award winner best known for his contributions to the field of virtual reality. Lanier is known as a big thinker, and in this book – at once rambling, provocative and thoughtful – he once again shows why. “WOTF” begins with a bleak assessment of where digital technology is leading us all. The main thrust of Lanier’s argument is as follows: • Technology makes it very easy to give away for free a lot of things that people find valuable – just
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
think about the search engine. Being human, we are conditioned to love the chance to get something for nothing, and we have gratefully grabbed at it with both hands. • However, the value that technology grants us is not actually free. In exchange, we tacitly give up information about ourselves, which is then stored as data. • Thanks largely to analytics professionals, this data is then pooled and analyzed to create a variety of
commercial opportunities that would not otherwise exist. • This commercial wealth confers extraordinary power upon those who own the technologies that capture and analyze this data (Lanier calls them “Siren Servers”). • This power in turn enables the owners of the Siren Servers to have a huge impact on the society that we live in, including employment, government, culture and ideas. • Taken to their logical conclusions,
video learning center Your one-stop shop to view top presentations from key INFORMS meetings NOW ONLINE! 2014 Edelman Presentations 2013 Analytics Conference and Annual Meeting 2012 Analytics Conference and Annual Meeting 2011 Analytics Conference and Annual Meeting 2010 Practice Conference and Annual Meeting 2009 Annual Meeting
Your latest member benefit lets you learn from the best on your schedule.
http://livewebcast.net/INFORMS_Video_Learning_Center A NA L Y T I C S
J U LY / A U G U S T 2 014
|
15
ANALY ZE T H I S !
all of this ultimately dooms the human species to a very sad and cataclysmic ending. Along the way, Lanier also wanders off into pleasantly intense digressions on a broad variety of somewhat related topics, including Aristotle, the tenure system, biodiversity and the concept of local optima. He too clearly loves to read. IMPACT ON PUBLISHING While still digesting this thoughtprovoking book, I came across George Packer’s recent article entitled “Is Amazon good for books?” Taking a long hard look at Amazon.com, the website that perhaps most fully embodies Lanier’s concept of a Siren Server, Packer finds that many of Lanier’s more dire predictions are already playing out there. Packer’s particular focus is Amazon’s impact on the publishing industry, and he believes that the stakes here are incredibly high: “In the book business the prospect of a single owner of both the means of production and the modes of distribution is especially worrisome; it would give Amazon more control over the exchange of ideas than any company in U.S. history. Even in the iPhone age, books remain central to American intellectual life, and perhaps to democracy.” I wholeheartedly agree. Just as Lanier predicts, suppliers 16
|
A N A LY T I C S - M A G A Z I N E . O R G
and consumers alike had originally both rushed to embrace Amazon, for like so many technologies it seemed to magically (that is, without cost) provide all parties with something for which they hungered. As Packer writes, “When Amazon emerged, publishers in New York suddenly had a new buyer that paid quickly, sold their backlist as well as new titles, and, unlike traditional bookstores, made very few returns” – generating fresh revenues for publishers with little incremental investment. Meanwhile, we readers flocked to Amazon in droves for its convenience, its variety, and its low prices. Amazon.com today accounts for more than 40 percent of all printed books purchased as well as 65 percent of all eBooks, so it is probably fair to say that book buyers by and large still love Amazon. For us as readers, this is fortuitous, since the number of independent bookstores in business has declined by more than 50 percent since Amazon’s founding. However, as its share of overall book sales has ballooned, Amazon has taken advantage of its market power to aggressively push the terms of its agreements with book publishers dramatically in its own favor, often through tactics reflecting Amazon’s famously secretive and opaque corporate culture. Meanwhile, Packer reports, the many publishers large and small whose businesses are now W W W. I N F O R M S . O R G
X1 S1
X2
X3
S1
S2
P
Parallel Simplex
S1 P
Now part of FICO Xpress Optimization Suite. ®
People have been attempting to add parallel processing to the simplex method for linear programming for well over 30 years. FICO is proud to announce that we have solved this enormously difficult problem and can now offer parallel simplex in our software, including FICO® Xpress Optimization Suite. The addition of parallel processing to simplex algorithms speeds performance of FICO® Xpress Optimization Suite by as much as a factor of 2.5. Our method for the parallelization of classic simplex algorithms involves picking apart the algorithmic components and rearranging them to make the algorithm open to parallelization.
Learn more about parallel simplex and FICO® Xpress Optimization Suite: http://www.fico.com/xpress © 2014 Fair Isaac Corporation. All rights reserved.
ANALY ZE T H I S !
As more and more aspects of the enterprise are mediated by software, those in the business of carefully creating content (rather than digitally distributing it) will be increasingly devalued.
dependent on Amazon for much of their distribution and revenues are learning firsthand that, as Lanier sharply points out, “information supremacy for one company becomes, as a matter of course, a form of behavior modification for the rest of the world.” Packer’s article also describes an Amazon culture that places a very low value on human beings that are involved with development, promotion and distribution of books, placing its faith in algorithms rather than editors and relying on volunteer (that is, free) reviewers to take the place of staff writers. All of this serves as a real illustration of Lanier’s premise that as more and more aspects of the enterprise are mediated by software, those in the business of carefully creating content (rather than digitally distributing it) will be increasingly de-valued and many forms of employment that have long-term value to our culture will subsequently perish. ELIMINATING THE GATEKEEPERS While Amazon’s efforts at actually serving as a publisher have so far failed, it is clear that we can expect them to continue to pursue the holy grail of “eliminating the gatekeepers” from the world of publishing by producing its own original content. Indeed, one comes away from Packer’s article with the feeling that if Amazon’s founder and CEO Jeff Bezos could eliminate the need for authors and publishers by replacing them with automated content-generating software, he would not hesitate for an instant.
18
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
In fact, book distribution has from the outset been only a small part of Bezos’ vision. The real prize for Bezos has been the access to reams of consumer data and the ability to analyze this data for fun and profit. According to Packer, as early as 1995, Bezos had publicly stated that “Amazon intended to sell books as a way of gathering data on affluent, educated shoppers.” Indeed, today the $5.25 billion in book sales makes up only 7 percent of Amazon’s total revenues. This too is just as Lanier predicts in “WOTF,” which
may be why it was somehow not available directly from Amazon.com when I looked for it the other day (it has since been restored somehow). One book that I was able to find on Amazon.com was “Ethics of Big Data,” in which author Kord Davis asks a number of more fundamental questions about data and its place in the business world. As a longtime software/IT professional with a deep grounding in philosophy and the history of technology, Davis is equally comfortable discussing
To find an expert to help you, log onto INFORMS Find An Analytics Consultant Database
informs.org/Find-Analytics-Consultant/Search INFORMS is the foremost association of O.R. and analytics experts. Our members literally wrote the book on how analytics and the principles of operations research are used to improve organizational decision making.
A NA L Y T I C S
J U LY / A U G U S T 2 014
|
19
ANALY ZE T H I S !
“Nobody in history has ever had the opportunity to innovate, or been faced with the risks of unintended consequences, that big data now provides.” — Kord Davis
20
|
topics as diverse as digital strategy, supply chain optimization, application development and values-based management. As such, he has a unique perspective that motivates him to take these important – and very thorny – questions seriously. As he writes in the book’s Preface, “nobody in history has ever had the opportunity to innovate, or been faced with the risks of unintended consequences, that big data now provides.” In particular, Davis identifies four major aspects of any serious data ethics discussion: • Identity: In the digital world, who we are is tacitly defined by the data we leave behind and indeed our own sense of self is often tightly intertwined with our online activities. Davis points out that capturing and analyzing our digital trail “provides others the ability to quite easily summarize, aggregate or correlate various aspects of our identity – without our participation or consent.” • Privacy: Does your decision to engage in a digital interaction confer upon other entities the right to utilize data captured in the course of that specific interaction, and to link it to other sources of data that may correspond to you? As Davis asks, “Does privacy mean the same thing in both online and offline worlds?… should individuals have a legitimate ability to control data about themselves, and to what degree?”
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
SCHOLARSHIP FOR SERVICE PROGRAM Undergraduate, graduate, and doctoral students pursuing degrees in Science, Technology, Engineering, & Mathematics (STEM) fields
SMART Scholars receive: + Full tuition and educational fees + Generous cash stipend + Employment with Department of Defense facilities after graduation + Summer internships, health insurance, & book allowance For more information and to apply, visit
For more information and to apply, visit
H T T P ://S M A R T. AS E E .O R G
In accordance with Federal statutes and regulations, no person on the grounds of race, color, age, sex, national origin or disability shall be excluded from participating in, denied the benefits of, or be subject to discrimination under any program activity receiving financial assistance from the Department of Defense.
ANALY ZE T H I S !
• Ownership: Digital technology, data and analytics have given some companies the ability to turn individual users’ data into saleable assets and many others the capacity for improved decision-making and increased profitability. Intelligently utilizing data is something that we typically celebrate in our profession, but Davis again challenges this view by asking some very fundamental and thought-provoking questions: “Does our existence itself constitute a creative act, over which we have copyrights or other rights associated with creation? If it does, then how do those offline rights and privileges, sanctified by everything from the Constitution to local, state and federal laws, apply to the online presence of that same information?” • Reputation: Davis hits the nail on the head when he points out that, thanks to the ability of data to be combined and analyzed to drive inferential and predictive judgments, “the number of people who can form an opinion about what kind of person you are is exponentially larger and farther removed…” And while these online reputations are stubbornly persistent, the accuracy of this reputational assessment is too often an afterthought. 22
|
A N A LY T I C S - M A G A Z I N E . O R G
CALL FOR ACTION Unsatisfied with merely admiring the problem, both Lanier and Davis also call for action. Lanier proposes a technological and marketplace solution to the otherwise inevitable destiny that he believes digital technology, user data, and business analytics are rapidly leading us into, problems that are so vividly illustrated by the case of Amazon. He suggests an elaborate (though high-level) framework in which all personal data and creative works are tagged so as to enable their owner/creators to capture micropayments whenever and however their data/works are utilized. While his proposed remedy is at this stage sketchy at best, from my perspective he is to be commended for engaging us all in a conversation about a technology-enabled solution to a complex set of problems that few others are even willing to acknowledge. Davis, like Lanier, is a technologist rather than a Luddite (as he quite rightly points out, “whereas big data is ethically neutral, the use of big data is not”). In “Ethics of Big Data,” he strongly encourages organizations that use data extensively (as well as the policy-makers who attempt to make judgments in support of social good) to have meaningful discussions about how and why we use data and what the ethical implications are W W W. I N F O R M S . O R G
of those actions. In his call for serious ethical inquiry, Davis asserts that “Organizations realize that information has value that can be extracted and turned into new products…the ethical impact is highly context-dependent. But to ignore that there is an ethical impact is to court an imbalance between the benefits of innovation and the detriment of risk.” Especially, as Lanier would be quick to add, “with technology itself enabling the risk to be pushed off onto many, while the benefits are captured by an ever smaller few.” As Packer reports, Amazon has given very little thought to the near-term ethics or the long-term implications of the way in which it has used its customers’ data to obtain its current level of market power. But as Amazon’s current battle [1] with publisher Hachette rages on, with publishers, governments and erstwhile business partners sure to follow, it is clear that this particular story is far from over. As analytics professionals, neither is ours. We have a significant stake in the outcomes of these conversations about ethics and the future. As such, we would be wise to actively participate in those conversations. At this particular moment, we have considerable leverage to advocate for a digital future that reflects our own values. A NA L Y T I C S
The world of digital business – our own personalized Siren Server – has provided us with a massive, lucrative, and free channel for our products and services. Today’s digital enterprise depends so much on our ever-expanding ability to capture, transmit, store, integrate and organize data, and our deep capacity to use this data to summarize, analyze, correlate, predict and optimize. Through no fault of our own, we have been bestowed with The Sexiest Job of the 21st Century [2], and it is indeed tempting to believe that we are an integral and indispensable part of the world in which we live and work, and that we always will be. Turns out this is exactly what the publishers thought when Amazon first appeared on the scene too. Beware: There is no free lunch. Vijay Mehrotra (vmehrotra@usfca.edu) is a professor in the Department of Business Analytics and Information Systems at the University of San Francisco’s School of Management. He is also a longtime member of INFORMS.
REFERENCES 1. For more on this, see http://www.nytimes. com/2014/06/21/business/booksellers-scoresome-points-in-amazons-standoff-with-hachette. html and http://www.latimes.com/books/ jacketcopy/la-et-jc-amazon-and-hachetteexplained-20140602-story.html#page=1. 2. http://hbr.org/2012/10/data-scientist-thesexiest-job-of-the-21st-century/ar/1
J U LY / A U G U S T 2 014
|
23
HEALT H CARE A NA LY T I C S
Will Apple, Google usher in new era in healthcare analytics? The two giants have all the technology, talent and financial firepower needed to drive analytics into the consumer health space by enabling a platform play for various data generating devices and apps.
2014 is turning out to be an interesting year for the healthcare industry. On the healthcare technology front, this year has spurred 16 acquisitions since Jan. 1. State and federal government health insurance exchanges finally started to operate at scale, offering affordable health insurance coverage to millions. Twenty-six states and Washington, D.C., expanded their Medicaid program as of May 2014, making a large number of patients eligible for the safety net. These are all good things that add to the success of the Affordable Care Act (ACA), also known as Obamacare. At the same time we are just beginning to see the impact of the new patient inflow on our health system in the form of emergency room overcrowding [1]. Opponents of the ACA argue that the expansion of coverage without expanding the primary care physician network across the nation will lead to disaster. It remains to be seen which way the pendulum will swing. APPLE’S BIG SPLASH WITH HEALTHKIT
BY RAJIB GHOSH
24
|
Meanwhile, Apple has released its HealthKit product that connects multiple devices and apps. It has shown promise to become the health data repository
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
for consumers. In essence this was the promise of the personal health record, or PHR, a promise that rose to the peak of inflated expectation a few years back and then fell to the trough of disillusionment quite quickly [2]. But with Apple’s foray into the space, this time it could be different. The key promise, however, is the fusion of data from multiple sources and use of analytics to generate user-facing insights. The latter, however, is not there yet. In my last column I argued that the true empowerment of the patient consumer is waiting on the data fusion and analytics to become mainstream. Consumers do not want just a data repository like a PHR. They want actionable information that PHR does not provide. Apple’s announcement and subsequent action may expedite the health data movement in the right direction, but I am somewhat skeptical regarding data liquidity in Apple’s “walled garden” approach. Now that Apple has taken the lead how far behind can Google be? Recently, Forbes reported that Google is planning its own version of a health platform. By the time this column goes live we will know what Google is concealing up its sleeves. These two giants have all the technology, talent and financial firepower needed to drive analytics into the consumer health space by enabling a platform play for various data generating devices and apps. Insights for the consumer, however, will come at a price. As the insights with actionable consumer guidance increase, so too will the level of FDA scrutiny, including requirement for mandatory FDA approval. It is unclear how quickly Apple or Google A NA L Y T I C S
The key promise is the fusion of data from multiple sources and use of analytics to generate user-facing insights. The latter, however, is not there yet.
J U LY / A U G U S T 2 014
|
25
HEALT H CARE A NA LY T I C S
When a system enjoys large market share both among patients and providers and the system connects with the largest EMR company in the country, we can expect seamless bi-directional data flow to reach critical mass.
26
|
will go for that since it is an unknown territory for both companies. Having spent a decade in the medical device industry I know first hand the pain points of the manufacturers when their products come under FDA’s purview. APPLE-EPIC PARTNERSHIP Apple is also partnering with Epic Systems, the giant electronic medical record (EMR) company that controls close to 20 percent of the enterprise EMR market and covers 51 percent of the patients in the United States. This is a smart move by Apple. The ability to send usergenerated data to a healthcare professional’s EMR system has always been a key requirement for providers. This “end-to-end” data channel establishes continuum of care, which acts as the building block for analytics-driven population health management (PHM) initiatives. Since the introduction of the iPhone, Apple products have enjoyed a widespread adoption among healthcare professionals. A 2013 study by the Black Book Rankings found that among physicians who use medical apps on their smartphones, 68 percent used iPhones while 31 percent used Android devices. Also, 59 percent of physicians accessed apps from their tablet, and most of those users prefer iPad. Among U.S. consumers, Apple has lost some ground recently to its key competitor, Google Android, but still commands a large consumer following. When a system enjoys large market share both among patients and providers and the system connects with the largest EMR company in
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
the country, we can expect seamless bi-directional data flow to reach critical mass. This is a prerequisite to build a cloud-based analytics solution that can leverage data hubs at both ends of the flow. This is the reason why Apple’s HealthKit introduction is a key phenomenon, albeit it does not do much in its early incarnation. If Google wants to become a serious player in the healthcare field beyond fitness lovers, they have to think in the same direction as well. Once that happens imagine what sort of revolution the rivalry of these technology companies can usher in! The health data acquisition market is still fragmented, and as a result EMR companies have not shown much interest in opening up their data repository to those players. If Apple and Google can now turn the table and make this a true platform play using their controlling stakes in the mobile device market, then it becomes meaningful for the EMR companies to forge powerful partnerships with one or both of them. In turn that will create the unification of episodic data and continuous user-generated data – the Holy Grail! Interoperability standards will be firmed up and data security solutions will emerge. Most importantly, patients and providers will both benefit from the analytics solutions that will get a shot in the A NA L Y T I C S
arm from a data rich holistic picture of the patient. So far IBM is the lone warrior creating an ecosystem around its “Watson in the cloud” analytics solution. It still lacks the health data source. So what can Apple, Google, IBM and Epic do together to shake up healthcare? I’m getting goose bumps just thinking about the possibilities. Rajib Ghosh (rghosh@hotmail.com) is an independent consultant and business advisor with 20 years of technology experience in various industry verticals where he had senior level management roles in software engineering, program management, product management and business and strategy development. Ghosh spent a decade in the U.S. healthcare industry as part of a global ecosystem of medical device manufacturers, medical software companies and telehealth and telemedicine solution providers. He’s held senior positions at Hill-Rom, Solta Medical and Bosch Healthcare. His recent work interest includes public health and the field of IT-enabled sustainable healthcare delivery in the United States as well as emerging nations. Follow Ghosh on twitter @ghosh_r. REFERENCES 1. Laura Ungar, “More patients flocking to ERs under Obamacare,” http://www.courier-journal. com/story/news/2014/06/07/patients-flockingemergency-rooms-obamacare/10181349/ 2. “Hype Cycle for Healthcare Provider Applications, Analytics and Systems,” 2013, Gartner http://www.healthcatalyst.com/healthdata-analytics-hype-cycle
Subscribe to Analytics It’s fast, it’s easy and it’s FREE! Just visit: http://analytics.informs.org/
J U LY / A U G U S T 2 014
|
27
INFO RM S IN I T I AT I VE S
CAP exam, continuing education, analytics conference cluster Candidates for the CAP certification exam can choose from Kryterion’s global network of online secured testing locations to schedule their exam at a convenient time and place.
28
|
The Institute for Operations Research and the Management Sciences (INFORMS), the largest professional society in the world for professionals in the fields of analytics, operations research (O.R.) and management science and the publishers of Analytics magazine, announced that its Certified Analytics Professional (CAP®) exam will now be given at hundreds of computer-based testing centers worldwide through an agreement with Kryterion, the full-service provider of customizable assessment and certification products and services. Candidates for the CAP certification exam can choose from Kryterion’s global network of online secured testing locations to schedule their exam at a convenient time and place. INFORMS’ online testing center partner Kryterion, through strategic partnerships with colleges and universities, as well as testing and training companies, provides over 700 testing locations in more than 100 countries. In the United States alone, more than 400 testing centers are available. CAP exams can now be scheduled almost any day of the week and at a time and location that best suits the candidate.
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
Candidates can apply at www.informs.org/applyforcertification. Upon acceptance into the program, candidates receive an online voucher to present on the Kryterion site. Exam locations can be found at http:// www.kryteriononline.com/host_locations/. Introduced in the spring of 2013, the CAP program was created by subject matter experts, many of whom are INFORMS members. The CAP credential is designed for general analytics professionals in early- to mid-career and is based on a rigorous job task analysis and is vendor- and software-neutral. Benefits of analytics certification include gaining the ability to advance one’s career by setting a professional with CAP apart from the competition and obtaining the structure to make continuing professional development an integral part of one’s job performance. The CAP program assists hiring managers in finding competent analytics talent and shows that an organization hiring CAP professionals follows best analytics practice. NEW INFORMS CONTINUING EDUCATION COURSES The INFORMS Continuing Education program is offering two new courses this fall: “Introduction to Monte Carlo and Discrete-Event Simulation” and “Foundations of Modern Predictive Analytics.” A NA L Y T I C S
The intensive, two-day, in-person courses, like the program’s popular current courses “Essential Practice Skills for Analytics Professionals” and “Data Exploration & Visualization,” provide real take-away value to implement immediately at work. Once you leave the classroom, you will be able to apply the real skills, tools and methods of analytics. The courses will give participants hands-on practice in handling real data types, real business problems and practical methods for delivering business-useful results. In the course “Introduction to Monte Carlo and Discrete-Event Simulation,” taught by Barry Lawson, University of Richmond and Lawrence Leemis, C o l l e g e o f W i l l i a m a n d M a r y, participants will learn the basics of Monte Carlo and discreteevent simulation and how to identify real-world problem types appropriate for simulation. They’ll also develop skills and intuition for applying Monte Carlo and discrete-event simulation techniques. Topic areas covered include Monte Carlo modeling, sensitivity analysis, input modeling and output analysis. The course will be held at the INFORMS office, Catonsville (Baltimore area), Md., Sept 12-13, and Chicago, Oct. 16-17. J U LY / A U G U S T 2 014
|
29
INFO RM S IN I T I AT I VE S
The second new course, “Foundations of Modern Predictive Analytics,” will be taught by James Drew, Worcester Polytechnic Institute, Verizon (ret.). Modern predictive analytics, the science of discovering and exploiting complex data relationships, has rapidly changed in recent years, especially in today’s businesses. This course will give participants hands-on practice in handling real data types, real business problems and practical methods for delivering business-useful results. Some of the topic areas to be covered in this course are: linear regression, regression trees, logistic regression and CART (classification and regression trees). The course will be held in Washington, D.C., Sept. 15-16, and San Francisco, Nov. 7-8. Learn more about these courses including course outlines, instructor biographies, program objectives and how to register at: www.informs.org/ continuinged. ANALYTICS CLUSTER SET FOR INFORMS ANNUAL MEETING IN S.F. The Analytics Section of INFORMS will present the analytics cluster of sessions and presentations at the INFORMS Annual Meeting in San Francisco Nov. 9-12. The cluster encompasses 30
|
A N A LY T I C S - M A G A Z I N E . O R G
20 sessions featuring the renowned analytics practitioners and leaders. Nine additional sessions will be jointly organized in collaboration with the Health Applications Society (HAS),CPMS (the Practice Section of INFORMS) and the Section on O.R. in Sports (SpORts). The sessions/presentations within the cluster cover such topics as: • Successful application of analytics in multiple industries such as healthcare, transportation, defense and sports • Analytics focus areas such as big data, spreadsheets and predictive analytics • Panel discussions on understanding the connection between O.R. and analytics, building analytics programs to support organizations’ needs and business analytics in healthcare industry • Winners of the Innovative Applications in Analytics Award and the SAS Student Paper Competition • Why’s, how’s and what’s of analytics certification More information about the conference can be found at http://meetings2. informs.org/sanfrancisco2014/.
Help Promote Analytics Magazine It’s fast and it’s easy! Visit: http://analytics.informs.org/button.html
W W W. I N F O R M S . O R G
USD_Online MBA BA Analytics Magazine Ad.pdf
1
6/9/14
9:15 AM
Advance your career with an online Master of Business Administration with a specialization in Business Analytics.
C
M
Y
CM
MY
CY
CMY
K
Solve key business problems utilizing big data. Earn an AACSB-International accredited Master of Business Administration with a specialization in Business Analytics from the University of South Dakota.
Learn more: www.usd.edu/cde The University of South Dakota’s Beacom School of Business has been continuously accredited by AACSB-International since 1949.
DIVISION OF CONTINUING & DISTANCE EDUCATION 414 East Clark Street | Vermillion, SD 57069 605-677-6240 | 800-233-7937 www.usd.edu/cde | cde@usd.edu
FO RUM
Quick stop: Optimized F1 pit teams can change four tires in two seconds.
Pit stop analytics The idea of lifting a car, changing four tires and sending it on its way in a mere two seconds stretches the imagination.
BY E. ANDREW BOYD
32
|
Magic shows are fun because we get to experience the impossible. Still, we know there’s trickery afoot. But what about those times when the magic isn’t magic? When we witness something that’s seemingly impossible but proves all too real? Not only real, but the result of optimization? Such is the case in the Formula 1 race car pit. If you follow F1 racing, it comes as no surprise that pit stops have been reduced to two seconds. But if you aren’t an F1 devotee, the idea of lifting a car, changing four tires and sending it on its way in a mere two seconds stretches the imagination. The role of the pit has changed dramatically over the years. For much of racing history it was assumed cars would only stop in the event of problems. Scheduled tire changes or fuel stops weren’t part of the
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
equation. This orthodoxy was challenged in 1982 when an analytically minded race team from the United Kingdom focused in on two important facts. First, softer tires stuck to the track better during turns than their harder cousins, though they wore out more quickly. Second, less gas in the tank translated into a lighter, and therefore faster, car. Calculations showed that time spent changing tires and refilling the tank was more than offset by the improved performance of the car on the track. It’s a calculation any analytics practitioner would be proud of. The idea quickly caught on, making pit stops – and their efficient execution – an integral part of racing. Refueling was banned in 1984 out of safety concerns, but reinstated in 1994. During that 10-year period pit crews refined their tire changing skills to the point where the fastest pit stops took a little over four seconds. When refueling was again instituted, the impetus for faster tire changes disappeared since refueling was the bottleneck. That changed in 2010 when F1 racing again reverted to a no refueling policy, setting the stage for lightening fast tire changes. Achieving a two-second tire change required optimizing the entire process. Engineers took a look at everything from the design of the wheel nuts (one per wheel on F1 cars) to the special, selfpositioning pneumatic guns that remove A NA L Y T I C S
and tighten each nut. They then turned their attention to the pit crews. Teams of three work on each wheel: one to remove the old tire, one to position the new tire and one to operate the gun. Their moves aren’t left to chance, but are choreographed down to the position of their hands and feet from start to finish. It’s not hard to imagine John and Lillian Gilbreth – progenitors of industrial engineering and pioneers of time and motion studies – standing nearby, stopwatches in hand. They’d certainly be smiling in approval. With two jack operators and scattered observers, as many as 20 people crowd around a car during a pit stop – for two seconds of work. Optimization brings to mind models and mathematical programs. But sometimes optimization is smart without being sophisticated. And in the F1 pit, it works like magic. Andrew Boyd, INFORMS Fellow and INFORMS VP of Marketing, Communications and Outreach, served as executive and chief scientist at an analytics firm for many years. He can be reached at e.a.boyd@earthlink.net. NOTES & REFERENCES 1. Gray, W., “Tech Talk: Can F1 Pit Stops Get Even Quicker?” Eurosport, April 9, 2013. See also: https:// uk.eurosport.yahoo.com/blogs/will-gray/gray-matterf1-stops-even-quicker-101951154.html. Accessed May 24, 2014. 2. Examples of fast pit stops can be found at: https://www.youtube.com/watch?v=aHSUp7msCIE https://www.youtube.com/watch?v=Xvu0GlMa3xQ
J U LY / A U G U S T 2 014
|
33
CU STOME R R E LAT I O N S H I P S
Real-Time Text Analytics Cloud-based analytical engine yields instant insight using unstructured social media data.
BY (l-r) AVEEK MUKHOPADHYAY AND ROGER BARGA
I
nformation is generated in today’s world more rapidly than ever before, and it will keep growing at an exponential rate. The rise of social media combined with increased Internet penetration has led to a significant increase in user-generated content in the form of product reviews and feedback, blogs, independent news articles, Twitter and Facebook updates. The crux of leveraging such data lies in identifying patterns from it and using the data to 34
|
A N A LY T I C S - M A G A Z I N E . O R G
generate actionable insights in real time. This article proposes a cloud-based analytical engine that analyzes comments, reviews and opinions generated by customers to understand the main underlying themes and the general sentiment so that actionable insights can be generated in real time. Algorithms such as latent Dirichlet allocation for topic modeling and the holistic lexiconbased approach for sentiment mining have been operationalized using a multiagent framework deployed in a cloud W W W. I N F O R M S . O R G
environment. This process meets computational demands as it allows users to run virtual machines within managed data centers, freeing them from worrying about acquisition of new hardware and networks. UNSTRUCTURED SOCIAL MEDIA DATA According to a study by International Data Corporation (IDC), mankind created an estimated 150 exabytes (1 billion gigabytes) of data in 2005, a number that jumped to 1,200 exabytes in 2010. A more recent study by IDC and EMC put the amount of data created in 2011 at 1.8 zettabytes (1 followed by 27 zeroes), a number the study researchers expected to double every two years. Only 5 percent of this data is structured (comes in a standard format that can be read by computers). The remaining 95 percent is unstructured (photos, phone calls and free-flow texts). A large chunk of such unstructured data is in text format. Posing challenges owing to the sheer volume, depth and complexity, such data, however, holds immense potential for organizations. The key lies in identifying patterns from the data and gaining relevant insights. REAL-TIME ANALYTICS Not long ago, analyzing data and generating business intelligence reports A NA L Y T I C S
depended on the time-intensive ETL process (extract, transform, load). Depending upon the system and data complexity, analytics could be delayed by hours, days or even weeks while data management put it all together. In today’s business landscape, minimizing the lag between acquiring data and generating actionable insight has become the key differentiator. Acting in real time to respond to an event can result in huge profits and improved customer relationships for a firm. Real-time analytics can benefit in multiple business scenarios, including: • High-frequency trading (sophisticated algorithms to rapidly trade securities) • Real-time detection of fraudulent transactions • Real-time price adjustment based on competitor information • Real-time feedback from social media for a product firm about its new launch • Real-time recommendations by retail stores based on customer’s location • Real-time traffic routing based on information about vehicle frequency, direction, etc. Social media content comes from users without any vested interest, thus their opinions beget more trust. Organizations whose products and services J U LY / A U G U S T 2 014
|
35
REAL - T IME T E X T A NA LY T I C S
are mentioned in such media need to remain current on relevant discussions and be able to track the sentiment of every employee, customer and investor. To address this challenge, a cloud-based real-time ecosystem was created for analyzing comments, reviews and opinions mined from Twitter. In addition, tracking trending themes in the customer space and the evolution of these trends over time was incorporated.
be assumed as generated from multiple topics in different proportions. Now every word generated in a tweet can be randomly chosen in a two-step process: • First, a topic is randomly selected from the distribution of topics. • Second, the chosen word is randomly selected from the distribution of words over that topic.
So, the joint probability distribution of word W and topic T = Probability (W, T) = TEXT MINING ALGORITHMS Probability (T) * Probability (W | T). Topic modeling. Topic models are Now when the individual probability of statistical techniques that analyze words/ occurrence of a word is known (because it phrases in textual data to understand has already occurred in the tweet), the posthe main themes running through them. terior distribution is calculated as follows: This model algorithm is based on LDA Probability (T | W) = Probability (W, T) (latent Dirichlet allocation) and uses the / Probability (W) observed words in tweets (extracted from Given the probabilities of observed Twitter) to infer the hidden topic structure. words, latent information like the vocabuLDA is more easily understood by its lary distribution of a topic and the distrigenerative process. This generative pro- bution of topics over the tweet are thus cess defines a joint probability distribution inferred. over the observed (the words) and hidden (the topics) random variables. This joint Sentiment analysis. A holistic lexidistribution is used to compute the condi- con-based algorithm is used to analyze tional distribution of the hidden variables individual feature-level sentiments as well given the observed variables. This con- as cumulative sentiments over tweets. ditional distribution is called the posterior Aggregating opinions for a feature: distribution. The algorithm parses one tweet at a time A topic is assumed to be a collec- identifying the features present. A set of tion of words with different probabilities opinion words for each feature is identiof occurrence. An individual tweet can fied using a lexicon. An orientation score 36
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
Visual Analytics Opportunity at your fingertips.
The answers you need, the possibilities you seek—they’re all in your data. SAS helps you quickly see through the complexity and find hidden patterns, trends, key relationships and potential outcomes. Then easily share your insights in dynamic, interactive reports.
Try Visual Analytics and see for yourself
sas.com/VAdemo
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. © 2014 SAS Institute Inc. All rights reserved. S120597US.0214
REAL - T IME T E X T A NA LY T I C S
for each feature in the sentence is then calculated by summing up the featureopinion scores for that sentence. (Each feature-opinion score is obtained from the sentiment polarity of the opinion word and a multiplicative inverse of the distance between the feature and opinion word. Opinion words at a distance from the feature are assumed to be less associated to the feature compared to the nearer words.) For example, the phone is useful and a great work of art. Let the feature here be phone and opinion words be “useful,” “great.” Semantic orientation of useful = 1 Semantic orientation of great = 1 Distance between the words useful and phone = 2 Distance between the words great and phone = 5 score(f)=1/2+1/5= 0.7 Aggregating opinions for tweets: The sentiment score for a tweet is the summation of the scores for all opinion words present in the tweet. For example, “The phone is useful and a great work of art.” The opinion words in the sentence are “useful,” “great” Semantic orientation of useful = 1 Semantic orientation of great = 1 score(t) = 1 +1= 2 38
|
A N A LY T I C S - M A G A Z I N E . O R G
Negation-rule: This identifies the negation word (which can be 1 or 2 places before the opinion word) and reverses the opinion expressed in a sentence. For example, “The phone is not good.” Here phone gets negative orientation. Context-dependent rules: The features for which we find no opinion words, context dependent constructs are used to identify the orientation score. For example, “The phone is good but battery-life is short.” The only opinion word in the sentence is “good” (“short” is a context-dependent word). Phone gets positive orientation because of “good.” Battery-life gets negative orientation because of the word “but” being present between good and battery-life. Topic Evolution. The next step to topic modeling is to understand how topics and trends develop, evolve and go viral over time. The algorithm maintains a fixed number of topic streams and their statistics. Each tweet is processed as it comes in and is assigned to the “closest” topic stream (the topic stream most similar to it). If no topic stream is close enough, then a new stream is created and a stale stream is killed to maintain a fixed number W W W. I N F O R M S . O R G
INFORMSCONFERENCE
BIG DATA
THE
BUSINESS
OF
THANKS
Leadership Sponsors
To Our Sponsors Corporate & University Sponsors
REAL - T IME T E X T A NA LY T I C S
Figure 1: Real-time text mining agency. of topic streams. Streams are constantly monitored for the rate of arrival of tweets. Whenever there is a burst of tweets in a particular topic stream, an alert for the trending topic is generated. THE REAL-TIME EDGE A multi-agent distributed framework enables the processing of real-time data and facilitates decision-making by allowing for easy deployment of analytical tasks in the form of process flows. In this multi-agent paradigm, an agent is a software program designed to carry out one or more tasks and can communicate with other agents in the system using agent communication language. Thus, an 40
|
A N A LY T I C S - M A G A Z I N E . O R G
analytical task can be written as an agent, and the analytical process flow can be established by wiring together a set of communicating agents (an agency) that can run in sequence or in parallel. These agents were written using R to offer the analyst the benefits of a powerful and flexible statistical modeling language. OPERATIONALIZATION IN THE CLOUD The entire real-time platform was then deployed on a cloud ecosystem to allow for the following processes: Efficient resource management: The cloud platform provides the necessary virtual machine, network bandwidth and other W W W. I N F O R M S . O R G
Figure 2: Topic modeling treemap. infrastructure resources. Even when a machine goes down because of an unexpected failure, a new virtual machine is allocated for the application automatically. Dynamic scaling and load balancing: The cloud solution allows scaling out as well as scaling back an application depending on resource requirements. Multiple services running in tandem make the whole system computationally resource intensive. As resource demands increase, new role instances can be provisioned to handle the load. When demand decreases, these instances can be removed so that payment for unnecessary computing power is not required.
A NA L Y T I C S
Availability & durability: The cloud storage services replicate data on three different servers, guaranteeing it can be accessed at all times, even if a server shuts down unexpectedly. Better mobility: The application can be accessed from any place, as long as there is an Internet connection. There is no tight coupling with any physical server or machine. RESULTS Figure 2 shows a snapshot of the topic treemap generated in one run of the topic modeling algorithm (different topics are represented by different colors, with the areas representing occurrence frequency). J U LY / A U G U S T 2 014
|
41
REAL - T IME T E X T A NA LY T I C S
Figure 3: Trends stream graph. Incoming tweets over a time period were captured in a stream graph visualization as shown in the Figure 3 screenshot. Each topic is represented by a stream in the visualization and is characterized by the top words in that topic. At any point of time, the top words in each topic are displayed in a topic treemap below the stream graph. It is possible to get the keyword “treemap� at any past time in history. Successive runs of the sentiment analysis algorithm for batches of tweets are represented by the visual in Figure 4. Each bar captures the sentiment for that feature in a particular batch of tweets. The height of the bar represents the number of opinion words 42
|
A N A LY T I C S - M A G A Z I N E . O R G
for the feature in that batch. The color of each bar represents the overall sentiment level expressed in a batch of data, ranging from extremely negative (dark red) to extremely positive (dark green). The change in color of the bars across various batches can be used to identify stimuli that are driving the change. Selection of a particular bar provides a deeper analysis of that batch. The size of a bubble indicates the number of references of a particular opinion word, and the color shows the overall sentiment score for the particular opinion word. Both the size and color are indicators of which opinion words drive the sentiment for a feature in a batch. W W W. I N F O R M S . O R G
Figure 4: Sentiment analysis. CLOSING THOUGHTS Trending topics represent the popular “topics of conversation,” and when detected in real time, these hot topics are the social pulses that are usually ahead of any standard news media. Data analyzed via managed data centers can provide key insights into the evolving nature and patterns of social information and opinion and the general sentiment prevailing over such subjects. Aveek Mukhopadhyay is an associate manager at Mu Sigma where he works with the Innovation & Development Team with a core focus on driving the adoption of advanced analytical platforms and techniques both internally and externally. He has interests in the fields of text mining, machine learning and analytics automation. Roger Barga, Ph.D., is group program manager for the CloudML team at Microsoft Corporation where his team is building machine learning as
A NA L Y T I C S
a service in the cloud. Barga is also a lecturer in the Data Science program at the University of Washington. He joined Microsoft in 1997 as a researcher in the Database Group of Microsoft Research (MSR), where he was involved in a number of systems research projects and product incubation efforts, before joining the Cloud and Enterprise Division of Microsoft in 2011. NOTES & REFERENCES 1. The Economist (Feb. 25, 2010), “The Data Deluge” (http://www.economist.com/node/15579717). 2. David M. Blei, “Probabilistic Topic Models,” Communications of the ACM, April 2012, Vol. 55, No. 4 (http://www.cs.princeton.edu/~blei/papers/Blei2012. pdf). 3. Xiaowen Ding, Bing Liu and Philip S. Yu, “A Holistic Lexicon-Based Approach to Opinion Mining” (http://www.cs.uic.edu/~liub/FBS/opinionmining-final-WSDM.pdf).
Help Promote Analytics Magazine It’s fast and it’s easy! Visit: http://analytics.informs.org/button.html
J U LY / A U G U S T 2 014
|
43
THE DATA E C O N OMY
Why do so many analytics projects fail? Key considerations for deep analytics on big data, learning and insights.
BY (l-r) HALUK DEMIRKAN AND BULENT DAL
W
hat is big data? Big data, which means many things to many people, is not a new technological fad. In addition to providing innovative solutions and operational insights to enduring challenges and opportunities, big data with deep analytics instigate new ways to transform processes, organizations, entire industries and even society. Pushing the boundaries of deep data analytics uncovers new insights 44
|
A N A LY T I C S - M A G A Z I N E . O R G
and opportunities, and “big” depends on where you start and how you proceed. Big data is not just “big.” The exponentially growing volume of data is only one of many characteristics that are often associated with big data, such as variety, velocity, veracity and others (the six Vs; see box). According to Gartner Research, the worldwide market for analytics will remain the top focus for CIOs through 2017 [1]. According to research [2], W W W. I N F O R M S . O R G
The six Vs of big data n Volume (data at rest): terabytes to exabytes, petabytes to zettabytes of lots of data n Velocity (data in motion): streaming data, milliseconds to seconds, how fast data is being produced and how fast the data must be processed to meet the need or demand n Variety (data in many forms): structured, unstructured, text, multimedia, video, audio, sensor data, meter data, html, text, e-mails, etc. n Veracity (data in doubt): uncertainty due to data more than half of all analytics projects fail because they aren’t completed within budget or on schedule, or because they fail to deliver the features and benefits that are optimistically agreed on at their outset. Today, an abundance of knowledge and experience exists to have successful data and analytics-enabled decision support systems. So why do so many of these projects fail, and why are so many executives and users still so unhappy? While there are many reasons A NA L Y T I C S
inconsistency and incompleteness, ambiguities, latency, deception, model approximations, accuracy, quality, truthfulness or trustworthiness n Variability (data in change): the differing ways in which the data may be interpreted; different questions require different interpretations n Value (data for co-creation and deep learning): The relative importance of different complex data from distributed locations. Big data with deep analytics means greater insight and better decisions, something that every organization needs. for the high failure rate, the biggest reason is that companies still treat these projects as just another IT project. Big data analytics is neither a product nor a computer system. Instead, it should be considered a constantly evolving strategy, vision and architecture that continuously seeks to align an organization’s operations and direction with its strategic business goals and tactical and operational decisions. Table 1 includes a list of common mistakes that can doom analytics projects. J U LY / A U G U S T 2 014
|
45
WH Y P ROJ E C T S FA I L
KEY CONSIDERATIONS FOR DEEP ANALYTICS We live in an era of big data. Whether you work in financial services, consumer goods, travel, transportation, healthcare, education, supply chain, logistics or industrial products and professional services, analytics are becoming a competitive necessity for your organization. But having big data – and even people who can manipulate it successfully – is not enough. Companies need managers who can partner effectively with analysts to ensure that their work yields better strategic and tactical decisions.
Big data with deep analytics is a journey that helps organizations solve key business issues and opportunities by converting data into insights to influence business actions and drive critical business outcomes. As organizations try to take advantage of the big data opportunity, they need not be overwhelmed by the various challenges that might await them. Managers will need to start their journey by [2]: Identifying clear business need and value. Almost everything needs to be a business rather than a technology solution. Before companies start collecting big
Going Deep & Wide on big data with deep analytics for deep learning
46
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
Table 1: Common mistakes for analytics projects. Failing to build the need for big data within the organization Islands of analytics with “Excel culture” Data quality and reliability related issues Not enough investigation on vendor products and rather than blindly taking the path of least resistance Departmental thinking rather than looking at the big picture Considering this as a one-time implementation rather than a living eco-system Developing silo dashboards to answer a few questions rather than strategic, tactical and operational dashboards Not establishing company ontology and definitions for “single version of truth” culture Lack of vision and not having a strategy; not having a clear organizational communications plan Lack of upfront planning; overlooking the development of governance and program oversight Failure to re-organize for big data Not establishing a formal training program Ignoring the need to sell success and market the big data program Not having the adequate architecture for data integration Forgetting rapidly increasing complexities with …volume, velocity, variety, veracity, and many more
A NA L Y T I C S
J U LY / A U G U S T 2 014
|
47
WH Y P ROJ E C T S FA I L
data, they should have a clear idea of what they want to do with it with from a business sense. Here’s what you need to consider: Turn over part or all of big data solution delivery to business leaders. Project management and ownership from business (not IT) in big data solutions is the key for success. In the meantime, make sure to have clear alignment between business and IT. Partner with business peers to identify opportunities and solutions. If we talk about big data, the impact of these projects should also be “big.” Create a cross-organization team and involve all stakeholders early in the game. Value co-creation of value with customers. Overall business objective should always be about customers. If one of the initiatives is about big marketing outcome, than it should be about how to set up customer-centric marketing, how to provide targeted dynamic advertisement, how to engage customers and how to manage personalized shopping. Start small – with an eye to scale quickly. While big data solutions may be quite advanced, everything else surrounding it – best practices, methodologies, org structures, etc. – is nascent. No one has all the answers, at least not yet. Understand why traditional business intelligence and data warehousing projects can’t solve a problem. 48
|
A N A LY T I C S - M A G A Z I N E . O R G
Small, simple and scalable. When launching big data initiatives, avoid 1) getting too complicated too fast, and 2) not being prepared to scale once a solution catches on. Big data solutions can quickly grow out of control since discovering value from data prompts wanting more data. Identify what part of the business would benefit from quick wins. Look for opportunities that will show quick wins within no more than three months. Success brings more people to the table. This is not a one-time implementation. Understand that this is a living and evolving organism that will grow exponentially very fast. It is a culture change in the company with the way that you collect and use data, and the way you make outcome-based decisions. Develop a minimal set of big data governance directives upfront. Big data governance is a chicken-and-egg problem – you can’t govern or secure what you haven’t explored. However, exploring vast data sets without governance and security introduces risk. New processes to manage open source risks. Most big data solutions are being built on open source software, but open source has both legal and skill implications as firms are: 1) exposed to risk due to intellectual property issues and complex licensing agreements; 2) concerned about liability if systems built W W W. I N F O R M S . O R G
on open source fail; and 3) required to use technology that is often early release and not enterprise-class. New agile processes for solution delivery. Successful firms will embrace agile practices that allow end users of big data solutions to provide highly interactive inputs throughout the implementation process. Integrate structured and unstructured data from multiple sources. Integration of data is one of the most important and also complex processes to serve efficient and effective decision-making. In
terms of data, it includes machine data, sensor data, videos, audio, documents, enterprise content in call centers, e-mail messages, wikis and, indeed, larger volumes of transactional and application data. Data sharing is key. In order for a company to build a big data ecosystem that drives business action, organizations have to share data. Build a strong data infrastructure to host and manage data. Make sure to have secured and reliable in-house and/or hosted data (e.g., cloud) and information management infrastructure.
Save the Date!
Catch the Analytics Wave in Huntington Beach, CA
APRIL 12-14, 2015 INFORMS CONFERENCE ON
USINESS ANALYTICS & PERATIONS RESEARCH
A NA L Y T I C S
J U LY / A U G U S T 2 014
|
49
WH Y P ROJ E C T S FA I L
Think about what information do I collect today … and what analytics should I perform that can benefit me and others. New security and compliance procedures to protect extreme-scale data. In order to succeed with big data, new processes must be developed that recognize and protect the special nature of extreme-scale data that may be largely unexplored. Be ready to support rapid growth. Big data solutions can grow fast and exponentially. They can start as a pilot with a few terabytes of data, then becomes a petabyte very quickly. Since the same data can be used different ways and reanalyzed for new insights easily, nothing ever gets deleted. Funding must move out of IT for big data success. Funding for these projects should come from outside of the CIO organization and move to a marketing or sales organization, for instance, so that the business has a vested stake in the game. Create a road map that gradually builds the skills of your organization. It’s important to create a road map that allows you to gradually build the required skills within your staff, minimize risk and capitalize on previous successes to gain more support. In the organization, there will be new roles and responsibilities such as the data scientist, who possesses a 50
|
A N A LY T I C S - M A G A Z I N E . O R G
blend of skills that includes statistics, applied mathematics and computer science. This is different than any current decision support solution. With big data, organizations should look for new capabilities, such as: using advanced analytics to uncover patterns previously hidden; visualization and exploration to help the business find more complete answers, with new types and greater volumes of data to best represent the data to the user and highlight important patterns to the human eye; enable operational decision-making with on-demand stream data by making floor employees into analytic consumers; and turn insight into action to drive a decision – either with a manual step or an automated process. And most important be ready for rapidly increasing benefits and complexities from the six Vs. WHAT IS NEXT IN THE DATA ECONOMY? Organizations have access to a wealth of information, but they can’t get value out of it because it is sitting in its most raw form or in a semi-structured or unstructured format [3]. As a result, they don’t even know whether it’s worth keeping. So where is deep analytics for deep learning headed in the next few years? The exciting news is that many W W W. I N F O R M S . O R G
career analytics. Enroll now only AAS nation. Flexibility
Credential Options
Executive Accelerated Program Industry Recognized Tools & Skills
Wake Technical Community College served 68,919 students in 2012-13 and was ranked the second largest community college in the country in 2012 by Community College Week. A future forward college, it launched the AAS in Business Analytics, the first of its kind, in 2013. The program provides students the knowledge and practical skills necessary for employment and growth in analytics professions in as little as two semesters. Competitive tuition, open-door enrollment, flexible scheduling options, access to industry recognized tools, and a variety of credential options make enrollment in the program both accessible and affordable. This program is funded in full by a $2.9 million Dept. of Labor Trade Adjustment Assistance Community College & Career
WH Y P ROJ E C T S FA I L
organizations are already realizing the value of big data analytics today. Insightdriven, information-centric initiatives will be deployed where the ability to capitalize on the six Vs of information will create new opportunities for organizations to exploit. By combining and integrating deep analytics, local rules, scoring, optimization techniques and machine learning with cognitive science into business processes and systems, decision management helps deliver decisions that are consistently optimized and aligned with the organization’s desired outcomes. Social analytics will ensure businesses know how, when and where to creatively engage with individual consumers and social communities to foster trusted, one-to-one relationships and better understand and manage the way their companies are perceived. Integrating demographic and transactional data with what can be learned about attitudes and opinions allows organizations to truly understand the motivations and intents of its constituents to better serve them at the right time and place. Deep analytics will help organizations uncover previously hidden patterns, identify classifications, associations and segmentations, and make highly accurate predictions from structured and unstructured information. Organizations will use real-time analysis of current activity 52
|
A N A LY T I C S - M A G A Z I N E . O R G
to anticipate what will happen and identify drivers of various business outcomes so they can address the issues and challenges before they occur. Many decisions will be done automatically by computers that also have deep-learning capabilities. When you are in a process of starting a big data journey, consider this question: What should our big data with deep analytics roadmap look like to achieve our objectives? Haluk Demirkan (haluk@uw.edu) is a professor of Service Innovation and Business Analytics, and the founder and executive director of Center for Information Based Management at the Milgard School of Business, University of WashingtonTacoma. He has a Ph.D. in information systems and operations management from the University of Florida. He is a longtime member of INFORMS. Bulent Dal (bulent.dal@obase.com) is a co-founder and general manager of Obase Analytical Solutions (http://www.obase.com/index.php/en/obase), Istanbul, Turkey. His expertise is in scientific retail analytical solutions. He has a Ph.D. in computer sciences engineering from Istanbul University. Acknowledgement Part of this article is excerpted with permission of the publisher, HBR Turkey, from Demirkan, H. and Dal, B., “Big Data, Big Opportunities, Big Decisions,” Harvard Business Review Turkish Edition (published in Turkish), March 2014. REFERENCES 1. Gartner, Inc., 2013, “Gartner Predicts Business Intelligence and Analytics Will Remain Top Focus for CIOs Through 2017,” Dec. 16, 2013, http://www. gartner.com/newsroom/id/2637615. 2. Demirkan, H. and Dal, B., “Big Data, Big Opportunities, Big Decisions,” Harvard Business Review Turkish Edition (published in Turkish), March 2014, pp. 28-30. 3. Davenport, T., 2013, “Analytics, 3.0,” Harvard Business Review, December.
W W W. I N F O R M S . O R G
7ANALYTICS SYMPOSIUM
th ANNUAL BUSINESS
Hotel Capstone, The University of Alabama, Tuscaloosa, Alabama September 25-26, 2014 The Institute of Business Analytics Symposium is a two-day event where presenters from major companies across the U.S. share their experiences in business analytics. We will explore a diverse landscape from statistics, data-mining, and forecasting to predictive modeling and operations research. It’s also a great networking opportunity for businesses, students and academia. Keynote Speakers: - Wayne Winston - Hear from this renowned analytics expert. Major league sports teams and Fortune 500 companies have requested his business analytics services. - Paul Adams, VP of Ticket Sales is beginning his 26th season with the Atlanta Braves. For a complete list of presenters and to register visit http://mycba.ua.edu/basymposium. Early registration is available at a discounted rate through August 15. Businesses registering four or more individuals can receive a reduced rate. The INFORMS Certified Analytics Professional (CAP®) exam will be administered on September 24 as a pre-symposium event and requires separate payment.
Wayne Winston
“Obviously he (Wayne Winston) helped start the basketball analytics revolution with us,” said Dallas Mavericks owner Mark Cuban. Paul Adams
DATA S C IENT I ST S I N DE M A ND
‘It’s their time to shine’ According to executive search firm head Linda Burtch, the job prospects for data scientists and other elite analytics professionals have never been better – and the future is even brighter. BY PETER HORNER n April, the executive search firm Burtch Works released the results of its first-of-itskind salary and demographics survey of data scientists, a follow-up survey of big data professionals conducted a year earlier. Among other findings, the 2014 survey quantified that data scientists are well paid, relatively young, overwhelmingly male and that almost half (43 percent) are employed on the West Coast. Linda Burtch, managing partner of Burtch Works, has been involved in the recruitment and placement of high-end analytics talent for 30 years. She started her career with Smith-Hanley before founding her own company five years
I
54
|
A N A LY T I C S - M A G A Z I N E . O R G
ago. Analytics magazine editor Peter Horner interviewed Burtch in April, not long after the survey of data scientists was released. Following are excerpts from the interview. What did you find that surprised you the most from the salary and demographics survey of data scientists? First of all, I find it funny that everyone is interested in salaries and what data scientists and big data professionals make, but it’s such a taboo subject to actually talk about. Not to me. I talk about salaries all the time. That’s my business. What surprised me? That’s an interesting question. It actually turned out the way I thought it would – a lot of the W W W. I N F O R M S . O R G
candidates living out on the West Coast and a higher predominance of Ph.D.s among data scientists than the general analytics population or the big data professionals, as I call them. It all pretty much made sense to me. It was interesting because it was actually quantified.
data scientists. Data storage has become so much cheaper, computing power has become much faster, nanotechnology and sensors are now becoming ubiquitous. Self-driving cars, traffic sensors, the energy grid. The list goes on and on and on. Right now the obvious stuff is happenWeren’t you a little ing with understanding Linda Burtch, founder and managing partner of Burtch Works. surprised by the extent digital streams of data of the concentration of in applications related data scientists – nearly 50 percent – to social media. That’s pretty straighton the West Coast? forward stuff, but wait until it hits the That’s for the moment, for now, but healthcare industry, for example. Selfwatch and see what happens. Analytdriving cars are going to be a huge, ics has been around for a long time, yet huge deal. While a lot of it is being done some people still ask me, “Are you sure out in California now, over the next five this isn’t a fad?” It’s not. years we are going to see it scattered Analytics has become a hugely profitall over the United States. able specialty area within organizations as they try to optimize their operations, When it comes to recruiting canor target their marketing or look at redidates and job placement, who are turn on investment issues, and that has you talking to? been around for years and years. I recruit in analytics – people who I would argue that those issues are have master’s degrees in statistics, opsort of the humdrum stuff of analytics. erations research, econometrics, people Data-driven decision-making is really who are out there working in business going to explode, and that’s what we are applications, solving problems related seeing with this whole area going toward to marketing spend or credit worthiness A NA L Y T I C S
J U LY / A U G U S T 2 014
|
55
Q&A W ITH LI NDA B U R TC H
or target marketing. More recently I’ve gotten into data science. That’s a huge umbrella description.
data scientist is different from a big data professional, but the primary distinguishing feature, in my opinion, is that data scientists are working with data that’s unYou mentioned operations research, structured. It’s something that’s going to the heart and soul of INFORMS. grow as sensors become more and more It is. When I started out in recruit- prevalent and data streams become coning more than 30 years ago, I focused tinuous in so many applications areas. on operations research candidates. It’s grown pretty dramatically since then. How would you describe the current They have a very fond place in my heart job market for quants, for lack of a because that’s how I got started. It’s one better word? of those things that I’ve really been inIt’s hot. A couple of months ago we did volved with – the INFORMS group back a flash survey in which we simply asked in New York when I was living there, how often are you are contacted about a and I’m really excited now because the new job opportunity through LinkedIn. We INFORMS group in Chicago is getting had 400 responses; 89 percent of the rere-energized. It’s really exciting to watch. spondents said they were contacted at least monthly, and 25 percent said that they When looking at the job market- were contacted at least weekly. I’m working place, do you distinguish between, with elite data scientists, and they’re telling say, a data scientist and other analyt- me that they get calls once or twice a day ics professionals? from recruiters, so it’s just crazy. Let me back up a little bit. Last sumOur candidates are seeing a 14 permer, when I was putting together the big cent increase in salary when they change data salary study, I saw that data scien- jobs, so there’s a lot of churn out there. tists were a breed apart, and that they If they stay with their existing company, had higher compensation levels. So I they might see an annual increase of bemade the decision to take them out of the tween 2 percent and 3 percent, so the general big data study and hold them for 14 percent is a nice bounce if they delater because it’s such an emerging field cide to make a change. One of my data that’s so different. They are working with scientists in Boston said he received 30 what I would call unstructured data. You calls in one week after he left a job and could get into a lot more detail over how a went on the job hunt. 56
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
It’s amazing. Competing offers is another sign that the market is really hot. Sign-on bonuses are another thing that has become very commonplace in the analytics job market. Another sign that is important to note is the academic institutions have really stepped up with many of them developing master’s programs in analytics, predictive analytics and the like, so that’s something that is very new in the last two or three years.
c
ontinuing ducation
In an interview with the New York Times, you said in reference to MBAs, and I quote, “In 15 years, if you don’t have a solid quant background, you might have a permanent pink slip.” That’s a little rough, isn’t it? I know, I’ve become the harbinger of the permanent pink slip. Seriously, I have seen many MBAs, your general MBA, look around and say, whoa, this is a little bit scary, because they are seeing this trend toward analytical decision-making
COURSES FOR ANALYTICS
PROFESSIONALS
NEW! INTRODUCTION TO MONTE CARLO AND DISCRETE-EVENT SIMULATION
NEW! FOUNDATIONS OF MODERN PREDICTIVE ANALYTICS
» Monte Carlo Modeling » Sensitivity Analysis » Input Modeling » Output Analysis
» Linear Regression » Regression Trees » Classification Techniques » Finding Patterns
Topic areas:
This course will be held Catonsville, MD (INFORMS HQ) Sep 12-13, 2014 Chicago, IL Oct 16-17, 2014 Faculty: Barry G. Lawson, University of Richmond Lawrence M. Leemis, The College of William & Mary
A NA L Y T I C S
Topic areas:
This course will be held Washington, DC – Sep 15-16, 2014 San Francisco, CA – Nov 7-8, 2014 Faculty: James Drew, Worcester Polytechnic Institute, Verizon (ret.)
Learn more about these courses at: informs.org/continuinged INFORMS Continuing Education program offers intensive, two-day in-person courses providing analytics professionals with key skills, tools, and methods that can be implemented immediately in their work environment. These courses will give participants hands-on practice in handling real data types, real business problems and practical methods for delivering business-useful results.
J U LY / A U G U S T 2 014
|
57
Q&A W ITH LI NDA B U R TC H
Just how important are communication skills to a data scientist? INFORMS, for example, now routinely holds “soft skills” workshops aimed at helping analysts explain their work to non-technical audiences in order to garner corporate buy-in. Yes. That’s absolutely critical. The other piece that goes hand in hand with that is having the ability to understand the business at hand. Business acumen In an episode of the TV show “Mad is really important. You have to have Men,” the ad agency employees, cir- that gut check; does it make sense and ca late 1960s, were concerned that a how can I best monetize the situation new computer the size of a confer- to benefit a client or employer? It’s reence room would make them expend- ally important for people to understand able. Your quote reminded me of that. not only what’s interesting – what a lot Right. A lot of people ask me about of quantitative people tend to gravitate that. There is going to be a disruption. toward – but also what’s important. There already has been. Just yesterday, the Times had a visual display of analytIf a company is just starting out on ics and quants and how it was disrupting the analytics journey and has no inthings and what jobs were going to be house expertise in this area, how can eliminated, including truck drivers and they judge a candidate’s technical airplane pilots. abilities? Self-driving cars, robots, analytics, That’s an interesting problem. When algorithms and all this stuff is here to I’m talking to a client, especially in this stay, and it’s only going to get bigger, data science area that is so new, they but it’s not going to replace the ability to will call me and sometime they will have read, write and think critically. While I’m it down. They are talking the right lana big proponent of analytics, communi- guage, they are thinking about the right cation will continue to be really impor- things, they are asking the right questant; human-to-human contact can’t be tions. Other clients are floundering; they replaced, ever. are still exploring. becoming so predominant in business. Personally, I think within 10 or 15 years if MBAs don’t have a quantitative foundation, they will be prevented from promotion. We’ll see. I always said back when I was working with the operations research people that my guys are so smart, they are the ones who should be running these companies. Now I’m seeing it come true.
58
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
I think it’s very important that they make sure they understand where their needs are before they actually bring in somebody because it’s not inexpensive to apply analytics in an organization. You really need to think very carefully what the goals are, what the road map is going to look like and so on. I can certainly help with that, and I can give the names of consultants who can help a company really understand what their plan should be before they jump in and make hires.
On the other side of that coin, what’s the best advice you can give an analytics candidate who is testing the job market? Another flash survey we did focused on understanding what motivates people to make a job change. The number one motivation is money, but it’s quickly followed by challenging work and the opportunity to grow within an organization. Money is important to everyone, but candidates shouldn’t make decisions regarding changing jobs based on
Job Seeker Benefits • Access to high quality, relevant job postings. No more wading through postings that aren’t applicable to your expertise.
CAREER CENTER
• Personalized job alerts notify you of relevant job opportunities. • Career management – you have complete control over your passive or active job search. Upload multiple resumes and cover letters, add notes on employers and communicate anonymously with employers. • Anonymous resume bank protects your confidential information. Your resume will be displayed for employers to view EXCEPT your identity and contact information which will remain confidential until you are ready to reveal it. • Value-added benefits of career coaching, resume services, education/training, articles and advice, resume critique, resume writing and career assessment test services.
http://careercenter.informs.org A NA L Y T I C S
POWERED BY
J U LY / A U G U S T 2 014
|
59
Q&A W ITH LI NDA B U R TC H
For any quantitative person, when they’re talking to a potential new employer, it’s important to understand if analytics has a seat at the corporate table.
salary alone because money isn’t going to be the factor that’s going to change their life. Rather, it’s the kind of work you will do and how engaged you will be. It’s really important to understand the challenge and the growth opportunity within whatever it is you are looking to jump into. The third thing I think is important to analyze for any quantitative person when they’re talking to a potential new employer is to understand if analytics has a seat at the corporate table. You have to make sure that there is buy-in within the organization and the stakeholders are really actively involved and engaged in conversations about how analytics can and should be used or imbedded within any organization. That’s a huge factor in understanding how happy you will be in your job and how successful you can be as a quantitative professional. Getting back to the plight of the quantpoor MBA, how can a candidate boost analytical skills mid-career? Many colleges and universities are now offering analytics programs, often online, through their business schools, and INFORMS, for example, holds continuing education courses in the analytics area, as well as a certification program. I get that question a lot: “I’m really interested in beefing up my analytical skills so what should I do?” As you noted, there are more opportunities than ever to do that. In addition to the formal education programs, there are plenty of good books on the topic. I get the question all the time: What books should I be looking at?
60
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
Another way that you can jump into this is through Kaggle competitions, which I recommend to people if they are interested in understanding data science and who else is out there doing this kind of work and what they are doing. There are many tools out there. Certainly what INFORMS is doing is terrific. It’s important to keep your skills fresh and make sure you continue to learn. When it comes to giving general career advice, especially to younger candidates, my advice is this: prepare for three or four careers during your lifetime. In today’s world, it’s not good to specialize in one thing and try to stick with one company or one industry or one vertical application for your entire career. It’s incredibly dangerous, and it likely won’t carry you through a 35-year career. You need to be continuously learning something new. People should keep that in mind. INFORMS offers an analytics certification program (CAP). Is that a differentiator in the job marketplace? No two candidates are ever equal, but it can certainly help once there are enough employers out there who understand what it means to be CAP certified. I’m seeing people put various MOOCs (massively open online course) on their resumes now, along with Kaggle competition results. I have a candidate who A NA L Y T I C S
actually got his job because of a Kaggle competition. The first couple of times he submitted his solution it was totally rejected, but as he continued to study the problem and resubmitted, he climbed up the leaderboard. Then he started getting calls and job opportunities because of his Kaggle rank. From your perspective, what does the future hold for data scientists and other analytics professionals? In my 30 years of experience, I have never seen anything like this. The opportunities for elite analytics candidates have never been better, and I think what we’re seeing now is just the tip of the iceberg. As I said earlier, I really think that my quantitative candidates are going to be running companies one day. Certainly the CMO (chief marketing officer) is going to be coming up through the analytics ranks. Now there’s all this talk about CAOs (chief analytics officer). I think the candidates I’m working with have a very strong chance – if they have leadership ability and the ambition – to advance up the ranks and continue to climb and run organizations at some point. Their quantitative skills are going to be unique and absolutely required to be a successful businessperson. It’s their time to shine. Peter Horner (peter.horner@mail.informs.org) is the editor of Analytics and OR/MS Today magazines.
J U LY / A U G U S T 2 014
|
61
ANALY TIC S AC ROS S T H E E N T ERP RISE
Analytics transforms a ‘dinosaur’ The story of how IBM not only survived but thrived by realizing business value from big data.
BY (l-r) BRENDA DIETRICH, EMILY PLACHY AND MAUREEN NORTON his is the story of how an iconic company founded more than a century ago, and once deemed a “dinosaur” that would not be able to survive the 1990s, has learned lesson after lesson about survival and transformation. The use of analytics to bring more science into the business decision process is a key underpinning of this survival and transformation. Now for the first time, the inside story of how analytics is being used across the IBM enterprise is being told. According to Ginni Rometty, chairman,
T
62
|
A N A LY T I C S - M A G A Z I N E . O R G
president and chief executive officer, IBM Corporation, “Analytics is forming the silver thread through the future of everything we do.” What is analytics? In simple terms, analytics is any mathematical or scientific method that augments data with the intent of providing new insight. With the nearly 1 trillion connected objects and devices generating an estimated 2.5 billion gigabytes of new data each day, analytics can help discover insights in the data. That insight creates competitive advantage when used to inform actions and decisions. W W W. I N F O R M S . O R G
Data is becoming the using data, but it involves world’s new natural remore than simple data source, and learning how (or database) queries. to use that resource is a Analytics involves the use game changer. … of mathematical or scienAnalytics is not just a tific methods to generate technology; it is a way of insight from the data. doing business. Through Analytics should be the use of analytics, inthought of as a progressights from data can be sion of capabilities, startcreated to augment the ing with the well-known gut feelings and intuition methods of business inthat many decisions are telligence, and extending This article is adapted from based on today. Analytics the book, “Analytics Across the through more complex does not replace human Enterprise: How IBM Realizes methods involving sigjudgment or diminish the Business Value from Big Data nificant amounts of both and Analytics.” creative, innovative spirit mathematical modeling but rather informs it with new insights to and computation. be weighed in the decision process. … Reporting is the most widely used Analytics for the sake of analytics will analytic capability. Reporting gathers not get you far. To drive the most value, data from multiple sources, such as busianalytics should be applied to solving your ness automation, and creates standard most important business challenges and summarizations of the data. Visualizadeployed widely. Analytics is a means, not tions are created to bring the data to life an end. It is a way of thinking that leads to and make it easy to interpret. fact-based decision-making. … As a generic example, consider store sales data from a retail chain. The data BIG DATA AND ANALYTICS is generated through the point of sale DEMYSTIFIED system by reading the product bar codes If analytics is any mathematical or sciat checkout. Daily reports might include entific method that augments data with total store revenue for each store, revthe intent of providing new insight, aren’t enue by department for each region, and all data queries analytics? No. Analytics is national revenue for each stock-keeping often thought of as answering questions unit (SKU). Weekly reports might include A NA L Y T I C S
J U LY / A U G U S T 2 014
|
63
ANALY TIC S AC ROS S T H E E N T ERP RISE
the same metrics, as well as comparisons to the previous week and comparisons to the same week in the previous calendar year. Many reporting systems also allow for expanding the summarized data into its component parts. This is particularly useful in understanding changes in the sums. For example, a regional store manager might want to examine the storelevel detail that resulted in an increase in revenue from the home entertainment department. She would be interested in knowing whether sales increased at most of the stores in the region, or whether the increase in total sales resulted from a significant sales jump in just a few stores. She might also look at whether the increase could be traced back to just a few SKUs, such as an unusually popular movie or video game. If a likely cause of the sales increase can be identified, she might alert the store managers to monitor inventory of the popular products, reposition the products within a store, or even reallocate inventory of the products across stores in her region. ‌
to process the amounts of data that are today being generated through social media, sensors, and more. While gut instinct is often the basis for decisions, analytically informed intuition is what wins going forward. Several studies have highlighted the value of analytics. Companies that use predictive analytics are outperforming those that do not by a factor of five. In a 2012 joint survey by the IBM Institute of Business Value and the Said Business School at the University of Oxford of more than 1,000 professionals around the world, 63 percent of respondents reported that the use of information (including big data and analytics) is creating a competitive advantage for their organizations. IBM depends on analytics to meet its business objectives and provide shareholder value. The bottom line is that analytics helps the bottom line. Your competition will not be waiting to take advantage of the new insights from big data. Should you? IBM has approached the use of analytics with a spirit of innovation and a belief that analytics will illuminate insights in data that can help improve outcomes. WHY ANALYTICS MATTER The company hasn’t been afraid to make Quite simply, analytics matters be- mistakes or redesign programs that cause it works. You can be overwhelmed haven’t worked as planned. Unlike trawith data and the value of it may be unat- ditional IT projects, most analytics projtainable until you apply analytics to create ects are exploratory. For example, the the insights. Human brains were not built Development Expense Baseline Project 64
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
BIG DATA. BIG CAREER. Master of Science in Analytics
Apply technical knowledge to diverse analytical problems in this program for working adults. Learn to draw insights from complex data using statistical methods and modeling. Develop advanced proficiency in applying sophisticated statistical, database development, and software skills to various industries. Apply by August 10. Join us for an information session. When Where
More Info RSVP
Thursday, July 10, 6–7 pm, or Thursday, July 17, 6–7 pm July 10: Downtown Chicago Gleacher Center 450 North Cityfront Plaza Drive July 17: Online grahamschool.uchicago.edu/MAANMP July 10: http://tinyurl.com/o4auzsw July 17: http://tinyurl.com/nbs2495
ANALY TIC S AC ROS S T H E E N T ERP RISE
explored innovative ways to determine development expense at a detailed level, thereby addressing a problem that many thought was impossible to solve. IBM analytic teams haven’t waited for perfect data to get started; rather, they have refined and improved their data along the way. … The key is to put a stake in the ground with a commitment that analytics will be woven into your strategy. That’s how IBM does it. This approach is also effective with big data. Rather than postpone the leveraging of big data, you should embrace it, establish a link between your business priorities and your information agenda, and apply analytics to become a smarter enterprise. …
with expertise in the data in that particular area of the business. A joint study by MIT Sloan and the IBM Institute for Business Value developed several recommendations. The first is that you start with your biggest and highest-value business challenge. The next recommendation is to ask a lot of questions about that challenge in order to understand what’s going on or what could be going on. Then you go out and look for what data you might have that’s relevant to that challenge. Finally, you determine which analytic technique can be used to analyze the data and solve the problem. Because most companies have constraints on the amount of money and skills available for projects, estimating the ROI can provide a better differentiator PROVEN APPROACHES for selecting the project with the highest Staying focused on solving business potential impact than relying on instincts. problems was the pragmatic start, and Estimating an analytics project’s ROI inthe other crucial element was having very volves both capturing the project costs high-level executive support from the be- and measuring the value. … ginning. From a governance perspective, those are two key levers to drive value: EMERGING THEMES focus on actions and decisions that will generate value and have high-level executive sponsorship. The ideal team to do analytics is a collaboration between an experienced data scientist, a person steeped in the area of the business where the challenge needs to be solved, and an IT person 66
|
A N A LY T I C S - M A G A Z I N E . O R G
Relationships inferred from data today may not be present in data collected tomorrow. The relationships that you infer from data about the past do not necessarily hold in data that you collect tomorrow. You cannot analyze data once and then make decisions forever based on old analysis. It’s important to W W W. I N F O R M S . O R G
Analytics SAS and Hadoop take on the Big Data challenge. And win.
Why collect massive amounts of Big Data if you can’t analyze it all? Or if you have to wait days and weeks to get results? Combining the analytical power of SAS with the crunching capabilities of Hadoop takes you from data to decisions in a single, interactive environment – for the fastest results at the greatest value.
Read the TDWI report
sas.com/tdwi
A NA L Y T I C S
J U LY / A U G U S T 2 014
|
67
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. © 2014 SAS Institute Inc. All rights reserved. S120598US.0214
ANALY TIC S AC ROS S T H E E N T ERP RISE
continually analyze data to verify that previously detected relationships are still valid and to discover new ones. Fortunately, major discontinuities with data do not happen very often, so change generally happens gradually. Social media sentiment, however, has a much shorter half-life than most data. Using relationships derived from past data has been repeatedly demonstrated to work better than assuming that no relationships exist. The relationships that have been detected are likely correlation rather than causality. However, these relationships, if detected and acted upon quickly, may provide at least a temporary business advantage. You don’t have to understand analytics technology to derive value from it. For a long time, many business leaders expressed the opinion that mathematics should be used by only those who understood the details of the computations. However, in recent years this view has been changing, and analytics is being treated like other technologies. You must learn how to use it effectively, but it is not necessary to understand the inner workings in order to apply analytics to business decisions. You have to apply analytics methods in the context of the problem that is being solved and make the results accessible to the end user. But just as the user of a car navigation system 68
|
A N A LY T I C S - M A G A Z I N E . O R G
does not need to understand the details of the routing algorithm, the end user of analytics does not have to understand the details of the math. Typically, making the results accessible to the end user involves wrapping the math in the language and the process of the end user. Also, the analytics can be embedded deep inside things so that the user does not see it, like in supply chain operations. Analytics should be usable by anyone, not just those with Ph.D.s in statistics or operations research. Some users will want to understand the algorithms and inner workings of an analytics model in order to trust the results prior to adoption, but they are the exception. Fast, cheap processors and cheap storage make analysis on big data possible. Moore’s law has resulted in vast increases in computing power and vast decreases in the cost of storing and accessing data. With readily available and inexpensive computing, we can do whatif calculations often and test a number of variables in big data for correlation. Doing things fast is almost always better than doing things perfectly. Often inexact but fast approaches produce enormous gains because they result in better choices than humans would have made without the use of analytics. Over time, the approximate analytics methods can be refined and improved to W W W. I N F O R M S . O R G
achieve additional gains. However, for many business processes, there is eventually a point of diminishing returns: The calculations may become more detailed and precise, but the end results are no more accurate or valuable. Using analytics leads to better auditability and accountability. With the use of analytics, the decision-making process becomes more structured and repeatable, and a decision becomes less dependent on the individual making the decision. When you change which people are in various positions, things still happen in the same way. You can often go back and find out what analysis was used and why a decision was made. … Dr. Brenda L. Dietrich is an IBM Fellow and vice president. She joined IBM in 1984, and during her career she has worked with almost every IBM business unit and applied analytics to numerous IBM decision processes. She currently leads the emerging technologies team in the IBM Watson group. For more than a decade, she led the Mathematical Sciences function in the IBM Research division, where she was responsible for both basic research on computational mathematics and for the development of novel applications of mathematics for both IBM and its clients. In addition to her work within IBM, she has been the president of INFORMS, the world’s largest professional society for operations research and management sciences. An INFORMS Fellow,
she has received multiple service awards from INFORMS. Dr. Emily C. Plachy is a distinguished engineer in Business Analytics Transformation at IBM, where she is responsible for leading an increased use of analytics across IBM. Since joining IBM in 1982, she has integrated data analysis into her work and has held a number of technical leadership roles including CTO, process, methods, and tools in IBM Global Business Services. In 1992, Emily was elected to the IBM Academy of Technology, a body of approximately 1,000 of IBM’s top technical leaders, and she served as its president from 2009 to 2011. She is a member of INFORMS. Maureen Fitzgerald Norton, MBA, JD, is a distinguished market intelligence professional and executive program manager in Business Analytics Transformation, responsible for driving the widespread use of analytics across IBM. In her previous role, she led project teams applying analytics to IBM Smarter Planet initiatives in public safety, global social services, commerce and merchandising. Norton became the first woman in IBM to earn the designation of Distinguished Market Intelligence Professional for developing innovative approaches to solving business issues and knowledge gaps through analysis. Note: This article is adapted from the book, “Analytics Across the Enterprise: How IBM Realizes Business Value from Big Data and Analytics,” authored by Brenda L. Dietrich, Emily C. Plachy and Maureen F. Norton, published by Pearson/IBM Press, May 2014, ISBN 978-013-383303-4, ©2014 by International Business Machines Corporation. For more information, visit: ibmpressbooks.com.
Request a no-obligation INFORMS Member Benefits Packet For more information, visit: http://www.informs.org/Membership
A NA L Y T I C S
J U LY / A U G U S T 2 014
|
69
SO FTWARE S U R VE Y
The future of forecasting Making predictions from hard and fast data.
BY JACK YURKIEWICZ ere is an easy forecast to make: Forecasting will be part of our information flow for the foreseeable future. Forecasting is also a key topic in my “Decision Modeling for Management� course. In preparing the midterm exam for this past spring term, I wanted the students to analyze the enrollment figures for the Affordable Care Act and make some forecasts. The media has been talking about these enrollment figures since the rollout, and politicians have been making
H
70
|
A N A LY T I C S - M A G A Z I N E . O R G
projections about them as well. In the course we covered various forecasting methodologies, including trend analysis. Thus, my plan for a midterm problem was to give the students the enrollment data and have them make a forecast for the May 1 enrollment deadline. Getting those enrollment numbers became obstacle number one. Figures 1 and 2 show some typical results of an Internet search. I found graphs, some better, more worse (look at the markers on the x-axis of the graph W W W. I N F O R M S . O R G
Figure 1: http://www.cnn.com/interactive/2013/09/health/map-obamacare/.
Figure 2: http://www.whitehouse.gov/the-press-office/2014/04/17/fact-sheet-affordable-care-act-numbers.
in Figure 1), lots of opinion articles with forecasts, but no data. I punted and decided to present the class a similar but far less-pressing problem. On March 31, the day of the midterm exam, I asked students to make forecasts for the A NA L Y T I C S
cumulative domestic box-office gross for the recently released movie “Non-Stop.� The action film starring Liam Neeson had opened on Feb. 28, and I gave the students the daily domestic box-office gross values from opening day through J U LY / A U G U S T 2 014
|
71
FO RE CAST ING
Figure 3: Initial daily domestic box-office gross of the motion picture (“Non-Stop�).
March 16, or 17 days of data. The students were asked to make a time plot of these box-office figures (see Figure 3) and, after examining various trend models, get a forecast for the cumulative domestic box-office gross for a target date, midterm day, March 31. I knew that two days later (after I had graded their exams and returned them), Universal Studios would give the actual cumulative domestic gross of the film as of March 31. It was $85.39 million. Of the various trend models we covered, the Weibull curve yielded the most accurate forecast, $86.11 million; another model was reasonably close, and the others we discussed and they tried were way off. 72
|
A N A LY T I C S - M A G A Z I N E . O R G
CATEGORIZING THE FORECAST SOFTWARE Commercial forecasting software is available in two broad categories. Using the nomenclature from previous OR/MS Today forecasting surveys, the first category is called dedicated software. A dedicated product implies that the software only has various forecasting capabilities, such as Box-Jenkins, exponential smoothing, trend analysis, regression and other procedures. The second category is called general statistical software. This implies the product does have forecasting techniques as a subset of the many statistical procedures it can do. Thus, a product that can do ANOVA, factor analysis, etc., as well as Box-Jenkins techniques would fall into W W W. I N F O R M S . O R G
NORTHWESTERN ANALYTICS As businesses seek to maximize the value of vast new streams of available data, Northwestern University offers two master’s degree programs in analytics that prepare students to meet the growing demand for data-driven leadership and problem solving. Graduates develop a robust technical foundation to guide data-driven decision making and innovation, as well as the strategic, communication and management skills that position them for leadership roles in a wide range of industries and disciplines.
MASTER OF SCIENCE IN ANALYTICS • 15-month, full-time, on-campus program • Integrates data science, information technology and business applications into three areas of data analysis: predictive (forecasting), descriptive (business intelligence and data mining) and prescriptive (optimization and simulation) • Offered by the McCormick School of Engineering and Applied Science www.analytics.northwestern.edu MASTER OF SCIENCE IN PREDICTIVE ANALYTICS • Online, part-time program • Builds expertise in advanced analytics, data mining, database management, financial analysis, predictive modeling, quantitative reasoning, and web analytics, as well as advanced communication and leadership • Offered by Northwestern University School of Continuing Studies 877-664-3347 | www.predictive-analytics.northwestern.edu/info
FO RE CAST ING
this group. In recent years, the number of products in the second category has been growing, as statistical software firms have been adding additional and more sophisticated forecasting methodologies to their lists of features and capabilities. However, some dedicated software manufacturers offer specific capabilities and features (e.g., transfer function, econometric models, etc.) that general statistical programs may not have. In both software categories, forecasting software varies when it comes to the degree to which the software can find the appropriate model and the optimal parameters of that model. For example, Winters’ method requires values for three smoothing constants and Box-Jenkins models have to be specified with various parameters, such as ARIMA(1,0,1) x(0,1,2). Forecasting software vary in their degree to find these parameters. For the purposes of this and previous surveys, the ability of the software to find the optimal model and parameters for the data is characterized. Software is labeled as automatic if it both recommends the appropriate model to use on a particular data set and finds the optimal parameters for that model. Automatic software typically asks the user to specify some parameter to minimize (e.g., Akaike Information Criterion (AIC), Schwarz Bayesian Information Criterion (SBIC), RMSE, 74
|
A N A LY T I C S - M A G A Z I N E . O R G
etc.) and recommends a forecast model for the data, gives the model’s optimal parameters, calculates forecasts for a user-specified number of future periods, and gives various summary statistics and graphs. The user can manually overrule the recommended model and choose another, and the software finds the optimal parameters, forecasts, etc., for that one. The second category is called semiautomatic. Such software asks the user to pick a forecasting model from a menu and some statistic to minimize, and the program then finds the optimal parameters for that model, the forecasts, and various graphs and statistics. The third category is called manual software. Here the user must specify both the model that should be used and the corresponding parameters. The software then finds the forecasts, summary statistics and charts. If you frequently need to make forecasts of different types of time series, using manual software could be a tedious choice. Unfortunately, that broad advice may not be apropos for some software. Some products fall into two categories. For example, if you choose a Box-Jenkins model, the software may find the optimal parameters for that model, but if you specify that Winters’ method be used, the product may require that you manually enter the three smoothing constants. W W W. I N F O R M S . O R G
When it comes to analyzing trends, most the products I tried fall into the semiautomatic group. That is, I need to choose a trend curve, and the software finds the appropriate parameters for that model, gives forecasts, summary statistics and graphs. WORKING WITH A SAMPLE OF PRODUCTS In my class, students use StatTools, part of the Palisade Software Suite that comes with their textbook. Its forecasting
capabilities are regression, exponential smoothing (Brown, Holt and Winters’) and moving averages. If data followed some nonlinear function, the students could make mathematical transformations to make the data linear and then use ordinary linear regression on it, and do the inverse transformation to get the forecast. They also have several Excel templates I developed (Gompertz, Pearl-Reed, Weibull, etc.) for the course. For this article, I tried a small sample of professional
How will you stand out from the crowd?
A membership in INFORMS will help! Join INFORMS for rest of 2014 for just $80. Exclusive offer to Analytics subscribers. Promocode ANALYTICS-HALF. • Certification for Analytics Professionals • Online access to the latest in operations research and advanced analytics techniques • Networking Opportunities available at INFORMS Meetings and Communities • New Members receive one free Subdivison membership in 2014
A NA L Y T I C S
visit http://join.informs.org
J U LY / A U G U S T 2 014
|
75
FO RE CAST ING
Figure 4. IBM SPSS input worksheet (showing the “Non-Stop” movie daily box-office returns).
products from different categories, specifically Minitab, IBM SPSS and NCSS on the “Non-Stop” movie data. IBM SPSS falls into the automatic forecasting category; Minitab and NCSS are semiautomatic products. A caveat: This is not meant to be a critical review of any product mentioned. I let IBM SPSS first do the analysis of the movie data via its automatic mode, called “Expert Modeler” (i.e., choose the model and its parameters and get the forecasts). Figure 4 shows superimposed screen shots of IBM SPSS’ worksheet,
showing the “Non-Stop” daily domestic box-office gross and the menu system to start the automatic forecasting procedure. The program then gave its recommended model, Brown’s method for data with linear trend, which uses one smoothing constant to estimate the intercept and slope of the fitted line (as compared to Holt’s method, which uses two independent smoothing constants) [1]. IBM SPSS’ accompanying statistics, forecast plot and additional output are shown in Figure 5.
Figure 5: IBM SPSS’ results of “automatic” forecasting of the “Non-Stop” data.
76
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
No matter what kind of forecasting you do, we invite you to
take Foresight for a “test drive.” Published for business forecasters, planners, and managers by the International Institute of Forecasters (IIF), Foresight: The International Journal of Applied Forecasting delivers authoritative guidance on forecasting processes, practices, methods, and tools. Each issue features a unique blend of insights from experienced practitioners and top academics, distilled into concise and accessible articles, tutorials, and case studies. Our mission is to help you improve the accuracy and efficiency of your forecasting and operational planning.
Foresight’s topics include
• S&OP process design and management • Forecasting principles and methods • Measuring and tracking forecast accuracy • Regular columns on forecasting intelligence, prediction markets, financial forecasting • Hot new research and its practical value • Reviews of new and popular books, software, and other technologies To take Foresight for a spin, download a recent issue here:
bit.ly/ForesightTestDrive
To receive quarterly hard copy issues, unlimited access to our library of back issues, and much more, subscribe to Foresight here: forecasters.org/foresight/subscribe Foresight is a publication of the International Institute of Forecasters. IIF Business Office: 53 Tesla Avenue, Medford, MA 02155, USA. Tel: 1-781-234-4077
FO RE CAST ING
Figure 6: IBM SPSS’ fitted models for three specified growth curves.
IBM SPSS does have a curve fitting feature, so I utilized it and specified three possible models to be examined – the linear, growth and logistic curves. Figures 6 and 7 give the resulting output and plots for these choices. NCSS has, in addition to the stanFigure 7: IBM SPSS’ plot of the data and growth curves. dard forecasting procedures (Box-Jenkins and exponen- with four parameters; there is a Logistic(3) tial smoothing models), an extensive list model available as well], and Figure 8 of more than 20 nonlinear curve mod- shows the NCSS’ output. els under its menu label “Growth and Minitab is a hybrid of a semi-autoOther Models.” The user chooses a matic and manual forecasting product. If model, and NCSS finds the appropriate you specify that a Box-Jenkins model be parameters for the particular data set. used, the software finds the appropriate I chose, for the “Non-Stop” data, the parameters for the model. However, if you “Logistic(4)” model [i.e., a logistic curve choose Winters’ method, Minitab requires 78
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
Figure 8: NCSS’ output. I chose the “Logistic(4)” from NCSS’ list of “Growth and Other Models.”
that you manually enter values for the three smoothing constants. Minitab also has, under the Time Series choice on the main menu, a Trend Analysis option. Choosing that gives the user four possible curves (linear, quadratic, exponential and Pearl-Reed logistic). Figure 9 gives the results of my choice for the “Non-Stop” data, the Pearl-Reed curve (Minitab calls it the S-Curve Trend Model).
A NA L Y T I C S
Figure 9: Minitab’s output for the Pearl-Reed logistic growth model for the Non-Stop data.
J U LY / A U G U S T 2 014
|
79
FO RE CAST ING
Figure 10: The four-parameter Weibull curve fit for the Non-Stop data.
Finally, Figure 10 shows the results of one of my Excel templates that uses the four-parameter Weibull trend curve and uses Solver’s nonlinear programming capability to find the optimal parameters that minimizes the root mean square error for the entered data. THE SURVEY We e-mailed the vendors and asked them to respond on our online questionnaire so readers could see the features and capabilities of the software. The purpose of the survey is to inform the reader of a program’s forecasting 80
|
A N A LY T I C S - M A G A Z I N E . O R G
capabilities and features. We tried to identify as many forecasting vendors and products as possible and contacted all the vendors that we identified and/ or responded to the last survey in 2012. For those who did not respond, we tried gentle reminders (several e-mails and some phone calls). In addition to the features and capability of the software, we wanted to know what techniques or enhancements have been added to the software since our previous survey. The information comes from the vendors, and we made no attempt to verify what they gave us. W W W. I N F O R M S . O R G
SURVEY DATA & DIRECTORY
To view the survey results as well as a directory of vendors who participated in the survey, click here.
If you use data to make forecasts, what should you look for in a vendor and the product? First, find out the capabilities of the software. Specifically, what forecasting methodologies can the product do? Does it find the optimal parameters of the procedure for your particular data set or must you manually enter those values? How extensive, useful and clear is the output? Most, but not all, vendors allow you to download a time-trial version of the software that typically expires in anywhere from a week to a month. Ideally, the trial version should allow you to work with your own data and not just “canned� data that the vendor bundles with the trial software. Verify if the trial version has size
A NA L Y T I C S
limitations of the data, and if so, are they overly restrictive. Ask about technical support, updating to a newer version when it is released and differences (if any) depending on the operating system you are using. Contact the vendor with your specific questions. Users tell me, and I have independently found, that most vendors have good and helpful technical support before and after you buy. Jack Yurkiewicz (yurk@optonline.net) is a professor of management science in the MBA program at the Lubin School of Business, Pace University, New York. He teaches data analysis, management science and operations management. His current interests include developing and assessing the effectiveness of distance-learning courses for these topics. He is a longtime member of INFORMS.
J U LY / A U G U S T 2 014
|
81
CO N FERE N C E P R E V I E W
S.F. conference set to capture hearts & minds The conference will include more than 4,000 technical presentations by experts from industry, academia and government, from leading-edge advancements in operations research methodologies and analytics to applications in healthcare, energy, environmental management and supply chain management.
BY CANDACE “CANDI” YANO
82
|
Some of San Francisco’s many landmarks are mobile.
Tony Bennett sang that he “left his heart in San Francisco” – and at the 2014 INFORMS Annual Meeting in San Francisco, you will begin to understand why as you take advantage of the opportunity to fill both your heart and your mind. To fill your mind, you can attend special presentations: • Alvin Roth, professor of economics at Stanford University and professor of economics and business administration at Harvard University who was awarded the 2012 Nobel Prize in Economics for his work in the area of Game Theory, will talk about his work. • Richard Cottle, emeritus professor at Stanford University, will offer a commemorative and historical perspective on George Dantzig in honor of Dantzig’s 100th birthday. • Jonathan Caulkins, professor at the Heinz School of Public Policy at Carnegie Mellon University, will discuss his work on health and drug-related policy issues.
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
• Anthony Levandowski of Google will talk about the Google Driverless Car project, offering his perspective as both a developer and a user of the technology. • A panel of experts from within the INFORMS community will discuss their experience with, and offer advice on, massively open online courses (MOOCs). If this is not enough, there will be more than 4,000 technical presentations by experts from industry, academia and government. Topics will be wide-ranging, covering the full breadth of the field, from leading-edge advancements in operations research methodologies and analytics, to applications in healthcare, energy, critical infrastructure management, environmental management and supply chain management. If you are not already overwhelmed while filling your mind, you will have ample opportunity to fill your heart – and stomach. San Francisco is regarded as one of the most beautiful cities in the world and offers world-class cuisine from almost every ethnic heritage. The meeting will take place in two adjacent hotels, the Hilton San Francisco Union Square and the Parc 55 Wyndham. The location is in close proximity to the city’s prime shopping district and near the boarding point for cable cars to Fisherman’s Wharf – famous for A NA L Y T I C S
fresh seafood and Pier 39 – where you can see dozens of sea lions and walk to ferries that offer everything from simple rides across San Francisco Bay to amazingly scenic tours, as well as Ghirardelli Square, known for Ghirardelli chocolate. Venturing into other parts of San Francisco, you can visit world-class museums, including the Palace of the Legion of Honor, DeYoung Museum, Asian Art Museum and California Academy of Sciences. The performing arts, including the symphony, ballet, opera, jazz, theater and concerts, are all within easy reach. If you prefer the outdoors, you can take a trip to the former prison on Alcatraz (a limited number of tickets will be available to conferees for purchase), see the redwoods in Muir Woods, hike in the Marin Headlands with an unobstructed view of the Golden Gate Bridge, sign up to play a round of golf with other conferees at TPC Harding Golf Course the day before the conference, or simply wander through the haunts of the hippies in Haight-Ashbury or the Beat poets in North Beach. Just a bit further from the city are the wine regions of Napa and Sonoma, only an hour’s drive away. Both the meeting and the venue will have much to offer in many dimensions. We look forward to seeing you there. Candace “Candi” Yano is general chair of the 2014 INFORMS Annual Meeting in San Francisco. She is a longtime member of INFORMS.
J U LY / A U G U S T 2 014
|
83
FIVE- M IN U T E A N A LYST
Probabilistic parking problems The whole parking activity brings out the worst behaviors of humankind: hoarding, brinksmanship, scarcity mentality, irrational objective functions… and why as an O.R. professional I love parking lots: because they are so interesting to study.
Figure 1: A “smart meter” in a parking lot. This meter has a button next to the coin lot that may be pressed for a free hour of parking. Coins may be added for additional time, up to two hours.
BY HARRISON SCHRAMM, CAP
84
|
Few things make me more conflicted than parking lots. On a personal level, I loathe the whole parking activity. It brings out what I think is the worst behaviors of humankind: hoarding, brinksmanship, scarcity mentality, irrational objective functions… and now you see why as an O.R. professional I love parking lots: because they are so interesting to study. At the corner of Hades Street and Styx Ave. is (at least to me) the world’s worst parking lot. Here’s the set-up: There is an upper level with metered parking. The meter has a two-hour limit at a rate of $1.25/hour, but pressing a silver button on the meter sets the time to 60 minutes if the meter is currently less than 60 (see Figure 1). This makes parking here free to most visitors. The lower level is
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
a standard parking garage, which has a flat $2 per hour fee which can be validated by the two “anchor” stores, making it essentially free for most patrons as well. While this is light and exploratory, there is serious work going on with parking problems [1]. In the sterile world of figures and mathematics, this sounds like a reasonable way to run a parking lot, and patrons who miss the upstairs free parking will simply renege and take the lower level free parking. In reality, people
“mob” the upstairs portion in search of “free parking.” My assistant and I had observed this behavior over a number of weeks, and we were interested in learning about the time parked cars spent in the lot, with an eye for simple metrics such as expected wait time for a parking spot or the expected number of cars “trolling” for a slot. This interest became action (the key for any analysis), and we chose 6:30 p.m. on a Thursday evening – a time that we knew the parking lot would be full – to collect data
BECOME A CERTIFED ANALYTICS PROFESSIONAL DON’T BE LEFT BEHIND. www.informs.org/Build-Your-Career/Analytics-Certification
BENEFITS OF CERTIFICATION
• Advances your career potential by setting you apart from the competition • Drives personal satisfaction of accomplishing a key career milestone • Helps improve your overall job performance by stressing continuing professional development • Recognizes that you have invested in your analytics career by pursuing this rigorous credential • Boosts your salary potential by being viewed as experienced analytics professional • Shows competence in the principles and practices of analytics
APPLICATIONS
• Prepare to apply by reviewing Candidate Handbook & Study Guide Draft • Arrange now to secure academic transcript and confirmation of “soft skills” to send to INFORMS
COMPUTER-BASED TESTING
It is now more convenient than ever to schedule your CAP exam in more than 700 Kryterion test centers in more than 100+ countries. To find the location closest to you, check this site: www.kryteriononline.com/host_locations/
DOMAINS OF ANALYTICS PRACTICE Domain Description Weight* Business Problem (Question) Framing 15% I Analytics Problem Framing II 17% III Data 22% Methodology (Approach) Selection IV 15% Model Building V 16% VI Deployment 9% Life Cycle Management VII 6% *Percentage of questions in exam
QUESTIONS? certification@mail.informs.org
A NA L Y T I C S
J U LY / A U G U S T 2 014
100%
|
85
FIVE- M IN U T E A N A LYST
Figure 2: Histogram of raw parking meter data. Note the tri-modal nature of the data. “Overtime,” i.e., flashing parking meters are represented by -1 in the red-shaded oval and constitute the large bar at the origin of the graph. Known paid parking meters are at the right and have a blue oval.
from the meters, which is displayed for anyone who wishes to see. What we found was surprising. We expected to see uncorrelated parking lot data. We did not expect to find many over-time parking spots. I hoped that the data would be exponential – which would lead to nice, clean analysis. What we discovered was, well, a mess. Of the 100 parking spots surveyed, 25 percent were “flashing” or over-time (violation). Of the parking spots that were not over-time, six showed times over one hour, implying that the persons parked there had in fact put money in the meter. We are completely discarding the possibility that someone would park in a 86
|
A N A LY T I C S - M A G A Z I N E . O R G
spot that had been previously occupied but was not vacated, i.e., showing up with 30 minutes remaining on meter and not pressing the button/inserting coins. I had hoped that the sojourn times would be exponentially distributed, but that is a case that is pretty difficult to make with this dataset (see Figure 2). Now, we don’t actually know how many patrons have paid, or how many have simply run over. However, there are 100 parking spots considered, and of these, six currently have clocks over one hour. We can (crudely) estimate [2] the true number of paid parking spots by realizing that we are observing the last hour of what may be a two-hour process. Therefore, we think approximately W W W. I N F O R M S . O R G
Figure 3: Histogram of parking time remaining, less than 60 minutes. Approximately six of these data points are actually spill over from “paying” customers.
12 parking spots have been paid for at any given time. YES, BUT WHAT DOES IT ALL MEAN? So in one sense, the distributions of the data are irrelevant; there are 100 parking spots on average, and the average time that a parking spot is occupied is some time greater than 27 minutes. If we make the (not bad!) assumption that the parking spots that run over are occupied for 90 minutes, then the average occupancy is 43 minutes. In a lot with 100 spots, this means that on average, A NA L Y T I C S
one spot comes open every 30 seconds. This doesn’t sound so bad. If we treat the system as a queue, and use the (observed) steady state cars waiting of three, we can place a rough lower estimate [3] that a new car arrives every 30 seconds looking for a parking spot, and that they have between a 15 percent and 25 percent chance of finding an open spot. These crude estimates, however, do not agree very well with observation, because they neglect the “blocking” effect of other cars waiting for spots to open up. A better analysis of J U LY / A U G U S T 2 014
|
87
FIVE- M IN U T E A N A LYST
this parking lot would involve simulation, which would go beyond our intent.
parking, and I really don’t like the risky behaviors aggressive parkers participate in. On the upside, there’s time to write 12 THE WORLD’S WORST PARKING LOT? articles in a single push of the button! Because of the behavior of the drivI’d be interested in hearing real ers while trolling for a parking spot, it contenders for the “World’s Worst might be considered the world’s worst Parking” lot. parking lot. Enforcement of the parkUpdate: Between the original draft of ing policy might help because it would this article and its publication, the parkdecrease the sojourn times of the cars ing lot in question began installing an parked in the lot, but there is no guar- electronic system to help customers deantee, and – more importantly – no di- termine how many spots were available rect incentive for the parking lot owners before entering the parking “queue.” It to do so. This is because the number of has yet to be determined if it will change “free” parking spots is fixed, and once the behaviors of the parking lot. Look they are filled, they are filled, regard- forward to an update in a future column! less of by whom. From the lot manHarrison Schramm (harrison.schramm@gmail. ager’s point of view, it doesn’t matter com) is an operations research professional in the if they are “long” or “short” parkers. Washington, D.C., area. He is a member of INFORMS and a Certified Analytics Professional (CAP). In fact, the rate structure is such that short parkers are slightly more lucrative for the parking lot owner than parking NOTES & REFERENCES above ground. 1. Fabusuyi, Hampshire, Hill and Sasauma, 2014, “Decision Analytics for Parking Availability in In conclusion, it’s probably a bit of litDowntown Pittsburgh,” Interfaces, INFORMS, Hanover, Md. erary hyperbole to imply that this is the 2. This is just an estimate. More delicate techniques world’s worst parking; I’m sure there are may be applied. others that are much worse. This is be3. Using the M/M/1 queuing model to find the “lower” or optimistic estimates, and the M/G/1 queuing model cause I like to make short trips to this area to find the upper estimate. and visit the locations that don’t validate
Join the Analytics Section of INFORMS
For more information, visit: http://www.informs.org/Community/Analytics/Membership
88
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
November 9-12, 2014
Hilton San Francisco Union Square & Parc 55 Wyndham San Francisco, California
Join us in San Francisco INFORMS returns to the City by the Bay for its 2014 Annual Meeting with a rich and varied program, bridging data and decisions. Each year, the INFORMS meeting brings together experts from academia, industry and government to consider a broad range of ORMS and analytics research and applications. In 2014, we’ll offer that program excellence in one of America’s most exciting cities. Join us for INFORMS 2014!
Registration Now Open!
The Premier Conference for OR/MS Professionals offers you:
Networking – connect with colleagues, share knowledge and ideas
Top industry and academic speakers Two great receptions, Sunday and Tuesday Tutorials, exhibits and software demonstrations Extensive tracks on “hot topics” – the best in ORMS Focus on Analytics and Practice – special tracks and sessions Vibrant Interactive/Poster Sessions
meetings2.informs.org/sanfrancisco2014
Thanks to our Sponsors:
THIN K IN G A N A LY T I CA LLY
Frog and fly A frog is looking to catch his next meal just as a fly wanders into his pond. The frog jumps randomly from one lily pad to the next in hopes of catching the fly. The fly is unaware of the frog and is moving randomly from one red flower to another. The frog can only move on the lily pads and the fly can only move on the flowers. The interval at which both the frog and the fly move to a new space is one second. They never sit still and always move away from the space they are currently on. Both the frog and the fly have an equal chance of moving Figure 1: Where will the frog dine on the fly? to any nearby space including diagonals. For example, if the frog were on space A1, he would have a one-in-three chance each of moving to A2, B2 and B1. The frog will capture the fly when he lands on the same space as the fly.
BY JOHN TOCZEK John Toczek is the senior director of Decision Support and Analytics for ARAMARK Corporation in the Global Operational Excellence group. He earned a bachelor of science degree in chemical engineering at Drexel University (1996) and a master’s degree in operations research from Virginia Commonwealth University (2005). He is a member of INFORMS.
90
|
QUESTION: Which space is the frog most likely to catch the fly? Send your answer to puzzlor@gmail.com by Aug. 15. The winner, chosen randomly from correct answers, will receive a $25 Amazon Gift Card. Past questions can be found at puzzlor.com.
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
GENERAL ALGEBRAIC MODELING SYSTEM
GAMS-related Courses and Workshops Whether you are new to GAMS or already an experienced user looking to deepen or expand your knowledge in a certain area - take a look at our diverse list of GAMS related courses. From basic introductions to equilibrium or agricultural modeling these courses meet your needs in your area of interest. Courses are led by domain experts at locations worldwide.
Scheduled courses for 2014 include:
© pressmaster / © Jonas Glaubitz Fotolia.com
• Advanced Techniques in General Equilibrium Modeling with GAMS • Agro-Economic Modeling with GAMS • Applied Equilibrium Analysis of Energy and Climate Policies • Basic and Advanced GAMS • Development Policy Modeling • Dynamic Impacts of Macroeconomic Policies and Shocks • Environmental Computable General Equilibrium Modeling with GAMS • Financial General Equilibrium Modeling with GAMS • Global Computable General Equilibrium Model Training • Microeconomic Analysis of Welfare and Policy • Modeling and Optimization with GAMS • Practical General Equilibrium Modeling with GAMS • Simulation Techniques for Applied Microeconomics • Trade and Climate Policy Analysis with GAMS and MPSGE
For more information please visit: http://www.gams.com/courses.htm sales@gams.com
www.gams.com