H T T P : / / W W W. A N A LY T I C S - M A G A Z I N E . O R G
DRIVING BETTER BUSINESS DECISIONS
M ARC H / APRI L 2016 BROUGHT TO YOU BY:
Analytics in
the cloud
It’s all about the data in the Internet Age
ALSO INSIDE: Democratization of analytics How would Google farm? Organizational data = strength
Consultant training strategy
Executive Edge Frontline Systems President Dan Fylstra on cloud analytics and security concerns
INS IDE STO RY
Cloud: sky’s not the limit In the beginning, there was data. Then along came “analytics” to try and make sense of, and ascertain insight from, the data. Then in rolled “big data,” a tsunami of data of such grand proportions that most folks crowned it “Big Data,” with capital letters, in a show of respect for its massive size. Big Data, in turn, begot the “data scientist” – the golden child of the Analytics Age and the Internet of Everything. And now, behold, “cloud computing”: Internet-based, networked computing that promises to give Big Data a virtually unlimited roomy place to call home and to give analytics a seemingly ideal environment in which to work. You might say that the combination of analytics, Big Data and cloud computing is a match made in, well, heaven. In this month’s cover story, “Analytics in the Cloud: It’s all About the Data,” Dave Hirko, co-founder of B23, takes readers on a quick trip through the cloud and how to implement and make the best use of the technology. “These days, as the cloud is making storage of enterprise data easier and more affordable for companies of any size, every business is now a data business, whether they know it or not,” Hirko writes. Dan Fylstra, president of Frontline Systems, also takes a look at cloud analytics, but from a somewhat different perspective 2
|
A N A LY T I C S - M A G A Z I N E . O R G
in his Executive Edge column, “Cloud analytics: a disruptive technology.” Writes Fylstra: “Growth in cloud-based analytics is happening both because it enables new uses and benefits, and because old objections are fading.” One of those “old objections” is data security, a topic of widespread concern at a time when hackers are a major global threat. Does cloud computing put data at greater risk than internal data warehousing? Hirko and Fylstra address the question. The widespread availability of data and the analytical tools to turn it into valuable insight has given rise to the concept of “democratization” of analytics, another trend explored by Hirko and Fylstra. A trio of contributors from Accenture Digital – Srujana H.M., Sanjay S. Sharma and Amitava Dey – dive deeper into the topic in their article, “Democratization of data analytics: New frontier of data economy.” “By freeing themselves from data silos and the traditional practice of data collection, storage and access, agile businesses can not only improve their dynamic decision-making, but they can also expedite enterprise data integration and decentralization,” they write. ❙
– PETER HORNER, EDITOR peter.horner@ mail.informs.org W W W. I N F O R M S . O R G
C O N T E N T S
DRIVING BETTER BUSINESS DECISIONS
MARCH/APRIL 2016 Brought to you by
FEATURES
42
34
PREDICTIVE ANALYTICS IN THE CLOUD By Dave Hirko It’s all about the data: Exploiting the natural synergy between the cloud and analytics in the Internet Age.
42
DEMOCRATIZATION OF DATA ANALYTICS: NEW FRONTIER By Srujana H.M., Sanjay S. Sharma and Amitava Dey Making data available to the people who need it and giving them the skills, tools to derive meaningful insights from it.
50
HOW WOULD GOOGLE FARM? By Alex Thomasson, Gabe Santos and Atanu Basu Lessons from self-driving cars may help win the race to feed the world via the Internet of Agriculture and analytics.
56
HOW TO TURN ORGANIZATIONAL DATA INTO CORPORATE STRENGTH By Rupert Morrison Understanding and analyzing the organization as a system sets the platform for value-adding analysis and company success.
62
TRAINING STRATEGY IN THE ANALYTICS CONSULTING DOMAIN By Chandrakant Maheshwari An alternative training approach that benefits trainers and trainees equally during the teaching/mentoring process.
66
VEHICLE ROUTING SOFTWARE SURVEY By Randolph Hall and Janice Partyka Increasing market demands and high expectations drive transformation and innovation in dynamic industry.
56
66 4
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
XLMINER®: Data Mining Everywhere Predictive Analytics in Excel, Your Browser, Your Own App
XLMiner® in Excel – part of Analytic Solver® Platform – is the most popular desktop tool for business analysts who want to apply data mining and predictive analytics. And soon it will be available on the Web, and in SDK (Software Development Kit) form for your own apps.
Forecasting, Data Mining, Text Mining in Excel. XLMiner does it all: Text processing, latent semantic analysis, feature selection, principal components and clustering; exponential smoothing and ARIMA for forecasting; multiple regression, k-nearest neighbors, and ensembles of regression trees and neural networks for prediction; discriminant analysis, logistic regression, naïve Bayes, k-nearest neighbors, and ensembles of classification trees and neural nets for classification; and association rules for affinity analysis.
have in Excel, and generate the same reports, displayed in your browser or downloaded for local use.
XLMiner SDK: Predictive Analytics in Your App. Access all of XLMiner’s parallelized forecasting, data mining, and text mining power in your own application written in C++, C#, Java or Python. Use a powerful object API to create and manipulate DataFrames, and combine data wrangling, training a model, and scoring new data in a single operation “pipeline”.
Find Out More, Start Your Free Trial Now. Visit www.solver.com to learn more, register and download Analytic Solver Platform or XLMiner SDK. And visit www.xlminer.com to learn more and register for a free trial subscription – or email or call us today.
XLMiner.com: Data Mining in Your Web Browser. Use a PC, Mac, or tablet and a browser to access all the forecasting, data mining, and text mining power of XLMiner in the cloud. Upload files or access datasets already online. Use the same Ribbon and dialogs you
The Leader in Analytics for Spreadsheets and the Web Tel 775 831 0300 • info@solver.com • www.solver.com
DRIVING BETTER BUSINESS DECISIONS
REGISTER FOR A FREE SUBSCRIPTION: http://analytics.informs.org INFORMS BOARD OF DIRECTORS
26
82
DEPARTMENTS
2 8 14 18 22 26 30 74 78 82
Inside Story Executive Edge Analyze This! Healthcare Analytics INFORMS Initiatives Forum Viewpoint Conference Preview Five-Minute Analyst Thinking Analytically
Analytics (ISSN 1938-1697) is published six times a year by the Institute for Operations Research and the Management Sciences (INFORMS), the largest membership society in the world dedicated to the analytics profession. For a free subscription, register at http://analytics.informs.org. Address other correspondence to the editor, Peter Horner, peter.horner@mail.informs.org. The opinions expressed in Analytics are those of the authors, and do not necessarily reflect the opinions of INFORMS, its officers, Lionheart Publishing Inc. or the editorial staff of Analytics. Analytics copyright ©2016 by the Institute for Operations Research and the Management Sciences. All rights reserved.
6
|
A N A LY T I C S - M AGA Z I N E . O RG
President Edward H. Kaplan, Yale University President-Elect Brian Denton, University of Michigan Past President L. Robin Keller, University of California, Irvine Secretary Pinar Keskinocak, Georgia Tech Treasurer Sheldon N. Jacobson, University of Illinois Vice President-Meetings Ronald G. Askin, Arizona State University Vice President-Publications Jonathan F. Bard, University of Texas at Austin Vice President Sections and Societies Esma Gel, Arizona State University Vice President Information Technology Marco Lübbecke, RWTH Aachen University Vice President-Practice Activities Jonathan Owen, CAP, General Motors Vice President-International Activities Grace Lin, Institute for Information Industry Vice President-Membership and Professional Recognition Susan E. Martonosi, Harvey Mudd College Vice President-Education Jill Hardin Wilson, Northwestern University Vice President-Marketing, Communications and Outreach Laura Albert McLay, University of Wisconsin-Madison Vice President-Chapters/Fora Michael Johnson, University of Massachusetts-Boston INFORMS OFFICES www.informs.org • Tel: 1-800-4INFORMS Executive Director Melissa Moore Meetings Director Laura Payne Director, Public Relations & Marketing Jeffery M. Cohen Headquarters INFORMS (Maryland) 5521 Research Park Drive, Suite 200 Catonsville, MD 21228 Tel.: 443.757.3500 E-mail: informs@informs.org ANALYTICS EDITORIAL AND ADVERTISING
Lionheart Publishing Inc., 506 Roswell Street, Suite 220, Marietta, GA 30060 USA Tel.: 770.431.0867 • Fax: 770.432.6969
President & Advertising Sales John Llewellyn john.llewellyn@mail.informs.org Tel.: 770.431.0867, ext. 209 Editor Peter R. Horner peter.horner@mail.informs.org Tel.: 770.587.3172 Assistant Editor Donna Brooks donna.brooks@mail.informs.org Art Director Alan Brubaker alan.brubaker@mail.informs.org Tel.: 770.431.0867, ext. 218 Advertising Sales Sharon Baker sharon.baker@mail.informs.org Tel.: 813.852.9942 Aileen Kronke aileen@lionhrtpub.com Tel.: 770.431.0867, ext. 212
Su pp ac Pow orts he e T Sp r B abl ar I a ea k n u ExcelBig Dd , at a
Ap
ANALYTIC SOLVER PLATFORM ®
From Solver to Full-Power Business Analytics in
Solve Models in Desktop Excel or Excel Online.
Plus Forecasting, Data Mining, Text Mining.
From the developers of the Excel Solver, Analytic Solver Platform makes the world’s best optimization software accessible in Excel. Solve your existing models faster, scale up to large size, and solve new kinds of problems. Easily publish models from Excel to share on the Web.
Analytic Solver Platform samples data from Excel, PowerPivot, and SQL databases for forecasting, data mining and text mining, from time series methods to classification and regression trees and neural networks. And you can use visual data exploration, cluster analysis and mining on your Monte Carlo simulation results.
Conventional and Stochastic Optimization. Fast linear, quadratic and mixed-integer programming is just the starting point in Analytic Solver Platform. Conic, nonlinear, non-smooth and global optimization are just the next step. Easily incorporate uncertainty and solve with simulation optimization, stochastic programming, and robust optimization – all at your fingertips.
Find Out More, Download Your Free Trial Now. Analytic Solver Platform comes with Wizards, Help, User Guides, 90 examples, and unique Active Support that brings live assistance to you right inside Microsoft Excel. Visit www.solver.com to learn more, register and download a free trial – or email or call us today.
Fast Monte Carlo Simulation and Decision Trees. Analytic Solver Platform is also a full-power tool for Monte Carlo simulation and decision analysis, with 50 distributions, 40 statistics, Six Sigma metrics and risk measures, and a wide array of charts and graphs.
The Leader in Analytics for Spreadsheets and the Web Tel 775 831 0300 • info@solver.com • www.solver.com
EXE CU TIVE E D G E
Cloud analytics: a disruptive technology Growth in cloud-based analytics is happening both because it enables new uses and benefits, and because old objections are fading.
BY DANIEL FYLSTRA
8
|
At the fall 2010 INFORMS Annual Meeting, I organized an INFORMS Roundtable session focused on cloud computing. Speakers from Microsoft, IBM, RightScale, Gurobi Optimization and Frontline described their work with cloud-based analytics. Almost five years later in May 2015, Gartner issued a “Market Guide for Optimization Solutions,” which asserted that, “despite very few cloud deployments today, most platform vendors are moving to cloud.” Was this accurate? In early 2016, just how prevalent – and how practical – are cloud-based analytics applications? As president of an analytics software vendor who’s in close contact with leading-edge customers as well as other analytics vendors, I’m aware of both usage statistics and specific cases, from us and others, that I cannot cite – customers often consider their cloud analytics solutions a competitive advantage, and won’t permit us to name them. But I can say with certainty that there were far more than “a very few cloud deployments” of analytics tools in early 2015, and growth over the last 12 months has accelerated. One statistic I can cite: At the time of the Gartner report, Frontline had 45,000 users of its cloud spreadsheet analytics tools for Excel Online and Google Sheets,
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
Your Analytics App – Everywhere
Use Solver, Risk Solver, XLMiner in Excel Online, Google Sheets Or Turn YOUR Excel Model into a Web or Mobile App in Seconds
The easiest way to build an analytic model – in Excel – is now the easiest way to deploy your analytic application to Web browsers and mobile devices – thanks to the magic of Frontline Solvers® and our RASON® server.
Use our Analytics Tools in your Web Browser. Solve linear, integer and nonlinear optimization models with Frontline’s free Solver, and run Monte Carlo simulation models with our free Risk Solver® tool, in Excel Online and Google Sheets. Use our free XLMiner® Analysis ToolPak tool for statistical analysis, matching the familiar Analysis ToolPak in desktop Excel.
Build Your Own Apps with RASON Software. RASON – RESTful Analytic Solver® Object Notation – is a new modeling language for optimization and simulation that’s embedded in JSON (JavaScript Object Notation). With support for linear, nonlinear and stochastic optimization, array and vector-matrix operations, and dimensional tables linked to external databases, the RASON language gives you all the power you need.
Your Excel Model Can Be a Web/Mobile App. The magic begins in Excel with Frontline Solvers V2016: Our Create App button converts your Excel optimization or simulation model to a RASON model, embedded in a Web page, that accesses our cloud servers via a simple REST API. You’re ready to run analytics in a browser or mobile device! Or if you prefer, run your RASON model on your desktop or server, with our Solver SDK®. Either way, you’re light-years ahead of other software tools.
Find Out More, Sign Up for a Free Trial Now. Visit www.solver.com/apps to learn more, and visit rason.com to sign up for a free trial of RASON and our REST API. Or email or call us today.
The Leader in Analytics for Spreadsheets and the Web Tel 775 831 0300 • info@solver.com • www.solver.com
EXE CU TIVE E D G E
and that number has nearly tripled in less than a year, as this is written. A DISRUPTIVE TECHNOLOGY Analytics in the cloud is following the pattern of other disruptive technologies (including personal computers), described so well in Clayton Christensen’s classic book “The Innovator’s Dilemma.” The early users of cloud-based analytics are not always, or even usually, the traditional large BI or analytics users – they are both new users of analytics per se, and existing users pursuing new applications that weren’t practical before. Some cloudbased analytics tools are less mature and lack all the features of desktop tools, but this isn’t what’s important for the new users and their applications (meanwhile, the gaps are rapidly closing). Growth in cloud-based analytics is happening both because it enables new uses and benefits, and because old objections are fading. SECURITY AND PRIVACY: FADING CONCERNS In the discussion that followed the speakers at the 2010 INFORMS Roundtable meeting, representatives of nearly every company cited concerns about information security and privacy. The analytics group leaders on the Roundtable, and even more so their upper 10
|
A N A LY T I C S - M A G A Z I N E . O R G
management and IT staffs, could not imagine that public clouds such as Amazon Web Services and Microsoft Azure could be a secure place to store their sensitive data and analytics results. Several reps stated that their company would “never” make use of cloud services for their proprietary applications. (I wouldn’t hold them to these statements in 2016.) But one needs only a newspaper to realize that the alternative to cloud computing – a firm’s own data centers – is hardly a protection against loss of valuable proprietary data. In 2013, Target lost 40 million credit and debit card accounts, plus data on 70 million customers. In 2014, Home Depot had a breach of 56 million credit card accounts, JP Morgan lost sensitive financial information on 76 million households and 7 million small businesses, and eBay had a breach of 145 million customer accounts. And in 2015, Anthem lost 80 million patient and employee records, Ashley Madison had 33 million names of affair-seekers exposed, the IRS disclosed full tax returns of 500,000 people to hackers, and the Office of Personnel Management lost sensitive records – including security clearance details – on 22 million federal employees. Not one of these breaches occurred in a cloud computing system. In 2013, the Central Intelligence Agency chose the “newcomer” Amazon W W W. I N F O R M S . O R G
Web Services to build a secure cloud service for the intelligence community. According to Fortune magazine in mid2015, the “intelligence community loves its new Amazon cloud.” If it’s secure enough for the CIA, is it secure enough for you? ADVANTAGES: SCALABILITY, ACCESSIBILITY, COLLABORATION, AGILITY An advantage often cited for cloud computing is scalability – the ability to “spin up or down” virtual machines in
A NA L Y T I C S
minutes, when physical machines can take weeks to months to install, and usually must be written off when not needed. Though we’re unable to cite customer names, GAMS Optimization, Gurobi Optimization and Frontline Systems have seen this happening in practice – we’re aware of multiple customers scaling up to hundreds of virtual machines in the cloud 24x7, just for their own applications. Thanks to scalability, Frontline offers use of its Apache Spark Big Data cluster, running on Amazon Web Services,
M A R C H / A P R I L 2 016
|
11
EXE CU TIVE E D G E
to academic customers at minimal to no cost. These are new applications in the spirit of “The Innovator’s Dilemma” – they weren’t economically practical before. Accessibility of analytics tools using only a browser has opened up opportunities for new users, again in the spirit of “The Innovator’s Dilemma” – especially in foreign countries where powerful computers and desktop or server software sold on permanent licenses are not nearly as common as in the United States and Europe. Many of Frontline’s new users of its cloudbased Solver (optimization), Risk Solver (simulation) and XLMiner (statistics/ data mining) tools are from those countries, and are using the free versions of Excel Online or Google Sheets. Their applications may be modest compared to big company IT-sponsored analytics initiatives, but their numbers are large and rapidly growing. Easy collaboration is a major advantage of tools such as Microsoft’s Power BI, Tableau Online, SAS Cloud Analytics, FICO Analytic Cloud and Frontline’s XLMiner.com and Rason.com services. Companies of all sizes are now operating with geographically distributed teams, and members of those teams collaborate every day via online meetings and live chat. They need, and now they can get, immediate access to results from 12
|
A N A LY T I C S - M A G A Z I N E . O R G
analytics applications – not just slide presentations from projects where key insights were discovered months ago. The significant impact on agility in decision-making is increasingly important in a competitive world with few geographic barriers. Analytics in the cloud isn’t just a future trend – it’s happening now. Based on our data, I’m confident that while you’ve been reading this article, someone new, somewhere in the world, whom you never thought of as a “data scientist” or “management scientist,” has started using advanced analytics tools in the cloud. They’re learning how to make a difference with analytics. Are you ready? ❙ Daniel Fylstra is founder and president of Frontline Systems Inc. (www.solver.com), a software vendor with a full range of “advanced analytics” tools, from data mining, text mining and predictive analytics to Monte Carlo simulation and risk analysis, decision analysis and large-scale optimization. Early in his career, Fylstra co-founded Personal Software, the company that brought to market the first spreadsheet, VisiCalc, for the Apple II. He has a B.S. degree in electrical engineering and computer science from MIT and an MBA from Harvard Business School. He is a longtime member of INFORMS.
Subscribe to Analytics It’s fast, it’s easy and it’s FREE! Just visit: http://analytics.informs.org/
W W W. I N F O R M S . O R G
THE
NATION’S FIRST
Associate in Applied Science (A.A.S.) degree in Business Analytics on campus or online.
Credential options • Enroll in one or several: • AAS degree
Why Study Business Analytics?
The Business Analytics curriculum is designed to provide students with the knowledge and the skills necessary for employment and growth in analytical professions. Business Analysts process and analyze essential information about business operations and also assimilate data for forecasting purposes. Students will complete course work in business analytics, including general theory, best practices, data mining, data warehousing, predictive modeling, project operations management, statistical analysis, and software packages. Related skills include business communication, critical thinking and decision making.The curriculum is hands-on, with an emphasis on application of theoretical and practical concepts. Students will engage with the latest tools and technology utilized in today’s analytics fields.
Accelerated Executive Program
Our accelerated learning options allow students to complete certificate credentials in two semesters part time or one semester full time. Accelerated options are available for the Business Intelligence and the Business Analyst certificates.
Questions? Tanya Scott
Director, Business Analytics
919-866-7106 tescott1@waketech.edu
• Certificates: Business Intelligence, Business Analyst, Finance Analytics, Marketing Analytics, and Logistics Analytics
Flexibility • Open-door enrollment • Courses are offered in the fall and spring • Courses can be taken online or on campus • Competitively priced tuition
Gain skills in: • Data gathering • Collating • Cleaning • Statistical Modeling • Visualization • Analysis • Reporting • Decision making
Use data and analysis tools: • Advanced Excel • Tableau • Analytics Programming • SAS Enterprise Guide • SAS Enterprise Miner • SPSS Modeler • MicroStrategy
• Presentation
Funded in full by a $2.9 million Dept. of Labor Trade Adjustment Assistance Community College & Career Training (DOLTAACCCT) grant.
businessanalytics.waketech.edu
ANALY ZE TH I S
Quiet storm brewing outside ivory tower The administration encouraged us to “package” our courses into an MBA specialization that could be presented to the outside world as a coherent program of study. This is when I suddenly became nervous.
BY VIJAY MEHROTRA
14
|
All of our MBA students at the University of San Francisco are required to take a course entitled “Spreadsheet Modeling and Business Analytics” [1] that has become very popular. A few years ago, students began to clamor for more analytics courses to follow up on this one. In response, my colleagues and I began to create an ad hoc collection of electives on topics that were interesting to us (business statistics, data mining, data visualization, data strategy). Given that the vast majority of our MBA students do not have technical backgrounds, we have avoided requiring the students to write code, opting instead for user-friendly tools such as Excel (the workhorse tool for MBAs for more than a generation), JMP and Tableau. Over time, these electives have grown in popularity. Recently, the administration took notice and encouraged us to “package” our courses into an MBA specialization that could be presented to the outside world as a coherent program of study. This is when I suddenly became nervous. Implicitly or explicitly, we were about to start making a promise to students to help them prepare for specific types of careers – but since I did not know what the profile for those careers looked like, I also lacked an honest and rigorous way to examine how well our electives were meeting this goal.
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
To deal with my newfound anxiety, I reached out to many of the students who had taken our electives over the past few years. To my surprise, I found that their careers had ventured into a wide variety of industries and a diverse set of functional roles, ranging from financial analyst to technology consultant. All of them agreed that their experience in studying analytics during business school had been professionally valuable, but there seemed to be myriad explanations as to how and why. Brian Liou helped me understand my data more clearly. Liou is CEO and cofounder of Leada (www.teamleada.com), a Y Combinator-backed Bay Area startup focused on developing data training software and services. Liou spends a lot of his time talking to business executives – and business faculty members – around the country to understand the kinds of data skills that today’s business professionals need. I had met Liou through a former student of mine, and recently decided to use Leada’s excellent online tutorials on SQL, R and Python as part of a popular new elective in our business school called “Coding for Analytics.” During a recent conversation, Liou proceeded to explain why so many of my MBA students were so eager to take this class. To begin with, he pointed out data analytics groups have (under various A NA L Y T I C S
names and charters) been around for quite a long time in most industries. Over time, functional groups such as marketing, finance and operations have come to lean on these groups for their data needs, ranging from routine reports to ad hoc “deep dives” to answer specific questions. However, as executives have become more aware of how valuable predictive and prescriptive analytics can be, there has been more focus by (and much more pressure on) such groups to not only provide operational support to different parts of the enterprise but also to deliver its own data-driven business insights and innovations. I nodded at this part of his explanation, as it was consistent with the findings of a recent study that Jeanne Harris and I had conducted [2]. As our conversation continued, I slowly began to realize that just outside my ivory tower office, drowned out by the relentless buzz associated with “big data” and “data scientists,” there is an equally powerful storm taking place, albeit one that is much quieter. Advanced analytics are indeed being recognized as being more valuable than ever. And demand for high-end analytic talent far exceeds the supply, which means that highly skilled analytic specialists remain rare and are increasingly more expensive. Meanwhile, the pressure from management on all departments to quickly make better, more data-driven decisions is M A R C H / A P R I L 2 016
|
15
ANALY ZE TH I S putting stress on business analysts of all stripes – marketing analysts, supply chain analysts, operations analysts, financial analysts – to access and effectively utilize data more quickly and creatively than ever. While the proliferation of business intelligence software has helped to automate much of the repetitive data manipulation, analysts nevertheless often need to get their hands on a more customized chunk or summary of data to answer a constantly growing list of questions whose answers have not yet been codified. So, Liou explained, analysts can put in requests to a centralized group and expect to wait days or weeks or longer for the desired data. Alternately, they can try to figure it out by writing their own code to access the data (usually via SQL) and analyzing it themselves. Given these two choices, many analysts were increasingly keen to try their hand at writing basic queries and analyses, simply because they lacked the time and patience to wait for someone else to do it for them. Think of this as a choice to learn how to fish rather than waiting around for someone to perhaps someday give you a fish when if they get a minute. “Speed to decision-making is what is making these business professionals require a different skillset,” Liou said. “And we’re also seeing demand for data visualization skills and a better understanding of how to design experiments.” Liou was also
16
|
A N A LY T I C S - M A G A Z I N E . O R G
quick to point out the importance of teaching business students and business professionals relevant technical concepts and tools within the context of their job functions. “With us, regardless of whether you are learning coding skills or analytic techniques, it is very case-based and always incorporates business data and language.” Teaching analytic concepts and tools within a business context is something we have long been doing with our MBAs at USF, so this final point also resonated with me. And Liou’s overall summary – better decisions at faster speed + overworked specialized resources = increased demand for business analysts with a modicum of technical competency – really helped give me a much better understanding of why enrollment in our MBA electives are rising. Final thought: Given that these basic technical skills are an increasingly large part of so many analyst jobs, how long until basic programming skills become a required part of every MBA program? I suspect that day will be here sooner than we think. ❙ Vijay Mehrotra (vmehrotra@usfca.edu) is a professor in the Department of Business Analytics and Information Systems at the University of San Francisco’s School of Management and a longtime member of INFORMS. NOTES & REFERENCES 1. To learn more about this class, see http://dx.doi. org/10.1287/ited.2015.0149 2. http://sloanreview.mit.edu/article/getting-valuefrom-your-data-scientists/
W W W. I N F O R M S . O R G
S S E N I BUS TICS Y L A N A
Online
MBA
BECOME MORE AT THE
Beacom School of Business Best Value MBA Ranked Top
Top Rated College by Forbes & Princeton Review
10
AFFORDABILITY & ACCREDITATION by Best Master’s Degree
Online MBA Ranked Top
25
IN THE WORLD
by Princeton Review
MBA – General MBA – Business Analytics MBA – Health Services Administration
Get started at
www.usd.edu/onlinemba cde@usd.edu • 800-233-7937
HEALT H CARE A N A LY T I C S
Quiet period in healthcare analytics: lull before the storm? The mission is to combine data on the health of their employees using analytics to discover the cost vs. benefit from their employer-sponsored health insurance plans.
BY RAJIB GHOSH
18
|
A somewhat quiet period is continuing in the healthcare analytics market. At the end of February the annual conference of the Healthcare Information Management System Society (HIMSS) will take place in Las Vegas. Health IT companies and tens of thousands of attendees will congregate around the key theme of the event, “healthcare analytics and the rise in big data in healthcare.” I will provide coverage of HIMSS in my next column. In this article I will highlight a few new developments in the healthcare analytics space, which are both interesting and promising. BIG EMPLOYERS COME TOGETHER TO TRANSFORM HEALTHCARE Twenty major corporations including behemoths like IBM, Coca-Cola, Verizon and American Express recently announced a new alliance called the Health Transformation Alliance. The mission is to combine data on the health of their employees using analytics to discover the cost vs. benefit from their employersponsored health insurance plans. The project is expected to launch in 2017. Together those corporations spend $14 billion a year on healthcare for about 4 million people. That’s a lot of money and data!
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
In the United States there are two major payer groups: The government (Center for Medicare and Medicaid or CMS) and employers. Current data suggests that 55 percent of “non-elderly” people (less than 65 years of age) in the U.S. are covered by employer-sponsored health insurance. Employers usually buy group insurance from health plans or they are self-insured. In the latter case they pay for the healthcare of their employees. Clearly hospitals and other care delivery organizations have a lot at stake when it comes to employers. This is a big initiative, and more details are yet to emerge. Does this alliance have the potential to create transformational pressure on the healthcare system? We have to wait for a year or two for a conclusive answer. HEALTH INSURANCE EXCHANGES ARE NOT IN DANGER In my last column I highlighted four mega trends to watch for in 2016. One of those was consumerism in healthcare, which is poised to rise and influence healthcare in the coming years. A key aspect of that is the sustainability of the health insurance marketplaces. United Healthcare created a lot of noise last year with the announcement that it was leaving the health insurance exchange marketplace because of losses they suffered in that business. Many predicted that this
A NA L Y T I C S
could be the beginning of the end for insurance exchanges and Obamacare in general. But, a recent report from the Urban Institute suggests that United’s departure had practically no impact on exchanges. Why is this important? If the insurance marketplaces stay, then consumers will have the opportunity to pick their own health plans. As expected, they will shop based on transparency of cost and value. Taking charge of one’s own health is the key driver for individuals being invested in healthcare. Therefore, a consumercentric platform – and analytics that track consumer behavior to make the experience more engaging – will continue to emerge and get better over time. There is no barrier in sight for the rise of consumerism in healthcare. MORE HOSPITALS TO INVEST IN DATA WAREHOUSE AND MINING Technology procurement and adoption within hospitals is also changing. HIMSS Analytics published a new report that showed that, compared to 2015, about 500 percent more hospitals have plans to deploy clinical data warehousing and data mining technologies in 2016. This is a huge increase. Clearly, care delivery organizations have come to the realization that electronic health record (EHR) systems that they have bought
M A R C H / A P R I L 2 016
|
19
HEALT H CARE A N A LY T I C S
and deployed during the past years are just data capture and billing systems. Organizations need to go beyond that to add sophistication to their decision-making process and be successful in the new world of payment reform. TECH COMPANIES OPEN SOURCING ARTIFICIAL INTELLIGENCE ENGINES Some interesting developments have started to happen in the technology world that are worth paying attention to. Microsoft recently announced that it is open sourcing its Deep Learning source code stack, dubbed CNTK, for developers. A couple of months ago Google open sourced its own artificial intelligence stack called TensorFlow. There were other open source initiatives in the Artificial Intelligence (AI) world in the past, but Google and Microsoft’s foray into this is significant. Many pundits believe that those two technology giants, especially Google, are possibly many years ahead of other organizations in this area. This trend could open up more possibilities. Image and speech recognition are two major areas where AI engines are powerful. IBM is offering deep learning with its very own Watson in the cloud solution as well, albeit it is not open source. In fact, IBM is banking on its Watson technology for various businesses and its own internal corporate transformation.
20
|
A N A LY T I C S - M A G A Z I N E . O R G
Undoubtedly “deep learning” and AI are about to usher in a new era in analytics that will have profound impact on all industries. Healthcare will be no exception. Having said that, I would also like to caution all AI enthusiasts. Like any other new and transformational technology, let’s make sure we understand that it is neither the mathematics nor the coolness of the technology that transforms our everyday interactions. It is the usability at the point of use that does the trick. Can we interact with the AI in a manner like we do with a human? That jump takes a while – possibly many years after the technology or the mathematical model emerges or matures. It will be a while before AI can become a significant part of our lives and healthcare. But one thing is for sure: Owing to Google and Microsoft’s actions, we may be able to reach that future faster by leveraging the power of the crowd. ❙ Rajib Ghosh (rghosh@hotmail.com) is an independent consultant and business advisor with 20 years of technology experience in various industry verticals where he had senior-level management roles in software engineering, program management, product management and business and strategy development. Ghosh spent a decade in the U.S. healthcare industry as part of a global ecosystem of medical device manufacturers, medical software companies and telehealth and telemedicine solution providers. He’s held senior positions at Hill-Rom, Solta Medical and Bosch Healthcare. His recent work interest includes public health and the field of IT-enabled sustainable healthcare delivery in the United States as well as emerging nations.
W W W. I N F O R M S . O R G
CPLEX Optimization Studio®. Still the best and most complete optimizer and modeler Now you can get it direct
CPLEX Optimization Studio is well established as the leading, complete optimization software. For years it has proven effective for developing and deploying models and optimizing business decisions. Now there’s a new way to get CPLEX – direct from the optimization industry experts. Find out more at optimizationdirect.com The IBM logo and the IBM Member Business Partner mark are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. *IBM ILOG CPLEX Optimization Studio is trademark of International Business Machines Corporation and used with permission.
INFO RM S IN I T I AT I VE S
ACB, continuing education & more MITCHELL-GUTHRIE, LEVIS TO LEAD ANALYTICS CERTIFICATION BOARD Polly Mitchell-Guthrie of SAS and Jack Levis of UPS will serve as chair and vice chair, respectively, of the 2016 Analytics Certification Board (ACB) following their election by INFORMS members and CAP designees. They will be joined on the ACB Board by: Aaron Burciaga, CAP, Accenture Digital; Tom Davenport, Babson College; Bill Franks, Teradata; Esma Gel, Arizona State University; Jeanne Harris, Columbia University of New York; Lisa Kart, CAP, Polly Mitchell-Guthrie Gartner; Kathy Kilmer, Disney; Jonathan Owen, CAP, GM; Greta Roberts, Talent Analytics; Jim Williams, CAP, FICO; Melissa Moore, INFORMS executive director. The ACB will meet on a quarterly basis to provide oversight and guidance to the Certified Analytics Professional and the Associate Certified Analytics Professional Jack Levis programs. For more about the CAP速 program, visit https:// www.certifiedanalytics.org/about.php
22
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
INFORMS CAREER CENTER OFFERS SUPPORT FOR INTERNSHIPS Just in time for the annual internship season, INFORMS now offers support for both students and employers through the INFORMS Career Center. Students can access analytics internships posted through INFORMS Career Center and elsewhere on the Web. To start your search, select Job Seekers Job Search from the menu at careercenter.informs.org, apply the “Internship” filter under JOB FUNCTION and click “Update Results.” Employers can post internships through INFORMS Career Center for free. To post your position, select Employers Post a Job from the menu at careercenter.informs.org,” click the appropriate continue button (INFORMS member or not), log in and then select the appropriate “30-Day Internship Posting.” INFORMS LAUNCHES EDUCATION RESOURCE LIBRARY INFORMS recently launched the INFORMS Education Resource Library, a
comprehensive new site about university analytics programs that offers a rich set of resources including a listing of university programs and their courses, syllabi and case studies and research conducted by INFORMS committees about the intersection of university analytics programs and industry. Whether you’re directing a corporate analytics department and reaching out to job candidates, an analytics expert seeking to work with university researchers, an academic, or a student, you’ll find a wealth of resources at https://education. informs.org/home. ANALYTICS CONFERENCE + CONTINUING ED WORKSHOP = BUNDLED SAVINGS INFORMS Continuing Education is offering special savings on its two-day, professional development workshops when “bundled” with registration for the 2016 INFORMS Conference on Business Analytics & Operations Research in Orlando, Fla.
The INFORMS Education Resource Library is a comprehensive site about university analytics programs. A NA L Y T I C S
M A R C H / A P R I L 2 016
|
23
INFO RM S IN I T I AT I VE S
On April 13 and 14, INFORMS will conduct its “Essential Practice Skills for High-Impact Analytics Projects” and “Foundations of Modern Predictive Analytics” courses at the University of Central Florida Executive Development Center in downtown Orlando. Register before March 21 and pay $1,950 for registration to the 2016 Conference on Analytics, as well as one of the two continuing education courses mentioned above. For more information on the bundled pricing visit the Continuing Education page for the 2016 Analytics Conference.
INFORMS AIDS DATA SCIENCE BOWL INFORMS will once again support the National Data Science Bowl, an online, three-month-long (ending March 14) competitive event sponsored by Booz Allen Hamilton and Kaggle. Held in conjunction with the National Heart, Lung and Blood Institute (part of the National Institutes of Health), this year’s challenge is to develop an algorithm to empower doctors to more easily diagnose dangerous heart conditions and help advance the science of heart disease treatment. For more information, visit www.datasciencebowl.com/ ❙
2016 ER FAIR E R A C TICS ANALY
EMPLOYER PAYMENT FORM
Employers: Looking For an Analytics Professional to Make Sense of Your Data? COMPLETE YOUR SEARCH AT THE INDUSTRYʼS LARGEST, PREMIER CAREER FAIR: ANALYTICS 2016 Find the seasoned professionals you need, 800+ analytics professionals expected • Provide your recruitment materials in a casual setting • Arrange discreet on-site meetings in private booths • Enjoy discounted combination pricing with the fall Annual Meeting Career Fair •
Questions?
careers@informs.org or call +1(800) 4-INFORMs
April 10 –12, 2016 | Hyatt Regency Grand Cypress | Orlando, Florida REGISTER TODAY: http://meetings.informs.org/analytics2016 24
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
Most enterprises rely on analytic experts to solve their toughest challenges. But to get ahead of the competition, you need to put powerful optimization capabilities in the hands of your business users — so they can deliver value with no friction or delays.
Learn how FICO optimization software can help businesses transform their most complex challenges into smarter, faster, more profitable customer decisions. FICO® Xpress Optimization Suite FICO® Optimization Modeler FICO® Decision Optimizer
www.fico.com/optimization FICOAmericas@fico.com © 2016 Fair Isaac Corporation. All rights reserved.
FO RUM
Data deluge problem for the oil and gas industry Too much data is a matter of perspective, and this challenge can be viewed as an opportunity – an opportunity to explore new avenues of growth.
BY SANDEEP BHAGAT
26
|
Can too much data be a problem? I believe we all can agree that an abundance of anything, especially data, can become an issue at some point. The oil and gas industry, which is a veteran data generator, is no exception to the issue, and data growth is posing serious questions. However, now that the issue is identified, the next step is to find a way to deal with it as efficiently as possible. Too much data is a matter of perspective, and this challenge can be viewed as an opportunity – an opportunity to explore new avenues of growth. Long before big data was making headlines in the tech cosmos, huge oil and gas companies were hoarding loads of data, such as: 3D earth models, videos, historical seismic data from deep probes, well log data, temperature registers, chemical analysis reports and sensor data from machines and production data that aid technical decisions and drilling strategy. Technology and cost constraints have enabled the oil and gas industry to have mature business intelligence
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
systems. However, many companies often performed analysis related to business intelligence in isolation on a comparatively smaller set of enterprise data, which limits a company’s ability to come up with meaningful and actionable insights for better business outcomes. With recent advancements in big data technology, enterprises can build cost effective scalable platforms to process and analyse the ocean of data captured in real time from various sources. Big data and advanced analytics technologies have ample scope in both the upstream and downstream oil and gas sector. Understanding and leveraging the data for upstream, which is exploration and production, has helped uncover new business opportunities and improve performance through fast, agile analytics aimed at modelling and simulating business performance. Similarly, in downstream, big data has improvised planning and forecasted results, maximizing and enhancing maintenance and production. Oil and gas companies can leverage the broad spectrum of technologies of big data in various aspects in areas such as digital oil field, exploration and production, preventive maintenance and compliance with environmental, health and safety regulations. By analyzing data generated by sensors during A NA L Y T I C S
oil drilling exploration, production, refining, market and social data oil and gas enterprises gain competitive advantage in many ways: • Exploration and production of oil reserves. Seismic monitors generate vast amount of data that can potentially identify new oil and gas reserves and also trace signatures previously overlooked. The massive amount of data generated can be analyzed along with different variables such as weather data, soil data and help predict real-time success of drilling operations during explorations of new reserves. • Preventive maintenance of equipment. Using big data technologies, a variety of data captured by a diverse set of sources and formats, oil and gas enterprises can quickly analyze the safety, environmental anomalies in drilling, well problems, etc., and shut down drilling before it turns risky. Also, sensors at the drill heads can be monitored to collect data about behavior of the equipment and predict when it is going to fail or if it needs maintenance. Combined with historical data and mechanics of equipment, it becomes possible to predict the life of the equipment well in advance. M A R C H / A P R I L 2 016
|
27
• Compliance with environmental, health and safety regulations. Big data and oil and gas are a powerful combination that can deliver massive benefits even in the compliance space around environmental, health and safety The oil and gas industry has an opportunity to re-imagine how it regulations. Shutting approaches data-driven intelligence. down the drills in case of abnormality, determined through analytics and more flexible production big data analytics, can prevent techniques, the industry can boost proenvironmental disasters. In addition, ductivity by as much as 30 percent. data from smart video cameras can The bottom line is that the oil and gas be used to record what is happening industry has an opportunity to re-imagine in real time and avoid any scenario how it approaches data-driven intelliof security breach at the sites. Data gence. Doing so requires new integrated collected from geo-mechanical planning and operations tools to coorstate of earth models helps in dinate and refocus resources. The first understanding the earth’s subsurface step is to create and operationalize data to develop safe and sustainable science as a way to standardize analytdrilling methodologies. ics and data platforms internally. This will enable companies to industrialize their Currently, the oil and gas industry use of analytics. � is at varying stages of big data and Sandeep Bhagat is a big data technology strategist, analytics adoption. A maturity curve product and platform visionary at Wipro. A data is emerging, and some early adopters science practitioner with more than 18 years of experience, his core technology expertise includes are visible. One of the earliest stages big data & analytics, machine learning, business in that maturity curve is connecting opintelligence, information architecture, product engineering, performance engineering and cloud erating assets, performance monitoring computing. and problem diagnosis. By introducing 28
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
Photo Courtesy of 123rf.com | Antonio Ribeiro
FO RUM
GREAT STORIES CREATE LASTING IMPRESSIONS... YOURS SHOULD TOO.
Learn how to become a compelling data storyteller. Register today for ESSENTIAL PRACTICE SKILLS FOR HIGH-IMPACT ANALYTICS PROJECTS April 13-14, 2016 | 8:30am-4:30pm University of Central Florida Executive Development Center Orlando, Florida
Register at www.informs.org/continuinged
VIE W POIN T
Is analytics really new? I believe there may be some sub-optimization and duplication of analytics resources being used without having a more centralized CAO function.
BY JAY LIEBOWITZ
30
|
Data analytics is now the craze. From the “sexiest” job (“data scientist”), as previously indicated by the Harvard Business Review, to the overabundance of demand vs. supply of data scientists, organizations are clamoring to tap the expertise of business analytics professionals and data scientists to look for insightful trends and improve organizational decision-making. I am certainly part of this fan club, as evidenced by launching a new book series on data analytics applications to be published by Taylor & Francis. (We already have 12 books signed, ranging from sports, education, business and healthcare to government, law, cybersecurity and beyond). However, in stepping back and contemplating this emerging field, I wonder if “analytics” is really something new. With my background in operations research (O.R.) and artificial intelligence (AI), I see many of the underlying foundations and techniques being applied in analytics coming from these disciplines and other fields (such as social psychology if looking at intuitionbased decision-making and analytics). Whether looking at structured or unstructured data, the database management, AI and OR/OM (operations research/operations management) communities have certainly contributed greatly to the field of analytics. Of course, with some of the newer languages like R
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
and Python and Hadoop architectures, the field of data analytics has some potentially unique elements that have evolved from computer science, statistics, mathematics and other fields. But, is this field of analytics so new and what’s driving it? If you look at industry research like Gartner’s, you see that business intelligence (BI) and analytics are focused on some hot topics such as modern BImodal BI, advanced analytics, automated decisions, self service, Internet of Things, data lakes, chief analytics officers, algorithmic business and the cloud. And, if you look at the Forrester Wave for BI and Analytics, you see that predictive analytics will continue to be an important capability needed by organizations, with prescriptive analytics the hope for the future. Organizations continue to be inundated with data, both internal and external, and they need improved ways of making sense of this onslaught of both structured and unstructured data. This is where analytics seems to be playing an important role, along with filling other niches such as customer analytics and healthcare analytics. But again, is this really new? There is a growing trend for a relatively new position in organizations, namely the chief analytics officer (CAO). However, A NA L Y T I C S
in my experience, many companies have decentralized the analytics expertise within the various functional/business units in the company. I believe there may be some sub-optimization and duplication of analytics resources being used without having a more centralized CAO function. The annual SAS Analytics Summit typically attracts more than 150 CAOs as part of its executive component, and there seems to be a growing trend for this type of position. This reminds me of my knowledge management days with the chief knowledge officer (CKO) position. Unfortunately, though, the CKO position didn’t rise to as much prominence as many people thought. I wonder if this may be the same fate with the CAOs. Well, no matter whether new or old, the analytics field will speedily grow in the coming years. As a shortage of data scientists looms and the influx continues with varied types of data of great quantity and velocity, it probably is a safe bet to encourage those individuals who have some technical orientation to pursue analytics. Of course, cybersecurity would be another great opportunity for those with similar inclinations. More analytics and data science programs are being developed at universities worldwide in order to meet the growing demand for graduates with these types of skill sets. In our MS in Analytics program, we already M A R C H / A P R I L 2 016
|
31
VIE W POIN T
have about 300 students in this degree at Harrisburg University of Science and Technology. In a way, analytics is taking a similar path as knowledge management. Knowledge management is a multidisciplinary field with some of its roots laden in AI, organizational learning, human capital strategy and business process innovation. Analytics has its underlying foundations in the computer science, AI, statistics, O.R. and management disciplines. As knowledge management
initiatives are fading a bit in organizations, hopefully analytics will pick up the slack and ensure a greater longevity for improved decision-making and insights. If not, we will wait until the “next great thing” in order to make our organizational lives easier, and still debate whether this “next great thing” seems new or not. ❙ Jay Liebowitz (Jliebowitz@harrisburgu.edu) is the distinguished chair in applied business and finance at Harrisburg University of Science and Technology in Harrisburg, Pa. He is a member of INFORMS.
2016 EDELMAN FINALISTS
CONGRATULATIONS TO THE
360i, for “360i’s Digital Nervous System”
BNY Mellon, for “Transition State and End State Optimization Used in the BNY Mellon U.S. Tri-Party Repo Infrastructure Reform Program” Chilean Professional Soccer Association (ANFP), for “Operations Research Transforms Scheduling of Chilean Soccer Leagues and South American World Cup Qualifiers” The New York Police Department (NYPD), for “Domain Awareness System (DAS)” UPS, for “UPS On-Road Integrated Optimization and Navigation (Orion) Project” U.S. Army Communications-Electronics Command (CECOM), for “Bayesian Networks for U.S. Army Electronics Equipment Diagnostic Applications: CECOM Equipment Diagnostic Analysis Tool,Virtual Logistics Assistance Representative” Join us at the Edelman Gala, April 11 in Orlando, FL when the 2016 winner is announced! http://meetings.informs.org/analytics2016
32
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
Hadoop brings big data. SAS brings big insight. What good are all those massive Hadoop data sets if you can’t analyze them all? Or if it takes you weeks just to get the results? Combining the power of SAS® Analytics with Hadoop lets you go from big data to brilliant decisions in seconds. And it all happens in a single, easy-to-use, interactive environment so you get faster answers with greater insight.
Learn more about SAS® solutions for Hadoop
sas.com/hadoopsolutions
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. © 2016 SAS Institute Inc. All rights reserved. S150989US.0116
AU TOM AT ED A NA LY T I C S
Predictive analytics in the cloud: It’s all about the data
BY DAVE HIRKO uring the 1992 presidential election, the Clinton team coined the phrase “it’s the economy, stupid,” as an easy way to remember one of the most important platforms of the campaign. For the cloud – and especially predictive analytics in the cloud – it’s not the economy, but the data, that makes all the difference. These days, as the cloud is making storage of enterprise data easier and more affordable for companies of any size, every business is now a data business, whether they know it or not. And that will be truer still as the Internet of Things begins to collect and contribute data to enterprise systems from nearly
D
34
|
A N A LY T I C S - M A G A Z I N E . O R G
every household and business device or appliance. You have to assume that the volume of enterprise data will increase (possibly exponentially) every year. Most organizations already are overwhelmed with data and can’t process it fast enough. Enter the cloud. There’s a natural synergy between the cloud and analytics. The cloud allows you to scale out horizontally easily and quickly, which in turn enables you to look across silos of data to identify developing trends. Most companies that are struggling with a move to the cloud are concerned in particular with how to migrate data to this new computing environment – and that’s where they’re going wrong. New
W W W. I N F O R M S . O R G
Photo Courtesy of 123rf.com | rawpixel
technology makes it much more practical to scratchbuild their data repositories in the cloud rather than migrate data to the cloud. After that, complex data analysis can be underway in minutes rather than months (if at all). Let’s look back at the cloud, and ways to make the best use of the technology when putting predictive analytics to work on an enterprise scale. “POWER COMPANY” OF THE INTERNET AGE In the past, on-premise data collection and manage- Most companies that are struggling with a move to the cloud are concerned with how to migrate data to this new computing environment. ment was limited because IT resources were finite and expensive. Datacenters in the 1990s and 2000s That has changed with the cloud. Think were a lot like that: Every company had of it as analogous to the power company one, and they were very expensive. But at the turn of the 20th century. the cloud has changed the landscape. In the late 1880s when electricity was And just as power utilities mean we no just coming on the scene in industry, evlonger have to worry about running out ery business built its own generating caof power, so the cloud means we no pacity at great expense to the company. longer have to worry about running out When local electric utilities came along of space and computational power to to manage capacity and distribution, analyze our data. businesses no longer needed their own With the power company, you plug power generators, and the use of electriinto an outlet. With the cloud, you plug cally powered equipment became more into a software application program commonplace. interface (API).
A NA L Y T I C S
M A R C H / A P R I L 2 016
|
35
ANALY TIC S I N T H E C LO U D
For most enterprise purposes, the cloud has practically infinite capacity for storing data. It is finite, of course, but for all intents and purposes, like the power company, it doesn’t ever run out. It does, however, have its own set of challenges. THE MYTH OF CLOUD’S SECURITY PROBLEMS The most common challenge for organizations when moving to the cloud is embracing a new security paradigm for storing their data. Legacy technology companies whose business model is threatened by the cloud have for years been perpetuating the notion that the cloud is less secure than enterprise data systems. That’s mostly a marketing pose, but it has taken its toll on cloud adoption. In reality, cloud security is implemented differently, but when it’s done correctly, the cloud can actually be more secure than on-premise solutions for data storage. The problem comes in a skills gap in the market for technical professionals. Comparatively few IT professionals have both security and cloud skills, which is, to an extent, hampering the adoption of the cloud for data-intensive applications like predictive analysis. It will take time for customers to fully trust storing their sensitive data in the
36
|
A N A LY T I C S - M A G A Z I N E . O R G
cloud, but it will happen. It’s not a question of “if,” but of “when.” DATA MIGRATION, NO; SCRATCHBUILDING DATA, YES Contrary to popular belief, many organizations are not “migrating” their analytical applications to the cloud. Rather, they are building new cloud-based systems from scratch and writing off the legacy systems. As early as the end of 2013, Piper Jaffray enterprise software analyst Mark Murphy surveyed a number of IT professionals and concluded that by 2018 onethird of all computing workloads will be running on the cloud as opposed to onpremise. In a separate study, the analyst firm IDC noted that the cloud represents one-third of all new IT infrastructure spending, and that cloud spending is expected to increase steadily through 2019. Our own experience in helping businesses move to the cloud is that enterprises are building many times more applications in the cloud than had existed previously in legacy on-premise datacenters. Migrating data to the cloud is hard; it takes money and time. Expensive circuits are needed to move the data. Consequently, it may take months – or even years – to move terabyte- and petabyte-scale data to the cloud.
W W W. I N F O R M S . O R G
The better strategy is to start collecting data right to the cloud, and to analyze that data locally in the cloud. Migrate data only as a last resort – and be prepared to spend a considerable amount of time, money and effort to do so.
is building analytics applications meant exclusively for on-premise use any longer. There are more net-new analytical applications today than there were five or 10 years ago, but all of them are being born in the cloud. In the early days of enterprise computing, massive computing power scaled up vertically, by adding more CPUs, servers, etc. The cloud takes a different approach, scaling from one node to hundreds with a single API call. Analytics software can be distributed
OPEN SOURCE ANALYTICS: THE DEMOCRATIZING SOFTWARE TREND After creating the appropriate scheme for data collection in the cloud comes the challenge of analysis in that environment. Fortunately, open source analytics software is allowing an exponentially larger number of analytic READ THE POPULAR BOOK - NOW REVISED AND UPDATED - AND IN PAPERBACK solutions to be used in an enterprise environment. Traditionally, there were only a handful of proprietary analytical software solutions, from companies like Oracle, SAS, SAP and so on. These solutions were very expensive, so only a few “elite” organizations (big banks, government agencies and so on) could afford them. Open source software tTranslated into 9 languages has “democratized” the tUsed in courses at more ability for small and large than 30 universities companies to build their own analytical applications. More info: www.thepredictionbook.com *Free audiobook with purchase of paperback or e-book In our experience, no one
Predictive Analytics The Power To Predict Who Will Click, Buy, Lie, or Die
A NA L Y T I C S
M A R C H / A P R I L 2 016
|
37
ANALY TIC S I N T H E C LO U D
over thousands of servers, scaling horizontally rather than vertically. With this distributed processing technology, you can send a job to each node in a system, which lets you to do very sophisticated processing in a parallel fashion. As for analytics applications themselves, modern open source analytics software has its roots with Google’s search technology, specifically the Google File System. By the mid-2000s, Yahoo announced an open source version of Google’s technology called Hadoop. Hadoop relies on many (on the scale of hundreds or thousands) of cheap commodity hardware servers, which made it a perfect fit to run in the cloud. Over the past 10 years, the Hadoop ecosystem has grown significantly, and has now been adopted by almost every enterprise or Fortunelevel business doing work in the cloud. A chief advantage of the cloud is its scalable, “pay as you go” nature. It allows you to pay for only what you use, and then turn it off. By extension, analytics applications in the cloud are similarly scalable and even “ephemeral.” They exist only when they’re needed – and like the power company, you can turn the service on and off when you need it (thus eliminating the fees of massive and cumbersome enterprise applications). 38
|
A N A LY T I C S - M A G A Z I N E . O R G
“SOFTWARE IS EATING THE WORLD” – SORT OF That well-known quote comes from Marc Andreessen, an original developer of the modern Web browser and present-day venture capitalist. Internet-style applications are transforming other industries as Uber has for ride-hailing or Seamless for food service. In trying to get to that same level of software and industry integration, most organizations have not fully embraced how to best leverage cloud software APIs to maximize performance potential and cost savings. They’re still treating the cloud like their datacenter. Or, to use the power company analogy, they’re keeping the lights on all the time, incurring 24/7 expenses. And while open source analytics software packages are very capable, they’re not yet mature enough to deploy easily. These packages still require very specialized (and expensive) staff that are hard to find and hard to maintain. Consequently, many organizations are building their own cloud-based analytics software manually, one step at a time. This manual approach introduces risk for project failures and delays. A deployment may take weeks, months or sometimes years to complete, making analytics software one of the most fragile systems in the organization. Once they work, no one wants to touch them. W W W. I N F O R M S . O R G
This, of course, stifles innovation; after all, what’s the point of analysis if you can’t be flexible? In fact, most organizations are spending too much time on the systems, and not enough time on the data and analytics. One IT executive from a well-known hedge fund told us, “I hired mathematicians and data scientists from Princeton and Harvard to analyze my portfolio performance. I didn’t hire them to fix the Hadoop cluster.” It’s the same complaint, regardless of the industry. Customers say, “It’s taking
A NA L Y T I C S
too long to analyze data; I can’t afford to have my IT department take that long to build a new analytics application. We need applications and data ready immediately.” Responding to that pain from business leaders, some firms – including ours – are now using software automation to orchestrate cloud and data APIs. SOFTWARE CLOUD AUTOMATION CUTS DEPLOYMENT TO MINUTES With software automation, analytical applications can be developed and
M A R C H / A P R I L 2 016
|
39
ANALY TIC S I N T H E C LO U D
launched not in months or years but in minutes, ready to ingest data almost immediately. The approach is taking place in a range of technology sectors, from financial services to healthcare, retail and technology. Take the example of harvest.ai, a San Diego, Calif.-based provider of next-generation cyber security solutions for companies using platforms such as Google for Work, Office 365, Box or Dropbox. harvest.ai uses cloudbased analytics and natural language processing to help organizations identify and stop data breaches from targeted attacks, insider threats and stolen credentials in near real time while still allowing their employees to efficiently collaborate and share data. The company works at the crossroads of computer science, artificial intelligence and computational linguistics to monitor the interactions between computers and human (or “natural”) languages. By deploying multiple open source analytical software applications, harvest.ai was able to automate a cloudbased predictive analytics and data system. This automated system was launched literally in minutes with a few clicks of the mouse. Once automated, harvest.ai was able to offer multiple deployment modes of natural language processing based on 40
|
A N A LY T I C S - M A G A Z I N E . O R G
its customer needs. This deployment reduced the time required to serve customers to a fraction of a typical client engagement. Previously, it took the company weeks or even months to offer new analytical products to its clients; by automating its systems, this turnaround time was cut down to a matter of days. As we’ve seen, the push to the cloud for predictive analysis can be backbreaking labor for companies trying to apply the traditional data center model to their business. Migrating data to the cloud – and building specific applications for your business alone – is a strategy that’s fallen out of step with the latest advancements in cloud technology. By automating your cloud strategy and using open source analytical applications, you can start using your enterprise data right away. And making use of that data quickly and easily is the point, isn’t it? Because when it comes to predictive analytics in the cloud, remember – it’s all about the data. ❙ Dave Hirko (dave@b23.io) is a co-founder of B23, a consulting company that provides comprehensive services for companies moving to the cloud, from application development to big data.
W W W. I N F O R M S . O R G
®
CERTIFIED ANALYTICS PROFESSIONAL Analyze What CAP Can Do For You ®
www.certifiedanalytics.org
EMPOW ERIN G P EO P LE
Democratization of analytics: New frontier of data economy “Each business is a victim of Digital Darwinism, the evolution of consumer behavior when society and technology evolve faster than the ability to exploit it. Digital Darwinism does not discriminate. Every business is threatened.” – Brian Solis
BY (l-r) SRUJANA H.M., SANJAY S. SHARMA AND AMITAVA DEY nalytics influence every aspect of our existence, from the way we think and behave to how we work and do business. Every second, millions of records are generated by organizations, on social media and in the consumer space, leading to a humungous explosion of data of at least 2.5 quintillion
A
42
|
A N A LY T I C S - M A G A Z I N E . O R G
bytes [1] produced every day. By the year 2020 we are expected to see a 4,300 percent increase in the annual data generation [2]. Google alone processes 3.5 billion searches per day and houses 10 exabytes of data. Data is generated by multiple nodes and exchanged seamlessly between channels, ranging from digital to voice
W W W. I N F O R M S . O R G
Photo Courtesy of 123rf.com | Kheng Ho Toh
Many organizations are sitting on a gold mine of data but do not have effective resources to analyze the data in order to frame effective policies. to mobile. Mobile data is experiencing exponential growth, especially in the developing world, where it already boasts 5 billion users. Individuals generate 70 percent of the overall data, but enterprises store 80 percent of this data [3]. MIT Technology Review indicates that even though digital data created by consumers is doubling every two years, almost all of it remains unused or unanalyzed. Research indicates that 99 percent of
A NA L Y T I C S
new data is never used, analyzed or transformed. Of what use is the data if it is trapped in silos and not analyzed effectively? Many organizations are sitting on a gold mine of data but do not have effective resources to analyze the data in order to frame effective policies. A study by IBM found that key executives globally spend 70 percent of their time finding data and only 30 percent analyzing it. This challenge becomes
M A R C H / A P R I L 2 016
|
43
DE MO C RAT I Z AT I O N O F A N A LY T IC S
more complicated because data is not just obtained in a structured format. The vast majority of data exists in semi-structured or unstructured forms as social media turns all of us into the data-generating agents. Most of the data in cyberspace remains untapped because there aren’t enough data scientists to analyze the data and derive meaningful information from it. To perform impactful analysis, a data scientist should have a deep understanding of mathematical concepts, proficiency in computational programming and sound domain knowledge. Unfortunately, relatively few data scientists possess all three skill sets, creating a scarcity of data science talent across the globe. A study by McKinsey & Company corroborates this, noting a grave shortage of analytics with the needed skill sets and that the United States alone faces a shortage of 190,000 data scientists. Consequently, businesses are constantly exploring alternatives to making data available to businesses and academia to unleash the value and derive meaningful insights from this data. In this light, “democratization of data and analytics� is the next promising frontier for business success. Democratization of data and analytics is the phenomenon of making data available to people who need it and have the 44
|
A N A LY T I C S - M A G A Z I N E . O R G
skill sets to deriving meaningful insights from it. By freeing themselves from data silos and the traditional practice of data collection, storage and access, agile businesses can not only improve their dynamic decision-making, but they can also expedite enterprise data integration and decentralization. While a plethora of analytics and data visualization tools have opened up new possibilities for sharing data across a business, they have also introduced a new set of challenges for business owners and analytics teams. DEMOCRATIZATION OF BUSINESS DATA AND ANALYTICS In a business context, data democratization primarily entails making data and analytics available to all the layers of the organization while transcending departmental boundaries. For example, inventory data and forecasts may be of importance to finance teams in planning the budget allocation for purchases. Historically in most of the companies, useful business data is just confined to IT folks and a handful of senior executives who need to make decisions based on that data. Such a data handling and analysis process, compounded with different and complex data analysis tools, poses severe limitations for management to get a unified and single version of the true W W W. I N F O R M S . O R G
story across all the domains of the enterprise. This further impedes the effective decision-making process. Democratization of data and analytics is emerging as a provident solution to this problem. Democratization of analytics has necessitated the use of open source platforms such as Hadoop and programming languages like R. Artificial intelligence and machine learning further add to the accessibility and applicability the democratization process.
A NA L Y T I C S
Effective democratization of data analytics across the enterprise demands a three-pronged strategy: 1. Enterprise cross-functional resource architects. A marketing team can build a campaign for a product by working alongside the pricing team, which can take input from the product development team. The product development team can then take input from the market research team about the demand for such a product, the competition, etc., thus building a cross-functional business
M A R C H / A P R I L 2 016
|
45
DE MO C RAT I Z AT I O N O F A N A LY T IC S
intelligence resource with enterprisewide level access. Along with creating greater access, the organization needs to breed a culture of grooming “resource connecters,” people who are involved in connecting employees at all levels to BI (business intelligence) resources and tools. These connectors play a pivotal role in mobilizing democratization of data and analytics in an organization. 2. Effective governance of seamlessly integrated data. Data governance is a crucial aspect that determines the success of democratization of data and analytics. Data governance entails two aspects: 1) upstream data process consisting of data sourcing, transformation, storage, etc.; and 2) downstream data processes consisting of usage and consumption of data. Traditionally data governance was viewed as a means to control data access, but now there is a paradigm shift toward performing data governance to drive business agility. The major data governance challenge confronted by businesses today is striking a balance between borderless data and maintaining data security. There is constant concern of business sensitive data falling into the wrong hands and prevention of undue confidential data usage by competition that may hamper business growth. Most of businesses 46
|
A N A LY T I C S - M A G A Z I N E . O R G
do not want to take the risk of providing wider access to its business-critical data. This results in a “do nothing” attitude that hampers innovation, discovery and business growth. Masking data could provide the way out of this problem. Data masking would involve changing the actual figures in the data while maintaining its original characteristics, thereby ensuring the compliance with data privacy norms and simultaneously promoting democratization of data. Data could be masked dynamically at the application or permanently at the source depending on the nature of sensitivity. A good data governance strategy focuses on building a futuristic business state through: • assessment of leading practice and high performance model fit, • identification of future state objectives and operating models, • conducting a gap analysis, and • creation of an implementation plan and roadmap. 3. Striking a balance between data quality, scalability and agility. As scale, complexity and availability increase, so do the challenges associated with maintaining quality. Data profiling carried out to determine anomalies, inconsistencies and redundancies in content, structure and relationships help fix challenges and W W W. I N F O R M S . O R G
help maintain relevant versions at the source, thereby facilitating meaningful insight generation. Defining data quality capability through appropriate business rules and key performance indicators is the first step toward enabling the clear vision of data through all phases of data management. Such a strategy will prove to be the next fron- Figure 1: Process of effective governance of seamlessly integrated data. tier of competitive advantage facilitating democratization of data used by researchers and academicians and analytics. to perform analytics and derive meaningful inferences, which can be fed back DEMOCRATIZATION OF NONto governments in the form of reports, BUSINESS DATA AND ANALYTICS proceedings of symposiums and conferDespite the skeptics of democratizaences, thereby enabling effective and tion of data and analytics at the corporelevant policy formulation and smart rate sector, these practices are already governance. receiving great acceptances at the reWebsites such as “FindTheData” are search, academic and public sectors facilitating such democratization of analyt(primarily governmental organizations). ics by organizing the publically available Governments, through censuses and government data into more structured surveys, collect lots of data on an annual compilations and adding relevant filters, basis, but this data is buried in governpivots and charts so that researchers can mental websites in the form of zip files slice and dice the data for meaningful anand is not used as often as it should be. alytics. “Socrata” is another such public Dynamic data analysis to frame relevant data discovery platform that facilitates policies remains a farfetched dream in massive amounts of data to create differthis context. This data, however, can be ent views and analyze hidden patterns in A NA L Y T I C S
M A R C H / A P R I L 2 016
|
47
DE MO C RAT I Z AT I O N O F A N A LY T IC S
the governmental data that were previously not looked at to decipher policy impacts. Socrata gives consumers access to public data and also equips users with the tools they need to draw insights from the data taking democratization of data analytics to the next level. SUMMARY Data has become the new raw material for business and fundamental basis for social existence in this century. Many studies in recent years (EMC, etc.) have indicated that, on a global scale, data and information are inexorably doubling in less than two years, yet an estimated 99 percent of the overall data in the world is not used or analyzed. Unleashing this data to derive meaningful information would require people with deep analytics skills. Unfortunately, a huge gap in the demand and supply of analytics professionals is impeding the data analysis potential of corporate and governmental organizations. Democratization of data and analytics seems to be a promising solution to this problem. As the next frontier of competitive differentiation, the democratization of data and analytics will help business become more agile, productive, adaptable and profitable, while helping governmental organizations make real-time policies that can 48
|
A N A LY T I C S - M A G A Z I N E . O R G
better cater to the relevant needs of all stakeholders. ❙ Srujana H.M. is a marketing analytics manager at Accenture Digital, an executive member of the Analytical Society of India and an internationally certified Project Management Professional. Sanjay S. Sharma is managing director & India lead of Advanced Analytics in Accenture Digital. He leads a multi-functional, multi-industry team of about 500 analytics professionals. He is a member of INFORMS. Amitava Dey is a data science principal director for marketing analytics India at Accenture Digital. He has 12 years of experience working across different functional domains and industries with a keen interest in learning new techniques and applying them to real-life analytical problems. REFERENCES 1. http://www.vcloudnews.com/every-day-big-datastatistics-2-5-quintillion-bytes-of-data-createddaily/ 2. Sources: https://www.linkedin.com/pulse/4300increase-annual-data-generation-2020-callschange-yaron-haviv 3. http://www.csc.com/big_data/flxwd/83638-big_ data_just_beginning_to_explode_ interactive_ infographic
SOURCES • “The Democratization of Big Data,” Journal of National Security, Law & Policy, Vol. 7, p. 325 • OECD report of data driven innovation for growth and wellbeing, 2014 • “Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends,” Wiley Publishers, Michael, Michele & Ambiga. • Accenture white paper on “Data Governance and Data Management,” 2011, 2012. • https://www.betterbuys.com/bi/businessintelligence-stats/ • http://www.csc.com/insights/flxwd/78931-big_data_ universe_beginning_to_explode
W W W. I N F O R M S . O R G
Shell unlocks the benefits of optimization with AIMMS The team at Shell Global Solutions has driven optimization solutions with AIMMS since 2003.
“GMOS has helped us to optimize our supply chain and select the feedstock with the best commercial terms at any given time. The folks in the petrochemicals business really trust this tool to help them make the right decisions.” Jim Nastoff, Strategy Manager at Shell
Their AIMMS-based decision-support tool, GMOS/ NetSim has contributed to millions of dollars in cost savings. This valuable tool has helped Shell define an optimal asset structure, and within that achieve: • Optimal sourcing of raw materials, • Optimized manufacturing volumes by plant, driven
through tiered pricing
• Enhanced decision support for manufacturing
locations and logistics distribution costs
Experience the power and ease of use of AIMMS for yourself. Contact sales@aimms.com to get a FREE 30-day trial. Award winning software company AIMMS works with half of the Fortune 500 companies to optimize their businesses, making them even more successful.
AG RIC U LTU R E & A NA LY T I C S
How would Google farm? The race to feed 9 billion people through the Internet of Agriculture and analytics.
BY (l-r) ALEX THOMASSON, GABE SANTOS AND ATANU BASU
A
solution to world hunger might be found on the streets of Mountain View, Calif., and Austin, Texas. We’re talking about Google’s self-driving cars, now being test driven, and the technology behind them. Although semi-autonomous farm equipment has existed for several years, the link between the tech giant and feeding the masses has more to do with sensors, data and analytics and less to do with just machines. In the decades to come, the intersection of these informational technologies will become increasingly crucial to feeding the world. The need to rethink food production has never been more urgent. Anticipated
50
|
A N A LY T I C S - M A G A Z I N E . O R G
population gains in developing countries and shifting demographics, particularly the expansion of the middle class, provide the biggest clues to what is coming. World population is predicted to grow from 7 billion today to 9.6 billion by 2050 and plateau at around 11 billion in 2100. That’s a lot more hungry mouths. Clearly, the demand side of the equation is daunting; the supply side also looks problematic. Urbanization, road construction and potential climate effects reduce the amount of arable land available for farming. Taken together, those issues will require that food production per acre of land be doubled to meet the burgeoning demand by the end of this century.
W W W. I N F O R M S . O R G
A NA L Y T I C S
M A R C H / A P R I L 2 016
|
51
Photo Courtesy of 123rf.com | Sandra Cunningham
Current agricultural technologies offer little hope for a solution. Growth in agricultural productivity is declining worldwide, according to the U.S. Department of Agriculture. Growth in productivity of grains and oilseeds, for example, averaged 2.4 percent per year from 1970 to 1990. But between 1990 and 2010, productivity growth fell to 1.6 percent annually, and the downhill slide is continu- Farmers have used new technologies in an attempt to increase crop yields while using fewer resources. ing. By 2021, it’s expected to an integrated approach that takes into be down to 1.5 percent. account advances in informational techFarmers in recent years have used nologies and complementary scientific new technologies in an attempt to increase disciplines to improve crop productivity is crop yields while using fewer resources. essential. But those technologies have been adopted Google’s blueprint for transforming our piecemeal, resulting in less-than-desired mobility is predicated on sensors, data outcomes. That’s where Google comes in. and software. Like a human driver, the computer-driven car is designed to detect WHAT DOES GOOGLE HAVE TO DO the driver’s location, understand what’s WITH FARMING? Google’s big leap forward in the race happening around the driver, predict what to autonomous cars is its success at demight happen next, prescribe what to do, veloping a fully integrated system. That and then implement this prescription. At system relies not only on sensors, but the same time, Google must design cars also on data – static and dynamic, and in with the capability to navigate varied and many forms – along with algorithms from complicated scenarios and obstacles, indifferent scientific disciplines. If we are cluding pedestrians and cyclists. to meet the challenges of feeding an exCrop production operates under a panding global population, then adopting similar system of varied field and plant
AG RIC U LTU R E & A NA LY T I C S conditions. Farmers must tailor their practices to address the vast differences that can exist on the same farm across topography, soil type, fertility, moisture content, plant health, weeds, insects and diseases. New precision agriculture technologies that take advantage of guidance systems and sensors potentially allow farmers to customize their practices by the square meter rather than by the acre. These tools have led to some advances, but those improvements have been only incremental as they often focus on one factor at a time. Such point solutions increase efficiency but won’t solve world hunger. Nor can precision agriculture technologies alone be sufficient. A scientific understanding of the genetic structure of the plants growing in the field is also required. That scientific knowledge must be fully integrated with the precision agriculture technologies to predict how a plant will respond to certain inputs in its environment (soil, moisture, temperature) and its various stressors (nutrients, weeds, insects, diseases). Making advancements in precision agriculture without accounting for plant genetics is akin to Google testing its self-driving cars on actual roads without paying attention to vehicle performance. For driverless cars to move from a science experiment to an accepted reality, they must not only function in actual traffic conditions on city streets and highways, but they must simultaneously
52
|
A N A LY T I C S - M A G A Z I N E . O R G
provide optimal performance in terms of efficiency, speed and comfort. The same premise applies to farming. PRECISION AGRICULTURE POINT SOLUTIONS In agriculture today, the main avenues for increasing productivity are optimizing farming practices through precision agriculture and accelerating crop improvements through plant breeding and genetic selection. While both of these avenues approach different aspects of agriculture – farming methods vs. the capabilities of the plants themselves – they are not being used together. Even so, they have a great deal of overlap in that they both rely on sensor information, analytics and data-driven action. The solution to doubling farm productivity must fuse the two. How do we get there? Let’s stroll down the two avenues and see how they might intersect. OPTIMIZATION OF FARMING PRACTICES Precision agriculture uses a comprehensive set of information technologies that rely on site-specific field information to vary production and management practices across the entire farm. The agriculture industry has been developing site-specific optimization techniques since GPS and satellite imagery have been available. Robots have recently been introduced for tasks like
W W W. I N F O R M S . O R G
cultivation and plant thinning in high-value crops. Unmanned aerial vehicles (drones) are now being used to gather images of crop fields. Some of these technologies enable the possibility of prescriptive solutions that can improve yields, build and maintain nutrients over time, and reduce costs. Although some achievements have been made, further advances in precision agriculture have been limited because of the lack of adoption and clear-cut benefits. Farmers have avoided potential precision agriculture technologies because of the time it takes to learn how to use them, the difficulty in managing the changes, and most importantly, a lack of obvious return on investment. Moreover, despite our tremendous data-collection capabilities, field and yield information is only valuable to farmers if it informs a management decision or agronomic practice. More sophisticated analytical tools that can synthesize all forms of data must be developed to enable the next step change in optimizing farming practices. ACCELERATED CROP IMPROVEMENTS Over the last century mechanization has increased farming efficiency, irrigation has provided precious water to crops, and more land has been brought into production. But perhaps the greatest productivity gains in agriculture over the
A NA L Y T I C S
last several decades have come about through breeding. Plant breeders grow small plots of different types of plants within a crop type. While genetic technologies have advanced tremendously in recent decades, it has long been known that the different plants within a crop type (genotypes) were different because of their genetic makeup. When plant breeders grow these plots they look for particular traits of interest (phenotypes) such as high yield and tolerance to drought, insects and diseases. Crop improvements through breeding continue, but the pace of adopting those changes has been slowing. Fortunately, crop breeders and geneticists now have new capabilities that can advance the field. They can more scientifically select critical genes to bring about crops with physically measurable traits that confirm the plant has been improved. Known as phenotypes, these newly developed traits can bring on higher and faster yields, greater resistance to the numerous sources of stress in nature, and even new unique properties such as high levels of specific nutrients to reduce malnourishment in atrisk populations. Genetic measurement techniques have become fast and inexpensive. But genetic improvement has been slowed by the small number of plant varieties that can be included in a study because
M A R C H / A P R I L 2 016
|
53
AG RIC U LTU R E & A NA LY T I C S of the amount of manual labor involved in measuring phenotypic traits. High-throughput phenotyping is a new field that capitalizes on recent technological advances. It uses sensors and automated delivery platforms, including robots and drones, to dramatically increase the number and the quality of phenotypic measurements that can be made. This allows many more varieties to be included in individual studies and provides a deeper pool for selecting appropriate genotypes. FUSING FARMING PRACTICES WITH CROP KNOWLEDGE Taken independently, precision agriculture and high-throughput phenotyping can provide sizable benefits. But how can these two technologies be merged to maximize agricultural production? Precision agriculture technologies provide farmers with ways to exert some control over the environment at the individual plant level, such as through the regulation of soil moisture and nutrient content. But, historically, precision agriculture has lacked detailed knowledge of how plants will respond in a multitude of situations. Crop models are evolving, but they are imprecise. We now have the potential to understand plant and field conditions on a single-plant or square-meter basis and relate them to physiological responses that are based on detailed knowledge of plant genetics.
54
|
A N A LY T I C S - M A G A Z I N E . O R G
Together, precision agriculture and high-throughput phenotyping give us a fairly high level of control over the desired phenotypes of a particular crop, which means we potentially have a great deal of control over yield and plant stress responses in the field. The genius of Google’s driverless car lies in its comprehensive analysis of data, which is aimed at creating a safe and efficient environment for vehicle travel. Compared to other fields where analytics have begun to be applied, agriculture is ripe for this type of thinking. Like transportation, it tends to operate over large areas with a great deal of spatial and temporal variability. Agriculture faces tremendous variability in crop and field status from location to location, not to mention weather. BEYOND POINT SOLUTIONS -THE INTERNET OF AGRICULTURE The technological backbone of Google’s automation project is the Internet of Things, which involves equipping objects with sensors and software so they can collect and exchange data. According to a recent report from the McKinsey Global Institute, the Internet of Things offers a potential economic impact between $4 trillion to $11 trillion a year in 2025. Opportunities to tap into the Internet of Things abound in agriculture as a number of proximal sensors are already being implemented
W W W. I N F O R M S . O R G
as components of original farm equipment or separate systems by other providers. Remote sensing is seeing a resurgence in agriculture with the advent of drones. Some companies are providing solutions involving unmanned aerial vehicles, multispectral cameras and cloud-based analytics capabilities that can estimate a plant’s health, among other things. Increasingly, these sensors are connected through wireless communications to the Internet, making their data available for inclusion in farm databases and mapping systems, and, more importantly, for analysis. The real-time availability of the data and analytical capabilities to the farmer is steadily increasing and is ushering in a new era of The Internet of Agriculture. Advanced analytics, using data from the Internet of Agriculture, will shed new light on our understanding of what affects what, why, when and how and in the process reinvent the agricultural sector. This understanding will evolve with improvements in analytics and as associated scientific disciplines advance. The key to doubling agricultural productivity – the fusion of accelerated crop improvement with optimization of farm practices – involves analyzing numerous data streams together. Doing so will accelerate progress toward the holy grail of prescriptive farming based on plants with specific genotypes.
A NA L Y T I C S
We believe the roadmap to rebalancing the world’s food supply-and-demand equation will encompass several other near-term advances, including standardization of agricultural data types and metadata; vastly improved analytical tools that provide actionable decision support to farmers and their equipment; and early stage fusion of precision agriculture with data from high-throughput phenotyping. If Google were to enter farming, it would likely start by finding and amassing all available data – without biases or preconceived notions – and let the data dictate how to proceed. A self-driving car can see, hear, read, understand, decide and act – just like human drivers do. As Google revolutionizes how we think about mobility, we have the same opportunity in agriculture. We are learning how to combine the latest advancements in precision agriculture with those in plant genetics. Their merger is imperative. A hungry world awaits. ❙ Alex Thomasson (thomasson@tamu.edu) is a professor of biological and agricultural engineering at Texas A&M University. He teaches and conducts research in subject areas related to remote and proximal sensing, including precision agriculture and high-throughput phenotyping. Gabe Santos is managing partner of Homestead Capital, a private equity fund investing in U.S. farmland. Atanu Basu (atanu.basu@ayata.com) is the CEO of Ayata. Based in Austin, Texas, the company’s customers include Fortune 500 operators in the oil & gas industry and in high tech. Ayata invented the technology behind prescriptive analytics with hybrid data over 10-plus years of research in artificial intelligence and related disciplines.
M A R C H / A P R I L 2 016
|
55
HU M AN RESO U RC E S A N A LY T I C S
How to turn organizational data into corporate strength and success
BY RUPERT MORRISON
W
e are constantly reminded that people are an organization’s most valuable resource, and yet it is one of the most mismanaged. The problem is pervasive, eroding performance at many levels. At an individual level, too many employees are unclear about what their roles in the organization really are. At a higher level, too many organizations are still struggling to understand where their people are and what they are doing, let alone realizing their plans and aspirations for the future. To add to this mix, the people function often finds itself at the bottom of the pile when it comes to budgeting
56
|
A N A LY T I C S - M A G A Z I N E . O R G
priority and investment. So how can organizations turn this around? The answer lies in organizational data (OD) and analytics. FUNCTION OVER FORM: AGENTS OF CHANGE To appreciate the business value and opportunity of organizational data we need to understand why businesses are struggling to build effective organizations. Firstly, the practice of organization design and business transformation is often shortsighted and superficial. Businesses are overly concerned with form over function. There is an obsession with
W W W. I N F O R M S . O R G
Photo Courtesy of 123rf.com | Duncan Andison
There is a huge opportunity for businesses to use their organizational data to gain competitive advantage. organizational structures rather than the work that structure needs to do or the skills that structure needs in order to deliver. Secondly, organizations find it difficult to track the delivery and evolution of an organization design. Transformation projects are left to evolve rather than being led, resulting in superficial change instead of an impactful transformation to the way work is being done within an organization. There is a huge opportunity for businesses to use their organizational data to gain competitive advantage. As companies look to become more agile, organizational data and analytics will play a key role
A NA L Y T I C S
in accelerating business transformation in today’s fast-moving market. In this article, I will walk through the transformation journey and the role organization data will play in delivering greater business impact. REPORTING LINES – IT’S WHAT LIES BENEATH THAT COUNTS The idea of treating an organization as a system is not novel, but the thinking is rarely applied in practice. Too often people start with what the organizational structure looks like, and then never get beyond thinking about reporting lines and how they could change. People are the central unit in organizational data, but they are
M A R C H / A P R I L 2 016
|
57
ORGAN IZAT I O NA L DATA
connected to a range of information. For example, people fulfill roles, which follow processes and require specific competencies or skills in order to hit objectives and organizational goals. Only by fragmenting the organization into its core components
can we then start to focus on how and where the real organizational change is needed (see Figure 1). To understand the role that organizational data plays in linking the organizational system together, think of people as a unit of analysis that are connected to many components. Each component of the system (roles, processes, etc.) contains its own cluster of information and data, and each component will be affected if one of them changes. For example, if you change a process, that will affect a role that a person or a group of people are performing.
Figure 1: The core components of an organization’s system and how they are linked to each other.
58
|
A N A LY T I C S - M A G A Z I N E . O R G
MAKING THE COMPLEX SIMPLE: BREAKING DOWN THE ORGANIZATIONAL SYSTEM To help understand the many-to-many connections within organizational data it is helpful to think of each component as a set of hierarchies. The obvious example is where people are represented in a hierarchical org chart. The real value, though, lies in breaking down each component into a hierarchy to help categorize them into logical chunks of information. For example, processes can be broken down into W W W. I N F O R M S . O R G
a number of levels to detail work that needs to be done (see Figure 2). Creating these hierarchies across the organization then allows you to connect the relationships people have to other organizational elements. An example of this relationship can be seen in an accountability matrix (Figure 3). Here you bring together your process hierarchy and Figure 2: An example of how value chain and support processes can be people or role hierarchy to broken down into hierarchies. assign accountabilities, providing clarity of what’s required for each see what’s going on in the “as-is” and activity and role. Doing so will uncover where they’re going with the “to-be,” which competencies are most in demand then it’s easier to get them engaged in and where the greatest skill gaps are. making organizational changes real. Ultimately, this will help you in prioritizing and designing your workforce training and recruitment plan. FASTER INSIGHTS FOR FASTER CHANGE: VISUALIZING ORGANIZATIONAL DATA To understand the complex many-to-many links and connections of the organizational system, data must be visualized in an intuitive way. If people can A NA L Y T I C S
Figure 3: Accountability Matrix gives practitioners visibility of the impact on structure, headcount, costs and skills of multiple organizational scenarios.
M A R C H / A P R I L 2 016
|
59
ORGAN IZAT I O NA L DATA
Figure 4: Spreadsheet vs. Sunburst colored by engagement generated from OrgVue. The power of visualization in accelerating insights from organizational data. In practice, this is easier said than done. Time and time again I hear practitioners say their data is too bad to get good insights from analytics. The irony is that human resources (HR) and OD teams
60
|
A N A LY T I C S - M A G A Z I N E . O R G
possess not only some of the most valuable data to the business, but some of the most complete. Think about payroll, human capital management (HCM) and enterprise resource planning (ERP) – each one has valuable people data and that’s just for starters before we’ve gone hunting for process or performance data. Even if the data sits in a spreadsheet, bad data is better than no data. The trick is twofold – first, simply start using the data and start engaging people through visualization tools and dashboards. The moment you play back data that’s wrong to business owners they will correct it to make sure it’s useful and reflecting the true picture. Second, use gamification to help people keep their data up to date. For example, visualize data completeness so managers of different business functions can compete on where they are in a journey of data completeness and cleanliness.
ENGAGING IN CHANGE: BEYOND ORG CHARTS, POWERPOINT & SPREADSHEETS Organizations are fluid and constantly evolving. This means visualizations and
W W W. I N F O R M S . O R G
analytics need to reflect that. Interactive images that can tell a story within seconds are much more powerful than static rows of data in a spreadsheet. Compare the two images in Figure 4: spreadsheet vs. sunburst visualization. Although both package the same data, such as reporting lines, engagement index, depth and spans of control, the latter communicates the information better. Visualization makes the complex simple. By accelerating the process of getting ad hoc insights from data, visualization can facilitate faster and better decisions in every step of the OD process – be it objective setting, rightsizing or workforce planning. One of the biggest traps in OD is when people only think of it as an exercise of moving org charts around a PowerPoint slide. With interactive visuals, you can view all the elements of organizations at once and model the impact of changes in different future scenarios. Since OD projects are rife with budget overruns, process disruptions and risks of losing good employees, having a nimble data visualization tool that lets you track headcounts, performance against plans and forecasts is hugely valuable. It will accelerate the implementation of changes and improve their odds of success.
for value-adding analysis, especially when doing scenario modeling in both macro and micro-design. For example, by having a clear map of how the organization is linked, you can easily trace the impact of improving the efficiency of a given activity and see who would be impacted and what would the full-time equivalent reduction be. The ultimate outcome of treating an organization as a system is this powerful ability to explore the impacts of various actions when modeling the “to-be” organization. Investing in the right skills and technology to make the most of organizational data and analytics should be an important focus area for 2016. At the end of last year, Ernst & Young showed that the mergers and acquisitions market is at its strongest in six years, with six in 10 companies pursuing acquisitions. This implies that transformation exercises are becoming the norm, and organizations must treat them as an ongoing process rather than a oneoff project. With this outlook, functions that hold organizational data will be responsible to drive the end-to-end transformation of their business and increase their odds of success. ❙
PLANNING FOR THE FUTURE: MODELING THE IMPACT OF CHANGE Understanding and analyzing the organization as a system sets the platform
Rupert Morrison is CEO of Concentra analytics and the author of “Data-Driven Organization Design: Sustaining the Competitive Edge Through Organizational Analytics.”
A NA L Y T I C S
M A R C H / A P R I L 2 016
|
61
LEA RN IN G C U LT U R E
Training strategy in analytics consulting domain BY CHANDRAKANT MAHESHWARI
T
he greatest asset for any organization involved in analytics consulting is its people and the willingness of its consultants to learn new ideas, to enhance their capabilities and to innovate. To ensure that its people remain on top of the latest trends in analytics, consultancies need to emphasize a learning culture, supported by appropriate and ongoing training. Typically, such training falls into four categories: 1. New skill/technology. Academicians or developers of a particular technology typically deliver this training. The technologies may be accepted by the general analytics community as effective, but they are 62
|
A N A LY T I C S - M A G A Z I N E . O R G
not widely implemented due to the lack of experts in the field. 2. Domain knowledge. In this case, the training focuses on domain areas such as finance, pharmaceutical and marketing. Trainers tell trainees about the importance of these industries to the economy and how businesses operate in these sectors, etc. Trainers talk about general business problems in the respective domains and how these problems are solved and equip the trainees with generic solutions. 3. Technical skills. Training around technical skills focuses on programming techniques (SAS, R, VBA, etc.) that are common tools in the analytics industry, as well as quantitative techniques W W W. I N F O R M S . O R G
Photo Courtesy of 123rf.com | Wavebreak Media Ltd
(statistics, economics, artificial intelligence, neural networks, etc.). 4. Soft skills. Training around soft skills generally focuses on communication, particularly training analytics professionals how to communicate their ideas/analysis to stakeholders. For training related to particular new skills/technology, domain experts are best equipped to lead the The first purpose of training should be that both trainer and trainee way. For the other three, companies equally benefit from the process. generally organize classroom sessions. that they, the trainers, have encountered. These sessions involve face-to-face trainThe limitation of this approach is that it ing given by practitioners or via webinars if is active passive interaction. The trainer is the teams are scattered in various locations. usually the active one, and the trainee is With increased globalization, webinars are the passive one. In face-to-face classroom common. interaction, the trainer can read the faces Professionals hired for analytics work of the trainees and impart his or her knowlgenerally have advanced degrees in quanedge in a more effective manner. In webititative techniques; for example, they are nars, it is quite possible that trainees may trained in computer programming at a unilose interest, and the trainer might not have versity. The question the arises: Why do a clue that he or she has lost the audience. they need training on technical skills? In The biggest drawback of this approach is fact, technical training consumes a majority that there is a lack of accountability. Trainof a typical organization’s training budget. ers may feel that they have given their best, The reason is that trainees need to and trainees may be left wanting more. interact with experts who have first-hand Given this limitation, consider a new experience on industrial implementation approach suggested by various practiof those techniques. The trainers explain tioners called “project-based training� in the nuances of the implementation by inwhich senior practitioners act as mentors troducing the trainees to real case studies to trainees. The senior practitioner creates A NA L Y T I C S
M A R C H / A P R I L 2 016
|
63
TRA IN IN G ST RAT E GY artificially constructed business problems and guides the trainee to solve them in predefined timelines. Trainees will gain domain knowledge and technical skills while exploring and solving the problem. The trainees then present the project process/ output to their peers and seniors, thus also developing their communication skills. Such a training process is not as wide as classroom training, but there is depth. The training develops the trainee’s independent thinking, judgment and confidence, which are key qualities for consultants. There are two challenges in this approach. First, the trainer does not gain much professionally because the whole process soon gets repetitive. Though he or she gains experience in mentoring and also receives due recognition, they become plateaued. Secondly, from the trainee’s perspective, it is apparent that they are trying to solve an already solved problem, so all their exploration pulls toward the expected results. This limits critical thinking. SUGGESTED APPROACH Given these concerns, it’s appropriate to consider an alternate training approach, but before that let’s understand what is expected from an analytics consultant. In the current complex business world, most of the time business owners are not clearly aware of the problems they are facing. The geniuses among
64
|
A N A LY T I C S - M A G A Z I N E . O R G
them are those who are able to determine that there is a business problem in spite of not knowing exactly what it is. When they hire consultants, they are not necessarily looking for quick solutions, but they expect that the consultant understands their pain, thinks through the business process and then works with them to formulate the business problem – a major part of the job. Such a consultant will be a person who knows how to traverse a business process, understand the nuances of each activity and is not hesitant in asking questions. Asking questions is an art that develops after years of experience and exploration of attitude. The purpose of training should be to direct professionals in that direction. So the first purpose of training should be that both trainer and trainee equally benefit from the process. For trainers, gains should be continuous and not limited. Almost all analytics professionals who have worked for more than 10 years must have seen some business problem in their career where they would have felt curious and wanted to do more than what was required but could not proceed because of time constraints. This business problem may be the result of their thought process developed while solving a similar problem or an extension of an existing problem, but the additional work did not match the requirement of their client.
W W W. I N F O R M S . O R G
Ideally, such professionals should volunteer as trainers and make that unsolved business problem a training project for trainees and team up with trainees in solving the problem. In this manner, trainers will satisfy their curiosity, they will get a helping hand to see their thought process in action, and they will receive the benefit of enhancing their business understanding. Meanwhile, trainees will get opportunity to work on a real-world unsolved problem. Asking questions and thinking outside of the box will enhance their critical reasoning. Since they will be looking for a new target, they will not be inclined to converge their answers to a particular given target. The benefits of such an approach are numerous for all concerned. For example: 1. Trainers get to see their idea from a fresh perspective, as well as its practicality. As professionals become more senior, they not only develop by doing work but by developing their ability in getting work done. 2. Trainees get to work on an unsolved problem. In the due process they develop business understanding skills, technical skills and communication skills. 3. For organizations, this approach will develop the culture within the company where new ideas are created as well as nurtured. Better relationships will be developed between senior and junior resources.
A NA L Y T I C S
4. The approach leads to higher accountability. The team will have to come out with a white paper or report about their activities and final finding. This will result in the elevation of the overall thought process and leadership of the organization in the industry. PARTING THOUGHTS In the current dynamic world where technology and ideas are moving at the blink of an eye, the purpose of training in analytics should not only be skill-based, but it should also develop a professional who can think through the business process. The consultant should be able to create the relevant questions around it so that he or she can opine on the future relevance of the process. The training should develop the consultant into a mature, confident individual who is able to judge what constitute the data and understand what knowledge it can provide and what its limitations are. Finally, given this understanding, the consultant can make a judgment about which analytical steps should be taken next to answer the problem in hand. â?™ Chandrakant Maheshwari (chandrakant721@ yahoo.com) is a subject matter expert in the risk and regulatory practice at Genpact LLC. He leads consulting engagements with global financial institutions, advising clients on financial risk management. He is an avid blogger and blogs at https://chandrakant721.wordpress.com/.
M A R C H / A P R I L 2 016
|
65
VE H IC L E RO U T I N G
Higher expectations drive transformation Biennial survey of vehicle routing software reveals many innovations in response to market demands.
BY RANDOLPH HALL AND JANICE PARTYKA
IN
the two years since the last survey, vehicle routing has begun a transformation that mirrors changes occurring throughout the software industry, pushed forward by expectations set in consumer markets for transportation and retail. For instance, Waze (and its owner Google) has seen an explosion of followers in app-based, crowd-sourced navigation, residing on the mobile phone. Rather than relying on static maps that may be older than your 66
|
A N A LY T I C S - M A G A Z I N E . O R G
car, Waze navigates and updates from information gathered from its users, and, as more users gravitate to its platform, the data becomes increasingly valuable (see accompanying sidebar article). For retail, Amazon’s same-day deliveries along with user-friendly interfaces for tracking and ordering have set new standards for customer empowerment. We spoke with representatives of Omitracs Roadnet, Descartes, DNAEvolutions, ALK and Appian TMW, and also surveyed vendors, to get the pulse W W W. I N F O R M S . O R G
Photo Courtesy of 123rf.com | Yuri Bizgajmer
The companies that use routing software vary greatly in size, ranging from small businesses with a fleet of 10 vans or fewer to large corporations routing thousands of trucks. of the industry. All of these companies offer software to solve variations of the “vehicle routing problem” – finding an optimal assignment of customers to vehicles, as well as the optimal sequence and schedule of customers served by each vehicle. The aim is to minimize transportation costs while satisfying feasibility constraints as to when and where stops are visited, what can be loaded in each vehicle, and what routes drivers can serve. Solutions are usually generated in advanced and executed as planned, though sometimes routes
A NA L Y T I C S
are dynamically updated throughout the day. Routing software is used to plan deliveries from central locations, pick-ups from shippers, routes of service fleets (e.g., appliance repair), and bus and taxi schedules. The companies that use routing software vary greatly in size, ranging from small businesses with a fleet of 10 vans or fewer, to large corporations routing thousands of trucks. What these companies have in common is the need to coordinate and sequence tasks across multiple drivers and stops, ensuring
M A R C H / A P R I L 2 016
|
67
VE H IC L E RO U T I N G
predictable and expedient customer service at the lowest cost. THE CLOUD Routing software emerged in the 1980s at a time when routing software resided on personal computers or on mainframes. While these options still exist, the direction has been toward cloud-based solutions or software-as-a-service (SaaS). As Cyndi Brandt of Omnitracs Roadnet tells us, “When customers have different versions of the software or only install part of the solution, it’s hard for us to support them. Our customers use 10 to 15 versions, and they don’t always update. By moving to SaaS, Omnitracs Roadnet can manage data better and offer better features.” According to Brandt, half of Omnitracs Roadnet customers are using SaaS. Other companies have observed similar trends. “The cloud has removed the barrier of infrastructure and systems are easier to deploy,” says James Stevenson at Appian TMW. “This is more important in large enterprises, such as those with multiple branches.” Ken Wood of Descartes indicates that the majority of its customers are cloud based, but he sees potential for future hybrid solutions to enable integration. “There are too many components outside the firewall that will be used as inputs to solutions,” he says. 68
|
A N A LY T I C S - M A G A Z I N E . O R G
SMARTPHONES With pervasive consumer adoption of smartphones, it is not surprising that the technology is affecting how vehicle routes are conveyed to drivers. Marc Gerlach of DNA Evolutions has found that “the availability of cheap and powerful mobile devices will step-by-step replace fixed installed units.” In addition, he says, “most of our customers are providing telematics systems to communicate with the drivers on apps running on smartphones and tablets.” Nevertheless, as Brandt indicates, rugged devices installed on vehicles still have a place “for proof of delivery, mobile forms or tracking how a truck is being driven, such as idling or speeding.” Device use also varies among transportation segments. As Stevenson tells us, “Long haul is less likely to use smartphones since they are required to comply with hours of service rules that require automated data logging connected to the engine bus.” However, tablets or “phablets” (cross between tablet and phone) can sometimes serve this purpose. Recent regulatory changes mandating automated data logging has created a push in this direction. What we did not find in our surveys and interviews is a big move toward integrating Waze-style crowd-sourced data with fleet routing. One challenge is that travel time estimates produced for cars are not W W W. I N F O R M S . O R G
very accurate for trucks, which travel more slowly and must observe height and weight restrictions on roads. ALK Technologies is a provider of mapping data specifically for trucks, as well as truck navigation products, such as “CoPilot Truck.” As Dan Popkin from ALK indicates, ALK has long provided mechanisms for customers to identify improvements in maps, which are then implemented through ALK’s quality assurance process. More recently, ALK has offered “MapSure” as a more automated
A NA L Y T I C S
way for customers to edit their own map files and submit changes into ALK’s data sets. As far as crowd-sourced navigation for trucks, Popkin says, “These are things we are looking into. It’s an exciting opportunity for the future.” INTEGRATION “Routing used to be just about creating a plan, but now it is about execution.” That’s the view of Omnitracs Roadnet’s Brandt, who lists proof of delivery, tracking and compliance as supplemental
M A R C H / A P R I L 2 016
|
69
VE H IC L E RO U T I N G
needs that demand system integration. Omnitracs, long a leader in the truck telematics industry, acquired Roadnet at the end of 2013, and Descartes acquired Airclic in 2014, with an eye toward these forms of integration. Another emerging form of integration is “self-scheduling,” which Descartes’ Wood describes as “a self-chosen delivery time.” Wood also mentioned the need to satisfy “increased expectations of delivery as a continued expansion of the Amazon model.” In addition, Appian TMW’s Stevenson relates, “companies now want full end-toend solutions. They want the day’s activities fed into a complete feedback loop.” This means providing information on actual performance that can be used to improve future routing. Gerlach of DNA-Evolutions sees the future challenge as the “interfacing of all systems along the supply chain. This will lead to more cloud- and web-based offerings on one hand, and the dispatching process will have to consider more aspects and data will thus become more complex.” One example is integration with the energy industry, where routing and production planning need to be optimized together. THIS YEAR’S SURVEY Twenty-two companies participated in this year’s online survey, ranging from 70
|
A N A LY T I C S - M A G A Z I N E . O R G
small vendors (less than 100 customers) to large corporations (1,000+ customers). We asked for demographic information on the companies (such as contact information and date of introduction), platform (hardware, operating system, driver devices, maps), features and capabilities, and installations. We also asked several open-ended questions, inviting comments on recent and expected industry changes, innovations and impacts of the economy on the industry. The print edition provides an excerpted set of questions for all of the respondents. Keep in mind that results are all self-reported and unverified. What’s notable in this year’s survey?
Operating systems: Almost everyone offers a SaaS solution, most provide a Windows solution, and half have solutions implemented on mobile devices (iOS or Android). Digital maps: Solutions are diverging. HERE, TomTom, ALK, Google Maps and OpenStreetMaps were some of those mentioned. Special features and innovations: These included integrated telematics, bulk load routing, integrated workforce management, integrated call centers and ultra low-latency route optimization. Installations: Most common are private fleets transporting consumer goods W W W. I N F O R M S . O R G
(think of Home Depot, Walmart, CocaCola, Walgreens), but for-hire carriers were also mentioned (DHS, R+L), as well as transporters of industrial goods. It was less common for companies to support taxi or bus fleets, but some do. As to where the industry is heading, predictions include heading toward “connected fleet solutions,” integration of real-time information, cloud offerings and mobile offerings via smart phones. “Amazon Now” was mentioned as an influence, creating a standard for coordinated
A NA L Y T I C S
immediate delivery. Descartes emphasized the need to harmonize the consumer’s delivery experience across multiple channels of transportation. And, as James Stevenson explained, “We are now seeing same-day delivery that bypasses distribution centers. It is the ultra last-minute transportation, a bit like Uber, that’s setting the direction.” Amazon, Waze and Uber – all software-driven companies that depend on routing – are setting new standards for the industry. In selecting a vehicle routing product, look for vendors that have experience
M A R C H / A P R I L 2 016
|
71
Which way to go By Randolph Hall
VE H IC L E RO U T I N G
serving similar industries to your own, and test the software on a representative data set to assess the quality and speed of solutions. Ask for references and determine whether any prior customers have switched to another product and why. And look ahead to see whether the company has the capability to maintain and update the software to meet your future needs. Consider total cost of ownership, including license costs, staff support and future upgrades and maintenance. ❙ Randolph Hall (rwhall@usc.edu) is vice president of research for the University of Southern California, as well as professor in the Epstein Department of Industrial and Systems Engineering. He is a member of INFORMS. Janice Partyka (jpartyka@jgpservices. net) is principal of JGP Services (www. jgpservices.net), a consulting group that helps companies with product strategy, market research and communications, specializing in M2M, telematics, logistics and the connected vehicle industries.
Survey Directory & Data To view the vehicle routing software survey products and results, along with a directory of statistical software vendors, click here.
72
|
As a new graduate student at U.C. Berkeley in the early 1980s, I was intrigued by the possibility of empowering travelers with information as a way to improve transportation. My first publication, titled “Habituality of Traveler Decisions and Travel Time Reliability,” proposed a method motivated by three theories. First, travel can be more reliable and faster when exploiting the full diversity of a network, utilizing different routes at different times (i.e., breaking habits). Second, for stochastic and time-dependent networks, the fastest path from point A to B is not a path in the classic sense, but instead an adaptive strategy that permits changes based on information learned while en route. And last, when initiating a journey, one should not only care about the travel time along individual links of a path, but whether they offer multiple options along the way, thus permitting changes as you acquire new information. Thirty-five years later, however, it was as though I had forgotten my own ideas, choosing the same route almost every day for 20 years traveling to work, and the same (but different) route traveling home. Then I discovered the mobile phone app Waze. Crowd-sourcing travel time data by tracking the movements of its users’ mobile phones, Waze offered me dynamic choices, as well as an estimated time of arrival that reflected current travel conditions. Soon I was navigating through back streets of Echo Park, Silver Lake and downtown Los Angeles that I would have never considered. I had gotten out of my rut, but had my journeys become better? Waze has several challenges in getting its algorithms to produce the best choices. First, because travel times constantly change (especially around the start of rush periods), it is not sufficient to have a good estimate of travel times at time of departure; a forward projected travel times along all points of a route is also needed (a USC spinoff company, TallyGo, is working on this issue). Second, travel times vary significantly along route segments depending on where you are heading next and which lane you have selected, and thus precise lane- and destination-based measurement is important. Third, travel time is partly a reflection of roadway congestion, but also a reflection of driver behavior, making the fastest route at least partly dependent on the individual. And last, owing to Waze’s own success, the system has the power to over-saturate streets (particularly the obscure ones) with traffic, resulting in unanticipated “Waze-induced congestion.” So am I still using Waze? Absolutely. It has broken my habits, made me aware of routes I had never considered, and given me information on traffic congestion at the moment I’m traveling. But I do often ignore its choices because my experience tells me alternate routes are likely to be better, and I am fully aware that its ETA will be overly optimistic at certain times of day and overly pessimistic at other times of day. And no matter how many times Waze tells me otherwise, I do know that my office is not situated on a freeway on-ramp. Our survey of fleet routing software tells us that the crowdsourcing revolution has not spread throughout industry. But vehicle routing is ripe for disruption, as fleet drivers, like the rest of us, are ready to break habits to get places faster. Once perfected, smartphones, coupled with crowd sourcing and cloud computing, provide the perfect platform to do so.
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
REGISTER TODAY!
EXPERIENCE THE PREMIER ANALYTICS CONFERENCE OF 2016 Attend Analytics 2016 to share ideas, network, and ask critical questions. Register today to experience and negotiate the ups and downs, and twists and turns of analytics.
Enjoy the Ride!
OPENING KEYNOTE:
“TRANSFORMING THE AUTOMOTIVE INDUSTRY WITH DATA AND ANALYTICS”
Paul Ballew, Global Chief Data & Analytics Officer, Ford Motor Company
TUESDAY KEYNOTE:
"O.R. IN TODAY'S DYNAMIC BUSINESS ENVIRONMENT"
Samuel K. Eldersveld, Ph.D., Lead, Director of Operations Research, Uber Technologies, Inc.
http://meetings.informs.org/analytics2016
CO N FERE N C E P R E V I E W
INFORMS Conference on Business Analytics & O.R. The INFORMS Conference on Business Analytics and Operations Research will be held in Orlando, Fla., at the Hyatt Regency Grand Cypress on April 1012. Early registration rates expire on March 21, and the hotel is filling up fast, so make your reservations as soon as possible to take advantage of the INFORMS group rate. The conference opens on Sunday, April 10 with technology workshops, a newcomers reception, career fair and welcome reception. Monday, April 11, will kick off with an opening plenary given by Paul Ballew, global chief data & analytics officer of Ford Motor Company. Samuel Eldersveld, director of operations research at Uber, will give the second plenary on Tuesday, April 12. April 11 and 12 will offer tracks arranged Paul Ballew of Ford. in the following categories: Marketing Analytics, Heath Care & Life Science, Revenue Management & Pricing, Supply Chain, Analytics Process, Predictive Analytics/Forecasting, Optimization, Analytics Leadership & Soft Skills, Fraud Detection & Cyber Security, Internet of Things, Decision & Risk Analysis, Samuel Eldersveld Sports & Entertainment, Transportation, Supof Uber. ply Chain and Data Mining. 74
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
The highlight of the program will be the 2016 Franz Edelman Awards Gala, set for the evening of April 11. The 2016 finalists are 360i for “360i’s Digital Nervous System,” BNY Mellon for “Transition State and End State Optimization Used in the BNY Mellon U.S. TriParty Repo Infrastructure Reform Program,” Chil- The Hyatt Regency Grand Cypress in Orlando, Fla., will host the conference. ean Professional Soccer Association for “Operations Research perspectives into some of the most critical Transforms Scheduling of Chilean Socproblems facing industry today, enabling cer Leagues and South American World them to broaden their research agendas. Cup Qualifiers,” The New York City PoThe INFORMS Professional Colloquium lice Department for “Domain Awareness is designed to help practice-oriented System (DAS),” UPS for “UPS On Road master’s and Ph.D. students transition Integrated Optimization and Navigation into successful careers. Participants in (Orion) Project” and the U.S. Army Comboth programs can register for the full munications-Electronics Command (CEconference at a discounted rate but must COM) for “Bayesian Networks for U.S. be nominated and selected to attend. Army Electronics Equipment Diagnostic The Analytics Career Fair is Applications: CECOM Equipment DiagINFORMS’ premier, professional career nostic Analysis Tool, Virtual Logistics Asevent that allows top analytics employers sistance Representative.” The winner will and seasoned professionals the ability be announced at the Edelman Awards to connect in a casual atmosphere. The Gala and Banquet following a daylong career fair is included in the registration series of presentations. for this conference. Special programs within the conference For more information and to register, are designed for future analytics leaders. visit the conference website at: http:// The Early Career Connection provides meetings2.informs.org/wordpress/ early career professionals with new analytics2016/. ❙ A NA L Y T I C S
M A R C H / A P R I L 2 016
|
75
CO N FERE N C E P R E V I E W
INFORMS International Conference heading to Hawaii The 2016 INFORMS International Conference will take place on June 12-15 at the Hilton Waikoloa Village Resort on the Kohala Coast in Waikoloa, Hawaii. The invited tracks will span the full range of emerging topics from global supply chains to social networks, and all aspects in beHilton Waikoloa Village Resort, site of the 2016 International Conference. tween. This informative program is designed to educate attendees on current Conference Chair Saif Benjaafar and advances that are at the cutting edge of the the rest of the conference committee will field anywhere in the world. Through a se- host a welcome reception on Sunday ries of diverse speakers, panels, tutorials evening. The Tuesday evening general and structured networking, this conference reception will feature an authentic will offer attendees a forum for rich intellec- Hawaiian Luau that is sure to be a feast tual exchange on a broad range of OR/MS for all of your senses. Men and women applications. in ornate costumes will perform a festival Gang Yu will deliver the plenary talk on drum dance from the islands of Tahiti. A Sunday, June 12. Yu is the co-founder and traditional Polynesian luau feast will be executive chairman of New Peak Group. served as well. This event is included in Prior to founding New Peak Group, he was the conference registration fee. the co-founder and chairman of Yihaodian, Registration fees for this event start at a leading ecommerce company in China. $630 for INFORMS members. Discounted In addition to the technical tracks, the student/ retired rates are available. Early program also includes two receptions registration rates will expire on May 20. primarily focused on networking with colFor more information or to register, visit leagues and other international attendees. meetings.informs.org/2016international. â?™ 76
|
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
Register Now and Save
2016 INTERNATIONAL CONFERENCE HAWAII June 12–15, 2016 Hilton Waikoloa Village
NETWORK AND SHARE YOUR EXPERIENCES ENHANCED WITH AN AUTHENTIC HAWAIIAN LUAU BY THE BEACH!
http://meetings.informs.org/2016international/register/ Hawaii 2016 delivers an impressive lineup of keynote and plenary speakers interspersed with invited tracks on emerging topics. Listen to an impressive array of speakers presenting their latest research on: • • • • • • •
OR in Medicine Entrepreneurship & Innovation Applied Probability Service Operations Big Data Analytics Inventory Systems Cloud Computing
• • • • • • •
Practice-Focused Operations Global Supply Chains Mechanism Design & Game Theory Business Strategy Robust Optimization Scheduling and Project Management and much more
Take this opportunity to network and collaborate with colleagues across the globe from both academia and industry. 2016 INTERNATIONAL
HAW II PLENARY SPEAKER:
GANG YU, Executive Chairman & Co-Founder of New Peak Group
REGISTER at http://meetings.informs.org/2016international
FIVE- M IN U T E A N A LYST
St. Powerball Paradox One who considers arithmetical methods of producing random digits is, of course, in a state of sin. – John von Neumann
BY HARRISON SCHRAMM
78
|
I’ve been somewhat swept away with “lottery fever” the past few weeks. It hits at my house doubly, because we’re all interested in dreaming about “hitting it big,” but it’s professionally interesting for me as well. I always seem to find myself at the intersection of human behavior and advanced mathematics. I bought a lottery ticket. Many of my co-workers think that I am opposed to playing the lottery, but this is not true. The pleasure that we derived from our one ticket was “on par” with other things (I replaced my wife’s weekly flowers with lotto tickets at her request). It also led to a fantastic discussion with my daughter and chief assistant of three “extensions” to the lottery. Fast forward: Lottery tickets -> sequential lottery -> St. Petersburg Paradox -> How many heads in a row can RNG (random number generation) in R produce? I was in the middle of discussing the St. Petersburg Paradox with my daughter and thought it might be interesting to demonstrate it with computer simulation, but I awoke in the middle of the night realizing that I couldn’t! Warning: The rest of this article is about something we use a lot but don’t think about very often – RNGs are the “gateway drug” to the forbidden pleasures of Number Theory. It can lead to late nights huddled over n old texts and contemplating 2 − 1 . What I initially thought of doing was this: I would code up the St. Petersburg Paradox, run it a few million times, and then see how I did “on average.” Flipping a coin is random enough for me to call it random: small,
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
unpredictable changes in air temperature, pressure and currents. There is no physical limit to the number of consecutive heads, and we expect the next flip to be “heads” with a probability of 50 percent regardless of the history of heads to date. Using a RNG is not like flipping a “Hello? Good news – your algorithm is statistically random.” coin, and the RNG is not random! The deto think about.) One interesting feature of fault RNG in R, which is a pretty good one, R is that the first time the RNG is invoked is the Mersenne Twist and is statistically it pulls the seed from the CPU Clock, so random by virtue of passing randomness almost all of us in the world are running a tests, known as the “Diehards.” different instance of the R RNG with nonWe can convert the uniform random overlapping streams. In my experiments, number that lives on (0,1) by interpreting looking for runs of “heads” a billion (a pal30 a “head” as runif(1) > .5. try ≈ 2 ) or so at a time are laughably We know that there is a limit to the lonsmall (Figure 1). gest sequence that can be in R; several The longest stream of “heads” I was estimates are possible, staying within our able to produce in R was 33, which five-minute allocation, we know for certain equates to winnings of “only” $537 million that it is less than (see following equa– which is large, but not infinite. tion). The full cycle of the RNG in R is I will give bragging rights, and a bottle unimaginably huge. Consider this: The esof Laphroig Cask Strength (provided it is timated time unlegal and ethical to do so) for the reply that til the universe most substantially improves on my 33-ple suffers “heat that I found in R without using degenerate death” under closed universe theories is seeds (see below). They will need to send 90 10 years from now. Iterating the RNG at me both the string of random numbers and one per nanosecond until the end of time the .Random.seed vector for verification. won’t get you the full cycle – not even close. (I was tempted to see if there was enough silicon in the universe to build enough computers to do the processing Figure 1: Batch size vs. largest string of consecutive “heads” returned in R, in parallel, but it is too depressing and a log/log plot of same. A NA L Y T I C S
M A R C H / A P R I L 2 016
|
79
FIVE- M IN U T E A N A LYST A few things to note: 1. Very important: When you start to manipulate .Random.seed, you are meddling with powers you cannot possibly comprehend. I strongly recommend beginning with safeseed = .Random.seed to store the current value, so that your RNG doesn’t
Figure 2: Different zoom levels of a “degenerate” starting point in the Mersenne twist, showing a long run (top), bifurcation (middle) and finally approach to randomness (bottom). 80
|
A N A LY T I C S - M A G A Z I N E . O R G
become degenerate. Should you mess up your RNG, you can use .Random.seed = safeseed to restore operations. (Note: this is a good rule of practice when playing with the inner settings of any computer.) 2. Speaking of degeneracy, a vector of NAs will produce a long string of .5314 in R (see below). Using the command .Random.seed = as.integer(c(403, 444, rep(.Machine$integer.max-1, 624) will produce a long sequence of .9333. I’m defining these as degenerate because I think (without proof) that they are unreachable from a “normal” starting state. This doesn’t win a bottle. The second point here is interesting. When I first started playing with this, I thought that the string of constant output was unending. It is not. The RNG eventually recovers (remember that the seed length is a 626-tuple, not a single “long” like many others. Consider the three graphs shown in Figure 2. A “real” number theorist would derisively call this an exercise in “number factualism.” I don’t necessarily disagree. I am not aware of any theory that could be brought against the problem of “how long will an RNG stay in a particular region,” but of course, this doesn’t mean it doesn’t exist. ❙ Harrison Schramm (harrison.schramm@gmail.com), CAP, is an operations research professional in the Washington, D.C., area. He is a member of INFORMS and a Certified Analytics Professional.
W W W. I N F O R M S . O R G
DATA ANALYTICA CEREBRUM understanding the underlying methodology and mindset of how to approach and handle data is one of the most important things analytics professionals need to know. informS intensive classroom courses will help enhance the skills, tools and methods you need to make your projects a success.
SAMPLING BIAS PRESCRIPTIVE PREDICTIVE STOCHASTIC MODELS NON-TECHNICAL DECISION MAKERS
UNSTRUCTURED PROBLEMS
REGRESSION OPTIMIZATION vs. SIMULATION DISPARATE INFORMATION
UPCOMING CLASSES:
Essential Practice Skills for High-Impact Analytics Projects
& Foundations of Modern Predictive Analytics BOTH COURSES TO BE HELD: april 13-14, 2016 | 8:30am-4:30pm university of central florida executive development center orlando, florida
Register at www.informs.org/continuinged
CHART1
THIN K IN G A N A LY T I CA LLY
Toy builder Figure 1: What is the optimal mix of toy cars, planes and helicopters? As a toy builder you enjoy making toys for both fun and profit. For your latest production batch, you need to decide how many of each toy to make. The three types of toys you make are airplanes, helicopters and cars as shown in Figure 1. To build an airplane you need three blue blocks, two green rods and one red wheel. To build a helicopter you need two blue blocks, four green rods and one red wheel. To build a car you need one blue block, two green rods and four red wheels. Your profit margins for each toy are as follows: airplane $7; helicopter $8; car $5. The parts available to you are as follows: 25 blue blocks, 29 green rods and 30 red wheels. It is OK to have leftover parts. Question: What is the maximum profit you can achieve? Send your answer to puzzlor@gmail.com by May 15. The winner, chosen randomly from correct answers, will receive a $25 Amazon Gift Card. Past questions and answers can be found at puzzlor.com. â?™
BY JOHN TOCZEK
82
|
John Toczek is the assistant vice president of predictive modeling at Chubb in the Decision Analytics and Predictive Modeling department. He earned his BSc. in chemical engineering at Drexel University (1996) and his MSc. in operations research from Virginia Commonwealth University (2005).
A N A LY T I C S - M A G A Z I N E . O R G
W W W. I N F O R M S . O R G
OPTIMIZATION GENERAL ALGEBRAIC MODELING SYSTEM High-Level Modeling The General Algebraic Modeling System (GAMS) is a high-level modeling system for mathematical programming problems. GAMS is tailored for complex, large-scale modeling applications, and allows you to build large maintainable models that can be adapted quickly to new situations. Models are fully portable from one computer platform to another.
State-of-the-Art Solvers GAMS incorporates all major commercial and academic state-of-the-art solution technologies for a broad range of problem types.
GAMS Integrated Developer Environment for editing, debugging, solving models, and viewing data.
Optimizing to combat Climate Change: CO2 Capture, Utilization, Transport, and Storage The electricity generation sector in the U.S. is a major contributor of CO2 emissions. Thus reductions from this sector will play a central role in any coordinated CO2 emission reduction effort aimed at combating climate change. One technology option that may help the electricity generation sector meet this challenge is Carbon Capture and Storage (CCS). The U.S. Department of Energy uses GAMS to analyze potential CO2 emission reduction scenar-
Graphical representation of the NETL CO2 CTUS model and NEMS integration
ios in which CCS may play a role in meeting emission goals. The NETL CO2 CTUS model developed
When integrated into the National Energy Mod-
by the DOE National Energy Technology Labora-
eling System (NEMS) a detailed portrayal of
tory is written in GAMS. It optimizes on a least
CCS in energy economy projections is ren-
cost basis potential networks of CO2 pipelines and
dered. A version of CTUS has been modified
storage infrastructure amenable to handling the
and incorporated into the U.S. Energy Informa-
transport and storage of captured CO2 from CCS
tion Administration's (EIA's) version of NEMS,
systems.
and is in turn used to produce the Annual Energy Outlook.
For detailed information please contact Charles A. Zelek - Charles.Zelek@netl.doe.gov.
www.gams.com