Analytics July/August 2017

Page 1

H T T P : / / W W W. A N A LY T I C S - M A G A Z I N E . O R G

DRIVING BETTER BUSINESS DECISIONS

J ULY / AUGUST 2017

Shedding light on

dark data • Leveraging new business asset • Best practices to tap potential

ALSO INSIDE:

Executive Edge Mobilize CEO Amit Mehta on making humans, not machines, smarter with big data analytics

• Monetizing the IoT and health apps • Agriculture analytics: farmland value • Risk management amid global terrorism • Data storytelling: Framework for success


INS IDE STO RY

Data scientists’ salaries Burtch Works, an executive recruitment agency specializing in big data and data science talent, recently released a couple of surveys that offer interesting insight into the data science job market, as well as the preferred modeling language/ statistic tool for analytics professionals. In its 2017 salary survey of data scientists, Burtch Works reported that salaries for early career data scientists decreased over the past 12 months. That may come as a surprise to companies struggling to find qualified candidates, as well as to freshly minted data scientists with high expectations, but there’s a logical reason for the finding. According to the study, big data hype is causing more data science hopefuls to enter the field, and the increase in supply is decreasing salaries at the junior end. Meanwhile, some data scientists are opting to skip the Ph.D. and go for a master’s degree as a faster route to the workplace, to capitalize on the numerous opportunities available. Burtch Works also conducted a “flash survey” of more than a thousand data scientists and predictive analytics professionals to ascertain their preferred stat tool among SAS, R and Python. Among the findings: • Open source (specifically Python) support is highest in tech/telecom sector, which employs 41 percent of data scientists. 2

|

A N A LY T I C S - M A G A Z I N E . O R G

• SAS preference is higher in more regulated industries such as pharma and financial services. • Data scientists (working with unstructured data) prefer Python at 69 percent. • Predictive analytics pros (structured data) prefer R (42 percent) and SAS (39 percent) almost equally. I met Linda Burtch, founder and managing director of Burtch Works, at the INFORMS Conference on Business Analytics & O.R. in Las Vegas earlier this year. She was a panelist on “Growing an Analytics Team,” one of many great sessions at the conference. Linda offered plenty of insight, including this gem: More and more data scientists and other analytics professionals are listing “storytelling” among key skills on their resumes. Which brings us to this issue of Analytics magazine. Along with articles on dark analytics, dark data, agriculture analytics and risk management amid global terrorism, the issue includes a lighter fare entrée by Esther Choy, president and chief story facilitator of the business communication training and consulting firm Leadership Story Lab. The title: “Data storytelling: No more criticism sandwiches.” Enjoy. ❙

– PETER HORNER, EDITOR peter.horner@ mail.informs.org W W W. I N F O R M S . O R G


Add Speed to Your Operations to Fuel Growth Find out how a leading U.S. food manufacturer freed up cash to invest in R&D with The AIMMS Prescriptive Analytics Platform.

This food company skyrocketed in popularity to become the best-selling brand in its category in the U.S. Through What-If Analysis and reviewing over 20 different scenarios, the team drove cost reductions that ranged from 10-25%, which amount to millions of dollars annually. These cost savings helped the company further invest in R&D and subsequently launch several new product lines. Eager to get this kind of results?

Read the case study The AIMMS Prescriptive Analytics Platform helps you evaluate and identify the best options to tackle your most pressing challenges with sophisticated analytics that leverage mathematical modeling and scenarios while pulling from multiple data sources. You can immediately gauge, not just what is likely to happen, but what you should do about it for the best possible outcome. That’s why teams at J&J, Shell, GE and Heineken and many more fire up AIMMS every day.


C O N T E N T S

DRIVING BETTER BUSINESS DECISIONS

JULY/AUGUST 2017 Brought to you by

FEATURES 28

28

SHEDDING LIGHT ON DARK ANALYTICS New business asset: Leveraging advanced technologies to explore unstructured and “dark” data reveals hidden insights.

By Nitin Mittal 32

DARK DATA: TWO SIDES OF SAME COIN Lost opportunity and security risk: Dark data can be tapped to generate more opportunities or remain in the dark, forever.

By Ganesh Moorthy 36

32

SURVIVING GLOBAL TERROR How businesses can use analytics to better track risks and mitigate the cost of damage, disruption and cyber attacks.

By Virág Fórizs and Shane Latchman 40

AGRICULTURE ANALYTICS: FARMLAND VALUE Along with increased productivity and profits, data analytics can produce more precise, stable and higher agriculture land values.

By Joseph Byrum

40

46

SOFT SKILLS: DATA STORYTELLING No more ‘criticism sandwiches’: New framework for garnering feedback and connecting with the people who matter most.

By Esther Choy 52

THE UNIVERSE OF HEALTH DATA THINGS IoT and health apps: monetizing the value of ‘data as data’ via realtime biometrics and associated mobile phone apps.

By Aaron Lai

46 4

|

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


Welcome to Analytic Solver ® Cloud-based Optimization and Simulation that Integrates with Excel

Everything in Predictive and Prescriptive Analytics Everywhere You Want, from Concept to Deployment.

functions; easy multiple parameterized simulations, decision trees, and a wide array of charts and graphs.

The Analytic Solver® suite makes the world’s best optimization software and the fastest Monte Carlo simulation and risk analysis software available in your web browser (cloud-based software as a service), and in Microsoft Excel. And you can easily create models in our RASON® language for server, web and mobile apps.

Forecasting, Data Mining and Text Mining.

Linear Programming to Stochastic Optimization. It’s all point-and-click: Fast, large-scale linear, quadratic and mixed-integer programming, conic, nonlinear, non-smooth and global optimization. Easily incorporate uncertainty and solve with simulation optimization, stochastic programming, and robust optimization.

Comprehensive Risk and Decision Analysis. Use a point-and-click Distribution Wizard, 50 probability distributions, automatic distribution fitting, compound distributions, rank-order correlation and three types of copulas; 50 statistics, risk measures and Six Sigma

Analytic Solver is also a full-power, point-and-click tool for predictive analytics, from time series methods to classification and regression trees, neural networks, and access to SQL databases and Spark Big Data clusters.

Find Out More, Start Your Free Trial Now. In your browser, in Excel, or in Visual Studio, Analytic Solver comes with everything you need: Wizards, Help, User Guides, 90 examples, even online training courses. Visit www.solver.com to learn more or ask questions, and visit analyticsolver.com to register and start a free trial – in the cloud, on your desktop, or both!

Tel 775 831 0300 • Fax 775 831 0314 • info@solver.com


DRIVING BETTER BUSINESS DECISIONS

REGISTER FOR A FREE SUBSCRIPTION: http://analytics.informs.org INFORMS BOARD OF DIRECTORS

60

64

DEPARTMENTS

2 8 12 16 22 60 64 66 68 72

Inside Story Executive Edge Analyze This! Healthcare Analytics Newsmakers INFORMS Annual Meeting Conference preview: Healthcare Winter Simulation Conference Five-Minute Analyst Thinking Analytically

Analytics (ISSN 1938-1697) is published six times a year by the Institute for Operations Research and the Management Sciences (INFORMS), the largest membership society in the world dedicated to the analytics profession. For a free subscription, register at http://analytics.informs.org. Address other correspondence to the editor, Peter Horner, peter.horner@mail.informs.org. The opinions expressed in Analytics are those of the authors, and do not necessarily reflect the opinions of INFORMS, its officers, Lionheart Publishing Inc. or the editorial staff of Analytics. Analytics copyright ©2017 by the Institute for Operations Research and the Management Sciences. All rights reserved.

6

|

A N A LY T I C S - M AGA Z I N E . O RG

President Brian Denton, University of Michigan President-Elect Nicholas Hall, Ohio State University Past President Edward H. Kaplan, Yale University Secretary Pinar Keskinocak, Georgia Tech Treasurer Michael Fu, University of Maryland Vice President-Meetings Ronald G. Askin, Arizona State University Vice President-Publications Jonathan F. Bard, University of Texas at Austin Vice President Sections and Societies Esma Gel, Arizona State University Vice President Information Technology Marco Lübbecke, RWTH Aachen University Vice President-Practice Activities Jonathan Owen, CAP, General Motors Vice President-International Activities Grace Lin, Asia University Vice President-Membership and Professional Recognition Susan E. Martonosi, Harvey Mudd College Vice President-Education Jill Hardin Wilson, Northwestern University Vice President-Marketing, Communications and Outreach Laura Albert, University of Wisconsin-Madison Vice President-Chapters/Fora Michael Johnson, University of Massachusetts-Boston INFORMS OFFICES www.informs.org • Tel: 1-800-4INFORMS Executive Director Melissa Moore Director, Public Relations & Marketing Jeffrey M. Cohen Headquarters INFORMS (Maryland) 5521 Research Park Drive, Suite 200 Catonsville, MD 21228 Tel.: 443.757.3500 E-mail: informs@informs.org ANALYTICS EDITORIAL AND ADVERTISING

Lionheart Publishing Inc., 1635 Old​41 Hwy, Suite 112-361, Kennesaw, GA 30152​USA Tel.: 770.431.0867 • Fax: 770.432.6969

President & Advertising Sales John Llewellyn john.llewellyn@mail.informs.org Tel.: 770.431.0867, ext. 209 Editor Peter R. Horner peter.horner@mail.informs.org Tel.: 770.587.3172 Assistant Editor Donna Brooks donna.brooks@mail.informs.org Art Director Alan Brubaker alan.brubaker@mail.informs.org Tel.: 770.431.0867, ext. 218 Advertising Sales Aileen Kronke aileen@lionhrtpub.com Tel.: 678.293.5201


Welcome to Analytic Solver ® Cloud-based Data and Text Mining that Integrates with Excel

Everything in Predictive and Prescriptive Analytics Everywhere You Want, from Concept to Deployment. The Analytic Solver® suite makes powerful forecasting, data mining and text mining software available in your web browser (cloud-based software as a service), and in Microsoft Excel. And you can easily create models in our RASON® language for server, web and mobile apps.

Full-Power Data Mining and Predictive Analytics. It’s all point-and-click: Text mining, latent semantic analysis, feature selection, principal components and clustering; exponential smoothing and ARIMA for forecasting; multiple regression, logistic regression, k-nearest neighbors, discriminant analysis, naïve Bayes, and ensembles of trees and neural networks for prediction; and association rules for affinity analysis.

distributions, 50 statistics and risk measures, rankorder and copula correlation, distribution fitting, and charts and graphs. And it has full-power, point-and-click optimization, with large-scale linear and mixed-integer programming, nonlinear and simulation optimization, stochastic programming and robust optimization.

Find Out More, Start Your Free Trial Now. In your browser, in Excel, or in Visual Studio, Analytic Solver comes with everything you need: Wizards, Help, User Guides, 90 examples, even online training courses. Visit www.solver.com to learn more or ask questions, and visit analyticsolver.com to register and start a free trial – in the cloud, on your desktop, or both!

Simulation/Risk Analysis, Powerful Optimization. Analytic Solver is also a full-power, point-and-click tool for Monte Carlo simulation and risk analysis, with 50

Tel 775 831 0300 • Fax 775 831 0314 • info@solver.com


EXE CU TIVE E D G E

Making humans, not machines, smarter with big data Today’s real take-away in getting value from big data analytics is the capability of making humans, not machines, smarter first. This may be heresy to proponents of big data analytics, but it’s time to rethink big data’s potential. Good insights from smart analytics are of scant value if there is an insight deficit in big judgment enablement. Insight deficit? It’s the ability to find and analyze relevant information to drive actions and decisions effectively. If humans can do their daily jobs more efficiently, not only will their daily quality of life improve, they will have more time to apply big judgment that steers clear of overlooking opportunities or underestimating risks. LESSONS FROM CONSUMER APPLICATIONS

BY AMIT MEHTA

8

|

Let’s draw parallels from consumer apps. Apple’s legendary Steve Jobs discovered how to make consumers “smarter” by combining the ability to make phone calls, take photos, send text messages, look for driving directions, search the vast Internet and more in one platform. The aptly named

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


smartphone has accomplished nothing less than changing the world by making us more productive and enhancing our quality of life. Once Apple “hooked” consumers on smartphones, Jobs and his Silicon Valley experts made users even smarter by reducing all possible daily inefficiencies that occur (let’s call it obstacles to human efficiency). With the smartphone, users theoretically became more productive and, as a result, have more time to think, be more inventive and launch new, innovative initiatives. Apple, of course, does not hold a monopoly on changing the electronic marketplace. To the contrary, the consumer application world is filled with similar success examples such as Facebook getting users hooked to its platform via better user experience, then gathering human intelligence via human generated data and now actively applying actionable insight to deliver, for example, more focused advertising to users among other new experiments they are conducting with artificial intelligence (AI). ENTERPRISE USER ENGAGEMENT WILL DRIVE BIG DATA ADOPTION Enterprises must draw parallels from these consumer world successes if they want to see big data become a A NA L Y T I C S

competitive differentiator and adopted companywide. How can this be done? Simply by finding all human efficiency problems, i.e., finding all possible dead times, idle times and wait times in the current enterprise workflows from data to decisions. Let’s look at some examples of human inefficiencies in the enterprises: upload/download data, prepare data, aggregate data from a variety of sources, make Excel charts, make management presentations, and separate facts from fiction in operational and business decisions. Subsequently, as with an “aha” reaction of discovery, if big data can be applied to eliminate or reduce these human inefficiencies, it will free users from drowning in data, from being busy in essentially a make-work way and from being less productive. Once enterprise users are liberated from the above and become more productive in their respective enterprises, their quality of life will improve. Virtually overnight, they will have enough time to apply real insights and intelligence to make the machines for which they are responsible function even smarter. Once users are engaged through this scenario, cognitive and other intelligent analytics could be added as humans will most likely generate good judgment via human data in such a platform effortlessly. J U LY / A U G U S T 2 017

|

9


EXE CU TIVE E D G E

DATA DECISIONS WORKFLOWS: ADDRESSING THE INSIGHT DEFICIT Given that perspective, what should an organization’s executives and management look for to make humans, not machines, smarter first? Data decisions workflows have three primary pillars: data quality, actionable insight and human capability. Let’s define, then analyze, the effects: • Data quality: ensuring all data from a variety of sources/data lakes is of good quality regardless of real-time or historical. • Actionable insight (information): turning good data into information/ insight that is useful and can be trusted. • Knowledge worker capability (engineer): individual competency, which can vary based on their experience and expertise. All three must intersect (see Figure 1) to leverage real value from big data analytics and to reduce insight deficit, help enterprise users enable big judgment and eventually increase human efficiency. As shown in Figure 1, let’s look at three scenarios to explain the point: 1. If data quality and actionable insight are excellent but human capability is low, they will overlook opportunities or underestimate risks.

10

|

A N A LY T I C S - M A G A Z I N E . O R G

Figure 1: The three pillars of data decisions workflows must intersect.

2. If data quality and human capability are excellent but actionable insight cannot be trusted, knowledge workers cannot translate information to action. 3. If data quality is poor, but human capability and insight are excellent, knowledge workers will not trust the analytics. Hence all three must intersect to ensure the highest business performance and make humans smarter, which will set the foundation to apply real complex machine learning/AI. REAL CHANGE MANAGEMENT Since the above is not rocket science, why are central data analytics enterprise

W W W. I N F O R M S . O R G


groups seemingly so opposed and focused on only applying data analytics to machines first? The simple answer is, “Machines can be fixed, machine data has been gathered for years, they are relatively expensive so optimizing them in theory makes sense.” Humans, on the other hand, have their perspectives and competencies that make everyone unique. Each individual has their own habits, which need considerable influencing. That is particularly the case when organizational change is required.

Ivan B. Class of ‘18 Oil and Gas

Typically, individuals offer resistance to change and need “hand holding” to move from old to new ways. And that’s simply not easy. The dilemma in making humans smarter is: Data analytics groups budget for data scientist hiring, but will they budget to hire personnel who specialize in influencing habit changes? If “yes” to the latter, the results can be dynamic across the business world. ❙ Amit Mehta is CEO of Houston-based Moblize (www.moblize.com).

“This program has helped me develop some great tools for my analytics belt.” -Spring 2016 Exit Survey

MORE INFORMATION: ANALYTICS.STAT.TAMU.EDU

Informs ad.indd 1

A NA L Y T I C S

10/17/16 2:57 PM

J U LY / A U G U S T 2 017

|

11


ANALY ZE TH I S !

Lessons learned: The ‘get-with-theprogram’ problem The importance of building relationships with client staff members and allies.

BY VIJAY MEHROTRA

12

|

Early in my career, I was working on a project team at a very large software company that had conducted a preliminary analysis of some newly captured data. Our executive sponsor “Chuck,” who was the company’s vice president of Operations, asked us to meet individually with a few key managers in one of the business units to discuss our initial results. More than a decade later, I still have a very vivid memory of one of those meetings. “Troy” was in his mid-forties and had been at the company for several years. We arrived at his office well prepared, with a meeting agenda, charts and talking points. Nevertheless, the meeting did not go well. Troy did not bother to understand our newly captured data or our statistical estimation methods. After barely a cursory look at our handouts, he quickly discounted the new data, questioned the validity of our analysis method, and disregarded the

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


insights that we offered. Throughout our conversation, he was extremely condescending. Later that week, while giving a quick update to our executive sponsor, we took the opportunity to describe our meeting with Troy. Unexpectedly, Chuck immediately picked up the phone and asked Troy to come to his office. As we looked on, Chuck proceeded to berate Troy for how he had interacted with us and told him in no uncertain terms to “get with the program.” While this was a somewhat uncomfortable experience, we could not help but gloat, for suddenly we felt like we had decisively won this battle. This turned out, however, to be a hollow victory. Within Troy’s business unit, news spread quickly about what had happened in Chuck’s office. Thereafter, Troy and his peers were always careful to comply with requests from our team – but none of them ever volunteered anything beyond the bare minimum that we asked for. Moreover, we were never able to establish a modicum of trust with any of those managers, and our inability to tap into their knowledge and experience made our work much harder. Eventually, we also realized that we had used up some valuable political capital with our executive sponsor. A NA L Y T I C S

Looking back, I can certainly understand why we had handled the situation in this way. We were young, ambitious and eager. This was the first project with this new client, and we knew that we had Chuck’s ear. At some deep level, we were confident that our academic training and our intelligence would lead to success in the business world. At the same time, we were also feeling real pressure to deliver results in order to help establish a lucrative long-term consulting relationship. We were also frustrated and angry at how Troy had treated us – and we did not hesitate to express those feelings to Chuck or to anyone else who would listen. If you are an experienced analytics professional, you are likely already familiar with this kind of story. If you are a student – or perhaps just starting out in the business world – you should expect to encounter people like Troy and situations like this early in your career. But let’s also take a minute to consider Troy’s viewpoint. Troy and most of his peers had been at this company for several years. They had their own sense of what was working, what was going wrong, and how things might be improved but had not felt empowered or engaged to make significant changes. Meanwhile, management had chosen to bring in J U LY / A U G U S T 2 017

|

13


ANALY ZE TH I S !

consultants to gather data, build models and conduct analysis. In addition, like most executive sponsors, Chuck had an enormous array of responsibilities and pressures, and so he had had little time to explain the motivations and goals for our project to middle managers like Troy, who well may have been concerned about losing their jobs (the business unit was going through a significant transition at the time). Troy was surely confused and concerned when we came rolling into his office. He saw us inexperienced kids 15+ years younger than him, armed with little more than our fancy academic credentials. Undoubtedly, we also brought our own kind of “pros from Dover” swagger [1] as we casually described our statistical methods, though I am sure we thought we were successfully hiding our condescension behind our tight professional smiles. As Troy saw it, we did not know him, his peers or their business – and neither did the executive who had sent us in to start poking around. We were taking up some of his valuable time, and he already had too much to do. Our project was surely going to be intrusive and disruptive, and we almost surely would not be around to clean up the mess afterwards. Is it any wonder that he did not really want to bother with us? 14

|

A N A LY T I C S - M A G A Z I N E . O R G

My experience with Troy taught me quite a few lessons about the important process of starting to build relationships with client staff members and would-be allies. Lessons learned: 1. Be as humble as possible about who you are and as transparent as you can be about why you are there. Note that this is a way of being rather than an affected posture, as most people have very strong instincts for how others actually feel about them. 2. Do as much as you possibly can to try to understand who the different players are, and what their roles and responsibilities and incentives are. This knowledge can be very valuable in helping you proactively appreciate concerns and/or avoid sensitive topics. 3. Come prepared with open-ended questions, rather than just an agenda or a presentation. Do ask “why?” a lot, but do your best to express wonderment and curiosity rather than condescension and ridicule. 4. Be thoughtful about the specific language that you choose to use, for this makes a surprisingly big difference in how you are perceived. In particular, the more concrete and familiar (rather than abstract or mathematical) your terminology is to those you are speaking with, W W W. I N F O R M S . O R G


especially non-technical people, the more likely you are to be perceived as empathetic and trustworthy [2]. In almost all analytics projects and roles, you will need to interact with – and rely on – people whose backgrounds, responsibilities and world views are quite different than your own. From my experience, you will find it much easier to achieve your goals if you are able to earn their trust. Often, it will be impossible to succeed without it. ❙

REFERENCES 1. From http://www.urbandictionary.com/: “An American slang term for outside consultants who are brought into a business to troubleshoot and solve problems. The term comes from the 1968 book “M*A*S*H” by Richard Hooker. In the book, the character Hawkeye is described as using the guise of being the ‘pro from Dover’ to obtain free entrance to golf courses.” 2. For example, see: Hansen, J., and Wänke, M., 2010, “Truth From Language and Truth From Fit: The Impact of Linguistic Concreteness and Level of Construal on Subjective Truth,” Personality and Social Psychology Bulletin, Vol. 36, No. 11, pp. 1,576-1,588.

Vijay Mehrotra (vmehrotra@usfca.edu) is a professor in the Department of Business Analytics and Information Systems at the University of San Francisco’s School of Management and a longtime member of INFORMS.

Prizes & Awards Deadlines

Each year INFORMS grants several prestigious institute-wide prizes and awards for meritorious achievement. These prizes and awards celebrate wide ranging categories of achievement from teaching, writing, and practice to distinguished service to the institute and the profession and contributions to the welfare of society. Case and Teaching Materials Competition Submission: August 26

Judith Liebman Award Nominations: August 25

Frederick W. Lanchester Prize Submission: June 15

Moving Spirit Award for Chapters Nominations: August 25

George B. Dantzig Dissertation Award Submission: June 30

Moving Spirit Award for Fora Nominations: August 25

George E. Kimball Medal Submission: July 31

Prize for the Teaching of the OR/MS Practice Nominations: June 30

George Nicholson Student Paper Competition Submission: June 15

Saul Gass Expository Writing Award Submission: July 1

INFORMS O.R. & Analytics Student Team Competition Register: September 30

Undergraduate Operations Research Prize Submission: June 15

John von Neumann Theory Prize Nominations: July 1

Volunteer Service Award Submission: June 30

https://www.informs.org/Recognizing-Excellence/INFORMS-Prizes

A NA L Y T I C S

J U LY / A U G U S T 2 017

|

15


HEALT H CARE A N A LY T I C S

Tighten your belt as healthcare industry slows down ‌ but stay hopeful

BY RAJIB GHOSH

16

|

In my last article, I mentioned that the Affordable Care Act (ACA) dodged the bullet but the attack would continue after the first incarnation of American Healthcare Act (AHCA) failed on the House floor. It didn’t take long for the House to get the second incarnation ready and get that passed, albeit without the Congressional Budget Office (CBO) score. The Senate now has it, but apparently they are working on their own bill that pundits expect to be a complete rewrite of the House bill. Meanwhile, the CBO came out with its scoring of the new incarnation of the House Bill and found it to have a large, negative impact on the healthcare of the elderly and the poor with a projected 23 million people expected to lose health insurance over a 10-year period, including 14 million from Medicaid.

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


On the positive side, the bill is expected to reduce the deficit by saving overall healthcare costs, which is intuitive since a smaller number of high-risk or high-utilizer people would be covered by the bill. With all the drama going on in Washington these days, it is unclear when a real and final bill would come out of the Senate for the public debate and subsequent implementation. It is unfortunate that during this time of uncertainty access to healthcare for many people will recede as the insurance companies, being unable to figure out their upcoming risks, would play it safe and move out of many state health insurance exchanges. Medicaid expansion states would not be affected during this period, but they would be the first to feel the brunt should the Senate bill follow the same line of thinking as the House bill. Healthcare politics, health policy and healthcare technology, which includes analytics, go so much together in our country that it is impossible to separate them. Politics determine federal and state policies, which in turn dictate provider and payer behavior and their subsequent adoption of technology solutions. If we return to the era of volume-based care, many healthcare technology solutions such as population health management or analytics to reduce hospital readmission A NA L Y T I C S

won’t remain a priority, albeit the trailblazers would surely continue that path. Providers want to do things better, but they might not focus on doing better things. In other words, transformational change that healthcare analytics usher in would lose momentum. So, it is important to track healthcare politics and policies and analyze their impact so we understand the future direction of healthcare analytics. IMMINENT SLOWDOWN IN THE HEALTHCARE INDUSTRY After several years of rapid expansion of the healthcare industry, which analysts say was fueled by high debt growth, the growth rate is expected to slow down in the coming years. The 308 percent debt growth since 2009 is not supported by the demand growth. Large investments went into brick and mortar acquisitions, job growth and rapid mergers/acquisitions. A recent report published by John Burns Real Estate Consulting company identified the risk of the industry pulling back and its impact (see Figure 1). The risk has already started to manifest itself in a few news items that came out recently including the large sell off of hospitals by Community Health Systems. Coupled with the uncertainty in the rise of uncompensated care with the AHCA’s attack on Medicaid, it J U LY / A U G U S T 2 017

|

17


HEALT H CARE A N A LY T I C S

Figure 1.

is quite clear that brick and mortar expansion of the provider network will slow down in the coming months and years. However, the load of chronic disease management won’t go away as the population continues to age. The brick-and-mortar-based healthcare delivery channel will transform to a lower cost, virtual-care model. This transformation won’t be possible without digital technology, advanced data analytics and, in the near future, use of artificial intelligence. That’s the good news for my colleagues in the field. RAPID RISE OF AI COMPANIES IN HEALTHCARE The scope and the use of artificial intelligence in healthcare is increasing. The technology is making progress at a rapid rate. Innovators and investors are both building and funding new companies to accelerate the growth. Six years ago, I was exposed to 18

|

A N A LY T I C S - M A G A Z I N E . O R G

the digital nurse avatar created at Northeastern University. I was fortunate to meet Professor Timothy Bickmore, whose research found that patients waiting for hospital discharge resonated well when the nonhuman “Elizabeth” interacted with the patients instead of a human nurse. The technology has only become smoother since then, as natural language processing (NLP) powers have increased. As a result “avatar with digital empathy” is gaining momentum. CB Insights recently reported 106 companies that are working on AI and machine learning in many areas of healthcare, from virtual nurses to drug discovery (see Figure 2). A recent global trend study published by Tata Consultancy Services stated that 86 percent of provider organizations, life sciences companies and technology vendors are using AI in their solution stack to improve business operations. This trend will continue through 2020 and beyond. The road for AI is not free of challenges, however. For example patient privacy is a big issue; an uproar ensued after the data sharing deal between Alphabet’s DeepMind AI company and National Health Services (NHS) of the United Kingdom became known. Nonetheless, consulting firm Frost & Sullivan predicts that by 2025 W W W. I N F O R M S . O R G


Want to deploy more applications in 2017? Now, you can build and deploy ANY optimization or analytic application 80% faster with FICO® Optimization Modeler (powered by Xpress). This platform delivers scalable and high-performance algorithms, a flexible modeling environment, and unmatched rapid application and reporting capabilities.

FICO is a proud sponsor of the 2017 INFORMS Analytics Conference Email us at (+%1#OGTKECU"ƒEQ EQO to schedule a demo, or visit Booth 32 in the Exhibit Hall for your chance to win a mini drone, and other great prizes. © 2017 Fair Isaac Corporation. All rights reserved.


HEALT H CARE A N A LY T I C S

Figure 2.

AI systems could be involved in everything from population health management to digital avatars like what Dr. Bickmore showed me in 2011. AHCA in its current House incarnation or after the Senate rewrite is expected to bring forward several of ACA’s provisions. Nonetheless, the threat of uncompensated care for hospital-based systems is real. The complexity of the American healthcare system value chain is such that it is hard to find a silver bullet that ensures access to care for all while reducing the cost of care delivery. The only option is rapid transformation with digital health and AI. Hospital bed days and pharmaceutical costs, two of 20

|

A N A LY T I C S - M A G A Z I N E . O R G

the biggest contributors to the overall cost of care, will have to be reined in through efficient post-acute care management, reduction in drug discovery cost and policy changes. On both fronts technology, data and analytics would be the game-changers, but they need to be embraced with an open mind. The cost of these technologies also needs to come down so that providers can achieve better ROI, faster. At the beginning, many things would not be perfect, but with rapid iterations, technologies would deliver on their promises. While I am issuing a word of caution for my colleagues in this field expecting an imminent slow down, I am also encouraging them to feel hopeful about the field for the next five to 10 years. ❙ Rajib Ghosh (rghosh@hotmail.com) is an independent consultant and business advisor with 20 years of technology experience in various industry verticals where he had senior-level management roles in software engineering, program management, product management and business and strategy development. Ghosh spent a decade in the U.S. healthcare industry as part of a global ecosystem of medical device manufacturers, medical software companies and telehealth and telemedicine solution providers. He’s held senior positions at Hill-Rom, Solta Medical and Bosch Healthcare. His recent work interest includes public health and the field of IT-enabled sustainable healthcare delivery in the United States as well as emerging nations.

W W W. I N F O R M S . O R G


REGISTER NOW

Healthcare 2017 OPTIMIZING OPERATIONS & OUTCOMES

Attendees can network and meet with fellow "heathcare analytics" academic researchers and industry stakeholders who are applying and sharing research to improve the delivery of effective healthcare.

NETWORK WITH MEMBERS

JOIN US IN ROTTERDAM, JULY 26–28 Healthcare 2017 offers a platform that will help you take the next step on the path toward optimizing healthcare while advancing the theory and practice of operations research and analytics. Technical sessions focus on data analytics & machine learning, scheduling & planning, supply chains, medical decision making, disease & treatment modeling, emergency medical services, health operations, policy & systems, simulation, eHealth & telemedicine, quality & safety, insurance, and much more.

LEARN BEST PRACTICES

Access the full Program Schedule: http://meetings.informs.org/healthcare2017

Keynote Speakers Dimitris Bertsimas

Operations Research Center Massachusetts Institute of Technology

Brian Denton

Department of Industrial and Operations Engineering University of Michigan

Dr. Eric de Roodenbeke

CEO, International Hospital Federation

Margaret Brandeau

Stanford University Philip McCord Morse Lecture: "Public Health Preparedness: Answering (Largely Unanswerable) Questions with Operations Research"

REGISTER today http://meetings.informs.org/healthcare2017

HEALTHCARE 20 7

TAKE SOMETHING BACK


NE W S M AK E R S

Salary study, stretch goals, overtreatment & GDPR compliance STUDY: SALARIES FOR EARLY CAREER DATA SCIENTISTS DECREASE FOR FIRST TIME Salaries for early career data scientists decreased year over year for the first time in four years as did the percentage of early career data scientists with a Ph.D. while demand for data scientists continued to increase, according to a recently released Burtch Works’ 2017 salary study of data scientists. Salaries for more experienced data scientists generally held steady or increased slightly depending on an individual’s focus area, responsibility and geographic base, according to the report. The study includes data on salaries and bonuses by industry, region, education, The demand for data scientists continues to increase. Photo Courtesy of 123rf.com | © Natalia Romanova residency status and gender, 22

|

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


as well as the data science hiring market trends and predictions for the future. The study sample included more than 400 U.S.-based data scientists. Burtch Works defines “data scientist” as a “specific type of predictive analytics professional who applies sophisticated quantitative and computer science skills to both structure and analyze massive stores or continuous streams of unstructured data, with the intent to derive insights and prescribe action.” To download the free, complete report click here. SETTING ‘STRETCH GOALS’ CAN UNDERMINE ORGANIZATIONAL PERFORMANCE While the general consensus regarding “stretch goals” is that they boost drive, innovation and improve organizational performance, new research in the INFORMS journal Organization Science shows that this is the exception and not the rule. For many organizations, stretch goals can serve to undermine performance. The study, “Stretch Goals and the Distribution of Organizational Performance,” was conducted by Michael Shayne Gary of UNSW Business School in Sydney, Miles Yang of Curtin University, Philip Yetton of Deakin University and John Sterman of the Sloan School of Management at MIT. They examined the impact of assigning A NA L Y T I C S

About 80 percent of participants failed to reach the assigned stretch goals. Photo Source: Thinkstock

stretch or moderate goals to managers. Study participants were assigned moderate or stretch goals to manage the widely used interactive, computer-based People Express business simulation. The researchers found that about 80 percent of participants failed to reach the assigned stretch goals. Compared with moderate goals, stretch goals improved performance for a few, but many abandoned the stretch goals in favor of lower self-set goals or adopted a survival goal when faced with the threat of bankruptcy. Consequently, stretch goals generated higher variation in performance across organizations, created large performance shortfalls that increased risk taking, undermined goal commitment and generated lower risk-adjusted performance. The authors suggest that whether boards or top management should adopt stretch goals in their organization depends on their attitudes toward risk. Those with large appetites for risk may still prefer stretch goals. In venture capital or private equity, the value created by J U LY / A U G U S T 2 017

|

23


NE W S M AK E R S

“big winners” can more than offset the poor returns or losses on the majority of organizations in the portfolio. These organizations may also be more able to absorb the poor returns or losses created by aggressive goals. For the full study, click here. NEW RESEARCH ON HOW THE BRAIN MAKES PREFERENCEBASED DECISIONS Researchers have found a direct window into the brain systems involved in making everyday decisions based on preference. The study, led by a team of neuroscientists at the University of Glasgow’s Institute of Neuroscience and Psychology, and recently published today

How our brains arrive at decisions is a popular research topic. Photo Courtesy of 123rf.com | © Natalia Romanova

24

|

A N A LY T I C S - M A G A Z I N E . O R G

in Nature Communications, offers crucial insight into the neural mechanisms underlying our decision-making process, opening up new avenues for the investigation of preference-based choices in humans. Whether we decide to opt for a piece of apple or a piece of cake is, for example, a preference-based decision. How our brains arrive at such decisions – as well as choices that rely on our subjective valuation of different alternatives – is currently a popular research topic. Previously it was unclear where the brain implements preference-based choices and whether it uses a mechanism similar to when we make decisions purely based on the perceptual properties of the alternatives (like choosing the bigger of two items). The study presented participants with pairs of snacks, like a chocolate bar and a pack of crisps, and asked them to choose their preferred item. To identify the brain areas involved in these decisions, the team used a stateof-the-art multimodal brain imaging procedure. Volunteers wore an EEG cap (to measure their brain electrical activity) while being simultaneously scanned in an MRI machine. The EEG revealed that decision activity unfolds gradually over time and persists until one commits to a choice. W W W. I N F O R M S . O R G


This EEG activity was then localized with functional MRI in the posterior medial frontal cortex of those who participated in the study, a brain region that has not been previously linked directly with preference-based decisions. FIVE ESSENTIAL PILLARS OF BIG DATA GDPR COMPLIANCE Less than a year from now, on May 25, 2018, the General Data Protection Regulation (GDPR) will come into effect in the European Union. GDPR represents a significant change in how data will be handled around the world. In the United States, the 2017 GDPR Preparedness Pulse Survey the conducted by PricewaterhouseCoopers polled C-suite executives from large American multinationals and showed that U.S. companies are overwhelmingly aware of, and concerned with, GDPR regulations. Over half of survey respondents cited GDPR as a “top” priority, and 38 percent named it “among” their top priorities. And rightfully so given that fines are applicable to U.S. businesses as well and that the new regulations are relatively complicated and will require significant preparation, not just as an afterthought. GDPR applies to any enterprise in the world that targets the European market in offering goods or services or profiles European citizens. A NA L Y T I C S

For companies in big data (or any data for that matter), one of the most daunting things about the GDPR is that organizations have already accumulated massive amounts of data and the regulations apply not just going forward, but retroactively as well. The path toward GDPR compliance for big data organizations begins by identifying the five critical challenges: 1. data storage, 2. aligning teams, 3. accommodating data subject requests, 4. data governance and 5. adaptability. GENEROUS HEALTH INSURANCE PLANS ENCOURAGE OVERTREATMENT Offering comprehensive health insurance plans with low deductibles and co-pay in exchange for higher annual premiums seems like a good value for the risk averse and a profitable product for insurance companies. But according to a forthcoming study in the INFORMS

Why do patients pick more generous insurance plans and expensive treatments? Photo Source: Thinkstock

J U LY / A U G U S T 2 017

|

25


NE W S M AK E R S

journal Marketing Science, such plans can encourage individuals with chronic conditions to turn to needlessly expensive treatments that have little impact on their health outcomes. This in turn raises costs for the insurer and future prices for the insured. The study, “A Dynamic Model of Health Insurance Choices and Health Care Consumption Decisions,” is coauthored by Nitin Mehta of the University of Toronto, Jian Ni of Johns Hopkins University, Kannan Srinivasan of Carnegie Mellon University and Baohong Sun of the Cheung Kong Graduate School of Business. The authors examined data from an unnamed health insurer in the United States on the insurance plan and treatment options for 3,000 chronically ill patients over a three-year period. Treatments vary widely in terms of cost and impact. Expensive “frontier” treatments provide the best outcome for only the seriously ill, while cheaper, established treatments prove effective for most other patients. The authors examined the underlying reasons for why patients chose the more generous insurance plans and expensive treatments. They found that not only price, but also the lack of information and uncertainty about effectiveness of alternative treatments, 26

|

A N A LY T I C S - M A G A Z I N E . O R G

drove choice. Chronically ill patients could be uncertain about the severity of their illness and how they would respond to alternative treatments. Faced with uncertainty, they asked doctors for the “best” treatment available and chose generous plans with lower copay and deductibles, which in turn made them more likely to choose the expensive treatment. 2018 IAAA COMPETITION ACCEPTING ENTRIES Entries are being accepted for the 2018 Innovative Applications in Analytics Award (IAAA) competition, sponsored by Caterpillar and the Analytics Society of INFORMS. The IAAA provides a forum for enterprising teams and organizations to receive recognition for applications that combine different types of analytics – descriptive, predictive and prescriptive – in creative and unique ways to achieve real-world impact. The award committee is finalizing an opportunity for this year’s best submissions to be published in a special issue of a flagship INFORMS journal. Apply now and receive the credit you deserve for your innovative analytics application. For more information, contact IAAA committee chair Dr. Juan Jaramillo at jaramijr@farmingdale.edu. ❙ W W W. I N F O R M S . O R G


BIG DATA OPTIMIZED ODHeuristics is designed with massive models in mind. It’s created by OD experts – global leaders innovators in the field of optimisation. It runs under IBM’s CPLEX and takes advantage of the multi-core processing power of modern computing by breaking models into multiple threads. The combined innovation of Optimization Direct with the power of CPLEX, gives effective results on packing problems, supply chain and telecoms as well as scheduling applications. On large scale MIPs it provide solutions that are often beyond the reach of traditional optimization methods. Find out more and challenge the Optimization experts with your large matrices, find us at optimizationdirect.com

The IBM logo and the IBM Member Business Partner mark are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. *IBM ILOG CPLEX Optimization Studio is trademark of International Business Machines Corporation and used with permission.


UNC OVERIN G H I D DE N VA LU E

Dark analytics: Shedding light on a new business asset

BY NITIN MITTAL eep within the astonishing volumes of raw information generated by business transactions, social media, search engines, IoT and countless other sources, valuable intelligence about customers, markets and organizations, lies waiting to be discovered. Leveraging advanced technologies to explore this expansive universe of unstructured and “dark� data reveals hidden insights to inform decision-making

D

28

|

A N A LY T I C S - M A G A Z I N E . O R G

and chart new paths to the future. Machine learning, robotics process automation, visualization, natural language processing, and image and video analysis allow questions to be answered and opportunities brought to light that were unimaginable only a few years ago. We’re only beginning to explore the digital universe, yet significant business value lies within. In fact, International Digital Corporation predicts that organizations that can analyze all relevant

W W W. I N F O R M S . O R G


The influence of dark data is inescapable, increasingly driving innovation and value to customers and stakeholders. Photo Courtesy of 123rf.com | © Kriangkrai Wangjai

data and deliver actionable information could achieve an extra $430 billion in productivity gains over their peers by 2020 [1]. The influence of dark data is inescapable, increasingly driving innovation and value to customers and stakeholders. From financial services to healthcare, we’re starting to see the powerful impact of evolving technologies, new types of data and its potential to drive competitive advantage and shape entire industries. In the automotive industry, connected cars could help make usage-based insurance possible and accurately predict when vehicles need maintenance. Some automakers are beginning to experiment with an augmented reality dashboard.

A NA L Y T I C S

DEFINING DARK DATA With a digital universe expected to reach 44 zettabytes by 2020, and 90 percent of it being unstructured data from the IoT and non-traditional sources [2], how is dark data defined? Today’s dark analytics efforts usually focus on three dimensions: 1. Untapped data already in your possession. Most organizations have large collections of structured and unstructured data sitting idle. On the structured side, it’s often due to the difficulty of making connections between disparate data sets. Insights lie waiting to be discovered: One company mapped the addresses of employees against workplace satisfaction ratings and retention

J U LY / A U G U S T 2 017

|

29


DA RK AN ALY T I C S

data, finding that one of the biggest factors fueling voluntary turnover was commute time. Unstructured data is often text-based, and until recently, tools and techniques needed to leverage them efficiently did not exist. Today, scanned patient records could hold the key to better understanding disease and prescribing effective treatments; executive emails and communications could unearth wisdom needed to pass along to a younger generation of workers. 2. Nontraditional unstructured data. Another dark analytics dimension focuses on data such as audio and video files and still images that could not be explored until now. Now, computer vision, advanced pattern recognition, and video and sound analytics allow companies to mine this data to better understand customers, employees, operations and markets. This insight can be illuminating; retailers have a window into customer sentiment by analyzing in-store posture and facial expressions or online browsing patterns. Oil and gas companies can use acoustic sensors to monitor pipelines and algorithms to provide visibility into flow rates and fluid composition. 3. Data in the deep web. The deep web may offer the largest of body of untapped information – data curated by academics, government agencies, communities and 30

|

A N A LY T I C S - M A G A Z I N E . O R G

other third-party domains – that is often hidden behind firewalls. Organizations may soon be able to curate competitive intelligence using search tools designed to help target specific data types. Analysis of this “deep web” data is especially promising in its potential to allow better prevention, detection and response to cyber threats. LIGHT UP THE DARK SIDE Developing a strategy for discovering value in unstructured data can help your organization generate insights today and prepare for even greater opportunities in the years ahead. How can your organization get the most value out of the mountains of data that it has created, owns or has access to? To help optimize the value of this business asset, consider these practical steps: Ask the right questions. Work with business teams to identify specific questions that need to be answered. Then identify the sources of data that make the most sense for your analytics efforts. To boost sales of sports equipment, analytics teams can focus on sales transactions, inventory and product pricing in a specific geographic area. Supplementing the data with unstructured data, such as instore analysis of foot traffic or social media trends, generates more nuanced insights. W W W. I N F O R M S . O R G


Look outside of your organization. Augmenting your data with publicly available demographic, location and statistical information helps put insights in context. For example, when a physician makes recommendations to an asthma patient about how to manage symptoms, he can also provide short-term solutions to help her deal with flare-ups during pollen season. Expand analytics talent for impact. Optimizing dark data analytics depends on assembling the right teams to meet specific needs and identify emerging opportunities. Build a cohesive team that encompasses organizational, business and technical knowledge. View analytics as a business-driven effort. Plan your project with your organization’s business goals in mind to determine the value that must be delivered, define the questions to ask, and decide how to harness data to generate the right answers. Data analytics then becomes an insight-driven advantage that wins support throughout the organization and fuels future endeavors. Think broadly. As new strategies and capabilities for using dark analytics are developed, consider how they can be

A NA L Y T I C S

extended across the organization, as well as to customers, vendors and business partners. By tapping into dark data, organizations have an opportunity to turn previously hidden or unknown patterns and correlations into powerful insights. Uncovering this business value leads to new opportunities, reduced risk and increased analytics ROI. The potential is exciting, but staying grounded with specific business questions that have a defined scope and measurable value is critical. Focus your dark analytics efforts on areas that matter to your business – and avoid getting lost in the increasingly vast unknown. ❙ Nitin Mittal is a principal in Deloitte Consulting LLP and leads Deloitte Consulting’s Analytics & Information Management practice. His team works with companies from many industries to fully leverage the potential of analytics and emerging technologies. REFERENCES 1. “IDC FutureScape: Worldwide big data, business analytics, and cognitive software predictions,” International Data Corporation, 2016. 2. EMC Digital Universe with research and analysis by IDC, “The digital universe of opportunities: Rich data and the increasing value of the Internet of Things,” April 2014, https://www.emc.com/ leadership/digital-universe/2014iview/index.htm

J U LY / A U G U S T 2 017

|

31


DIG ITAL U NI VE R S E

Dark data: The two sides of the same coin

BY GANESH MOORTHY oday, we live in a digital society. Our distinct footprints are in every interaction we make. Data generation is a default – be it from enterprise operational systems, logs from web servers, other applications, social interactions and transactions, research initiatives and connected things (Internet of Things). In fact, according to a Digital Universe study, 2.2 zettabytes of data was generated in 2012. This grew by 100 percent in 2013, and is

T

32

|

A N A LY T I C S - M A G A Z I N E . O R G

slated to grow to 44 zettabytes by 2020 worldwide. The study further states that only 0.5 percent of the data generated is actually being analyzed. The study goes on to estimate that about 25 percent of the data, if properly managed, tagged and categorized, can be consumed for other purposes. Enterprises have been collecting and storing data since the age of computers; dark data has always existed. But its

W W W. I N F O R M S . O R G


Organizations need to understand that any data left unexplored is an opportunity lost and a potential security risk. Photo Courtesy of 123rf.com | © aleksanderdn

close correlation to big data has made it a buzzword (or buzzkill, depending on your point of view) in current times. The challenge, though, is that we are simply not equipped to deal with this constant deluge of data. Compounding this effect is the fact that most of this unanalyzed data is unstructured. It takes more preprocessing and transformation efforts to make data ready for analytical consumption. So then, how do we manage dark data? Organizations need to understand that any data left unexplored is an opportunity lost and a potential security risk. Based on an organization’s intent and investment appetite, dark data can either be tapped

A NA L Y T I C S

to generate more opportunities or remain in the dark, forever – the two sides of the same coin. We cannot, however, manage it like a coin toss, with a 50 percent probability of achieving heads or tails. Four best practices to keep in mind: 1. Make it a conscious investment. Tapping into the potential of dark data requires organizations to make strategic decisions and investment toward information protection, retention and mining. They need to be owned by a centralized team that can formulate information management policies and guidelines. If possible, federate the process of executing those guidelines to business functions or departments.

J U LY / A U G U S T 2 017

|

33


DA RK DATA

2. Fetch your information from data lakes. Set up centralized data lakes or reservoirs along with required encryption and access controls. Employ automated data classification and categorization process towards information management. 3. Metadata-fication. Some enterprise units have started employing advanced machine learning to encrypt, tag and classify on transport level – data in motion rather than data at source. Here, it is important to differentiate between raw source data versus processed data and store them separately, using varying controls in place. 4. Deep diving and data mining. While data retention and management caters to information controls for compliance, data mining generates newer opportunities. There is no swaying in that data can be useful in one form or another. However, data mining must have a business case associated with it. For example, if I am to provide appropriate recommendations to a customer, I will need to consider past buying trends of the customer. Toward this end, I need customer data of the past three years for generating accurate models. Rather than sieving through a vast repository, if I can combine prioritized business problems, automated advanced data classifications and workflow systems, I would be able to generate quick results. 34

|

A N A LY T I C S - M A G A Z I N E . O R G

This cognizance requires education and business augmentation units to employ data mining, towards improving customer satisfaction, increasing operational efficiency or creating new growth channels. WELL-ROUNDED CONSUMPTION Dark data can contain important information about the entity, be it an individual or an organization. From an intra-organizational point of view, this information can be used for management – information containment, fraud detection and threat prevention. From an external organization perspective, most of the information contained in dark data can be used for customer 360. One point to keep in mind is that dark data does not need to be an elephant in the room. All it needs is a data-first leading to an analytics-first and finally an AI-first mind-set. This cause is further propelled by an implementable approach toward solving the dark data problem. There is light at the end of tunnel. Hopefully, you are in the right tunnel to start with! ❙ Ganesh Moorthy is an associate director at Mu Sigma, where he serves as program manager/ senior solution architect for R&D engagements. He has more than 16 years of experience in leading enterprise solution development for Fortune 500 clients. He is currently involved in building industrial Internet, augment reality and analytics and visualization platforms for both descriptive and predictive analytics.

W W W. I N F O R M S . O R G


ASSOCIATE CERTIFIED ANALYTICS PROFESSIONAL Analyze How aCAP Can Help Launch Your Analytics Career

The Associate Certified Analytics Professional (aCAP) is an entry-level analytics professional who is educated in the analytics process but may not have practice experience. This prestigious program begins a career pathway leading to the CAP designation.

www.certiďŹ edanalytics.org


RISK M AN AG E ME N T

Surviving global terror: How businesses can better track risks BY VIRÁG FÓRIZS (left) AND SHANE LATCHMAN 2015 and 2016, extremist groups carried out 33 successful terrorist attacks in Western countries – up from an average of three per year in the preceding decade. The overwhelming majority of terrorist attacks take place in a small handful of countries, with attacks in the West representing only 0.3 percent globally in 2016. But the increase in the frequency and fatality rates associated with these attacks means that terrorism is now firmly back on the corporate risk register, regardless of where a company’s assets may be located. Managing this terror risk has become an essential part of any business strategy that strives to protect personnel, property,

IN

36

|

A N A LY T I C S - M A G A Z I N E . O R G

information and privacy. In preparing a strategy, relevant sets of data and analytics can be employed to estimate potential effects of terrorist attacks – and help predict risks that companies and communities may be facing. COUNTING THE COSTS: DAMAGE, DISRUPTION, CYBER The most obvious threat that businesses face from terrorist attacks comes from physical damage to assets. Explosives – which were used in 53 percent of the attacks in the West in 2015 and 2016 – can cause substantial physical damage to their targets and to commercial property caught in the blast zone. While such costs are borne in the first instance by insurers, they W W W. I N F O R M S . O R G


can ultimately trickle down to increased policy premiums. For all but the most catastrophic attacks, however, these costs are typically outweighed by the economic costs associated with disruption to business. Disruptions can occur when locations are sealed off while public authorities conduct investigations and repairs are made. For instance, though the physical property damage caused by the attacks in Paris on Nov. 13, 2016, was limited, large parts of the city were effectively shut down for several days, shuttering all business- Companies are increasingly at risk of politically motivated cyber attacks. Photo Courtesy of 123rf.com | © es in the affected areas. Those costs can be compounded in tourist numbers following the 2005 attacks areas that are reliant on the ongoing conin the Red Sea resort of Sharm el-Sheikh, fidence of tourists, which can be shaken Egypt, demonstrates that revenue levels where the threat from terrorism rises. can eventually rebound, the current terDeclines in visitors to Paris following the rorist threat is likely to remain for the foreattacks there in 2015 are estimated to seeable future, proving potentially costly to have cost €750 million. Countries such businesses located in the affected regions. as Kenya, Tunisia, Turkey and Egypt are Companies are also increasingly at risk suffering from reduced tourism revenues of politically motivated cyber attacks, othfollowing attacks by Islamist extremist milierwise known as “hacktivism.” While many tants, including the June 2015 attacks on attacks may cause malicious damage to two hotels in the Tunisian city of Sousse, digital infrastructure, a small number of which left 38 tourists dead and 39 othattacks have demonstrated the capacity ers injured. While the steady recovery of to have physical impacts. Although some A NA L Y T I C S

J U LY / A U G U S T 2 017

|

sangoiri

37


RISK M AN AG E ME N T

attacks are suspected to originate from hackers working with the support of foreign governments, hacking into critical national infrastructure, industrial control systems or company networks is within the capability of non-state actors. TOOLS TO TRACK THE TERRORISM THREAT Precise prediction of terrorist attacks is notoriously difficult. By their nature, terrorist organizations largely operate in the shadows to avoid detection and will often look to change their methods once security services begin to gain the upper hand. However, risk managers and insurers aren’t operating entirely in the dark. A number of resources and tools can help them better track their exposure and mitigate risk appropriately. A good starting point is to investigate and review databases of historic events. Such databases can include hundreds of thousands of incidents from multiple decades and can capture a range of important attributes about attacks, such as the date, location, attack type, fatality count and perpetrator group, among others. While terrorists intend their attacks to be difficult to predict and prevent, their tactics are often recycled, including the weapons used and the types of targets. For instance, a local terrorist organization whose strategic rationale involves the 38

|

A N A LY T I C S - M A G A Z I N E . O R G

targeting of Western economic interests can be broadly expected to conduct further attacks on similar targets. By referencing terrorism data to identify terrorist actors that predominantly target Western targets, risk managers are better equipped to prepare for an attack and put a response protocol in place in the event of a successful attack. In addition to data on history and outcomes, it’s vital to have an assessment of the level of terrorist threat and the ability of security forces to conduct counterterrorism. Any truly useful quantification of terrorism risk therefore requires information not solely dependent on historical data. Expert judgment can also be deployed to signal potential shifts in strategy by a terrorist organization ahead of the data showing it in action. For example, some analysts produce metrics that provide clients with a more holistic view of the true extent of the risk they face. Modeling tools play a vital role in developing a more robust view of potential losses. The current approach of approximating a bomb’s blast as concentric rings with a percentage applied to the total value is often overly conservative. Using a terrorism model that simulates losses from blasts could allow insurance companies to have a more robust view of potential losses. Analytics and modeling can also be used to help spot weak links and bottlenecks in a company’s supply chain. Companies W W W. I N F O R M S . O R G


would be able to overlay their supply chain (for example, a network of suppliers, distributors and routes) on maps representing terrorism risk as a way to potentially identify areas of greatest vulnerability. MOVING TO BETTER RISK MANAGEMENT Natural hazards and other perils often follow long-term patterns and trends that allow insurers and risk managers to develop a view of what is likely to happen in the future – and to prepare for those possibilities. Conversely, terrorist groups intentionally

seek to do the unexpected. While patterns and trends can be identified, the desire of terrorists to advance and escalate their campaigns – combined with their need to avoid detection by security forces – means that predicting terror events probably won’t become an exact science. But relevant data and analytics, together with informed judgment, can lead to more realistic risk estimates and better risk management. ❙ Virág Fórizs is a terrorism and security analyst at Verisk Maplecroft, a Verisk Analytics business. Shane Latchman is assistant vice president at AIR Worldwide, a Verisk Analytics business.

informs career fair The INFORMS CAREER CENTER offers employers expanded opportunities to connect to qualified O.R. & analytics professionals. INFORMS offers a complete line of services to be used alone or in conjunction with the Career Fair at the 2017 Annual Meeting, giving job seekers and employers a convenient venue to connect. The Career Fair is free to INFORMS attendees. EMPLOYERS PARTICIPATING IN THE CAREER FAIR ACTIVITIES AT THE 2017 ANNUAL MEETING WILL BE ABLE TO: • Provide their recruitment materials in a fun and energetic Career Fair setting • Schedule their own on-site interviews at reserved tables or interview booths • Promote their organization and meet highly-qualified, diverse candidates FOR CAREER FAIR REGISTRATION & INFORMATION:

http://meetings.informs.org/houston2017

A NA L Y T I C S

J U LY / A U G U S T 2 017

|

39


C ROP L AN D R E A L E STAT E

Agriculture analytics: Solutions reflect farmland’s true value

BY JOSEPH BYRUM gricultural land has long been considered an exceptionally stable source of value, but that is changing. The price of farms has recently declined, but this devaluation may be the result of investors acting on insufficient data. Oldfashioned cropland valuation techniques are perhaps painting an inaccurate picture of the nation’s food production capacity, turning a “sure thing” investment into something less reliable than it once was. Data analytics can help correct the imbalance. Save for a brief dip at the end of the Great Recession [1] in 2009, farm prices

A

40

|

A N A LY T I C S - M A G A Z I N E . O R G

in the Midwest have steadily climbed yearover-year since the 1980s. This is not a particularly surprising result, as a rapidly growing global population – the world will have to feed two billion more people by 2050 [2] – guarantees a steady and strong increase in demand for agricultural products. Consequently, many institutional investors have seen farmland as the sort of reliable asset that can withstand all but the most severe economic downturns [3]. FALLING FARM VALUES So why have farm values begun sliding now [4] in a time of relative prosperity? Nationwide, the average crop land real W W W. I N F O R M S . O R G


Each percentage point shift in farmland value moves the national ledger to the tune of $26 billion. Photo Courtesy of 123rf.com | © Péter Gudella

estate value is $4,090 per acre [5], the lowest this figure has been since 2013. Surveys sent to farmers indicate that cash rental rates in Illinois, Indiana, Iowa and Wisconsin were down 7 percent in 2017 compared to the previous year [6]. Even prices for the highest quality farmland remained stagnant, while less attractive parcels have dropped in value. Why should you care about how much someone pays for a few acres of land in Iowa? It matters because agriculture contributes nearly a trillion dollars to the U.S. gross domestic product [7]. Add up the value of all farmland (including the buildings that sit on them), and you are talking about $2.6 trillion in wealth [8]. So each percentage point shift in farmland value moves the national ledger to the tune A NA L Y T I C S

of $26 billion. No wonder the Federal Reserve banks pay such close attention to farm values. Economists at the Chicago Fed point to collapsing commodity prices as the prime culprit in the recent decline: “This downturn has hit the Midwest hard, as seen in lower farmland values and cash rental rates for cropland” [9]. Compared to two years ago, the inflation-adjusted price for wheat has dropped 23 percent and corn is down 8 percent. Soybean has nearly recovered and is down just 2 percent over this period. For farmers operating on the edge of profitability, a commodity price downturn is devastating. While it can mean the difference between success and failure for an individual grower’s operation, the land will outlast the commodity price J U LY / A U G U S T 2 017

|

41


AG RIC U LTU R E A N A LY T I C S

trend. This suggests the measure of value for a long-term asset like land should not just be the short-term fluctuation in commodity price, but the land’s inherent productive capacity. But this should be determined with greater precision than can be had from a glance at the latest commodity futures price chart. DATA ANALYTICS = MORE PRECISE VALUATIONS The use of data analytics for more precise valuations could recapture billions in lost value. Land is, of course, the foundation of food production. Aside perhaps from fish, just about everything we eat depends on cropland, whether that means the field that grows corn for the supermarket or the soybean field that supplies the feed for cattle. The United States is a net food exporter [10], and agriculture is the only major U.S. industry that has consistently enjoyed a trade surplus [11]. That is largely due to the Midwest serving as the world’s primary supplier of corn and soybean, a status that is not likely to change any time soon. We have the ideal climate, experienced farmers and the best technology. That links the demand for the output of American farms to global demand for food. Despite fluctuations related to weather and stockpile levels, that global 42

|

A N A LY T I C S - M A G A Z I N E . O R G

demand can only head in one direction: up. According to the United Nations, the world population will increase by 82 million within the next year [12]. That is equivalent of adding an entire country the size of Germany to global demand. Now extend the time horizon to 16 years from now, and global producers will have to increase output to cover the needs of the population of India – over 1.2 billion people – more than the entire world population at the middle of the 19th century. Of course, farmers going about their business are not making plans based on what they think might happen in the year 2033. They tend to move from harvest to harvest, taking into account factors such as market conditions, expected weather, cash on hand and how much a given field can produce in adjusting their choice of crop for planting. This is how they make the most of the supply and demand situation, and it is how they have done things for decades. But modern operations research techniques have opened new opportunities for increasing the productive output of agriculture [13]. The same insights that increase the yield per acre of land can also provide the insights needed to improve land valuation. Rather than just raising and lowering values based on market conditions in W W W. I N F O R M S . O R G


isolation, the worth of a particular parcel should be based on its expected productive output, under the present market conditions. Data analytics can play a big role in enhancing not just the estimate of productive output, but also increasing that productive output itself. For instance, if the cost of corn stays the same over five years, or maybe even dips a bit, the value of the land would probably drop. But if yield were to increase enough to offset the price drop, the land’s true value ought to be positive, not negative. Thus, the increased value one can expect from the land ought to be factored into the price. Data analytics have only recently begun to play a part on the farm thanks to the development of new technologies. Remote sensing and satellite technologies offer unprecedented insight into the layout of fields and their chemical and biological condition, all in real time. Combined with models of weather patterns, water, nutrient and other resource availability, it is possible to generate an expected range of output from a given field. That knowledge can be combined with forecasted commodity values, as well as property taxes and other regulatory costs, to achieve a far more precise expression of a particular field’s monetary value. A NA L Y T I C S

FARM MANAGEMENT SOFTWARE Already, some startup companies are integrating data-backed farm management software with systems that track farm values. The company Granular has raised $18.7 million [14] to develop the AcreValue software package that tracks 40 million farm parcels in a database containing three years’ worth of land transaction records, plus public sources of crop rotation, soil and environmental data. The software uses two different models to create its valuation estimates. The first considers the individual characteristics of a piece of land. “This controls why a parcel of land on one side of the road might have a very different value than a parcel of land on the other side,” the company explains [15]. The second model uses all of the available financial data affecting the overall market, including interest rates and commodity prices. Granular says the combination of the two models is what provides the most powerful estimate of land value. These pioneers of data-backed valuation have also pointed toward many of the relevant factors that are not – at least for now – part of the analysis. There is the potential for commercial development, mineral rights, special leasing arrangements, soil erosion, the quality of tillage and drainage, and the history of past land use that would have contaminated the location. J U LY / A U G U S T 2 017

|

43


AG RIC U LTU R E A N A LY T I C S

Those are some of the big-picture data that would refine the overall market analysis, but more can and should be done to boost the precision of the first type of individual analysis. Unfortunately, that is much more easily said than done. Farmers would need to spend the up-front cash to install the latest remote sensing equipment. While many have done so, not all have been convinced of the value of data. Farming is an industry steeped in tradition, where it can be hard to break the old analog habits – why change what works? Of course, show farmers the value that can be derived from data analytics systems, and the farmer’s practical nature turns the tide in your favor. Demonstrating a return on the up-front investment in technology systems is the key to convincing farmers of the value of operations research. If, in addition to showing that the productivity increases from data analytics will drive greater profits, we could also show that greater precision in data collection will result in more stable and higher land values as the longterm adoption rates of data gathering technologies will rise. Ultimately, I believe this is where the agricultural industry is headed. Operations research is the future of our 44

|

A N A LY T I C S - M A G A Z I N E . O R G

industry, and that is why the data analytics community will be needed more than ever to help guide the way. ❙ Joseph Byrum, Ph.D., MBA, PMP, is senior R&D and strategic marketing executive in Life Sciences – Global Product Development, Innovation and Delivery at Syngenta. He writes about agricultural innovation. Connect on Twitter @ByrumJoseph. He is a member of INFORMS.

REFERENCES 1. https://www.federalreservehistory.org/essays/ great_recession_of_200709 2. http://www.un.org/en/development/desa/news/ population/2015-report.html 3. http://www.economist.com/news/finance-andeconomics/21637379-hardy-investors-areseeking-way-grow-their-money-barbarians-farmgate 4. http://www.kiplinger.com/tool/business/T019S000-kiplinger-s-economic-outlooks/ https://www.conference-board.org/data/ usforecast.cfm 5. Page 4: http://usda.mannlib.cornell.edu/usda/ current/AgriLandVa/AgriLandVa-08-05-2016.pdf 6. https://www.chicagofed.org/publications/ agletter/2015-2019/may-2017 7. https://www.ers.usda.gov/data-products/ag-andfood-statistics-charting-the-essentials/ag-andfood-sectors-and-the-economy.aspx 8. Page 17: http://usda.mannlib.cornell.edu/usda/ current/AgriLandVa/AgriLandVa-08-05-2016.pdf 9. https://www.chicagofed.org/events/2016/agconference 10. https://www.ers.usda.gov/data-products/foreignagricultural-trade-of-the-united-states-fatus/usagricultural-trade-data-update/ 11. https://www.usitc.gov/research_and_analysis/ trade_shifts_2014/us_trade_by_industry_sector. htm 12. https://esa.un.org/unpd/wpp/Download/Standard/ Population/ 13. https://www.informs.org/ORMS-Today/PublicArticles/June-Volume-42-Number-3/EdelmanAward-Syngenta-earns-2015-INFORMS-EdelmanAward 14. https://techcrunch.com/2015/07/22/granular-rakesin-18-7-million-to-manage-big-farms/ 15. https://www.acrevalue.com/faq/#valuationestimates

W W W. I N F O R M S . O R G


• Network With Your Professional Peers and Those Who Share Your Interests • INFORMS Connect, Our Online Community, Helps You Network With Your Colleagues Quickly • INFORMS Communities and Meetings Provide Unsurpassed Networking Opportunities • INFORMS Certification for Analytics Professionals (CAP®) • Build Your Professional Profile With a Leadership Role in INFORMS • INFORMS Career Center Provides You With the Industry's Leading Job Board

Join Today! http://join.informs.org


SO FT S K IL L S

Data storytelling No more ‘criticism sandwiches’: A new framework for getting feedback.

BY ESTHER CHOY he clock was ticking. Nitin had 24 hours before an important presentation. In it, he would summarize data from a recent project to persuade his boss to act on his recommendations. But without much time to make the presentation the best it could be, Nitin was getting frustrated with his colleagues Gwen and Alan, who had volunteered to give him feedback on a practice session. Their feedback felt overly nitpicking and personal. “I don’t like the way you transition between slides,” Gwen said. “It’s weird for it to take so long. Speed it up.” “You’re speaking at a good pace,” offered Alan, “but I really don’t like your choice of font on the deck.”

T

46

|

A N A LY T I C S - M A G A Z I N E . O R G

This is going nowhere, thought Nitin, racking his brain for a polite way to end the unprofitable session. WHY FEEDBACK GOES AWRY Asking for feedback doesn’t always end well – even when your test audience fully intends their feedback to be helpful. Since feedback can turn personal and unproductive, it’s clear that we need a new feedback framework. The usual model starts with what the critic liked or didn’t like, generally delivered in the form of a “criticism sandwich”: • Start positive • Deliver the bad news • End positive W W W. I N F O R M S . O R G


What do you want your audience to remember after they hear your presentation – even if they forget everything else? Photo Courtesy of 123rf.com | © kasto

The “like and dislike” framework is subjective. How do you know your real audience will have the exact same pet peeves your mock audience did? And delivering feedback in a “criticism sandwich” makes the recipient brace for the distasteful layer of bad news in the middle, and not enjoy the “bread” of good news on the outside. There’s a better way, and a new framework for getting feedback on how well the stories you tell around data are really working to communicate that information. MAKE A WISH LIST In my forthcoming book “Let the Story Do the Work” [1], I emphasize the importance of doing essential prep work before you seek feedback. If you start with a presentation that is on target, you won’t have to do as much work afterward. A NA L Y T I C S

So, make yourself answer this question: What do you want your audience to remember after they hear your presentation – even if they forget everything else? Start with the end in mind! This has kept my clients (and me) from generating draft after draft that doesn’t get to the point. Keep your “wish list” to three points. Keep those points to 10 words each (or fewer). For instance, a presentation on applying data analytics to real estate [2] given to an audience of Realtors, could have as its wish list: 1. Analytics can flag suspicious patterns and protect homeowner/ Realtor (9 words) 2. Data analysis makes realty transactions more personalized – not less (9 words). J U LY / A U G U S T 2 017

|

47


DATA STORY T E LLI N G

3. Data analysis gives Realtors more control over decision-making (9 words). Often, you can weave these points into a three-act formula for structuring stories (more on that in “Let the Story Do the Work”). KNOW YOUR AUDIENCE When you want these three points to resonate, it’s essential to understand your audience. Indeed, understanding your audience is essential to any kind of communication. Here are five audiences you are likely to encounter during data presentations [3]. 1. Intelligent outsiders: While these “outsiders” won’t have in-depth training in data analytics, they are, nonetheless, intelligent, and oftentimes are well-educated and demanding audience members who are familiar with your industry and do not appreciate material being dumbed down. For example, financial advisors sit through new product presentations by asset management firms. While they have gone through extensive training and passed rigorous licensing exams, they do not manage assets, are not portfolio managers and may not understand the complex valuation models asset-management firms use to curate investment products. 48

|

A N A LY T I C S - M A G A Z I N E . O R G

2. High-level cross-functional colleagues: These are peers from other departments who contribute different expertise but who are familiar with your topic. They seek more refined understanding and especially knowledge about how your topic could impact their areas. 3. Your boss: This is your direct manager, the person who not only has to understand but also stand by your work. This is the person who will forward your recommendations to higher-ups as if those recommendations had originated from her. In short, the boss may well be taking a chance on her career based on your work. Therefore, she would like to have “in-depth, actionable understanding of intricacies and interrelationships with access to detail” [4]. 4. The head cheese(s): Your manager’s managers (or even higher). These important executives are very busy and must make numerous important decisions on a daily basis. Because of this, they prefer, and often require, conciseness, and they may need to be reminded why someone is presenting on a given topic and which important decision it relates to. 5. Fellow experts: Especially in academia, think tanks or research organizations, it is possible that those in the audience seats are fellow experts who know just as much about your topic as you do, if not more. In this case, W W W. I N F O R M S . O R G


explanation, especially in the form of storytelling, takes a back seat. Instead, this audience may prefer to explore and even critique your methodologies and results. SEPARATE YOUR MOTIVATIONS FROM YOUR AUDIENCE’S Before we can truly connect with any of these five audiences, we have to admit that what motivates us as presenters is not what motivates our audience. As presenters, we want to impress and prove our value. We obsess over whether the

presentation shows how knowledgeable and qualified we are. But whenever we’re members of an audience, our motivations are totally different. When was the last time you sat down in a lecture hall or conference room and thought, “Boy, I really hope this presenter doesn’t screw up. I hope they don’t stumble over any words. I hope they really prove that they’re qualified.” Instead, the thought process is more like: “What will I learn that will improve my life? I know this person is an expert – but will he bore me to tears?”

VISIT THE UPDATED INFORMS VIDEO LIBRARY & WATCH CASE STUDIES OF AWARD-WINNING ANALYTICS PROJECTS, SUCH AS: • EDELMAN WINNER: Revenue Management Provides Double-Digit Revenue Lift for Holiday Retirement • Implementation of Platform-Based Product Development at Barco • The DICE Simulation Model Unlocks Significant Value for a Large Greenfield Mining Project • A Novel Movement Planning Algorithm for Dispatching Trains • The Off-Hours Delivery Project in New York City • American Red Cross Uses Analytics-Based Methods to Improve Blood Collection Operations • George D. Smith Prize, 2017 Winner • Special Panel Session: Supply Chain in the Age of Drones and Self-Drive Vehicles • And many more!

https://www.informs.org/Resource-Center/Video-Library

A NA L Y T I C S

Video Library

J U LY / A U G U S T 2 017

|

49


DATA STORY T E LLI N G

Having this self-awareness to acknowledge your own worries and then shift into the audience’s point of view is an essential part of prep. HOW TO INTERACT WITH YOUR MOCK AUDIENCE Once you’ve assembled your mock audience, tell them about your target audience. Which of the five categories do they fall into? What do they worry about at work? Second, follow the CLEAR framework: Clarify your intentions and goals to set parameters for the feedback. For example, if your intent is to explore whether a story idea is compelling, ask for it. Otherwise, others may automatically focus on your grammatical issues! Listen and take notes. Understandably, getting feedback can make the most confident people self-conscious. And when we are self-conscious, we don’t pay attention to others and their messages as much as we should. So, to make sure that all feedback is captured, record it in writing. Evaluate feedback in 24 hours. Stepping away from the feedback will give you a whole new perspective and appreciation. Ask questions. Feedback giving and receiving is about having a dialog. So instead of taking feedback as-is, ask questions to clarify. Here are three important questions to ask: 50

|

A N A LY T I C S - M A G A Z I N E . O R G

What fact(s) can they recall? You may have your own sense of what your audience should pay attention to. But only from their feedback will you really see what is “sticky.” How does the presentation make them feel? The ultimate goal is to drive change and prompt action. Without understanding the emotional impact your presentation has, it is hard to gauge how effective your presentation is. What action, if any, would they be likely to take after listening to your presentation? Resist the urge to defend. It is nearly impossible to perfectly align our intention with our action all the time. You may feel others have gotten it wrong and feel the need to defend yourself. Please resist this temptation. It will easily shut down the feedback and end the dialog. Instead, say thank you and ask if you could come back and ask more questions once you have a chance to evaluate the feedback after 24 hours. With this new feedback framework behind you, you can be confident your next data storytelling presentation will truly connect with the people who matter most. ❙ Esther Choy (esther@leadershipstorylab.com) is the president and chief story facilitator of the business communication training and consulting firm Leadership Story Lab. Her book, “Let the Story Do the Work” (published by AMACOM), is available for pre-order on Amazon. This article contains an excerpt from the chapter entitled, “Telling Stories with Data.”

W W W. I N F O R M S . O R G


DATA ANALYTICA CEREBRUM understanding the underlying methodology and mindset of how to approach and handle data is one of the most important things analytics professionals need to know. informs intensive classroom courses will help enhance the skills, tools, and methods you need to make your projects a success.

SAMPLING BIAS PRESCRIPTIVE PREDICTIVE STOCHASTIC MODELS NON-TECHNICAL DECISION MAKERS

UNSTRUCTURED PROBLEMS

REGRESSION OPTIMIZATION vs. SIMULATION DISPARATE INFORMATION

UPCOMING CLASS:

essential practice skills for high-impact analytics projects september 26–27, 2017 | 8:30am–4:30pm ama washington d.c. area executive conference center arlington, va

limited seating available. Register at www.informs.org/continuinged

CHART5


DATA E L E ME N T S

Health data things: monetizing IoT & health apps Time present and time past Are both perhaps present in time future And time future contained in time past. – “Four Quartets,” T.S. Eliot

BY AARON LAI he Internet of Things (IoT) is considered to be the next revolution that touches every part of our daily life, from restocking ice cream to warning of pollutants. Analytics professionals understand the importance of data, especially in a complicated field such as healthcare. This article offers a framework on integrating different data sources, as well as a way to unleash the full potential of data to estimate customer lifetime value (CLV). The ultimate goal – monetizing the value of “data as data” – is

T

52

|

A N A LY T I C S - M A G A Z I N E . O R G

one of the few things that work against the Law of Diminishing Marginal Return. We’ll illustrate the concept with a realtime biometrics monitoring device and associated mobile phone apps. BIG LITTLE DATA: VARIETY, VELOCITY AND VOLUME Big data is generally understood as the 3Vs: variety (different data nature), velocity (rapid arrival of data) and volume (massive quantity of data). Healthcare is a venue for big data, given the volume of data from EHRs (electronic health W W W. I N F O R M S . O R G


records) and genomic data. Healthcare also has very rapid data; a pacemaker can generate heartbeat data in real time. Meanwhile, machine-generated data, user-inputted data (e.g., patient-reported outcomes) and observation data (e.g., prescription note) add considerable variety to healthcare data. Big data provides big opportunities for the healthcare industry. Previous efforts in terms of health devices and health apps (for mobile phones) are mostly focused on data collected via those machines, as well as the business model that could be modified based on data from these and other data sources, both internal and external. TYPES OF DATA ELEMENTS Data sources can be grouped by their nature. Membership data. This includes user information, service utilization (e.g., call center), user-reported data (e.g., patient-reported outcome such as mood), system-recorded data (e.g., login IP, mobile model) and other information collected from or supplied by the users. Other than restrictions imposed by regulations, the company is free to use this data. Physically recorded data. This refers to the data collected by devices including biometrics, heartbeat, blood pressure, blood glucose, physical movement, etc. A NA L Y T I C S

Depending on the device, some data is highly accurate (e.g., blood glucose) and some data is just “directionally correct” (e.g., sleep tracker). It is important not to over-interpret less reliable data. Third-party data. Many “data brokers” provide individual or household-level data for further analysis or marketing purposes. A typical data set could easily reach thousands of attributes and millions of records. This data could be used for prospect acquisition and data enrichment. Clinical data. This data usually resides in the EHR system hosted by the payer (i.e., insurance company), the provider (i.e., hospital) or the pharmacist. With the permission of the patients, the company can combine the data to uncover previously unknown relationships or to serve the users better. For example, suppose someone has a cardiovascular disease and is taking a particular type of drug. The activity tracker could monitor and report her physical activity level, while a patient-reported outcome app could account for mood change. It would then be possible to correlate the impact of the drug on the level of her physical activities, as well as its effect on her mood. Competitive data. Many vendors sell a wide variety of competitive or marketrelated data. For example, some sell anonymized prescription data while others sell market insights. The data comes J U LY / A U G U S T 2 017

|

53


IoT & H E ALT H A P P S

from industry survey results and other means. Public data. The most obvious such source is census data. Other government agencies release data such as hospital discharge rates, disease (Centers for Disease Control and Prevention) and physician data (Centers for Medicare & Medicaid Services). Combining all this data provides clinical insight, behavioral insight and technical insight that can be fed into the R&D, diagnosis and prescription work. Eventually, this process can act as a feedback loop to the user database.

Figure 1: Integrated data flow.

CIRCLE OF LEVELS: MULTI-LEVEL MODELING In statistics, multi-level modeling is used to estimate the parameters of a model if the explanatory variables have both individual and group data. 54

|

A N A LY T I C S - M A G A Z I N E . O R G

In the following example, we use the same concept but in a slightly different context. Suppose we have all the data directly collected from the devices and apps as described above for a particular person. At that point, we know the user, and we can add third-party data and clinical data, providing a “full� information set for that individual. The next step is to integrate the anonymized data such as third-party prescription data, third-party procedure data, etc. The true identity of the person is not known since some of the demographics have been removed or masked due to privacy concerns. However, we could then use a simple Bayesian approach to estimate the probability that this is a similar enough match for analytics purposes. It would be a group-based match, i.e., the likelihood that this known user will share the behavior of this type of anonymous people. It can be written as: Probability (this user shares the same characteristics of this group of anonymous people) = Probability (this W W W. I N F O R M S . O R G


user shares the same characteristics of this group of anonymous people given the pre-defined characteristics) * Probability (pre-defined characteristics)

Exhibit & Sponsorship information is now available.

http://meetings.informs.org/analytics2018

For example, you may have some users who take a specific type of high blood pressure drug and a specific type of diabetes drug for a certain period of time. You may then be able to extract a population from the third-party prescription database based on those two

characteristics. You can then estimate the likelihood that those two groups are the same and set a subjective confidence matching cut-off. It should be noted that the final likelihood also depends on the probability of occurrence. Since this is a binary outcome, you can estimate it with logistic regression or any other techniques such as decision trees or support vector machine. Once you are satisfied with the formula, you can then assign those known users with additional information obtained from the prescription database.

A NA L Y T I C S

J U LY / A U G U S T 2 017

|

55


IoT & H E ALT H A P P S

Another type of data is aggregate data. Much of government data or competitive data is in aggregated form, usually geographically based (e.g., county level). We can follow the aforementioned approach to estimate the likelihood that a group of known users should share the same information as another group from another data source. Of course, we are not saying that everyone in Orange County, Calif., is the same. But we could argue that users in Orange County will resemble those in Orange County more than they will resemble those in Cook County, Ill. The same Bayesian estimation could be used. ESTIMATING THE CUSTOMER LIFETIME VALUE The journey of a customer can be illustrated with the diagram shown in Figure 2.

Figure 2: Customer lifecycle.

56

|

A N A LY T I C S - M A G A Z I N E . O R G

The customer lifecycle starts with a prospect, someone who will potentially use your product. Once a prospect contacts you, she becomes a lead. If she has bought (or used) your product, she is now a customer. She will then experience your product and services, and those customer experiences will affect her likelihood to continue. When the renewal moment comes up, she could follow the attrition path to leave or the renewal path to stay. A former customer could also be won back and become a customer again. Companies often want to know how much they should spend to acquire a customer. One approach is to calculate the customer lifetime value (CLV). If the cost per acquisition (CPA) is lower than the CLV, then this is a positive investment. An equivalent way is to calculate the return on investment (ROI), which is defined as CLV/CPA-1. A positive ROI means positive investment. However, estimating CLV is not a simple task. As we can see from Figure 2, a customer can follow many paths, and many people use the average tenure as a shortcut. Here we propose W W W. I N F O R M S . O R G


a more systematic and data-driven approach to CLV estimation. If we look closer at the customer lifecycle diagram, it is obvious that each path could be considered in a probabilistic way. For example, the transition from prospect to lead is governed by the probability that a person will respond to a solicitation. In other words, this is the response rate of a mar- Figure 3: Sample customer lifecycle process. keting campaign. Therefore, any errors will be amplified We can look at this problem using through multiplications. a Markov Chain approach. Table 1 is The CLV will then be the total revenue a sample transition matrix, i.e., the and cost at each stage for each person probability matrix that shows how people over his or her complete tenure. The value could move from one stage to another. of the additional data that is collected For the most accurate results, it should could be estimated by how the improved be person-specific and time-specific. accuracy will increase CLV. You can To estimate the equilibrium (or longalso perform if-then analysis using this term) transition probabilities, we will framework to estimate the value of a new multiply the transition matrixes together. feature for the medical device. From \ To

Prospect

Lead

Customer

Attrition

Prospect

50%

50%

0%

0%

Lead

20%

30%

50%

0%

Customer

0%

0%

80%

20%

Attrition

10%

10%

10%

70%

Table 1: Sample transition matrix for a certain person at a certain time.

A NA L Y T I C S

J U LY / A U G U S T 2 017

|

57


IoT & H E ALT H A P P S

Suppose your medical device can give a five-minute warning for an imminent heart attack. Then you can calculate the value of this feature via a longer tenure (since the patient will live longer) or potential revenue from higher reimbursement from insurance companies (the payers) or higher sales. The difference between this new value and the status quo will be the value of this feature. This can also be used to estimate the fair price in the value-based contract or value-based reimbursement. Another way to look at it is from a real option framework. The availability of that data enables us to do things. If you neither collect nor acquire the data, you can’t have that feature, even though you might not be able to make this feature work. The cost of the data is the option premium you pay to have a chance to do something else. This is also related to how you can monetarize the data. For instance, probably few people will be interested in buying the demographics of your user profiles. However, many people will be interested in buying a lead list with people who are physically active and are taking a particular class of drug. As a result, your user list is suddenly worth a lot more to advertisers. By the same token, insurance companies may not be willing to pay for your medical device as they are skeptical 58

|

A N A LY T I C S - M A G A Z I N E . O R G

about the value you added. But if you can prove that your device can monitor patients’ health properly and be able to get them to take proper action to avoid adverse and costly issues, e.g., hospital admission, then you can build up a better case to ask for payment or reimbursement. This is the idea behind real-world outcome research. CONCLUSION: WE MUST PART NOW Data is the beginning and the end. The IoT provides unprecedented big data and big opportunities for those who can appreciate it. We have explained what data one could and should collect, illustrated the issues and techniques in integrating those data, described the customer lifetime value (CLV) calculation, and extended the framework to accommodate strategy development. The past and future are both fixed on the present; one has to seize the opportunities presented to unleash the value of data and unleash the monetary value of data. ❙ Aaron Lai (aaron.lai@st-hughs.oxon.org) is the senior manager of analytics for Blue Shield of California. He also serves on the Advisory Board of Business Information and Analytics Center of Purdue University. He has bachelor degrees in finance from City University of Hong Kong and management studies from the University of London, a MBA from Purdue and master’s degrees from Oxford in sociology and evidence-based healthcare. All opinions expressed in this article are his own and do not necessarily reflect those of his employer or his affiliations.

W W W. I N F O R M S . O R G


2017 INFORMS ANNUAL MEETING OCTOBER 22–25 | HOUSTON, TEXAS

Join us in Houston, the healthcare and energy center of the U.S., for a unique opportunity to connect and network with 6,000 of your colleagues who compose the INFORMS community. Listen to intriguing plenaries, panels, and select sessions of interest from the 100s of tracks. Sessions will include topics such as: • Making Decisions Considering Uncertainty • Internet of Things: Promises, Challenges, and Analytics • Data Mining in Churn Decision Analytics • New Advancements in Using Data Analytics for Healthcare Applications • Retail Analytics • Analytics in Medical Decision Making and Population Health • PMU-based Data Analytics and Machine Learning in Power Grids • Food Supply Chain Analytics • Cyber Analytics and Optimization • Data-Driven Approaches to Predictive Analytics • Application of O.R. and Analytics in the Energy Sector • Revenue Management with Consumer Analytics • Psychology Neural Cognitive Computing • Sports Analytics • Large-Scale Data Analytics in Urban Transportation Modeling

REGISTER TODAY http://meetings.informs.org/houston2017 IMPORTANT NOTICE:

Presenters not registered by September 1 will be removed from the conference program.

http://meetings.informs.org/nashville2016


CO N FERE N C E P R E V I E W

Houston, we solve problems 2017 INFORMS Annual Meeting prepares for launch in ‘Space City’

With a little of something for everyone, Houston lends itself as the perfect conference location.

60

|

The 2017 INFORMS Annual Meeting will be held in Houston, Texas, on Oct. 22-25. Houston is world renowned for its economy based in aeronautics, energy, manufacturing and transportation. Houston’s energy industry is recognized worldwide for its renewable energy sources, including wind and solar power. Houston is the most diverse city in Texas and has a large and growing international community. Because it is the home of the NASA Lyndon B. Johnson Space Center and its strong ties to the aeronautic industry, it has earned the nickname “Space City.” With a little of something for everyone, Houston lends itself as the perfect location for the next INFORMS Annual Meeting. The meeting will take place at the George R. Brown Convention Center and the Hilton Americas, with all technical sessions taking place at the Convention Center. INFORMS also has group

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


The 2017 INFORMS Annual Meeting will light up Houston in October. Source: Thinkstock

rates at the Marriott Marquis Houston and the DoubleTree by Hilton Houston Downtown. Only a limited number of rooms are blocked and they will sell out quick, so please make your reservations as soon as possible. PRE-CONFERENCE WORKSHOPS This year INFORMS is introducing a new pre-conference workshop in conjunction with the Annual Meeting. The Academic Leadership Workshop will take place on Oct. 21, the day before the start of the Annual Meeting. The all-day event is designed for faculty of

A NA L Y T I C S

all ranks with an interest in every level of academic leadership. The workshop provides information that can be useful for becoming efficient and effective leaders. Panel speakers are highly visible world-class current and former academic administrators. Participants will have an opportunity to ask questions and network with peers in academic leadership positions. All participants must be nominated by a department head/chair or college dean. Another pre-conference workshop that is being introduced this year is the INFORMS Workshop on Data Science.

J U LY / A U G U S T 2 017

|

61


CO N FERE N C E P R E V I E W

Sponsored by the INFORMS College on Artificial Intelligence, this workshop is a premier research event dedicated to developing novel data science theories, algorithms and methods to solve challenging and practical problems that benefit business and society at large. The conference invites innovative data science research contributions that address business and societal challenges from the lens of statistical learning, data mining, machine learning and artificial intelligence. Contributions on novel methods may be motivated by insightful observations on the shortcomings of state-of-the art data science methods in addressing practical challenges, or may propose entirely novel data science problems. Research contributions on theoretical and methodological foundations of data science, such as optimization for machine learning and new algorithms for data mining, are also welcome. NETWORKING Along with the abundance of educational opportunities, the conference will offer several opportunities for connecting and networking. The Welcome Reception will be held on Oct. 22. Subdivision meetings will be held predominantly on Oct. 23 in the evening. On Oct. 24, INFORMS will host the General 62

|

A N A LY T I C S - M A G A Z I N E . O R G

Reception at Minute Maid Park, Home of the Houston Astros. Another unique networking opportunity for student members is the Coffee with a Member program. This wonderful program connects INFORMS students with some of INFORMS most enthusiastic members for 15-minute impromptu meetings and some sage INFORMS advice. We know the Annual Meeting can be a bit overwhelming and hope these casual meetings will make students more comfortable, knowledgeable and enthusiastic about both the meeting and INFORMS. Space is limited and open to first-time attendees/ participants only. Students may enroll for this program when they register for the meeting. CAREER CENTER A huge benefit of the INFORMS Annual Meeting for employers and job seekers is the INFORMS Career Center. The Career Center and activities provide employers with the opportunity to meet and collect resumes from numerous job seekers in a short period of time, early in the meeting, and to schedule and set up private interviews later in the meeting. Career Center activities are free for all individual meeting registrants. Job seekers should register in the INFORMS Career Center so employers will know you are attending. Be W W W. I N F O R M S . O R G


sure to post your resume or anonymous career profile that will lead employers to you. Job seekers should review Career Center Resources prior to attending the job fair to make sure their resumes and interviewing skills are in tip-top shape. We hope you will attend this unique opportunity to connect and network with

more than 5,000 INFORMS members, students, prospective employees, and academic and industry experts. We look forward to seeing you in Houston! For more information on any of the events listed in this article or other activities at the Annual Meeting, visit www. meetings.informs.org/houston2017.

Request a no-obligation INFORMS Member Benefits Packet For more information, visit: http://www.informs.org/Membership

JOB SEEKERS:

Find Your Next Career Move INFORMS Career Center contains the largest source of O.R. and analytics jobs in the world. It’s where professionals go to find the right job in industry or academia and where employers go to find the right talent.

CAREERCENTER.INFORMS.ORG JOB SEEKER BENEFITS: » POST multiple resumes and cover letters, or choose an anonymous career profile that leads employers to you » SEARCH & APPLY to hundreds of O.R. & analytics jobs on the spot by using robust filters » PERSONALIZED job alerts notify you of relevant job opportunities right in your inbox » ASK the experts advice, resume critiques & writing, career assessment test services & more!

www.informs.org

A NA L Y T I C S

|

1-800-446-3676

powered by

J U LY / A U G U S T 2 017

|

63


CO N FERE N C E P R E V I E W

INFORMS Healthcare 2017: Optimizing operations & outcomes Biennial conference set for Rotterdam, Netherlands, on July 26-28.

BY EDWIN ROMEIJN AND JORIS VAN DE KLUNDERT

64

|

Since the first edition in 2011, the biennial INFORMS Healthcare Conference has quickly developed into the global leading conference in its field. This is further underlined by the decision of the INFORMS Health Applications Society to cross the ocean for the fourth edition, which will be held in Rotterdam, Netherlands, on July 26-28. The Netherlands is widely renowned for its worldclass health system that has topped the European Consumer Health Index since 2008. While Rotterdam may be best known for being the world’s largest port for more than four decades (1962-2004), it is presently developing into a modern health and life sciences hot spot. It is already home to an internationally oriented health and life sciences community of almost 70,000 people and 4,000 companies. According to Lonely Planet, “this metropolitan jewel of the Netherlands is riding a wave of urban development, redevelopment and regeneration.” The largest academic hospital in the Netherlands, Erasmus Medical Center, and

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


Erasmus University also are located in Rotterdam. As a global logistics hub and health and life sciences hot spot, Rotterdam forms a perfect location for the INFORMS Healthcare 2017 conference focused on “optimizing operations & outcomes.” The conference offers a platform for the INFORMS community to make the next step The Markthal (Market Hall) is an example of Rotterdam’s fine architecture. Source: Rotterdam.Branding. Photographer: Ossip van Duivenbode on the path of optimizing health through advancing theory and field, including Dimitris Bertsimas of practice of operations research, manMIT and INFORMS President Brian agement science and analytics. The Denton of the University of Michigan. conference will offer a wide selection of Other plenary lectures will be delivered contributions from the United States and by health system leaders such as Dr. Europe (in particular, the Netherlands), Eric de Roodenbeke, president of the as well from developing countries where International Hospital Federation, who outcome improvements are needed will present his view on how information most. To this purpose, the conference technology is driving the transformation covers areas such as disease and treatof hospitals. In addition, Secretary ment modeling, personalized medicine, General Erik Gerritsen of the Dutch medical decision-making, healthcare anMinistry of Health, Welfare & Sports alytics and machine learning, health inwill address the Dutch view on health formation technology and management, system advancement, with a special health operations management, health emphasis on the role of information and humanitarian systems, disparities technology and big data. On behalf in health and global health, and public of the European Commission, Gisele health and policy-making. Roesems-Kerremans of the Unit of Well The program will include plenary Being, will outline European policies lectures from leading scientists in the and research priorities. A NA L Y T I C S

J U LY / A U G U S T 2 017

|

65


CO N FERE N C E P R E V I E W

Winter Simulation Conference

Of course, the conference is primarily a place for the many participants to meet, present and discuss The 2017 Winter Simulation Conference (WSC 2016) will be their work. In addition to the held Dec. 3-6 in Las Vegas at the Red Rock Resort. poster presentations, there WSC is celebrating 50 years. In addition to a technical program will be more than 100 sesof unsurpassed scope and quality, WSC is the central meeting place for simulation practitioners, researchers and vendors working in all sions and 400 talks. Discusdisciplines in industry, service, government, military and academic sions may continue in the sectors. open spaces of the downSubmissions are still being accepted for case studies, poster town conference center De sessions, Ph.D. Colloquium and the vendor track. Complete paper deadlines and requirements are available at www.wintersim.org. Doelen, the terraces of RotWSC 2017 is sponsored by ACM/SIGSIM, IISE (Institute of terdam during lunch, dinner Industrial and Systems Engineers), INFORMS-SIM and SCS or nightlife, or at the confer(Society for Modeling and Simulation International), with technical ence reception on a historic co-sponsorship from ASA (American Statistical Association), ASIM (Arbeitsgemeinschaft Simulation), IEEE/SMC (Systems, paddle-wheel steamer cruisMan and Cybernetics) and NIST (National Institute of Standards ing the Port of Rotterdam. and Technology). On the day before the For more information, visit the WSC website: http://meetings2. conference starts, the coninforms.org/wordpress/wsc2017/ ference offers organized visits to Dutch healthcare best pracboat to Bath (United Kingdom), where tices. For example, it offers site visits the EURO Working Group on Operato: the newly designed Erasmus Meditional Research Applied to Health Sercal Center, the leanest hospital in the vices (ORAHS) will hold its annual Netherlands; Reinier de Graaff Gastmeeting starting July 30. This is yet anhuis in Delft; and the Dutch Healthcare other good reason to join us in optimizQuality Institute, the hub of the Dutch ing operations and outcomes from July health data network. For the day after 26-28 in Rotterdam. the conference, conference attendees For more information, visit the confercan take advantage of organized sightence website: http://meetings2.informs. seeing tours to nearby sites (UNESCO org/wordpress/healthcare2017/ â?™ world heritage Kinderdijk, the Delta Edwin Romeijn is the program chair and Joris van works) or cities (Amsterdam, Gouda), de Klundert is the conference chair of INFORMS before some may catch a plane, train or Healthcare 2017. 66

|

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


SPE CIAL ADVE RT ISIN G SE C T IO N

C L AS S IF I E D S

Disney At Parks and Resorts , we have multiple teams analyzing all forms data to directly impact the business. Every day we use data to bring stories to life and influence the future of guest experiences across the globe. Imagine being part of a collaborative team leading worldwide revenue strategies, impacting growth through comprehensive business and consumer insights to drive innovative analytical solutions. A team that implements and leverages new technological systems, while managing dynamic, multi-faceted relationships and partnership across a global company. In a rapidly changing consumer and market environment, with ever increasing touch points and data, our emphasis is on leveraging innovative and optimal solutions to provide actionable insights. Join one of our innovative, diverse teams to drive business results and strategies using advanced analytics to help tell our story at Walt Disney Parks and Resorts.

OUR DATA ANALYTICS TEAMS INCLUDE:

$

REVENUE MANAGEMENT & ANALYTICS CONSUMER INSIGHT MEASUREMENT & ANALYTICS EXPERIENCE INSIGHTS & ANALYTICS MERCHANDISE PLANNING & INSIGHTS WORKFORCE MANAGEMENT & DECISION SUPPORT

Parks and Resorts Has more than RESORT

30 HOTELS

4 SHIPS Navigate CRUISE

And fill over

37,000 ROOMS DAILY

WALT DISNEY WORLD® RESORT | 18 RESORT HOTELS – 23,000 ROOMS

HONG KONG DISNEYLAND RESORT | 2 HOTELS – 1,000 ROOMS

DISNEYLAND® RESORT | 3 HOTELS – 2,400 ROOMS

SHANGHAI DISNEY RESORT | 2 HOTELS – 1,220 ROOMS

DISNEYLAND® RESORT PARIS | 7 RESORT HOTELS – 5,800 ROOMS

DISNEY CRUISE LINE® | 4 SHIPS – 4,274 STATEROOMS

BE A PART OF OUR STORY

Disney Careers

DisneyDataJobs.com/Informs

Disney Careers

EOE • DRAWING CREATIVITY FROM DIVERSITY • ©DISNEY

A NA L Y T I C S

The Walt Disney Company

J U LY / A U G U S T 2 017

|

67


FIVE- M IN U T E A N A LYST

Rainfall and reference years The original question – how to incorporate data from years that include “leaps” – started us down an interesting path.

BY HARRISON SCHRAMM

68

|

This installment comes from a discussion I’ve been having with longtime friend and U.S. Naval Academy classmate Cara Albright. Her problem revolves around determining the “most representative” year of precipitation (rain) data from a large set. The original question – how to incorporate data from years that include “leaps” (i.e., Feb. 29) – started us down an interesting path. This is a fun story about collaboration and thinking about problems. To make this concrete, consider a graph of two separate year’s raw rainfall data (Figure 1). From this plot, it is unclear what the best method for measuring the “distance” between these two years would be. One current approach to this problem is to measure the similarity of the years “pointwise.” Now, those of us who have been alive for a few years (or have seen “The Pirates of Penzance”) know that not every year is the same; most years have 365 days, but a quarter of years have 366. The approaches to dealing with the problematic Feb. 29 are: a. Ignore it, thus throwing away ~.3 percent of the data. b. Lump it in with March 1.

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


Neither of these are particularly satisfactory to us. Instead of trying to measure the distance pointwise – which is highly sensitive to “breakpoints” – hourly and daily, we propose to measure the difference between cumulative precipitation, normalized to 365 days (and thus overcoming the leap year problem). To measure the difference between years, we follow a simple process of renormalizing the data to a 365-day “standard year.” We then sum the squared differences between the two years. For those who prefer math over words, we do this:

Figure 1: Graph of two separate year’s raw rainfall data.

The year with the minimum distance, as determined by the minimum (summed) distance over all other years, is the “reference” year. APPLICATION Our current data set consists of 100 years of rainfall data from Philadelphia, as shown in Figure 2. We determine the “representative year” starting in 1965 to the year chosen; in other words, the 1993 point is 1989-1993, 2000 is 1989-2000 and so on. Using this “moving right-hand reference approach,” we see the years chosen as depicted in Figure 3.

A NA L Y T I C S

Figure 2: Cumulative rainfall over 25 years.

With 1973 chosen as the most frequently representative year, based on minimum distance and a normalized year length. One “might” argue, as we thought, that this approach tends to favor years that have the total rainfall that is closest

J U LY / A U G U S T 2 017

|

69


FIVE- M IN U T E A N A LYST

Which tends to favor 1956 and, later, 1991 as representative years. A plot of these two candidates is shown in Figure 5:

Figure 3: 1973 chosen as the most frequently representative year.

to average. To overcome this minor difficulty, we may simply normalize the rainfall over the year as well, scaling the total for the year to 1. This “variance only” approach produces the graph shown in Figure 4.

Figure 5: Plot of two candidates.

In conclusion, we have applied a few more than five minutes worth of analysis this installment. What is more important than the results is that the basic ideas of calculus and statistics, which we don’t always use every day in practice, continue to echo in practice far beyond our basic schooling. Technical note: This analysis made ample use of the R base function approxfun(), which interpolates between values of a given empirical data set. This made numerical integration quite straightforward. ❙ Harrison Schramm (Harrison.schramm@gmail. com), CAP, PStat, is a principal operations research analyst at CANA Advisors, LLC, and a member of INFORMS.

Figure 4: Variance only approach.

70

|

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


Connecting with the right audience isn’t rocket science.

Exhibit & SPonsor at the world’s largest Operations Research & Analytics Conference For additional details, contact Olivia Schmitz at Olivia.Schmitz@informs.org or visit:

http://meetings.informs.org/houston2017 Houston, Texas

|

October 22-25, 2017


THIN K IN G A N A LY T I CA LLY

A long walk Danger ahead: A one-mile walking commute.

Walking in large cities poses some risk to pedestrians. There are poorly designed intersections, mistimed lights and commuters in cars that are anxious to get to their destination. My new commute to work includes a one-mile walk, which is represented in the accompanying map. Each line represents a segment of that walk, and the line color represents the level of risk for that segment. Red indicates a high-risk section, orange indicates a medium risk section, and green represents a low risk section. Move from circle to circle in any direction you like from start to finish while trying to minimize the total risk. To calculate the total risk of the walk, add up points for all segments as follows: green = 0 points, orange = 1 point, red = 2 points. QUESTION: What are the minimum total risk points that can be achieved for the walk? Send your answer to puzzlor@gmail.com by Sept. 15. The winner, chosen randomly from correct answers, will receive a $25 Amazon Gift Card. Past questions and answers can be found at puzzlor.com. â?™

BY JOHN TOCZEK

72

|

John Toczek is the senior manager of analytics at NRG in Philadelphia. He earned his BSc. in Chemical Engineering at Drexel University (1996) and his MSc. In operations research from Virginia Commonwealth University (2005).

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


G A M S - R E L AT E D C O U R S E S AND WORKSHOPS IN 2017 Whether you are new to GAMS or already an experienced user looking to deepen or expand your knowledge in a certain area – take a look at our diverse list of GAMS-related courses. Learn advanced, state-of-the-art techniques in a focused and interruption-free setting using the professional’s choice in modeling software – GAMS. Domain experts will be teaching the following courses at locations worldwide:

June

September

Online Course

Essen, Germany

Introduction to Practical Global CGE Modeling with GAMS Prague, Czech Republic Practical General Equilibrium Modeling with GAMS Energy and Environmental CGE Modeling with GAMS Advanced Techniques in General Equilibrium Modeling with GAMS Overlapping Generation General Equilibrium Modeling with GAMS

August

Trade Policy Analysis with GAMS and MPSGE

November Weisenheim a.B., Germany Modeling and Optimization with GAMS (basic) Modeling and Optimization with GAMS (advanced)

Continuous Online Practical General Equilibrium Modeling with GAMS Online Advanced Techniques in General Equilibrium Modeling with GAMS

Annapolis, MD, USA Single Country General Equilibrium Modelling with GAMS and STAGE Global CGE Modelling with GAMS and GLOBE Frisco, CO, USA Basic GAMS Modeling – An Introductory Class Advanced GAMS Modeling

Take a look at our YouTube channel with instructional videos youtube.com/GAMSLessons

Further information and registration www.gams.com/courses


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.