Analytics March/April 2017

Page 1

H T T P : / / W W W. A N A LY T I C S - M A G A Z I N E . O R G

DRIVING BETTER BUSINESS DECISIONS

M ARC H / APRI L 2017

Getting down to

business • Disarming ‘Weapons of Math Destruction’ • Account-based marketing and lead scoring • Software survey: joys and perils of statistics • Human resources: talent acquisition strategy

ALSO INSIDE: • Elections & analytics • Analytics in action • Five-minute analyst • Healthcare analytics • Analytics conference

Executive Edge Emil Eifrem, Neo Technology CEO, on graph databases and the ability to make sense of terabytes of connected data


INS IDE STO RY

Weapons of mass instruction Cathy O’Neil’s provocative book, “Weapons of Math Destruction: How big data increases inequality and threatens democracy,” created quite a stir in the analytics community when it was released last fall. In the book, O’Neil, a data scientist who holds a Ph.D. in mathematics from Harvard, explores the dark side of big data and data science, including the fairness, power and risks of mathematical models and their potential negative impact. In its review,The New York Times wrote that “O’Neil’s book offers a frightening look at how algorithms are increasingly regulating people,” while Reuters opined that the book “… is the big data story Silicon Valley proponents won’t tell … [It] pithily exposes flaws in how information is used to assess everything from creditworthiness to policing tactics… A thought-provoking read for anyone inclined to believe that data doesn’t lie.” For its part, Amazon described the book as follows: “A former Wall Street quant sounds an alarm on the mathematical models that pervade modern life – and threaten to rip apart our social fabric.” Yikes! In this issue of Analytics magazine, we offer two more viewpoints on O’Neil’s book from a couple of fellow quants: Vijay Mehrotra and Eric Siegel. Vijay is perhaps 2

|

A N A LY T I C S - M A G A Z I N E . O R G

best known to readers of Analytics magazine as the author of the popular “Analyze This!” column, while Eric is a household name in the worldwide analytics community as the founder of the Predictive Analytics World series of conferences. Both are educators (Vijay as a professor at the University of San Francisco, Eric as a former instructor at Columbia University), both are entrepreneurs (Vijay with multiple high-tech start-ups and angel funding, Eric with PAW and other initiatives), both are accomplished authors and speakers, both have incredible knowledge and passion of and for high-end analytics, and both enthusiastically recommend “Weapons of Math Destruction” ­but for somewhat different reasons. As educators, Vijay and Eric’s commentary offers instructive insight on not just “Weapons of Math Destruction,” but also on the state of analytics, and the challenges and ethics issues all of those involved in the profession now face. In the interest of full disclosure, I have not yet read “Weapons of Math Destruction,” but I’m looking forward to doing so, and Vijay has offered to send me his copy. In the meantime, I highly recommend you read both Vijay’s Analyze This! column and Eric’s Viewpoint article in this issue of Analytics. ❙ W W W. I N F O R M S . O R G


Ready for more robust analytics and optimization software? Boost productivity and collaboration with a technology that’s easy to customize, update and deploy. Find out how JBS, a global leader in beef, lamb and poultry processing, increased their margin by 25% by trading spreadsheets for AIMMS-based Prescriptive Analytics applications.

Thanks to the flexibility of AIMMS, the development of new optimization modules can be done with the agility comparable to the dynamism of our company and its market.” - João Batista Rocha de Souza, Integrated Planning Manager at JBS

Read the case study The AIMMS Prescriptive Analytics Platform helps you evaluate and identify the best options to tackle your most pressing challenges with sophisticated analytics that leverage mathematical modeling and scenarios while pulling from multiple data sources. You can immediately gauge, not just what is likely to happen, but what you should do about it for the best possible outcome. That’s why teams at JBS, Shell, GE and Heineken and many more fire up AIMMS every day.


C O N T E N T S

DRIVING BETTER BUSINESS DECISIONS

MARCH/APRIL 2017 Brought to you by

FEATURES 32

ABM AND PREDICTIVE LEAD SCORING Account-based marketing, and the related technology of predictive lead scoring, is dramatically changing the face of sales and marketing.

By Megan Lueders

32 36

SOFTWARE SURVEY: JOYS, PERILS OF STATISTICS Trends, developments and what the past year of sports and politics taught us about variability and statistical predictions.

By James J. Swain

42

42

DISARMING ‘WEAPONS OF MATH DESTRUCTION’ Cathy O’Neil’s provocative best-selling book explores the dark side of data science, but data science is largely good. Here’s why.

By Eric Siegel

46

PREDICTIVE TALENT ACQUISITION STRATEGY As the cost of failed new hires grows, so does the importance of pre-hire assessment. How to find the right talent assessment vendor.

By Greta Roberts

46 52

VOTER MOTIVES AND MESSAGES The analytics story behind the Scottish secession vote, “Brexit” referendum and U.S. presidential election.

By Douglas A. Samuelson

52 4

|

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


Welcome to Analytic Solver ® Cloud-based Optimization and Simulation that Integrates with Excel

Everything in Predictive and Prescriptive Analytics Everywhere You Want, from Concept to Deployment.

functions; easy multiple parameterized simulations, decision trees, and a wide array of charts and graphs.

The Analytic Solver® suite makes the world’s best optimization software and the fastest Monte Carlo simulation and risk analysis software available in your web browser (cloud-based software as a service), and in Microsoft Excel. And you can easily create models in our RASON® language for server, web and mobile apps.

Forecasting, Data Mining and Text Mining.

Linear Programming to Stochastic Optimization. It’s all point-and-click: Fast, large-scale linear, quadratic and mixed-integer programming, conic, nonlinear, non-smooth and global optimization. Easily incorporate uncertainty and solve with simulation optimization, stochastic programming, and robust optimization.

Comprehensive Risk and Decision Analysis. Use a point-and-click Distribution Wizard, 50 probability distributions, automatic distribution fitting, compound distributions, rank-order correlation and three types of copulas; 50 statistics, risk measures and Six Sigma

Analytic Solver is also a full-power, point-and-click tool for predictive analytics, from time series methods to classification and regression trees, neural networks, and access to SQL databases and Spark Big Data clusters.

Find Out More, Start Your Free Trial Now. In your browser, in Excel, or in Visual Studio, Analytic Solver comes with everything you need: Wizards, Help, User Guides, 90 examples, even online training courses. Visit www.solver.com to learn more or ask questions, and visit analyticsolver.com to register and start a free trial – in the cloud, on your desktop, or both!

Tel 775 831 0300 • Fax 775 831 0314 • info@solver.com


DRIVING BETTER BUSINESS DECISIONS

REGISTER FOR A FREE SUBSCRIPTION: http://analytics.informs.org INFORMS BOARD OF DIRECTORS

28

60

DEPARTMENTS

2 8 12 16 20 24 28 60 65 68 69 70 74

Inside Story Executive Edge Analyze This! Healthcare Analytics INFORMS Initiatives Newsmakers Analytics in Action Conference preview: Analytics & O.R. Q&A: Analytics conference chairman Conference preview: Marketing Science Conference preview: Healthcare Five-Minute Analyst Thinking Analytically

Analytics (ISSN 1938-1697) is published six times a year by the Institute for Operations Research and the Management Sciences (INFORMS), the largest membership society in the world dedicated to the analytics profession. For a free subscription, register at http://analytics.informs.org. Address other correspondence to the editor, Peter Horner, peter.horner@mail.informs.org. The opinions expressed in Analytics are those of the authors, and do not necessarily reflect the opinions of INFORMS, its officers, Lionheart Publishing Inc. or the editorial staff of Analytics. Analytics copyright ©2017 by the Institute for Operations Research and the Management Sciences. All rights reserved.

6

|

A N A LY T I C S - M AGA Z I N E . O RG

President Brian Denton, University of Michigan President-Elect Nicholas Hall, Ohio State University Past President Edward H. Kaplan, Yale University Secretary Pinar Keskinocak, Georgia Tech Treasurer Michael Fu, University of Maryland Vice President-Meetings Ronald G. Askin, Arizona State University Vice President-Publications Jonathan F. Bard, University of Texas at Austin Vice President Sections and Societies Esma Gel, Arizona State University Vice President Information Technology Marco Lübbecke, RWTH Aachen University Vice President-Practice Activities Jonathan Owen, CAP, General Motors Vice President-International Activities Grace Lin, Asia University Vice President-Membership and Professional Recognition Susan E. Martonosi, Harvey Mudd College Vice President-Education Jill Hardin Wilson, Northwestern University Vice President-Marketing, Communications and Outreach Laura Albert McLay, University of Wisconsin-Madison Vice President-Chapters/Fora Michael Johnson, University of Massachusetts-Boston INFORMS OFFICES www.informs.org • Tel: 1-800-4INFORMS Executive Director Melissa Moore Director, Public Relations & Marketing Jeffrey M. Cohen Headquarters INFORMS (Maryland) 5521 Research Park Drive, Suite 200 Catonsville, MD 21228 Tel.: 443.757.3500 E-mail: informs@informs.org ANALYTICS EDITORIAL AND ADVERTISING

Lionheart Publishing Inc., 506 Roswell Street, Suite 220, Marietta, GA 30060 USA Tel.: 770.431.0867 • Fax: 770.432.6969

President & Advertising Sales John Llewellyn john.llewellyn@mail.informs.org Tel.: 770.431.0867, ext. 209 Editor Peter R. Horner peter.horner@mail.informs.org Tel.: 770.587.3172 Assistant Editor Donna Brooks donna.brooks@mail.informs.org Art Director Alan Brubaker alan.brubaker@mail.informs.org Tel.: 770.431.0867, ext. 218 Advertising Sales Aileen Kronke aileen@lionhrtpub.com Tel.: 678.293.5201


Welcome to Analytic Solver ® Cloud-based Data and Text Mining that Integrates with Excel

Everything in Predictive and Prescriptive Analytics Everywhere You Want, from Concept to Deployment. The Analytic Solver® suite makes powerful forecasting, data mining and text mining software available in your web browser (cloud-based software as a service), and in Microsoft Excel. And you can easily create models in our RASON® language for server, web and mobile apps.

Full-Power Data Mining and Predictive Analytics. It’s all point-and-click: Text mining, latent semantic analysis, feature selection, principal components and clustering; exponential smoothing and ARIMA for forecasting; multiple regression, logistic regression, k-nearest neighbors, discriminant analysis, naïve Bayes, and ensembles of trees and neural networks for prediction; and association rules for affinity analysis.

distributions, 50 statistics and risk measures, rankorder and copula correlation, distribution fitting, and charts and graphs. And it has full-power, point-and-click optimization, with large-scale linear and mixed-integer programming, nonlinear and simulation optimization, stochastic programming and robust optimization.

Find Out More, Start Your Free Trial Now. In your browser, in Excel, or in Visual Studio, Analytic Solver comes with everything you need: Wizards, Help, User Guides, 90 examples, even online training courses. Visit www.solver.com to learn more or ask questions, and visit analyticsolver.com to register and start a free trial – in the cloud, on your desktop, or both!

Simulation/Risk Analysis, Powerful Optimization. Analytic Solver is also a full-power, point-and-click tool for Monte Carlo simulation and risk analysis, with 50

Tel 775 831 0300 • Fax 775 831 0314 • info@solver.com


EXE CU TIVE E D G E

Graph databases, journalists & the Panama Papers Mining huge data sets: The powerful technology behind one of the biggest data leaks in history.

BY EMIL EIFREM

8

|

The Panama Papers, the unprecedented leak of 11.5 million files from the database of the global law firm Mossack Fonseca, opened up the offshore tax accounts of the rich, famous and powerful – laying bare how they have exploited secretive offshore tax regimes for decades. At 2.6 terabytes of data, the Panama Papers is the biggest data leak in history, towering over the U.S. diplomatic cables released by WikiLeaks in 2010, or more recently, intelligence documents handed over by Edward Snowden. The investigation into the Panamanian law firm’s dealings and that of its elite clients was the direct result of work carried out by journalists at The International Consortium of Investigative Journalists (www.icij.org). More than 370 reports from 80 countries worked on the data for a year, such was its

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


scale. As part of its endeavors, the ICIJ also released a searchable database of 300,000 entities harvested from the Panama Papers and its offshore leaks investigation. KEY TAKEAWAYS The Panama Papers displayed the murky side of offshore accounts, identifying high-ranking government and public officials and pushing some out of office. But another major aspect that stands out is the power of the data itself and how it was sifted. It wasn’t searched and manipulated by experienced data scientists, but by a team of journalists, many of whom would not identify themselves as very technical. How did the journalists manage to pick out meaningful data from such huge, unstructured files? The answer is graph database technology, which enabled journalists to surface connections between the data, much like joining the dots, to form a picture. Mar Cabra, head of the data and research unit at the ICIJ, has described graph database technology as “a revolutionary discovery tool that’s transformed our investigative journalism process.” The unique skill of graph databases is their ability to spot and understand relationships between data at huge scale. Graph databases utilize structures A NA L Y T I C S

made up of nodes, properties and edges to store data, unlike relational databases, which store the information in rigid tables. Graph databases then map the links between required entities. This is a boon for investigative journalists, but it is also a powerful tool for any business looking to tackle big data and connected data issues. GRAPH CONNECTIONS Graph databases are an excellent way to make sense of the terabytes of connected data in an efficient manner. Why? Because unlike relational databases, which break data down into tables, graph databases use a notational structure that mimics the way humans intuitively look at information. Once the data model is coded in a scalable architecture, a graph database is unbeatable at analyzing the connections in large, complex data sets. This enables any business to build and manipulate big data structures easily. Tech giants such as Google, Facebook and LinkedIn have recognized the power of graph databases for some time. For example, Facebook and LinkedIn’s tools for mapping realtime networks and connections that let us walk through social networks are founded on graph technology. Now that graph database technology has started to go mainstream, this highly scalable M A R C H / A P R I L 2 017

|

9


EXE CU TIVE E D G E

connected data analysis is available to all organizations, from startups to blue chips and government. Graph databases are set to come into their own with the Internet of Things (IoT), where billions of connected devices mean dealing with petabytes of data. Graph databases will enable enterprises to mine data in ways that just aren’t possible using data warehouses and relational database technology. Graph technology is increasingly becoming the tool of choice for international agencies,

governments, financial services companies and enterprises looking to make real-time connections between data and discover the patterns that make up their relationships. We will undoubtedly be hearing more about the power of graph databases in the business world as more and more organizations latch on to the unique capabilities it offers. ❙ Emil Eifrem is co-founder and CEO of Neo Technology (http://neo4j.com/), developers of the graph database Neo4j.

2017 ANALYTICS CAREER FAIR Find the Right Analytics Professional to Make Sense of Your Data. • Find the seasoned professionals you need – over 800 analytics professionals expected • Provide your recruitment materials in a casual setting • Arrange discreet on-site meetings in private booths • Enhance your visibility with an ad in Analytics or OR/MS Today

QUESTIONS?

careers@informs.org or call (800) 4-INFORMs

CAREER CENTER

http://meetings.informs.org/analytics2017 10

|

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


CPLEX Optimization Studio®. Still the best optimizer and modeler for the transport industry. Now you can get it direct

CPLEX Optimization Studio is well established as the leading, complete optimization software. For years it has proven effective in the transport industry for developing and deploying models and optimizing transport strategy. Now there’s a new way to get CPLEX – direct from the optimization industry experts. Find out more at optimizationdirect.com The IBM logo and the IBM Member Business Partner mark are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. *IBM ILOG CPLEX Optimization Studio is trademark of International Business Machines Corporation and used with permission.


ANALY ZE TH I S !

Victimized by WMD

“They do not listen. Nor do they bend. They’re deaf not only to charm, threats and cajoling but also to logic.”

BY VIJAY MEHROTRA

12

|

My family and I are spending the second half of my sabbatical in Madrid. We arrived in Spain a couple of weeks ago, and the day we arrived I tried to log in to my bank’s online site to pay my credit card bill. Quite unexpectedly, I received a message informing me that before logging on I would need to enter a verification code that had just been sent in a text message to my U.S. cell number. Alas, this number is temporarily turned off while I am overseas, so I was unable to log in. I wound up on the phone for more than an hour – at international telecomm rates! – with a customer service representative (who could not successfully help me) and then with her supervisor (who did ultimately help me get on to the online banking site only after having to change my login name). Feeling triumphant, I logged in to my bank’s site, paid my credit card bill and went to bed feeling that the matter had been resolved.

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


When I woke up the next morning, however, I had received an email from my bank’s fraud detection department saying that it: (a) suspected that the payment I made was fraudulent; (b) was canceling the payment; and (c) was suspending my online access until I called them. After spending another hour on another international phone call, I was informed that the payment could not be re-submitted and that my online access could not be restored due to a technical issue (“a known bug in the system,” the agent told me sheepishly). Dr. Cathy O’Neill, author of the book “Weapons of Math Destruction” [1], would be quick to say that I had been victimized by classic WMDs, a term that she coined to describe applied mathematical models with the following attributes: Opacity: It was not at all clear why my access to online banking had been revoked. Yes, I was accessing my account from outside of the country, but I had informed the bank of my travel plans several weeks before leaving. Not only could I not figure it out, but neither could the many bank employees who I spoke with over the phone. Could it have been my foreign last name? Note that my wife had no trouble accessing her accounts at the same bank from Spain. A NA L Y T I C S

To O’Neill, this is typical: “… these mathematical models were opaque, their workings invisible to all but the highest priests in their domain,” she writes. Scale: My bank has tens of millions of customers worldwide. So, the same models that were being used to prevent me from accessing my account online and paying my bills are undoubtedly blocking other customers in other circumstances from accessing their accounts as well. As O’Neill points out, “scale is what turns WMDs from local nuisances to into tsunami forces.” Damage: The potential cost to me of a late credit card payment, especially on a bill that included international plane flights for my entire family, was substantial. And despite an almost endless stream of apologies from the various bank employees over the phone, there was absolutely nothing either they or I could do about it. This is just as infuriating to O’Neill as it was to me: “You cannot appeal to a WMD. That part of their fearsome power. They do not listen. Nor do they bend. They’re deaf not only to charm, threats and cajoling but also to logic.” O’Neill’s book has a special focus on social justice (its subtitle is “How Big Data Increases Inequality and M A R C H / A P R I L 2 017

|

13


ANALY ZE TH I S !

Threatens Democracy”). This focus is evident in her text and in many of the examples of WMDs that she describes, including credit scoring, payday lending, predictive policing [2], criminal recidivism, predatory online marketing, employee selection and staff scheduling. She convincingly argues that many of these models perniciously combine to reduce social mobility for the poor (while also enabling the wealthier classes with increasingly personalized options for all kinds of products and services). She sadly observes that, “Being poor in a world of WMDs is getting more and more dangerous and expensive.” Two of O’Neill’s key criticisms of WMDs are the narrowness of their objectives and the absence of feedback. For example, when a for-profit university successfully targets students who then take out huge government-guaranteed loans before leaving school with no appreciable improvement in their job prospects, the institution views these models as “successful,” even though the cost to society – and to the unsuspecting students who bought into the sales pitch – are substantial. Moreover, information about situations where models have failed – for example, when an employee screening model filters out an individual who turns out

14

|

A N A LY T I C S - M A G A Z I N E . O R G

to be extremely successful elsewhere – is too often simply not considered, resulting in models that codify what was reflected in their initial assumptions and inputs. Meanwhile, the book also describes many instances in which data is being captured and aggregated across multiple sources and through various middlemen (all of whom have their own narrow objectives and lack of feedback), propagating data errors and biases across models and industries. Overall, O’Neill does an excellent job of describing very serious problems that have too often been ignored or discounted by the chorus of analytics cheerleaders, this writer included. Solutions, however, are harder to pin down. In the book’s final chapter, she offers some suggestions, including self-policing, richer models that measure broader metrics, changes in laws surrounding the use of data (including updates to the Fair Credit Reporting Act, the Americans with Disability Act, and the Health Insurance Portability and Accountability Act to prohibit discrimination that is driven by predictive models), calls for businesses to make their models more transparent, and academic research using agent-based simulations to help provide an understanding of the logic underlying WMDs and of the results that they produce.

W W W. I N F O R M S . O R G


Ultimately, O’Neill clearly believes that these problems can only be addressed with the active participation of analytics professionals, exhorting us to “come together to police these WMDs, to tame and disarm them.” And my sense is that it will require not only a great deal of intellectual firepower, but also a clear understanding of one’s own values and a great deal of courage. Kudos to O’Neill – a mathematician whose career has included stints as a college professor, Wall Street quant

Ivan B. Class of ‘18 Oil and Gas

REFERENCES & NOTES 1. https://www.goodreads.com/book/show/ 28186015-weapons-of-math-destruction 2. For more on this, see http://analytics-magazine. org/analyze-this-a-silver-lining-for-election-blues/

and data scientists – for sounding the alarm. We are all challenged to answer its call. ❙ Vijay Mehrotra (vmehrotra@usfca.edu) is a professor in the Department of Business Analytics and Information Systems at the University of San Francisco’s School of Management and a longtime member of INFORMS.

“This program has helped me develop some great tools for my analytics belt.” -Spring 2016 Exit Survey

MORE INFORMATION: ANALYTICS.STAT.TAMU.EDU

Informs ad.indd 1

A NA L Y T I C S

10/17/16 2:57 PM

M A R C H / A P R I L 2 017

|

15


HEALT H CARE A N A LY T I C S

Analytics in the post-ACA era In search of a silver lining in the current healthcare environment.

In the last three years, data and analytics found tremendous support within many healthcare organizations.

BY RAJIB GHOSH

16

|

Following the 2016 presidential election, the first two months of the new year were quite tumultuous for the U.S. healthcare system. Interestingly, the Affordable Care Act (ACA) has survived so far, albeit the process to repeal the law has begun. Congress is divided on the best approach to repeal and replace ACA; an array of proposals is on the table. We have also heard from the president that the replacement of the law may not happen until 2018. Still, various facets of the law could be changed or removed without going through the full congressional voting process. Pundits fear that such disjointed attempts to repeal the law could produce dramatic impact both on the insurance market as well as the lives of people who are currently covered by the law. Clearly this uncertainty is not boding well with the healthcare marketplace, and we anticipate a slowing down of the industry’s overall job growth. Leadership has changed both at the department of Health and Human Services (HHS) as well as the Center for Medicare and Medicaid (CMS).

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


The latter change particularly could have a profound impact on how the senior and underserved populations of the country would receive healthcare in the future. Congress is considering converting Medicaid as a “block grant” program, which would effectively limit the amount of federal mon- Figure 1: Will the steady growth of employment in the healthcare industry continue in the post-ACA era? ey allocated to states for Source: https://www.bloomberg.com/news/articles/2016-12-02/ trump-s-health-care-reform-uncertainty-could-see-jobs-shrink delivering healthcare to the underserved. Medicaid grants to states as fewer provider organizations (both pricurrently is proportional to the number mary care and specialists) show interest of Medicaid-covered lives. That could in seeing patients. That will also crowd change under a “block grant” program. emergency rooms with non-emergency States that opted for Medicaid expanand uncompensated patient care. This sion during the previous administration could potentially have a serious impact now face a stiff challenge. If the grant on the bottom line of rural community is capped while a state’s underserved hospitals which could close their doors population increases, will a state leave for good. the newly Medicaid eligible people out IS THERE A SILVER LINING? of coverage or will the state reduce payments to providers to allocate the same For a long time, healthcare futurists money over a larger pool of enrollees? talked about utilizing data, analytics and For example, one in three Californians eventually artificial intelligence at all levare now covered by Medicaid. With a limels within healthcare organizations. In the ited pool of federal grant money for Medlast three years, data and analytics found icaid, what will California do? If the state tremendous support within many healthreduces payments to providers (or recare organizations. Electronic health duces per capita payment for enrollees), record (EHR) vendors built population then access to healthcare will deteriorate health analytics products or services to

A NA L Y T I C S

M A R C H / A P R I L 2 017

|

17


HEALT H CARE A N A LY T I C S

augment their revenue streams. Many analytics companies, from startups to industry behemoths like IBM, put healthcare in the front and center of their business expansion strategy. At last year’s Healthcare Information Management System Society’s (HIMSS) annual conference, more than a hundred analytics companies exhibited their products to over 40,000 global attendees. Clinical analytics claimed the spotlight; operational analytics did not make much noise. I anticipate that the current environment will make operational analytics great again as organizations look inward to build efficiency and minimize waste. This will also spread from hospitals to health centers to payers. In other words, the entire value chain of healthcare. In this environment, will artificial intelligence (AI) become the new moonshot in healthcare? AI and machine learning have made significant progress toward maturity during the last two years. Many industries are looking at AI as their future. While AI is expected to grow at a CAGR rate of 62.9 percent during the next five years, a new report suggests that healthcare will serve as a key area for that growth. Given that not enough money may be available to take care of the underserved of the

18

|

A N A LY T I C S - M A G A Z I N E . O R G

nation, we could think of using AI as a tool to tackle regular primary care and chronic conditions. If human longevity continues to rise as predicted, the burden of chronic diseases will be everincreasing. Can AI become a savior in such a scenario? Clearly, AI technology needs to go beyond what it can do now, but imagine what could happen if IBM, Google, Facebook and Microsoft further democratize their AI software stack and invest in the rapid grow of AI-based expert systems. Community health centers or retail clinics could then use such systems to tackle the uninsured population at a fraction of the cost. If the current congressional proposals become the new norm, then the uninsured population will increase again. As a society, shouldn’t we accelerate the AI technology to prevent catastrophe in the lives of so many people? I consider that a greater societal good than, for example, developing autonomous vehicles. I think self-driving cars and trucks is a great idea, but we need to rethink our priorities. PERSONAL WELLNESS AND PRECISION MEDICINE While precision medicine is the future of medical science, personal

W W W. I N F O R M S . O R G


wellness will need a renewed focus in the near term. Both use data but with different goals. Precision medicine targets personalized intervention to cure a disease such as cancer based on genomic data; it is very expensive and still in the realm of research. Personal wellness tries to prevent such diseases from happening or delay their onset. It could be accomplished with the data that is available to us now. The data can be easily acquired and computed cheaply by today’s technology such as cloud, sensors and smart phones. We could do low-cost, preemptive intervention within the underserved population, enable people with technology and guide them to make better decisions in their daily lives. I would hope that more small- and large-scale analytics and wearable analytics companies focus on this population segment to prevent high-cost interventions like emergency room visits. States facing a deficit in their Medicaid funding could drive such initiatives in partnership

A NA L Y T I C S

with technology companies, and that can happen now. It is unclear what will happen in a year from now. I see states like California gearing up to resist new health policies from Washington. I see private technology companies of the Silicon Valley resisting the travel ban. I hope all political parties will sit down together to create a comprehensive plan to work in lock step and creatively address the funding deficit and care delivery challenges of the near future. Human society can then become what it should be – humanitarian. ❙ Rajib Ghosh (rghosh@hotmail.com) is an independent consultant and business advisor with 20 years of technology experience in various industry verticals where he had senior-level management roles in software engineering, program management, product management and business and strategy development. Ghosh spent a decade in the U.S. healthcare industry as part of a global ecosystem of medical device manufacturers, medical software companies and telehealth and telemedicine solution providers. He’s held senior positions at Hill-Rom, Solta Medical and Bosch Healthcare. His recent work interest includes public health and the field of IT-enabled sustainable healthcare delivery in the United States as well as emerging nations.

M A R C H / A P R I L 2 017

|

19


INFO RM S IN I T I AT I VE S

Edelman Award, IAAA, Wagner Prize, pro bono and more INFORMS ANNOUNCES 2017 EDELMAN AWARD FINALISTS INFORMS announced six finalists for the 46th annual Franz Edelman Award for Achievements in Operations Research and the Management Sciences, the world’s most prestigious award for achievement in the practice of analytics and O.R. The 2017 Edelman Award will be presented at the INFORMS Conference on Business Analytics & Operations Research in Las Vegas on April 2-4. The finalists for the 2017 Edelman Award include: the American Red Cross, Barco (a global technology company that manufactures products for the entertainment, healthcare and enterprise markets), BHP Billiton (one of the world’s largest producers of major commodities including iron ore, coal and other metals and minerals), General Electric, Holiday Retirement (the largest private owner and operator

20

|

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


of independent senior living communities in the United States) and the New York City Department of Transportation. For more, see page 62. ANALYTICS SOCIETY ANNOUNCES FINALISTS FOR 2017 IAAA HONORS The Analytics Society of INFORMS announced three finalists for the 2017 Innovative Applications of Analytics Award (IAAA) sponsored by Caterpillar and the Society. Scott Grasman of Rochester Institute of Technology chaired the judging committee. The finalists will present their projects at the 2017 INFORMS Conference on Analytics & Operations Research in Las Vegas in April. The finalists include: • “Audience Targeting Solutions Powered by Advanced Analytics,” Turner Broadcasting System, Inc. • “Combining Multi-Criteria Analysis with Interactive Visualization and Real Time Sensitivity Simulations to Support the Regeneration Process of Disused Railways,” London School of Economics and Political Science • “Assurance of Supply Center – Excellence through Supply Network Optimization,” Caterpillar The flagship competition of the Analytics Society, the award recognizes the creative and unique application of

A NA L Y T I C S

a combination of analytical techniques in a new area. The award promotes the awareness and value of the creative combination of analytics techniques in unusual applications to provide insights and business value. ESSENTIAL PRACTICE SKILLS WORKSHOP SET FOR SEATTLE IN APRIL The INFORMS Continuing Education program will be back on the road in 2017, with the first stop in Seattle on April 5-6 when Patrick Noonan will conduct a two-day “Essential Practice Skills for High-Impact Analytics Projects” workshop at Seattle Pacific University. Attendees will learn practical frameworks and systematic processes for addressing complex, real-world problems and how to facilitate effective action. To learn more about the workshop and to register, click here. APPLICATIONS FOR 2017 DANIEL H. WAGNER PRIZE NOW OPEN Applications for the 2017 Daniel H. Wagner Prize for Excellence in Operations Research Practice are now open. A twopage abstract in English that provides evidence of mathematical development, solution, unique new algorithm or series of coherent advances developed in

M A R C H / A P R I L 2 017

|

21


INFO RM S IN I T I AT I VE S

conjunction with an application is due by May 1. To learn more, click here. PRO BONO ANALYTICS PROGRAM ACCEPTING VOLUNTEERS

Apply for 2017

Make a difference in underserved communities by volunteering your time and talents to the INFORMS Pro Bono Analytics program. The program gives non-profit organizations the opportunity to work with analytics professionals on a volunteer basis to help solve challenges and create new opportunities for success

with the scientific process of transforming data into insight. The initiative matches INFORMS’ analytics professional volunteers with non-profit organizations that would benefit from advanced analytics and operations research training and techniques. By focusing on current analytics issues as they relate to non-profit organizations, the Pro Bono Analytics team will be able to take the necessary steps in assisting to solve the most complex of issues. Volunteer opportunities are constantly being added. To learn more, click here. ❙

Apply to win this prestigious practice prize that rewards professionals who devise innovative analytical methods, utilize those methods in a verifiably successful O.R./analytics project, and describe their work in a clear, well-written paper. Two-page abstract is due by May 1, 2017. This top INFORMS practice prize spans all O.R. and analytics disciplines and application fields. Any work presented in an INFORMS section or society practice-oriented competition is eligible as long as the work did not result in a published paper. The Wagner Prize competition is high-profile, with its own track at the INFORMS Annual Meeting. Presentations are widely distributed via streaming video. Finalist papers are published as a special issue in INFORMS respected practice journal Interfaces. Last year’s competition was held at the INFORMS Annual Meeting, November 13-16, 2016, in Nashville, Tennessee. The first-place prize will be awarded to Mikael Rönnqvist, Gunnar Svenson, Patrik Flisberg, and Lars-Erik Jönsson at the Edelman Gala during the April 2017 Conference on Business Analytics and O.R. in Las Vegas, Nevada. Don’t miss your chance to win this illustrious award for 2017.

Daniel H. Wagner

www.informs.org/wagnerprize

22

|

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


Want to deploy more applications in 2017? Now, you can build and deploy ANY optimization or analytic application 80% faster with FICO® Optimization Modeler (powered by Xpress). This platform delivers scalable and high-performance algorithms, a flexible modeling environment, and unmatched rapid application and reporting capabilities.

FICO is a proud sponsor of the 2017 INFORMS Analytics Conference Email us at (+%1#OGTKECU"ƒEQ EQO to schedule a demo, or visit Booth 32 in the Exhibit Hall for your chance to win a mini drone, and other great prizes. © 2017 Fair Isaac Corporation. All rights reserved.


NE W S M AK E R S

NAE electees, Electoral College, social media Mark S. Daskin, Arkadi Nemirovski and Sridhar R. Tayur have been elected to the NAE.

THREE INFORMS MEMBERS ELECTED TO NATIONAL ACADEMY OF ENGINEERING Three INFORMS members are among those elected to the National Academy of Engineering’s class of 2017. They include: Mark S. Daskin, the Clyde W. Johnson Collegiate Professor and chair, department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, for “leadership and creative contributions to location optimization and its application to industrial, service Mark S. Daskin and medical systems.”

Arkadi Nemirovski

24

|

A N A LY T I C S - M A G A Z I N E . O R G

Arkadi Nemirovski, John Hunter Chair and Professor, H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, for “the development of efficient algorithms for large-scale convex optimization problems.”

W W W. I N F O R M S . O R G


Sridhar R. Tayur, Ford Distinguished Research Chair Professor of Operations Management, Tepper School of Business, Carnegie Mellon Sridhar R. Tayur University, Pittsburgh, for “developing and commercializing innovative methods to optimize supply chain systems.” Election to the National Academy of Engineering is among the highest professional distinctions accorded to an engineer. Academy membership honors those who have made outstanding contributions to “engineering research, practice or education, including, where appropriate, significant contributions to the engineering literature” and to “the pioneering of new and developing fields of technology, making major advancements in traditional fields of engineering or developing/ implementing innovative approaches to engineering education.” ELECTORAL COLLEGE PUT TO THE MATH TEST FOLLOWING 2016 U.S. PRESIDENTIAL ELECTION While political pundits continue to rehash the respective 2016 U.S. presidential campaign, almost all of us

A NA L Y T I C S

can agree on two things: Hillary Clinton won the national popular vote by nearly three million votes, and Donald Trump won the Electoral College vote and thus the presidency. The discrepancy caused some people to take a closer look at the Electoral College and its state-by-state, largely winner-take-all format and question whether there could be a better method for selecting the president of the United States. Arnold Barnett MIT professor Arnold Barnett and Yale University professor Ed Kaplan, both longtime members of INFORMS (Kaplan served as president of the organization in 2016), were among those with inquiring minds. In a Dec. 16, 2016 article (“How to cure the Electoral College”) in the Los Ed Kaplan Angeles Times, Barnett and Kaplan proposed an “electoral vote equivalents” (EQV) system in which electoral votes are allocated “in direct proportion to each candidate’s share” of each state’s popular vote.” The authors argue that not only would the EQV system make every vote in every state important, but it would also increase the importance of

M A R C H / A P R I L 2 017

|

25


NE W S M AK E R S

less-populated states, which was one of the main objectives of the Founding Fathers when they created the Electoral College in the first place. At the time of the nation’s birth, a large percentage of the population was concentrated in a handful of urban areas such as New York, Boston and Philadelphia. In their wisdom, the Founding Fathers created the Electoral College to give voters in the less-populated, largely agricultural-oriented states a modest weighted say in the presidential election. Rather than repealing the Electoral College, Barnett and Kaplan say the EQV system would strengthen it by bringing every state into play in the general election, including electorate vote-rich California, Texas and New York, which are mostly ignored during presidential campaigns today because the winners of those states are a foregone conclusion. According to the authors, the EQV system would also increase the clout of small-population states because they tend to vote more lopsidedly than the nation as a whole, which pays off when the percentage of the margin of victory within a state matters. Would the EQV system have made a difference in this election? Yes, it would have awarded the presidency to Hillary, say the authors. But as Trump noted, if the rules 26

|

A N A LY T I C S - M A G A Z I N E . O R G

of the game had been different, he would have changed his campaign strategy. HOW SOCIAL MEDIA DATA MINING COULD SHAPE THE PRODUCTS OF TOMORROW Researchers at Nottingham Trent University have developed a way to analyze online consumer reviews and social media to help designers create better informed products. Led by design engineering expert professor Daizhong Su, a research team used data mining techniques and produced an algorithm that identifies the most liked and disliked features of existing products, according to thousands of consumer comments on websites such as Amazon, eBay, Facebook, YouTube and online stores. The approach – developed using big data, data mining and related Internet Daizhong Su technologies – detects keywords using automated online searches and informs designers of the successes and flaws of any given product. “At our fingertips is an array of data which tells us the strengths and weaknesses of almost every product in the world,” Su says. “We’ve developed a way to harness this valuable information and created a powerful approach that could change the way we think about design. It has the potential to W W W. I N F O R M S . O R G


make tomorrow’s products more innovative, user-friendly, sustainable and better informed of user requirements.” After keywords are entered, the computer program learns on its own and categorizes reviews, giving a breakdown of positive and negative comments on various products and their features. It also learns to disregard spam comments. To test the technology, the team designed a desk lamp as a case study based on more than a thousand comments on existing designs on the

market. The results showed that consumers liked desk lamps which were small, adjustable, bright but with a dimmer, that included a touch function and more. Dislikes included unstable bases, poor reliability, dullness and excessive heat. This feedback was used to set a range of specifications for a design to achieve. The final design included an on/off switch that controlled brightness, a sustainable bamboo base and LED casing, and a brushed aluminium neck with an adjustable arm. ❙

2017 EDELMAN FINALISTS

CONGRATULATIONS TO THE

American Red Cross, for “Analytics-based Methods Improve Blood Collection Operations” Barco, for “Platform-based Product Development Enhances Barco’s Healthcare Division”

BHP Billiton, for “The DICE Simulation Model Unlocks Significant Value for the Jansen Potash Project” General Electric, for “RailConnect 360 Revolutionizes Approach to Train Scheduling” Holiday Retirement, for “Senior Living Rent OptimizerTM (SLROTM), an Innovative Revenue Management System, Achieves Significant Revenue Gain” New York City Department of Transportation, for “The New York City Off-Hours Deliveries Project: A Business and Community-Friendly Sustainability Program”

Join us at the Edelman Gala, April 3 in Las Vegas, Nevada when the 2017 winner is announced!

http://meetings.informs.org/analytics2017

A NA L Y T I C S

M A R C H / A P R I L 2 017

|

27


ANALY TIC S I N AC T I O N

Illinois Tech hoopsters get an assist from data science program With Illinois Tech’s men’s basketball team on a roll this season, one assist is coming from the university’s Master of Data Science program. The Division III basketball team went from winning no games in 2013, the year before head coach Todd Kelly came on board, to winning two games in 2014, four in 2015, to achieving a record of 19-5 as of Feb. 21. Their Massey rating has gone from around 375-400 to 117. On Jan. 4, they cracked the 100-point mark for the first time since 2007. And for the first time in program history, the Illinois Tech men’s basketball team has qualified for the U.S. Collegiate Athletic Association’s National Championships. Although he says athlete talent is the No. 1 predictor of success, Kelly attributes about 20 percent to 25 percent of the turnaround to analytics from students in Illinois Tech’s Master of Data Science program. “The data science students gave us invaluable “They provided two key kinds of inhelp,” says Coach Todd Kelly. formation. First was the players’ adjusted Source: IIT

28

|

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


plus/minus,” Kelly says. “Plus/ minus basically calculates how many points a team has gained or lost during that player’s time on the court. Secondly, they provided me with my best five-man lineups for specific situations: defense, three-point shooting, rebounding and late game.” PULLING IN DATA SCIENCE HELP The data science program teaches students how to explore data using high-level mathematics, statistics and computer science. Coach Kelly approached program directors in March 2016 for help. Students Denis Bajic and Larry Layne took on the project as their practicum in summer 2016, with data science program director Shlomo Argamon. The three worked closely Malik Howze and the Illinois Tech men’s basketball team has qualified for the U.S. Collegiate Athletic Association’s National Championships for the first time in school with Kelly throughout the history. Source: Stephen Bates, WCS Photography summer, delivering final results in August 2016. They provided Kelly with five-man lineups for a variety of situations each player’s plus/minus, PER (player and several other useful metrics. efficiency rating) and what the best Bajic, who played basketball in junior lineups would be for specific situations. high and high school and who follows They also calculated each player’s the Illinois Tech team, says, “Advanced offensive and defensive win shares, analytics can provide a significant adadjusted PER, game simulations, best vantage when incorporated into a game

A NA L Y T I C S

M A R C H / A P R I L 2 017

|

29


ANALY TIC S I N AC T I O N

plan. Seeing that many Division III teams either didn’t take regular statistics, or didn’t utilize data to their advantage, we thought we could gain valuable insights not only on our team, but on other teams, which would help the coaches plan strategies more effectively. “We looked at all the teams Illinois Tech played in the 2015-2016 season and scraped their season statistics, along with IIT’s statistics, to get a feel for what other teams were doing better. We utilized all these statistics to build new analytics based on what’s currently used in the NBA (player efficiency ratings, win shares, adjusted plus/ minus, etc.).” “We then looked at these newly generated statistics for the Illinois Tech players and, after comparing them to performances against other teams, gained insights on how the program could improve. This knowledge translated into coming up with many different lineups that would be deployed as the situation called for, as well as tweaks the team could make (e.g., defensive scheme alterations). We used Python for scraping and analysis and Tableau for visualization.” GOING THE EXTRA MILE Layne led construction of the simulations, as well as working on the plus/minus calculations. 30

|

A N A LY T I C S - M A G A Z I N E . O R G

“The idea for these simulations was to see if the team was playing to expectation, and if not what areas they could improve,” Balic adds. “We also provided him with a dashboard of all the analytics we developed so he could easily make his decisions. I watch the sport daily and have been keeping tabs on the Illinois Tech team this season, and it’s great to hear of the success that they’re having.” “We intended for the simulations to use Denis’ top line-up calculations,” Layne notes. “We would try different lineups with different teams in the simulator in order to determine if certain players or play styles would work better against different opponent teams or styles. Unfortunately, we only reached the proof of concept stage by the end of the practicum.” “At the end of the day, our players make it work,” Kelly says. “They’re the ones that make it happen. But the data science students gave us invaluable help and allowed us to leverage one of the things that makes IIT unique – it’s a STEM school strong in quantitative problem-solving. Most Division III schools are liberal arts schools and don’t have a data science and analytics master’s program. “I plan to do this again next year,” Kelly continues. “In the meantime, I encourage everyone to come out and see this team!” ❙ W W W. I N F O R M S . O R G



SAL ES & M A R K E T I NG

ABM and predictive lead scoring – a primer for success BY MEGAN LUEDERS ccount-based marketing (ABM) – and the related technology of predictive lead scoring – is dramatically changing the face of sales and marketing. The difference is like spearfishing when all you’ve known before is dragging a net. It’s much more precise and uses analytical processes to ensure that your efforts are focused on the results your organization needs and your leads are more efficiently converted to sales. For enterprise sales, ABM is critical because B2B sales are much more tightly focused than B2C sales. In Zenoss’ case, we provide open source IT network monitoring solutions to corporations; consequently, our enterprise market is narrow.

A

32

|

A N A LY T I C S - M A G A Z I N E . O R G

Because small or even medium-sized businesses may not be appropriate, we may only be looking in the range of a few thousand targets. Having a more fine-tuned approach to go after targeted accounts – as opposed to casting a wide net – allows a company to streamline its traditional marketing efforts. You’re going after fewer fish, but they’re the ones you want. You’re enabling your sales team (and a segment of your marketing team) to pursue the very most targeted prospects with the very best tools at your disposal. Having recently implemented ABM and predictive lead scoring, I now understand the benefits of this new approach to marketing analytics. I’d like to share what W W W. I N F O R M S . O R G


ABM and related technology: “You’re going after fewer fish, but they’re the ones you want.” Photo Courtesy of 123rf.com | kritchanut

you need to do to make the right decision on vendors if you’re just getting started yourselves. FIVE COMPONENTS OF PREDICTIVE LEAD SCORING According to the research firm SiriusDecisions, there are five minimum requirements any predictive lead scoring service must offer: Statistical methodology. An appropriate solution must have an analytics engine that uses statistical techniques to correlate between data and such variables as offer responses, lead conversion and closes deals. Scoring. The solution must be able to look at behaviors and statistically A NA L Y T I C S

understand how a prospect will behave under those behavioral parameters. Integration with marketing automation platforms and/or sales force automation. For access to lead, contact and account information, the solution must be able to integrate with at least one MAP or SFA system. Integration must be bidirectional, with data and lead scores being received from and returning to the SFA or MAP system. Continuous learning. The solution must allow for adding new lead attributes and can improve scoring of leads based on previous scored lead results. Reporting. The solution must be capable of generating standard reports such as conversion rates, lead quality and trend analysis. M A R C H / A P R I L 2 017

|

33


SAL ES & M A R K E T I NG

I don’t intend to lead you to any single provider. Rather, I’d like to tell you a bit more about what you need to do beforehand to ensure success once you’ve settled on your own provider. HOW CLEAN IS YOUR DATA? Whenever I speak with my peers about what we’re doing with ABM and predictive lead scoring, their No. 1 question is: “How good is the data that you’re getting back from your vendors?” The right groundwork for an ABM structure means having a solid database of people to go after. You can compile this database through multiple tools and vendors (which is what we have done) and by making sure you have more sophistication and segmentation in the database. With the tools we are using, we have the capability of adding a great deal of intelligence to our data – not just content information, but other layers as well. Typically, these layers of data can be broken down as follows: • Firmographic layer: How firms are aggregated into meaningful market segments (basically like demographics for businesses). • Technographic layer: How consumers are categorized by ownership, use and attitudes toward information. 34

|

A N A LY T I C S - M A G A Z I N E . O R G

• Demographic layer: How the target market is divided along lines of age, ethnicity, education, etc. • Intent layer: How social signals indicate a prospect is interested in purchasing a particular technology. Of course, because of changes in any given business, the data that comes from ABM vendors is never completely clean. People move from division to division, from one region to another, and so on. Keep in mind, though, that these companies make their livelihood on having the most accurate data possible. While you can never say with 100 percent certainty that data is perfect, it is certainly as sound as can reasonably be expected. Ultimately, it is up to your internal team to keep your database 100 percent accurate. At some point, you need to rely on the human touch, which means routinely reaching out to companies in your databases to ensure you have the most accurate information. At a minimum, though, ABM as a tool enables marketers to take advantage of data that’s presented. TOP BUSINESS CONSIDERATIONS BEFORE STARTING As you decide on a vendor, keep in mind there are some aspects of your business that can influence which ABM W W W. I N F O R M S . O R G


and predictive lead scoring company will be the best fit for your company. Here are a few things to consider: Choose the right vendor for your size of business. Some vendors may be too big for you. You may not have enough records in your database for them to consider. You would be paying for a premium product when you may only need a midlevel solution. Know what kind of information you want or need. Are you looking for accountlevel information (just the company information) or specific contacts within accounts? Most companies’ databases haven’t been tended very well. Names and contacts may have been added to over the years, but the database itself may not have been cleaned since it was first developed. In that case, you know you need contact-level information first, to verify the basic information related to specific contacts within the organization. The accuracy of your database to start with may push you in the direction of a particular vendor. Have a solid partnership between sales and marketing within your organization. Buy-in to ABM can’t just be lip service. The VP of sales must agree that he or she is ready to shift how sales is done. This goes all the way down to the tactical level of which salespeople are calling, how they are A NA L Y T I C S

calling and what they are saying in the sales call. Most importantly, they all must be comfortable with the marketing department influencing some part of that tactical process. Be sure your sales and marketing structures are set up for success. As you develop your ABM strategy, you may need to rethink your sales force and the type of people you hire. You may also need to ensure that while you have real buy-in across the board, you still have a manageable group of decision-makers, so that you can move quickly once you’re ready to do so. Which takes us to our next point. Don’t rush for results. It will take some time for marketing to get its ducks in a row – understanding the database, the tools and how to implement the tools. You need to establish and follow a joint timeline. Marketing is responsible for a lot of the prep work, and sales has a real burden to deal with to make sure that ABM can take off once the prep work is done. It can take months of cooperative effort to start to understand the fruits of these efforts. It’s a journey, not a race. ❙ Megan Lueders (mlueders@zenoss.com) is vice president of marketing at Zenoss, a provider of hybrid IT monitoring, infrastructure monitoring and analytics software for physical, virtual and cloudbased IT infrastructures.

M A R C H / A P R I L 2 017

|

35


S U RVEY: STAT I ST I CA L A NA LYS IS SOF T WA RE

The joys and perils of statistics Trends, developments and what the past year of sports and politics taught us about variability and statistical predictions. “It is difficult to make predictions, especially about the future.” – Danish saying, variously attributed to Niels Bohr or Yogi Berra

BY JAMES J. SWAIN

W

e were repeatedly reminded several times last year that variability can confound statistical predictions and unlikely events do occur. Upsets in sports and politics are always news, since having the underdog beat the “sure thing” is surprising and noteworthy. What is exciting in sports is unexpected in politics, since we expect our predictions to do better when the business is serious. We certainly don’t expect to see another

36

|

A N A LY T I C S - M A G A Z I N E . O R G

“Dewey Wins!” headline, but both the Brexit vote and Trump’s election clearly confounded consensus predictions. In the latter case, the actual margins in several key states were very small – but in politics as in sports a win is a win. It was also noteworthy that while data-savvy campaign teams seemed to be the story in the previous election cycle, Trump’s campaign seemed to demonstrate that they weren’t essential. The savvy predictions may have been

W W W. I N F O R M S . O R G


The goal in any statistical investigation is to bring forth some insight from the data. Photo Courtesy of 123rf.com | Thananit Suntiviriyanon

correct, yet an 80 percent chance of winning is not a certainty, and the less likely outcome is still possible. Upsets in statistical prediction was not the only big story in statistics this year. The inability of researchers to replicate published experiments in several fields, such as psychology, have called published experimental results into question. It has also led to revisions in thinking about the old standby, the p-value. For instance, in one study of 100 articles in top psychology journals, only about 36 percent of the significant results were successfully replicated. Last May the American Statistical Association issued a statement condemning the use

A NA L Y T I C S

of any single measure, such as p-values, as a substitute for scientific reasoning. One journal, Basic and Applied Social Psychology, has eliminated their use altogether. Problems with an over reliance on the p-value have been known for years. In traditional hypothesis testing, the p-value is the probability of observing a statistic of the value (or larger) than the observed statistics under the null hypothesis. The null hypothesis is rejected when the p-value is sufficiently small, under the assumption that the alternative is the more likely explanation. Of course, in any large number of experiments, a “significant� result (i.e., one with a low p-value) is increasingly likely to

M A R C H / A P R I L 2 017

|

37


S U RVEY: STAT I ST I CA L A NA LYS IS SOF T WA RE

occur, as quantified by the Bonferroni inequality. That is why running many experiments and reporting only the “significant” ones distorts the actual p-value. One way to deal with the uncertainty with what p-value means is through experimental replication, which can either confirm the noteworthy result or fail to do so. In the latter case, the lack of significant result in the replication suggests that the first was simply a “false positive.” Since journals generally prefer novel results to replication of existing results, there is little incentive for independent replication. SOFTWARE FOR STATISTICS The goal in any statistical investigation is to bring some insight forth from the data, whether confirmation of a research hypothesis, or the reassurance that some process is still ticking along at the proper precision and regularity, or in building a usable model. To obtain these useful results, software must be able to perform a variety of functions including data acquisition and editing, presentation of results or relations among variables, transformations as needed, and computations to support the analysis. Computers were once human, as the recent hit film “Hidden Figures” illustrates. At Langley, the best computers were prized for their insight into the underlying analysis and physical processes as well 38

|

A N A LY T I C S - M A G A Z I N E . O R G

as computations [1]. The best modern software should provide the same assistance, both the computations that we choose, as well as further tools to enable further analysis that are suggested by analysis. The investigation is usually iterative, using one result to suggest alternative approaches and further experiments. Software will also include the ability to compute critical values from the reference sampling distributions such as the normal, t and F, from which p-values (for instance) can be computed. In fact, many of our critical mathematical and statistical tables were first computed by human computers in the early part of last century. This is noted in another book about human computers, “When Computers were Human” [2]. Software offers more than simply computations. Exploratory analysis was in part designed to generate quick pictures of the data that could be assembled quickly and by hand – dot plots, stemand-leaf and the box plot, for instance, minimizing complexity of computation for insight. Increasingly, multiple plots are provided in arrays or at the margins of other plots. For instance, box plots or histograms display the marginal distributions while the central plot provides the scatter plot. In multivariate investigations, a twodimensional array of two-dimensional scatter plots helps the analyst visualize higher dimensional relationships. The W W W. I N F O R M S . O R G


best software provides the interactive ability to manipulate plots interactively to identify points or sets of points that are noteworthy (e.g., outliers) or to transform the variables within a graph. This is a particular strength of the JMP software. Software provides a greatly enhanced range of graphical displays. Graphics are an excellent way to visualize data – to see distributions and commonalities across variables or in location. Data can also be summarized geographically. A recent popular interest article in The New York Times is representative of the possibilities. In the 2016 presidential election results, voting for Donald Trump was more highly correlated with certain popular television shows than with presidential voting in the last election. The cultural divide remarked upon during the election was paralleled with selections from among 50 television shows across the counties of the United States and then correlated to election results. The correlation is more easily understood graphically than numerically [3]. Finally, good statistical software can assist in the design of experiments. A good analysis, often in the context of the old PDCA cycle of “plan, do, check and act� begins with a question and a plan for the collection of experimental data. Software can be used to assist in sample size computations through power analysis, or A NA L Y T I C S

provide specialized designs for a range of designs in one or more variables. Modern software has the additional advantage that it opens analysis to a wider circle of individuals who would not be able to perform the analyses themselves. Since computations are less of a requirement, introductions to statistics are available to a wide array of individuals. The American Statistical Association sponsors teacher clinics for classes and poster competitions at the K-12 level, and AP statistics courses are growing quickly as well. SOFTWARE SURVEY PRODUCTS The biennial statistical software products surveyed this year provides capsule information about 19 products selected from 13 vendors. The tools range from general tools that cover the important techniques of inference and estimation, as well as specialized activities such as nonlinear regression, forecasting and design of experiments. The product information contained in the survey was obtained from product vendors and is summarized in tables to highlight general features, capabilities, computing requirements, and to provide contact information. Many of the vendors have their own websites for further, detailed information, and many provide demonstration programs that can be downloaded from these sites. No attempt was made to evaluate or rank M A R C H / A P R I L 2 017

|

39


S U RVEY: STAT I ST I CA L A NA LYS IS SOF T WA RE

the products, and the information provided comes from the vendors themselves. The survey data is available online (see Editor’s Note). Vendors that were unable to make the original publishing deadline are added to the online survey as they complete the online questionnaire. Products that provide statistical addins available for use with spreadsheets remain popular and provide enhanced specialized capabilities for spreadsheets. The spreadsheet is the primary computational tool in a wide variety of settings, familiar and accessible to all. Many procedures of data summarization, estimation, inference, basic graphics and even regression modeling can be added to spreadsheets in this way. An example is the Unistat add-in for Excel. The functionality of products for use with spreadsheets continues to grow, including risk analysis and Monte Carlo sampling, such as Oracle Crystal Ball. Dedicated general and special purpose statistical software generally have a wider variety and depth of analysis than available in the add-in software. For many specialized techniques such as forecasting, design of experiments and so forth, a statistical package is appropriate. In general, statistical software plays a distinct role on the analyst’s desktop, and provided that data can be freely exchanged among applications, each part of an analysis can be made with 40

|

A N A LY T I C S - M A G A Z I N E . O R G

the most appropriate (or convenient) software tool. An important feature of statistical programs is the importation of data from as many sources as possible, to eliminate the need for data entry when data is already available from another source. Most programs have the ability to read from spreadsheets and selected data storage formats. Within the survey we observe several specialized products, such as STAT::FIT, which are more narrowly focused on distribution fitting than general statistics, but of particular use to developers of models for stochastic systems, reliability and risk. ❙ James J. Swain (swainjj@uah.edu) is professor in the Department of Industrial and Systems and Engineering Management at the University of Alabama in Huntsville. He is a longtime member of INFORMS, as well as ASA, IIE and ASEE. EDITOR’S NOTE: SURVEY DIRECTORY & DATA To view the statistical software survey products and results, along with a directory of statistical software vendors, click here.

REFERENCES 1. Margot Lee Shetterly, 2016, “Hidden Figures,” William Morrow. 2. David Alan Grier, 2005, “When Computers Were Human,” Princeton University Press. 3. Josh Katz, 2016, “‘Duck Dynasty’ vs. ‘Modern Family’: 50 Maps of the U.S. Cultural Divide,” The New York Times, The Upshot, Dec. 27. Available online at: https://www.nytimes.com/ interactive/2016/12/26/upshot/duck-dynasty-vsmodern-family-television-maps.html?_r=0.

W W W. I N F O R M S . O R G


DATA ANALYTICA CEREBRUM understanding the underlying methodology and mindset of how to approach and handle data is one of the most important things analytics professionals need to know. informS intensive classroom courses will help enhance the skills, tools, and methods you need to make your projects a success.

SAMPLING BIAS PRESCRIPTIVE PREDICTIVE STOCHASTIC MODELS NON-TECHNICAL DECISION MAKERS

UNSTRUCTURED PROBLEMS

REGRESSION OPTIMIZATION vs. SIMULATION DISPARATE INFORMATION

UPCOMING CLASS:

essential practice skills for high-impact analytics projects april 5–6, 2017 | 8:30am–4:30pm seattle pacific university seattle, wa

limited seating available. Register at www.informs.org/continuinged

CHART1


VIE W POIN T

Disarming ‘Weapons of Math Destruction’

BY ERIC SIEGEL athy O’Neil, an industry insider and experienced expert, thoroughly covers the sociological downside of data science in her New York Times bestseller and first-of-its kind book, “Weapons of Math Destruction.” In the world of big data, there’s a lot of music to be faced. With all its upside, data science’s deployment risks being prejudicial, predatory, exploitative, buggy, blindly trusted and secretive. And it has the potential to magnify the consumer’s personal economic struggle rather than remedy it. These risks permeate across the field. The book’s broad coverage includes examples from all the main business application areas to which predictive models commonly apply: marketing,

C

42

|

A N A LY T I C S - M A G A Z I N E . O R G

online ads, credit scoring, insurance, workforce analytics, law enforcement and political campaigns. By providing such a uniquely comprehensive treatment of data’s downside, the book addresses two dire needs: increasing awareness and opening the door to prolific discussion. When exercising the power of analytics, what could be more important than that? Establishing and managing ethical data science is as important as including the brakes in a state-of-the-art automobile. O’Neil’s book pioneers much-needed first steps. RISKS TO SOCIAL JUSTICE “Weapons of Math Destruction” covers a range of ethical dilemmas. The sociological risks of data science include: W W W. I N F O R M S . O R G


Magnifying the social divide. Predicting a poor credit risk is, to some degree, a self-fulfilling prophecy that can hurl those less financially secure into a vicious cycle. This pitfall applies analogously when prison sentencing may be guided in part by a felon’s neighborhood, and when a university places low on U.S. News & World Report college rankings. More gen- Cathy O’Neil’s best-selling book explores sociological downside of data science. erally, data science may amplify capitalBUT DATA SCIENCE IS LARGELY GOOD ism’s tendency to further disempower the disenfranchised, i.e., punish the poor. The book’s main shortcoming to keep Racial prejudice. Law enforcement in mind while you’re (hopefully) reading models that predict crime and recidiit is that – in most places – it appears to vism have been shown to intrinsically sweepingly indict and vilify data science. enact biases against minority groups. Ultimately, that’s going too far. A little They are swayed in part by who you are extra copy could have easily clarified as rather than what you’ve done. much up front in the introduction by sayPredatory micro-targeting. Highly ing something like, “The risks are dire, targeted online ads are more adept than measures must be taken to protect social ever at exploiting vulnerable consumers justice when deploying data science, and and separating them from their money. here are several examples along with Opaque, overly trusted and somesuggested adjustments to implement times buggy. Whether buggy (in some such protective measures.” demonstrable cases) or unjust, preInstead, the author begins the book by dictive models often are not disclosed regaling her personal story of defecting transparently – hidden from scrutiny – completely from data science’s commercial and in some cases are overly trusted practice as a generally unethical endeavor, by those who rely on their predictions. and the remainder of the book covers a list A NA L Y T I C S

M A R C H / A P R I L 2 017

|

43


VIE W POIN T

of examples that will sound to many readers like no less than a diatribe. The book’s opening anecdote – on a flawed teacher evaluation metric – leaves out that, instead of doing away with the entire system, such a bug could in fact be remedied. Most mentions of micro-targeting tacitly imply it always necessarily serves greed or otherwise enacts some forms of injustice. The book focuses almost exclusively on the negative – depicting one train wreck after another – with relatively little to balance that out in the way of constructive advice that could actively mitigate risks to social justice within existing practices. The implicit advice is to basically turn it all off. But, if you read the full book in detail, you will indeed come across this intelligent author stating that, no, math is not inherently evil, and it can also serve for the greater good. She goes further with the following two quotes, each relatively buried within the book (the latter is literally the last sentence of the last chapter before the Conclusion): “... mathematical models can sift through data to locate people who are likely to face great challenges, whether from crime, poverty or education. It’s up to society whether to use that intelligence to reject and punish them – or to reach out to them with the resources they need. We can use the scale and efficiency that make ‘weapons of

44

|

A N A LY T I C S - M A G A Z I N E . O R G

math destructions’ so pernicious in order to help people. It all depends on the objective we choose.” “... with most ‘weapons of math destruction,’ the heart of the problem is almost always the objective. Change that objective from leeching off people to helping them, and a WMD is disarmed – and can even become a force for good.” I would go further than O’Neil and claim that many deployments of predictive models, even when designed to pursue profit as the objective rather than social justice, do more good than bad. Profit by way of efficiency is not always a bad thing. Moreover, consumers gain value by way of predictive models: less junk mail (and better for the environment), more relevant ads, better movie, music and books recommendations, effective email spam filters, better Google search results, more engaging Facebook feed content, more robust healthcare, and increased safety by more effectively targeting the inspection of buildings and manholes. This technology is like a knife: Its power can be used for good or for evil. That means it can be dangerous, but the idea of completely eliminating it – or even just its profitdriven deployment – is not on the table. It’s important not to go too far and criticize the entire field of data science in W W W. I N F O R M S . O R G


absolute terms. By painting it completely black, you compromise your own credibility and weaken your valuable voice in the call for social justice. Having said that, I strongly support O’Neil’s inspirational motion that you go out and pursue social justice rather than – or at least in addition to – profit. CONCLUSION While “Weapons of Math Destruction” conveys an oversimplifying, “black-andwhite” position, that’s just one aspect of what amounts to a broad treatment of a

multi-faceted, critical topic. I encourage you to look past that aspect in order to gain from this important book. I anointed it five stars on Amazon, and I strongly recommend you read it as well and pass on the word. ❙ Eric Siegel, Ph.D., is founder of the Predictive Analytics World conference series and executive editor of The Predictive Analytics Times. A renowned speaker, educator (he’s a former Columbia University professor) and leader in the field, Siegel is the author of the award-winning book, “Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die,” the second chapter of which is devoted to ethical issues that arise in predictive analytics’ deployment.

THE ODDS ARE IN YOUR FAVOR AT ANALYTICS 2017! Exhibitor and Sponsorship Opportunities at the 2017 INFORMS Conference on Business Analytics and Operations Research. This premier Analytics conference draws 800+ practitioners for three days of networking, professional development & intensive learning. For Sponsorship and Exhibitor details visit: http://meetings.informs.org/analytics2017

Contact: Olivia Schmitz INFORMS Exhibit & Sponsorship Sales Manager Olivia.Schmitz@informs.org 443.757.3539

CAESARS PALACE, LAS VEGAS APRIL 2-4, 2017

A NA L Y T I C S

M A R C H / A P R I L 2 017

|

45


AN ALY TIC S & H U M A N R E SOU RC E S

Predictive talent acquisition strategy As the cost of failed new hires grows, so does the importance of pre-hire assessments, so how do you find the right talent assessment vendor? Ask these questions.

BY GRETA ROBERTS ver the past 30+ years, businesses have spent billions on talent assessments. Many of these are now being used to understand job candidates. Increasingly, businesses are asking how (or if) a predictive talent acquisition strategy can include the use of pre-hire assessments? As costs of failed new hires continue to rise, recruiters and hiring managers are looking for any kind of pre-hire information to increase the probability of making a great hire.

O

46

|

A N A LY T I C S - M A G A Z I N E . O R G

Despite the marketing hype, all predictive analytics projects have three very simple steps: 1. A system reads “input” data ­­– perhaps assessment scores or CV information. 2. The system does some math to apply a “predictive model” to the input data. 3. The results of the model are shown as “output” data of the model, perhaps the likelihood of the candidate achieving a certain level of sales performance or another key

W W W. I N F O R M S . O R G


For most companies, current pre-hire talent assessments are wasted data. Photo Courtesy of 123rf.com | rawpixel

performance indicator (KPI). At heart, it takes “inputs” and turns them into “outputs” or predicted business outcomes. But to build and validate a model, you need a healthy, logical set of both input and output data for that role in your company. If you are using a talent assessment alone, this is just input data. To be predictive you need to include the other two steps.

A NA L Y T I C S

For most companies, current pre-hire talent assessments are wasted data. Results are delivered in an individual report that often cannot be analyzed or aggregated. For most “legacy” talent assessments, it’s difficult or impossible to determine what positive (or negative) business affect the assessments are having. It often comes down to the question of “how much the HR person believes the results” vs. “how much the business is able to document and realize

M A R C H / A P R I L 2 017

|

47


ANALY TIC S & H U M A N R E SO U RC E S

real results.” It can be daunting to figure out what solutions will actually deliver a predictive solution. To help, here are some important questions to ask of your talent assessment vendor: Consider the predictive company itself. Are you dealing with an assessment company who is trying to learn how to be predictive? Or is it a predictive company that also uses assessment data? How long have they been doing predictive work? Consider the predictive team. Ideally the company will have data scientists on staff as well as industrial organizational (IO) psychologists. This is important because data scientists tend to utilize more modern and rigorous methods for prediction and validation. IO psychologists tend to be focused on the instrument, while data scientists tend to be concerned with predictive validity and business results. Are they predicting for your company or for everyone? Some companies create “industry benchmarks,” that is, general performance predictions for general industry categories such as retail sales or customer service. These predictions are significantly less accurate, because they are based on companies different from your own, with different cultures, goals and regions. Not all “customer service” is the same. Modern 48

|

A N A LY T I C S - M A G A Z I N E . O R G

computing methods enable leading providers to create and validate predictive models for your roles in your own company alone, and to continuously update the model over time. Do they care about your outcome data? Generally, these solutions predict attrition or performance for a candidate or employee. Has the assessment company asked you for the attrition or KPI data for your employees in your target role? If they don’t know your employee outcomes, how can they predict your outcomes? They can’t. Most job roles have multiple KPIs that describe performance. Do they predict each of these separately? For KPIs that naturally contradict each other, e.g., speed vs. accuracy, how does the predictive solution resolve the contradiction? Just getting a “green light” isn’t good enough in many cases. What sample size did they ask for? Real predictions require a reasonable sample to properly validate that you aren’t being fooled by randomness. If they only ask for 15 top performers, your sample is too small to create a real prediction. Does the solution base predictions on outcome data or a job fit, job match or job blueprint survey? Data science predicts what you ask it to predict. If you want lower attrition or higher KPIs, the models must be trained and validated W W W. I N F O R M S . O R G


with those data alone. The process looks for fact-based patterns to drive your business. Surprisingly, many solutions don’t use this approach, but fall back to managerial bias. These solutions ask well-meaning committees of managers to list competencies that they believe are needed for success in a role. The resulting criteria are not predictive at all; they just find candidates that match the laundry list of beliefs and biases held by that committee. Nowhere in this process is a connection to actual attrition or KPI outcomes. Again, if the system doesn’t know about your outcomes, how can the process predict them? Start with data, not bias. Does the solution use machine learning to recalibrate your predictive models? How often? Business needs, role descriptions and culture changes over time. Local labor conditions change. For example, service representatives may be incentivized to cross-sell related products, or new regulations may require new compliance to be performed. It is important to update and revalidate your predictive models two to four times a year to keep up to date with seen and unseen trends. Some solutions have not changed their models for 30 years; do you expect these to find great sales reps for you? A NA L Y T I C S

The new validation question: criterion validation? HR has been taught to ask if the assessment is validated. The first level of validation checks whether the assessment measures are self-consistent. Continue to ask this question. But ultimately you care about whether the assessment feeds predictions that accurately correspond to improved business outcomes. That is, are the predictions actually working? This level is called “criterion validation” and is a high bar that is not commonly reached by vendors. A top tier predictive talent assessment vendor will perform criterion validation for the solutions several times a year. Criterion validation is the highest level of validation possible, and is the most preferred by regulatory agencies. Can you easily access/download your company’s talent assessment data? Talent assessment data is a critical data set for your company. If your talent assessment vendor makes it difficult or impossible to access your talent assessment data, this is a good indication they are using pre-predictive technology and that they don’t appreciate that this data is your asset. True predictive solutions know that your workforce data scientists will want to use your talent assessment data to find correlations and predictions in many M A R C H / A P R I L 2 017

|

49


ANALY TIC S & H U M A N R E SO U RC E S

areas of your business. You need to insist on easy and direct access to the raw assessment scores. How easy is it to deploy the solution into the talent acquisition process and use the predictions? How much training is required? Do your talent acquisition professionals need to read long text reports or get out a calculator to use the predictions? The complexity of a prediction should be kept out of the way of daily operations. If your team still needs to “think” about what the answer is, it is probably not a predictive solution. Is there a different assessment for every role? Or one assessment with multiple predictive models? Multiple assessments make it impossible to predict one candidate’s performance against multiple roles. This may also be a signal that you are working with an older, legacy (less predictive) talent assessment supplier. Is there an answer key for their solution on the web? For many assessments, there are answer keys and guides on how to fool or pass the test, which means two things: 1) the test is easily fooled, lacking internal controls to prevent spoofing, and 2) you are looking at an “industry benchmark” with one clear set of answers. A data science-driven model would be custom to your role in your company, 50

|

A N A LY T I C S - M A G A Z I N E . O R G

continuously evolving and therefore very difficult for answer keys and spoofing to catch. Ask to see the company policy on employee predictive modeling, discrimination, disparate impact and fairness. It is important that a predictive solution has thought through the specific outcomes of its models and how they fit into creating fair opportunity for all applicants. In particular, it is vital for the solution to satisfy or exceed any government requirements for hiring and selection. Do your own (internal) data scientists approve of the predictive solution? Ask one of your own data scientists (from HR, marketing or another area inside your own company) to accompany you in your evaluation. They should know what is a rigorous approach and what is marketing fluff. How does the predictive solution regularly prove to you that the models are working? Ideally the company you select will be able to show you two to four times a year how your predictions are working (i.e., turnover is going down, sales are going up, calls are going up, errors are going down, etc.). ❙ Greta Roberts is CEO of Talent Analytics, Corp., which helps to solve employee performance & attrition challenges through its predictive talent analytics software platform.

W W W. I N F O R M S . O R G


Healthcare 2017 OPTIMIZING OPERATIONS & OUTCOMES INFORMS Healthcare 2017 brings together academic researchers in “healthcare analytics” and industry stakeholders who are applying and sharing research to improve the delivery of effective healthcare.

Paper Submission Deadline is March 24 Take advantage of this opportunity to present your research, to showcase existing methods, and to deliver innovative techniques for addressing emerging challenges at all levels of healthcare delivery. You still have a chance to be a part of Healthcare 2017 and capitalize on the increased interest to achieve implemented solutions and testable outcomes.

Keynote Speakers Dimitris Bertsimas Operations Research Center Massachusetts Institute of Technology Brian Denton Department of Industrial and Operations Engineering University of Michigan Dr. Eric de Roodenbeke CEO International Hospital Federation

SUBMIT YOUR ABSTRACT

Deadline for submission is Monday, March 24, 2017

http://meetings.informs.org/healthcare2017

HEALTHCARE 20 7 Rotterdam, Netherlands | July 26–28, 2017


HU M AN B E H AV I O R MODE LS

Voter motives and messages The analytics story behind the Scottish secession vote, “Brexit” referendum and U.S. presidential election.

BY DOUGLAS A. SAMUELSON midst the ongoing debates about the U.S. 2016 elections, the most plausible explanation comes to us from a Greek OR/MS analyst who started off looking at medical research. As often happens, truth is stranger than fiction, and powerful insights tend to emerge from seemingly unrelated lines of thinking. Several years ago, Dimitris Vayenas’ father was diagnosed with amyotrophic lateral sclerosis (ALS), also known as the disease that killed baseball great Lou Gehrig. Since the disease is highly heritable, this implied that Vayenas himself had about a

A

one-in-six chance of developing it. Naturally, he became curious about it and interested in bringing together the Greek and the Israeli ALS Patient Associations. One thing led to another, and in 2011 he found himself sitting behind two leading French researchers at a conference on ALS at the Weizmann Institute of Science in Israel while a British researcher from the University of Oxford convincingly argued that the French researchers’ line of Dimitris Vayenas

52

|

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


Models of human behavior provide a basis for powerful emotional appeals that can infuence elections and other actions. Photo Courtesy of 123rf.com | rawpixel

inquiry was wrong. Why, he asked the French researchers, didn’t they offer to join the efforts? He was promptly informed of the incentives, in grant-funded research, to compete rather than collaborate. This got him thinking about incentives against useful information sharing in many areas of endeavor. In particular, he shared his concerns with his mentors at the RAND Corporation, professors James A. Dewar and Warren Walker, on transparency and communication in information networks. That led to his matriculation at the Department of Computer Science at the University of Oxford

A NA L Y T I C S

in June 2012, aiming toward the “modeling and objective quantification of transparency” that may help the quantification of each party’s contribution in knowledge-based collaborative efforts. He noted that current formal access control models (ACM) are mainly involved with data instead of users. To their detriment, these models assume that the users will always fulfill their intended roles and obligations, so the access control is unable to deal with threats such as insider misdeeds. He realized that many protection policies and practices focus on single-user, single-purpose access, but the real evolving threats involve

M A R C H / A P R I L 2 017

|

53


VOTER MOT I VE S & ME S SAG E S many-to-many access. The protective focus, he recommended, needs to shift from one user access at a time to collaborations of users – or, as he put it, from vetting access rights to monitoring ongoing information flows and adapting responses. He also realized that understanding information requires combining content and context, incorporating deeper and more complex representations of the background and intentions behind the communications. In addressing the question of transparency, the detection of “true” and “false” statements in interactions is of primary importance. However, it is not a necessary and sufficient condition; for example, one party can state only “truths” but can be questioned on why it opted to state these “truths” and not other “truths” in the given timeframe and context. He tested his theories about information protection with an analysis of attempted and potential intrusions into the 2011 United Kingdom census [1] and invented a verbbased access control method (ACM) in order to offer direct integration of behavioural content with the ACM. This included and incorporated provenance (information sources and paths of origin) in the model. In this way, he transformed by transforming the ACM to focus on activity-experience content, rather than just static access records, to ensure more efficient monitoring of experiences [1,2].

54

|

A N A LY T I C S - M A G A Z I N E . O R G

This research led him, in turn, to assessing how adversaries think. The Soviet Union, based on extensive and intensive study, had developed a simple but powerful model of human motivations. They saw people in terms of four basic motivational groups: conflict (power)-based, need-based, joy-based and affectionbased. Table 1 summarizes their view. Using this overview of human motivators, the Soviet propagandists had exerted considerable influence over their own public as well as other countries’ policies – and elections. This led to the realization that the Russians, among others, are still engaged in influence-seeking activities along these lines – and if they can do it, so can many others. In his predictive model, Vayenas suggested that the Brexit\Trump campaigns were modeled as addressing primarily the conflict motive, pivoted by the joy motive with only implicit references to the need and altruistic motives. Respectively, the Bremain\Clinton campaigns were modeled as addressing primarily the need motive, pivoted by the affection motive with only implicit references to the conflict and joy motives. (The question of whether the Russian government did, in fact, actively intervene in the U.S. election is the subject of ongoing congressional investigation and is well beyond the scope of this article. It is noteworthy, however, that

W W W. I N F O R M S . O R G


Motive based on:

CONFLICT

JOY

NEED

AFFECTION

War, Politics, Diplomacy, Related Life Activities: Shipping, Aviation, Exploration, Sports

Love & Sex, The Arts, Entertainment, Fashion, Travel, Relaxation

Agriculture, Trade & Industry, Family, Spirituality, Philosophy, Engineering, Finance, Medicine, Philanthropy, Cooking, Immigration Pedagogy, Pacifism

Representative Age:

Child

Youth

Middle

Senior Citizens

Representative Nationality:

Greek

French

English

Indian

Trade Unions, Standard of Living, Technical Achievements, Trade Shows, Financial Aid

Humanistic and Pacifistic Ideals, Ethics, Antidefamation, Anti-racism, Charity, Internationalism

Nationalistic Ideals, Political Festivals and Carnivals, Indicated Propaganda Debate, Militant Rallies, Literature, Tourism, Mechanisms: Competitive Sports Educational Exchanges

Table 1: Human motives and optimal communication according to propaganda specialists in former USSR. Source: Georgalas, G., 1967, “Propaganda: The Methods and Techniques of Educating the Masses,” new thesis. (in Greek, translated from an obscure Russian source)

those who unarguably did influence the election did so in a way the Soviets would have found quite familiar.) Hence, Vayenas turned his attention to analyzing recent and pending elections and concluded that identifying small groups of voters, understanding their motivations and crafting multiple-mode messages that combined the right motivational elements is the way to if not win, to at least accurately predict, the outcome. The effort started by examining the expression of “truth” in the Scottish Referendum where it sensed that the “no” (to independence) voters were bullied by their communities, and therefore the likelihood of their lying in polls was probable. It is worthwhile to note that “undecideds” were 23 percent in person-to-person polling, 14 percent on telephone polling and less than 10 percent in Web polling; these disparities led him to consider these elections as an ideal

A NA L Y T I C S

proving ground of his approach in determining what the result “ought to be” based on the motives of the public as addressed by the campaigns. This idea is not new, of course, but the explosive advances in big data, big computing and data collection have made astonishing new things possible. He proceeded to predict, more accurately than most other analysts, the outcomes of the Scottish secession vote, the “Brexit” referendum on whether Britain should leave the European Union and the U.S. presidential election [3] (see Table 2). Vayenas concluded, “It is hard to avoid the parallels with the myth of Cassandra and the tragedy that the rational forces seem to be unaware of these fundamentals of human motives and their impact in electoral outcomes as it appears that the voters, by and large, remain unaffected by the theatrics of the campaigns in terms of

M A R C H / A P R I L 2 017

|

55


VOTER MOT I VE S & ME S SAG E S policies presented. Moreover, it appears that, contrary to popular belief, more than 96 percent of the voters make up their mind as soon as the elections are announced; they just either don’t know it or don’t admit it in public by giving the impression to the pollsters that their vote is negotiable up to the moment they reach the ballot box. The polemic against “populism” is a polemic against the essence of democracy as a means of expressing one’s motives as experienced in ancient Athens. The forces of reason, rather than intensifying their polemic against populism, need to take into account all these motives if they are to succeed in avoiding unnecessary surprises with unintended consequences

ELECTION

2014 Scottish Referendum

and a way to ensure that any future computer network hacking, from Russia or elsewhere, will be doomed to irrelevance.” Vayenas is the first to concede that his predictions were shared among his friends and followers in his Facebook profile, were not widely disseminated in advance and are therefore not as fully tested as analysts would prefer. Still, his analysis is sufficiently similar to other studies and less formal after-action reports that it appears to merit serious consideration. CONTEXT FROM OTHER SOURCES Readers of Analytics magazine and OR/MS Today may recall that building databases via social media, to help

Actual

Previous Election

Poll of Polls Average (on the date of the prediction)

Two days

55.3%

N\A

49%

Five days

36.9%

32.4%

33.6% & Hung Parliament

31.5%

30.4%

28.9%

33.5%

6%

7.9%

22%

10%

Predicted

Time of Notice

54.7%

36.5% & Outright majority

2015 UK General Elections: Conservatives Labour Liberal Democrats 2015 Greek Referendum

59.5-63.5%

Ten days

61.3%

N\A

47.5%

2015 Greek Elections: Syriza

36%

Three weeks

35.46%

36.3%

31%

2016 Brexit Referendum

52%

One month

51.9%

N\A

48%

July 2016

+2.1% @ 54.6%

N\A

14%

August 2016

H. Clinton

US Elections

H. Clinton +2% @ 54% turn-out D. Trump due to wins in FL, OH, MI, PA, WI,

Popular Vote Winner

Table 2: All predictions as published in D. Vayenas’s Facebook timeline.

56

|

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


assess what messaging would work, was a key aspect of the 2008 [4] and 2012 [5] Obama campaigns, and that Obama also used social media to shape his agenda and messaging after the election [6]. Even in 2008, the main ideas of the micro-targeting method were far from new, as political scientist Eugene Burdick had depicted them in best-selling novels in 1956 [7] and, with more technical detail, in 1964 [8]. Burdick also discussed some of the troubling moral issues that could arise from an unscrupulous or even malevolent

candidate using micro-targeting, especially if the conflict and fear motives dominated. George Reedy, former press secretary to President Lyndon Johnson, pointed out that excessively negative campaigning could undermine the legitimacy of the resulting government [9]. More recently, a RAND study found strong evidence of polarization of the American electorate, both along geographic lines and along interest lines, with serious adverse consequences on the ability of the U.S. House of Representatives to ascend above partisan squabbles to get anything done [10].

ABSTRACT SUBMISSION IS NOW OPEN 2017 INFORMS ANNUAL MEETING OCTOBER 22–25 | HOUSTON, TEXAS

You are invited to present to 5,000 plus attendees, join intriguing plenary presentations, panel discussions, and tutorials, or submit for one of our numerous oral and poster tracks focusing on operations research and analytics.

IMPORTANT DATES May 15 - Abstract Submission Deadline August 1 - Poster Competition Submission Deadline September 1 - Poster Submission Deadline September 11 - All Presenters Must Register September 29 - Early Registration Deadline

SUBMIT AN ABSTRACT OR REGISTER TODAY http://meetings.informs.org/houston2017

A NA L Y T I C S

M A R C H / A P R I L 2 017

|

57


VOTER MOT I VE S & ME S SAG E S Last but not least, it turns out that both the Brexit vote and the Trump victory can be claimed at least in part by a British consultancy, Cambridge Analytica, which appears to have employed pretty much the methods and analysis Vayenas had independently developed. In its case as in his, there is a basis for some skepticism about how good its predictions really were and how much it was just lucky [11]. Still, the implications if they’re right, and if therefore data-driven polarization-based campaigns will now be the ones that succeed, deserve careful consideration by both researchers and political leaders. CONCLUSIONS There is substantial evidence, from Dimitris Vayenas and others, that models of human behavior based on relatively simple deep motivations provide a basis for powerful emotional appeals that can influence elections and other actions. These motivators seem capable of swamping appeals to reason alone, rendering much policy-based advocacy irrelevant and ineffective. In addition, such motivators explain threats of intrusion to general information and communication systems better than traditional forms of assessment, indicating a different approach to information security. Both the political and information security applications call for more of a focus on many-to-many

58

|

A N A LY T I C S - M A G A Z I N E . O R G

REFERENCE 1. V ayenas, Dimitris, 2015, “A Policy Analysis Approach to the Convergence of Formal Methods for Content and Context Modelling: A Verbbased Access Control Model,” Technical Report submitted to Department of Computer Science, University of Oxford and presented at MANCEPT, 2015. 2. Gonzalez-Manzano, Lorena, Slaymaker, Mark, de Fuentes, J. M., and Vayenas, Dimitris, “SoNeUCONABCPro: an access control model for social networks with translucent user provenance,” submitted at Lecture Notes in Computer Science: ACNS 2016. 3. Vayenas, Dimitris, 2016, unpublished communication to the Financial Times of London. 4. Samuelson, Douglas A., 2008, “Election 2008: How to Predict the Winner and How He’ll Do,” OR/MS Today, October. 5. Samuelson, Douglas A., 2013, “Analytics: Key to Obama’s Victory,” OR/MS Today, February. 6. Samuelson, Douglas A., 2009, “Change We Can Blog In: Obama’s Use of Social Media to help Him Govern,” OR/MS Today, February. 7. Burdick, Eugene, 1956, “The Ninth Wave,” Houghton Mifflin. 8. Burdick, Eugene, 1964, “The 480,” McGraw-Hill. 9. Reedy, George, 1970, “The Twilight of the Presidency,” New American Library, Cleveland, Ohio. 10. Sussell, Jesse, and Thomson, James, 2015, “Are Changing Constituencies Driving Rising Polarization in the U.S. House of Representatives?” RAND Corporation Report RR896. 11. Wood, Paul, 2016, “The British Data-Crunchers Who Say They Helped Donald Trump to Win: Are Cambridge Analytica Brilliant Scientists or Snakeoil Salesmen?” The Spectator, UK, December. http://www.spectator.co.uk/2016/12/the-britishdata-crunchers-who-say-they-helped-donaldtrump-to-win/.

network communications and less on one-to-many mass messaging. In short, the changing communication methods of our time are having profound effects on our governmental structures, effects we are only beginning to recognize. ❙ Doug Samuelson (samuelsondoug@yahoo.com) is president and chief scientist of InfoLogix, Inc. in Annandale, Va. He is a longtime member of INFORMS.

W W W. I N F O R M S . O R G



CO N FERE N C E P R E V I E W

Business Analytics Conference ready to roll in Las Vegas The conference will bring together nearly a thousand leading analytics professionals and industry experts to share ideas, network and learn.

60

|

The 2017 INFORMS Conference on Business Analytics & Operations Research will take place in Las Vegas on April 2-4 at Caesars Palace. Analytics 2017 will bring together nearly a thousand leading analytics professionals and industry experts to share ideas, network and learn through real-life examples of data-based analytical decisions. Longformatted talks offer an outlet to hear the complete story of successful analytical projects from inception through implementation. The Oscar-like Edelman Awards Gala on Monday evening promises to be a conference highlight. The gala honors the world’s best applications of applied operations research and advanced analytics (see page 62). This conference also offers substantial networking opportunities, making it the analytics event of the year for anyone who works in the analytics, operations research or management science fields. Handpicked topics and speakers. The Analytics Conference has seen huge growth and success year after year due in part to the conference program committees. They develop the topic tracks, select speakers

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


Caesars Palace, site of the 2017 INFORMS Conference on Business Analytics & O.R., sits on the famed Las Vegas Strip. Image © Thickstock

and organize the presentations that comprise the heart of the conference. Maher Lahmar, IBM solutions executive, chairs the Analytics 2017 Committee (see page 65). The 38 members of the program committee include analysts and managers from companies such as Accenture, BNSF Railway, Chevron, Deloitte, Gartner, Google, Innovative Decisions, Intel, InterContinental Hotels Group, Kroger, Lockheed Martin, Mayo Clinic, The MITRE Corp., SAS, Schneider and Walt Disney Company, as well as leading universities and government agencies. Jim Diamond, managing director of operations research and advanced analytics at American

A NA L Y T I C S

Airlines, will be the plenary speaker. His topic: “From Flying Machines to Machine Learning: Advanced Analytics at American Airlines.” The conference committee has designated nine topical tracks for the 2017 invited speaker program: Analytics Leadership and Soft Skills, Analytics on Unstructured Data, Decision and Risk Analysis, Emerging Analytics, Entertainment and Gaming, Internet of Things, Marketing Analytics, Revenue Management and Pricing, and Supply Chain Applications. Leading analytics professionals from companies such as American Airlines, Amazon, Boeing, Caterpillar, Disney, General Motors,

M A R C H / A P R I L 2 017

|

61


CO N FERE N C E P R E V I E W Google, Netflix, UPS and many others, as well as operations researchers from top universities and government organizations have signed on to speak at the premier conference on business analytics and operations research. The program will be rounded out by six tracks of handpicked member-contributed talks, software tutorials from vendor sponsors and poster presentations.

Caesars Palace: a classic, Roman-themed casino hotel. Source: Caesars Palace

Edelman Awards Gala One of the highlights of the INFORMS Conference on Business Analytics & O.R. is the Edelman Awards Gala, an Oscar-like evening celebrating the world’s best examples of applied operations research and high-end analytics work. This year’s gala will be held on Monday, April 3, at Caesars Palace in Las Vegas. The six finalists for the 46th Annual Franz Edelman Award for Achievements in Operations Research and the Management Sciences, the world’s most prestigious award for achievement in the practice of analytics and O.R., include: The American Red Cross, in partnership with researchers from the Georgia Institute of Technology, developed a new blood collection model to reduce the expenses associated with the 44,000 blood donations needed in the United States each day. Barco, a global technology company that manufactures products for the entertainment, healthcare and enterprise markets, partnered with researchers from the IESEG School of

62

|

A N A LY T I C S - M A G A Z I N E . O R G

Management in France and KU Leuven and Vlerick Business School in Belgium to support the assessment and development of product platforms for the company’s production and development of high-tech screens used in medical devices. BHP Billiton, one of the world’s largest producers of major commodities including iron ore, coal and other metals and minerals, partnered with AMEC Foster Wheeler to prepare the company to expand into the production of bulk fertilizer commodities. The project sought to create a model that incorporated every component of mining, hoisting and ore processing, while decreasing capital and operating expenses, and increasing production capacity. General Electric (GE) partnered with Norfolk Southern (NS) Railroad to create and implement an algorithm to dispatch thousands of trains in real time, increase their average speed, and realize annual savings in the hundreds of millions.

W W W. I N F O R M S . O R G


Conference venue. The conference will be held at one of the most prestigious resorts in the world, Caesars Palace. This iconic resort is one of the largest on the Las Vegas Strip. Caesars includes five towers of rooms, a pool garden, numerous bars, restaurants and other nightlife locations, and much more. It is connected to The Forum Shops, which include approximately 160 The conference promises to be the best (analytics) show in Las Vegas.

Image © Thickstock

Holiday Retirement, the largest private owner and operator of independent senior living communities in the United States with more than 300 facilities and $1 billion in annual revenue, partnered with Prorize LLC to revise its outdated pricing model, replacing it with a new revenue management system that would increase revenue and improve customer satisfaction. New York City Department of Transportation, in conjunction with researchers from Rensselaer Polytechnic Institute’s VREF Center of Excellence for Sustainable Urban Freight Systems, developed the Off Hours Delivery (OHD) Project. This project, which began in 2002 to transform urban freight policy by transitioning delivery times to the offhours of 7 p.m.-6 a.m. when New York City streets are less congested, is currently in use by more than 400 establishments and is being implemented internationally. All six finalists will present their work in a series of sessions at the conference before a panel of judges. The winner will be announced at the Awards Gala. Along with the Edelman, the following awards and prizes will also be recognized during the Edelman Awards Gala:

A NA L Y T I C S

• INFORMS Prize for effectively integrating analytics into organizational decision-making. • UPS George D. Smith Prize aimed at strengthening ties between academia and industry by rewarding institutions of higher education for effective and innovative preparation of students to be good practitioners of operations research. • Daniel H. Wagner Prize for Excellence in Operations Research Practice emphasizes quality and coherence of the analysis used in practice. • Innovative Applications of Analytics Award recognizes the creative and unique application of a combination of analytical techniques in a new area. • INFORMS O.R. & Analytics Student Team Competition recognizes outstanding solutions to real-world problems developed by undergraduate and master’s student teams. • Syngenta Crop Challenge in Analytics offers a $5,000 prize by presenting the best solution to the following problem: How can a farmer make seed variety decisions that optimally reduce risk and increase yield?

M A R C H / A P R I L 2 017

|

63


CO N FERE N C E P R E V I E W specialty stores, fine restaurants and attractions. If you venture outside of the hotel, you will find yourself in the middle of a global crossroads. The Las Vegas Strip is the entertainment capital of the world, with shows and restaurants galore. The Hoover Dam is a short 30-minute ride away. If you are feeling adventurous, you can venture a bit farther and take a 2.5-hour drive to the Grand Canyon. Las Vegas really is fabulous and offers a little bit of something for everyone. Organizations can take advantage of the $1,070 team discount rate when they send three or more attendees to the conference. A $1,070 newcomer rate is also offered. This special rate applies to any INFORMS member who is attending the conference for the first time. All meals for two days are included in all registration fees. For more information regarding conference registration or submitting a presentation, visit: meetings. informs.org/analytics2017. ❙

64

|

Meeting of Analytics Program Directors The inaugural Meeting of Analytics Program Directors (MAPD) will be held on April 1 at Caesars Palace in Las Vegas, the day before the start of the 2017 INFORMS Conference on Business Analytics & Operations Research in the same locale. Sponsored by INFORMS, the meeting is designed to provide program directors of college analytics programs with a forum for discussion, networking and the sharing of best practices. While master’s level programs are the majority, this first-of-its-kind meeting is open to program directors representing bachelor’s and doctoral level programs as well. The goal is to provide a platform for the types of exchanges that will raise the bar for all analytics programs and ultimately lead to better outcomes for students. Additionally, by supporting this emerging cadre of academics who lead analytics programs, INFORMS hopes that this meeting will help foster new ideas and new energy going forward. J. David Dittman, director of Business Intelligence & Analytics Services at Procter & Gamble, will deliver the opening plenary at MAPD. The topic: “An Industry Connection: An Industry View of Essential Components of a Graduate Analytics Program.” The all-day event is free of charge. For more information, contact Bill Griffin, manager, INFORMS Continuing Education Program (bgriffin@informs.org).

Trio of keynote speakers announced Analytics thought leaders Bill Groves, Andrew Boyd and Jim Diamond will be the keynote speakers at the 2017 INFORMS Conference on Business Analytics and Operations Research. The conference will be held April 2-4 at Caesars Palace in Las Vegas. Bill Groves, chief data and analytics officer, Honeywell, will discuss the ever-changing industry of analytics in his keynote: “The Changing Landscape of Analytics.”

Bill Groves E. Andrew Boyd, former chief scientist and senior vice president, PROS, Texas A&M University and the University of Houston, will share some analogies between analytics and games of chance to illustrate proven educational methods leading to project success in his keynote: “Analytics: A Winning Gamble.” E. Andrew Boyd

Jim Diamond

A N A LY T I C S - M A G A Z I N E . O R G

Jim Diamond, managing director, Operations Research and Advanced Analytics, American Airlines, will provide insights on how advanced analytics are being used to solve some of American Airlines’ most challenging and complex problems in his keynote: “From Flying Machines to Machine Learning: Advanced Analytics at American Airlines.”

W W W. I N F O R M S . O R G


B US INES S AN ALY T I C S & O . R .

Conference chair Maher Lahmar:

Welcome to the best show in Vegas! Following is an interview with Maher Lahmar, solution executive, Watson Customer Engagement at IBM, and chair of the 2017 INFORMS Conference on Business Analytics & Operations Research:

MAHER LAHMAR

A NA L Y T I C S

Can you tell us why the 2017 Business Analytics Conference is a must-attend event? The analytics conference is an annual event that focuses entirely on real-world applications of analytics, presented by industry and university leaders. The conference includes keynote speeches, invited talks, panels, poster sessions, career fair and an executive forum. This event also hosts the Edelman competition presentations and the Edelman Gala. The size and format of the conference allows attendees to easily take conversations beyond the scheduled sessions, network and advance their careers, whether you are a young professional, an executive or an academician. This year’s conference will take place in Las Vegas on April 2-4. We want it to be the best show in Vegas. Analytics show, I mean. For more information, visit our website: http:// meetings2.informs.org/wordpress/analytics2017/. As the conference marks its 17th anniversary, how do you see the state of analytics? Actually, it is impressive how much analytics practice has evolved to become an essential pillar of today’s M A R C H / A P R I L 2 017

|

65


CO N FERE N C E C H A I R Q & A

organizations. Whether it is a well-established company in full transformation, a young firm that is disrupting its industry or a government organization seeking efficiency, analytics and technology are becoming pervasive across all industries. We are in an era where the analytics practice is not solely the job of dedicated teams, but pervasive all across the organization and executive ranks. I believe the days where we had to raise awareness of our discipline and convince executives of the value we can add are behind us. INFORMS and its analytics conference in particular were pioneers in elevating the role of analytics in business and society and have contributed immensely to this achievement. Job well done, but far from complete. Analytics still has a huge untapped potential that we have to continue to promote. What would you consider the main theme of the analytics conference this year? That was the first question I asked myself when I was handed the baton. We would definitely want the conference to be an opportunity to celebrate achievement and recognize talent, but also we would like it to be a reminder that our discipline is going through major changes. Analytics practice is shifting gears from insights to action. This does not necessarily mean that insights are no more needed, it just 66

|

A N A LY T I C S - M A G A Z I N E . O R G

implies that we see more and more of the analytics applications trying to go beyond handing in an insight to recommending a decision and taking actions. Change comes with challenges but also creates opportunities. We would like our attendees to get a glimpse of those challenges and potential opportunities to take their business further and advance their careers. Can you tell us more about what triggered that change? There is no question that the perceived value of analytics is growing, but so are the expectations. Technological innovations have pushed the boundaries of what is possible with analytics. Executives are recognizing the importance of new sources of data and advantages of automation, and decision-makers are demanding that analytics professionals be more engaged in the decision process. As a result, a client who could have been content with a visual dashboard deliverable a decade ago would be less impressed with anything but a decision agent curating external data feeds today. I may be painting a dramatic picture here, but my point is that things are changing, and we would like to make sure that analytics professionals are aware of it and ahead of the curve. How will the analytics conference help practitioners adapt to these changes? W W W. I N F O R M S . O R G


Obviously, meeting these expectations requires practitioners to expand their knowledge, sharpen their skills and develop new ones. This is reflected in the mix of talks we have planned this year. As you know, over the last years, we introduced new tracks on Unstructured Data Applications and Big Data. This year we introduced tracks on IoT and Emerging Analytics applications. We also wanted to bring a flavor of Las Vegas to the conference rooms by dedicating a track to Gaming, Entertainment and Sports Analytics. This is in addition to the talks on new techniques and applications in traditional tracks such as Marketing Analytics, Supply Chain Management and Decision Analysis. We also arranged for machine-learning technical sessions such as the one on Deep Learning and scheduled sponsored workshops on the use of analytical tools. We hope to offer our attendees the opportunity to reflect on all the changes in our discipline, and what that means to their careers and our profession as a whole. There is a surge of interest in machine learning. How does the conference embrace that? Actually, machine learning has been part of our conference for a long time under different labels and tracks. In this event, it is no exception. For instance, we will hear about how ML techniques A NA L Y T I C S

can help optimize eCommerce merchandising at Home Depot and learn about the algorithms that drive surge pricing at Uber apps. We will also get a glimpse into IBM Watson research on how to add personality and emotion dimensions to conversational agents such as Apple’s Siri or Amazon’s Alexa. This is in addition to a variety of other applications that range from IoT to supply chain management and marketing. Many are still skeptical about the hype surrounding machine learning. What is your perspective on that? I am among those who believe that machine learning is just at the beginning of the journey. Many of the emerging applications such as driverless cars and conversation assistants are real, but would be considered science fiction just a decade ago. I think what is feeding that sense of hype is the lack of understanding of the limitations and complexities involved in deploying machine learning algorithms, and in some cases the obsession of the technical aspect over the business ROI. I would like to quote Jean Utke’s, an Allstate data scientist and a speaker at the conference, who said, “there are no off-the-shelf solutions.” While there are no ready solutions out there, there is a tendency to make algorithms and emerging applications easily M A R C H / A P R I L 2 017

|

67


CO N FERE N C E C H A I R Q & A

ISMS Marketing Science Conference

available for consumption The 39th Annual ISMS Marketing Science Conference will be held to the data scientist and June 7-10 at the University of Southern California in Los Angeles. The ISMS Marketing Science Conference is an annual event developers community. that brings together leading marketing scholars, practitioners and policymakers with a shared interest in rigorous scientific research on For instance, we will hear marketing problems. firsthand how a start-up Topics include but are not limited to branding, segmentation, consumer choice, competition, strategy, advertising, pricing, product, called Satalia offers scalinnovation, distribution, retailing, social media, Internet marketing, able optimization-as-aglobal marketing, big data, machine learning, choice models, game theory, structural models and randomized control trials. service solutions bridging Early registration ends March 15. For more information and to the gap between acaregister, click here. demia and industry. On a similar topic, we will learn how Deloitte of the top challenges they face. This year, consultants overcome the challenges of we will hear from business leaders about scaling NLP implementations in large orthe importance of soft skills in delivering ganizations. We are also dedicating an successful analytics, how they managed HBR special panel session where we will to build high-performing teams, and what raise provocative questions to help the causes of failure to avoid. In addition, we audience distinguish the promise from are organizing a panel that brings leaders the current state of machine learning. from a variety of companies and recruiting firms to discuss “How to Grow Analytical The interests of analytical proTeams.� fessionals evolve with their careers. What can analytical leaders look Anything else to add? forward to at the conference? I want to highlight that many of our Thanks for asking this question. We committee members are volunteering typically tend to focus on the scientific side practitioners who have busy schedules of the analytics practice. Many of our atand travel commitments, but are very tendees hold mid-management and senior passionate about the analytics discipline. leadership positions in their respective orThis meant that the INFORMS staff had ganizations and see the conference as to adjust to different schedules and pace an opportunity to exchange experience of work. For these reasons, I would like with peers on how to build and manage to extend my sincere gratitude to the analytical teams. In all conversations, the conference committee members and question of how to attract, recruit and reINFORMS team. They are the ones who tain analytical talent continues to be one went the extra mile to make it happen. � 68

|

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


Ready for Rotterdam:

2017 INFORMS Healthcare Conference

Photo Courtesy of 123rf.com | rudi1976

All over the world, healthcare organizations are being challenged by obstacles related to the aging of the population and other global trends. It is vital that the complex and growing demands in healthcare delivery are tackled swiftly and optimally. With the abundance of data available, operations researchers across the globe can pool resources and share solutions to come up with innovative ideas to remedy shortcomings within current healthcare systems. The 2017 INFORMS Healthcare Conference, being held July 26-28 in Rotterdam, Netherlands, is the ideal forum for O.R. professionals to come together to optimize health service operations and outcomes. The conference with be led by conference chair Joris Van de Klundert, Institute of Health Policy and Management, Erasmus University, along with program chairs Edwin Romeijn, Georgia Institute of Technology, and Sandra Sulz, Erasmus University Rotterdam. The program is organized into nine tracks of top issues that are impacting the healthcare industry today. These tracks are Disease and Treatment Modeling, Healthcare Data Analytics and Machine Learning, Health and Humanitarian Logistics, Health Information Technology and Management, Health Operations Management, Health Systems in Low and Middle Income Countries, Medical Decision-Making, Personalized Medicine, and Public Health and Policy-Making. When attendees are not engaged in high-level talks they can take part in some of the other events

A NA L Y T I C S

that round out the program for the conference. The INFORMS Health Applications Society sponsors a student paper competition. Students are asked to submit either oral or poster presentations that will be evaluated by leading healthcare scholars on quality, novelty and importance of methodology, contribution to healthcare research and potential for impact on practice. The finalists must present their work during a special session at the conference. There are also poster sessions during the conference that are not related to the student paper competition. These poster session presentations allow authors to present projects that are in the early stages of development, and thus benefit from the interactive critique, suggestions and encouragement from colleagues working in similar areas. The conference will take place at the De Doelen International Congress Centre, in the heart of Rotterdam, which is an emerging world leader in the healthcare and medical industry. A group rate is available at the Rotterdam Marriott, which is linked to the Congress Centre and just a two-minute walk away from Rotterdam Central Station. There are only a limited number of rooms booked at the INFORMS group rate, so make your reservations as early as possible. For more information on this conference, including registration, the venue or the program visit: http:// meetings2.informs.org/wordpress/healthcare2017/.

M A R C H / A P R I L 2 017

|

69


FIVE- M IN U T E A N A LYST

Voter fraud

No, it’s not about the presidential election. It’s about a model car contest for kids.

BY HARRISON SCHRAMM

70

|

This article is a true story about detecting voting fraud in a charitable auction, using no tools save a pencil, paper and smart phone. The setup is as follows: A group of kids have entered model cars into a contest where they are voted on by the other contestants. Each of the models had an attribute to be voted on, such as color, creativity, dangerousness, etc. Each participant is given a strip of 10 tickets and told that they could vote for “one car per category.” When I arrived at the event, I was asked to tally the votes. The organizer, having no idea what he was about to unleash on the problem, assured me that “my judgment was absolute,” with a telling wink that said, “expect foolishness” ­– but did not elaborate. As I started tallying up the votes by hand (40 participants x 10 tickets each = 400 tickets total), I realized that some of the votes were off ... that there were way more tickets for some of the cars than there should have been. But how could I adequately prove (to myself) that there was cheating going on?

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


PILOTS AND STATISTICIANS Before all this happened, I was a Navy pilot. I took my first round of training flying the T-37 “Tweety Bird”’ at 37th Flying Training Squadron, Vance AFB, in Enid, Okla. I had a former fighter pilot instructor who was very fond of saying: “If you don’t know what to do, wind the clock, and by the time you’re T-37 Cockpit: The clock is on the upper left of the center instrument console. finished, something useful will probably come to you.” arrive at a standard deviation, s = 9.5. Years later, teaching and then pracA purely statistical approach would be ticing statistics, I had a similar mantra to be suspect of any car that received for my students: “If you are working a more than +2s . problem, and you don’t know what to This approach is, of course, wrong. do, you should compute the ‘marginal Let’s recall the original question, [1] distributions [2],’ and by the time which is to find where participants had you are finished, something useful may pathologically voted for themselves. come to you. Because each participant only has In this instance, I took my own advice, and computed the marginals by hand (see Figure 1; they are by computer, but I assure you it was the same process). Now, because we know that there were 40 participants with 10 tickets each, we infer that `(C) = 40. Once the marginals {1} are computed, it’s an easy exercise to compute the variance by hand using the definition, Va r[ C ] = E[ C 2 ] - E[ C ] 2 , which can easily Figure 1: Votes per category: something strange going on. be done using our smart phone. We A NA L Y T I C S

M A R C H / A P R I L 2 017

|

71


FIVE- M IN U T E A N A LYST

Figure 2: Plot of the distribution of votes for “strangest car.”

10 tickets, looking for 19 extra tickets would imply that more than one kid colluded, a very unlikely scenario. We identify “strangest” and “fastest looking” as potential areas for cheating, with “strangest” being the most interesting. Figure 2 shows a plot of the distribution of votes for “strangest car” (solid line

Figure 3: An abnormally high number of votes for “strange” car No. 17.

72

|

A N A LY T I C S - M A G A Z I N E . O R G

Votes for car No. 17: Note the ticket serial numbers.

indicates the average number of votes for in this category.) Interestingly, while car No. 17 received attention for having the most votes in a single category, it did not have the most votes overall. In fact, car No. 17 did not receive more than an average number of votes. By using the conditional distribution of votes for “strangest,” we see that car No. 17 has an abnormally high number of votes.

W W W. I N F O R M S . O R G


Do I think that car No. 17 received an abnormally high number of votes from one source? I’ll let you be the judge. AN INTERESTING OBSERVATION You will notice that the number of votes increases by car number. The participants were handed tickets at the entry of the judging line, which is in front of car No. 1, and as they neared the end of the line, found themselves voting for the “later” cars. Interestingly, as the saying goes, a rising tide floats all boats, and being later in the judging did not affect the distribution of prizes.

A final thought: This type of fraud was easy to catch because it was poorly executed. Had the owner of car No. 17 had a more moderate strategy, such as only “stuffing” the box by five votes, he may have won and his fraud gone undetected. This type of padding can be detected by statistical methods, but not ones that are likely to be employed by hand on a Sunday afternoon! ❙ Harrison Schramm (Harrison.schramm@gmail.com), CAP, PStat, is a principal operations research analyst at CANA Advisors, LLC, and a member of INFORMS.

INFORMS O.R. & ANALYTICS STUDENT TEAM COMPETITION Recognizing outstanding solutions to real-world problems developed by undergraduate and master’s student teams.

THANKS TO OUR SPONSORS! HOST & FOUNDING SPONSOR

T

D

U

ST

IN

FO

RM

S

EN

O

.R

.&

A

TE NALY A TIC M S CO

M

PE

TI

TI

O

N

Generous funding, plus provision of the problem statement and data.

SOFTWARE SPONSORS

Complimentary access to software for student teams.

Finalist presentations & winners announced at INFORMS Analytics Conference, Las Vegas. First Prize: $7,500.

A NA L Y T I C S

http://connect.informs.org/oratc/home

M A R C H / A P R I L 2 017

|

73


THIN K IN G A N A LY T I CA LLY

Any port in a storm There is significant danger to boats caught out in the open sea during a storm. Ideally, boats will dock before the storm hits and wait it out. The map shows 20 orange boats out at sea. With a storm approaching, each boat needs to be directed to one of three docks. Docks have a limited number of spaces available for boats (indicated by the rectangular spaces). Twenty boats and 20 docks: Which boat goes where? Altogether, there are 20 boat spaces available. The boats are clustered into three areas, and each area varies in distance to the docks (as indicated by the black arrows). All boats must be assigned to one unique space in a dock. QUESTION: What is the minimum possible total distance traveled by all boats to the docks? Send your answer to puzzlor@gmail.com by May 15. The winner, chosen randomly from correct answers, will receive a $25 Amazon Gift Card. Past questions and answers can be found at puzzlor.com. â?™

BY JOHN TOCZEK

74

|

John Toczek earned his BSc. in chemical engineering at Drexel University (1996) and his MSc. in operations research from Virginia Commonwealth University (2005). He is a member of INFORMS.

A N A LY T I C S - M A G A Z I N E . O R G

W W W. I N F O R M S . O R G


OPTIMIZATION GENERAL ALGEBRAIC MODELING SYSTEM High-Level Modeling The General Algebraic Modeling System (GAMS) is a high-level modeling system for mathematical programming problems. GAMS is tailored for complex, large-scale modeling applications, and allows you to build large maintainable models that can be adapted quickly to new situations. Models are fully portable from one computer platform to another.

State-of-the-Art Solvers GAMS incorporates all major commercial and academic state-of-the-art solution technologies for a broad range of problem types.

GAMS Integrated Developer Environment for editing, debugging, solving models, and viewing data.

SmartEnergyHub How can operators of critical infrastructure optimize

leader for the research project, while Stuttgart Airport

their energy management in the context of a rapidly

is the project’s application partner. The SmartEner-

changing energy market? The research project

gyHub project is financed by the Federal German

SmartEnergyHub deals with this question on the

Ministry for Economic Affairs.

basis of a smart data platform that combines and

Modeling and optimization for SmartEnergyHub is

analyzes sensor data and forecasts for weather and

done in GAMS. The first part of the project inte-

energy prices.

grates sensor data, process- and control systems,

Prospective clients like airports or municipal utilities

forecasts, and an internal optimization. The second

can work out an innovative and efficient energy

part analyzes real-time optimization of energy con-

management solution using their existing building

sortiums. The core optimization system is comprised

infrastructure. They can discover energy saving

of LP and MIP models written in GAMS. The project

potentials, predict fluctuations and better compen-

benefits from the variability of GAMS to compare

sate them, and act as stability anchor in the grid.

various open source and commercial solvers with re-

Fichtner IT Consulting operates as joint venture

spect to specific applications and market segments.

For further information please contact Armin Gauss - Armin.Gauss@fit.fichtner.de

www.gams.com


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.