#thisisOR
D R I V I N G I M P R O V E M E N T W I T H O P E R AT I O N A L R E S E A R C H A N D D E C I S I O N A N A LY T I C S
AUTUMN 2019
MODELLING PRODUCES THE OPTIMUM ADVERTISING BUDGET SPLIT FOR MERCEDES-BENZ NEW CARS Estimated savings were 15-30% of the overall spend
Learn from the experts, share your knowledge and develop network at these year-round OR Society events: • Annual Conference • Analytics Summit • Computational Modelling
• Regional & Special Interest Groups • Simulation Workshop • Specialist Lectures
Operational research (OR) is the science of better decision-making.
Discover all our events and conferences at
www.theorsociety.com/events www.theorsociety.com
@theorsociety
© emirhankaramuk Canal & River Trust / Shutterstock.com
Conferences and events to build your skills and network PREPARING SHIFT SCHEDULES FOR GERMAN TRAIN CONDUCTORS Powerful algorithms enable high quality decision support and provide considerable cost reductions
ANALYSIS GIVES BETTER RESULTS – IN SPORT A wide variety of teams benefit from performance analysis
JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY
Perspicax agricolae suffragarit Augustus. Suis vocificat fiducias.
00
Saburre miscere Aquae Sulis. Pessimus tremulus matrimonii insectat Octavius.
00
Satis saetosus ossifragi agnascor incredibiliter perspicax apparatus bellis.
00
Satis quinquennalis fiducias imputat gulosus agricolae.
00
Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis.
00
Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis.
00
Perspicax agricolae suffragarit Augustus. Suis vocificat fiducias.
00
Saburre miscere Aquae Sulis. Pessimus tremulus matrimonii insectat Octavius.
00
Satis saetosus ossifragi agnascor incredibiliter perspicax apparatus bellis.
00
Satis quinquennalis fiducias imputat gulosus agricolae.
00
Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis.
00
Contents
Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis. Perspicax agricolae suffragarit Augustus. Suis vocificat fiducias.
00
Saburre miscere Aquae Sulis. Pessimus tremulus matrimonii insectat Octavius.
00 •
Satis saetosus ossifragi agnascor incredibiliter perspicax apparatus bellis.
00
Satis quinquennalis fiducias imputat gulosus agricolae.
00
Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis.
•
THE EUROPEAN JOURNAL OF INFORMATION SYSTEMS
JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY
00
VOLUME 00 NUMBER 00 MONTH 00 ISSN: 0960-085X
EJIS
Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis.
JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY
Contents
JORS is published 12 times a year and is the flagship journal of the Operational Research Society. It is the aim of JORS to present papers which cover the theory, practice, history or methodology of OR. However, since OR is primarily an applied science, VOLUME 00 NUMBER 00 MONTH 00 it is a major objective of the journal to attract and ISSN: 0960-085X publish accounts of good, practical case studies. Consequently, papers illustrating applications of OR 00 to real problems are especially welcome.
00
Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis.
00
Perspicax agricolae suffragarit Augustus. Suis vocificat fiducias.
00 •
Saburre miscere Aquae Sulis. Pessimus tremulus matrimonii insectat Octavius.
00
Satis saetosus ossifragi agnascor incredibiliter perspicax apparatus bellis.
00
Satis quinquennalis fiducias imputat gulosus agricolae.
00
Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis.
00
Real applications of OR - forecasting, inventory, investment, location, logistics, maintenance, marketing, packing, purchasing, production, project management, reliability and scheduling A wide variety of environments - community OR, education, energy, finance, government, health services, manufacturing industries, mining, sports, and transportation Technical approaches - decision support systems, expert systems, heuristics, networks, mathematical programming, multicriteria decision methods, problems structuring methods, queues, and simulation
THE EUROP JOURNAL O INFORMATIO SYSTEMS
Editors-in-Chief: Thomas Archibald, University of Edinburgh Jonathan Crook, University of Edinburgh
Dov Te’eni
VOLUME 00
VOLUME 00
T&F STEM @tandfSTEM
Dov Te’eni @tandfengineering NUMBER 00
NUMBER 00
Explore more today… bit.ly/2ClmiTY MONTH 2018
MONTH 2018
E D I TO R I A L I’m writing this in the middle of the Rugby World Cup. As an England fan, I’m pleased that we have won the first two matches. Readers will be able to judge whether that continued to be the case. I am also pleased to be able to include in this issue an account of the work of Insight Analysis, who have supported the England team with their sports performance analysis. The purpose of Impact is to demonstrate the usefulness of Operational Research/analytical approaches by reporting, from a wide variety of organisations, successful applications of these methods. You may, therefore, be surprised to find in this issue a story of failure: not failure of the analysis, but failure to get the (unpalatable?) results accepted. Dennis Sherwood was commissioned in 2013 by Ofqual, the organisation which regulates qualifications, examinations and assessments in England, to investigate the systems within which they operate. His work highlighted the fundamental problem caused by attempting to map fuzzy marks into grades. His subsequent statistical analysis led to an estimate that a quarter of the grades awarded to GCSE, AS and A level students are wrong, a result which was rejected by Ofqual. In 2016, however, Ofqual changed the rules for appeals, making it harder for a candidate to request a re-mark. So perhaps his analytical work was not ignored! There are several accounts of successful applications of analytical methods in this issue. Of note are two from Germany. Our cover features a Mercedes-Benz car, illustrating the account of Marc Fischer’s work to model the effects of their advertising and produce an optimal advertising budget split for their new cars. The result is an estimated savings of around €2m per campaign. The lead article concerns German Rail and tells of Janis Neufeld’s development of an algorithm to give significant cost reductions and simplify the planning process for crew shift schedules. I hope you enjoy reading all the reports of how O.R. and analytics continue to make an impact. Electronic copies of all issues are available at https://issuu.com/ orsimpact. For future issues of this free magazine, please subscribe at http://www. getimpactmagazine.co.uk/.
The OR Society is the trading name of the Operational Research Society, which is a registered charity and a company limited by guarantee.
Seymour House, 12 Edward Street, Birmingham, B1 2RX, UK Tel: + 44 (0)121 233 9300, Fax: + 44 (0)121 233 0321 Email: email@theorsociety.com Secretary and General Manager: Gavin Blackett President: John Hopes Editor: Graham Rand g.rand@lancaster.ac.uk Print ISSN: 2058-802X Online ISSN: 2058-8038 www.tandfonline.com/timp Published by Taylor & Francis, an Informa business All Taylor and Francis Group journals are printed on paper from renewable sources by accredited partners.
Graham Rand
OPERATIONAL RESEARCH AND DECISION ANALYTICS Operational Research (O.R.) is the discipline of applying appropriate analytical methods to help those who run organisations make better decisions. It’s a ‘real world’ discipline with a focus on improving the complex systems and processes that underpin everyone’s daily life – O.R. is an improvement science. For over 70 years, O.R. has focussed on supporting decision making in a wide range of organisations. It is a major contributor to the development of decision analytics, which has come to prominence because of the availability of big data. Work under the O.R. label continues, though some prefer names such as business analysis, decision analysis, analytics or management science. Whatever the name, O.R. analysts seek to work in partnership with managers and decision makers to achieve desirable outcomes that are informed and evidence-based. As the world has become more complex, problems tougher to solve using gut-feel alone, and computers become increasingly powerful, O.R. continues to develop new techniques to guide decision-making. The methods used are typically quantitative, tempered with problem structuring methods to resolve problems that have multiple stakeholders and conflicting objectives. Impact aims to encourage further use of O.R. by demonstrating the value of these techniques in every kind of organisation – large and small, private and public, for-profit and not-for-profit. To find out more about how decision analytics could help your organisation make more informed decisions see www.scienceofbetter.co.uk. O.R. is the ‘science of better’.
Benchmark Benchmark your expertise your expertise
Professional accreditation for Professional for your accreditation analytics career your analytics career Choose the right pathway to Choose pathway to edge aheadthe of right the competition edge ahead of the competition GET THE RECOGNITION YOU DESERVE GET THE RECOGNITION YOU DESERVE Full details are available at www.theorsociety.com/accreditation Full details are available at www.theorsociety.com/accreditation PLEASE NOTE: That membership of a professional society (like The OR Society) is universally recognised as a key component of certified professional competence. PLEASE NOTE: That membership of a professional society (like The OR Society) is universally recognised as a key component of certified professional competence.
Operational research (OR) is the science of better decision-making. Operational research (OR) is the science of better decision-making.
#thisisOR #thisisOR
www.theorsociety.com www.theorsociety.com
@theorsociety @theorsociety
CO N T E N T S 7
EFFICIENT RAILWAY CREW SCHEDULING IN GERMAN REGIONAL PASSENGER TRANSPORT Janis S. Neufeld explains how an algorithm gave significant cost reductions and simplified the planning process for train crew shift schedules
14
DACHSHUNDS: GETTING BACK TO BASICS Ian Seath demonstrates how statistical analysis has been used to make recommendations to vets and dog owners to reduce the risk of back problems
25
KEEP IT CLEAR Lauren Knight discusses work for the Anglian Water team responsible for keeping sewers clear of blockages, to improve the impact data has on their campaign
28
INSURANCE IN THE IOT ERA Neil Robinson reports Ine Steenmans’ work with Lloyd’s to reimagine insurance for a superconnected society
33
THE AD MAN’S DILEMMA Brian Clegg explains how Marc Fischer’s analytical work enabled Mercedes-Benz to make considerable savings in advertising when launching new car models
39
USING O.R. TO GAIN INSIGHTS TO HELP BRITISH ATHLETES SUCCEED Nick Lade describes the work of Insight Analysis, the sports performance consultancy, which has supported the England team’s preparations for the 2019 Rugby World Cup
43
LIES, DAMNED LIES, AND ... GCSE RESULTS Dennis Sherwood tells us about, what he has called, “the perfect crime”, with 1.5m victims each year: the grading of school examinations
4 Seen Elsewhere
Analytics making an impact
11 The Data Series – Solving the
data privacy problem using synthetic data Louise Maynard-Atem discusses synthetic data: what it is, how it’s generated and potential applications
19 Pretty Persuasion: the
advantages of data visualisation Robert Grant demonstrates how visualising models created by analysts, the data that go in, and the predictions that come out, is a powerful tool for effective communication
24 Universities making an impact
Brief report of a postgraduate student project
46 Truthiness or Factfulness?
Geoff Royston challenges us to consider how much our view of the world is based on evidence whilst reflecting on two recent books: Truth or Truthiness, by Howard Wainer, and Factfulness, by the late Hans Rosling
DISCLAIMER The Operational Research Society and our publisher Informa UK Limited, trading as Taylor & Francis Group, make every effort to ensure the accuracy of all the information (the “Content”) contained in our publications. However, the Operational Research Society and our publisher Informa UK Limited, trading as Taylor & Francis Group, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by the Operational Research Society or our publisher Informa UK Limited, trading as Taylor & Francis Group. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. The Operational Research Society and our publisher Informa UK Limited, trading as Taylor & Francis Group, shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions
Reusing Articles in this Magazine
All content is published under a Creative Commons Attribution-NonCommercial-NoDerivatives License which permits noncommercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way.
SEEN ELSEWHERE ARTIFICIAL TASTE AND SMELL
© ZoneCreative/Shutterstock.com
Much work is being done to develop artificial intelligences that would taste and smell more sophisticatedly than mere humans. An artificial tongue developed at the University of Glasgow could be used to detect poisons or counterfeit whiskies. The Scottish researchers’ paper, “Whisky tasting using a bimetallic nanoplasmonic tongue”, (Nanoscale, 2019,11, 15216-15223) suggests that their reusable artificial tongue – a tiny glass wafer composed of millions of tiny artificial metallic taste buds about 500 times smaller than human ones – could be used to detect fake whiskies. That would be a boon when the high-end single malt market is reportedly awash in counterfeits. In New Zealand, a wine dealer has partnered with the artificial intelligence company Spacetime to create an AI sommelier more discriminating, and ideally less snooty, than real ones to assist customers in matching wines to their palates and their food. The US food business McCormick & Company and IBM are trying to develop an AI application that uses data to learn about users’ taste preferences and produce new, appealing flavour combinations for them. As far as smell is concerned, Cyrano Sciences, (now where did that name come from?), is commercialising an electronic nose invented at the California Institute of Technology to not just enhance human smell, but also possibly to replace sniffer dogs in the detection of drugs and bombs. Whilst there may be benefits for counterfeit detectors and sniffer dogs, will the culture of customised shopping – “You liked that? You’ll love this” –
4
IMPACT © THE OR SOCIETY
be extended, as machine learning extrapolates our future tastes from our past ones? There’s a thought to savour – or not!
TO BE OR NOT TO BE
Researchers say they have developed an algorithm that predicts with 85% accuracy whether an actor is yet to have their most productive year, or whether they have already peaked. While actors and agents alike might be keen to get their hands on such a crystal ball, the researchers say there is little mystery: an actor’s best year is usually preceded by a steady rise in the amount of work they are getting. Oliver Williams from Queen Mary University of London, said: “If I were to give a piece of advice based on my findings, I would say just do more jobs and you’ll get more jobs.”
institutions are under immense pressure to maintain market share. On the marketing front, this means focusing on the consumer with a well-informed, customized omnichannel strategy. Despite the challenges, analytics plays a central role in optimizing the delivery of a positive consumer experience in real time across a wide variety of channels. Financial institutions that invest in the very best data analytics platform that delivers processing power, realtime insights, unified measurement, person-level data and brand insights will produce exceptional consumer experiences that drive success. The digital consumer demands this level of experience, and the future of banking depends on meeting this challenge.” Andy Cheong, product marketing director at Marketing Evolution. See https://doi.org/10.1287/ LYTX.2019.03.03 for the full article in Analytics.
MAKING A SPLASH IN BATH
ANALYTICS CAN GIVE FINANCIAL MARKETING SUCCESS
“In a competitive industry disrupted by rapidly evolving consumer behaviour, financial services
The City of Bath (or “Bath Spa” as the railways call it) was a very popular health resort in Roman and Georgian times. Recently, a large area of the shopping centre was demolished and rebuilt (in Georgian style). By collecting and collating data on the movement of people, their credit card transactions and their use of the public transport system, Simon Babes from Movement Strategies and Allison Herbert from Bath Business Improvement District (BID) have been able to move ahead with a major business improvement and regeneration project which has
delivered tangible benefit to 700 stakeholders. What they are delivering in Bath is adding value to the work the council does. The rich data sets that Movement Strategies generates are used to produce a web-based dashboard for BID which provides a pictorial visualisation of the impacts the regeneration strategy is achieving.
IMMIGRATION CONTROL ALGORITHMS
According to research published in an article for the Financial Times, algorithms used to determine whether people migrating to the UK can stay in the country may be biased. Concerns about an over-dependency on the algorithms used have been raised by independent bodies and lawyers who believe the ‘streaming’ processes utilised in visa processing could disadvantage certain groups of people. The government’s Science and Technology Committee (see bit.ly/VisaBias) has also told the government that it needs to be transparent in its means, so members of the public can be made aware of its impact.
New research published in Marketing Science (https://doi.org/10.1287/ mksc.2018.1143) suggests that in
PREDICTIVE POLICING
At least 14 police forces in the UK are either using algorithm programs for policing, have previously done so or have conducted research and trials into them. West Midlands police are at the forefront, leading on a £4.5m project funded by the Home Office called National Data Analytics Solution (NDAS), the long-term aim of which is to analyse data from force databases, social services, the NHS and schools to calculate where officers can be most effectively used. An initial trial combined data on crimes, custody, gangs and criminal records to identify 200 offenders “who were getting others into a life on the wrong side of the law.” However, a report by West Midlands police’s ethics committee raised concerns about the project. It says that no privacy impact assessments had been made available, and there was almost no analysis of how it impacted rights. The new tool will use data such as that linked to stop and search, and the ethics committee noted this would also include information on people
SKIPPING APPOINTMENTS
At University College Hospital in London an algorithm has been developed to predict which patients are most likely to miss appointments. Using records from 22,000 appointments for MRI scans allowed identification of 90% of those patients who would turn out to be no-shows. The machine intelligence is not perfect – it also incorrectly flags about half of patients attending appointments as being at risk of not showing. As patients missing their appointments cost the NHS £1bn last year even an imprecise indication of which patients will attend could save hospitals vast sums of money and help cut waiting times.
© Drop of Light/Shutterstock.com
© BigTunaOnline/Shutterstock.com
HOW AIRBNB IMPACTS TRADITIONAL HOTELS
some cases, the presence of Airbnb can help attract more demand in certain markets while challenging traditional hotel pricing strategies. The study, “Competitive Dynamics in the Sharing Economy: An Analysis in the Context of Airbnb and Hotels,” is authored by Hui Li and Kannan Srinivasan of Carnegie Mellon University. “Our analysis gleaned a number of insights,” Li says. “In the end, we arrived at four conclusions. Airbnb cannibalizes hotel sales, especially for lower end hotels. Second, Airbnb can help stabilize or even increase demand during peak travel seasons, offsetting the potential for higher hotel prices, which can sometimes be a deterrent. Third, the flexible lodging capacity created by Airbnb may disrupt traditional pricing strategies in some markets, actually helping to minimize the need for seasonal pricing. And finally, as Airbnb targets business travellers, higher-end hotels are most likely to be affected.”
IMPACT | AUTUMN 2019
5
who were stopped with nothing found, which could give rise to police bias.
O.R. TACKLING THE MISUSE OF OXYTOCIN
LOOKING INTO POTHOLES
To meet the needs of local communities, maintain safety, support the changing requirements of businesses, as well as optimise transport to and from locations, local highway authorities have a duty to keep public highways in good repair and to repair any hazards or defects when they occur. Repair of such defects is an essential requirement all councils’ budgets, because local highway networks are, without doubt, the most valuable publicly owned assets. Research done on Modelling the Deterioration Processes of Pavement Systems, funded by ESRC and undertaken by Dr Shaomin Wu,
6
IMPACT | AUTUMN 2019
enhancing existing analytics capabilities in their organization, is available from INFORMS, the US sister organisation to the OR Society, at bit.ly/BuildORteams.
OPTIMISING TRAFFIC SIGNALS
Researchers at the University of Technology Sydney and DATA61 have developed a genetic algorithm for optimising the timing of traffic signals in urban environments under severe traffic conditions. Traffic control signals are the most widespread tools to control and manage road traffic in densely populated urban environments. The method uses phase durations as decision variables and the total travel time in the network as the objective function. The authors claim an improvement of over 40%. Their paper can be found at: arxiv.org/ abs/1906.05356.
WAITING ON GETTING STARTED WITH ANALYTICS
There are many buzzwords such as analytics, data science, and machine learning. Understandably, leaders of organizations want to utilize these techniques to anticipate business needs, recognizing that this capability eventually leads to optimized business operations, improved products and services, and better risk management. An ebooklet, “How Organizations Can Get Started with Analytics”, for leaders who are interested in deploying new analytics capabilities or
We’ve all been there… you need the bill in a restaurant and a waiter refuses to catch your eye. New research, conducted by Tom Fangyun Tan of the Cox Business School at Southern Methodist University and Serguei Netessine of the Wharton School at the University of Pennsylvania, shows that restaurants should introduce tabletop technology to improve service and satisfaction. Tabletop technology allows customers to view menu items, re-order beverages, pay for the meal, play games and browse news content. The technology is meant to assist waiters, not replace them. See https://dx.doi. org/10.2139/ssrn.3037012.
© INFORMS
Pakistan has one of the highest maternal mortality rates in South Asia, and the highest new-born mortality rate in the world, with one in 22 babies dying during the first month of life. LuxOR, the Operational Research unit of Médecins Sans Frontières (MSF), conducted a study in Timergara, Pakistan, which assessed how unregulated use of oxytocin linked to maternal and neonatal health complications. The study provided strong evidence of associated health risks to women and has led to the establishment of a ‘modular’ training course on the correct use of oxytocin, piloted from December 2018. More at bit.ly/PakistanMSF.
University of Kent, in collaboration with a local authority, focussed on highway infrastructure to produce algorithms that could track and present trends in asset deterioration, as well as analyse and understand the impact and interdependency that one highway asset has with another highway asset. See http://www.blgdataresearch.org/tag/ pavement/.
E F F I C I E N T R A I LWAY CREW SCHEDULING IN GERMAN REGIONAL PA S S E N G E R T R A N S P O R T JANIS S. NEUFELD
Š Deutsche Bahn AG/Georg Wagner
SINCE THE GERMAN RAILWAY REFORM IN 1996, federal states and transport associations are responsible for regional rail passenger transport. Usually, transport associations determine lines and timetables as well as all requirements for running a transportation network. Based on this, they invite tenders for operating services, whereas railway companies submit bids as operators. In order to be
able to make a realistic, cost-effective and thus promising offer, a detailed deployment of material and personnel is necessary. Therefore, the railway companies have to plan vehicles, local services and personnel. Each of these planning problems is very complex and difficult to solve, so that it is hardly possible to generate efficient schedules without appropriate decision support systems.
IMPACT Š 2019 THE AUTHOR
7
Since personnel costs make up a substantial part of the overall operating costs of a local transport network, the preparation of shift schedules for train drivers and conductors is of particular importance. First of all, anonymous shifts are created, i.e. sequences of tasks which must meet the requirements of labour laws and collective agreements but are not yet assigned to a specific person. In addition to staffing trains, the shifts also include preparation and closing services, breaks, guest journeys and necessary foot journeys, for example for a change of trains. In a later planning step, these shifts are then assigned to specific employees (personnel rostering).
CHALLENGES
As a conductor is able to change trains at many stations, the number of possible shifts increases exponentially with the size of the planned network and can easily amount to several hundred million per day (!) for a medium-sized network. With this, of course, finding a (near) optimal schedule represents a major challenge, © Deutsche Bahn AG/Oliver Lauer
8
IMPACT | AUTUMN 2019
especially as the usual planning horizon for tactical planning is at least one (standardized) week. This week can then be rolled out for a longer planning period. At the same time, the creation of efficient shifts can have a considerable influence on the employees’ productivity and thus on the resulting costs. Another major complexity driver is the high number of requirements that have to be met with the generated schedule. These derive from the German Working Hours Act, collective agreements and operational demands. Among these are regulations on breaks, working hours and shift length. In addition, preparatory train services, such as times for viewing the working documents at the beginning of the shift, final services at the end of a covered trip, or times for accounting for ticket sales have to be considered. Transitions between two subsequent trips must be organised in such a way that a changeover is possible. Furthermore, requirements that take the entire schedule into account, especially lead to a significant increase in complexity of the crew scheduling.
For example, these can be a limited personnel capacity of crew bases (the only places where shifts can start or end), a given percentage of full-time and part-time employees or bounds for the average paid working time. While each train must always be assigned to a train driver, recently the shift planning for conductors often has a specific characteristic, the so-called attendance rates. These are often required in the tender documents by the transport association to reduce costs and imply that only a specified percentage of train kilometres must be attended by a conductor. In order to avoid penalisation by the transport association, it is necessary for the operating railway company to fulfil these specified rates. Attendance rates can vary, for example, depending on the type of train, the line, the time of day (e.g. from 7 p.m. more personnel required to ensure safety) or even within a train (e.g. change of rate at the border of a federal state). Frequently, several attendance rates have to be considered in a regional transport network. However, attendance rates are usually not integrated in conventional IT systems for railway planning. Therefore, especially more complex rates (e.g. depending on the time of day) can quickly lead to a high manual effort over several days for planning a transport network. This task often has to be repeated several times throughout the year, e.g. due to construction sites or changes of the timetable. Because of the complexity of the problem, in practice shift plans from previous periods are often used as a basis and adapted manually by the planner. However, the lack of IT support leads to a time-consuming ongoing check of compliance with the attendance rates and the efficiency
of the generated schedules is often questionable. One way of dealing with this planning problem without IT support is to create separate shifts for trips with different rates. In this case, with the support of conventional systems, all trips can first be assigned to shifts (i.e. with a rate of 100%). Afterwards, depending on the respective attendance rate, only a part of the generated shifts is included in the final shift plan (e.g. with a rate of 25%, only a quarter of the generated shifts is actually used). However, this procedure is difficult to implement for some regulations (for example, in the case of attendance rates that are dependent on the time of day). In addition, synergy effects from joint planning of trips with different rates cannot be realised. Since there are no available approaches in research or practice to tackle this task automatically, the research project Sina was established by the German railway company DB Regio AG and the university TU Dresden. A new algorithm had to be developed that makes use of several Operational Research methods. Finally, it was integrated in a new software-based decision support system.
above. Theoretically, all possible shifts could be generated. However, due to the high number of possible changes in regional transport networks, even for smaller practical problems the number of shifts exceeds the capacity of a planner. It is, therefore, important to limit the number of shifts without restricting the solution space and wasting optimization potential. Column generation is a modern mathematical solution method of Operational Research and has already been successfully used to solve highly complex problems. By only generating promising shifts it is able to keep the number of considered shifts low and speed up the solution process significantly. In the second phase, a set of shifts is selected from all generated shifts. This schedule has to cover all trips that have to be attended, meet attendance rates and, at the same time, should minimize the number of necessary conductors (i.e. costs). This second step can be modelled mathematically as a so-called Set Covering Problem and be solved
with the help of a mathematical solver (e.g. IBM ILOG CPLEX or Gurobi).
APPLICATION
The column generation algorithm has been implemented in a new software solution, Sina, to provide decision support. It is designed as client-server architecture so that necessary input data can be prepared locally by the individual planner, while a powerful calculation server solves the underlying mathematical problem. The results can then be downloaded and be further processed by each planner. Figure 1 shows the user interface with a schematic shift schedule generated by Sina. The results for three real regional transportation networks show exemplarily the applicability and performance of the solution algorithm. Table 1 shows some properties and the results for these examples. It should be noted that the number of train stations per network contains only train stations where a change of
SOLUTION METHOD
In order to solve complex planning problems, the use of Operational Research is often essential. In the past, railway crew scheduling problems could already be solved successfully, but only without taking attendance rates into account. Therefore, it was necessary to extend and adapt existing approaches. For the railway crew scheduling problem, a solution is created in two phases. In a first step, a large number of feasible shifts are generated, which must meet the requirements mentioned
FIGURE 1 USER INTERFACE OF SINA AND A GENERATED SHIFT SCHEDULE
IMPACT | AUTUMN 2019
9
Sina
OBJECTIVE
TRIPS PER
TRAIN
WEEK
STATIONS
Network I
4918
13
8,170,890
4,944,530
955
−50.5%
Network II
3767
11
4,795,500
2,896,590
1568
−39.6%
Network III
4099
16
4,940,625
3,777,000
2079
−23.5%
CONVENTIONAL PLANNING
OBJECTIVE
TIME [S]
COST REDUCTION
TABLE 1 RESULTS FOR THREE REAL REGIONAL TRANSPORTATION NETWORKS
trains is possible. A trip is defined as each part of a train journey that runs between two of these stations. Due to their size and structure, the presented networks can be regarded as typical networks for a regional rail passenger transport. In the generated schedules all essential requirements are met, such as attendance rates, labour regulations or technical requirements. In order to assess the solution quality of Sina, its shift schedules were compared to the approach where trips with different attendance rates are planned separately and are subsequently aggregated to form a feasible schedule. This approach corresponds to planning with conventional planning systems (as presented above), that cannot take attendance rates into account directly.
the improvements of the objective function ranged between 2% and 38%
The objective function values are composed of fixed costs per shift and a cost rate per working minute of each conductor. Therefore, they represent the actual personnel costs very well. It is evident that considerable reductions of personnel costs can be achieved by using the integrated planning approach of Sina. For the networks presented, the improvement lies between 23.5% and over 50%. It should be noted, however,
10
IMPACT | AUTUMN 2019
that in practice the shift schedules of the conventional approach could possibly be improved by manual post-processing by an experienced planner. Nevertheless, it is clear that the high complexity of the planning problem cannot be adequately overseen by a human planner. Taking into account all the shift schedules of DB Regio AG that have been considered during the project, the improvements of the objective function ranged between 2% and 38% as a result of Sina. Depending on network size and structure, the computation times of Sina are between 15 minutes (Network I) and 35 minutes (Network III). Since the railway crew scheduling is performed on a tactical level, these times are totally acceptable and enable automated shift planning even for larger local transport networks in practice. This is especially the case as the required computation time is nearly exclusively necessary on the calculation server. Furthermore, it is possible to test several variants of shift schedules in parallel to check the influence of changes in the conditions by sensitivity analyses. This includes, for example, strategic issues such as the opening of new crew bases or considerations as to whether tickets should be sold by ticket vending machines or an increased number of personnel in the trains. The results show the potential for improvement that can be achieved through algorithm-based, integrated planning using modern methods of Operational Research.
The developed algorithm resulted in significant cost reductions and enabled a considerable simplification of the planning process
CONCLUSION
The Sina project showed that the integration of attendance rates in regional passenger transport requires new solution methods for an efficient deployment of conductors. The developed algorithm proved to be suitable for use in practice. It resulted in significant cost reductions and enabled a considerable simplification of the planning process. This is another example for the necessity as well as the potential of modern methods of Operational Research. Summing up, the project manager at DB Regio, Volker Jacobsen, stated: ‘The powerful algorithms and, in particular, the consideration of the numerous requirements that are essential for practical application, enable highquality decision support. This leads to a considerable reduction of our planning effort and considerable cost reductions. Sina is a great benefit for our planning processes.’
Janis S. Neufeld is leader of the research group Operations Management at Technische Universität Dresden, where he received his PhD in 2016. He has worked on several Operational Research projects with partners from industry.
THE DATA SERIES – SOLVING THE DATA PRIVACY PROBLEM USING SYNTHETIC DATA Louise Maynard-Atem
limit the availability/usage of the data or when the data need for a test environment does not exist. There are three main types of synthetic data: • Fully synthetic data – contains no original data. • Partially synthetic data – only selected sensitive values are replaced with synthetic data. • Hybrid synthetic data – generated using both synthetic and original data. There are two primary methods used to generate synthetic data:
Historically synthetic data generation has been mostly developed within academia, however there are now a number of commercial organisations that are bringing the capabilities to market (see Hazy – www.hazy.com and Mostly – www. mostly.ai as two great examples). © The Author
WE ARE ALL BECOMING INCREASINGLY AWARE OF THE VALUE OF OUR DATA, and the desire to share it without the concept of a value exchange is dwindling. A true and widely accepted model for the value exchange has yet to be developed, and as a result the ability for organisations to share data is slowing down data innovation. From an organisation perspective, regulations like GDPR and an increased desire for privacy among consumers are driving this cautionary approach when it comes to data. As a result, these organisations are keen to embrace technological advances that mean they can share data and derive insights whilst maintaining compliance with the demands of both consumers and regulators. There are a number of ways in which this problem is being approached, but the one that I want to discuss in this penultimate article of the data series is Synthetic Data. As synthetic data is anonymous and exempt from data protection regulations, this opens up a whole range of opportunities for otherwise locked-up data, resulting in faster innovation, less risk and lower costs. This article covers what it is, how it’s generated and the potential applications.
• Drawing numbers from a distribution – The principle is to observe real-world statistical distributions from the original data and reproduce artificial data according to these distributions. This can also include the creation of generative models. • Agent based modelling (see Figure 1) – The principle is to create a model that explains the observed behaviour, then reproduce random data using this model. It is generally agreed that observed features of a complex system can often emerge from a set of simple rules.
WHAT ARE SYNTHETIC DATA AND HOW ARE THEY GENERATED?
Synthetic datasets are any production data applicable to a given situation that are not obtained through direct measurement, and are generated to meet specific needs or conditions. This is very useful when either privacy needs
FIGURE 1 SIMPLIFIED REPRESENTATION OF HOW SYNTHETIC DATA IS CREATED
IMPACT © 2019 THE AUTHOR
11
© The Open Data Institute FIGURE 2 THE ODI DATA SPECTRUM (SHOWING HOW DIFFERENT LEVELS OF ANONYMISATION CAN MAKE CLOSED DATA, MEDICAL RECORDS, SHARED AND/OR OPEN DATA)
SYNTHETIC DATA VS ANONYMISATION: WHAT’S THE DIFFERENCE?
To understand the need for anonymisation (see Figure 2) and synthetic data, we must first establish three definitions: Personal data – defined by GDPR as any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, either directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data an online identifier or to one or more factors specific to the physical, physiological, genetic mental, economic, cultural or social identity of that natural person. Private information – defined by the Office for National Statistics as: • Relating to an identifiable legal or natural person; • Not in the public domain or common knowledge; • Information that if disclosed, would cause the subject damage, harm or distress. Sensitive information – defined as a sensitive asset that if compromised can cause serious harm to an organisation. For data to be shared between organisations it must be modified; for sensitive data in general this is done
12
IMPACT | AUTUMN 2019
by redaction and for personal data this is done by anonymisation. Anonymisation is a process that alters a dataset to reduce the risk of re-identification as much as possible. Modern anonymisation techniques tend to fall into three categories: • Suppression – removing identifiers or pieces of information that may lead to re-identification; • Generalisation – aggregating data points into a coarser granularity, or otherwise removing details to obfuscate data about people on an individual basis; • Disruption – adding noise and changing values to the extent that it is increasingly difficult to know how, or whether, information about specific individuals can be recovered or inferred. Synthetic data methodology can be best described as a subset of anonymisation created by an automated process such that it holds similar statistical patterns as the original dataset. Each individual record may have no relation to reality but, when viewed in aggregate, the dataset is still useful for certain analyses and for testing software. If done correctly, synthetic datasets can contain no personal data, which eliminates the risk of or re-identification.
POTENTIAL USE CASES
The applications and potential use cases for synthetic are vast, including analytics, machine learning, and data modelling. Fraud detection and healthcare are described in more detail here as being prominent areas where considerable work is taking place, but these are just two examples. The important thing to remember is that being able to share granular level statistically representative synthetic data with external organisations (including start-ups and researchers) can lead to truly disruptive innovations in a much shorter time frame.
Financial services and fraud detection Standard anonymisation is often insufficient for financial transaction data; it has been demonstrated that 80% of credit card owners could be re-identified by only three transactions, even when just the merchant and date of the transaction are revealed. Synthetic data maintains the majority of the valuable information and statistical integrity of the original data but eliminates the risk of re-identification. The ability to generate synthetic fraud data is of great benefit to the financial services sector because it allows for collaboration amongst organisations. This collaboration, which is inherent between fraudulent actors, can give organisations a valuable speed advantage which allows them close fraudulent loopholes more quickly. Patterns of fraudulent behaviour can not only be recreated in synthetic data, but the signal can also be amplified, allowing machine learning models to be trained more quickly and comprehensively. The PaySim mobile money simulator (https://www.kaggle.com/ntnu-testimon/paysim1) is an example of using aggregate data from a private dataset to generate a synthetic equivalent that resembles the normal operation of transactions. Malicious behaviour was injected into the synthetic dataset to evaluate the performance of fraud detection models.
Registration and Analysis Service (NCRAS). Access to this data can give analysts and researchers the ability to answer questions that will broaden their understanding of cancer, whilst protecting the confidentiality of the patient. Because this data is entirely synthetic, it removed the need for many of the essential controls that are required to access the real data held by Public Health England (PHE). Although the data is synthetic, the Simulacrum maintains most of the properties of the original data with a high degree of accuracy. It is important to remember that the more complex the data query the more approximate the results, which is a limitation of using synthetic data. However, because the data model (but not the data) is the same as the real model in the Cancer Analysis System in PHE, researchers can use the Simulacrum to plan and test their hypotheses before making a formal request to PHE to analyse the real data. This database was developed by Health Data Insight in partnership with IQVIA and AstraZeneca. The Open Data Institute are also currently looking to release a synthetic dataset of A&E admissions.
THE FUTURE OF SYNTHETIC DATA
Healthcare
Sample-free population synthetic population generation – It is possible to generate synthetic population in all geographies as long as there is widely available public data. Using samplefree synthetic reconstruction methods gives us the capability to generate synthetic populations without any sample data to use as a basis. Research has shown that the sample-free method of synthetic reconstruction can give globally better results than the sample-based methods. However, it should be noted that although the sample-free method is much less data demanding, the data requires much more pre-processing before use. This method of population generation is currently being used in academia and there is definite commercial opportunity that could take advantage of this technology.
There has been considerable interest in using synthetic data within the healthcare sector to help researchers answer important questions about diseases, and to improve the efficiency and effectiveness of services in the sector. The Simulacrum is a synthetic database that imitates some of the cancer data held securely by the National Cancer
Louise Maynard-Atem is an innovation specialist in the Data Exchange team at Experian. She is the co-founder of the Corporate Innovation Forum and is an advocate for STEM activities, volunteering with the STEMettes and The Access Project.
IMPACT | AUTUMN 2019
13
© Ian Seath
DAC H S H U N D S : G E T T I N G B AC K TO B A S I C S IAN SEATH
14
IMPACT © 2019 THE AUTHOR
DACHSHUNDS ARE ONE OF THE UK’S MOST POPULAR PEDIGREE DOG BREEDS and, although many people might think they are lap dogs that don’t take much exercise due to their short legs, they actually originate
from a very active working breed. Their short legs are a result of a genetic mutation that is thought to have occurred around 4000 years ago, but this mutation also results in premature ageing of the disc material in their
spines, making them more prone to back disease than other breeds. As a breed, they are around 10 times more likely to suffer a back problem than the average breed of dog and about 1 in 4 dachshunds will suffer some degree of back problems during their life. Back disease is more correctly described as Intervertebral Disc Disease (IVDD) or Herniation (IVDH). Given this background, the Dachshund Breed Council (DBC), a Kennel Club body that represents the UK’s breed clubs, has been proactive in assessing the health of the breed and conducting research to help improve health. I am Chairman of the DBC and have been able to apply my O.R. skills to our work on breed health improvement. In 2015, with the help of researchers at the Royal Veterinary College (RVC), we designed and analysed a breed survey to identify lifestyle factors that were associated with the risk of back disease.
NOVEL FINDINGS
those neutered under 12 months were at a higher risk; • Those that lived with more than one other dachshund had a lower risk. Additionally, and unsurprisingly, those that had more exercise and were more physically active had a lower risk. I carried out this initial analysis and then the dataset was further analysed by the team at the RVC. They were able to carry out more sophisticated multivariate analyses to look at the interactions between several factors and IVDD prevalence. The whole study was published in the peer-reviewed Journal of Canine Genetics and Epidemiology in 2016.
Neutering can have adverse effects and it is increasingly apparent that the evidence for benefits and disbenefits needs to be considered on a breed-by-breed basis
FROM ANALYSIS TO IMPACT
The Breed Council had been active in promoting the importance of dachshunds having an active lifestyle to keep them fit and healthy, but the 2015 study provided further evidence that could be used to help reduce the risk of IVDD. Around half of all dogs in the UK are neutered and many vets recommend this is carried out between 6 and 12 months. Neutering is carried out for a range of reasons, including population control and to prevent conditions such as mammary tumours in bitches. Recent studies have shown that neutering can also have adverse effects and it is increasingly apparent that the evidence for benefits and disbenefits needs to be considered on a breedby-breed basis, not at a species level. The findings therefore had the potential to make a significant impact on the breed’s health if they could be translated into different decisions being made by owners and vets.
© Ian Seath
The design of the study included factors such as body proportions (length:height, weight), diet, exercise and neuter status. The survey ran for three months and gathered just over 2000 responses. Initial analysis looked at individual variables to calculate Odds Ratios (e.g. the ratio of longer-bodied dogs with IVDD to those without IVDD). While most people would assume that longer dogs would be more prone to IVDD, the evidence did not support that theory. Dietary factors also were not associated with IVDD risk. Several surprising findings emerged: • Dachshunds that had been neutered had an increased risk of IVDD and
IMPACT | AUTUMN 2019
15
So began a two-pronged communications strategy targeted at owners and vets. At the end of 2016, the DBC launched a website (www. dachshund-ivdd.uk) and a Facebook Group focussed on providing IVDD advice and guidance to owners. They also began work to raise awareness of the association between neutering and IVDD among veterinary surgeons.
COLLABORATION AND MORE ANALYSIS!
Veterinary surgeon Dr Marianne Dorn joined the DBC team and, aided by her husband (an actuary), went to work on a more detailed analysis of the neutering data to provide an up-to-date review of the evidence on IVDD in Dachshunds, linked to the risks associated with neutering. This study set out to investigate the possible relationship between neuter status and risk of IVDH in dachshunds. The aims of the study were twofold: to investigate
The cases set were dogs that had been diagnosed as having IVDH by a veterinary surgeon. We used Power Analysis to determine the sample sizes required to be able to detect an effect and to avoid false positives and false negatives. It was calculated that at least 100 animals were required per group, so this number was easily achieved from the cases and controls. For dachshunds that were ≥ 3 years and < 10 years old at the time of the survey (1073 individuals) incidence of IVDH was compared between earlyneutered (<12 months), late-neutered (>12 months) and entire animals of each gender. Figure 1 shows the histogram of age of IVDH diagnosis for 313 dachshunds in the survey. Most dogs are first affected between the ages of 4 and 6. The original analysis (2015) had reported prevalence data and Odds Ratios. For this follow-up, Incidence rates were estimated to evaluate the rate of IVDH onset using dog-years at risk (DYAR). The number of cases of
IVDH was divided by DYAR for the age range 36–120 months to calculate mean incidence per DYAR.
RESULTS AND CLINICAL SIGNIFICANCE
Neutered females had nearly twice the risk of IVDH than entire females. For neutered males, incidence of IVDH was also slightly increased but this was not statistically significant. For both genders, there was a significantly increased risk of IVDH in dachshunds neutered before 12 months old as compared with those neutered after 12 months. For females, the risk was 2.1 times and for males it was 1.5 times. The study concluded: ‘Results from this retrospective study suggest that gonadectomy, especially if performed before 12 months old, increases risk of IVDH in this breed. Decisions regarding neutering should be made on an individual basis, taking a range of pros and cons into account.
Our hypothesis was that neutered dachshunds, and especially those neutered before 12 months old, would have a higher incidence of IVDH. The first task was to clean the data into a set of cases and controls. Of the original 2031 survey responses, 1964 ended up in the sample for analysis.
16
IMPACT | AUTUMN 2019
© Ian Seath
(a) whether neuter status is associated with increased risk of IVDH in either male or female dachshunds, (b) whether neutering before 12 months old, as opposed to after 12 months old, is associated with increased risk of IVDH in dachshunds.
FIGURE 1 AGE AT FIRST OCCURRENCE OF IVDH
COMMUNICATIONS CAMPAIGN
The paper was published in the Open Access Canine Genetics and Epidemiology Journal in November 2018. It was the most highly accessed paper in that journal (nearly 18000) in the first quarter of 2019. In addition to this paper, Marianne Dorn wrote an article on IVDD for first opinion vets which was published in the In Practice magazine. Veterinary surgeons are one target audience for the information in this study and the other key group is dachshund owners. The DBC has publicised the study widely via social media and created an infographic (Figure 2) to communicate the findings visually. The aim is to provide the most up-to-date information on IVDD and influence owners’ decisions about neutering, with a view to reducing prevalence of the condition. A further breed survey was conducted in 2018 and replicated the 2015 findings showing the
association between neutering and IVDD. A quick calculation showed that the overall breed prevalence of IVDD would have been reduced from 25% to 17% if the neutered dogs had been left entire. The DBC will conduct its next breed survey in 2021 and expects that the impact of their research and communication will translate into a reduction in IVDD prevalence.
The overall breed prevalence of IVDD would have been reduced from 25% to 17% if the neutered dogs had been left entire
Ian Seath, is a member of the OR Society and Chairman of the Dachshund Breed Council, a not-for profit organisation working for the benefit of Dachshunds and their owners. He is an independent consultant with more than 25 years’ experience of working with private, public and third sector clients. His work includes strategy development, process improvement and project management. He and his wife have owned dachshunds since 1980 and, this year, one of their dogs was a class winner at the world-famous Crufts dog show.
FIGURE 2 DACHSHUND BREED COUNCIL INFOGRAPHIC
FOR FURTHER READING
Considering the high prevalence, morbidity and mortality of IVDH in dachshunds, increased IVDH risk associated with neutering is a key factor to consider in deciding whether and when to neuter’.
Packer, R.M.A., I.J. Seath, D.G. O’Neill, S. De Decker and H.A. Volk (2016). DachsLife 2015: An investigation of lifestyle associations with the risk of intervertebral disc disease in Dachshunds. Canine Genetics and Epidemiology 3: 8. Dorn, M. and I.J. Seath (2018). Neuter status as a risk factor for canine intervertebral disc herniation (IVDH) in dachshunds: A retrospective cohort study. Canine Genetics and Epidemiology 5: 11.
IMPACT | AUTUMN 2019
17
PICK YOUR COURSES PICK YOUR COURSES
BUILD YOUR SKILLS BUILD YOUR SKILLS
Top Top up up your your talent talent with with the the latest latest training training in in analytics, analytics, OR OR & & data data science science
WIDEN YOUR NETWORK WIDEN YOUR NETWORK
LEARN NEW TECHNIQUES LEARN NEW TECHNIQUES
Gain new skills and Gain new skills and techniques with our techniques with our year-round training year-round training programme, whatever programme, whatever your career stage your career stage
#thisisOR #thisisOR Operational research (OR) is the science of better decision-making. Operational research (OR) is the science of better decision-making.
To find out more visit: To find out more visit: www.theorsociety.com/training
www.theorsociety.com/training www.theorsociety.com www.theorsociety.com
@theorsociety @theorsociety
P R E T T Y P E R S UA S I O N : T H E A DVA N TAG E S O F DATA V I S UA L I S AT I O N ROBERT GRANT
DECISION-MAKERS NEED TO UNDERSTAND CAUSE AND EFFECT, and predict the future. Others analyse and provide the information that helps good decisions to be made. Either way, they will use mathematical models of the world, based on some previously collected data. If the calculations are sound, but the communication poor, there may be misunderstandings, or the decision-maker, finding it confusing and onerous, will simply revert to their prejudices.
Visualising those models created by analysts, the data that go in, and the predictions that come out, is one of the most powerful tools for effective communication
Visualising those models created by analysts, the data that go in, and the predictions that come out, is one of the most powerful tools for effective communication. Not only do visual representations of the findings capture attention and interest, they are also more quickly absorbed than a dry table of numbers, and, if done correctly, the message they carry will be remembered and shared.
For the analyst too, visualisations can help them see when their model is providing accurate representations of the world and acceptably accurate predictions, and when it is failing. Knowing this allows them to revisit and refine the model, and get better results. There may be undetected problems with the input data, which have to be accounted for in the model. A model that works well in most situations but fails in some specific circumstances suggests that more complexity needs to be added. To find these model or data problems, we may have to scan millions of observations over thousands of variables. Unusual patterns might not be obvious from summary statistics or from graphs of one variable at a time. The only efficient way to scan the data on this large scale is to do so visually. Figure 1 shows election data from 12 national general elections, where each dot represents a polling station, and higher densities of polling stations are shown as colours moving towards red. Suggested irregularities are circled in red, where certain stations reported 100% of the electorate voting and 100% voting for the winner, unlike the majority of polling stations. In the Ugandan and the two Russian elections, there is a group of stations at, or close, to this perfect turnout and winner-support.
IMPACT Š 2019 THE AUTHOR
19
FIGURE 1 ELECTION DATA FROM 12 NATIONAL GENERAL ELECTIONS
From ‘Statistical Detection of Systematic Election Irregularities’ by Peter Klimek, Yuri Yegorov, Rudolf Hanel and Stefan Thurner. Reproduced under open access licence from the Proceedings of the National Academy of Sciences.
Notably, there is then a gap between this and the bulk of the country. The authors of the paper identified this as implausible. Given the huge number of polling stations involved in each part of this figure, it would have been almost impossible to detect peculiar patterns by examining a table of data. Our brains are excellent tools, fine-tuned over millions of years, to spot things that don’t fit in with their surroundings, so we should use them! There are some generic tips that I think are crucial for innovative, effective visualisation: • produce lots of sketches and user-test them; • use metaphor, grounded in context (time spent in traffic jams is more impactful than a rate ratio, for example); • use annotation, but sparingly; • use natural frequencies rather than percentages and rates wherever possible; • show uncertainty in statistical estimates;
20
IMPACT | AUTUMN 2019
• keep familiar ‘landmark’ variables, such as location and time, to the fore. INTERACTIVE VISUALISATION
One of the most significant developments in data visualisation has been interactive content, delivered in the web browser. This is typically constructed with some programming in JavaScript, the language that gives instructions to web browsers such as Chrome, Safari or Firefox. However, recently there has been a welcome addition of packages that translate output from general-purpose analytical software such as R, Python or Stata into JavaScript. Tableau, Google Sheets and Mapbox are also excellent starting points for generating basic interactive output, and all are free in their basic versions.
It is possible to use interactive graphics to great effect for understanding the analysis as well as communicating the results
It is possible to use interactive graphics to great effect for understanding the analysis as well as communicating the results. It is even possible to let the user re-run analyses and simulations under their own whatif scenarios through the browser. Two choices here are to run the analysis on a server (which keeps data confidential but requires some investment in infrastructure and maintenance) or on the user's computer (‘client-side’). The latter can provide a smoother and more flexible experience but relies on fast internet connections, up-to-date browsers and of course hands over at least some of the data to the user. The interaction via the web browser can take many forms: • Pop up more information when the mouse pointer hovers over a region; • Click on a region and more information appears in an adjacent part of the screen; • Click to have the visualisation show only data from that region; • Click and drag to zoom in on the selected rectangle; • Scroll down to change the images and text to the next layer of detail, or step in the story; • Move sliders and click on tick boxes to control some aspects of the visualisation; • Toggle between showing or hiding some aspect; • Animate the content unless the user pauses it and moves back manually; • Move through layers/steps as arrow buttons are pressed; • Scroll around and zoom in on a map or a virtual globe. This has brought a new mindset to data analysts: the concept of giving the audience not just a static report, but a
tool they can use. It could be conceived of as a graphical user interface (GUI), webapp or application programmable interface (API). Although it is seen in the web browser, it need not be stored remotely and delivered via the internet, an important consideration where the data that are being processed are sensitive or confidential. A team will need programming and design skills in addition to mathematical modelling or computing or statistics, if they are to deliver this kind of output. Some ability to program in JavaScript will be essential, although packages like D3 and Leaflet save the developer a lot of time reinventing the wheel. Even more than with static images, user-testing is essential. Analysts are often so familiar with the data, models and conclusions that it becomes very hard to say what is understandable to the audience; the only way to really find out is to convene a small group of people who can give you honest feedback. This is one of the most important resources that an organisation or team can have if they intend to communicate quantitative findings.
INTERPRETABILITY AND EMERGING TECHNIQUES
‘Interpretability’ has become a hot topic in the world of machine learning
in recent years, following a concern that methods such as neural networks can give reasonably good results but without human users understanding why that is the case – the problem of the so-called black box. Visualisation is clearly a powerful tool when seeking interpretation. Not only is there a legal obligation through GDPR to be able to explain to a court how a decision was arrived at, but interpretable predictions are more likely to influence decisionmakers, and can be more robust to hacking. Two very promising new ways of thinking have emerged from efforts to boost interpretability. Firstly, adversarial approaches perturb part of the input data and monitor the impact this has on conclusions. When a large impact is observed, we can conclude that the model places considerable emphasis on those values. Figure 2 is a depiction of a complex model (convolutional neural network) detecting the presence of a flute with 99.73% certainty in the left image. The computer is tasked with finding regions that, when blurred, alter the prediction. The centre image shows a good match, reducing flutecertainty to 0.07%, and this ‘mask’ is shown in the right image. This assures us humans that the model is indeed responding to flutes and not some other artefact that happens to appear
FIGURE 2 DETECTING THE PRESENCE OF A FLUTE
From the open access pre-print paper ‘Interpretable Explanations of Black Boxes by Meaningful Perturbation’ by Ruth Fong and Andrea Vedaldi (arxiv.org/abs/1704.03296).
in the flute photographs it was trained on (for an example of such artefacts causing problems, read John Zech’s article ‘What are Radiological Deep Learning Models Actually Learning?’ at https://medium.com/@jrzech/whatare-radiological-deep-learning-modelsactually-learning-f97a546c5b98). A more open-ended approach is that of the generative adversarial model (GAM), where one model generates data, which is in turn classified by another. The second, classifying, model has already been trained on real-life data. The first, generative, model is programmed to focus on configurations of data that produce incorrect classifications from the classifying model. In this way, the theory goes, we can find combinations of data that our classifying model gets wrong, and from this, we can learn how to make it more robust, as well as how to interpret what it is really detecting and responding to. GAMs are also responsible for the majority of eyecatching images where pairs of neural networks, by trying to trick each other, make uncanny images that look like dogs but not quite, or convert holiday snaps into the style of Van Gogh. It’s hard to imagine making sense of this if the data were simply massive arrays of numbers. By visualising them, us humans can keep up with the enormously complicated models at work. Secondly, researchers have had some success in communicating how these complex methods work by providing simple examples with an interactive front-end in the web browser (e.g., playground.tensorflow. org). The user can tweak the parameters of the model and watch it get trained on the data without any need for technical training, simply by moving sliders, ticking boxes and watching the visualisations evolve.
IMPACT | AUTUMN 2019
21
This seems to be the realisation of a long-held ambition of O.R. and statistics people alike, to create engaging computer-based visual content that helps students learn about our more obscure techniques. Figure 3 shows a screenshot of the TensorFlow Playground, which is an interactive website demonstrating some simple neural networks and how they can predict very complex and challenging patterns in data.
BIG DATA
When datasets become very large, visualisation is challenging because the page or screen will become one amorphous blob of colour. Making lines or points semi-transparent can help us to see areas of highest density,
but patterns will be impossible to discern when there are too many lines or points. One useful framework for approaching this problem is called “bin-summarise-smooth”, proposed by Hadley Wickham of RStudio [https:// vita.had.co.nz/papers/bigvis.pdf ]. Split the space in the plot into bins (a rectangular or hexagonal grid is most common), allocate observations to bins and accumulate a relevant statistic in each bin (counts and means are easy to program in this way, while medians and other quantiles turn out to be demanding). If desired, lumpy patterns can then be smoothed out using any acceptable method. Figure 4 shows all 14 million yellow taxi journeys in New York in 2013. Because of the distinctive outline of
the land, no background map is really necessary. This is a 1000-by-1000 pixel image, so there are one million bins, most of which are empty. The shade of each pixel is determined by the number of journeys that start there, according to GPS data, so the summary statistic is a simple count. There is no smoothing, allowing us to see the individual major roads. A similar approach applies when data are arriving too quickly to be displayed in any comprehensive way, called windowing. Statistics are accumulated for consecutive time intervals or windows. It is helpful to have more than one of these series of windows, with edges offset so they partly overlap. This gives some scope for smoothing or zooming in and out after the data reduction has happened.
FIGURE 3 A SCREENSHOT OF THE TENSORFLOW PLAYGROUND
By Daniel Smilkov, Shan Carter, Martin Wattenberg and Fernanda Viégas at Google, reproduced under the Apache 2.0 licence.
22
IMPACT | AUTUMN 2019
SHOWING UNCERTAINTY
© The author
One of the questions I am asked most often as a trainer is how best to show, not just the best prediction from a model, but also the uncertainty around that. This is something that statisticians like me deal with all the time – it’s perhaps our only trick – but we rarely think about how to show it, beyond putting those ubiquitous error bars on the charts. The trouble with adding more lines and shaded regions is that it clutters the chart quickly. Again, sketching and user-testing is vital to find what makes sense to the audience. Whenever we use simulations and agent-based models to model the real world, we have a large number of potential outcomes, which we might then average or summarise in some way to get a bottom line. Those individual simulations are also a valuable resource for visualisation, because simply by showing them together, the uncertainty in the outputs are made evident. A lot of effort has been made in recent years with weather forecasting, especially for dangerous events like hurricanes. Multiple predictions can appear all together, perhaps with semi-transparent lines or points. This gives a simple interpretation: one of these lines will be the one the hurricane takes, but we don’t know which. Alternatively, they
FIGURE 4 14 MILLION YELLOW TAXI JOURNEYS IN NEW YORK IN 2013
The starting locations are counted into a 1000-by-1000 grid and shown as shading for each pixel.
can be summarised by a boundary at a given level of probability (determined by methods like quantiles or convex hulls). The interpretation here is that there is only a x% chance of being hit by the hurricane if you live outside the boundary. Finally, we could also have a series of boundaries: effectively, contours in a probability surface for the prediction. Figure 5 shows three approaches to showing uncertainty around a trajectory (perhaps a hurricane path): multiple draws from the model, best estimate with a region of fixed probability, and multiple contours without a best estimate.
© Images by the author, Copyright CRC Press
FIGURE 5 THREE APPROACHES TO SHOWING UNCERTAINTY AROUND A TRAJECTORY
CONCLUSION
Working with data visualisation requires a peculiar mixture of analytical skills and design thinking. It is increasingly in demand among public and private sector employers. The open-ended nature of the challenge, with no right or wrong answer, makes it stimulating and a source of constant learning. It is also helping us to tackle some of the most resistant problems in modelling and prediction: poor data quality, black box models, communication to a non-technical audience and assisting decision-makers. Robert Grant of Bayes Camp Ltd, UK (robert@bayescamp.com) is a freelance trainer and coach for people working in data analysis, Bayesian models and data visualisation. His background is in medical statistics and he is the author of ‘Data Visualization: charts, maps and interactive graphics’, published by CRC Press and the American Statistical Association.
IMPACT | AUTUMN 2019
23
U N I V E R S I T I E S M A K I N G A N I M PAC T EACH YEAR STUDENTS on MSc programmes in analytical subjects at several UK universities spend their last few months undertaking a project, often for an organisation. These projects can make a significant impact. This issue features a report of a project recently carried out at one of our universities: Cardiff University. If you are interested in availing yourself of such an opportunity, please contact the Operational Research Society at email@theorsociety.com FORECASTING THE COST OF BODILY INJURY CLAIMS USING HOUSEHOLD DATA: AN APPLICATION OF XGBOOST MACHINE LEARNING ALGORITHMS (Emma McCarthy, Cardiff University, MSc Operational Research and Applied Statistics)
Evaluations of a driver’s riskiness and probability to make a claim are essential issues of a motor insurance company. It allows an appropriate price to be offered to the driver, reflective of their risk to the company. Statistical models have been widely applied in the insurance industry, while the accuracy of claim prediction is still a big challenge due to limited modelling complexity. With improving computing power and big data analytics, insurers need to stay on top of what data is available in the market. Emma, a student in Cardiff’s School of Mathematics, undertook her project for Admiral Insurance to explore machine learning techniques to predict clients’ potential claims in car insurance based on household level data. Admiral receives millions of claims: the occurrences of such claims are highly unpredictable under the existing risk modelling framework. The unpredictability of huge claims can lead to unexpected losses for the company. Motivated by the need to improve the claims and pricing models, Emma aimed at building body injury
24
IMPACT © THE OR SOCIETY
claim prediction models with high dimensional household data. The project highlighted the use of state-of-the-art machine learning techniques allowing great flexibility to create gradient boosting models which naturally handled missing and nonlinear data. Bayesian optimisation was applied to tune the hyperparameters so that the computational complexity and overfitting issues were under control. Overfitting was avoided by using cross validation that tested the accuracy of the evaluation metrics of the model. The final developed model performed well on the dataset and created useful predictions. In terms of applications and industrial implementations, the model results were imported into a pricing model provided by Admiral which is used to determine the price of a premium. This was justifiable because the model generated promising results and highlighted customer groups with similar trends to some key features, such as average claims, cost per vehicle year, and loss ratios.
The project proposed an innovative way for Admiral to price car insurance premiums by introducing the use of machine learning techniques to investigate alternative rating factors. Rhodri Charles, Admiral’s Head of UK Motor Pricing & Analytics was impressed with Emma’s results: “Dealing with so many variables can be challenging, but Emma was able to apply a variety of techniques to isolate useful variables and build predictive models to help us estimate the propensity of each postcode to make different claim types, as well as how large these claims are likely to be when they do occur. On a broader note, Emma’s work demonstrated that the dataset was valuable to Admiral. It has generated interest in the data and has spurred further work in this area. It’s important to note that Emma’s work also serves as a comparison of the different techniques in data science. It has helped us be better informed about how different methods work under different scenarios, and will help us to work more efficiently going forward.”
KEEP IT CLEAR LAUREN KNIGHT
ANGLIAN WATER IS THE LARGEST WATER AND WASTEWATER COMPANY in England and Wales by geographic area. With the volume of waste received daily enough to fill over 300 Olympic sized swimming pools, travelling through 76,624km of sewers it is important to keep our sewers clear of blockages. Most blockages are caused by fats, oil and grease (FOG) or non-flushable items, so to combat this in 2010 we launched our Keep it Clear campaign, featuring the fatberg monster seen above. Keep it Clear is our pioneering behavioural change programme focusing on key areas with high numbers of blockages across our region. We survey households, attend community events, work with schools
and local food premise businesses with the aim of changing behaviours around what ends up in our sewers. Since the programme started in 2010 there has been an average reduction in blockages of 58% in 24 targeted areas.
THE CHALLENGE
Across our business we have a large amount of data and we use Excel to cleanse, manipulate and analyse data sets. This is fine if you are the person who has built the spreadsheet and understands the logic but most of the time itâ&#x20AC;&#x2122;s someone else trying to make a decision or drive an action from the insight produced. Using solutions such as Excel is a very manual process, with analysts spending time maintaining
IMPACT Š 2019 THE AUTHOR
25
these processes rather than doing new value adding activities. It can also take a long time to process the data as Excel isn’t built for the large amounts of data we are trying to process daily. The Keep it Clear team challenged us to do something different, create
something which they could use to visualise, understand and share across their stakeholders to improve the impact data has on their campaign.
OUR COLLABORATIVE APPROACH
Often when you give a problem to a data analytics team they will take it, spend two weeks trying to solve it and then give you a result which is probably close to what you asked for but not quite right. At Anglian Water we are doing work with our Analytics Community to support analysts in working with their stakeholders to fully understand the problem and communicate progress. This improves the rate of work completed right first time and keeps the stakeholders engaged throughout the process. This is the approach we used when building our Operations Dashboard solution and Survey123, both technologies provided by ESRI, the global market leader in Graphical
FIGURE 1 THE OPERATIONS DASHBOARD
26
IMPACT | AUTUMN 2019
Information Systems, which we had not used before. We worked with the Keep it Clear team each step along the way, firstly understanding what insight helps them drive decisions, building the dashboard or survey and then reviewing progress regularly. This approach helped the Keep it Clear team achieve the end result they wanted as well as giving them an understanding of the tools and how they have been configured and therefore how to use them. The collaborative approach between the teams was a success and is something we will look to recreate on future projects.
OPERATIONS DASHBOARD SOLUTION
Figure 1 is an example of the Operations Dashboard solution. At first glance it is easy to identify whether blockages are more or less than last month with the indicator in the top right-hand corner being either green or red. The dashboard is interactive, with the map being the main feature. You can search the areas or
different sewer types and interact with the dashboard graphs which in turn changes the view on the map. This tool allows the users to work with the data in a visual manner and understand the data.
The Operations Dashboard... allows the users to work with the data in a visual manner and understand the data
SURVEY123 SOLUTION
Figure 2 is an example of Survey123, another ESRI product which has allowed us to digitise our survey solution. Previously our employees were having to attend with a laptop using SurveyMonkey and Google maps with paper as a backup in case they didn’t have signal. Then once they got back to the office they would update their spreadsheet. The Survey123 solution allows the user to complete the whole task using one
app and has offline functionality. You can login to the app when you are back at the office to see the results and download them in an accessible format. We survey people multiple times to understand if there are changes to their behaviours following the campaign being in their area. Rachel Dyson, Programme Manager Anglian Water Services Limited, said that “Survey123, which can easily be downloaded on to mobile phones, doesn’t require Wi-fi connectivity when out and about undertaking door knock interventions in hotspot blockage streets, meaning that behavioural and attitudinal survey responses can still be recorded on mobile phones and then downloaded direct to Excel when back in full Wi-fi connectivity. Previously the team would have to rely on pen and paper as a backup and carrying bulky I-pads around”.
BUSINESS VALUE
The Keep it Clear programme has achieved reduced call outs, reduced blockages and therefore reduced disruptions to customers in the areas it has been targeting. The Operations Dashboard solution has enabled engagement with local stakeholders and supports informed decision making within the Keep it Clear team.
The solutions have also benefited the data analysts – increasing the amount of time they have to complete value adding activities rather than maintaining current processes and cleansing data
FIGURE 2 SURVEY 123
The solutions have also benefited the data analysts – increasing the amount of time they have to complete
value adding activities rather than maintaining current processes and cleansing data.
Survey123 and the KIC dashboard has greatly improved the efficiency of the Keep it Clear team, as well as better targeting and analysis of hot spot blockage areas
Rachel Dyson sums it up: “Survey123 and the KIC dashboard has greatly improved the efficiency of the Keep it Clear team, as well as better targeting and analysis of hot spot blockage areas. Using the dashboard to visually engage local stakeholders and influencers on the local issues helps their understanding of underlying causes, which in turn helps these influencers communicate the local problems to residents and engage residents in helping prevent blockages, flooding and pollutions caused by fats, oils and grease (FOG) and unflushables such as wipes”. Over the next five years we will be looking to target new areas across our region to keep driving down blockages. We will also look for new opportunities to use or data analytics skills to improve processes across out business. With a background in IT, Lauren Knight has worked in Strategy and Commercial focusing on strategy for Operational Technology for Anglian Water. When responsible for the GIS and Performance Data team, Lauren brought added insight to the organisation through the power of analytics. In her new role as Digital Partner, with her experience in both IT and data, she is supporting the business in its drive to become more digitally focussed.
IMPACT | AUTUMN 2019
27
© Peter Gudella/Shutterstock.com
INSURANCE IN THE I OT E R A NEIL ROBINSON
LIKE ANY DISRUPTIVE INNOVATION, the Internet of Things brings opportunities and risks alike. These are set to have a significant impact on individuals, industries and society as a whole. The insurance sector was only recently said to have adopted a “wait and see” attitude towards the potential challenges, but O.R. suggests that there is no more time to waste. In 2006, in an operation codenamed Olympic Games, US intelligence experts created a computer virus that infiltrated and infected the systems controlling the centrifuges at Iran’s Natanz industrial plant. This
28
IMPACT © THE OR SOCIETY
cyber-attack – to use a term that was seldom heard at the time but has since entered everyday parlance – successfully derailed the Iranian regime’s alleged efforts to enrich uranium. Iran eventually realised what was happening only when IT firms captured and reverse-engineered the virus after a programming error allowed it to replicate elsewhere on the internet. Olympic Games has since been described as the cyber-warfare equivalent of the dropping of the first atomic bomb. According to various high-level sources, it was only a minor component of a far bigger operation,
Nitro Zeus, which, had it been carried out, would have devastated Iran’s air defences, power grid, communications and other vital infrastructure.
disruption has been made possible by the everincreasing connectivity of internet-linked, data-driven devices – and O.R. can help to explain the myriad implications of this global, all-encompassing trend
What does any of this have to do with Operational Research? The story is relevant because such discreet, yet ultimately spectacular, disruption has been made possible by the everincreasing connectivity of internetlinked, data-driven devices – and because O.R. can help to explain the myriad implications of this global, allencompassing trend. Levels of interconnectedness were still in their comparative infancy when Olympic Games was devised. Not so now. Today, with “smart” sensors becoming omnipresent, we have the Internet of Things (IoT) – a near-boundless, immensely intricate system of appliances and objects able to communicate and interact with each other. It has been estimated that the IoT already consists of around 25 billion devices, many of them – including refrigerators, lightbulbs and other fixtures and fittings in “smart homes” – not necessarily immediately associated with the concept of “big data.” The figure is expected to reach 125 billion by 2030, by which time the transformation may have become truly ubiquitous. It therefore seems reasonable to infer that the IoT’s continued growth will affect almost
every aspect of our lives, whether professional or personal. The challenges for insurers are particularly notable. Thirteen years on from Olympic Games and Nitro Zeus, in an age when even a household fridge could fall victim to a cyber-attack, how might the sector adapt to the novel opportunities and – perhaps more importantly – the unfamiliar risks that the IoT era is likely to bring? This is where O.R. enters the picture in earnest.
USING THE FUTURE TO INFORM THE PRESENT
Dr Ine Steenmans, Lecturer in Futures, Analysis and Policy at UCL, recently worked with one of the industry’s foremost specialists, Lloyd’s, to reimagine insurance for a superconnected society. The collaboration – which also involved PETRAS, an IoT research consortium comprising 11 UK universities – resulted in a major report entitled Networked World: Risks and Opportunities in the Internet of Things. The report clearly divides the insurance implications of the IoT into opportunities and risks. The former include improved data collection and management and the scope, thanks to innovations such as realtime monitoring, to enhance risk assessment and product flexibility. The latter include the emergence of new kinds of “harm” – for example, data loss, exploitation and information asymmetries – and their prospective effects, not least the threat of a single security breach spiralling into systemic failure. As the authors observe: “The difficulties in securing very simple but connected devices like sensors can lead to a vastly expanded cyber-attack vector, with many weakened access
points to critical systems.” This, they add, may be especially worrying when more and more data-driven decisions are likely to entail no direct human intervention whatsoever. It was with such concerns in mind that Steenmans was asked to “unpack different risk trajectories” by using techniques from the field of technology foresight, which the United Nations Industrial Development Organisation (UNIDO) calls “the most upstream element of the technology development process.” The principal aim of technology foresight, according to UNIDO, is to provide “inputs for the formulation of policies and strategies that guide the development of technological infrastructure.”
The difficulties in securing very simple but connected devices like sensors can lead to a vastly expanded cyber-attack vector, with many weakened access points to critical systems
This form of “horizon scanning” is by no means an exact science, but it can still prove very useful if approached correctly. “Plausibility is essential,” says Steenmans. “Studies often invite only one question: ‘So what?’ This tends to happen when research generates future impact stories that are noticeably different from the status quo but which place only limited focus on the fundamental nature of change. That’s why we tried to outline situations that might actually come about and which could be realistically associated with the massive economic and societal shifts that the IoT is producing.”
IMPACT | AUTUMN 2019
29
Steenmans eventually assembled 10 scenarios, spreading them across four themes: critical water infrastructure, agriculture, marine and smart homes. The intention was to illustrate not only how the IoT is dramatically remoulding the landscape of risk but also, crucially, how the future might help to reshape the present.
CONFRONTING THE UNPRECEDENTED
The construction of “extreme but plausible” stress-testing scenarios is a well-known method of future risk management in the insurance sector. In this instance, given the scale and speed of disruption, a balance between the radical and the conceivable was especially desirable. “We expect to see profound changes between now and 2030,” says Steenmans, of UCL’s Department of Science, Technology, Engineering and Public Policy (STEaPP), “so we wanted insurers to take the leap and envision the immediate, mid-term and long-term effects of the IoT. It
was a case of exploring the likely impacts with a view to preparing for fundamental changes before they happen.” Steenmans and her fellow researchers, Dr Leonie Tanczer, Dr Irina Brass and Professor Madeline Carr, first reviewed the literature on IoT risk projections. They then surveyed dozens of experts – including PETRAS associates, IoT industry figures and specialists in cyber-security and technological disruption – to ensure that the scenarios chosen for the report would be suitably robust, representative and complex. There were also workshops with academics and practitioners, as well as interviews with Lloyd’s underwriters and actuaries. The team used the same basic format for each scenario: a description of IoT devices and technologies that could be deployed by 2030 or beyond; a pathway explaining how events and risks might materialise; an appraisal of critical uncertainties; and an examination of the classes of insurance likely to be affected. A further constant was the severity
FIGURE 1 CRITICAL WATER INFRASTRUCTURE SCENARIO: ALGORITHMIC SUPPLY BIAS.
30
IMPACT | AUTUMN 2019
of the possible outcomes, with the predicted consequences ranging from mass bankruptcies and collapsing infrastructure to street riots and food shortages. Consider, for example, the scenario detailed in Figure 1. Here a cyber-attack on water companies’ communication networks sparks serious supply problems, dangerous miscalculations and, as the crisis escalates, public disorder. The initial security breach is exacerbated by an artificial intelligence system that has been trained on historical usage patterns and so prioritises wealthier neighbourhoods. Businesses fail. Hospitals are rendered unsafe. Entire communities take legal action. Consider, too, the scenario shown in Figure 2. Here the world’s seas are the preserve of uncrewed, autonomous vessels that are loaded with containers whose perishable cargos are kept in optimum condition by temperature sensors. Using hacks illegally sold online, criminals block the ships’ GPS signals and swiftly usher in a new age of global piracy – prompting substantial losses for maritime businesses. “These scenarios reveal numerous critical uncertainties for insurers,” says Steenmans. “There’s the issue of unclear attribution – how do we identify the intentional or unintentional nature of harm in an IoT setting? There’s the issue of cascading and aggregated risk, which is practically inevitable in light of the IoT’s interdependencies. There’s the issue of legislative and regulatory frameworks and how these need to evolve. There’s the ever-present, overarching issue of whether data can be trusted… It’s a lengthy list – and a thought-provoking one.”
FIGURE 2 MARINE SCENARIO: HACKING AND VESSEL PIRACY.
ADAPTING TO A “NEW NORMAL”
The findings of Networked World echo Steenmans’ observations. The report concludes: “‘Cyber’ is a broad concept in insurance, and it will need to be broken down into much greater detail to be effective in modelling risk and exposure, shaping coverage and guiding capital reserve requirements… Much closer partnership with the tech sector is necessary. Expertise and cutting-edge developments in data science could be integrated into the insurance sector through collaborative relationships.” This message has already been welcomed within the industry. “We will know so much more with billions
of additional pairs of ‘eyes’ on the world,” says Trevor Maynard, Head of Innovation at Lloyd’s, whose own Innovation Lab worked on the project. “The challenge is to be ready to make best use of what these eyes see. New forms of insurance may arise. Smart contracts with parametric, sensordriven triggers may improve the claims process. Advanced data talent will be required to engage with this new world, and actuaries are well placed to play a key role – provided they upskill as necessary.” Policymakers are also taking note. The Parliamentary Office of Science and Technology has referenced Networked World in a briefing entitled ‘Cyber Security of
TABLE 1 OPERATIONAL BENEFITS REQUIREMENTS FOR THE INSURANCE SECTOR
Consumer Devices’. Lord ClementJones, the Liberal Democrats’ House of Lords spokesperson for the digital economy, has commended the authors for drawing attention to “a technological advancement regrettably often overlooked,” adding: “UCL and Lloyd’s have championed the emerging leadership role of the insurance sector in the domain of ethical data usage… [Insurers] now need to take on the challenge of developing standards for businesses and consumers.” Praising Steenmans’ scenarios for “illustrating some of the fundamental changes likely to be encountered across multiple sectors,” Lord Fox, the Liberal-Democrats’ House of Lords spokesperson for business, energy and industrial strategy, has acknowledged the report’s “contribution to ongoing debates on future codes of practice and models of cyber-security governance.” As shown in Table 1, Networked World offers a simple breakdown of what IoT is likely to allow insurers to do and what it is likely to require them to do. Steenmans believes that there is no more time to waste in responding to both.
the sector is going to have to move quickly to capitalise on this transformation – and we can help it to do so
“Ernst & Young referred to the IoT as a ‘futuristic concept’ as recently as 2016 and suggested that many insurers were adopting a wait-andsee attitude,” she says. “But things have progressed rapidly in the few years since then. Usage of the IoT has increased exponentially, compelling insurers to examine the risks and value
IMPACT | AUTUMN 2019
31
propositions to which the IoT is giving rise. It’s no longer a matter of ‘wait and see’. It’s a matter of acting now.”
New forms of insurance may arise. Smart contracts with parametric, sensor-driven triggers may improve the claims process. Advanced data talent will be required to engage with this new world, and actuaries are well placed to play a key role
The O.R. community, too, should pay heed of what is happening, since
technology foresight and horizon scanning will undoubtedly play a continued role in recognising and adapting to the “new normal” of unparalleled connectivity. “This was very much an O.R. project,” says Steenmans, Chair of the OR Society’s Special Interest Group on Policy Design. “Historically, O.R. has made great contributions to the development of scenario methodologies, and this work built on that tradition. Our scenarios to date have already highlighted significant considerations for new business models, the operational future of underwriting, insurance claims and modelling. Now the
00
Perspicax agricolae suffragarit Augustus. Suis vocificat fiducias.
00
Saburre miscere Aquae Sulis. Pessimus tremulus matrimonii insectat Octavius.
JOURNAL OF SIMULATION
Satis saetosus ossifragi agnascor incredibiliter perspicax apparatus bellis. Satis quinquennalis fiducias imputat gulosus agricolae.
Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis. Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis. Perspicax agricolae suffragarit Augustus. Suis vocificat fiducias. Saburre miscere Aquae Sulis. Pessimus tremulus matrimonii insectat Octavius. Satis saetosus ossifragi agnascor incredibiliter perspicax apparatus bellis. Satis quinquennalis fiducias imputat gulosus agricolae. Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis.
00
Neil Robinson is the managing editor of Bulletin Academic, a communications consultancy that specialises in helping academic research have the greatest economic, cultural or social impact. The figures are taken from ‘Networked World: Risks and Opportunities in the Internet of Things’, published in November 2018. This can be downloaded at https://www. lloyds.com/news-and-risk-insight/riskreports/library/technology/networkedworld.
VOLUME 00 NUMBER 00 MONTH 00 ISSN: 0960-085X
00 of Simulation (JOS) aims to publish both articles and technical notes from researchers and Journal 00 practitioners active in the field of simulation. In JOS, the field of simulation includes the techniques, tools, methods and technologies of the application and the use of discrete-event simulation, agent00 based modelling and system dynamics. We are also interested in models that are hybrids of these JOS encourages theoretical papers that span the breadth of the simulation process, approaches. 00 including both modelling and analysis methodologies, as well as practical papers from a wide 00 range of simulation applications in domains including, manufacturing, service, defence, health care and general commerce. JOS will particularly seek topics that are not “mainstream” in nature but 00 interesting and evocative to the simulation community as outlined above. 00 Particular interest is paid to significant success in the use of simulation. JOS will publish the 00 methodological and technological advances that represent significant progress toward the application 00 of simulation modelling-related theory and/or practice. Other streams of interest will be practical applications that highlight insights into the contemporary practice of simulation modelling; articles that are tutorial in nature or that largely review existing literature as a contribution to the field, and articles based on empirical research such as questionnaire surveys, controlled experiments or more qualitative case studies.
THE EUROPEAN JOURNAL OF INFORMATION SYSTEMS
Joint Editors Christine Currie, University of Southampton, UK John Fowler, Arizona State University, USA Loo Hay Lee, National University of Singapore, Dov Te’eniSingapore VOLUME 00
T&F STEM @tandfSTEM
@tandfengineering
NUMBER 00
Explore more today… http://bit.ly/2Gg9Zv9 MONTH 2018
32
IMPACT | AUTUMN 2019
EJIS
Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis.
JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY
Contents
sector is going to have to move quickly to capitalise on this transformation – and we can help it to do so.”
© Zoran Karapancev/ Shutterstock.com
THE AD MAN’S DILEMMA BRIAN CLEGG
THE 19TH CENTURY AMERICAN DEPARTMENT store owner John Wanamaker is credited with the remark ‘Half the money I spend on advertising is wasted; the trouble is, I don’t know which half.’ It’s a problem that remains to this day, though there are now many more channels for advertising than Wanamaker had available. The choice of how to spread an advertising budget across TV, print and online poses distinct problems, never more so than in the product launch of a new car.
QUANTIFYING PRODUCT LAUNCHES
Producing the optimum advertising budget split was the challenge taken on by Marc Fischer of the University of Cologne and University of Technology,
Sydney, helping Mercedes-Benz with campaigns for new car models. Mercedes was particularly concerned with product launches as it greatly expanded its introduction of new models, following stagnating sales in the 1990s. This revived the brand but meant that major sums were being spent on repeated launch campaigns – typically spending several million euros per car – making it essential that the budget was effectively deployed. Fischer puts his interest in O.R. and quantitative measures in marketing down to the need to show evidence for decision-making: ‘Many practical problems require that you compare several decision options. It is not sufficient to evaluate them with gut feeling. O.R. helps with getting to the
IMPACT © THE OR SOCIETY
33
fundamentals and provides a fact base that is transparent. It increases your credibility in the company a lot when you suggest solutions backed up by sound quantitative analysis.’ Because sales usually don’t pick up significantly until well after a launch campaign, something had to be used as a measure to stand in for sales figures. The best indicators available proved to be those around recognition – how much the advertising penetrates the awareness of potential customers. Fischer notes: ‘TNS (a subsidiary of the British multinational market research agency Kantar) and I did some modelling for another automotive client. We found a very powerful relationship between recognition and future car sales. There is other research that has been published in our top marketing journals that provides further support for this relationship and
for other mind-set metrics such as consideration, likability, etcetera.’ Fischer developed a mathematical model, a so-called advertising production function, linking advertising spend to level of recognition and related indicators. The model works by taking data from previous campaigns on relationships between spend in different media and recognition among target customers; this information is used to predict these relationships in future campaigns. Feeding into this model is an online survey.
newspapers and magazines, and online banner advertising, with a budget each time of several million euros. The advertisers had particular target customers in mind each time – for example, one campaign focussed on buyers in the 35 to 69 age range with a net monthly income of €3,000 and over.
RECOGNITION, INVOLVEMENT AND MOTIVATION
The project used online surveys of potential customers, selected to match the target customers for the particular new model, aiming to measure three key performance indicators (KPIs): recognition, involvement and motivation, as illustrated in Figure 1. Recognition
In the initial implementation in 2012 and 2013, Mercedes used Fischer’s model in four campaigns bringing new cars to the German market. These were integrated campaigns across TV,
Mercedes used Fischer’s model in four campaigns bringing new cars to the German market
FIGURE 1 MEASUREMENT OF AD KPIS - A HIERARCHY OF ADVERTISING EFFECTS (Illustrative numbers)
34
IMPACT | AUTUMN 2019
reflects how well the survey participant recalls the advertisement; involvement measures how effective the participant thought the advertisement was; and motivation tries to uncover whether the advertisement made the brand relationship with the potential customer stronger. The data from the survey drove the allocation of budget to the different media, giving the decision makers the ability to see, for example, how many more target customers would be reached with a specific increase in budget, or how to modify the media mix to achieve the same result with a reduced budget. Figure 1 demonstrates how well the campaign performed with respect to these advertising KPIs. In the illustrative example, 7m target customers out of a total population of 10m correctly recognized the advertisement in one of the used media channels. From these 7m customers, 6.3m feel involved by the ad and 3.15m showed stronger relation to the brand. 0.85m target customers will eventually buy the advertised new car. The approach taken was a ‘calibrated ordered logit model’. Fischer explains: ‘Assume you want to study how important preparation is for the performance of a marathon runner. Preparation can be measured in terms of number of runs, distances covered, elapsed time since the start of the preparation, etcetera. Performance is measured in time to cover the marathon distance. Assume now you don’t observe the time but only the top ten places. This is a ranking that tells you the winner was faster than the second place and the second place faster than the third, and so on. What we do not know is how much faster the runners were. It is very unlikely
that the runners were evenly spaced in time. We call this ranking an ordered variable which needs different models to analyse it, because we can’t quantify the distances between the ranks. ‘For the Mercedes campaigns, the main objective was to improve the depth of knowledge about the new car among potential new buyers. Mercedes used three major channels: TV, print, and online advertisements, to deliver the message. Obviously, knowledge about the new product is deeper if a target customer remembers commercials both from TV and print. However, we cannot quantify exactly how much “deeper” this is. But what we can say is that it is deeper the more channels that were remembered.’ Compared with some optimisation models this is not a particularly complex approach, but there is considerable uncertainty in the data – the model is driven, after all, by indirect opinions from surveys, not measurable facts. The more complex a model when using uncertain data, the more that uncertainty can be driven up, so care had to be taken to avoid producing a GIGO (‘garbage in, garbage out’) scenario, hidden by the complexity of the model.
Management can use this curve to predict the increase in target customers by increasing the overall budget but without changing the media mix
FROM SURVEYS TO DATABASE
Accordingly, a significant amount of effort was put into data gathering and the quality of resultant database, which was built up over a range of
campaigns. The underlying surveys were sent out by TNS. The aim was to have over 800 responses per campaign from a random sample, with the outcomes weighted to better represent the target customers. If selected, a respondent was first asked what car they owned. (This was necessary to not repeat an infamous misuse of statistics, where a US car company claimed a huge percentage of the public liked their new car without pointing out that they only surveyed their existing customers.) Participants were then asked for their ‘sympathy’ with competitive car brands – effectively asking how close they feel to the different brands – to balance out any existing bias for or against Mercedes. This is a good example of the kind of question that has to be carefully handled, as many respondents may not feel close to such brands at all, so put in an arbitrary response which might not reflect reality. To test recognition, participants were shown different adverts with the branding removed and asked which brand they were for. The subsidiary indicators of involvement and motivation were then brought in if the advertisement was recognised by asking yes/no questions. To finish off, responders were asked about their media consumption and demographic details. One essential was to minimise the contribution of those who lost the will to contribute meaningfully. The average time to complete this survey was 40 minutes – a very significant commitment of time. While TNS ‘made sure that the survey was executed in a way that minimizes respondents’ fatigue’, it was inevitable that some would not give the questions sufficient consideration. As a result, data from
IMPACT | AUTUMN 2019
35
anyone who spent less than 12 minutes on the survey was not used.
OPPORTUNITIES TO SEE
Inevitably there is a significant gap between the data gathered in the survey and reality. Ideally, the model would be driven by, for example, knowledge of exactly which adverts an individual watched. However, there is no practical way to discover this – particularly as consumers increasingly use technology such as personal video recorders and ad blockers which mean they don’t see adverts – so instead ‘opportunities to see’ (OTS) were calculated. For example, if someone said they watched a particular channel at a particular time twice every four weeks, during a period the advert was run over eight weeks, their OTS would be 2/4 × 8 = 4. This figure was modified with a probability adjustment depending on participants’ recall of the most recent equivalent
period to try to counter potential bias in their memory. A similar approach was taken to estimate OTS for newspaper and online advertising. In the case of newspapers, this was relatively simple, though with online banner adverts there was considerably more complexity because of the way that online advertising exposure is driven by complex search engine algorithms, so a Mercedes advert is not presented every time someone visits a page where it might appear.
Fischer used the model to derive a budget and media mix that achieved that target at the lowest possible cost
Starting with a target level for recognition and other indicators, Fischer used the model to derive a budget and media mix that achieved
that target at the lowest possible cost, based on data from previous campaigns. As one output from the model, decision makers were given a table of predicted recognition, involvement and motivation given different budgets in €500,000 increments, with variants reflecting a campaign that is average, underperforming or overperforming. Towards the end of the campaign, there were sufficient new data to be able to, if necessary, modify the budget and media mix to reflect how the advertising was being received. Figure 2 shows two curves that connect total campaign advertising expenditures with the predicted KPI level, measured in number of target customers. The lower curve reflects the chosen media mix. Management can use this curve to predict the increase in target customers by increasing the overall budget but without changing the media mix. The second curve is based on the optimal media
2.8
Optimised media mix
2.4
Actual media mix
Ad KPI (e.g., motivated target customers 2.6 in millions) 2.3
2.2
A €1.5m. higher budget reaches 0.5m. more target customers at optimised media mix.
2.0 1.8 1.6
An optimised media mix achieves the same effect at a €1.5m. lower budget.
1.4 1.2 3.0
3.6
4.2
4.8
5.4
6.0
6.6
7.2
7.8
8.4
9.0
Total advertising expenditures (million €) FIGURE 2 EVALUATE ADVERTISING EFFECTIVENESS – PREDICT ADVERTISING KPIS (Illustrative example)
36
IMPACT | AUTUMN 2019
mix. Management can use this curve in combination with the first one to answer various questions. For example (Example 1), assume the campaign budget is set at €6m, and the KPI target was set at 2m motivated target customers. By allocating the budget in an optimal way, there is a savings potential of €1.5m without affecting the KPI target.
CUTTING COSTS
An important outcome of running the model was to help with the difficult decision of how to split the budget across the types of media. The expectation had been that TV ads would have become less significant compared with internet advertising. In practice, though, TV had not lost its appeal, if anything slightly growing, though budget was shifted from print to online. It’s easy to speculate about causes – that, for example, Mercedes customers tend to be older than most and are reading fewer newspapers than
they once did – however, without the evidence provided by the survey this would be a gut feel approach, rather than data-driven. As shown in Figure 3, over the four campaigns where the model was used in 2012/2013, there was a clear improvement in the way that the campaigns were set up, reflected in a reducing potential to save more by changing the mix of media used. Reflecting John Wanamaker’s dilemma, the most important contribution of the model would be if it could bring down the expenditure on advertising without reducing its impact. A senior executive, heavily involved in the project at Mercedes, estimated that they saved around €2m per campaign – 15 to 30% of the overall spend. While it wasn’t certain what would have been spent had the model not been used, data from 108 past launch campaigns were used to set a baseline. He also noted there was a ‘significant improvement of
ad KPIs across campaigns from the recommended media shift’.
Fischer believes that his model has the potential to apply in other areas where brand is central to product launches
The model continued to be used in many more launches after the end of the initial project. Although specifically built for new car launches, Fischer believes that his model has the potential to apply in other areas where brand is central to product launches. There is no doubt that advertising remains something of a dark art, where it is difficult to quantify benefits. However, models such as Fischer’s are beginning to allow rationality to penetrate the darkness, providing guidance that improves outcomes and reduces costs. FOR FURTHER READING Fischer, M. (2019). Managing Advertising Campaigns for New Product Launches: An Application at Mercedes-Benz. Marketing Science 38: 343–359.
FIGURE 3 SAVING POTENTIAL FROM MEDIA MIX OPTIMISATION
Brian Clegg is a science journalist and author and who runs the www.popularscience.co.uk and his own www.brianclegg.net websites. After graduating with a Lancaster University MA in Operational Research in 1977, Brian joined the O.R. Department at British Airways. He left BA in 1994 to set up a creativity training business. He is now primarily a science writer: his latest title Conundrum features 200 challenges in the form of puzzles and ciphers, requiring a combination of analytical and lateral thinking plus general knowledge.
IMPACT | AUTUMN 2019
37
CORMSIS Centre for Operational Research, Management Science & Information Systems
Helping people and organisations make better decisions through advanced mathematical and analytical modelling and enhanced problem understanding →→ One→of→the→largest→OR/MS→groups→in→the→UK →→ World-class→research→with→demonstrable→real-world→ impact→for→50→years →→ 70→researchers→and→circa→170→UK→and→international→MSc→ students→annually
→→ Strong→links→with→industry,→commerce,→public→sector→ and→non-governmental→organisations →→ Dedicated→CORMSIS→Industry→Liaison→Officers →→ Extensive→UK→&→international→alumni→network
Areas of Expertise
Optimisation
Healthcare
Simulation
Predictive→and→ Prescriptive→ Analytics
Transportation→ and→Logistics
Risk→and→ Uncertainty
Engage with us →→ Study
→→ Training→programmes→and→courses
→→ Case→studies→and→events
→→ Research
→→ Sponsorship→(MSc→Summer→Projects→→ and→PhD→students)
→→ Scholarships→and→awards
→→ Consultancy
↗
Contact us: www.soton.ac.uk/cormsis→ →→→→CORMSIS@southampton.ac.uk →→→→@cormsis→→ →→→→→CORMSIS→@→University→of→Southampton
© Insight Analysis
USING O.R. TO GAIN INSIGHTS TO HELP BRITISH ATHLETES SUCCEED NICK LADE
INSIGHT ANALYSIS PROVIDES SPORTS PERFORMANCE ANALYSIS AND CONSULTANCY to help professional sports organisations maximise efficiencies, achieve KPIs and deliver insights to improve individual and team performance. Since its creation in 2003 by ExBath and England player and coach John Hall, INSIGHT has used its knowledge and experience with data in
sports to work with an impressive range of clients including England Rugby, Bath Rugby, the Football Association (FA), British Athletics, British Sailing, GB Snowsport, the Lawn Tennis Association (LTA) and Team Sky. Owen Farrell, England, British Lions and Saracens: ‘When in camp for England I have always found the analysis provision provided by INSIGHT to be excellent. They leave
IMPACT © 2019 THE AUTHOR
39
no stone un-turned in trying to provide players with whatever video or data they require in order to maximise offfield learning’.
METHODS
Our methods are based on a chain of data collection, data management and data analysis, although our relationship and communication with our clients remains key throughout, be they performance analysts, backroom staff, coaches or the athletes themselves. Data collection is completed using methods necessary to achieve the level of desired detail for relevant insights to be extracted. Occasionally, the clients provide the data in an accessible format, though more often than not a certain amount of creative endeavour is required. With many of the sports we cover, there are elements of manual data collection or ‘coding’ of videos of the sports events in question. This might be to focus on certain key instances in competitions which are not easily extracted from the data using other methods. In rugby, we code entire matches from the domestic and international calendar in order to build the most detailed and useful dataset of the English game available. This ensures we have a robust, trusted dataset that delivers much richer insights than the common data feeds. Data management is an important part of the process, since data inaccuracy leads to errors in the analyses. We manually check data in specific and general cases for our clients, which require the highest level of accuracy. When speed of delivery is a higher priority for the client, then the data is cleaned for general cases as well as the most obvious specific cases of
40
IMPACT | AUTUMN 2019
inaccuracy. Data in sports can often be messy, and the politics of sports organisations often provides a further challenge to working with the best possible data available to aid the athletes’ performances. The methods of data analysis we adopt are driven by the challenges of the problem and desires of the client in question. On one extreme, we have sports specialist performance analysts embedded in teams and organisations that are able to use the data and knowledge of the sport to provide meaningful insights in preparation for and during sports events. On the other end of the scale, we perform data driven tasks using methods in Operational Research and Artificial Intelligence when necessary. Naturally, there are many tasks which lie within such a spectrum. Across our services, we aim to utilise our knowledge of sports and data to achieve the best possible outcomes for our clients and athletes.
I have always found the analysis provision provided by INSIGHT to be excellent. They leave no stone un-turned in trying to provide players with whatever video or data they require in order to maximise off-field learning As communication is so central to our success it can take many forms. Naturally, we maintain consistent and clear communication throughout, using all the ‘usual’ channels over our entire data process. However, in order to keep the clients engaged we are constantly trying new methods of delivery. Verbal channels are complimented by creating multimedia reports and annotating
video in a way so that a story can be told with the data and analysis. Increasingly, we move to methods of web-based delivery so that such stories are far more interactive and engaging for ourselves and the clients. Dan Hunt, Performance Director – GB Snowsport: ‘I can honestly say that their manipulation of data and dissemination of information is second to none in sport in this country’.
CASE STUDY: PREDICTING PLAYER RANKING PROGRESSION
In sports, it is of interest to be able to map out the potential career journey of a particular team or athlete and then make decisions as to how best to support them over their career. Many sports administrations and fans are interested to know who the best teams and players are and so ranking tables are created. By utilising information in these ranking tables over time, it is possible to look at the progression of given teams and athletes of interest. Typically, a new athlete or team will start low in the ranking and then as they start to succeed in their sport their ranking will improve. This is followed by a period of maturity when the team or athlete stays near the same rank, then entering a period of decline where their rank gets worse and/or they stop competing. Of course, there are exceptional cases of athletes and teams which do not strictly follow this pattern, but looking at the data as a whole will often display this overall trend. An example of an athlete’s career progression is visually presented in Figure 1, with date as the x-axis and rank as the y-axis. By observing Figure 1, it is clear to see
that for this particular athlete their career can be broken up into growth (before 2002), maturity (2002– 2006) and decline (2006 onwards). Within the career pattern of growth, maturity and decline, it is interesting to understand the progress for each athlete or team within this overall trend. For example, we can explore questions relating to athlete rank and rates of achievement. Such answers will be different for each athlete and team, and so using the data, it may be of interest to model the probability of an athlete or team reaching given threshold ranks over time, such as breaking into the Top 100, Top 50 and Top 20. By using data from similar teams and athletes at comparable points in their own progressions, it is possible to make a prediction of how likely they are to break into the Top Ranks over given timelines. The results of such calculations can be displayed graphically with future time as the x-axis and the cumulative probability of breaking the given threshold ranks displayed on the y-axis. An example of this with some data is shown in Figure 2. Figure 2 shows that in this particular example, the athlete is more than likely to break into the Top 100 within 3–4 years. The athlete is also nearly 40% likely to break into the Top 50 in this time. However, their probability of breaking into the Top 20 is only around 10%. This is another indicative pattern found in sports, in that with each ranking success, it becomes increasingly difficult for an up and coming athlete to break into the higher ranks of that sport. In addition to graphical analysis, further insight can be displayed and delivered to our
FIGURE 1 EXAMPLE OF AN ATHLETE’S RANKING OVER THEIR CAREER
FIGURE 2 EXAMPLE OF AN ATHLETE’S PREDICTED PROBABILITY OF BREAKING INTO THE TOP 100, TOP 50 AND TOP 20 RANKS OVER THE NEXT 4 YEARS
clients through a bespoke website with full interactivity, allowing them to consider general cases as well as specific cases to help them understand the nature of ranking progression as well as monitoring specific athletes/teams of interest. Web delivery has advantages over other methods because the nature of interaction allows each user to have a personalized non-linear journey in which they can view and extract the most relevant insights. It is easy for us to customise the web platform to allow for extra data or requests by the users with the possibility of offering a service to be the ‘hub’ for their data management and analysis requirements.
Chris White, Performance Intelligence and Analysis Lead, Lawn Tennis Association (LTA): ‘INSIGHT provides bespoke delivery, a flexible approach and constant innovation as data and technology grows. Moreover, their analysts know how to integrate and deliver impact to their clients. INSIGHT is the benchmark for Performance Analysis across all team sports’.
SPORT INSPIRING INNOVATIVE RESEARCH
When completing the ranking progression work, it was necessary to explore many possible methods to make such ranking progression
IMPACT | AUTUMN 2019
41
© Insight Analysis
predictions. One of the challenges was the natural hierarchical and nested structure to the data. For example, if an athlete has broken into the Top 50, they have also broken into the Top 100. Some of the more traditional methods available, although mathematically sound, failed to appreciate this fact. Occasionally, the model produced predictions for a Top 50 break being higher than a Top 100 break at certain time points, which makes little sense since the Top 50 ranks are contained within the Top 100. As the baseline method provided such favourable results, we are currently exploring a method to develop the driving mathematics to appreciate and allow for the hierarchical nested nature of the events. This works by exploiting
known results in probability theory and modifying their applications to deconstruct the probability calculation into constituent parts which themselves can be calculated and implicitly respect the nested nature of the event space. Although the calculated predictions will be less biased, applying this ‘deconstruction’ method introduces extra levels of uncertainty which have to be understood and accounted for. We are currently working on a submission for an academic journal using these results and developing the methodology further. With our extensive work in Rugby and with the England team we worked flat out in support of the team this year with the final preparations for the 2019 Rugby World Cup in Japan in the Autumn.
LOOKING TO THE FUTURE
INSIGHT continues to work to develop its methods to help support the successes of British athletes. We continue to work with exciting new clients and new sports, developing our methods to help provide them with the best possible results.
we worked flat out in support of the team this year with the final preparations for the 2019 Rugby World Cup in Japan in the Autumn
Many of our other British clients work on a four-year Olympic cycle, which will culminate in the Summer Olympic Games in Tokyo 2020, and the Winter Olympic Games in Beijing 2022. Just like the athletes, our work in support of the athletes and teams has these major goals and timelines in mind. So, as you watch British athletes compete in competitions in the future, remember they may well have been assisted by implementing operational research and the work we do at INSIGHT. Nick Lade has spent the last year as a data analyst within INSIGHT, driving a data orientated approach to sports analysis and insights. He has recently moved to a data science role within the parent company MyLife Digital with a central focus of looking to productise our expertise in sports and other industries to help improve the outcomes for our clients.
42
IMPACT | AUTUMN 2019
DENNIS SHERWOOD
YOU WERE PROBABLY EXPECTING to read ‘statistics’ in the title, and you are probably thinking ‘Oh no! Not that old chestnut again!’. But if you are still reading this, you might also be thinking ‘Why GCSE results? What’s going on?’. Well, the answer is that on average, across all subjects, about 25% of GCSE grades, as originally awarded each August, are wrong. And the same applies to AS and A level grades too. To make that real, out of the (about) 6,000,000 GCSE, AS and A level
grades announced for the summer 2019 exams, about 1,500,000 were wrong. So about 1 grade in 4 was, in the words of the title, ‘a lie’; or to a student who needed a particular grade to win a coveted university place, or an apprenticeship, or just for personal self-esteem, and didn’t get that grade, it was indeed ‘a damned lie’. Especially so, since no one knows which specific grades are wrong, or which specific candidates are the victims of having been ‘awarded’ the wrong grade. And now that the school exam regulator
IMPACT © 2019 THE AUTHOR
43
Credit: Monkey Business Images/Shutterstock.com.
L I E S , DA M N E D L I E S , A N D … G C S E R E S U LT S
Ofqual has changed the rules for appeals, deliberately to make it harder to query a result, any candidate who thinks that a wrong grade might have been awarded has a very high mountain to climb to get the script re-marked. WHAT’S THE EVIDENCE?
Contains public sector information licensed under the Open Government Licence V3.0 http://www.nationalarchives.gov.uk/doc/ open-government-licence/version/3/
‘1 grade in 4 is wrong’ is quite a claim. What’s the evidence? In fact, the evidence is from Figure 12 on page 21 of a report, Marking Consistency Metrics – an update, (see bit. ly/ConsistencyMetrics) published by Ofqual in November 2018, reproduced here (with some additional narrative) as Figure 1. This chart shows the results of an extensive study in which very large numbers of scripts for 14 subjects were each marked by an ‘ordinary’ examiner,
and also by a ‘senior’ examiner, whose mark was designated ‘definitive’.
out of the (about) 6,000,000 GCSE, AS and A level grades announced for the summer 2019 exams, about 1,500,000 were wrong
We all know, especially for essay-based subjects, that different examiners can legitimately give the same script different marks. So, suppose that an ordinary examiner gives a particular script 54 marks, and a senior examiner, 56. If grade C is all marks from 53 to 59, then both marks result in grade C. But if grade C is 50 to 54, and grade B is 55 to 59, then the ordinary examiner’s
FIGURE 1 GRADE RELIABILITIES FOR 14 SUBJECTS, AVERAGED OVER GCSE, AS AND A LEVEL
44
IMPACT | AUTUMN 2019
mark results in grade C, and the senior examiner’s mark results in grade B. These grades are different, and if the senior examiner’s mark (and therefore grade) is ‘definitive’ – or in every-day language, ‘right’ – then the ordinary examiner’s mark (and therefore grade) must be ‘non-definitive’, which to me means ‘wrong’. That’s the explanation of Figure 1. For each of the subjects shown, the heavy vertical line within the darkerblue box defines the percentage of scripts for which the grade awarded by the ordinary examiner was the same as that awarded by the senior examiner. This percentage is therefore a measure of the average reliability of the grades awarded for that subject. So, for example, about 65% of Geography scripts were awarded the right grade, and, by inference, 35% the wrong grade. And if the subject percentages shown are weighted by the corresponding numbers of candidates, the average reliability over the 14 subjects is 75% right, 25% wrong. The headline ‘1 grade in 4 is wrong’ applies to the average across only the 14 subjects studied by Ofqual as shown in Figure 1. Some modelling, however, suggests that the reliability averaged across all examined subjects is likely to be within the ranges 80/20 and 70/30, so 75/25 is a sensible estimate. But whether the truth is 85/15 or 65/35 just doesn’t matter. What other process do you know with such a high failure rate? Has the examination industry not heard of Six Sigma? How many young people’s futures are irrevocably damaged when 1,500,000 million wrong grades have been ‘awarded’ last summer alone? And yes, grades are wrong both ways, and so ‘only’ about 750,000 grades are ‘too low’, potentially damaging those candidates’ life chances. By the same token, 750,000 grades are ‘too high’, and
the corresponding candidates might be regarded as ‘lucky’. But is it ‘lucky’ to be under-qualified for the next educational programme, to struggle, perhaps to drop out, and maybe lose all self-confidence? Unreliable grades are indeed a social evil. Importantly, this unreliability is not attributable to sloppy marking. Yes, in a population of over 6 million scripts, there are bound to be some marking errors. But they are very few. Rather, grades are unreliable because of the inherent ‘fuzziness’ of marking. A physics script, for example, does not have ‘a’ mark of, say, 78; rather, that script’s mark is better represented as a range such as 78 ± 3; similarly, a history script marked 63 by a single examiner is more realistically represented as, say, 63 ± 10, where the range ± 10 for history is wider than the range ± 3 for physics since history marking is intrinsically fuzzier.
grades are unreliable because of the inherent ‘fuzziness’ of marking
In practice, the vast majority of scripts are marked just once, and the grade is determined directly from that single mark: the physics script receives a grade determined by the mark of 78, and the history script by the mark of 63. Fuzziness is totally ignored, and that’s why grades are unreliable: the greater the subject’s fuzziness, the greater the probability that the grades resulting from the marks given by an ordinary examiner and a senior examiner will be different, and the more unreliable that subject’s grades – hence the sequence of subjects in Figure 1.
WHAT’S THE REACTION?
You might think that Ofqual would be working flat out to fix this – especially
since the evidence, as illustrated in Figure 1, is from their own research. Unfortunately, this is not the case. If anything, Ofqual are in ‘defensive mode’, if not outright denial. A few days before the summer 2019 A level results were announced, The Sunday Times ran a front-page article (bit.ly/WrongResults) with the headline ‘Revealed – A level results are 48% wrong’. That headline is journalistic drama – as shown in Figure 1, it is only the combined A level in English Language and Literature that is 52% right, 48% wrong – but the text of the article is substantially correct. That same Sunday morning, Ofqual posted a two-paragraph announcement (see bit.ly/ResponseToGrades) under the title ‘Response to Sunday Times story about A level grades: Statement in relation to misleading story ahead of A level results’. Here is the final sentence: ‘Universities, employers and others who rely on these qualifications can be confident that this week’s results will provide a fair assessment of a student’s knowledge and skills.’ Phew! What a relief! That grade, as declared on the certificate, must be right after all! Those interpretations of the statistics must be lies, if not damned lies! But, in the first paragraph, we read this: ‘… more than one grade could well be a legitimate reflection of a student’s performance…’ You might like to read that again. Yes. It does say ‘more than one grade could well be a legitimate reflection of a student’s performance’. More than one grade? Really? But why is only one grade awarded? Why does only one grade appear on the candidate’s certificate? And if ‘more than one grade could well be a legitimate reflection…’, how can that other statement that ‘Universities ... can
be confident that this week’s results will provide a fair assessment of a student’s knowledge and skills’ be simultaneously true? Please read that ‘Universities, employers…’ sentence once more, very carefully. Did you notice that oh-soinnocuous indefinite article, ‘a’? Yes, the grade on the certificate is indeed ‘a’ fair assessment of a student’s knowledge and skills. BUT NOT THE ONLY FAIR ASSESSMENT. There are others. Others that Ofqual know exist, but that are not shown on the candidate’s certificate. Others that might be higher. Both statements in Ofqual’s announcement are therefore simultaneously true. But, to my mind, that last sentence is very misleading. And I fear deliberately so. Whoever drafted it is clearly very ‘clever’. So perhaps statisticians, or those who use (or misuse!) statistics, are not the only community to whom that cliché ‘Lies….’ might apply. I think that this is an outrage. Do you? Dennis Sherwood, Managing Director, The Silver Bullet Machine Manufacturing Company Limited, is an independent consultant who, in 2013, was commissioned by Ofqual to compile causal loop diagrams of the systems within which Ofqual operates. Dennis has also worked closely with several stakeholder communities, particularly schools, to seek to influence Ofqual to publish relevant data (as happened in November 2018), and continues to campaign to get the problem described in the article resolved. See Dennis’s website: https://www.silverbulletmachine. com/. Dennis featured in an interview on BBC 4’s More or Less broadcast on Friday 24th August 2019 (bit.ly/ ExamClip).
IMPACT | AUTUMN 2019
45
TRUTHINESS OR FACTFULNESS? Geoff Royston
intended recipients by recognising their knowledge, interests and concerns. While Wainer appreciates that improving the analytical literacy of the population will not change the appeal of truthiness-based arguments to those who like using them, he hopes it will make such arguments less attractive to others. Which brings me on to the second book on which this column draws.
FACTFULNESS
TRUTHINESS
“Truthiness” was a word popularised by the American comedian and TV host Stephen Colbert, who defined it as “a quality characterising a “truth” that a person making an argument or assertion claims to know intuitively …. without regard to evidence, logic, intellectual examination, or facts.” In Truth or Truthiness Howard Wainer argues that resistance to using evidence in decision making derives in part from a lack of understanding of how analytical thinking works (another factor he mentions is conflict between what is true and what is wished to be true – “It is difficult to get someone to understand something when their salary depends on them not understanding it”). The book is described as “a primer in thinking like a data scientist”. Its first part provides a basic introduction to key analytical ideas of causal inference, controlled experiments and so on. The second part of the book focuses on how best to communicate analytical material, emphasising the importance of good graphical displays (see my last column) that empathise with the
46
IMPACT © THE OR SOCIETY
1. What is the life expectancy in the world today? A. 50 years B. 60 Years C. 70 years? 2. In all low-income countries across the world today, how many girls finish primary school? A. 20 percent B. 40 percent C. 60 percent? 3. In the last 20 years, the proportion of the world population living in extreme poverty has: A Roughly doubled. B. Remained roughly the same C. Roughly halved? 4. There are 2 billion children (age 0 to15 years) in the world today. How many will there be in the year 2100, according to the UN? A.4 billion B.3 billion C. 2 billion? 5. How many people in the world have some access to electricity? A. 20 percent B. 50 percent C. 80 percent? 6. How many of the world’s 1-year old children today have been vaccinated against some disease? A. 20 percent B. 50 percent C. 80 percent? 7. How has the number of deaths per year from natural disasters changed over the last hundred years? A. More than doubled B. Remained about the same C. Decreased to less than half?
© Hodder
For this issue, my column draws upon two books, one whose title includes what was once declared the American “word of the year” and the other a book that impressed Bill Gates so much (“one of the most important books I’ve ever read”) that last year he made it available free to every graduating student in the USA. The first book is Truth or Truthiness, by Howard Wainer, the second is Factfulness, by the late Hans Rosling, renowned for his public lectures about global health and related statistics. (I have previously recommended his TED talks, which have had over 35 million views; see the website https://www.gapminder.org)
This is Hans Rosling’s swansong, published, with input from his son Ola and his daughter-in–law Anna, last year. Let’s follow the presentation in Factfulness and start with a quiz. Here are a few questions taken from the beginning of the book; jot down your answers, A, B or C, to each question before reading on further.
(answers are at the end) How did you do? Hans Rosling found that only about 7% of people got the question on world poverty right, 13% for the one on vaccination. Over a set of such questions the average (of 12,000 people in 14 countries) was to get about 20% right. As Rosling points out, this is worse than ignorance chimpanzees would be expected to get right a random 33%! Factfulness is subtitled “Ten reasons we’re wrong about the world – and why things are better than you think”. Rosling’s diagnosis is that a lot of people’s knowledge suffers from what he calls “an overdramatic worldview”. His book sets out various “instincts toward drama” that can lead people astray in thinking about global facts and figures. Here are a couple of these “instincts”.
Similar change can be seen in charts of income, education, or electricity; most countries are in the middle, the world is no longer divided into two.
THE NEGATIVITY INSTINCT
This is the temptation to divide things into two distinct and often conflicting groups, with an imagined gap in between: for example, that the countries of the world are divided into rich and poor, or developed and undeveloped. In Factfulness, Rosling gives, for example, a chart (Figure 1) showing, for each country, survival rates for children to age five plotted against numbers of babies per woman; we can observe exactly that sort of division. But Rosling then points out that that chart was for 1965. Charts from the past can be a very poor guide to navigating the world of today. The chart for 2017 (Figure 2) looks quite different. The gap has been replaced by a continuum, with 85% of countries now being inside the box that was named “developed” and only 6% still inside the “developing” box.
This is our tendency (aided and abetted by the media) to notice the bad more than the good, and to romanticise the past; leading to the view that the world is getting worse (“Warning: objects in your memories were worse than they appear”). Information about bad events is much more likely to reach us; good news is rarely reported, especially if it is of a gradual change. Some things are indeed worsening – the climate crisis being perhaps the prime global example – but Rosling argues that awareness of the bad is crowding out knowledge of the good. Look, for example, at Figures 3 and 4 from Factfulness that show the trend over the last two hundred years in the proportion of the world population living in extreme poverty, and the trend in average life expectancy. Even just over Rosling’s lifetime the changes have been dramatic – in 1948 sixty percent of the world’s population lived in extreme poverty, but now the figure is under ten percent. And in the same period average global life expectancy has risen from around forty to over seventy years. Rosling was no Dr Pangloss; he said he was not even an optimist but rather a “possibilist”. His stance was to recognise that things can be simultaneously both bad and better; and that it is important to be able to hold facts about levels (e.g. bad) and about direction and rates of change (e.g. getting better, slowly) in one’s head at the same time.
FIGURE 1 A GAP BETWEEN TWO WORLDS?
FIGURE 2 THE DISAPPEARING GLOBAL GAP
THE GAP INSTINCT
IMPACT | AUTUMN 2019
47
FIGURE 3 POVERTY TRENDS
FIGURE 4 LIFE EXPECTANCY TRENDS
A short overview cannot provide the full flavour of Factfulness (which goes on to discuss eight more “instincts toward drama” that can lead people astray in thinking). Not only, as Bill Gates noted, is it an important book, it is also a great read, with a text that is threaded through with fascinating stories of memorable incidents in Hans Rosling’s professional life around the world. Fortunately, there is simple solution to this deficiency; read the book.
00
Perspicax agricolae suffragarit Augustus. Suis vocificat fiducias.
00
HEALTH SYSTEMS
Saburre miscere Aquae Sulis. Pessimus tremulus matrimonii insectat Octavius. Satis saetosus ossifragi agnascor incredibiliter perspicax apparatus bellis. Satis quinquennalis fiducias imputat gulosus agricolae.
Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis. Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis. Perspicax agricolae suffragarit Augustus. Suis vocificat fiducias. Saburre miscere Aquae Sulis. Pessimus tremulus matrimonii insectat Octavius. Satis saetosus ossifragi agnascor incredibiliter perspicax apparatus bellis. Satis quinquennalis fiducias imputat gulosus agricolae. Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis.
00
VOLUME 00 NUMBER 00 MONTH 00 ISSN: 0960-085X
Health 00 Systems is an interdisciplinary journal promoting the idea that all aspects 00 of health and healthcare delivery can be viewed from a systems perspective. The underpinning philosophy of the journal is that health and healthcare systems are 00 characterized by complexity and interconnectedness, where “everything affects 00 everything else”. Thus, problems in healthcare need to be viewed holistically as an 00 integrated system of multiple components (people, organizations, technology and resources) and perspectives. The journal sees the systems approach to be widely 00 applicable to all areas of health and healthcare delivery (e.g., public health, hospitals, 00 primary care, telemedicine, disparities, community health). Hence, the principal aim of 00 the journal is to bring together critical disciplines that have proved themselves already 00 in health, and to leverage these contributions by providing a forum that brings together diverse viewpoints and research approaches (qualitative, quantitative, and conceptual).
THE EUROPEAN JOURNAL OF INFORMATION SYSTEMS
Co-editors Sally Brailsford, University of Southampton, UK Paul Harper, Cardiff University, UK Nelson King, Khalifa University, United Arab Emirates Cynthia LeRouge, Florida International University, USA
Dov Te’eni
VOLUME 00
T&F STEM @tandfSTEM
@tandfengineering
NUMBER 00
Explore more today… http://bit.ly/2GgCKYq MONTH 2018
48
IMPACT | AUTUMN 2019
EJIS
Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis.
Dr Geoff Royston is a former president of the OR Society and a former chair of the UK Government Operational Research Service. He was head of strategic analysis and operational research in the Department of Health for England, where for almost two decades he was the professional lead for a large group of health analysts.
JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY
Contents
Quiz Answers: They are all C.
JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY
Perspicax agricolae suffragarit Augustus. Suis vocificat fiducias.
00
Saburre miscere Aquae Sulis. Pessimus tremulus matrimonii insectat Octavius.
00
Satis saetosus ossifragi agnascor incredibiliter perspicax apparatus bellis.
00
Satis quinquennalis fiducias imputat gulosus agricolae.
00
Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis.
00
Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis.
00
Perspicax agricolae suffragarit Augustus. Suis vocificat fiducias.
00
Saburre miscere Aquae Sulis. Pessimus tremulus matrimonii insectat Octavius.
00
Satis saetosus ossifragi agnascor incredibiliter perspicax apparatus bellis.
00
Satis quinquennalis fiducias imputat gulosus agricolae.
00
Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis.
00
Contents
Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis. Perspicax agricolae suffragarit Augustus. Suis vocificat fiducias.
00
Saburre miscere Aquae Sulis. Pessimus tremulus matrimonii insectat Octavius.
00 •
Satis saetosus ossifragi agnascor incredibiliter perspicax apparatus bellis.
00
Satis quinquennalis fiducias imputat gulosus agricolae.
00
Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis.
•
THE EUROPEAN JOURNAL OF INFORMATION SYSTEMS
JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY
00
VOLUME 00 NUMBER 00 MONTH 00 ISSN: 0960-085X
EJIS
Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis.
JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY
Contents
JORS is published 12 times a year and is the flagship journal of the Operational Research Society. It is the aim of JORS to present papers which cover the theory, practice, history or methodology of OR. However, since OR is primarily an applied science, VOLUME 00 NUMBER 00 MONTH 00 it is a major objective of the journal to attract and ISSN: 0960-085X publish accounts of good, practical case studies. Consequently, papers illustrating applications of OR 00 to real problems are especially welcome.
00
Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis.
00
Perspicax agricolae suffragarit Augustus. Suis vocificat fiducias.
00 •
Saburre miscere Aquae Sulis. Pessimus tremulus matrimonii insectat Octavius.
00
Satis saetosus ossifragi agnascor incredibiliter perspicax apparatus bellis.
00
Satis quinquennalis fiducias imputat gulosus agricolae.
00
Apparatus bellis corrumperet Medusa, quod fiducias amputat verecundus suis.
00
Real applications of OR - forecasting, inventory, investment, location, logistics, maintenance, marketing, packing, purchasing, production, project management, reliability and scheduling A wide variety of environments - community OR, education, energy, finance, government, health services, manufacturing industries, mining, sports, and transportation Technical approaches - decision support systems, expert systems, heuristics, networks, mathematical programming, multicriteria decision methods, problems structuring methods, queues, and simulation
THE EUROP JOURNAL O INFORMATIO SYSTEMS
Editors-in-Chief: Thomas Archibald, University of Edinburgh Jonathan Crook, University of Edinburgh
Dov Te’eni
VOLUME 00
VOLUME 00
T&F STEM @tandfSTEM
Dov Te’eni @tandfengineering NUMBER 00
NUMBER 00
Explore more today… bit.ly/2ClmiTY MONTH 2018
MONTH 2018
#thisisOR
D R I V I N G I M P R O V E M E N T W I T H O P E R AT I O N A L R E S E A R C H A N D D E C I S I O N A N A LY T I C S
AUTUMN 2019
MODELLING PRODUCES THE OPTIMUM ADVERTISING BUDGET SPLIT FOR MERCEDES-BENZ NEW CARS Estimated savings were 15-30% of the overall spend
Learn from the experts, share your knowledge and develop network at these year-round OR Society events: • Annual Conference • Analytics Summit • Computational Modelling
• Regional & Special Interest Groups • Simulation Workshop • Specialist Lectures
Operational research (OR) is the science of better decision-making.
Discover all our events and conferences at
www.theorsociety.com/events www.theorsociety.com
@theorsociety
© emirhankaramuk Canal & River Trust / Shutterstock.com
Conferences and events to build your skills and network PREPARING SHIFT SCHEDULES FOR GERMAN TRAIN CONDUCTORS Powerful algorithms enable high quality decision support and provide considerable cost reductions
ANALYSIS GIVES BETTER RESULTS – IN SPORT A wide variety of teams benefit from performance analysis