Language of the Future 2nd Annual Adarga Symposium on AI Online, 22 September 2020
Hosted by
1
About Adarga
About Adarga
2
Table of contents
3
A very warm welcome to Adarga’s 2020 AI Symposium – Language of the Future
4
Our Core Partners
8
Agenda
9
Turning the Lens on Yourself
10
Information, Innovation and Integration: the Military-Strategic Challenges of the New Information Environment
16
Data – The Important Part of Data Science
24
“But It’s Just a Machine!”: Bias in the Age of AI
30
AI for Natural Language: Where We Are Today, And Where We (Might) Go From Here
36
Amplifying AI / ML Through Synthetic Environments
40
Adarga creates and deploys powerful AI technology that analyses huge quantities of data and returns key information to the user as actionable intelligence. Users can then understand the deep insights within their data to drive fast, better decisions – using the data as information to create knowledge.
Good And Bad Events: Combining Network-Based Event Detection With Sentiment Analysis
48
Speakers
54
Our vision is to empower us all to realise the full potential of available knowledge.
Our panellists
60
The ability to unlock the true potential in all your data is key to fast, effective, smart decision making.
2
Table of contents
3
A very warm welcome to Adarga’s 2020 AI Symposium – Language of the Future The dawning of a new era can be defined by the break it represents with the past. The world certainly looks very different today than it did 12 months ago. During the course of this year we have witnessed unprecedented changes in our society and our economy. Changes that would have ordinarily unfolded over years have been compressed into a matter of months. Crisis is always a powerful catalyst for change and through crisis, opportunities for progress beckon. Never has the power and importance of data been more evident. We live in a data rich world in which the potent combination of computing power, digital connectivity and data is driving technological advancement that impacts almost every aspect of our daily lives. Our ability to collect, process and analyse information from a vast range of available data sources to inform decision making is an essential activity for all organisations. But for most it also remains a uniquely human, time intensive endeavour. The volume, complexity and interconnectedness of available information is also growing and at a pace and scale that makes this ever more challenging, placing an even greater importance on our capacity to understand, and our swiftness to harness, new technologies and to put them to best use in the real-world, supporting us in our complex, everyday tasks. The current crisis has caught many organisations midstream in their transition from the Industrial Age to the Information Age. By contrast, it has been notable how resilient and agile digitally native organisations have been, and how nimble and effective they have been in response to the current situation. There is an imperative on those organisations who were not so well prepared to regain the initiative, to adapt and evolve more rapidly or risk being rendered uncompetitive and obsolete. And for those institutions that do persist, how do they ensure they are ready for the next unforeseen challenge? 4
We now face a global backdrop of increased strategic uncertainty and a rapidly evolving competitive landscape. This also all comes at a time when the UK is formally considering how it best defines its place in the world and how it ensures that it ‘is equipped to meet the global challenges of the future ”. Fundamental to this is how we can better use science, technology and data to adapt to the changing nature of threats we face as a nation. Data is a critical element in all of this. But data can only be fully harnessed as information, or even as knowledge, if it is effectively refined and understood through technology. As a company Adarga’s particular focus is applying Artificial Intelligence (AI) to enable us to make better use of all this available, human generated data and information, ultimately to support and enhance all aspects of our endeavour. AI technologies are poised to transform every industry and activity that we undertake, just as electricity did 100 years ago. AI will allow us to identify unforeseen threats and seize fleeting opportunities in this uncertain, complex and volatile world. The increasing adoption and application of AI will in turn fuel even greater velocity in its development and sophistication in what it can achieve. The pace of change we are witnessing is only going to get faster. The advent of the Information Age has already catalysed AI’s ascension into a realm previously reserved exclusively for humans. And, during the course of the next decade, AI will again make further leaps in human-like capability. Data and information processing will become fully the domain of AI. Not only will routine tasks be handed over to machines, but much of the typical decision-making that consumes people throughout their day-to-day work will also be taken over by AI. Whilst this continued technological advancement has the potential for significant positive impact on society, for example through improved productivity, providing freer access to the sum of available human knowledge and by promoting free speech, it 5
LANGUAGE OF THE FUTURE
can also offer opportunities for distortion, deception and disruption. Our potential adversaries have been quick to exploit the capabilities afforded by new technologies, seeking to strengthen their competitive advantage in the information dimension. AI is, of course, now a buzzword used by every leader of every organisation. Governments and businesses alike, rightly imagine AI being employed for a variety of purposes. However, very few of these leaders have an accurate or comprehensive understanding of how these algorithms and intelligent systems are built, how they operate, the full range of uses that they might be put to or the challenges that exist in applying them in the real world. All of us have a responsibility to understand the impact and importance of the drivers that are fuelling the rapid pace of technological change. Our Symposium aims to help our audience of global leaders to understand the forces behind this paradigm shift, the key technologies driving and enabling change, the opportunities, the threats and the imperative to act - This is the language of our future. Our Symposium this year focuses on the future of AI and its potential to enhance our prosperity and resilience, and more specifically on the importance of language-based technologies. The sophistication of human language underpins our ability to reason, to comprehend, to ask questions and to communicate. It sets us apart from every other living thing on our planet. But as we move further into the Information Age, AI will encroach even further into capabilities previously only possessed by humans. Knowledge and understanding will become a domain shared by both man and machine, with language being the bridge. If understood and harnessed effectively, AI has the potential to help us generate economic and social prosperity on a scale never seen in human history. You all have an important role to play in this journey. How will you incorporate AI in your own organisations to drive efficiencies, disrupt outdated business models, revolutionise ways of working and enhance the abilities of your human talent? Our speakers, and other contributors, will share with us their unique insights and offer some practical proposals for how we might all achieve the potential of AI, how we might address the challenges associated with its implementation and case studies to bring the promise of potential to life. Speed is of the essence. As is the case in any other race, the spoils will go to the winners. I urge you all to become active authors of your own organisation’s AI futures. Rob Bassett Cross Founder & CEO, Adarga 6
techUK is a membership organisation that brings together people, companies and organisations to realise the positive outcomes of what digital technology can achieve. We collaborate across business, government and stakeholders to fulfil the potential of technology to deliver a stronger society and more sustainable future. By providing expertise and insight, we support our members, partners and stakeholders as they prepare the UK for what comes next in a constantly changing world.
techUK.org | @techUK | #techUK
Our Core Partners Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform, offering over 175 fully featured services from data centres globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government
Agenda All hosted online at https://adarga.ai/event/symposium2020 All timings in BST
agencies—are using AWS to lower costs, become more agile, and innovate faster.
Azets is an international accounting, tax, audit, advisory
14.50 – 15.00
Log in and countdown to start
15.00 – 15.05
Welcome: Rob Bassett Cross, CEO of Adarga
15.05 – 15.40
Opening keynote speaker: Robert O. Work
15.40 – 16.25
Panel discussion and Q&A:
and business services group that delivers a personal experience, both digitally and at your door. With over 6,500 people across our office network, we help companies and organisations of all shapes and sizes, public sector enterprises and high net- worth private clients achieve their personal and business ambitions. Whether you’re a start-up or a blue-chip, we save your precious time – so you can focus on what you do best.
· Chaired by Professor John Cunningham, Columbia University Founded in 2012 and headquartered in London, Improba-
· Dr Deborah Fish OBE, Scientific Advisor
ble is driven by the potential of virtual worlds to transform
· Dame Wendy Hall, Regius Professor of Computer Science and Executive Director of the Web Science Institute, University of Southampton
our societies, cultures and economies. The company’s games business is therefore dedicated to providing better ways to make multiplayer games.
· Dr David Talby, CTO at John Snow Labs
Improbable’s Defence business combines our parent
· Dr Colin Kelly, NLP Team Lead at Adarga
company’s software engineering experience with expertise in computational modelling, AI and data analytics to serve government, defence and national security organisations across the NATO alliance. Working closely with customers
16.25 – 16.50
Closing keynote speaker: Professor Sir David Omand GCB
16.50 – 17.00
Summary and close
such as the UK Ministry of Defence as well as industry partners, our mission is to enable the most sophisticated synthetic environments ever experienced – environments that will transform planning, training and decision support. 8
9
Turning the Lens on Yourself David Gillian Managing Partner of McChrystal Group European Office
Victor Bilgen Head of McChrystal Analytics
Humans have been the lone stewards of information for millennia. Our ability to capture, transmit, and use that information has been the key to our dominance as a species.”1
I
n the McChrystal contribution to Adarga’s 1st AI Symposium, we wrote about the role of leaders in focusing Artificial Intelligence (AI) capabilities to ensure their potential can be met. As we continue down the path to an AI-enabled world we still contend that those leaders who overlook or misunderstand their role in guiding and focusing AI capabilities will create rather than1remove challenges for their organisation. Not least of these responsibilities is to understand what these capabilities truly mean and the enduring benefits they can bring as a means to reducing the fear and uncertainty that surround them. In the course of the coronavirus pandemic, many organisations have found their resilience challenged, prosperity threatened and security compromised. Yet in this testing time, tremendous opportunity exists to understand the reasons why an organisation may have been challenged in its response and to chart a new course forward. From McChrystal’s experience, AI, Machine Learning (ML) and Natural Language Processing (NLP) are critical tools in finding this way ahead as they can help divine meaning from the swirl of information that courses through organisations.
“Data, data everywhere, nor any drop of knowledge…” Today’s leaders are swimming in oceans of data, bombarded with insights that are often detached from business impact. Data alone does not provide a sufficient pic-
1. McChrystal Group article in “Enhancing Human Ingenuity” 1st Annual Adarga Symposium on AI, 5 September 2019 10
ture to drive business outcomes; it must be coupled with an understanding of the direction of the business. This connection between data and business context is paramount when making decisions in crises. Yet understanding not just your business but the environment in which it operates with the clarity that enables sound decisions is an inherently complex task. Indeed, if the intricacies of an organisation’s hierarchy, responsibilities, functions, geographies, demographics and other factors are to be truly absorbed into an understanding of the functioning of a business, it is an increasingly incomplete, if not impossible, task without the use of AI-related tools. As a simple study of how ML and NLP, in particular, can be used to build this connection, we will outline how McChrystal uses these tools to build the context and meaning from data, allowing leaders to focus their organisations on the issues that matter to their success and to identify the key players in unlocking their full potential. In other words, how can you turn the lens on yourself to drive better business outcomes?
The key to resilience: understanding your network Building resilience in the face of uncertainty takes the cumulative effort of an organisational network. Successful leaders will realise that they can’t build this resilience alone, but they also can’t impact every person in their organisation’s network directly. Taking stock of how your people connect and then taking smart, measured approaches to drive change through your key individuals will lead to success in managing through your crises. The chart below depicts the network of a small organisation going through major change and restructuring. Each circle or node is an individual within the organisation, and each line between circles indicates that one individual goes to another as a good source of help or information. The larger the circle, the more people go to them for help or information. 11
LANGUAGE OF THE FUTURE
2ND ANNUAL ADARGA SYMPOSIUM ON AI
analytical challenge amounting to hundreds of thousands of rows of data. In making sense of this collected data both ML and NLP are our essential tools. In simple terms, ML reduces the dimensions in quantitative data, allowing us to look at much more massive data sets than could otherwise be possible. Through innumerable calculations, in a constant cycle of training and testing a statistical model, it helps us see what data looks similar to other pieces of data, and what data stands out.
Figure 1 Orgnizational Network Analysis Chart Source: ONA Survey
This data is relatively easy to collect and display. However, to fully explain ‘why’ it is like this, whether it is ‘good or bad’, or what can be done to adjust it, developing contextual understanding is essential. Only through the collection and analysis of people’s observations and opinions can true understanding of these network connections be established. Yet for an unaided analyst, that task can be immense.
The essential tools for ‘hearing’ the voices of your people In a typical McChrystal diagnostic of say, 2000 respondents in a major Fortune 500 type company, we collect enormous quantities of both quantitative and qualitative data. This data is used to build a unique picture of how a business is performing and how its internal network is functioning, relative to the business outcomes they are trying to achieve. The quantitative data points collected can be in the order of a million, while we also draw inferences of the wider context using millions more data points from our previous diagnostics. While this quantitative picture tells its own story, the real richness in our understanding will come from the qualitative data we collect as it helps us get to the meaning and cause of the quantitative picture. This insight emerges from textual responses of 10-15 questions per respondent, and transcribed interviews of 50-100 hours of spoken word. The result is an enormous 12
NLP helps to reduce the number of responses we need to understand. It gives us an intelligent readout of the core dimensions apparent in the qualitative responses given the similarities in their answers (based on keywords and phrases). The NLP algorithms help classify responses in a much more manageable way such that the analysis can be done in a fraction of the time than spent by someone with a highlighting pen.
It’s still about the leader What is most important to understand in these processes – and a critical point for those who may think they are giving their insight over to machines – is that neither the ML nor the NLP tools present us with an answer. Through the power of sorting, cataloguing and synthesizing responses, they bring the outlines of a complex picture into view for our analysts. From this we then build the story. A story not of where clusters of people may appear on a network map or where collected scores fall on a graph – you can easily see that on the relevant map or chart. Rather, we can tell you why your organisation is operating the way it is and what can be done to capitalise on opportunities. It enables leaders to respond to the complex reality that actually exists and not what the loudest voice thinks exists. These tools enable us to provide a clarity that can be used to drive business outcomes. In reality, it taps the voices of your people to build your roadmap for the future. How these decisions may impact a business are shown below, in the before and after network map of the simple organisation described earlier. As we said at the outset, the role of the leader remains paramount in this process of applying advanced technologies to understand the functioning of a fundamentally 13
LANGUAGE OF THE FUTURE
2ND ANNUAL ADARGA SYMPOSIUM ON AI
Figure 2 Source: ONA Survey
human endeavour – building, operating and growing a business. In today’s world, the lens you use to comprehend your business challenges and the way forward can be made sharper. Understanding the value that the application of advanced AI tools can bring to your business may indeed become clearer by turning that lens on yourself.
 �
14
15
Information, Innovation and Integration: the MilitaryStrategic Challenges of the New Information Environment Paul Cornish* Visiting Professor at the London School of Economics
T
he invention of cyberspace1created unprecedented possibilities for worldwide, near instant, mass transfer of information, data and ideas. Two decades into the 21st century enough is known about this manufactured environment to suggest that its effects are possibly best described as revolutionary – technologically, politically, industrially, societally and strategically. Unless, given that it is difficult to discern who, or what, might be driving this supposed revolution, what direction it is taking and where, when or how it might conclude, evolutionary would be more appropriate? Whatever term we use to describe this new environment, it is clear that, as with many pivotal moments in human history, it is both attractive and inspiring on the one hand, and alarming and intimidating on the other. The attraction of cyberspace is easily explained. Cyberspace is expansive and dynamic: a global communication infrastructure promising not only to shape, but also to improve all dimensions and all levels of human life – cultural, economic, religious, diplomatic, commercial, family, individual, non-governmental and governmental, and so on. What is more, the price of access to these extensive benefits could be as little as the cost of a SIM card (for as long as SIM cards are necessary) and the rules of behaviour do not seem too onerous. But if the opportunities are great, so too are the associated threats and hazards. Cyberspace is routinely exploited by a variety of adversaries, aggressors and predators: hostile states; political extremists and terrorists; businesses practising commercial espionage and theft; individuals and criminal organisations undertaking financial fraud and trafficking in people, armaments and narcotics; and individual so-called ‘nuisance’ hackers. And certain of these new or evolving technologies, such as Artificial Intelligence, while presumed to be broadly benign, might themselves create unexpected hazards and governance challenges.
The fact that the information environment is developing and is being exploited so intensively – at such a fast pace and in different (often incompatible) directions simultaneously – is what makes it an especially difficult arena for policy making and strategic planning. Yet it is precisely that complexity and urgency that makes it imperative to describe the contours and boundaries of the new information environment as clearly as possible. It should come as no surprise that the information revolution provokes such intense interest at the policy level and among military strategists, commanders and analysts. In military circles any changes and developments in the information environment have always been followed closely. Sun Tzu’s The Art of War, probably written during the 4th century BCE, contains a passage which encapsulates centuries-old military wisdom:
Know the enemy and know yourself; in a hundred battles you will never be in peril. When you are ignorant of the enemy but know yourself, your chances of winning or losing are equal. If ignorant both of your enemy and of yourself, you are certain in every battle to be in peril.2
* Professor Paul Cornish is Visiting Professor at LSE Ideas, London School of Economics. He is editor of the Oxford Handbook of Cyber Security and Cyber Security: A Very Short Introduction, both to be published by Oxford University
2. Sun Tzu, The Art of War, trans. & ed. S.B. Griffith (Oxford University Press, 1971), p.84.
Press in 2021 16
17
LANGUAGE OF THE FUTURE
Know the enemy and know yourself; in a hundred battles you will never be in peril. When you are ignorant of the enemy but know yourself, your chances of winning or losing are equal. If ignorant both of your enemy and of yourself, you are certain in every battle to be in peril. As Sun Tzu suggests, information, or the lack of it, can decisively influence the course of conflict. Information about allied and enemy dispositions, intentions and logistics, about the nature of the terrain and the likely effects of the weather; in modern military parlance such information can at the very least enable military operations and might at best multiply the effect, say, of a numerically inferior but more cleverly led and ‘situationally aware’ force. Modern information and communications technologies (ICT) have been influencing military activities for several decades, in the form of encrypted radio, high-volume data transmissions, identification friend or foe (IFF) systems, the doctrine of network-centric warfare and so on. Information is, in short, considered a vitally important commodity at all levels of military activity – tactical, operational and strategic. The UK Ministry of Defence routinely acknowledges the importance of information, militarily and strategically: ‘Our Armed Forces need to exploit information to a much greater extent’; ‘Achieving a step change in how we exploit information will unlock a step change in our military capability, achieving what we call Information Advantage’ [emphasis added]3. Information Advantage has been defined as ‘the credible advantage gained through the continuous, adaptive, decisive and resilient employment of information and information systems.’4 Ironically, this is not the most helpful or informative explanation of what has become a term of art in the strategic debate in the UK and elsewhere. The importance of using language precisely and consistently is the first of four challenges that arise when considering the military-strategic implications of innovation in the information environment. The breadth, complexity and urgency of the challenges to international security and national strategy in the early 21st century can prompt a resort to language which tries either to domesticate and neuter the challenges, or to reify both the challenges and our responses to them, turning fluid generalisations into fixed, concrete com-
3. UK Defence Innovation Directorate, Defence Innovation Priorities (London: Ministry of Defence, 9 September 2019), p.16. 4. UK Ministry of Defence, Information Advantage (DCDC: Joint Concept Note 2/18, November 2018), p.7.
18
2ND ANNUAL ADARGA SYMPOSIUM ON AI
modities (sometimes signified by the use of Capitalised Nouns) which are conveniently familiar and manageable, rather than strange and unsettling. The language can be reassuringly commonplace and anodyne (e.g., ‘the era of persistent competition’ or ‘political warfare’), or elaborately engineered (e.g., ‘Hybrid Warfare’ and its cousins ‘Next Generation’ and ‘Grey Zone’). But expressions such as these can be unclear (see ‘Information Advantage’ above) or confusing (when has ‘competition’ in international politics not been ‘persistent’ or ‘constant’?). The language can also be misleading. The idea of ‘Grey Zone’ conflict, for example, tries to persuade us that the binary, monochrome understandings (‘peace’ versus ‘war’) that have for long governed our analysis of war and conflict can now be discarded in favour of a third option. But how is it possible to describe a notional no-man’s land between ‘peace’ and ‘war’ other than in terms of ‘peace’ and ‘war’? ‘Hybrid Warfare’ is an especially bewildering term; a hybrid animal is one that is not only descended from its parents but is also, importantly, different from them. Thus, a mule is neither a ‘hybrid donkey’ nor a ‘hybrid horse’ – it is a mule. The distinctive feature in much of what is often described as ‘hybrid’ warfare is that it is not ‘warfare’. What then is it? If anything, it is political competition (often manifested in information and communication) that has been ‘hybridised’, absorbing some military methods, rather than vice versa. In general, expressions such as these also hint at the possibility that the new information environment might not, after all, be that much of a problem. Why worry about innovation when what is taking place can be described, defined and contained either within what we already know or, better still, by some soothingly sophisticated new label? The second challenge is to ensure that the spur to innovate is not blunted by the use of over-familiar or insubstantial language and that the information revolution is recognised for the practical and conceptual challenge that it is. Innovation is not the same as invention and creativity, and neither is it a matter of forcing the products of invention and creativity to comply with current principles and practices. Innovation is concerned instead with the application of ideas and inventions; ‘turning ideas into practical, reliable and affordable reality’.5 These new applications might be practical or procedural, incremental or abrupt but, in any case, they imply adaptation and change rather than compliance and continuity. For Armed Forces, particularly those well-versed in the exploitation of timely and accurate information for the successful
5. Mark Dodgson & David Gann, Innovation: A Very Short Introduction (Oxford University Press [2nd edn], 2018), p.13; Matt Ridley, How Innovation Works (London: 4th Estate, 2020), p.29.
19
LANGUAGE OF THE FUTURE
conduct of military operations, there can be a tendency to see the information revolution as merely the latest phase in something they have known since Sun Tzu. Thus, improvements in ICT might be welcomed as a reinforcement of the doctrine known as ‘mission command’, allowing information to be distributed rapidly and in volume to very low levels, improving the precision and effectiveness of tactical decision-making. More broadly, the information/decision/action cycle, otherwise known as the ‘OODA loop’,6 could become tighter and faster, enabling improvements in intelligence reporting, target acquisition, situational awareness, force protection and so on. Yet although these could all be seen as significant adaptations, creating tactical and operational advantage over an adversary, they might also be obscuring the far larger, strategic implications of innovation in the information environment. For example, while it might strengthen the doctrine of mission command, a centrifugal approach to information distribution might have the effect of weakening the co-ordination, integration and application of information at the national strategic level. Furthermore, information is becoming much more than an enabler or multiplier of the effectiveness of military activity. Military commanders have always been conscious of the need to capture and hold what is known as vital ground – topographical features, the loss of which would jeopardise the operation. In traditional thinking, ‘vital ground’ might refer to hills overlooking probable lines of enemy advance or to river crossing points. But the implication of the information revolution is that the (awkwardly termed) ‘information space’ is becoming, figuratively speaking, strategic vital ground in its own right. If information is no longer simply one of the many components of a military campaign (together with tactics, manoeuvre, logistics and so on), but has instead become the principal focus and perhaps even the essence of strategy, then information for war has become information war. Moreover, the fixation on information as a campaign enabler might overlook a more fundamental and higher-level strategic challenge; what if some other agency, such as a non-human intelligence, is permitted to construct its own ‘OODA loop’, turning information into actionable knowledge (or, if ‘knowledge’ is too strong a term, ‘activated information’) in order to serve its own ends? In other words, we should pause before assuming that the information has run its course and revealed all it has to offer.
20
2ND ANNUAL ADARGA SYMPOSIUM ON AI
Information can be organised in various categories, from the commonplace to the rarefied and strategically relevant. Access to the latter has traditionally been assumed to be the exclusive preserve of governments, intelligence agencies and armed forces. What most characterises the information revolution, however, is that it commoditises information, making access to even highly specialised information more or less open to those who have the (relatively inexpensive) means to find, collate and analyse it. This widening access is the third challenge to consider. Rather than reminisce fondly (and pointlessly) about a time when strategically significant information was not so freely available, a more innovative response to the information revolution would see a shift in emphasis from controlling access to high-level information to ensuring a more integrated and effective management of it. It is here that the Armed Forces, with their extensive experience of turning information into actionable knowledge and decisions, and acting in conjunction with other government departments and agencies, could make a valuable contribution. This is not to suggest that the Armed Forces should in some way assume command of the national response to the information revolution. But if information has become a genuinely strategic commodity and if ‘information war’ does indeed touch upon all areas of society then the best response to this challenge must involve all or most functions of government and in particular those, including the Armed Forces, that have experience in the competitive exploitation of information. As well as asking “How can the information revolution improve military effectiveness?”, military commanders could also ask “What can the Armed Forces do to ensure the effective, advantageous, timely and decisive management of information at the national strategic level?” This would require a far larger vision of the implications of the information revolution, not only for the Armed Forces but also for government and society as a whole, and of the role of the Armed Forces in co-ordinating the national response. The United Kingdom already has at its disposal the UK Strategic Command, acting as the integrating authority for defence and charged with pursuing deeper integration ‘across government’, with exploiting data more efficiently and effectively through common hosting and standardisation, with standardising networks and information exchanges and with creating a single cloud environment (not unlike the US ‘war cloud’ initiative), and all in order to enable ‘faster and better decision making.’7 There
6. ‘Observation, Orientation, Decision, Action’: attributed to John R. Boyd. See ‘The OODA “Loop” Sketch’, http://www.
7. Commander Strategic Command, General Sir Patrick Sanders’ Speech at the Air and Space Conference, 15 July
danford.net/boyd/essence4.htm and Robert Coram, Boyd: The Fighter Pilot Who Changed the Art of War (New York:
2020: https://www.gov.uk/government/speeches/commander-strategic-command-general-sir-patrick-sanders-speech-
Little, Brown & Co., 2002), pp.327-344.
at-the-air-and-space-power-conference
21
LANGUAGE OF THE FUTURE
is no obvious reason why Strategic Command could not extend this effort into UK government as a whole – if not as the integrating authority for information management across UK government, then at the very least as the integrating catalyst. The fourth and final challenge is also concerned with integration, but in this case with ensuring that the relationship between human and machine intelligence is not only effective but also balanced and appropriately governed. The information environment is already replete with technological developments and challenges: the Internet of Things (or, as some put it, rather dauntingly, the Internet of Everything); Big Data; Human-Machine Teaming; Quantum Computing; and, in the military sphere, developments such as Lethal Autonomous Weapon Systems. The emphasis now being placed on data and intelligence, on the robustness of information- and decision-making networks and particularly on automated/artificial processing and communication is unlikely to diminish. The major growth area in the near future seems certain to be in machine learning and in the progressive development of artificial intelligence from ‘basic’ to ‘augmented’ to ‘general’. As well as offering opportunities to governments and commercial enterprises, these new and evolving technologies also present risk, in terms both of their appeal to cyber aggressors, criminals and hackers and because of what has become known as the ‘problem of control.’
2ND ANNUAL ADARGA SYMPOSIUM ON AI
The prospect of a deep and rapidly learning, non-human intelligence selecting and acquiring the information and data it considers significant, processing those inputs through an OODA loop it has designed to serve its own purposes and subsequently making decisions which might or might not be aligned to human values ought to be a cause for concern – a regulatory and governance challenge of the highest order. The task of ‘managing and guiding’ machine intelligence is one in which Armed Forces should play a central role, partly because of their practical experience of the benefits and hazards of these technologies – ‘Full appreciation of any adverse consequences of technology is needed for its effective regulation.’10 But the most valuable military contribution would be cultural, in the insistence that whatever the national strategic mission might be, it must always have an explicit political purpose, must always be ethically governed and must always be competently commanded. To that end, some innovative use of familiar language might be permitted. The military preoccupation with distributing information horizontally and vertically, to ensure integration as well as timely and effective ‘mission command’, could become the basis of a governmentand society-wide determination to ensure ‘command of the mission’ – by humans. The human management of complex problems calls for language to be used intelligibly as well as intelligently, making it possible to draw upon the considerable knowledge and wisdom – technological, strategic and ethical – readily available in the UK and elsewhere.
As well as providing practical experience both in exploiting these technologies and in defending against them, Armed Forces can make a more fundamental contribution to discussions about the ordering of the relationship between human and machine intelligence. In Human Compatible, Stuart Russell observes that ‘We cannot predict exactly how [AI] will develop or on what timeline. Nevertheless, we must plan for the possibility that machines will far exceed the human capacity for decision making in the real world. What then?’ Russell goes on to argue ‘AI has the power to reshape the world, and the process of reshaping will have to be managed and guided in some way.’8 In the same vein, Toby Ord warns that ‘The advent of military uses of AI will play a role in altering, and possibly disrupting, the strategic balance’ and, more broadly, that ‘unaligned’, ‘deep learning’ AI presents ‘great ethical challenges’: ‘There are serious concerns about AI entrenching social discrimination, producing mass unemployment, supporting oppressive surveillance, and violating the norms of war.’9
8. Stuart Russell, Human Compatible: Artificial Intelligence and the Problem of Control (Viking, 2019), pp.xi, 249. 9. Toby Ord, The Precipice: Existential Risk and the Future of Humanity (London: Bloomsbury, 2020), pp.102, 141.
22
10. Dodgson & Gann, Innovation, pp.102, 141.
23
Data – The Important Part of Data Science Dr Dan Clarke Head of Applied Science, Adarga
T
he topic of artificial intelligence has captured human imagination for centuries – from the automatons in ancient Greece; through to the founding of Artificial Intelligence (AI) as an academic discipline in the 1950s; and up to the present-day concepts of deep learning, big data and the idea of cognitive intelligence (or artificial general intelligence). As a result of ongoing research effort in this field, artificial systems have been built which outperform their human counterparts in a number of areas. For example, AI has been shown to outperform some of the best human players in games like Go; Starcraft 2 and Dota 2; Facebook’s DeepFace facial recognition software achieves a similar accuracy to humans; and Microsoft’s speech recognition AI is able to transcribe audio with fewer mistakes than humans. However, the concept of a cognitive intelligence has still not yet been realised despite the promises of big data and deep learning and the aforementioned results AI has already achieved. The limiting factor in this is cognition and skill, something which is difficult in the current generation of connectionist and associational pattern recognition models which form the vast majority of modern artificial intelligence algorithms. In almost all areas, AI is used to undertake a very specific set of tasks, governed by a well understood and logical set of rules. These systems use expert domain knowledge as part of their design process. For example, while the boardgame Go has a complex set of possible actions, the ruleset governing the possible actions and outcomes is well defined. In the context of facial recognition, the algorithm is trained using a large dataset of example faces which have been pre-labelled (though there has been some recent promising work in single shot facial recognition). It is important to highlight at this point that the algorithm engineers have selected a data set and labelled the class (identify) that each face example belongs to. That is, the engineers have provided the solution that the AI algorithm should learn, and the AI algorithm is essentially just a numerical model representing that learning. New examples can then be classified (the identity is inferred) by simply applying that numerical model to them.
24
Expanding on this, One of the key concepts of a generalist AI is that if you select a training data set that has a sufficient volume of examples to cover all of the variety the system is likely to see, then it will be able to generalise for all possible cases of that. As an example, consider the task of detecting and classifying traffic signs in modern cars. The system must be able to detect traffic signs in a range of lighting conditions, at different ranges, in different weather conditions and with different local variations (English and Welsh for example). The machine learning algorithm must be sufficiently complex to allow all of the different examples of a single traffic sign to be correctly assigned to that traffic sign no matter their variation due to locality, lighting and weather. Consider again that that the majority of machine learning algorithms are connectionist and associational pattern recognition models – that is to say, the algorithm fits to the data it sees. This is an important point to consider as machine learning algorithms will struggle to precisely predict in areas where there are no data examples, or where the data samples are sparse, as outlined in figure Figure 1.
Figure 1 Fitting a Neural Network model. The data points shown in red have an inherent deviation which the Machine Learning algorithm attempts to ‘fit’. However, in areas where there are no data points, the model variance diverges far from the mean where there are few or no data points.
25
LANGUAGE OF THE FUTURE
Not all events where the machine learning will operate are equal and it is necessary to consider what the data means in the context of application that we are trying to address. One of the most well-known applications of AI in modern society is that of highly automated or autonomous driving. To give an example, the automated taxi firm Waymo have driven over 20 million miles training their vehicle. But what does such a large volume of data actually mean from both a statistical and application perspective? The vast majority of those data points will likely be very similar, showing standard every-day situations where nothing remarkable happens (e.g. driving on a well lit road where nothing extraordinary happens). However, it is not the uneventful situations which are of interest to the system, but the eventful – the situation where something unusual happens. For example, if we consider that an accident happens every 300,000 miles, and that for every accident there are about 10 scenarios that could cause an accident – the 20 million miles of data is reduced to only around 667 eventful data points that are really interesting to the artificial intelligence system. In the highly automated driving example, the most common scenarios will be represented by millions of miles of examples and a generalised model can be developed to cover these. Those 667 data points – the edge cases – represent the most interesting and dangerous situations for the vehicle, however they are likely too sparse for a generalist algorithm to be effectively developed for it. Consider this thought example: There is a drunk person crawling across the road in a panda costume at Halloween – what do you see? You probably see a drunk person crawling across the road in a panda costume. But what would the machine learning algorithm ‘see’? That very much depends on what data was used to train that algorithm and it is quite certain that the algorithm wasn’t trained using hand labelled examples of a drunk people in panda costumes. What would the car recognise in this scenario? In most Data Science applications there is an interest in different aspects of the data and what the data represents. Consider the normal distribution which represents many natural phenomena. The vast majority of example data points are distributed around the mean (that is they are close to the average). We could split our use cases into the background use cases which cover the vast majority that the system will encounter, and which are relatively average and stable (the vast majority of driving cases). The foreground cases then become the statistical outliers at the edge of the normal distribution for which the system may only ever observe a small number of times in its lifetime – the drunk person crawling across the road in a panda costume. These foreground cases are often the most interesting – in the case of the highly 26
2ND ANNUAL ADARGA SYMPOSIUM ON AI
automated car it is those dangerous scenarios that have the potential to cause an accident. From the context of an intelligence analysis example, it is the small detail that helps to corelate a broader hypothesis. For those cases which are statistical outliers a purely data driven machine learning approach is not sufficient. It is likely that the AI will miss them as the connectionist and associational pattern recognition models have been trained to have the best performance across all use cases, where the background use cases dominate this model. For the statistical outliers, the machine learning algorithm must be both developed and then tuned specifically for that use case. That is, the engineering team needs to be experts not only in the algorithm development, but also in the application specific use case and the data used within that use case.
Figure 2 The relationship between Data and expert and a machine learning algorithm.
Within Figure 2, it is important to note that an effective machine learning algorithm (represented by the blue line) can be developed using a combination of data and expert/domain knowledge. What this means is that at one end of the scale, we have the generalist approach, where a very large volume of training data is used with little expert knowledge used to design and tune the algorithm. In the other extreme – the specialist approach, a smaller data set is used (which is heavily cleaned and pruned) and a highly specific and tuned algorithm is utilised. That is to say, if large volumes of data are available, a generalist approach to algorithm design can be undertaken. 27
LANGUAGE OF THE FUTURE
However, where the data is sparse (i.e. the outliers discussed earlier) a far more specific approach must be followed. So, what does this mean for those aspiring to develop effective AI and Data Science? For applications hoping to accelerate performance using artificial intelligence and data science, data is one of the fundamental building blocks. Data alone is not the accelerant. Rather, the data must be combined with an effective algorithmic approach and expert domain knowledge regarding the application. Furthermore, there cannot be a single approach to any given application. Take as an example the application of AI and Data science to intelligence analysis, where the aim is to take large volumes of data from disparate sources and to extract information and knowledge from the data. The information level can be considered as the broad relationships between entities of interest; whereas the knowledge is what those relationships mean in the context of the intelligence questions. There is a need to map the broad relationships between data to understand the general pattern of life; while simultaneously probing the strength of different relationships to answer the specific intelligence questions being asked (presence of the abnormal). In conclusion, the trend towards a general artificial intelligence that can learn any intellectual task requires a data set that is sufficiently large and with sufficient variety to cover all possible eventualities. While the academic study of big data and deep learning is helping to progress towards these goals, the most interesting and impactful tasks remain as outliers, where there is still insufficient data (points of observation) for a generalist approach to algorithm design. There are no one-size fits all, and an effective data science or artificial intelligence approach must include both the generalist and specialist approaches. The specialist approach will differentiate the system and must be supported by the right data which has been labelled and tagged to support the application requirements. This in turn means that the data scientists and algorithm design engineers must be supported by (or even be themselves) experts in the application they wish to solve. This is not purely a software problem and the most successful organisations will effectively integrate the subject matter experts with the data scientists during the design and development process. The data that the data science algorithms are trained on and utilise must be specific to support the use case.
28
“But It’s Just a Machine!”: Bias in the Age of AI Dr Tom Todd Data Science Team Lead, Adarga
Dr Colin Kelly NLP Team Lead, Adarga
A
I promises to accelerate and enhance decision-making in all spheres of life, but can we trust AI algorithms to make fair decisions? Biases in AI systems can lead to discrimination, poor performance, and a lack of trust in the system’s results. Tackling bias is a crucial step to enable the adoption of AI, particularly in the highstakes applications where it can be most valuable. Before the advent of AI, the source of algorithmic bias in an automated system was clear: it was due to the programmers who developed it. The decisions they made while solving the problem resulted (intentionally or unintentionally) in an algorithm which gave a biased result. The story with AI is a little less clear. The power of AI is its ability to learn the logic of an algorithm from data (we’ll call this AI derived algorithm a ‘model’). This means it’s no longer necessary for a programmer to meticulously think through each logical step of the algorithm as would be the case in traditional software development. This is an immensely powerful technique for solving difficult problems quickly, accurately and at scale. However, this introduces an effect that we’ve not seen before in algorithm development. When logic is inferred from data any bias in the data will be ‘baked in’ to the model produced by the AI. If AI practitioners are not careful, they can easily produce models which perpetuate the current state of the world, unwelcome biases and all. The aim of this paper is to consider different kinds of bias that we must consider when working with AI, the mitigation strategies we can pursue to avoid bias, and look to how this situation may change in the future. Defining bias as systematic error, or one not borne out of reason, it is easy to see that bias is undesirable and should be avoided. But it’s worth considering the various
30
If AI practitioners are not careful, they can easily produce models which perpetuate the current state of the world, unwelcome biases and all
sources of bias - the existence of bias is not a modern problem nor is it unique to scientific endeavours. As the saying goes, “history is written by the victors”, and within that we encounter our first form of bias – observer bias. History books tell us the ‘facts’ about our world in the past, but any budding historian learns they must work hard to control for the subjective prism of their sources’ viewpoint. When collecting data to train our algorithms, data scientists must be no less careful. Confirmation bias – placing greater weight on that which supports one’s preconceptions – can be easily capitalised upon, for good and for ill: if an algorithm has identified that a social media user engages with 5G conspiracy theory articles, who better to supply further 5G ‘evidence’ to? We data scientists and AI developers can be no less susceptible to confirmation bias. This is why the gold standard for scientific research involves double-blind experiments. After all, modern machine learning and AI platforms take a snapshot of a slice of the world through the data they ingest and then use it to characterise that world in some way. While we humans have (albeit imperfect) means to overcome our own biases, these machines do not. We can access other print, television and online news sources and possess our own ability to reason about and to test our theories – a pre-built, use-case-specific model or algorithm will not.
LANGUAGE OF THE FUTURE
We humans have means to overcome our own biases — these machines do not. Indeed, some of the structural issues (in data, and in society) can inadvertently seep into entire research domains – and natural language processing (NLP) is no exception. For example, Mielke (2016) showed that the vast majority (90%+) of NLP research as measured by submissions to ACL have consistently focused on the English language. In second place was Chinese. Meanwhile it’s estimated that there are around 6,500 languages spoken in the world, with quite the long tail. There is a clear impact of this – if there is a whole body of research supposedly focusing on language in general but in fact devoted to just one or two languages, we run the risk of missing parallels and differences between languages, biasing our methodology to the idiosyncrasies of the dominating language and failing to capture nuance in other languages. As Bender (2019) put it: “English is neither synonymous with nor representative of natural language.” This structural bias in research, a product of the combination of researchers’ native tongues, locations, funding sources, accessibility and countless other considerations, presents a slow-burning risk to the field as a whole. Not only that, but using vast corpora – such as the mega-corpus of almost a trillion words used to train state of the art language models like GPT3 – to ‘understand’ language can confer a misleading sense of accomplishment. Similar to our budding historian from earlier, it is the well-resourced English language which will dominate, potentially pushing other languages out of sight, and out of mind.
32
2ND ANNUAL ADARGA SYMPOSIUM ON AI
tic differences are themselves important to language research and understanding. Can we transfer learning from English onto a low-resourced language without losing the ‘essence’ of the target language? Or could this be a new form of colonialism for the AI era? There are a growing number of technical approaches that we can use to remove bias from AI systems. They can be grouped into three broad categories: data, explainability, and feedback. These approaches are all underpinned by a single factor which must also be grasped: a clear statement of what the AI system is going to do, who will be using it, and who else may be impacted by the system. The first element to consider when thinking about AI bias is the data used to build the system. We must ensure the data is representative. This means several things – firstly that it represents the behaviour that you want to learn (for example: if we are building an AI to detect the sentiment of a tweet we must collect examples that include all of the sentiments we wish to detect), secondly we need to ensure that the data we have collected is representative of the data we will want to apply it on (do we have tweets from a broad range of demographics? About a wide range of subjects? Are different dialects represented? Is our data labelling accurate across these different groups?). Here it really helps to have a clear understanding of the data that the AI will be asked to make predictions on.
Could this be a new form of colonialism for the AI era?
Not only that, but this homogenisation may well skip out on some of the dialectic disparities and nuances which exist internally to the language. In voice recognition, my Canadian friend can switch on the lights easily using Google Home, while I need to feign an American accent to turn them off. This ‘bias’ towards a specific accent seems innocuous yet is shaping my actions.
While we are used to hearing the mantra of ‘more data is better’, limiting certain kinds of data is a valuable tool to limit bias. Removing data which is unnecessary or may have a protected status (e.g. race or sexual orientation) can reduce the chance this data will become a confounding factor in the AI model. In the previous example this could mean removing the ‘geolocation’ metadata for our tweets, so that the AI model doesn’t learn an erroneous rule that simply categorises sentiment based on location.
A certain level of homogenisation may be acceptable – if not essential – for a range of downstream use cases, but we should stay keenly aware that these internal linguis-
Explainability in AI systems is an important subject, and it also has a bearing on tackling bias. The internal logic of modern AI models is difficult to understand. Developing 33
LANGUAGE OF THE FUTURE
tools to understand why an AI has made a certain prediction gives us a valuable insight to understand if the model’s logic is biased. A fun example is the case of building a ‘cat or dog’ image classification model which is a common open challenge used in education contexts. Applying explainability techniques to a model produced on this dataset shows that one of the most important features used by the AI is whether the image has a green background. Why? Dogs are usually photographed outside, and cats are mostly inside! The bias implication here is that outdoor cats will not be well served by our classifier, and we should consider isolating the background before using this data. The final broad category we will mention here is feedback. As we discussed at the beginning of this article, AI models are trained on data which represents a snapshot of the real world. As time goes by the state of the world changes and our models should adapt with it. The input data that the model is making predictions about should be monitored to ensure that the training data continues to be representative of the input. When the input data diverges from the training data, the model should be re-trained with an updated data set. Throughout the lifetime of a model deployment it is important to continue to test performance with newly labelled data. Changes in the model’s accuracy on new data might indicate that the model is ‘drifting’ away from reality. This can be an early-warning system of issues with a bias in the model. Perhaps the most important type of feedback is directly from the model’s users and the other stakeholders who are affected by it. Giving users the ability to flag incorrect predictions can alert us to situations where model performance may be biased, providing us with an opportunity to act. Having developed and adopted some of these technical approaches it is worth acknowledging that these can only go so far – it is simply impossible to eliminate all bias. So, what can and should a socially responsible organisation do to make sure that some of these negative consequences of bias are indeed mitigated? A technical solution only goes as far as its deployed – and it is still a brave (foolish?) leader who fully entrusts a computer or algorithm with any significant decision-making without a human somewhere in the loop, acting as a safety valve.
2ND ANNUAL ADARGA SYMPOSIUM ON AI
recently had her license suspended. Explainability is not a panacea, but does help us identify gaps and assumptions in inferences. Dealing with bias is not a novel challenge – it’s a human problem which requires ethics, experience and knowledge. However, the impact and rate at which mistakes can be made is new – the uplands of automation and efficiency can bring with them a ruthless and unforgiving robotic authority. “Computer says no.” Users of intelligent agents need to pose the right questions, as they would for any policy. What are the downstream impacts of our use of this model, even with mitigations? What are the unanticipated human consequences? And if these are harmful or contentious – what will we do then? Whether delegated to a committee or resting on the shoulders of an individual, these decisions, for now at least, can only and should only be taken by humans.
References 1. Language diversity in ACL 2004-2016, Sabrina J Mielke, 2016
Explainable models may indicate why the newly licensed 20-something was deemed more risky than the mother of two for car insurance – but a different dataset might tell you the 20-something works from home and avoids driving whilst the mother 34
2. https://sjmielke.com/acl-language-diversity.htm 3. “High Resource Languages vs Low Resource Languages“ , Bender, 2019: https://thegradient.pub/ the-benderrule-on-naming-the-languages-we-study-and-why-it-matters/
35
AI for Natural Language: Where We Are Today, And Where We (Might) Go From Here By Professor John Cunningham Columbia University
P
erhaps no part of machine learning and artificial intelligence has seen more exciting and rapid growth in the last five years than the area of natural language processing. Be it machine translation, speech recognition and transcription, natural language generation, or other notable challenges, these years have brought us from early but exciting possibilities, to technologies and commercializable products that are in some cases competitive or indistinguishable from human experts. What has followed of course is a wave of hype, often including grandiose or dire prognostications about the future of humanity. In this article we will ground this hype in some reality to firstly detail an exciting example of where we are today: GPT-3, a triumph of engineering and machine learning that has produced startling results in natural language generation. Secondly, I will speculate on the limit case of this line of technological progress, not as a threat to humanity, but as a technical and commercial disruption analogous to the last massive change: the internet.
Today: Natural Language Generation in 2020 As an example, consider the problem of natural language generation: we train a machine to, given a seed text, generate free text, perhaps of extensive length, that follows sensibly from the seed text. Q&A systems, chatbots, autocomplete, and similar have been advancing this technology for years, but the state-of-the-art could not get beyond a few sentences before losing coherence altogether. Enter the Generative Pre-trained Transformer 3 (GPT-3), released in 2020, which can generate in some cases several paragraphs of coherent and human-indistinguishable text. 36
First, it should be noted that the results are at times stunning, and anyone who has not seen them would do well to read some of the better samples produced by this model. Second, at a technical level, the progress made by GPT-3 is one of engineering, That is not to detract from the accomplishment, but to clarify that these results are as much the success of gathering and creating the infrastructure to train massive data on such a model (which itself combines advances in deep learning, attention models, embeddings, and more), rather than some revolutionary insight into artificial or biological intelligence, speech, etc. Third, you will hear people speculating wildly on its implications, or naysayers claiming GPT-3 can not reason, think, or similar. These comments are catchy but by and large distracting, in so much as they appeal to poorly defined (or poorly understood) notions of reason, thinking, and similar. Let us avoid this trap. Instead, recognize that language generation is a process of pattern recognition (understanding what has been said and written), and then sampling forward paths (what words will I choose, what themes will I explore). A suitably engineered system, artificial or biological, with an expressive model and extensive training data should learn patterns of human text.
Tomorrow: the commoditization of understanding With a model that can (sometimes) generate large blocks of human-indistinguishable text, it is understandable to jump to a conclusion that all human intelligence is in jeopardy. I’d like to set that aside and suggest an earlier opportunity and risk. Reflecting back on the internet, its core revolution was a commoditization of information: data that was once expensive - to gather, index, search, and transport - became essentially free. A recent and worthy book Prediction Machines has this observation as its 37
LANGUAGE OF THE FUTURE
central observation of what the internet offered to the world, both to humanity and to the commercial landscape. With this commoditization in hand, subsequent developments - search engines, targeted advertising, streaming media, social media, and more - all appear as natural consequences of information becoming essentially free. It is an imperfect summary of the internet revolution, but it frames the key question that we face today: as natural language processing technologies become increasingly indistinguishable from humans, they will commoditize not information, but the understanding gleaned from that information. 25 years ago, a researcher or analyst would have spent weeks gathering information and then reading and extracting understanding; today we get the information for free and spend hours or days reading and extracting knowledge. In 10 years or sooner, taken to its logical conclusion, this analytical process is a single query, with the result of a summary optimized to convey the core understanding to its reader, or an agent who can interact and teach the core concept. Consider the implications of this possibility for the professional services industry, much of which (e.g. legal) revolves around advisory work and the conveyance of understanding. Such a development may sound mundane or less than revolutionary, but I return to the internet analogy: in 1995, perhaps claims that the budding internet would commoditize information would seem mundane as well, with little idea of the companies that would thrive, the skills displaced, and entire industries that would be created as a result. As we look to a future of increasingly powerful natural language technologies, we can begin to speculate on the many windfalls and challenges humanity will face from the potential commoditization of understanding.
PROUDLY SUPPORTING ADARGA’S ANNUAL AI SYMPOSIUM Celicourt is a leading communications consultancy, focused on effective engagement with key stakeholders, to ensure our clients achieve their strategic goals.
CELICOURT.UK
38
39
Amplifying AI / ML Through Synthetic Environments Richard Warner Director Partnerships, Improbable
Dave Culley Product Manager, Improbable
Introduction
What is a Synthetic Environment?
The current technological revolution that’s transforming our industries, economies and lifestyles is set to have a similarly transformational impact on national security. The convergence of new technologies in the fields of Big Data & Advanced Analytics, AI, Machine Learning and distributed computing promise to give those that exploit them a significant competitive edge. Currently, the availability of data is less of a constraint than the accessibility of analytical tools powerful enough to let humans realise its true value. That’s about to change.
Improbable’s mission is to enable Synthetic Environments that are detailed, highly realistic representations of the real world that integrate data, artificial intelligence technologies, machine learning systems and constructive models such as digital twins and simulators. They’re engineered to help users explore real-world scenarios by letting them visualise, interact with and experiment on rich simulations.
Synthetic Environments are virtual representations of the real world, composed of interconnected simulation and analytical technologies that, when combined, form an ecosystem far greater than the sum of its parts.
Thanks to human-in-the-loop simulation coupled with detailed 3D rendering, users can even interact directly with the environment, enabling immersive training and rehearsal. Since hundreds or even thousands of users can experience this shared world at the same time, Synthetic Environments also promote shared understanding, improved consensus-forming, optimised decision-making and interoperation within and between organisations. Individuals at every level and across organisations will be able to improve their uniquely human attributes: using their imagination, honing their intuition and building on their experience. By running many simulations, many times, users can understand and explore the potential cascading consequences of their plans and allow critical decisions to be made with growing precision and confidence. Leaders can also employ their Synthetic Environments as a command and control tool that can help them manage the execution of those same plans that others have been developing and rehearsing. All of this builds resilience through increased preparedness, better planning and more extensive training.
Figure 1 The data revolution means that vast volumes of data are newly accessible. Those who can best interpret and exploit it will have a significant competitive advantage.
40
User interfaces are carefully tailored, so how users experience a Synthetic Environment depends on what they need to achieve with it. For example, users can experience it as a real-time geospatial dashboard that brings together abstract systems like human geography with physical terrain. Alternatively, they can use the Synthetic 41
LANGUAGE OF THE FUTURE
2ND ANNUAL ADARGA SYMPOSIUM ON AI
Environment as an in-silico laboratory for developing, testing, refining and rehearsing plans. Others might need to experience the Environment as a training and rehearsal space, with computer-aided wargaming at the strategic and operational levels, and fully immersive, realistically rendered first-person experiences at the tactical level.
A platform-enabled ecosystem Improbable’s open platform for Synthetic Environments can integrate a range of models, simulations, and AI/ML technologies from different sources across government, industry and academia. This open-platform approach to unifying and managing these assets inoculates against vendor lock-in and also ensures that the Environments can draw on the most reliable, relevant and up-to-date content to grow and evolve.
Figure 2 Synthetic Environments are presented to users according to their needs. An analyst may see data and dashboards, but on-the-ground personnel may experience their environment as a richly detailed virtual world in which they can train and rehearse safely, frequently, collectively and cost-effectively.
Figure 3 Integration of analytical tools and data into a Synthetic Environment allows for integration of processes across an organisation
42
The platform’s open standards do not, however, suggest ‘open access’. The platform and deployment model have been designed with robust security at their core, and configured and deployed to meet the security and access requirements of our customers, whether that involves a secure cloud, on-prem or bespoke solution.
Figure 4 Synthetic Environments comprise an ecosystem of technological assets that let people explore options safely and cost-effectively in a virtual world before taking action in the real one.
43
LANGUAGE OF THE FUTURE
Development of this ecosystem of assets is, by its very nature, a multi-disciplinary and multistakeholder enterprise. It means bringing together the best of the UKs capabilities and expertise from across industry, academia and government. Over time, this ecosystem will become richer, and the ability to substitute and upgrade parts of it would ensure it remains a cutting-edge tool that develops into a truly sovereign capability, fundamentally transforming the way government approaches decision making. The platform simplifies management and maintenance across the ecosystem, abstracting the complications of system integration and deployment, and delivering a broad range of technologies to users as a service. For the user, it simplifies access, ensures a coherent user experience, and thereby encourages greater collaboration across and between organisations. AI for analysis in Synthetic Environments
AI for analysis in Synthetic Environments Machine Learning has, in recent years, become increasingly popular, both benefiting from and also fuelling an explosion in data-collection capabilities across a wide range of sectors. In this family of analytical techniques, highly flexible models are fitted to vast volumes of data, leveraging advances in computation and statistical algorithms. They are excellent at finding patterns in data, flagging trends and anomalies for further investigation, and helping us better understand the world as it really is. Powerful though they may be, however, Machine Learning technologies can only work on the data they are given. In isolation and without broader situational context, it can sometimes be hard to trust ML and to identify blind spots in the analysis it provides. Interconnecting ML technologies into a broader (synthetic) environment helps mitigate this weakness by maximising the user’s contextual understanding, therefore increasing the scope and quality of the insights. For example, consider collaborative work between Adarga and Improbable as part of a programme to build a Synthetic Environment application to help military operational-level planning. Adarga’s natural language processing (NLP) technology (a type of Machine Learning) enriches the application’s common operating and intelligence picture (COP/CIP) by combing through vast amounts of text data such as news articles and intelligence reports. Since the tool is integrated with the Synthetic Environment, this information is then presented to the user in the broader context of many other 44
2ND ANNUAL ADARGA SYMPOSIUM ON AI
data sources. In this way, something as abstract as a population’s political sentiment might be overlaid as a layer on the geospatial dashboard alongside terrain, allied and enemy forces, critical national infrastructure and various other data layers. At a glance, an analyst or decision maker can see the broadest yet most comprehensive possible picture. Using this information, plans can be drawn up and potential interventions can also be simulated and thus evaluated. The candidate plan can then be passed to the relevant operatives who can rehearse and perfect its execution in a 3D rendered, fully immersive simulation of the scenario. To achieve the best possible performance, Machine Learning techniques such as NLP need to be carefully tuned to the data available and the problem at hand. For example, because the source material is so different, analysing the content and structure of an intelligence report is a distinct challenge from extracting meaning from a tweet. This tuning requires a certain amount of domain expertise, focus and specialisation. Adarga’s technology, for example, is architected and calibrated to be particularly effective at working with the most relevant types of data such as intelligence reports, and therefore at extracting the most useful information for defence and national security applications. Ultimately, this means a more complete and accurate understanding of the current state of the world for the user – but it also highlights the importance of a diverse ecosystem of technologies developed by expert providers who specialise in their respective domains.
Artificial intelligence for automation in Synthetic Environments Artificial Intelligence technologies can also perform an important automation role in Synthetic Environments. A key application is to drive realistic behaviour of simulated actors. Without these, Synthetic Environments would be largely empty and unrealistic places, populated only by as many real-world users as are participating. If you want to render large-scale training environments with crowds of civilians and large numbers of personnel on both red and blue teams, these ‘people’ need to be computer controlled. Without AI, you would have to find (and pay) several thousand real people to control individual avatars. Not only would this be impractical to the point of impossibility as well as prohibitively expensive, it would also rule out faster-thanreal-time simulation.
45
LANGUAGE OF THE FUTURE
Depending on the actor being represented, these so-called non-player character (NPC) AIs will have to perform very different tasks subject to very different constraints. For example, senior actors in a given decision-making hierarchy will make strategic decisions with significant scope and influence even though such actors may be few in number. In comparison, civilians have less individual influence but are far more numerous. In reality, every human is a complex thinking and feeling entity, but to ensure simulations run quickly, it’s necessary to select and tailor the AI technology to trade-off between sophistication and computational cost. As such, a range of AI tools is needed for various different jobs. Thanks to the open development platform, these can be sourced cost-effectively from the broadest spectrum of sources such as the commercial video-game industry, which is investing heavily and making huge advances in NPC AI development.
Transforming how organisations plan, work and cooperate
46
2ND ANNUAL ADARGA SYMPOSIUM ON AI
underlying assumptions of a stored, digitised plan no longer hold true. The disaster relief plans from the above example may be contingent upon logistical assumptions that certain resources are available in certain places in certain quantities. A separate crisis – for example Covid-19 – may have seen this material moved or used up, in which case alert systems would notify the responsible owners of the contingency plans that a vulnerability has opened up against this set of pre-planned disaster-relief scenarios. Synthetic training – as an augmentation of live training – facilitates similar benefits, enabling a much more complete understanding of the training level, skill sets and proficiency of operational personnel. As the Synthetic Environment is used more and more, all this data will accumulate. Thanks to Machine Learning, it’s possible to extract the maximum value from it: training data can be fed back into the planning process, leading to better-trained personnel who can respond faster based on improved plans with improved outcomes.
We have seen how Machine Learning can provide a critical ‘sifting’ function in a Synthetic Environment, interpreting a continuous deluge of data to give users a richly detailed, accurate and up-to-date picture of the world. The application of Machine Learning to Synthetic Environments can go much deeper, however, synthesising data captured through the use of the environment itself to enrich understanding across an organisation.
The common environment for plans and training can also be leveraged proactively. High-impact and / or high-likelihood scenarios (whether civil or military) can be identified and earmarked, and synthetic training exercises developed and drilled with the relevant services. This would provide a step-change towards perpetual readiness.
One of the major benefits of computer-supported planning and training is that outputs and artefacts such as plans, policy options and records of decision-making are stored digitally. Compared to analogue archives, digital ones are easier to search and are also accessible from anywhere, at any time. Furthermore, these artefacts can be dynamic, updating automatically as variables change and new data emerges.
Finally, Machine Learning technologies can aid with the transition to computer-based planning. Text based plans that were not originally developed digitally could also be uploaded into the Synthetic Environment. NLP technologies such as those being developed by Adarga could help collate and classify the relevant information, and convert static, hand-written or printed plans into dynamic digital artefacts.
For example, several contingency plans may have been developed and stored on the system that pertain to potential disaster relief scenarios in a particular location. As time passes, if the system is connected to the relevant data streams, the information in the plans could be continuously updated. Even relatively simple information like the organisational structure of the local government and contact details for key figures could be kept up to date automatically, which would reduce friction in a crisis scenario.
Towards cognitive advantage and perpetual readiness
Perhaps more powerfully, if the relevant data streams are connected to the platform, then Machine Learning technologies could be set to raise a flag when measurable
Scalable, adaptable and ever-evolving Synthetic Environments like that being built on Improbable’s platform for the UK Ministry of Defence are designed to bring to-
Governments and defence departments are increasingly challenged by fast-moving, evolving threats, and require transformational changes to keep pace. Stovepiped synthetics and paper-based analysis must be replaced with collaborative tools that use authoritative content, and support continuous improvement.
47
LANGUAGE OF THE FUTURE
2ND ANNUAL ADARGA SYMPOSIUM ON AI
gether planning, training and rehearsal across and between organisations, be they military forces or government departments. The benefits of this pan-organisational ethos is that different roles and functions can work collaboratively and concurrently on a project across government. This enables a much more agile and integrated response to both seize opportunities, and respond to challenges. Use of such technology ensures that knowledge and provenance of decisions is recorded, and that institutional understanding is both retained and made more easily accessible. This, coupled with the opportunities for cheaper and more accessible synthetic training, promises to move the UK towards a state of perpetual readiness to achieve its global and domestic objectives. Finally, the platform-enabled ecosystem approach reduces the barrier to developing and trialling new technologies, facilitating the development of a truly sovereign capability that draws on the very best national governmental, academic and industrial institutions. They can enhance or enable modernisation and transformation initiatives, and help to advance national data, science and technology agendas. The ability for the ecosystem to evolve and grow ensures it can be sustained as a cutting-edge capability which supports and maintains the UK’s safety and prosperity in the information age.
48
49
Good And Bad Events: Combining Network-Based Event Detection With Sentiment Analysis Iraklis Moutidis PhD candidate at University of Exeter
Professor Hywel T. P. Williams University of Exeter
T
he volume and velocity of online news has increased dramatically in recent years. For news analysts in various domains (e.g. politics, finance, technology) this creates a need for automated methods to detect and summarize news events in real time, since doing so with human effort alone is rapidly becoming intractable. News is now consumed online directly from news platforms and aggregators, but also socially via social media, creating a complex media ecosystem. Automated methods can assist human analysts by providing alerts to emerging news events, generating brief descriptions of the detected event, extracting the sentiment of the crowd discussing about it, and directing the analyst towards relevant documents. Methods that can be applied to different sources (e.g. online news and social media) are especially useful.
Adarga is developing, in collaboration with the University of Exeter, a network based event detection system (NED) for heterogeneous news streams, such as articles, forum posts or tweets [1, 2]. The system detects emerging news/topics by utilising natural language processing methodologies such as named entity recognition (NER) and entity disambiguation, as well as time series analysis and social network analysis techniques. The main assumption is that the occurrence of a trending topic or news event can be detected by changes in how frequently named entities appear on the text as well as with which other entities they co-occur. So a significant entity isn’t only an entity that appears frequently in the incoming documents but it is also an entity that co-appears with other significant entities.
50
Depending on the type of document the system uses two different classifiers for named entity recognition. For news articles NED utilizes the Stanford NER classifer from Finkel et al. [3]. This classifier was trained with the CoNLL data set which consists of Reuters newswire articles. For entity detection in tweets, the system uses the classifer of Ritter et al. [4]. Tweets are different to news articles in that they common-
ly contain different linguistic features (e.g. typos, jargon and lower case letters) and are shorter in length, so the detection of entities with classifiers that are not trained in this kind of text could lack in precision. According to the authors of Ritter et al. [4] , this classifier produces significantly better results than the Stanford NER system for detecting named entities in tweets. NED detects named entities in a stream of news documents and creates a series of networks where the nodes represent named entities and the edges represent the co-appearance of two entities in the document text. The edge weights are calculated using the ‘gravity’ that each entity has in the text. To achieve that, we assign a significance value to each named entity in a given document based on how frequent this entity appears on the text and the total number of entities the document contains. The time intervals for which networks are created can vary based on the medium being monitored (newspaper articles, Twitter) and the nature of the event. For example, when dealing with newspaper articles a network is typically created for each day, but if dealing with Twitter then the interval might be reduced to a few hours. Time series of the weighted degree of each node (entity) are created by analysing the network sequence. The objective of the time series analysis is the detection of significant changes caused by upcoming events. These changes are detected by utilising a sliding time window. Initially, trends are being removed from the time series by calculating their first differences, and then, the mean and standard deviation from a sliding window of X blocks is calculated. A ‘peaking entity’ is then identified as an entity node with weighted degree bigger than a threshold of Y standard deviations away from the rolling mean. A lower standard deviation threshold enables the system to detect more events including less significant ones. On the other hand a higher threshold returns fewer but more important events. The NED system consists of three main stages: event detection, event characterisation or summarisation and event sentiment extraction. Emerging topics and news 51
LANGUAGE OF THE FUTURE
2ND ANNUAL ADARGA SYMPOSIUM ON AI
are spotted by detecting peaks in the named-entity degree time series. For each window the peaking entities are being gathered along with the documents containing them. Depending on factors like the source of the incoming documents and the time window size, the number of detected events may vary. For that reason, the system needs to distinguish the occurring events to acquire useful information. Filtering out documents that do not contain peaking entities significantly prunes the amount of data the system has to process and improves the quality of the results by removing noisy input. To identify individual events, a second generation of knowledge graphs is generated using only the filtered documents and incorporating nouns, noun phrases as well as named entities. The main idea here is that nouns and noun phrases are parts of speech that can effectively provide a comprehensive description of the event. For the task of detecting noun phrases in the text, NED uses the ToPMine algorithm from El-Kishky et al. [5]. Then the system applies the Louvain community detection algorithm from Blondel et al. [6] to the knowledge graph previously created, for each event period. Each detected community of nouns, noun phrases and named entities is considered as a candidate for an event. For each community a bag-of-words summary is created and being sorted using the weighted degree of each node to generate an easily interpreted synopsis of the event. Finally in the case of tweets, sentiment classification is applied to the documents of each detected event. All the tweets that contain at least one of the entities or noun phrases given in the event summary are gathered in a collection. Sentiment analysis is then applied to all tweets in the collection to mine the overall sentiment distribution of the user crowd that posted tweets about the event. This augments the summarisation with a simple affective judgment of the event using sentiment analysis—effectively, determining whether the event is seen as ‘good’ or ‘bad’ by the authors contributing to the news stream. For this task, the VADER model of Hutto and Gilbert [7] was used, since it addresses many of the challenges of sentiment analysis in documents coming from microblogging platforms such as Twitter, Reddit or Facebook. Key challenges arise from the small size of the text, widespread use of abbreviations, informal language and slang, as well as typographic mistakes and sarcasm. Manual comparison of the VADER model with several other algorithms on a sample of tweets showed VADER to give the best performance. Topic detection and tracking combined with sentiment analysis can be beneficial 52
to the research of human analysts in many domains (e.g. political, financial) who can automatically detect emerging events in real time, understand what the event is about and observe the impact on public sentiment. Companies might use such methods to monitor the popularity of their products and understand consumer sentiment around their brands. Academic users for such methods might include social scientists of many kinds, including those who study politics, media effects and information diffusion. This methodology demonstrates one approach to the broad challenge of distilling and interpreting the rich information that is increasingly available through high-volume heterogeneous online news streams. In future work, the method might be improved by resolving sentiment down to the article level (‘good’ or ‘bad’ news reports) or to the user level (‘happy’ or ‘sad’ users or authors). Another improvement would be to identify a greater diversity of sentiment classes, beyond the simplistic 53
LANGUAGE OF THE FUTURE
positive/neutral/negative classification used here; identification of other emotions, for example joy, trust, anger or anticipation, might enable this approach to provide greater insight into public mood around unfolding news events.
References 1. Moutidis I, Williams HT (2019) Utilizing complex networks for event detection in heterogeneous high-volume news streams. In: International conference on complex networks and their applications. Springer, pp 659–672 2. Moutidis I, Williams HT. Good and bad events: combining network-based event detection with sentiment analysis. Social Network Analysis and Mining. 2020 Dec;10(1):1-2. 3. Finkel JR, Grenager T, Manning C (2005) Incorporating non-local information into information extraction systems by Gibbs sampling. In: Proceedings of the 43rd annual meeting on association for computational linguistics. ACL ’05, pp 363–370. Association for Computational Linguistics, USA. https://doi.org/10.3115/1219840.1219885 4. Ritter A, Clark S, Mausam, Etzioni O (2011) Named entity recognition in tweets: an experimental study. In: Proceedings of the 2011conference on empirical methods in natural language processing Association for Computational Linguistics, Edinburgh, Scotland, UK, pp 1524–1534. https://www. aclweb.org/anthology/D11-1141 5. El-Kishky A, Song Y, Wang C, Voss C, Han J (2014) Scalable topical phrase mining from text corpora. ArXiv:1406.6312 6. Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10):10008 7. Hutto CJ, Gilbert E (2014) Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Eighth international AAAI conference on weblogs and social media
54
The world of technology is fast-paced, nuanced and continuously evolving. Keeping up to date with what’s new, what’s changed and how these may affect you, or your business, is of paramount importance to businesses working in the technology sector. As lawyers we help our clients to do just that. Our legal expertise, coupled with our strong understanding of the tech sector, ensures that we offer well-informed and commercial advice. Find out more www.howardkennedy.com
Speakers
Rob Bassett Cross MC
Air Marshal Sir Chris Harper KBE MA FRAeS CMgr CCMI MIoD RAF
Founder and CEO, Adarga
Chair of the Adarga Advisory Group
Rob is the CEO and founder of Adarga. He is a former British Army officer, who was widely respected as one of the leading military officers of his generation and fulfilled some of the most demanding and sensitive appointments during his service as a commander on combat operations in the Middle East, Central Asia, Africa, and elsewhere. Rob was awarded the Military Cross for leadership on counter terrorism operations in Iraq in 2006. Before leaving the military, Rob led future technology development and procurement for a number of ground-breaking projects which included the first deployment of software engineers to a live war zone and the development of an intelligence analysis tool.
Air Marshal Sir Chris Harper joined the Royal Air Force in 1976 as a single-seat fast jet pilot. He commanded at all levels of the RAF and his last fulltime appointment saw him serving as the Director General of the NATO International Military Staff at HQ NATO in Brussels. Sir Chris now runs a small company, CH4C Global Ltd, which provides bespoke consultancy services to organisations working in international defence and security, aviation and aerospace.
After leaving the Army, Rob joined J.P. Morgan as an investment banker in the corporate finance division where he advised international corporate customers from a number of sectors and was involved in over $30bn worth of transactions across the spectrum of corporate finance advisory, mergers and acquisitions, and offerings roles. Specific coverage responsibility was for the natural resources and defence and aerospace sectors. Rob founded Adarga in 2016 to apply cutting-edge AI analytics technology to solve complex, real-world problems in defence, legal and other sectors.
56
Sir Chris is a Fellow of the Royal Aeronautical Society and is a member of the Society’s governing Council. He is also a Trustee of the Air League, a Chartered Manager and Companion of the Chartered Management Institute, a member of the Institute of Directors, and a Non-Resident Senior Fellow of the Atlantic Council. He has led an in-depth study of Baltic air defence for the Estonian MoD and recently co-authored an Atlantic Council study into Baltic ISR capability requirements. He also speaks at key events such as the Munich and Berlin Security conferences. Sir Chris still serves in the RAF as the Honorary Air Commodore for No 2620 Squadron, Royal Auxiliary Air Force. Sir Chris has been flying since 1974. He has over 5000 flying hours, predominantly on aircraft such as the Jaguar, F-18 and Typhoon. Now he owns and operates a Vans RV-8 (G-NRFK) in which he and his wife (Jan) tour extensively. 57
Speakers Robert O. Work
Professor Sir David Omand GCB
Vice Chair, US National Security Commission on Artificial Intelligence
Visiting Professor in War Studies, King’s College London
Mr. Robert Work was the thirty-second Deputy Secretary of Defense, serving alongside three Secretaries of Defense from May 2014 to July 2017. He is currently the President and Owner of TeamWork, LLC, which specializes in defense strategy and policy, programming and budgeting, military-technical competitions, revolutions in war, and the future of war.
David is currently Visiting Professor in War Studies, King’s College London. Previous posts included Security and Intelligence Coordinator in the Cabinet Office, Permanent Secretary of the Home Office, Director GCHQ, and Deputy Under-Secretary of State for Policy in MOD. He served for 7 years on the JIC. He is the Senior Independent Director of Babcock International. He is the author of Securing the State (Hurst), Principled Spying: The Ethics of Secret Intelligence (OUP) and How Spies Think: 10 Lessons from Intelligence (Penguin Viking).
In 2001, Mr. Work retired as a Colonel in the United States Marine Corps after spending 27 years on active duty. He subsequently was a Senior Fellow and Vice President and Director of Studies at the Center for Strategic and Budgetary Assessments. In November 2008, he was asked to join the Obama Defense transition team, serving as the primary analyst for the Department of the Navy. In January 2009, he was asked to join the Obama administration as the 31st Under Secretary of the Navy, and he was confirmed in that role in May 2009. Mr. Work stepped down as the Under Secretary in March 2013 to become the Chief Executive Officer for the Center for a New American Security (CNAS). He remained in that position until he assumed the role of Deputy Secretary of Defense in May 2014. Mr. Work is now back at CNAS as Senior Counselor for Defense. He is also Senior Counselor at Telemus Group, LLC, a strategic consulting firm specializing in defense issues; a Principal at WestExec Advisors; Senior Fellow at the Johns Hopkins University Applied Physics Laboratory; and a Distinguished Visiting Fellow at the MITRE Corporation. He is on several boards of directors, including the Raytheon Technologies Corporation, and on the boards of advisors for several small technology firms. He also serves as the Vice Chairman of the Congressionally mandated National Security Commission on Artificial Intelligence. 58
59
Our panellists
Dame Wendy Hall, DBE, FRS, FREng
John P. Cunningham Professor of AI, Columbia University and Adarga Scientific Advisory Group Chair
John P. Cunningham, Ph.D., is a leader in machine learning/ artificial intelligence and its application to industry. In his academic capacity, he is a professor at Columbia University in the Department of Statistics and Data Science Institute, and has received multiple major awards including the Sloan Fellowship and McKnight Fellowship. In industry, he has worked in, founded, exited, and consulted with multiple companies and funds in the data and AI space. His education includes an undergraduate degree from Dartmouth College, masters and Ph.D. from Stanford University, and a fellowship at University of Cambridge.
Regius Professor of Computer
Science and Executive Director of the Web Science Institute, University of Southampton
Dr David Talby
Dr Deborah Fish OBE Scientific Advisor
60
Dr Deborah Fish is a Fellow at the Defence Science and Technology Laboratory, specialising in innovative modelling and the application of artificial intelligence to help Defence make better evidence-based decisions. Recent projects (many commissioned through industry and academia) include a review of the application of AI to wargames, AI for ship air defence and other aspects of maritime survivability, and AI for anti-access area denial. She believes that AI has significant potential to augment human input to improve tactics, evaluate new concepts and capabilities, and provide an intelligent adversary against which to train or plan operations. Deb’s career in Dstl has included operational deployments as a scientific advisor to UK forces in Iraq, Afghanistan and the Gulf, and she was made an OBE in 2011 for her work in Afghanistan. Prior to joining Dstl, Deb spent time in consultancy, and as an academic, following a PhD on the measurement and modelling of Arctic ozone loss.
Chief Technical Officer, John Snow Labs
Dr Colin Kelly NLP Team Lead, Adarga
Dame Wendy Hall, DBE, FRS, FREng is Regius Professor of Computer Science, Associate Vice President (International Engagement) and is an Executive Director of the Web Science Institute at the University of Southampton. She became a Dame Commander of the British Empire in the 2009 UK New Year’s Honours list, and is a Fellow of the Royal Society and the Royal Academy of Engineering. Dame Wendy was co-Chair of the UK government’s AI Review, which was published in October 2017, and is the first Skills Champion for AI in the UK. In May 2020, she was appointed as Chair of the Ada Lovelace Institute.
David Talby is a chief technology officer at John Snow Labs, helping companies apply artificial intelligence to solve real-world problems in healthcare and life science. David is the creator of Spark NLP - the world’s most widely used natural language processing library in the enterprise. He has extensive experience building and running web-scale software platforms and teams – in start-ups, for Microsoft’s Bing in the US and Europe, and to scale Amazon’s financial systems in Seattle and the UK. David holds a PhD in computer science and master’s degrees in both computer science and business administration.
Colin is the NLP team lead at Adarga and will be joining the panel discussion. Colin is passionate about using natural language processing and artificial intelligence to enable better decision-making and transform knowledge-based organisations. Prior to joining Adarga, Colin worked as an analytics consultant at IBM and PA Consulting, shaping and deploying analytics solutions for financial services, energy and public sector clients. He studied mathematics and computer science at the University of Oxford and holds a Masters and PhD in natural language processing from the University of Cambridge. 61
Ever wondered
‘What if?’
Today’s technology companies are fast-growing, agile and flexible. They need support teams to match. In the tech sector, new products, fast-moving markets and intense competition mean that businesses must consistently deliver big ideas, lean processes and smart solutions if they are to keep up. Our dedicated tech accountancy and advisory team share the same approach. We can help you with financing, how to protect and exploit your intellectual property and identify any Research & Development Tax reliefs you may be entitled to. Azets is an international accounting, tax, audit, advisory and business services company that delivers a personal experience, both digitally and at your door.
Synthetic Environments enable the combination of Artificial Intelligence and Machine Learning technologies in new, complementary, and powerful ways. This leads to better outcomes, because users can test and rehearse decisions in a virtual world before taking action in the real one.
For further information please send an email to hello@azets.co.uk or speak to your accountant at Azets.
Follow us
Find out how Improbable and our partners are helping to transform planning and preparedness at improbable.io/defence
azets.co.uk Azets is a trading name of Azets Holdings Limited. Registered in England & Wales. Registered No. 06365189. VAT Registration No. 320 5454 37. Registered office: Churchill House | 59 Lichfield Street | Walsall | West Midlands | WS4 2BX. Regulated by the Institute of Chartered Accountants in England & Wales for a range of investment business activities. The term ‘Board Director’ is used to refer to a statutory director and principal of the company as registered at Companies House. Any other designations that include the term ‘Partner’ or ‘Director’ or ‘Licensed Insolvency Practitioner’ are not registered statutory directors or principals of the registered company.
62
Delivering legal certainty in a changing world. Technology and data are ever more critical to business. We enable clients to navigate an evolving legal and regulatory landscape, offering solutions for competitive advantage. Our Tech Legal Outlook Mid-Year Update explores seven key trends likely to shape the technology sector in the second half of 2020 and considers the legal implications for businesses.
linklaters.com/technology 65
We are Adarga
London
Bristol
Embassy Tea House 195-205 Union Street London SE1 0LN
1 Victoria Street Bristol BS1 6AA
hello@adarga.ai +44 (0)800 8611 087
www.adarga.ai 66