12 minute read

Improving NATO policies with AI

38 Simon Michell talks to Dr Marcello Piraino, Principal Engineer at NATO Headquarters’ Consultation, Command and Control (C3) Staff (NHQC3S), to find out how applying artificial intelligence (AI) techniques can help improve the quality of the Alliance’s C3 policy documents

In his capacity as Principal Engineer at NHQC3S, Marcello Piraino is responsible for C3 policy development oversight, on behalf of the C3 Board, and his mission is to ensure that the policy documents and related implementation directives are coherent and fit for purpose. Until recently, the C3 Board had more than 2,000 policy documents addressing the many nuances of the IT domain, ranging from telecommunications issues to software development and lifecycle management. Not surprisingly, there were instances when a policy on one subject would be contradicted or superseded by a policy on another. Among the numerous reasons for this was that many of those policy documents were developed at different times, by different subject-matter experts, and in response to different issues. Unfortunately, the lack of sufficient coherency control led to a very low level of policy compliance throughout NATO.

ADDRESSING THE CHALLENGE

It was only by implementing a rigorous programme to reduce and align these policies that the C3 Board began to get on top of this challenge. “I was put in charge of a project to rationalize them,” explains Dr Piraino. “This effort took several iterations, but eventually we managed to reduce the 2,000 documents to a few hundred, then a few dozen, and now we have one Alliance C3 Policy document, approved by the North Atlantic Council, containing 12 policy annexes and about 10 implementation directives, with more under review, which provide sufficient coverage of our current needs.” Having achieved this feat, they needed to come up with a plan to ensure this task could be repeated in the future, as more policy documents and implementation directives were developed, or existing ones were amended.

NATURAL LANGUAGE PROCESSING

Dr Piraino discussed this requirement with the NCI Agency’s Head of Data Science and Innovation, Dr Michael Street, who suggested that a specialism within AI known as NLP (natural language processing), which leans heavily on machine learning techniques, might be able to help the C3 Board achieve their aim. Luckily, Dr Piraino already had a good knowledge of AI techniques, having completed an AI project as part of his PhD. That

facilitated the acceptance of the proposed approach and broader agreement by the nations. So, together they embarked on a project using the C3 Regulatory Framework, which embodies the Alliance C3 Strategy, the Alliance C3 Policy, including the 12 policy documents as annexes, and the 10 implementation directives that the C3 Board had synthetized out of the original 2,000 documents.

PREPARING THE DOCUMENTS

Much work has to be done to prepare the documents before any algorithms can be fed through them. “In order to make any document readable by artificial intelligence tools, we had to do a lot of work ‘normalizing’ them. We had to take all the statements and put them into a very simple format – subject, verb, complement. We had to strip out the most common words, like ’NATO’ or ’C3’, and articles, prepositions and such like, to ensure the tool would not generate skewed results.”

The NLP algorithm process works by assigning a numerical value to each word in relation to a certain ’concept’. The combination of values attributed to words in a sentence determines a vector. Thus, comparing two sentences’ vectors shows how similar they are. In practice, the smaller the angle between the two vectors, the more likely it is that the content of the two sentences is relatable. This is known as cosine similarity.

A key challenge relates to NATO’s tendency to develop a specialist language of its own. That makes it difficult (if not impossible) to use existing ontologies (classifications) to analyse the policy text. In addition, as all decisions in NATO are reached by consensus, the policy language is filled with ’constructive’ ambiguities, introduced with the aim of facilitating political agreement.

As soon as Piraino’s team started putting documents through the AI tools, they got promising results. An important task they wanted to achieve was to use the NLP, particularly its ‘similarities’ function, to track the use of key concepts elaborated in the Alliance C3 Strategy – IT governance, IT architecture, interoperability and IT capabilities. In doing so, they hoped to be able to improve the relationship between the Alliance’s C3 Strategy and its supporting policy documents, and between different policies themselves. This supports the work of developing lower-level implementation directives. “The NLP approach led to the exact determination of these relationships, by using the cosine similarity function to calculate how close two policies were in accordance with the concept we were exploring. It also enabled deeper analysis.”

NEW CONNECTIONS

For example, they were able to identify a relationship between two policies not previously considered to be connected – Waveform Policy and Software Policy. “We thought that a policy that was focused on ‘waveform’, that is the way in which a radio wave is shaped, should not be very close in terms of content to a policy relating to software development. What we actually found out was nearly the opposite.” What had not been taken into account when developing the two policies prior to the AI experiments was that software is used to produce waveforms. Therefore, the software policy had more commonalities with the waveform policy than they had realized.

According to Dr Piraino, this an excellent start, and it proves that AI tools can help NATO improve the way it develops and evolves policies. “We have been able to demonstrate that the set of documents analysed is very coherent, a validation that we were not able to achieve before. Our policy set is hierarchically organized, with the Alliance C3 Strategy on top, supported by policy documents that, in turn, are underpinned by implementation directives. This set is coherent from top to bottom – meaning that all policy is consistently derived by the strategy and that directives are coherently built on policy, and across, meaning that there are no overlaps or contradictions amongst policy documents.”

ISSUES TO OVERCOME

But this is far from the endgame. There are significant issues to overcome. The objective is to achieve a method that would predict whether a new policy statement or principle is already covered by, or contradicts, one of the existing policies. This assessment is normally carried out by human experts that are hard to find and train, even on a limited set of policy documents. Unlike humans, machines do not really understand meaning. They process the single words, but not the overall intent in a sentence. As result, the algorithm cannot distinguish between two statements that express different meaning by using a similar set of words.

“For the work we are doing, it is really important that the algorithm can understand the meaning in the context of the policy, as well as the instances of certain phrases or words,” says Dr Piraino. “At the moment, we are unable to represent this meaning with our tools. We are producing specialized ontologies to try and get close to something that resembles meaning, but it is not the same thing.” 39

INDUSTRY PERSPECTIVE

Dr Alexander Schellong

VP Global Business, Member of the Board – INFODAS

Why are cross domain solutions relevant for data as a strategic resource?

Within NATO and among NATO Member States, from the command centre to the tactical edge, large amounts of data are being produced in all domains. In fact, concepts such as Joint All-Domain Command and Control (JADC2) rely on data being moved quickly and securely.

The problem is that networks are segmented based on classification levels such as NATO’s internal networks. Some may be even completely isolated. Data is stuck in silos, and moving data manually is slow and cumbersome – hence the term ‘sneaker’ network. Data may also take suboptimal routes through multiple nodes before reaching its destination, straining bandwidth that is always scarce during missions. Moreover, any data within a particular security domain is considered to share its associated security classification – even if that is not the case. In addition, non-military data or compute sources remain outside the networks. In essence, sensors, effectors, C2 centres and other parties at different classification levels are not properly and securely connected. The latest generation of cross domain solutions (CDS) provide the means to digitize classified domains and allow for new ways of data exchange. Direct connectivity between SECRET and RESTRICTED domains is now possible, with the protection of classified information and compliance with the highest information assurance standards remaining the guiding principle of CDS. While technology has evolved, the question is whether classified information protection regulations currently allow the latest CDS to connect domains of various classification levels.

What are cross domain solutions?

CDS are IT hardware security appliances. They are not firewalls or encryption devices, but they do complement them. Whereas a firewall protects you from unwanted elements entering from the outside world, a CDS is a forced protocol break that prevents sensitive data from exiting a network – a bit like a border control point.

They represent a niche segment within the datacentric security, data-loss prevention and network security sector. They are commonly used in defence and intelligence, but also address cybersecurity issues in critical infrastructure when connecting OT (operational technology) to IT systems. Development and evaluation, through national security authorities, of commercial-off-the-shelf (COTS) CDS products has only been taking place over the past 10 to 15 years. And, even within the cybersecurity community, many engineers are not yet aware of CDS.

There are three product categories within CDS: Data Diodes, Security Gateways/Guards/IEG, and Data Classification/Labelling solutions. Today, many people ask for data diodes when they want to solve their CDS use-case. That said, it is often the case that unidirectional data diode concepts are outdated, and bi-directional, high-assurance data-exchange and filtering capabilities are needed instead.

How do INFODAS cross domain solutions support NATO’s mission and the future role of data?

INFODAS is the only CDS vendor in the world with a product portfolio that holds triple approvals for NATO SECRET, EU SECRET and German SECRET. The Secure Domain Transition (SDoT) Product Family addresses every military use-case with structured or unstructured data, from data centres to military vehicles. Within the day-to-day activities of military personnel, it is even possible to create manual and/or automatic NATO STANAG 4774/8-compliant XML security labels that are cryptographically bound to data objects for classification (eg RESTRICTED, Releasable to, expiration date). The SDoT Security Gateway, our flagship product, will check these XML security labels for release decisions.

Our products solve many of today’s cross domain issues and, ultimately, result in time, resource and budget savings. The human factor can be removed from various processes. We also improve cybersecurity by removing reasons for military personnel to share sensitive data in an uncontrolled and less-secure fashion.

Classically, our CDS control data transfer and access in the following architecture: SECRET (HIGH) <> Gateway (SGW) <> Firewall (FW) <> RESTRICTED (LOW)

Alternatively, cascading security domain architectures can be mapped as per: SECRET (HIGH) <> SGW <> FW <> Confidential <> SGW <> RESTRICTED (LOW)

SDoT CDS were recently selected for NATO’s AWACS aircraft upgrade (led by Boeing) and the JEWCS programme (led by Leonardo). New data-sharing scenarios are frequently tested at CWIX, or recently in a multinational live-fire air defence exercise at NAMFI in Crete. The German Armed Forces have been using SDoT products for over 10 years on Navy ships, in Afghanistan (ISAF/RSP) and Mali (MINUSMA), always coming up with new requirements and use-cases to solve. The products are available as 19”, 1 U appliances or as smaller tactical versions for vehicles.

What is the advantage of a general NATO SECRET approval for CDS products?

The ability to directly connect a SECRET to a RESTRICTED domain is a gamechanger for NATO and Member States. This opens up a range of digitization and inter- or intra-domain dataexchange opportunities that make a soldier’s life easier, and ultimately raises NATO’s military posture. Any evaluated and approved product removes the risk for military end-users failing to get their solution architecture accredited by the respective information security authority.

Moreover, unlike a commercial certification, which allows variations in the scope of the evaluation, NATO SECRET approvals are the result of multi-year, independent formal and technical hardware/software and supply-chain evaluations by NATO SECAN and national information security bodies, such as Germany’s Federal Office of Information Security. In addition, all of the NATO Member States confirmed the decision by NATO’s Military Committee to grant a general NATO SECRET approval. Maintaining the approval requires continuous work on the security of products and meeting new requirements of national information security authorities as they arise. Essentially, every new line of code or functionality must meet very strict security requirements.

HIGH

SECRET LOW

RESTRICTED

LOW

Confidential

LOW

MISSION SECRET

LOW

Unclassified

Connecting multiple domains to one HIGH (SECRET) domain with SDoT CDS

Can you share any lessons learnt for those who plan to use CDS to share data across security domains?

Before going into the technical details, it’s important to understand your high-level use-case and the expected opportunities prior to changing the status quo. For example, would other data improve your mission or systems? Would third-parties, previously never considered as part of the data sharing, benefit from specific data?

You should also be fully or at least partially aware of all the data and its sensitivity level inside your domain. Is it a machine-to-machine scenario, such as a database replication, or does it involve the human end-user who wants to make HIGH to LOW queries? A lot of time and budget can be saved if you gather the following information: classification levels, data types/structure, data size, data transfer frequencies, protocols and throughput.

How will CDS evolve in the future to enable multidomain data sharing?

Due to the information security authority requirements and evaluation cycles, CDS tend to trail behind technology trends. We expect CDS will slowly evolve in areas such as:

– performance (transfer speed, latency); – virtual CDS instance; – data discovery and classification; – deployment; – easier filter/parser creation; – voice/video streaming; – multi-CDS-asset management in large networks from a central location; – formfactor/miniaturization.

However, some of those areas are highly contested from a cybersecurity perspective. If a security gateway is your last resort to protect classified data, wouldn’t central management increase the attack surface of a CDS and probability of classified data loss? Given the complexity and fast-paced change of today’s cloud environments, would national security agencies ever be comfortable with a virtual CDS? We don’t know, but we are certain that change is inevitable as technological advances and user demands build up the pressure to find a compromise between high-assurance and the strategic value of data and digitization.

This article is from: