10 minute read

Aviation Safety Analysis

Next Article
MorphoTest

MorphoTest

MATTHIAS ATTARD | SUPERVISOR: Dr Joel Azzopardi | CO-SUPERVISOR: Mr Nicholas Mamo COURSE: B.Sc. IT (Hons.) Artificial Intelligence

Aviation is currently a growth industry and, with an average 400 aviation accidents occurring on a monthly basis, it would be imperative to determine the causes of these accidents to improve aviation safety. Figure 1 presents the aviation accidents occurring each year, according to data collected by the Aviation Safety Reporting System (ASRS).

Advertisement

In this research, data and text mining techniques were used to extract useful information from a large database of aviation accidents. This study has drawn largely from the ASRS database, which consists of over 210,000 aviation accident reports since 1988. The ASRS holds narratives which contain a detailed account of what occurred in the accidents, as well as categorical information about the flights in question such as weather elements and aircraft information. The study of such accident reports helps to identify the cause of these accidents, with a view to extract similarities or differences amongst them in order to prevent fatalities and minimise the loss of resources.

This work demonstrates the use of data mining techniques to determine the primary problem of accident reports from the ASRS and predict the risk factor of these accidents. This is achieved through the use of machine learning classifiers such as naive Bayes and support-vector machines (SVMs), and deep learning techniques for both classification and prediction.

To identify the primary problem of accidents, the narratives were subjected to a preprocessing exercise, which involved reducing words to their stems, removing punctuation and stop words, and mapping synonyms and acronyms to umbrella terms. Machine learning classifiers were then used to predict the primary problem of an accident. This method achieved an accuracy of 60% on the test data with the use of SVM.

For the task of predicting the risk factor of accidents, similar steps for preprocessing were carried out on synopses, which are brief summaries of narratives. SVM once again proved to be the best performing classifier with a test accuracy of 61%. Furthermore, structured data was also used to predict the risk factor of accidents. After encoding the data and labels, SVM provided an accuracy of 66% on the test data.

The work achieved through the proposed system demonstrates that machines could reliably identify a flight’s primary problem, as well as a high-risk situation in a flight.

Figure 1. Number of aviation accidents per year (as sourced from the ASRS database)

Implementations of the statemerging operator in DFA learning

MATTHEW JONATHAN AXISA | SUPERVISOR: Dr Kristian Guillaumier CO-SUPERVISOR: Prof. John M. Abela | COURSE: B.Sc. IT (Hons.) Artificial Intelligence

DFA learning is the process of identifying a minimal deterministic finite-state automaton (DFA) from a training set of strings. The training set is comprised of positive and negative strings, which respectively do and do not belong to the regular language recognised by the target automaton.

This problem is NP-hard and is typically accomplished by means of state-merging algorithms. The algorithms in this family all depend on the deterministic state-merging operation, which combines two states in a DFA to create a new, smaller automaton. These algorithms could be broadly classified as non-monotonic (e.g., SAGE, EdBeam, DFA SAT, automata teams) or monotonic (e.g., EDSM, blue-fringe, RPNI) which respectively do and do not allow backtracking. When running, these algorithms would perform many millions of merges with non-monotonic algorithms performing significantly more merges than monotonic algorithms. In both cases the deterministic state merging operation is a significant bottleneck.

This project was motivated by the need to alleviate this bottleneck through a faster state-merging operation. Achieving this would assist researchers in dealing with harder problems, run more sophisticated heuristics and perform more meaningful analyses by running their tests on a larger, more statistically significant pools of problems.

With the main goal of identifying the fastest implementation, this project examined a number of implementations of the state-merging operation with state-of-the-art DFA learning algorithms on a large pool of problems. To achieve this, existing implementations of the state-merging operation, such as FlexFringe and StateChum, were investigated to evaluate the typical state merging implementations. This involved studying the extent to which it might be possible to take advantage of concurrency to perform many merges in parallel, and building a novel GPU implementation of the merge operation. Additionally, the process entailed using deep profiling and various optimisation techniques related to memory-access patterns to further enhance performance. Finally, an equivalent solution was developed in multiple programming languages to minimise the falsely perceived performance increase which may be due to a better optimising compiler.

The implementation was evaluated and benchmarked on a large pool of Abbadingo-style problems. Abbadingo One is a DFA learning competition that took place in 1997 and was organised with the goal of finding better DFA learning algorithms. These problems are suitable for testing because the typical algorithms used on Abbadingo problems tend to be simple to implement and rely heavily on deterministic merge performance. The motivation behind this project was not to seek better solutions than those submitted by the winners of the competition, but to find their solutions faster.

At evaluation stage, the implementation presented in this project proved to be significantly faster than a naïve one. The operation was tested by using three DFA learning frameworks, namely: windowed breadth-first search, exhaustive search, and blue-fringe. The preliminary results indicated that the improved state-merging operation was three times as fast as the baseline.

Figure 1. A valid merge between states 4 and 5; the successors of the two states are recursively merged

Assisting a search and rescue mission for lost people using a UAV

MICHAEL AZZOPARDI | SUPERVISOR: Dr Conrad Attard COURSE: B.Sc. IT (Hons.) Software Development

Search and rescue (SAR) missions on land are nowadays still being executed on foot or by manned aircraft, including planes and helicopters. Both methods demand a significant amount of time and are not cost-efficient. This is particular significant when considering that the main challenge in such operations is the time needed to react and to take the required action.

New developments in unmanned aerial vehicle (UAV) technology could help tackle such a problem with the use of drones and aerial photography. This study exploited this technology, combined with shortest-path and objectdetection algorithms, to seek to reduce the mission duration and the risk of the injury of the parties involved.

In preparation for devising a solution to the aforementioned problem, existing research on UAV/drone technology and SAR missions on land was studied carefully. Particular attention was given to research focusing on lost persons living with dementia. Models and prototypes

Figure 1. Architecture diagram of the SAR mobile application were formulated prior to development, on the basis of this research.

An Android mobile application was developed to simplify the communication between a DJI drone and the operator, by making use of the DJI Mobile Software Development Kit (SDK). Given the time constraint to search for a lost individual with dementia during such an SAR mission, a shortest-path algorithm was implemented to aid the operator in the drone navigation from one waypoint to another, depending on prioritisation and the probability of finding the lost person. An object-detection algorithm received the images captured by the drone to detect persons at multiple points throughout the route. A separate Android mobile application was developed to efficiently gather data on the SAR mission, including personal information and locations that would be potentially vital during a SAR mission. Both mobile applications used the same Firebase Realtime Database to collect and utilise the mission information.

Figure 2. Prototypes of the mobile applications

Analysis of police-violence records through text mining techniques

CHRISTINA BARBARA | SUPERVISOR: Dr Joel Azzopardi CO-SUPERVISOR: Mr Nicholas Mamo | COURSE: B.Sc. IT (Hons.) Artificial Intelligence

This research applies data mining techniques to the ‘Mapping Police Violence’ dataset, which provides information on every individual killed by police in the USA since 2013. Knowledge related to police violence is extracted by profiling typical violence victims, analysing violence across different states, and predicting the trend such incidents follow.

The first task in this study involved profiling the victims, which was tackled by clustering the data and identifying the main types of reports. The typical victim belonging to each cluster set was then extracted. This was done using different clustering algorithms, namely: k-means, k-medoids, and self-organising maps.. The generated profiles were validated by observing how many killings in the dataset are accurately described by the different profiles.

The second task was to cluster the data in each location separately. This helped establish the most common locations where such incidents took place, and the typical victim profiles within those locations.

Using regression techniques, a prediction of the number of future police killings was attempted, based on information related to past incidents, anticipating the third task of this study. This entailed considering information such as the unemployment rate in each state to establish whether including such external information would be helpful in accurately predicting the number of killings. The results were evaluated by comparing the predicted number to the actual number of killings that took place.

This research could be extended by employing hierarchical clustering, thus allowing a more detailed examination of the generated profiles. Additionally, it would be possible to perform clustering by focusing on the years in which the killings took place, so as to follow how the profiles generated change throughout the years. The analyses performed in this study could also be applied to datasets that focus more on general police interactions ‒ such as stop-and-search data ‒ and observing whether there might be any similarities between analyses of the different sets of data.

Figure 1. Sample victim profile

Figure 2. States with the highest number of killings with respect to population numbers (2013-2018). Colours range from blue (min) to yellow (max).

Using COVID-19 pandemic sentiment and machine learning to predict stock-market price direction

LUKE BEZZINA | SUPERVISOR: Prof. John M. Abela COURSE: B.Sc. IT (Hons.) Computing and Business

The buying and selling of financial instruments, such as stocks and bonds, has for long been an essential activity in maximising investors’ wealth. Stock markets, one of which being the New York Stock Exchange have facilitated this trading activity. A pertinent question in this field is to determine whether particular securities would increase or decrease in value in the foreseeable future. Changes in value for an equity could be described through a candlestick chart, as per Figure 1.

In financial trading, an approach that has recently gained traction is algorithmic trading, which is the buying and selling of (financial) instruments by using algorithms. This was possible through the exponential improvements in computational speed, along with the introduction of diverse machine learning (ML) algorithms. This study exploits ML to predict the general price direction of securities for the three days following each day in the dataset range. This is referred to as time series forecasting.

The solution being proposed uses data for stocks domiciled in the United States found in the S&P500 index, which is an indicator representative of the largest 500 US-listed companies. Two stocks per S&P500 sector were reviewed with the purpose of obtaining a fair representation of the US market. A baseline artificial neural network (ANN) model, and a long short-term memory (LSTM) model were used in parallel, as described in Figure 2. The latter model has recently become popular in time-series problems in view of its ability to remember the previous outcome of neurons.

The COVID-19 pandemic has affected businesses in a global manner. Hence, this work also attempts to identify whether there is value to be derived from the sentiment towards the pandemic, in terms of prediction accuracy. Google Trends data involving pandemic-related terminology was used to derive additional features to be used within an LSTM model, providing an effective comparison between the model implementations in this study.

Figure 1. Candlestick chart of the S&P 500 index for December 2020

Figure 2. Architecture of the proposed stock prediction algorithm

This article is from: