7 minute read
MorphoTest
DAVID VASSALLO | SUPERVISOR: Dr Claudia Borg COURSE: B.Sc. IT (Hons.) Artificial Intelligence
Learning a language is always a challenging task that requires a substantial amount of time and dedication to be able to see any progress. This holds true with respect to the Maltese language, Maltese grammar in particular. Maltese has a ‘mixed’ grammar, which is influenced by its origins. For instance, words of Semitic origin follow a rootand-pattern conjugation system, whilst words of a Romance origin follow a stem-and-affixation pattern. Both children and adults learning the language often find that they would need to memorise which system is to be applied, to which set of words.
Advertisement
When compared to other languages, Maltese is considered a low-resource language, meaning that there is a lack of resources available to process Maltese computationally. This is also true in terms of educational resources that could assist Maltese-language learners in making progress. The main aim of this project is to investigate how to utilise existing natural language processing (NLP) tools to facilitate the creation a language-learning tool for Maltese. Due to the richness of Maltese morphology (i,e., the structure of its words and the way in which they interact) the research seeks to create an application that could assist language learners to practice this grammatical aspect of the language.
The language-learning sector is very vast and nowadays there are many smartphone applications that seek to aid language learning. However, many of these applications do not necessarily make use of NLP tools to the best advantage.
One of these applications is WordBrick [1] and it seeks to tackle the difficulty of independent language learning by displaying a jumble of words, presented in different shapes and colours, requiring the user to rearrange them to form a proper sentence. Echoing jigsaw puzzles, this is achieved by having connectors attached to the word that have a specific shape, where only another word with that shape could be joined to it.
This project was inspired by WordBrick, which allows learners to build words from their different components (morphemes), and studying the meaning of each component. In order to achieve this, we take advantage of Ġabra [2], an open-source lexicon for Maltese. The first step was to automatically segment words into their components and associate a label to the individual component. This task is referred to as morphological analysis. The components would be presented to the user, jumbled up, and they would have to join the pieces together again in the right order to produce the word. The focus on the language-learning component would then determine which words should be presented to which learners, according to their level. The type of exercises offered could also be varied by reversing the process, and asking the learner to segment a word and to attach a meaning to each of the parts.
The developed application demonstrates how NLP techniques could assist Maltese-language learners. The main aim of the application is to provide a basis for the development of further exercises that use NLP as their backbone, allowing teachers to create content for exercises more easily and with more diversity.
Figure 1. A screenshot of WordBrick displaying how words are presented initially and then rearranged
REFERENCES
[1] M. Purgina, M. Mozgovoy, and J. Blake, “WordBricks: Mobile Technology and Visual Grammar Formalism for Gamification of Natural Language Grammar Acquisition”, inJournal of Educational Computing Research, vol. 58, pp. 126–159, Mar. 2020. Publisher: SAGE Publications Inc.
[2] John J. Camilleri. “A Computational Grammar and Lexicon for Maltese”, M.Sc. Thesis. Chalmers University of Technology. Gothenburg, Sweden, September 2013.
Predicting links in a social network based on recognised personalities
ANDREW AQUILINA | SUPERVISOR: Dr Charlie Abela COURSE: B.Sc. IT (Hons.) Artificial Intelligence
Link prediction has become a ubiquitous presence in online social networks (OSNs). A prominent example is the ‘People You May Know’ feature on Facebook. In information-oriented OSNs, such as Twitter, users choose whom to follow, on the basis of a number of factors. Although personality is one of the primary determiners influencing our social relationships, its impact within OSNs is often overlooked.
Personality could be represented by the ‘Big Five’ model, which identifies five dimensions: Openness to Experience, Conscientiousness, Extraversion, Agreeableness, and Neuroticism. It has been found that explicitly surveying users for these traits is not a viable option. However, by employing the relationship between language and personality, as reported by Mairesse et al [1], a user’s personality could be recognised without the need to resort to user surveys.
This research focused on extracting personality traits from textual content, and observing how these would relate to the established links between OSN users. In order to achieve this, we considered state-of-the-art research and developed personality recognition and link prediction components.
A basic representation of the personality recognition component is offered in Figure 1. Different approaches were investigated in line with existing research, comparing a variety of machine learning models and optimisation techniques. These models were trained on datasets containing users’ micro-blog postings and their Big Five trait values. The best performing models managed to recognise personality from text alone, achieving results that are comparable with the state-of-the-art.
The effect of user personality towards a real-life Twitter network was studied by applying the above-mentioned personality recogniser. Although correlations between the personality of followers and followees were found, it was also observed that users had their own implicit personality preferences in terms of who they follow. An example is outlined in Figure 2. The Personality-Aware Link Prediction Boosting (PALP-Boost) component takes these preferences into account and improves accuracy across various topologicalbased link-prediction algorithms, highlighting the utility of the underlying psychological attributes within OSNs.
Figure 1. Input-output of the personality recognition component. displaying a user’s followee personality preferences.
Figure 2. PALP-Boost scores of potential followees, based on their proximity to such preferences. For the sake of visual clarity, only two out of five Big Five dimensions are shown.
REFERENCES
[1] Mairesse, F., Walker, M., Mehl, M. and Moore, R., 2007. Using Linguistic Cues for the Automatic Recognition of Personality in Conversation and Text. Journal of Artificial Intelligence Research, 30, pp.457-500.
MATTHIAS ATTARD | SUPERVISOR: Dr Sandro Spina COURSE: B.Sc. (Hons.) Computing Science
The creation of good animation sequences is a lengthy process, which is typically carried out either using a motioncapture system or directly by content creators on 3D modelling software. To increase realism in a virtual scene, a considerable number of animations (e.g., for walking) would need to be created. This project attempts to address this problem by proposing a system which, given a set of animations, would be able to generate variations of these while preserving the purpose of the actions being animated.
Essentially, the process required mapping different humanoid animation sequences into a latent space using different clustering algorithms, and then proceeding to create variations by combining different animations and varying the influence of the chosen animations for the new animation. This would then result in new variations influenced by other animations. The dataset of animation sequences that was mapped into the latent space was sourced from the Carnegie-Mellon Graphics Lab Motion Capture Database. This consists of hundreds of different humanoid animations, such as walking, dancing, jumping and other actions that involve hand movement while sitting and squatting.
The latent space created holds the mapped animations in such way that similar animations are closer to each other. The mapping function considers a feature vector extracted from the animation sequence. The features that were considered include, the position and rotation of each component of the human body from 60 evenly spaced frames, the muscle values of 60 evenly spaced frames, and the distribution of translation and rotation values of each human body component throughout the whole animation. This was done to observe which features would best create groups in the latent space, such that animations within the groups would be closest to the animations within that group.
The groups formed in the latent space were identified in such a way as to produce a hierarchical structure. In practice, the space was divided into large groups; these were further divided, until no more subgroups could be identified. In the system, the first groups identified were divided in terms of movement speed and direction of the whole body. Particularly, looking at one of the groups, it was further subdivided in terms of where the hand movement was occurring.
The new animations were created through Unity, the game engine. The humanoid animations were imported into Unity according to how they were grouped in the latent space. The user could then specify to which groups the different limbs and the strength of the influence could be assigned. The system then creates all possible combinations with the animations in the chosen groups, allowing the user to procced to choosing the preferred variations.
The correctness and quality of the results obtained was evaluated through an online survey, where 120 participants were asked to rate animations that included some that were not created by the system. Results show that some variations performed better than those of the motion-capture library. Additionally, the range of the overall ratings was not that wide, which suggests that the variations mixed well with existing animation sequences.
Figure 1. Merging of basketball dribbling and drinking (first 2 rows) into one animation sequence (third row)