related neuroscience data sources and the technical solution for graph-based data analysis. To provide web-based access to the graph data, we designed and implemented a web application and API. The main components of our approach were: (i) the common graph model based on a native graph database management system; (ii) integration of identified external datasets through an extract-transform-load process; (iii) graph analytics to analyse the graph data, based on existing graph algorithms and visualisation methods; (iv) web-based data access interface for the data based on the graph model. The database generated in this study consists of 9,539 distinct nodes with 46 distinct node labels, 29,807 distinct relationships, and 66 distinct relationship types [L4]. We conducted exploratory and confirmatory data analyses on the data. The former aims to obtain general information about the data and the latter to answer specific questions. For the exploratory analysis, we used community detection algorithms to investigate the graph data structure (Label propagation and Louvain algorithms), a centrality algorithm to find influential nodes in a graph (PageRank), a node similarity algorithm to compare nodes (available in Neo4j), and graph visualisations (ForcedAtlas2) to investigate the general data structure. A consultation with a neuroscience expert revealed that the results include interesting expected and unexpected findings as well as already known findings. For example, Figure 2 (a) shows that males (largest green node) are studied far more often than females. In the confirmatory data analysis part, we aimed to find similar analyses based on a specific criterion using a node similarity algorithm. For example, we searched for studies (i.e., analyses) that investigated the same cell type in the same brain region and with the same object of interest. Figure 2 (b) presents the studies (in orange) in the dataset connected to the specified nodes and species. The yellow nodes represent the two species in the dataset, and the central node in the middle is the cell type “neurons”. The results and our experience in representing, integrating, and analysing basal ganglia data show that a graph-based approach can be an effective solution, and that the approach should be further considered for management of various types of neuroscience data. Links: [L1] https://bams1.org [L2] https://scicrunch.org/scicrunch/interlex/dashboard [L3] http://neuromorpho.org [L4] https://github.com/marenpg/jupyter_basal_ganglia References: [1] R. Angles and C. Gutiérrez: “Survey of graph database models”, ACM Computing Surveys, vol. 40, no. 1, pp. 1–39, 2008. [2] I.E. Bjerke, et al.: “Database of literature derived cellular measurements from the murine basal ganglia”, Scientific data, vol. 7, no. 1, pp. 1–14, 2020. [3] R. Cattell: “Scalable SQL and NoSQL data stores”, Sigmod Record, vol. 39, no. 4, pp. 12–27, 2011. Please contact: Dumitru Roman, SINTEF AS, Norway dumitru.roman@sintef.no ERCIM NEWS 125 April 2021
The ICARuS Ontology: A General Aviation Ontology by Joanna Georgiou, Chrysovalantis Christodoulou, George Pallis and Marios Dikaiakos (University of Cyprus) A key challenge in the aviation industry is managing aviation data, which are complex and often derived from heterogeneous data sources. ICARUS Ontology is a domain-specific ontology that addresses this challenge by enhancing the semantic description and integration of the various ICARUS platform assets. The current digital revolution has heavily influenced the way we manage data. For the past few years every human activity, process, and interaction has been digitised, resulting in an exponential increase in the amount of data produced. There is huge potential for useful information to be extracted from this data, perhaps enabling us to discover new ways of optimising processes, find innovative solutions, and improve decision-making. However, managing enormous amounts of data from numerous, heterogeneous sources that do not share common schemas or standards poses many challenges: data integration and linking remains a major concern. The procedure of integrating and linking data can be expensive and is often underrated, especially for small and medium-sized enterprises (SMEs), which may lack the required expertise and be unable to invest the necessary time and resources to understand and share the digital information derived from their operational systems. Usually, data models are created in a manner that can handle information by encoding the structure, format, constraints, and relationships with real-world entities. The challenge of managing big data is apparent in various industry domains, including the aviation industry. Unfortunately, aviation data providers use very distinct data models that can vary across different dimensions [1], like data encoding format, data field naming, data semantics, spatial and temporal resolution, and measurement unit conventions. To enhance the integration of data in a data platform for the aviation industry, we designed and introduced the ICARUS ontology [2]. The ICARUS [L1] platform helps stakeholders that are connected to the aviation industry by providing them with a system to share, collect or exchange datasets, knowledge, and skills. Through these services, the ICARUS project seeks to help stakeholders gain better perspectives, optimise operations, and increase customer safety and satisfaction. Additionally, it aims to offer a variety of user-friendly assets, such as datasets, algorithms, usage analytics and tools. To manage the complex and dynamic evolution of these assets, we sought to develop an ontology to describe various information resources that could be integrated and managed by the ICARUS platform. Ontologies can act as valuable tools to describe and define, in a formal and 45