Research and Innovation
Graph-based Management of Neuroscience data: Representation, Integration and Analysis by Maren Parnas Gulnes (University of Oslo / SINTEF AS), Ahmet Soylu (OsloMet – Oslo Metropolitan University) and Dumitru Roman (SINTEF AS) Advances in technology have allowed the amount of neuroscience data collected during brain research to increase significantly over the past decade. Neuroscience data is currently spread across a variety of sources, typically provisioned through ad-hoc and non-standard approaches and formats, and it often has no connection to other relevant data sources. This makes it difficult for researchers to understand and use neuroscience and related data. A graph-based approach could make the data more accessible. A graph-based approach for representing, analysing, and accessing brain-related data [1] could be used to integrate various disparate data sources and improve the understandability and usability of neuroscience data. Graph data models and associated graph database management systems provide performance, flexibility, and agility, and open up the possibility of using well-established graph analytics solutions; however, there is limited research on graph-based data representation as a mechanism for the integration, analysis, and reuse of neuroscience data. We applied our proposed approach to a unique dataset of quantitative neuroanatomical data about the murine basal ganglia – a group of nuclei in the brain essential for processing information related to movement. The murine basal ganglia dataset consists of quantitative neuroanatomical data about basal ganglia found in healthy rats and mice, collected from more than 200 research papers and data repositories [2]. The dataset contains three distinct information types: quantitations (counts), distributions, and cell morphologies. The counts and distributions relate to either entire cells or spe-
Figure 1: An overview of initiatives investigated for overlap with the murine basal ganglia dataset.
cific parts of the cell, while the morphologies describe the cell's physical structure. The dataset's primary purpose is for researchers to find and compare neuroanatomical information about the basal ganglia brain regions. To identify datasets that overlap with the murine basal ganglia dataset for integration purposes, we evaluated a set of related data sources, including repositories, atlases, and publicly available data, against the following criteria: (i) serves data programmatically; (ii) contains data related to the basal ganglia; and (iii) provides data that could be connected to murine basal ganglia. Figure 1 summarises the results of our investigation; Brain Architecture Management System (BAMS) [L1], InterLex [L2], and NeuroMorpho.Org [L3] matched the specified criteria. We designed and implemented a graph model for the murine basal ganglia dataset and migrated the data from the relational database into a NoSQL graph database [3]. Further, we designed and implemented the integration of data from
Figure 2: (a) Relationship between the dataset analyses and the sex and (b) the dataset analyses with related nodes.
44
ERCIM NEWS 125 April 2021